You know you have made it in heterodox economics when
someone claims that you have a plan to destroy post-Keynesianism as we know it
– muhahahaha. Hyperbole aside, this is essentially what one Philip Pilkington
thinks that I’m doing, as he explains in this
rant, itself a spin-off of a
discussion that started at the INET YSI Facebook page and went somewhat astray
(the thread containing it has since been closed by the moderator of the page).
Pilkington, a journalist-cum-research assistant currently
working on his dissertation at Kingston University, frames the discussion around two alleged sins
that I have committed,
namely, (i) not knowing what I’m talking about and (ii) mistake model for
reality and make grandiose claims.
As evidence of the first sin, he offers this comment of mine
from the facebook discussion (emphasis added here, you’ll see why in a second):
OK, this ergodicity
nonsense gets thrown around a lot, so I should comment on it. You only need a
process (time series, system, whatever) to be ergodic if you are trying to make
estimates of properties of a given probability distribution based on past data.
The idea is that enough observations through time (the so called time-averages)
give you information about properties of the probability distribution over the
sample space (so called ensemble averages). So for example you observe a stock
price long enough and get better and better estimates of its moments (mean,
variance, kurtosis, etc). Presumably you then use these estimates in whatever
formula you came up with (Black-Scholes or whatever) to compute something else
about the future (say the price of an option). The same story holds for almost
all mainstream econometric models: postulate some relationship, use historical
time series to estimate the parameters, plug the parameters into the
relationship and spill out a prediction/forecast.
Of course none of this
works if the process you are studying in non-ergodic, because the time averages
will NOT be reliable estimates of the probability distribution. So the whole
thing goes up in flames and people like Paul Davidson goes around repeating
“non-ergodic, non-ergodic” ad infinitum. The thing is, none of this is
necessary if you take a Bayes’s theorem view of prediction/forecast. You start
by assigning prior probabilities to models (even models that have nothing to do
with each other, like an IS/LM model and a DSGE model with their respective parameters),
make predictions/forecasts based on
these prior probabilities, and then update them when new information becomes
available. Voila, no need for ergodicity. Bayesian statistics could not
care less if the prior probabilities change because they are time-dependent,
the world changed, or you were too stupid to assign them to begin with. It is
only a narrow frequentist view of prediction that requires ergodicity (and a
host of other assumptions like asymptotic normality of errors) to be
applicable. Unfortunately, that’s what’s used by most econometricians. But it
doesn’t need to be like that. My friend Chris Rogers from Cambridge has a
t-shirt that illustrates this point. It says: “Estimate Nothing!”. I think I’ll
order a bunch and distribute to my students.
Pilkington then goes on to say:
It is not clear that
Grasselli’s approach here can be used in any meaningful way in empirical work.
What we are concerned with as economists is trying to make predictions about
the future.
These range from the
likely effects of policy, to the moves in markets worldwide. What Grasselli is
interested in here is the robustness of his model. He wants to engage in
schoolyard posturing saying “my model is better than your model because it made
better predictions”.
Wait, what? Exactly which part of “make
predictions/forecasts based on these prior probabilities, and then update them
when new information becomes available” is not clear? Never mind that I’d give
a pound of my own flesh for this to be the Grasselli approach (it’s actually
the Bayesian approach), the sole purpose of it is to make precise predictions
and then update them based on new evidence, so it’s baffling that Pilkington
has difficulties understanding how it can be used in empirical work. Not to
mention the glaring contradiction of saying in one breath that
“What we are concerned with as economists is trying to make
predictions about the future” and admonishing me in the next for allegedly
claiming that “my model is better than your model because it made better
predictions”. So you want to make predictions, but somehow don’t think that a
model that makes better predictions is better than one that made worse
predictions. Give me a minute to collect my brains from across the room…
Back to my comment on the “ergodicity nonsense”, the key
point was that it is the frequentist approach to statistics that forces one to
make estimates of priors based on past time series, and this requires a lot of
assumptions, including ergodicity. In Bayesian statistics, the modeler is free
(in fact encouraged) to come up with her own priors, based on a combination of
past experience, theoretical understanding, and personal judgment. Fisher, the
father of the frequentist approach, wanted to ban any subjectivity from
statistics, advocating instead (I’m paraphrasing here) that one should
“Estimate everything!”. By contrasts, Bayesians will tell you that you should
estimate when you can, but supplement it with whatever else you like. To
illustrate how historical estimates are not only misleading (for example when
the underlying process is non-ergodic) but also unnecessary in Bayesian
statistics, Chris Rogers has the mantra “Estimate nothing!”. But nowhere does it
say “Predict nothing!”. On the contrary, once again, the nexus “make
predictions based on probabilities--compare with reality--change the
probabilities” is what the
approach is all about. So on the topic of advice for t-shirt making, Pilkington
should wear one that says “I ought to read the paragraphs I quote” in front,
followed by “and try to avoid self-contradictions” on the back.
Moving on, as evidence for the second sin, Pilkington quotes
another long comment of mine with the “clearest explanation” (his words, but I
agree!) of what I’m doing, namely:
I’m not comparing
models, I’m comparing systems within the same model. Say System 1 has only one
locally stable equilibrium, whereas System 2 has two (a good one and a bad
one). Which one has more systemic risk? There’s your first measure. Now say for
System 2 you have two sets of initial conditions: one well inside the basin of
attraction for the good equilibrium (say low debt) and another very close to
the boundary of the basin of attraction (say with high debt). Which set of
initial conditions pose higher systemic risk? There’s your second measure.
Finally, you are monitoring a parameter that is known to be associated with a
bifurcation, say the size of the government response when employment is low,
and the government needs to decide between two stimulus packages, one above and
one below the bifurcation threshold. Which policy lead to higher systemic risk?
There’s your third measure.
He then goes on to paraphrase it in a much more clumsy way
(question: if someone already gave you the clearest explanation about
something, why should you explain again? Just to make it worse?):
What Grasselli is
doing here is creating a model in which he can simulate various scenarios to
see which one produces high-risk and which will produce low-risk environments
within said model. But is this really “measuring systemic risk”? I don’t think
that it is. To say that it is a means to measure systemic risk would be like me
saying that I found a way to measure the size of God and then when incredulous
people came around to my house to see my technique they would find a computer
simulation I had created of what I think to be God in which I could measure
Him/Her/It.
Now I don’t know what the God example is all about, but Pilkington seems to think that the only way
to measure something is to go out with an instrument (a ruler, for example) and
take a measurement. The problem is that risk, almost by definition, is a
property if future events, and you cannot take a measurement in the future. ALL
you can do is to create a model of the future and then “measure” the risk of
something within the model. As Lady Gaga would say “oh there ain’t no other
way”. For example, when you drive along the Pacific Coast Highway and read a
sign on the side of the road that says “the risk of forest fire today is high”,
all it means is that someone has a model (based on previous data, the theory of
fire propagation, simulations and judgment) that takes as inputs the
measurements of observed quantities (temperature, humidity, etc) and calculates
probabilities of scenarios in which a forest fire arises. As time goes by and
the future turns into the present you then observe the actual occurrence of
forest fires and see how well the model performs according to the accuracy of
the predictions, at which point you update the model (or a combination of
models) based on, you guessed it, Bayes’s theorem.
So that’s it for the accusation of mistaken model for reality. But still on the
second fundamental sin according to Pilkington, recall that its second part consists
of making “grandiose claims about what they have achieved or will potentially
achieve that ring hollow when scrutinized”, against which his advice is to “tone
down the claims they are making lest they embarrass the Post-Keynesian
community at large”. This is all fine, but it sounds a bit rich coming from
someone who published a piece titled
Teleology and Market Equilibrium:Manifesto for a General Theory of Prices, which upon scrutiny contains some
common platitudes about neoclassical denials of empirical evidence, followed by
this claim:
My goal is to lay out
a general theory of prices in the same way Keynes laid out a general theory of
employment and output. This will provide a framework in which the neoclassical
case of downward-sloping demand curves and upward-sloping supply curves is a
highly unlikely special case. With such a framework we can then approach
particular cases as they arise in a properly empirical manner. In doing this I
hope to be able to introduce the Keynesian theory to pricing; and with that, I
think, the neoclassical doctrines will be utterly destroyed and a full,
coherent alternative will be available. Fingers crossed!
So instead of an actual general theory of prices,
Pilkington’s “manifesto” states his goal to be as great as Keynes and utterly
destroy the neoclassical doctrines. Hear the low tone!
Which brings me to Keynes’s advice on the use of mathematics that Pilkington
also quotes in his rant. Again, if you read the quote carefully you see that
Keynes warns against “symbolic pseudo-mathematical methods” and complains that
“Too large a proportion of recent ‘mathematical’ economics are mere concoctions”.
Notice the prefix pseudo and the inverted commas around the word mathematics in
the original quote, which suggests that Keynes’s peeve was not with mathematics
itself, but with the “imprecise…initial assumptions they rest on”. In
particular, Keynes singles out methods that “expressly assume strict
independence between the factors involved”, which is admittedly a very stupid assumption,
but in no way necessary for the application of (true, i.e not pseudo)
mathematical methods. For example, NONE of the models I work with assume
independence between factors, on the contrary, they highlight the complex and
surprisingly rich interdependencies, as well as their consequences.
I conclude in meta fashion with a Bayesian framing of this discussion itself.
Pilkinton has a very strong prior that my mathematical methods are useless and
he’s honest enough to say so: “I heard about the work Grasselli and others were
doing at the Fields Institute some time ago and I was instantly skeptical”. He
bases this prior on a well-documented post-Keynesian intellectual tradition, as
well as a slightly misguided notion of what constitutes “giant formal models”
(my models are actually pretty easy low-dimensional dynamical systems, but if
you are Bart Simpsons then I guess anything looks like a giant formal model).
By contrast, I have a very strong prior that my models are useful, based on
similarly well-documented intellectual tradition of applications of mathematics
in other areas of study. Pilkington concedes that he might be wrong and
promises unreserved praise if that turns out to be true. Likewise, I might be
wrong, in which case I’ll abandon the models and do something else. In either
case, we’ll both be traveling the same Road to Wisdom, as in the exceptional
poem by Piet Hein:
THE ROAD TO WISDOM?
Well, it's plain
and simple to express.
Err and err and err
again,
but less and less and less.
Nate Silver uses this poem as inspiration for the title of
the chapter in his book explaining the Bayesian approach, which he ends with a meta
statement of his own: “Bayes’s theorem predicts that Bayesians will win”.
Estimate nothing!