Wednesday, August 21, 2013

Small-brain economics: Pilkington's strong prior against mathematics

You know you have made it in heterodox economics when someone claims that you have a plan to destroy post-Keynesianism as we know it – muhahahaha. Hyperbole aside, this is essentially what one Philip Pilkington thinks that I’m doing, as he explains in this rant, itself a spin-off of a discussion that started at the INET YSI Facebook page and went somewhat astray (the thread containing it has since been closed by the moderator of the page).

Pilkington, a journalist-cum-research assistant currently working on his dissertation at Kingston University, frames the discussion around two alleged sins that I have committed, namely, (i) not knowing what I’m talking about and (ii) mistake model for reality and make grandiose claims.

As evidence of the first sin, he offers this comment of mine from the facebook discussion (emphasis added here, you’ll see why in a second):

OK, this ergodicity nonsense gets thrown around a lot, so I should comment on it. You only need a process (time series, system, whatever) to be ergodic if you are trying to make estimates of properties of a given probability distribution based on past data. The idea is that enough observations through time (the so called time-averages) give you information about properties of the probability distribution over the sample space (so called ensemble averages). So for example you observe a stock price long enough and get better and better estimates of its moments (mean, variance, kurtosis, etc). Presumably you then use these estimates in whatever formula you came up with (Black-Scholes or whatever) to compute something else about the future (say the price of an option). The same story holds for almost all mainstream econometric models: postulate some relationship, use historical time series to estimate the parameters, plug the parameters into the relationship and spill out a prediction/forecast.

Of course none of this works if the process you are studying in non-ergodic, because the time averages will NOT be reliable estimates of the probability distribution. So the whole thing goes up in flames and people like Paul Davidson goes around repeating “non-ergodic, non-ergodic” ad infinitum. The thing is, none of this is necessary if you take a Bayes’s theorem view of prediction/forecast. You start by assigning prior probabilities to models (even models that have nothing to do with each other, like an IS/LM model and a DSGE model with their respective parameters), make predictions/forecasts based on these prior probabilities, and then update them when new information becomes available. Voila, no need for ergodicity. Bayesian statistics could not care less if the prior probabilities change because they are time-dependent, the world changed, or you were too stupid to assign them to begin with. It is only a narrow frequentist view of prediction that requires ergodicity (and a host of other assumptions like asymptotic normality of errors) to be applicable. Unfortunately, that’s what’s used by most econometricians. But it doesn’t need to be like that. My friend Chris Rogers from Cambridge has a t-shirt that illustrates this point. It says: “Estimate Nothing!”. I think I’ll order a bunch and distribute to my students.

Pilkington then goes on to say:

It is not clear that Grasselli’s approach here can be used in any meaningful way in empirical work. What we are concerned with as economists is trying to make predictions about the future.
These range from the likely effects of policy, to the moves in markets worldwide. What Grasselli is interested in here is the robustness of his model. He wants to engage in schoolyard posturing saying “my model is better than your model because it made better predictions”.

Wait, what? Exactly which part of “make predictions/forecasts based on these prior probabilities, and then update them when new information becomes available” is not clear? Never mind that I’d give a pound of my own flesh for this to be the Grasselli approach (it’s actually the Bayesian approach), the sole purpose of it is to make precise predictions and then update them based on new evidence, so it’s baffling that Pilkington has difficulties understanding how it can be used in empirical work. Not to mention the glaring contradiction of saying in one breath that
“What we are concerned with as economists is trying to make predictions about the future” and admonishing me in the next for allegedly claiming that “my model is better than your model because it made better predictions”. So you want to make predictions, but somehow don’t think that a model that makes better predictions is better than one that made worse predictions. Give me a minute to collect my brains from across the room…

Back to my comment on the “ergodicity nonsense”, the key point was that it is the frequentist approach to statistics that forces one to make estimates of priors based on past time series, and this requires a lot of assumptions, including ergodicity. In Bayesian statistics, the modeler is free (in fact encouraged) to come up with her own priors, based on a combination of past experience, theoretical understanding, and personal judgment. Fisher, the father of the frequentist approach, wanted to ban any subjectivity from statistics, advocating instead (I’m paraphrasing here) that one should “Estimate everything!”. By contrasts, Bayesians will tell you that you should estimate when you can, but supplement it with whatever else you like. To illustrate how historical estimates are not only misleading (for example when the underlying process is non-ergodic) but also unnecessary in Bayesian statistics, Chris Rogers has the mantra “Estimate nothing!”. But nowhere does it say “Predict nothing!”. On the contrary, once again, the nexus “make predictions based on probabilities--compare with reality--change the probabilities”  is what the approach is all about. So on the topic of advice for t-shirt making, Pilkington should wear one that says “I ought to read the paragraphs I quote” in front, followed by “and try to avoid self-contradictions” on the back.

Moving on, as evidence for the second sin, Pilkington quotes another long comment of mine with the “clearest explanation” (his words, but I agree!) of what I’m doing, namely:

I’m not comparing models, I’m comparing systems within the same model. Say System 1 has only one locally stable equilibrium, whereas System 2 has two (a good one and a bad one). Which one has more systemic risk? There’s your first measure. Now say for System 2 you have two sets of initial conditions: one well inside the basin of attraction for the good equilibrium (say low debt) and another very close to the boundary of the basin of attraction (say with high debt). Which set of initial conditions pose higher systemic risk? There’s your second measure. Finally, you are monitoring a parameter that is known to be associated with a bifurcation, say the size of the government response when employment is low, and the government needs to decide between two stimulus packages, one above and one below the bifurcation threshold. Which policy lead to higher systemic risk? There’s your third measure.

He then goes on to paraphrase it in a much more clumsy way (question: if someone already gave you the clearest explanation about something, why should you explain again? Just to make it worse?):

What Grasselli is doing here is creating a model in which he can simulate various scenarios to see which one produces high-risk and which will produce low-risk environments within said model. But is this really “measuring systemic risk”? I don’t think that it is. To say that it is a means to measure systemic risk would be like me saying that I found a way to measure the size of God and then when incredulous people came around to my house to see my technique they would find a computer simulation I had created of what I think to be God in which I could measure Him/Her/It.

Now I don’t know what the God example is all about, but Pilkington seems to think that the only way to measure something is to go out with an instrument (a ruler, for example) and take a measurement. The problem is that risk, almost by definition, is a property if future events, and you cannot take a measurement in the future. ALL you can do is to create a model of the future and then “measure” the risk of something within the model. As Lady Gaga would say “oh there ain’t no other way”. For example, when you drive along the Pacific Coast Highway and read a sign on the side of the road that says “the risk of forest fire today is high”, all it means is that someone has a model (based on previous data, the theory of fire propagation, simulations and judgment) that takes as inputs the measurements of observed quantities (temperature, humidity, etc) and calculates probabilities of scenarios in which a forest fire arises. As time goes by and the future turns into the present you then observe the actual occurrence of forest fires and see how well the model performs according to the accuracy of the predictions, at which point you update the model (or a combination of models) based on, you guessed it, Bayes’s theorem.

So that’s it for the accusation of mistaken model for reality. But still on the second fundamental sin according to Pilkington, recall that its second part consists of making “grandiose claims about what they have achieved or will potentially achieve that ring hollow when scrutinized”, against which his advice is to “tone down the claims they are making lest they embarrass the Post-Keynesian community at large”. This is all fine, but it sounds a bit rich coming from someone who published a piece titled Teleology and Market Equilibrium:Manifesto for a General Theory of Prices, which upon scrutiny contains some common platitudes about neoclassical denials of empirical evidence, followed by this claim:

My goal is to lay out a general theory of prices in the same way Keynes laid out a general theory of employment and output. This will provide a framework in which the neoclassical case of downward-sloping demand curves and upward-sloping supply curves is a highly unlikely special case. With such a framework we can then approach particular cases as they arise in a properly empirical manner. In doing this I hope to be able to introduce the Keynesian theory to pricing; and with that, I think, the neoclassical doctrines will be utterly destroyed and a full, coherent alternative will be available. Fingers crossed!

So instead of an actual general theory of prices, Pilkington’s “manifesto” states his goal to be as great as Keynes and utterly destroy the neoclassical doctrines. Hear the low tone!

Which brings me to Keynes’s advice on the use of mathematics that Pilkington also quotes in his rant. Again, if you read the quote carefully you see that Keynes warns against “symbolic pseudo-mathematical methods” and complains that “Too large a proportion of recent ‘mathematical’ economics are mere concoctions”. Notice the prefix pseudo and the inverted commas around the word mathematics in the original quote, which suggests that Keynes’s peeve was not with mathematics itself, but with the “imprecise…initial assumptions they rest on”. In particular, Keynes singles out methods that “expressly assume strict independence between the factors involved”, which is admittedly a very stupid assumption, but in no way necessary for the application of (true, i.e not pseudo) mathematical methods. For example, NONE of the models I work with assume independence between factors, on the contrary, they highlight the complex and surprisingly rich interdependencies, as well as their consequences.

I conclude in meta fashion with a Bayesian framing of this discussion itself. Pilkinton has a very strong prior that my mathematical methods are useless and he’s honest enough to say so: “I heard about the work Grasselli and others were doing at the Fields Institute some time ago and I was instantly skeptical”. He bases this prior on a well-documented post-Keynesian intellectual tradition, as well as a slightly misguided notion of what constitutes “giant formal models” (my models are actually pretty easy low-dimensional dynamical systems, but if you are Bart Simpsons then I guess anything looks like a giant formal model). By contrast, I have a very strong prior that my models are useful, based on similarly well-documented intellectual tradition of applications of mathematics in other areas of study. Pilkington concedes that he might be wrong and promises unreserved praise if that turns out to be true. Likewise, I might be wrong, in which case I’ll abandon the models and do something else. In either case, we’ll both be traveling the same Road to Wisdom, as in the exceptional poem by Piet Hein:

Well, it's plain
and simple to express.
Err and err and err again,
but less and less and less.

Nate Silver uses this poem as inspiration for the title of the chapter in his book explaining the Bayesian approach, which he ends with a meta statement of his own: “Bayes’s theorem predicts that Bayesians will win”.

Estimate nothing!


  1. You say this Pilkington guy is a journalist, and this or that, but how you know? Frnakly, I am not sure his very good in the head. the teleology crap is a telltale sign: narcissism, grandiosity.

  2. To follow up the last person's remarks...not too long ago, Philip Pilkington made this strange "challenge" that looks like it's built up on strawman arguments.

    It looks like someone else chose to respond to that post by Philip Pilkington recently.