|
by martingale
It's common these days to condemn the finance industry for all sorts of
outrageous practices. The Gods know that they deserve it, and it's been
most of a century since people were broadly receptive to new ideas on
this topic anyway. Yet in their zeal to pour scorn, commentators can
sometimes try too hard. One such commentator is Nassim Taleb,
whose essay "The
Fourth Quadrant: A Map Of The Limits Of Statistics" is a good example
of throwing the baby out with the bathwater.
I find it difficult to characterize the essay in a single sentence. Taleb is a former computer trader, and his expertise applies to commenting on the probabilistic and statistical methods that are used in the finance industry. Yet I find it's a hodge podge of rhetorical tricks, various analogies, and sundry mathematical claims as well as free advice. While the overall aim is certainly well meaning, I can't say I'm actually convinced by the details. That's also not to say that all's well in the way statistics is used in practice. But it's ridiculous to scapegoat a tool to spite an industry.
The essay starts off in the worst possible way: not only is the title
very ambitious, but the introductory paragraph confirms it.
Taleb wants to tell us no less than the limits of human knowledge.
If statistics
is the core of knowledge, then knowing its limits is either trivial or
profound. It's a very risky way to convince people to read an essay.
Statistical and applied probabilistic knowledge is the core of knowledge; statistics is what tells you if something is true, false, or merely anecdotal; it is the "logic of science"; it is the instrument of risk-taking; it is the applied tools of epistemology; you can't be a modern intellectual and not think probabilistically? [...](let's face it: use of probabilistic methods for the estimation of risks did just blow up the banking system). The most memorable part of the essay is the turkey metaphor. It's best to get this out of the way early. For a thousand days, a turkey gets fed and all is well, yet on the 1001st, something awful happens. RIP. See how statistics is wrong? No amount of extrapolating from the first thousand days can obtain the 1001st result. But this is only the first part. In the second part, the graph is relabeled, switching the turkey into the present economy. Suddenly, the example is all too real. The turkey story is offered as a spectacular failure of statistics. But is it really? To obtain a failure, we first need to formulate a problem that we're actually going to fail to solve, otherwise it's just so much griping after the fact. In this case, the turkey story serves to set up the problem: can the turkey predict the date of its own demise? The answer is obviously no, and now comes the rhetorical switcheroo: the turkey is really a magic rabbit maskerading as a bear market. By relabelling the graph but keeping its shape intact, the reader is made to transfer the turkey problem (what the goal of prediction is and how well the solution works) to a financial time series. But wait, what is the actual prior statistical problem to be solved here? There isn't any particular one, it's just a graph. But the reader doesn't realize it because he's still thinking about turkeys. In this way, Taleb implies that statistics was powerless to predict this year's huge losses after the fact. Yet it's easy to think of many other problems that might have been posed regarding the same graph. For example, in 2005, could statistics have predicted the next year's aggregate income rise? A simple straight line fits reasonably well in all years except the last, so the answer's clearly yes. What about 2002, predicting income for 2003? If anybody had predicted a huge loss then, they would have failed quicker than they could say margin call. Taleb was a trader for 20 years. Was his job all this time really to predict the single date of the financial crash of 2008? If not, why is he talking about it? Statistics is a descriptive science. It's a way of stating what a dataset looks like for a particular purpose. Sometimes that's easy (like in the turkey problem), sometimes not (like in the turkey problem). Taleb's pièce de résistance is The Map: a convenient two by two table containing all the statistical problems in the world, with the fourth quadrant containing the supposedly impossible problems. In keeping with the grandiose claims, he fills the map with all the big problems of humanity (after all statistics is the "core" of human knowledge). His Tableau of Payoffs tells us the true place of Medicine, Gambling, Insurance, Climate problems, Innovation, Epidemics, Terrorism, etc. Who knew that one could learn so much as a computer trader, eh? But what is The Map? In the first column, he puts the so-called light tailed distributions. Intuitively, a distribution is a smooth theoretical curve (or surface in higher dimensions) which describes the frequency of occurrence of the values of some quantity (called a random variable) which can be observed repeatedly. Light tailed distributions (Mediocristan) fit variables with a limited natural range. Heavy tailed distributions (Extremistan), which Taleb puts in the second column, fit variables whose natural range is very wide. Statistically speaking, a distribution representing a real quantity can only be identified by looking at empirical data. When the number of available data points starts to grow, they form clusters in the light tail case, and also in the heavy tail case. But heavy tailed data points can also spread out in much more unexpected places. This is why heavy tailed distributions are traditionally used for modelling extremes, like storms, floods, bankruptcies, etc. In one dimension, there are only two unexpected places. Of course, the problem is much more complicated in higher dimensions. In that case, all distributions spread out widely because there is so much more freedom of space. The quadrants in Taleb's Map represent the difficulty of a decision problem (decision problems maximize an expected payoff based on an assumed statistical distribution). Taleb claims that the fourth quadrant is hopeless. Fourth Quadrant: Complex decisions in Extremistan: Welcome to the Black Swan domain. Here is where your limits are. Do not base your decisions on statistically based claims. Or, alternatively, try to move your exposure type to make it third-quadrant style ("clipping tails").I'm going to take a few paragraphs to explain what he means, but if you're sharp, you might already wonder what all the fuss is about, when all you need to escape the fourth quadrant is clipping the tails... The swiss army knife of statistics is the Central Limit Theorem. It works with all light tailed observations, and forms the basis for fitting the parameters of theoretical distributions. Since the CLT applies to nearly everything of interest in that context, much of statistical methodology is concerned with computing means and variances, which are the quantities which specify the Gaussian limit distribution. But the CLT fails for heavy tailed distributions. This is the basis for Taleb's claim. As the datapoints multiply, there always comes an extreme point which is just large enough to seriously perturb the mean and variance yet again. No single Gaussian limit appears, and the usual techniques don't work in the long run. Yet extremes have meaning. If the random variable is an earthquake magnitude, a high value can ruin your whole day. The recent stock crash is an extreme datapoint. People want to know how likely the extremes are, and maybe even predict them (in advance if they're newbies). But extremes are rare. If they weren't, then we'd soon exhaust the range of possibilities in a dataset, and then we'd really have a light tailed distribution spread wide. For example, the Dutch are worried about floods breaking their dykes, which is an entirely different kind of wall street worry. Sooner or later, a flood bigger than any previous one will come, so it's hard to make the walls high enough by looking at historical records. Taleb would think that it's outright impossible (note: making the walls 1km high is not a realistic option). How can we fit an arbitrary distribution in the part of space where we have no datapoints? It's obviously impossible! Enter Extreme Value Theory, the branch of statistics which specializes in this kind of problem. To understand what's going on, it might help to review the CLT first. If you have a bunch of datapoints and the CLT holds, then plotting a histogram of frequencies will show a bell shaped Gaussian distribution. If you've ever tried to do this for yourself, you've come across the bandwidth problem. Just how wide do you make the bins? If they're wide enough to contain a lot of observations each, then you might see a Gaussian. But if the bins are so narrow that some bins have only one observation, and most bins have none at all, then the plot shows nothing usable! It seems crazy that the CLT can tell us what's going on in the regions of space in between the datapoints, yet plainly, it does. In fact, the mathematical statement of the CLT does not talk about histograms or bandwidths at all. In an analogous way, Extreme Value Theory tells us what the tails look like, even if we don't have datapoints throughout. As many of you know, the CLT concerns the behaviour of sums of random variables. In EVT, the fundamental theorem concerns the behaviour of the maximum of a collection of random variables. The extremes we care about are all maxima: order all the observations seen so far in a row, then the right most is the maximum, and the left most is the minimum. Exchange left and right, and the maximum becomes the minimum and vice versa. The CLT states that the only possible limit for a (suitably scaled and shifted) sum of random variables is Gaussian. The Gaussian family has two parameters, the mean and variance, and statistics concerns the problem of extracting the maximum information from all the variables so as to estimate the asymptotic mean and variance. The fundamental theorem of EVT states that the only possible limit for a (suitably scaled and shifted) maximum of random variables is one of three fixed distributions: the Gumbel, the Fréchet or the negative Weibull. There is in fact a single formula for all three, the Extreme Value Distribution, which contains a single parameter, called the tail index. Moreover, this limiting result holds regardless of whether the random variables in question have light or heavy tails, so is more general than the CLT. So what of Taleb's claims and mathematical appendix? In short, the fourth quadrant is not as impossible as he leads us to believe. That's not to say it's ever easy or routine. All worthwhile math problems are hard, otherwise anybody could solve them for breakfast. What bothers Taleb is the "robustness" of statistical methods near the tails. The theory of robustness is another big field of statistics, which is concerned with what happens to estimates when the datapoints are shifted a little bit or a lot. In other words, it's about quantifying the quality of the fit. Here's a typical bogus argument, though: For instance, if you move alpha from 2.3 to 2 in the publishing business, the sales of books in excess of 1 million copies triple!What exactly does that mean? He's comparing two quantities, yet doesn't tell us anything about their units. Alpha is merely a "parameter", yet we are supposed to believe that a (presumably insignificant) difference of 0.3 causes a serious misestimation! And how do we know it's a serious misestimation? Oh, it's the book publishing business, so anything in excess of 1 million is obviously a big deal. As a former trader, one might have expected that his best example would come from a banking related business, although with the kind of numbers being talked about in banking nowadays, 1-3 million seems positively puny and hardly worth bothering with. Much better to pick an ominous sounding example, and claim this as a proof of unpredictability throughout the Fourth Quadrant, no? To be sure, we also get a graph of alpha values in some aggregated dataset from "40 thousand economic variables". Are they all comparable? Are the estimation methods comparable? What's their interpretation? Should we transfer the book publishing insights to all of those economic quantities as-is? Do you smell another turkey sandwich? I should say at this point that there is in fact an underlying set of mathematical ideas that concern this kind of issue. Physicists, engineers, statisticans know it well under various names such as well posedness, condition numbers, and robustness. In all cases however, the mathematics must be tempered with the problem interpretation and units used, especially at the infinite end of the real numbers, where log scales play a major role. For instance, an abstract version of the book publishing example simply states that a small change of 0.3 in the parameter leads to a small shift in the range of the observations in log scale, since log(3) = 1.09. That doesn't sound nearly so bad, does it? In the case of EVT, the simple functional form of the asymptotic distribution is valuable for estimation. For example, at the tail end, it can be shown that the last few datapoints (order statistics) can always be transformed into a sample from a Poisson point process. This is useful for assessing the quality of the fit at the tail end, and therefore plays a prominent role in questions of robustness. Just as you would look for a bell shaped distribution when expecting Gaussians, you might look at the spacings near the tail for confirmation that you're on the right track. In fact, all the usual statistical methods have some sort of counterpart in EVT, such as maximum likelihood fitting, etc. You'll find textbooks on extreme value theory in the usual places. Taleb also often has the wrong viewpoint on other things: This absence of "typical" event in Extremistan is what makes prediction markets ludicrous, as they make events look binary. "A war" is meaningless: you need to estimate its damage?and no damage is typical. Many predicted that the First War would occur?but nobody predicted its magnitude. One of the reasons economics does not work is that the literature is almost completely blind to the point.The idea of the "typical" event is really a remnant of elementary statistics. In one dimensional statistics, the location of the peak of a distribution has the highest likelihood of occurrence, which is great for Anschaulichkeit, ie the sense that we can actually see what's going on. So this can be thought as typical, which is significant because it cuts out the complexity. What's a typical point on the surface of a sphere, though? There isn't one, they're all the same! In higher dimensions, and most big problems are high dimensional, a mode doesn't matter nearly so much. The observations are most likely not near the mode. There is no single "typical" observation that's easy to locate from looking at the distribution. That's true regardless of the shape of the tails, so don't believe Taleb when he says it's all about Extremistan. For example, take the simplest light tailed distribution (just to make life hard in the first column, where things are supposed to be easy): the uniform on the interval [0,1]. Every observational value is equally likely throughout the interval. Now do the same in two dimensions (uniform on the unit square), three, etc. You might think that in 12 dimensions, the observations are spread out evenly in the corresponding hypercube, but you'd be wrong. Once the CLT starts to work, the observations all lie geometrically on a thin spherical shell with spikes, like a hedgehog. Worried yet? Mutatis mutandis with other distributions. I don't recommend that you read Taleb's mathematical appendix. It's written in an elliptic lecture notes style that's difficult to follow, and since I haven't touched on some other of his ideas, such as his beliefs about asymptotics, it's difficult to summarize. Presumably he's expounded those ideas in more detail somewhere else, so I'll leave the review of it to the relevant experts. The essay ends with some free advice for quants faced with the fourth quadrant. I find this somewhat ironic, given that a few paragraphs earlier, in the section marked Beware the Charlatan, he writes So all I am saying is, "What is it that we don't know", and my advice is what to avoid, no more. At the risk of repeating myself, I don't have an issue with claims of difficulty or incompetence in the way that statistics is used in finance. The same statement could be made in lots of other fields. I do have an issue with Taleb's arguments though. p.s. This rant was commissioned by Migeru. For those whose eyes haven't glazed over yet, I added a paragraph or two on the issues raised in the linked thread, so I won't be making a direct reply over there. |
Menu
. Home
. About . Contact . New User Guide . FAQ . Search . Search (Google) . Archives (Wiki) Art, Economics, Energy, Environment, EU Politics, Mech & Tech, By Country Login
|
||
|
on the limits of statistics | 16 comments (16 topical, 0 editorial, 0 hidden)
on the limits of statistics | 16 comments (16 topical, 0 editorial, 0 hidden)
| ||||
| ||||