Mon Oct 22nd, 2007 at 04:31:39 PM EST
The honest use of bar graphs and a warning note
Last time, I warned the reader to beware of bar graphs, especially when they are used to support claims of correlation. What I did not include was an example of legitimate use of bar graphs - because I didn't have one close at hand and I was too lazy to search the web for it.
Fortunately, the kind people over at Scienceblogs (specifically Matt Nisbet) has provided a ready-made example, that both presents honest use of bar graphs and permits me to drive home another point that I've been wanting to make.
Diary rescue by Migeru
Why, one might ask, is Nisbet's use of bar charts more honest than Svenskt Näringsliv's use? Simple: Nisbet's purpose is to rank different responses to a survey according to the number of replies. And bar graphs are very good for that, because they provide a direct link between the size of each bar - which is easy for the human mind to evaluate to fairly high precision - and the value of the entry it represents.
(The first figure in Nisbet's post is something of another story - connecting the points in a time series is not what I'd call unethical or sloppy, but I find it aesthetically displeasing. That, however, is somewhat more a matter of taste.)
The other point I was hinting at can be introduced by way of the comments section on his post. Here Nisbet is challenged to justify the attention he gives to the data. This goes to show that presenting your data honestly is not the be-all-end-all of making (valid) numerical arguments. It is the minimum entry requirement. The numbers that you present have to actually be relevant to the conclusion you're making.
Furthermore (and this is another thing that's brought up in the comments over there), in some cases data ages more gracefully than in others. For instance, data on what the climate was like a century ago is still relevant to climate models today - data on the weather at some specific location at a specific time during the last century is little more than a curiosity today (unless it was fairly extreme, such as the flooding of the Low Countries in 1954).
Bar graphs are properly used to compare quantities - (naturally, such quantities as are compared must be comparable). This makes them particularly useful to present the results of polls, surveys and elections.
That someone isn't lying doesn't mean he isn't wrong - just because you can't catch someone red-handed in manipulating data is no excuse to disengage your other critical thinking processes.
An Aside: Since bar graphs provide a perfectly good way of presenting the results of surveys and elections, I wish people would stop using pie charts for that - they are inferior to bar graphs in almost every way I can think of.
Beware of bar graphs - if someone tells you that X causes Y and presents you with bar graphs, scrutinize them carefully. The proper graph to show correlation is in most cases a scatterplot. If he's using something else, chances are he's trying to pull a fast one on you.
Especially beware of highlighting - I'm sure highlighting single data points has legitimate uses, but off the top of my head, I cannot think of a single one. A very good indication that Someone Is Up To No Good.
The Entire Series:
How To Lie With Numbers - a short guide to politics and other things - introduction - bar graphs - highlighting.
How To Lie With Numbers 1½ - more bar graphs - a cautionary note
European Tribune - How To Lie With Numbers 2 - Laffer Nonsense From The WSJ - scatterplots- fitting methods - data grouping