although I can't find the bar graph that set me off yesterday. The graph was a graph of US casualties that showed a dramatic decrease in casualties for last couple of months following a steady rise for the rest of the year. comentators were pointing out that the casualties for last month were about half that of the peak and using this as evidence of the success of the current surge.
Looking at it, I more instantly thought, hmmm, bar graph, they're pulling a fast one. The obvious choice was the graph had been chosen to start in January. I thought, wonder if the weather a year ago had any effect, so went to have a look at the figures, and no I was wrong, but looking back further, the half casualties are still at the top edge of those figures from further back than a year.
2005 107 58 35 52 80 78 54 85 49 96 84 68 2006 62 55 31 76 69 61 43 65 72 106 70 112 2007 83 81 81 104 126 101 78 84 65 24 0 0
The way these guys you're talking about picked out a single month for comparison, however, is precisely the kind of dishonest highlighting I was talking about in the first installment. When doing data analysis, you are not permitted to pick and choose single points and compare them to other single points or to peaks or whatever, because every data set has outliers and random fluctuations, and there is zero guarantee that the point you decide to pick is actually representative of anything.
A couple of general things to keep in mind with the casualty data from Vietraq is that it fluctuates somewhat from month to month (that's why I'd prefer to use a scatterplot rather than a bar graph myself) and it also depends on operational posture, number of troops in the field, etc. It seems like a reasonable proxy for how well the war is going, but one should be careful not to take it too far - after all, the Americans could simply sit in their compounds and get everything they need by air, and casualties would plummet. They would also, however, lose the war that way (to any extend that it isn't already lost, that is).
I'm tired and going to bed now, but I may return tomorrow (well, later today, technically) with some plots of data on Vietraq casualties over time.
- Jake If you only spend 20 minutes of the rest of your life on economics, go spend them here.
FWIW, when GNUPLOT fits a first order polynomial to these data, it gives an upwards slope that is more than two asymptotic standard errors greater than zero. While this is not a proper statistical significance analysis, it does show that calling it an upwards trend is not grossly misleading.
Data source.
Apologies for the double post.
Day by day data might draw an interesting curve.
I'd have some trouble fitting a downwards curve to that data ....
Not if you were with the American Enterprise Institute :-P
Nah, random scatter would obscure any trend if you go to that resoluation.
I tried running two- and three-month running averages, but that doesn't make things a lot prettier, so I decided to just go ahead an post the raw data. This leads me to believe that a resolution of about one month (maybe you could push it down to two weeks, or even one week, but I don't think you could go much further) is about optimal as far as grouping goes.
Of course, one could use running averages (i.e. have a data point for each day that represents the average deaths pr. day for the 30-day period leading up to the point). But I'm not convinced that there is sufficient additional information to be obtained by doing so to justify the bother.
Not saying it's meaningful though ;-)
the subtle mathematical technique used is called drawing a random line that shows the message you intend and bluffing competence. I think it's the same technique used in the original graph in the story. Any idiot can face a crisis - it's day to day living that wears you out.
Unfortunately, the trend only becomes clearer if you include all the data...
Apologies all around for lack of proofreading.
Iraqi Deaths By Year (iCasualties.org as of 2007/10/24) Truth unfolds in time through a communal process.
http://icasualties.org/oif/IraqiDeathsByYear.aspx
(I could not figure out how to insert it as a static image file.) Truth unfolds in time through a communal process.
I think the only probably reasonably accurate figures are those for US casualties, as its hard to hide bodies turning up on the home front. Any idiot can face a crisis - it's day to day living that wears you out.
(Click for details)
The bars are week by week casualties and the blue line is a four week moving average. It might look a bit of to the right because it is the average of the current week and the three weeks preceding this week. Following the blue line we can see that it has been an extended higher level of violence starting around september 2006. We also see that the level of violence varies greatly and picking four relatively calm weeks in september 2007 as proof of anything is highly dubious. A vote for PES is a vote for EPP! A vote for EPP is a vote for PES! Support the coalition, vote EPP-PES in 2009!
The collection of data for Iraqi and US deaths is about as fraudulent as one could imagine- what the tame Iraqis have done, along with the PCA is to delete large pieces of data and massage the rest, thereby cooking the books.- such as deaths by IED,(gone) and, in the case of Civilian deaths, redefining criminal vs. sectarian deaths by preposterous criteria- like whether they got shot in the front of the head vs. the back of the head.
Garbage in- garbage out. Capitalism searches out the darkest corners of human potential, and mainlines them.
It is possible to write entire books on the subject of data collection, and I decided that it was beyond the scope of this guide to include it - especially considering the fact that many of the techniques to detect doctored data acquisition require that you get your hands on the primary sources, which is a lot of bother for a newspaper aticle.
And often hacks will employ both bad data and bad presentation. It's usually easier to nail them on the presentation side of things...
But you're certainly right that any total figure for Iraqi casualties less than half a million or so is pure fiction. The official numbers certainly are. By at least an order of magnitude.
I have not been able to verify this.
OTOH, soldier deaths are useful (to the extent that deaths can be useful...) primarily as a proxy for how things are going in general. So it does not really matter whether they lie a bit about the real numbers, as long as they've been lying in the same way since the war started.
The absolute values of coalition fatality figures from Vietraq are suspect anyway due to the fairly widespread employment of mercenary militias by the Coalition, as their numbers do not count towards casualties when they get killed.