The way these guys you're talking about picked out a single month for comparison, however, is precisely the kind of dishonest highlighting I was talking about in the first installment. When doing data analysis, you are not permitted to pick and choose single points and compare them to other single points or to peaks or whatever, because every data set has outliers and random fluctuations, and there is zero guarantee that the point you decide to pick is actually representative of anything.
A couple of general things to keep in mind with the casualty data from Vietraq is that it fluctuates somewhat from month to month (that's why I'd prefer to use a scatterplot rather than a bar graph myself) and it also depends on operational posture, number of troops in the field, etc. It seems like a reasonable proxy for how well the war is going, but one should be careful not to take it too far - after all, the Americans could simply sit in their compounds and get everything they need by air, and casualties would plummet. They would also, however, lose the war that way (to any extend that it isn't already lost, that is).
I'm tired and going to bed now, but I may return tomorrow (well, later today, technically) with some plots of data on Vietraq casualties over time.
- Jake If you only spend 20 minutes of the rest of your life on economics, go spend them here.
FWIW, when GNUPLOT fits a first order polynomial to these data, it gives an upwards slope that is more than two asymptotic standard errors greater than zero. While this is not a proper statistical significance analysis, it does show that calling it an upwards trend is not grossly misleading.
Data source.
Apologies for the double post.
Day by day data might draw an interesting curve.
I'd have some trouble fitting a downwards curve to that data ....
Not if you were with the American Enterprise Institute :-P
Nah, random scatter would obscure any trend if you go to that resoluation.
I tried running two- and three-month running averages, but that doesn't make things a lot prettier, so I decided to just go ahead an post the raw data. This leads me to believe that a resolution of about one month (maybe you could push it down to two weeks, or even one week, but I don't think you could go much further) is about optimal as far as grouping goes.
Of course, one could use running averages (i.e. have a data point for each day that represents the average deaths pr. day for the 30-day period leading up to the point). But I'm not convinced that there is sufficient additional information to be obtained by doing so to justify the bother.
Not saying it's meaningful though ;-)
the subtle mathematical technique used is called drawing a random line that shows the message you intend and bluffing competence. I think it's the same technique used in the original graph in the story. Any idiot can face a crisis - it's day to day living that wears you out.
Unfortunately, the trend only becomes clearer if you include all the data...
Apologies all around for lack of proofreading.
Iraqi Deaths By Year (iCasualties.org as of 2007/10/24) Truth unfolds in time through a communal process.