Coefficients: (Intercept) log(casualties) -4.0792 0.7741 Response: log(indicted) Df Sum Sq Mean Sq F value Pr(>F) log(casualties) 1 3.3419 3.3419 13.880 0.02038 * Residuals 4 0.9631 0.2408
With only 6 points it is really difficult to argue anything. For instance, what is the chance that all three "Serb indictee" points are above the line? 1 in 8. This is not sufficient to show bias at 90% confidence (you would need the probability to be less than 1 in 10) let alone 95% confidence (1 in 20). Most economists teach a theoretical framework that has been shown to be fundamentally useless. -- James K. Galbraith
Most of the civilian casualty figures were obtained from Wikipedia - which itself uses the ICTY, the Red Cross and for the Serbian civilian casualty figures in Kosovo, an EU funded project run out of Belgrade.
According to a Serbian government report, from January 1, 1998 to June 10, 1999 the KLA killed 988[5] people and kidnapped 287[5]; in the period from June 10, 1999, to November 11, 2001, when NATO had been in control in Kosovo, 847[5] people were reported to have been killed and 1,154[5] kidnapped. This comprised both civilians and security forces personnel: of those killed in the first period, 335[5] were civilians, 351 were soldiers, 230 were police and 72 were unidentified; by nationality, 87 of killed civilians were Serbs, 230 Albanians, and 18 of other nationalities.[5] The Humanitarian Law Center in Belgrade, an organization funded by the European Commission, have announced that it had identified 8,000 Serbians out of a total of 12,000 casualties they had identified in the Kosovo War. [53]
If you must insist on treating these three points as independent - which I can see little justification for doing, but maybe that's just me - you do a (casualties, indictments, convictions) plot and run a linear fit against all three points in a given series at the same time. This way you get some more meaningful (implicit) assumptions about the way the uncertainties look.
- Jake If you only spend 20 minutes of the rest of your life on economics, go spend them here.
Or, rather, three fits.
One for all six rows in Vladimir's table - that's the "null hypothesis".
One for the 3 Serb rows and the 3 non-serb rows. That's more or less equivalent to was was done in the diary.
Or you could do a test on whether the 3 Serb and 3 non-Serb points fall above or below the "null hypothesis" regression line. The trouble is, with only 6 points you probably can't say anything with 95% confidence. Most economists teach a theoretical framework that has been shown to be fundamentally useless. -- James K. Galbraith
I would be opposed to fitting the slope as well as the intercept in your model, because we already only have three points for every fit parameter - and you fit a number of parameters comparable to your number of data points at the peril of talking nonsense...
This actually makes it look worse for Vladimir's hypothesis. The fit is this:
Coefficients: (Intercept) -6.095 Response: log(indicted) - log(casualties) Df Sum Sq Mean Sq F value Pr(>F) Residuals 5 1.24772 0.24954
I used the t test because its appropriate for small data samples in a normal (Gaussian) distribution, which is an assumption that seems ok to me.