NY Times Coverage Biased Against Lancet Study

UPDATE: The Lancet Study can be downloaded here (pdf link). A companion paper, which provides some additional details, can be downloaded here (pdf link).

The New York Times coverage of the new Lancet study of Iraqi deaths, while maintaining an objective tone, is heavily slanted against the study; many of the painfully bad right-wing arguments against the earlier survey are repeated by the Times, usually without rebuttal. For example:

Violent deaths have soared since the American invasion, but the rise is in part a matter of spotty statistical history. Under Saddam Hussein, the state had a monopoly on killing, and the deaths of thousands of Iraqi Shiites and Kurds that it caused were never counted.

The implication is that perhaps these new numbers underestimate pre-invasion deaths due to “spotty statistical history.” But the Lancet study does not draw on the counts of Hussein’s government for it’s pre-war mortality estimates, so this is irrelevant.

Gilbert Burnham, the principle author of the study, said the figures showed an increase of deaths over time that was similar to that of another civilian casualty project, Iraq Body Count, which collates deaths reported in the news media, and even to that of the military. But even Iraq Body Count puts the maximum number of deaths at just short of 49,000.

The Iraq Body Count tallies only those deaths which are reported by at least two reputable news organizations. No one associated with the Iraq Body Count claims that their results represent “the maximum number of deaths.” From the Iraq Body Count website:

Our maximum therefore refers to reported deaths – which can only be a sample of true deaths unless one assumes that every civilian death has been reported. It is likely that many if not most civilian casualties will go unreported by the media.

Back to the Times coverage:

Robert Blendon, director of the Harvard Program on Public Opinion and Health and Social Policy, said interviewing urban dwellers chosen at random was “the best of what you can expect in a war zone.”

But he said the number of deaths in the families interviewed — 547 in the post-invasion period versus 82 in a similar period before the invasion — was too few to extrapolate up to more than 600,000 deaths across the country.

But as this example from The Roper Center’s “polling 101” illustrates, it’s accepted statistical methodology to extrapolate from small to large numbers – in their example, from 30 purple jelly beans in their sample to the conclusion that there are approximately 20 million purple jelly beans in the huge jelly bean jar.

The new Lancet survey is based on interviews with over 1000 Iraqis. The Times – and all major news organizations – routinely report numbers extrapolated from surveys which interview 1000, or sometimes just 500, people. Mainstream newspaper FAQs about polling methodology (example 1, example 2) suggest that a sample of just 500 is sufficient for surveys representing all Americans.

Of course, the Lancet survey – due to methodological issues having to do with collecting data in a war zone – has a wider confidence interval than most surveys. But that doesn’t mean that the study is unreliable, or its methods incorrect; it just means that the results have a wide confidence interval. We can be reasonably certain there have been between 426,369 and 793,663 excess Iraqi deaths since our invasion. That’s extraordinary, and appalling. If the occupation is intended to protect Iraqis, it is a dismal failure.

Curtsy: Deltoid

[Crossposted at Creative Destruction. If your comments aren’t being approved here, try there.]

This entry posted in Iraq, Media criticism. Bookmark the permalink. 

28 Responses to NY Times Coverage Biased Against Lancet Study

  1. Pingback: All The News The Buzz Approves » BuzzTracker

  2. Pingback: Tailrank - Top News for Today

  3. Pingback: Pacific Views

  4. Pingback: Zuky

  5. Pingback: Dan O'Huiginn

  6. Pingback: Giltner Review

  7. 7
    Sailorman says:

    Math is not “right wing”, it’s just math. My understanding is that while many right wing folks may be motivated to attack the study thorough their politics, SOME (not all) of their protests may be perfectly valid.

    it’s accepted statistical methodology to extrapolate from small to large numbers – in their example, from 30 purple jelly beans in their sample to the conclusion that there are approximately 20 million purple jelly beans in the huge jelly bean jar.

    This is true but only partially true.

    The underlying assumption of statistical prediction is that the sample size is randomy selected and thus bears a random amount of variation. Because it is difficult to acheive a true random sampling, however, the assumption is generally incorrect.

    You can randomly select households from a phone book and call them for a political survey during dinnertime, for example. But then you’ll only actually sample people who 1) have a phone, 2) are listed in the book; 3) are home during dinner time; 4) answer the phone; and 5) agree to talk to you.

    Absent other studies showing these variable to be irrelevant, the above group is the ONLY group to which those results can be extrapolated in larger numbers.

    The statistical margin of error is generally calculated by using the sample size, prediction size, and so on. The margin of error accounts for the possibility that random chance will result in a non-representative sample. The margin or error does not usually account for errors in sampling procedure.

    Given the situation in Iraq and the type and degree of sampling, the protests are (correctly) tending to question the accuracy and representative qualities of the sample. Answering or rebutting these protests requires a detailed analysis of the sampling procedure, which I suspect will be forthcoming in the next few days.

    Not incidentally, the harsh questioning is also appropriate for other reasons: When something differs drastically from the status quo, it is appropriate to question whether the difference is real or the result of error. It’s not that “general consensus” is always right. It’s just that the “general consensus” is usually right.

  8. 8
    Ampersand says:

    Given the situation in Iraq and the type and degree of sampling, the protests are (correctly) tending to question the accuracy and representative qualities of the sample.

    Which of the three protests I quoted address the “accuracy and representative qualities of the sample”? I think your claim that this is the basis of the critiques is inaccurate.

    Furthermore, the sampling technique here seems quite good, assuming it is the same technique used in the previous survey by the same people. For instance, their practice of visiting households in person eliminates the “do they have a telephone” problem you refer to.

    I agree we’ll know more after the study is available to be read in full.

  9. 9
    Sailorman says:

    Which of the three protests I quoted address the “accuracy and representative qualities of the sample”? I think your claim that this is the basis of the critiques is inaccurate.

    Having done a fair bit of statistics, the first thing that popped into my mind when I saw the results was “that’s odd that the difference is so huge. I wonder if they screwed up the methods or sampling?”

    I say this because there the underlying math used to turn the sample into a national average is actually pretty simple. If there was anything wrong with the math I’d be surprised; it’s not as if the Lancet doesn’t have statisticians who can multiply correctly. And anyone with decent access to the paper and data–the NYT, for example–would have been able to pick up any egregious errors.

    You are correct that the articles do not give details. But I’m betting that (for example) the Blendon and Barry quotes (NYT article) were made on the basis of methods and representative qualities. i’m also betting that most protests from about anyone with a stats background are also made on that basis.

  10. 10
    Daran says:

    The underlying assumption of statistical prediction is that the sample size is randomy selected and thus bears a random amount of variation. Because it is difficult to acheive a true random sampling, however, the assumption is generally incorrect.

    Er, sample size is not randomly selected. The sample, ideally, should be randomly selected. Since that isn’t always possible, there are various techniques available to adjust for any non-randomness in the sample.

    In choosing a sample size, three principles govern: Bigger is better; a law of diminishing returns apples; and the margin of error is more or less independent of the size population, provided the latter is large. What that means is that a sample of 1000 will be just as accurate for the population of the world as it is for the population of the US.

  11. 11
    molly blythe says:

    Actually, they interviewed about 1800 households and a total of 12,000 Iraqis. This means that unless there’s some incredible sampling error, the likes of which would be difficult to imagine, there’s no way in hell they’re too far off.

  12. 12
    Sailorman says:

    Daran: Yes, I know; I meant “sample”, not “sample size”; it was a mistype.

    Have you read the study? I just finished it. Here is is:
    http://www.thelancet.com/webfiles/images/journals/lancet/s0140673606694919.pdf

  13. 13
    RonF says:

    So, people went door to door and asked “how many people in your family died and when” to get pre- and post-invasion stats, eh?

    This would tend to bias negatively against areas such as in Kurdistan, where when poison gas was used whole villages died, and there was no one left to ask.

  14. 14
    RonF says:

    How representative was the sample geographically? Ethnically/religiously (Shia vs. Sunni vs. Kurd)? Urban/rural? Rich/poor? Like they are saying, it’s hard to do dependable research in a war zone, and I’m not sure that they can account for that by simply expanding the error bars.

  15. 15
    Brandon Berg says:

    Having a fairly solid grasp on the basics of statistics, I agree that the claim that a sample of 1800 is insufficient is completely bogus. And the confidence interval strikes me as fairly typical for social science.

    But I’m stil fairly skeptical. If these numbers are to be believed, then there were just under 900 violent deaths per day in the year between June 2005 and June 2006 (it’s doubled each year for the last two years). That’s 450 a day from gunshot wounds, over 150 from car bombs, and over 100 each from air strikes and other explosions. I don’t have a dog in this fight, so this isn’t an ideologically motivated skepticism—but this just doesn’t square with anything I’ve heard before. Is there anyone here who isn’t surprised by this?

  16. 16
    Charles S says:

    Brandon,

    [edited to remove erroneous calculation based on figure 4 and resulting snark, I should never snark at someone for a miscalculation when I myself am guilty of the miscalculation].

    The number of insurgency/military/death squad related deaths reported in the news for Baghdad has been frequently reaching 100+ per day for much of 2006 (do you read juancole.com?). It was lower last fall, but around the same range or higher in the summer of 2005. The Lancet study found that 4 provinces had higher rates than Baghdad, and several others had similar rates to Baghdad. 900 per day is shocking, but that includes murders resulting from lawlessness as well as US bombing and insurgent bombing and death squads.

    Also, notice figure 4, and the related discussion. A matching rise in rate of violent deaths for 2005 and 2006 were reported by the DoD program, which used a completely different methodology and covered a much more restricted category of fatalities, and by IBC estimates, which are known to dramatically underestimate death rates (most deaths don’t make the papers). So the rate of change is replicated by two other sources. It is a shocking number, but it can not simply be rejected on a “That’s too big” basis.

    RonF, after your participation in the discussion of the previous study, you really can’t be treated as a creditable discussant on this subject. Go read the study, and show some sign that you understood a word of it first (you never did last time).

  17. 17
    Sailorman says:

    Just FYI: the validity and strength of a given sample size is affected by the prevalence of the issue being researched. You only need a small sample of people to get political views, say, because everyone has views of some kind. You need a much larger sample to evaluate the prevalence of a condition which is much more rare. That’s why the “jellybean” example isn’t actually apposite here. And that’s why it’s not “obvious” that the sample size is sufficient, though in this case it appears to be sufficient, because–as I said–the Lancet’s statisticians are clearly capable folk.

    Having now read and reread the study, I’m beginning to think it is better and more accurate than I thought.

  18. 18
    RonF says:

    Another question; how was it concluded that everyone whose death was reported was a civilian? One might think that if someone died because they were shot at while planting an IED or sniping, their family might not be all that willing to admit that to someone who they have no particular reason to trust.

  19. 19
    Jesse says:

    Hi there, just to insert a little perspective on the IBC studies vs the Lancet on the media reportage issue, I’ve just had a look at Stat Can’s homicide rate/total for recent years here in Canada. (see the numbers here: http://www.statcan.ca/Daily/English/051006/d051006b.htm,
    [for 2000 alone:] http://www.statcan.ca/Daily/English/011031/d011031b.htm)

    To those who don’t know, StatCan is a part of the Canadian government and responsible for carrying out censuses etc.

    Here’s my point, StatCan reported that in 2000 there were 542 homicides. Note, that homicides are just one sort of violent/unexpected death-the Lancet includes other types of death-presumably some are accidental or collateral in their study. So, thats 542 murders in Canada for 2000.

    Another study about the media (data for which is also from StatCan and the National Media Archive-based at Simon Fraser University in BC -http://www.media-awareness.ca/english/resources/educational/handouts/crime/our_top_story_tonight.cfm) claims that in 2000 the two major news organizations in Canada (CTV and CBC) reported on homicide 230 times. This is TV coverage and I expect that a number of these stories overlap each other. However, disregarding that fact and imagining that each story was unique only 42.4 % of homicides are reported. This is from the two most respected (and possibly representative) news organizations in the country.

    The Lancet study estimates approximatley 600 000 deaths from violence in Iraq attributable to the war. IBC estimates 48 783 at the most-which they derive from media reportage. In other words, the English language media reports only 8.13% of deaths in Iraq.

    The way I see it, that doesn’t seem unreasonable given that in Canada, one of the safest, literate and technologically advanced (ie. open, cheap, ubiquitous communication technology) we report on less than half of all homicides. Also consider the fact that with regard to the Iraq war it is common and accpeted knowledge that censorship of the news is necessary (e.g. DOD claiming not to keep stats, coffins returning to US etc). These are policies that the US administration readily admits to and defends. With regard to the West in general (and Canada and US in particular) local violence and murders are often big news-some might say hysterically so- and yet in our safe, open societies we still don’t report on every single homicide that occurs.

    Of course the situation in Iraq even without censorship of the media is pretty much completley untenable for the type of reporting we expect in our domestic news media. With reporters almost never ascertaining ‘facts’ by first hand on-the-scene reporting (most seldom leave the Green Zone), Western media in Iraq relies on second or third hand accounts, or often enough, reports from vested interests such as the American military or various parts of the Iraqi government.

    In a veritable warzone is it unrealistic to expect that violent deaths are under-reported to this extent? In fact, if anything, I’m surprised at and commend the courage of journalists there to be able to report even 8% of such news. These people are in fact war correspondents and I think can be hardly expected to get up close and personal with every story. However, the media clearly fails in that it doesn’t acknowlegde its own limitations and studies like IBC partially colude in that. This level of reporting probably can’t really be improved in the short term, but it also shouldn’t (and IBC shouldn’t) purport to tell the whole story.

  20. 20
    ungeziefer says:

    Jesse ,

    Very good comment — we tend to think of the media as over-hyping crime and violence (“if it bleeds, it leads”), and they do. But I’m sure that even in the U.S. if we counted up every reported murder it would be less than half of the country’s homicide figure.

    Like most people, my initial reaction to the figure was “that sounds too high to be plausible” — and I’m still not sure you can completely reconcile the fact that they did the same study last year (granted, with a smaller sample) and reported a figure 1/6 as high as this one (???).

    But just stop an think about this: ask any soldier who comes back from Iraq “Did you kill anyone?”, and they will most likely say “Yes.” How many troops have gone over there? It’s around 140,000 – 150,000 at any given time, but those are not the same troops all serving three tours of duty. I don’t know how many individual soldiers have been deployed (maybe I’ll try to look it up), but if you figure most of them killed AT LEAST 1 person (and of course some of them killed a whole bunch of people) . . . . . I realize that’s ridiculously anecdotal and un-scientific, but it’s worth thinking about.

    I do think it must be hard to get accurate numbers. But it’s interesting how things can be spun for political reasons — such as, compare this report by two well-respected medical institutions and how it’s being discredited, with the numbers that were thrown out about Milosevic’s ethnic cleansing: “100,000, 200,000, 300,000 … Well, we’re not sure, but it’s definitely up there — at any rate a massive genocide. … Etc.” Turns out the figure is no more than 11,000, and probably far less — which is still a horrific attrocity, and deserving of the term “genocide,” but my point of course is that when it’s convenient to demonize a hated enemy and make him look like the next Hitler, politicians and pundits will pull figures straight from their own arses and repeat them over and over and insist that they’re true and accurate. When it’s OUR fault, well, that’s a whole different story …

    To “RonF”: I think your logic is backwards:

    Like they are saying, it’s hard to do dependable research in a war zone, and I’m not sure that they can account for that by simply expanding the error bars.

    So in other words, so many people are being killed that it’s too dangerous to do a poll of people killed, therefore the number of people killed must be inflated. Wha?? If anything, the opposite would be true: unable to talk to people in the most violent areas, your sample would be biased on the low side. It’s a bit like, “The problem with Iraq is that the media never report all the GOOD news — mainly cause they fear for their lives and can’t really travel freely in Iraq. If they COULD do their jobs without the constant fear of being kidnapped or killed, they’d be reportin on how everything’s goin just great over there.”

  21. 21
    ungeziefer says:

    Another comment by “RonF” was equally nonsensical :

    So, people went door to door and asked “how many people in your family died and when” to get pre- and post-invasion stats, eh?

    This would tend to bias negatively against areas such as in Kurdistan, where when poison gas was used whole villages died, and there was no one left to ask.

    I’m guessing you’re not disputing the poll here but rather trying to say that the PRE-invasion number should be much higher? In which case I can only infer that you’re talking about deaths in the 80’s and/or the first Gulf War — not terribly relevant to the “pre-invasion” figures, since the invasion happened more than 10 years later. But aside from that, your logic applies equally to the POST-invasion, since if every member of a family/household was killed (whether by U.S. troops on the ground or air strikes or insurgents or whatever), there would be “no one left to ask” — thus deflating the number.

    One last point: we’re talking about a country in which the most crippling sanctions ever devised had been in effect for over a decade, in which malnutrition and disease were widespread and so forth (Madeline Albright, remember, acknowledged that half a million children had died because of the sanctions.) The fact that, more than three years after “Mission Accomplished,” the death rate is higher AT ALL is a major scandal and a tragedy. Reasonable people can ask, “are things actually worse now than they were before, in an economically crippled country run by a brutal dictator?” And that is just too sad for words.

  22. 22
    Raznor says:

    The problem with many people’s assumptions here is what it means to be dead because of the war. The standard method is to count total fatalities during the war compared to a time not during the war. But when they say 600,000 deaths, remember not all of those are directly due to violence. Would you say someone killed by an otherwise treatable infection because the local hospital was destroyed is not a death caused by a war? Or a person killed by exposure to toxic chemicals leaked during bombings, or because of considerable damage to the counntry’s infrastructure prevented clean up is also not a death caused by the war? I think all can agree that a war adversely affects people who aren’t involved in the violence itself.

    Oh, and the math teacher in me has to comment on this:

    But that doesn’t mean that the study is unreliable, or its methods incorrect; it just means that the results have a wide confidence interval. We can be reasonably certain there have been between 426,369 and 793,663 excess Iraqi deaths since our invasion.

    This is a distinction that most people miss, because it is a very subtle one, but the (98%?) confidence interval does not mean we are 98% certain that the actual number of deaths is between 426,369 and 793,663, but that if this study were repeated with the same sample size, we are 98% confident that this other study would find between 426,369 and 793,663 deaths. Like I said, a very subtle distinction, and thus one I repeated ad nauseum when I taught my students about confidence intervals last year.

  23. 23
    Daran says:

    Raznor:

    Oh, and the math teacher in me has to comment on this:

    But that doesn’t mean that the study is unreliable, or its methods incorrect; it just means that the results have a wide confidence interval. We can be reasonably certain there have been between 426,369 and 793,663 excess Iraqi deaths since our invasion.

    This is a distinction that most people miss, because it is a very subtle one, but the (98%?) confidence interval does not mean we are 98% certain that the actual number of deaths is between 426,369 and 793,663,…

    Correct. The assumption that it does, is known as the prosecutor’s fallacy.

    …but that if this study were repeated with the same sample size, we are 98% confident that this other study would find between 426,369 and 793,663 deaths. Like I said, a very subtle distinction, and thus one I repeated ad nauseum when I taught my students about confidence intervals last year.

    Incorrect. The correct statement is that if this study were repeated (with or without the same sample size) then we are 98% confident that the other study’s 98% confidence interval would include the actual number of death’s.

    This is a very subtle point. The difference between the hypothetical study and the actual one is that we have more information about the later. We know what its confidence interval actually is. In general, it takes an application of Bayes’ theorem to compute an a posteriori probability from an a priori probability plus additional information. The problem with confidence intervals is that we can’t do this calculation because we don’t have a value for the a priori probability that the actual death rate was between 426,369 and 793,663.

  24. 24
    ungeziefer says:

    In general, it takes an application of Bayes’ theorem to compute an a posteriori probability from an a priori probability plus additional information. The problem with confidence intervals is that we can’t do this calculation because we don’t have a value for the a priori probability that the actual death rate was between 426,369 and 793,663.

    In all honesty, I minored in Philosophy in college, and I actually have no idea what you’re talking about. While that probably indicates my ignorance, I think you’re using jargon to deliberately obfuscate the truth (which is not technically a “fallacy,” but should be).

    There is no “a priori” question here at all. What is needed is a serious objective look at the evidence (or “a posteriori” knowledge, if you want to sound smart), by which we can determine how many people have died.

    So, in short, what the hell are you talking about????

    And to both “Raznor ” and “Daran”: I’m afraid you’re obligated to explain in more detail what in Christ’s name you’re talking about, because at this point essentially all you’re saying is that “if you repeated the study, you’d get the same results,” without stating why this is so, or why these results are therefore incorrect.

  25. 25
    Robert says:

    I think you’re using jargon to deliberately obfuscate the truth…

    Nope. Statistics is mathematical reasoning. If you don’t have the tools, you can’t engage in the reasoning; Bayes’ theorem is akin to a sharpened rock flake. (“Those swank Cro-Magnons, coming around here with their pointy rocks thinking they’re such hot stuff…they’re just obfuscating the tool-using process!”)

    I have a sufficient grounding in statistics to have a glimmering of awareness about how abysmally ignorant I am . If you don’t know what Daran’s talking about, it’s because you lack the cognitive tools to understand it, not because he’s being obscure. (NOI.)

  26. 26
    Daran says:

    Robert:

    If you don’t know what Daran’s talking about, it’s because you lack the cognitive tools to understand it…

    Correct. I wasn’t trying to obfuscate. I wasn’t even replying to ungeziefer, but to someone who had indicated that he did have the cognitive tools.

    And to both “Raznor ” and “Daran”: I’m afraid you’re obligated to explain in more detail what in Christ’s name you’re talking about, because at this point essentially all you’re saying is that “if you repeated the study, you’d get the same results,” without stating why this is so, or why these results are therefore incorrect.

    I was replying to Raznor who in turn was responding Amp’s original post. You weren’t even in this branch of the conversation, so I don’t see how either of us are “obligated” to you in any way.

  27. 27
    ungeziefer says:

    I apologize for both my ignorance and my rude tone.

    If you care to explain what this means —

    This is a distinction that most people miss, because it is a very subtle one, but the (98%?) confidence interval does not mean we are 98% certain that the actual number of deaths is between 426,369 and 793,663, but that if this study were repeated with the same sample size, we are 98% confident that this other study would find between 426,369 and 793,663 deaths. Like I said, a very subtle distinction, and thus one I repeated ad nauseum when I taught my students about confidence intervals last year.

    The correct statement is that if this study were repeated (with or without the same sample size) then we are 98% confident that the other study’s 98% confidence interval would include the actual number of death’s

    — I’m interested. If not, carry on with your sharp rocks, and I will return to my cave dwelling.

  28. 28
    Robert says:

    Re-reading what I wrote, ungeziefer, my tone was unpleasant and hectoring. That wasn’t my intention. Apologies.