Ivermectin is an antiparasitic drug, and a very good one. If you are infected with the roundworms that cause river blindness or the parasitic mites that cause scabies, it is wonderfully effective. It is cheap; it is accessible; and its discoverers won the Nobel Prize in 2015. It has also been widely promoted as a coronavirus prophylactic and treatment.
This promotion has been broadly criticized as a fever dream conceived in the memetic bowels of the internet and as a convenient buttress for bad arguments against vaccination. This is not entirely fair. Perhaps 70 to 100 studies have been conducted on the use of ivermectin for treating or preventing COVID-19; several dozen of them support the hypothesis that the drug is a plague mitigant. Two meta-analyses, which looked at data aggregated across subsets of these studies, concluded that the drug has value in the fight against the pandemic.
So if you’re the sort of person who “follows the science,” it might seem perfectly rational to join the fervent supporters of ivermectin. It might even strike you as reasonable to suggest, as one physician and congressional witness did recently, that “people are dying because they don’t know about this medicine.”
The problem is, not all science is worth following.
I work on a small team of researchers who do what one might call “forensic peer review.” In the standard process for scientific publishing, peer reviewers take a manuscript mostly at face value: They ensure that the study makes sense as it’s described. We do something else: We check everything, and try to ferret out any potential biases in reported patterns of digits, statistical impossibilities, inconsistencies between what researchers said they’d do and what they actually did, and plagiarized sentences or paragraphs. And we often find fatal flaws hidden behind a veil of two-dollar words and statistical jargon.
The ivermectin literature has been no exception. Over the past six months, we’ve examined about 30 studies of the drug’s use for treating or preventing COVID-19, focusing on randomized studies, or nonrandomized ones that have been influential, with at least 100 participants. We’ve reached out directly to the authors of these studies to discuss our findings, sometimes engaging in lengthy back-and-forths; when appropriate, we’ve sent messages to the journals in which studies have been published. In our opinion, a bare minimum of five ivermectin papers are either misconceived, inaccurate, or otherwise based on studies that cannot exist as described. One study has already been withdrawn on the basis of our work; the other four very much should be.
In the withdrawn study, a team in Egypt compared outcomes among COVID-19 patients who did and did not receive ivermectin—but, for the latter group, they included deaths that had occurred before the study began. (According to the journal Nature, the lead author “defended the paper” in an email, and claimed that the withdrawal took place without his knowledge. He did not respond to an inquiry from The Atlantic.) Other papers also have egregious flaws. Researchers in Argentina said they recruited participants from hospitals that had no record of having participated in the research, and then blamed mistakes on a statistician who claimed never to have been consulted. A few studies show clear evidence of severe data irregularities. In one from Lebanon, for example, the same section of patient records repeats over and over again in the data set, as if it had been copied and pasted. (An author on that paper conceded that the data were flawed, and claimed to have requested a retraction.)
All of the above may not sound that bad. If five out of 30 trials have serious problems, perhaps that means the other 25 are up to snuff. That’s 83 percent! You might be tempted to think of these papers as being like cheaply made light bulbs: Once we’ve discarded the duds with broken filaments, we can just use the “good” ones.
That’s not how any of this works. We can locate obvious errors in a research paper only by reanalyzing the numbers on which the paper is based, so it’s likely that we’ve missed some other, more abstract problems. Also, we have only so much time in the day, and forensic peer review can take weeks or months per paper. We don’t pick papers to examine at random, so it’s possible that the data from the 30 papers we chose are somewhat more reliable, on average, than the rest. A better analogy would be to think of the papers as new cars: If five out of 30 were guaranteed to explode as soon as they entered a freeway on-ramp, you would prefer to take the bus.
Most problematic, the studies we are certain are unreliable happen to be the same ones that show ivermectin as most effective. In general, we’ve found that many of the inconclusive trials appear to have been adequately conducted. Those of reasonable size with spectacular results, implying the miraculous effects that have garnered so much public attention and digital notoriety, have not.
Given all the care that goes into maintaining scientific literature, how did this house of cards acquire planning permission? The answer is that the pandemic has created a very difficult environment for scientific publishing. In early 2020, a hunger for high-quality information arose immediately. How scared of the coronavirus should we be, and how should we behave? How does the virus spread? How dangerous is it? What decisions should governments make? To answer those questions, scientific studies were produced at record pace, peer-reviewed almost immediately after they were submitted or else put into the public domain via preprint as soon as they had been completed. Publishing science is slow; highly contagious diseases are fast.
It’s not that, under such conditions, a few bad studies were bound to slip through the net. Rather, there is no net. Peer review, especially when conducted at pandemic speed, does not exert the rather boring scientific scrutiny needed to identify the problems described above. Forensic work like ours is not organized by scientific journals. We do not get paid. We are not employed by universities, hired by governments, or supported by private money to do this. We do it because we feel it should be done.
As volunteers, we have no inherent authority. When we ask a research group for access to its original data, in accordance with a long-held standard for maintaining scientific integrity, our requests are commonly refused or ignored. And when we do find what we think are serious anomalies in a given paper, getting the authors, the institutions they work for, or their publication outlets to return our emails tends to be somewhere between challenging and impossible. When we looked at an ivermectin study published over the summer in the Asian Pacific Journal of Tropical Medicine, for example, we found a highly unusual pattern of numbers that implied a failure of randomization. We reported this issue to the journal more than three months ago and have heard nothing substantive back. One of the journal’s executive editors in chief, Bo Cui, told The Atlantic that the study “represents the best available evidence at the time of publication and has undergone scientific peer review and editor review before its publication”; he also said that the journal has asked the authors to address “the randomization issue” and that any eventual retraction would come only after “due process, free from coercion or pressure.” The study’s lead author, Morteza Niaee, told The Atlantic via email that the randomization procedure was “completely acceptable and well-performed.”
This is a consistent theme in our work. We contact the authors of each paper, along with the journal or preprint service where their work is published, long before these issues are discussed in public. Sometimes, the authors or journals even reply, although these communications rarely result in any kind of investigation, let alone a serious consideration of the issues raised.
In this environment, no sinister conspiracy is needed to allow for the construction of an irreparably flawed body of literature. In fact, the suspect quality of the ivermectin/COVID-19 literature may be alarmingly commonplace. Remember, our low estimate is that about 17 percent of the major ivermectin trials are unreliable. John Carlisle, famous in metascientific circles for identifying the most prolific research fraud in the history of medical research—the case of Yoshitaka Fujii, an anaesthesiologist who managed to garner an astonishing 183 retractions—reviewed more than 500 trials submitted to the journal Anaesthesia in the three years leading up the pandemic and concluded that 14 percent of them contained false data. A 2012 survey of researchers at five academic medical centers in Belgium reported that 1 percent admitted to having fabricated data in the prior three years, though 24 percent said they had observed a colleague doing so. A meta-analysis on the same topic concluded, similarly, that 2 percent of researchers admit to having engaged in serious misconduct while 14 percent say they have observed it in a colleague.
Richard Smith, the former editor of the British Medical Journal, suggested in July that the scientific community is long past due for a reckoning on the prevalence of false data in the literature. “We have now reached a point where those doing systematic reviews must start by assuming that a study is fraudulent until they can have some evidence to the contrary,” he wrote. This is less hostile than it sounds. Smith isn’t saying that everything is fraudulent, but rather that everything should be evaluated starting from a baseline of “I don’t believe you” until we definitely see otherwise. Think of the airport-security workers who assume that you may be carrying contraband or weapons until you prove that you aren’t. The point is not that everyone is armed but that everybody has to go through the machine.
Yet it has not yet sunk in to the public consciousness that our system for building biomedical knowledge largely ignores any evidence of widespread misconduct. In other words, the literature on ivermectin may be quite bad—and in being so, it may also be quite unremarkable.
If this is the case, how does medical science manage to navigate all the bad research? How have we not returned to the ages of leeches and bloodletting?
The secret, again, is simple: Much research is simply ignored by other scientists because it either looks “off” or is published in the wrong place. A huge gray literature exists in parallel to reliable clinical research, including work published in low-quality or outright predatory journals that will publish almost anything for money. Likewise, the authors of fabricated or heavily distorted papers tend to have modest ambitions: The point is to get their work in print and added to their CV, not to make waves. We often say these studies are designed to be “written but not read.”
Although some of the papers we examined may claim, for instance, that ivermectin is a perfect COVID-19 prophylactic, they do so based on a smallish study of a few hundred people—and the work is published in journals that during pre-pandemic times would have been deeply obscure. When a group claims to have reviewed, say, 100,000 patient records—and then publishes their dubious results in a high-profile journal—the risks are significant.
In a pandemic, when the stakes are highest, the somewhat porous boundary between these publication worlds has all but disappeared. There is no gray literature now: Everything is a magnet for immediate attention and misunderstanding. An unbelievable, inaccurate study no longer has to linger in obscurity; it may bubble over into the public consciousness as soon as it appears online, and get passed around the internet like a lost kitten in a preschool. An instantly forgettable preprint, which would once have been read by only a few pedantic experts, can now be widely shared among hundreds of thousands on social media.
And our work will begin all over again.
The fact that there is no true institutional vigilance around a research literature that affects the health of nations, that it is necessary for us to do this, is obscene. It is a testament to how badly the scientific commons are managed that their products are fact-checked for the first time by a group of weary volunteers.