Vlada je včeraj preklicala zadnjega v seriji lockdownov, pri čemer je ta zadnji zgledal kot parodija. Medtem ko so resnejši člani vladne posvetovalne skupine predlagali popoln lockdown za 12 dni, je vlada to spremenila v burlesko z vmesnim odpiranjem za en dan, pri čemer so bili de facto zaprti samo šole, knjižnice, muzeji in nekaj storitvenih dejavnosti brez močnih botrov (frizerski saloni, gostinstvo in avtomobilski servisi). Da o epski neumnosti ponovne uvedbe nošenja mask na prostem ne govorim. Parodija. Ker je bilo praktično celotno gospodarstvo, ki je glavno žarišče okužb, odprto, raven okužb seveda ni upadla. Tak lockdown je bil nesmiseln, nepotreben in škodljiv. V prvi vrsti pa sploh ni bil utemeljen na podatkih. Vladna skupina in vlada sta se zanj odločili, ne da bi sploh vedeli, kje so centri okužb. Popolna parodija. Slepci, ki iščejo ključ pod ulično svetilko.
No, če imamo pri nas burlesko (s sicer tragičnimi posledicami za prebivalstvo in gospodarstvo) z nekompetentno vlado, ki ne zna zbirati in analizirati podatkov in na njihovi osnovi argumentirati ukrepov, pa imamo na drugi strani znanstvene lunatike, zaljubljene v lepoto matematičnih modelov in ki se radi malce igrajo z njimi, na osnovi čudnih predpostavk in voodoo magije sproducirajo katastrofične scenarije in jih objavijo v super prestižni znanstveni reviji, kot je Nature. Javnost te objave vzame kot suho zlato (saj je vendar “objavljeno v Nature” !!!), v bistvu pa jim je skupina teoretikov brez smisla za realnost prodala “kačje olje”, bullshit brez relevantnosti. Larpurlartizem par excellence.
Tak primer je članek skupine avtorjev s prvopodpisanim Sethom Flaxmanom (Flaxman et al, 2020), ki so v super prestižni znanstveni reviji Nature že junija lani objavili članek, da naj bi lockdowni lani spomladi v prvem valu rešili 3 milijone človeških življenj. Članek je postal viralen, uporabljali so ga kot biblični dokaz, da so lockdowni najbolj koristen epidemiološki ukrep (nekaj podobnega, kot je bila v ekonomiji pred desetletjem voodoo raziskava Alesine in Ardagne (2010) o “ekspanzivnih učinkih restriktivne fiskalne politike” med najvišjimi evropskimi politiki vzeta kot biblični dokaz, da ima politika varčevanja pozitivne učinke na gospodarsko rast). Philippe Lemoine je decembra lani ta Flaxman et al članek vzel pod lupo in kmalu ugotovil, kako katastrofalno slabo in manipulativno je bil narejen ter seveda ugotpvil, da so se avtorji članka neodgovorno igrali z ekonometričnimi ocenami in na čudnih predpostavkah sproducirali katastrofične številke, ki nimajo nobene relevantne vrednosti. Do rezultatov so prišli na osnovi voodoo magije. Spodaj objavljam samo uvod in sklep Lemoinove kritike, prosim pa vas, da si preberete tudi tisti ključni del vmes, ki se nanaša na metodologijo. Hudič je, tako kot pri Alesini in Ardagni, skrit prav v metodologiji. V voodoo magiji.
Kult znanosti, v katero so nekateri tako zelo zaljubljeni, je lahko enako slab, kot ukrepe oblikovati brez upoštevanja podatkov. Gre za dve skrajnosti norosti.
I hadn’t read it until very recently, but since the debate about lockdown has been rekindled by the second wave in Europe, I decided to read it and I was astonished by how bad it was. Since it still plays a very important role in that debate, I think it’s important to explain why it’s bad, so in this post I’m going to take it apart and explain why even Flaxman et al.’s own analysis shows that, while lockdowns do make a difference, they don’t cut transmission that much more than other, less stringent and therefore less costly restrictions. (I basically taught myself Stan over a weekend to do this, so someone should probably check my code, though I’m pretty sure there are no major mistakes.) After pointing out that it changes the policy debate quite a lot, I conclude with a few remarks about the role bad science has played in this debate and how this paper illustrates why the cult of science that many people have fallen into is bad.
What Flaxman et al. did
I’ve seen many people claim that Flaxman et al.’s paper showed that lockdowns were very effective, but I don’t think many of them know what they actually did in that paper, because it showed no such thing despite what the authors claim. In fact, what they did is assume that non-pharmaceutical interventions work (as we shall see, the model was bound to find that lockdowns specifically did most of the work, but more on that later), fit the data on deaths to a model that makes this assumption in order to infer various epidemiological variables of interest such as and the number of infections. Once this was done, they just compared the number of deaths predicted by this model, which by construction is going to be very close to the actual number of deaths, to the number of deaths in a counterfactual where there were no interventions.
Again, it’s not a big deal if you don’t understand everything, what you need to remember is that each non-pharmaceutical intervention is assumed to affect immediately, only non-pharmaceutical interventions are supposed to affect it and each is assumed to have the same effect in every country where it’s implemented.
Many people have criticized the first assumption, but while it’s no doubt unrealistic, the second and third assumptions seem even worse. Indeed, it’s obvious that even in the absence of any government interventions, people would still change their behavior in response to the pandemic because they’re afraid. This would reduce transmission and therefore it’s not true that only non-pharmaceutical interventions affect . Moreover, there is also no doubt that each non-pharmaceutical intervention is not equally effective in every country where it’s implemented, so it’s not true that each intervention has the same effect in every country. First, although various restrictions are modeled as the same in the paper, they were actually pretty different. For instance, the intervention that Flaxman et al. call “public events banned” consisted in forbidding gatherings of more than 5 people in Austria, 100 people in France, 500 people in Sweden and 1,000 people in Germany, but in the model all those different policies are treated as identical. Indeed, even “lockdown” was not the same thing everywhere, since under “lockdown” people in France couldn’t leave their place without filling a form while in Denmark people could meet as long as they were no more than 10.
Not only very different interventions are treated as the same in the model, but even if they had really been identical, their effect in different countries would still have been different for idiosyncratic reasons. Everybody has noted how people in some cultures might be more willing to follow rules, but there are many other possible factors that could also affect how effective the same intervention would be in different countries, such as differences in the proportion of people who need to take public transportation to go to work, differences in the age distribution of the population, etc. As you can see above, Flaxman et al. did include a country-specific effect in the model, which is supposed to model the fact that, precisely for this kind of idiosyncratic reasons, we don’t expect that even the same non-pharmaceutical interventions will be equally effective in every country. I will come back to this country-specific effect later, because it plays a very important and largely unacknowledged role in Flaxman et al.’s paper, but for the moment it’s enough to say that it’s only associated with the last intervention, which is a lockdown in every country except in Sweden where it’s the ban of public events. So the model only acknowledges that idiosyncratic factors could affect how effective the same interventions are in different countries for lockdowns but not for the other interventions.
As you can see, the country-specific effect for Sweden is gigantic, barely less than what the model estimates for the effect of a complete lockdown, whereas it hovers around zero for the other countries. Indeed, according to their prior on the country-specific effect, there was only a 1 in 4,000 chance that it would be that large. Moreover, the estimate is pretty tightly estimated for Sweden, but the credible intervals are extremely wide for the other countries.
Now, this completely undermines the conclusion Flaxman et al. draw from their results, so I understand why they swept that fact under the rug and didn’t show this chart anywhere in the paper or even in the supplementary materials… (What I don’t understand, or would not understand if I didn’t know how peer review actually works, is that no reviewer asked for it.) Indeed, the country-specific effect is supposed to model how the same interventions may not be equally effective in every country for idiosyncratic reasons, but unless you believe there is some kind of magic floating in the air in Sweden, it doesn’t make sense to believe that, for some mysterious reasons, banning public events was several orders of magnitude more effective over there than in the rest of Europe, despite the fact that it only banned events with more than 500 people and was therefore less stringent than in any other country except Germany. The obvious explanation is that the model is misspecified and that in fact it’s not the case that lockdowns did most of the work in reducing transmission. Instead, most of the heavy lifting was or would have been done by other, less stringent interventions plus more or less spontaneous behavioral changes. You don’t need a complicated model to reach this conclusion, you just have to look at the death curve in Sweden during the first wave or at the epidemic curve in other countries where there was no lockdown or where it wasn’t nearly as strict as in the Spring but where nevertheless fell below 1 to see that complete lockdowns aren’t necessary to break the epidemic, as Flaxman et al. claim based on results that obviously show no such thing.
I actually think that not mentioning this fact about the country-specific effect in Sweden comes very close to scientific malpractice. It makes their main conclusion, which as just noted can be seen to be implausible without any complicated modeling, very hard to maintain and I have a hard time believing they weren’t aware of that and that it’s not why they carefully avoided the topic in describing their results. In any case, what is clear is that, once you realize that the model was only able to find that no intervention except lockdowns had a meaningful effect on transmission by estimating a huge country-specific effect for Sweden, it becomes impossible to take that conclusion seriously. As we have seen, the assumption that only non-pharmaceutical interventions affect is baked into the model, but prima facie it doesn’t assume that only lockdowns specifically have a significant effect on transmission. Flaxman et al. would probably say that it reached that conclusion from the data, but I think it’s largely artificial and that in practice the model was bound to find that only lockdowns had a large effect on transmission, even if they did not.
I have no doubt that lockdowns saved lives, but they didn’t save nearly as many as people think and they certainly didn’t save 3 million lives in Europe alone during the first wave, as Flaxman et al. claim. They use sophisticated statistical techniques to reach a conclusion that can be rejected with a high degree of certainty just by eyeballing a chart. Their paper is a prime example of propaganda masquerading as science that weaponizes complicated mathematics to promote questionable policies. Complicated mathematics always impresses people because they don’t understand it and it makes the analysis look scientific, but often it’s used to launder totally implausible assumptions, which anyone could recognize as such if they were stated in plain language. I think it’s exactly what happened with Flaxman et al.’s paper, which has been used as a cudgel to defend lockdowns, even though it has no practical relevance whatsoever. The truth is that, with the data and methods they used, it’s impossible to estimate the effect of non-pharmaceutical interventions and anyone who claims otherwise is selling snake oil.
To be clear, while I personally think it’s best to avoid lockdowns, I don’t claim to have demonstrated that. I only claim that, even if you think that lockdowns are the way to go, you shouldn’t use Flaxman et al.’s paper to argue for that view, because it has no practical relevance whatsoever. People sometimes ask why I joke that I’m anti-science, but this is why. Scientific journals, even very prestigious ones, routinely publish this kind of papers that claim to have policy relevance but don’t actually have any. It’s essentially voodoo magic for the “I love science” crowd, where the colorful rituals have been replaced by scary-looking mathematical formulas. This is especially likely to happen when, as in the case of the pandemic, the topic is politicized. As they say about sausages, once you know how they’re made, you often no longer want to eat them. Well, with science, it’s the same thing.
Finally, a lot of people have criticized preprints and stressed the importance of peer review during the pandemic, but note that the problem here isn’t the lack of peer review. Flaxman et al.’s paper was peer reviewed and published at one of the most prestigious scientific journals in the world, but it’s still garbage. In fact, not only was the problem not the lack of peer review, but I think this episode illustrates some of the problems with pre-publication peer review. Although I can’t prove it, it’s very likely that Flaxman et al. decided to hide the country-specific effect for Sweden because they knew that it would make it more difficult for them to publish their paper in Nature and scientists have very strong incentives to publish in prestigious journals, which would not exist if we abolished pre-publication peer review and journals as they currently exist. Moreover, precisely because it was peer reviewed and published in a prestigious journal, this paper was able to play a huge role in the policy debate. If pre-publication peer review didn’t exist, on the other hand, it would just have been another paper without the credibility granted to it by peer review and the affiliation with a prestigious journal, which always impress people who can’t judge the quality of a scientific paper and even many who can.
Vir: Philippe Lemoine