Friday, September 16, 2016

A curmudgeonly read of the ZIKV case control study, Brazil 2016 (Lancet ID)

A ‘definitive’ study examining the relationship between ZIKV and microcephaly in Brazil has just been published [1].

I need preface this by saying this obviously represents a massive amount of work by a large number of people under very difficult circumstances, with many families making massive sacrifices to be involved, and I am in no way denigrating those efforts. It is also very explicit about being a preliminary analysis, and is being touted as a definitive causal statement.


But, why do I say that? Bias, bias, and more bias. 

I. Sample size, power and preliminary analyses

The authors state: “The original study aimed to include 200 cases and 400 controls to have 90% power, 95% precision to detect an association with an odds ratio of 2 or greater, assuming that 67% of cases were exposed.”

Power calculations exist for a very good reason- small numbers lie to you. Fantastic discussion [2], and the pdf here:

“However, as small studies are particularly susceptible to inflated effect size estimates and publication bias, it is difficult to be confident in the evidence for a large effect if small studies are the sole source of that evidence.”

This is why protocols get approved and pre-filed. Interim analyses are dangerous, as small numbers are unstable- I can confidently predict the final OR will be much, much closer to 1.

II. Biologically implausible effect sizes

The overall odds ratios (whether 55.5 or 86.5) are simply entirely biologically implausible. The only comparable OR I've ever seen, and the most-iron clad relationship in epi is mesothelioma and occupational long-term asbestos exposure, with an OR= 50.0 (25.8–96.8) [3]. If you have long-term exposure, you'll get mesothelioma, and essentially no-one else gets mesothelioma.

Looking at the another very strong relationship that everyone is familiar with, we have lung cancer and smoking, with a RR = 8.96; (95% CI: 6.73–12.11) (RR, since it's pooled in a meta-analysis) [4].

III. Biases in analysis

The authors analyze using “median unbiased estimator for binary data in an unconditional logistic regression model” which is also called ‘exact logistic’ to reduce instability due to small (or zero) cell counts. Excellent discussion here:

However, this exceedingly wide CI, with an upper bound of +∞ suggests a major problem and a potentially biased estimate, which requires closer examination. There are newer alternatives, particularity so-called Firth logistic regression. 

Rerunning the published numbers (ignoring matching and covariates) using Stata’s –firthlogit- gives an OR of 86.5 (95% CI: 4.9 to 1523.4). While still disconcertingly wide, this CI is acceptable for such sparse data.

IV. Loss of controls

The overall OR of 55.5 (8.6 to +∞) [or Firth: 86.5 (4.9 to 1523.4)] is based on 62 controls. However, the authors report moderate levels of refusal (76% agreed, so 20 refused). So what happens if some of those twenty controls that declined to participate were actually ZIKV (+)?

N of 94:                                            OR= 86.5 (95% CI: 4.9 to 1523.4;   p= 0.002)
N of 114 (5 ZIKV (+) controls):        OR= 9.8  (95% CI: 3.2 to 29.6;        p< 0.001)
N of 114 (10 ZIKV (+) controls):      OR= 4.8  (95% CI: 1.9 to 12.3;        p= 0.001)

While still all significant, the estimates very rapidly progress from jaw-dropping through interesting to ‘ho-hum[5], and statistically significant does not always mean biologically important.

Is there reason to think those that refused might be different from those that participated? Yes, I think so- perhaps they lived in outlying neighborhoods, or have different SES or other characteristics that might have a direct impact on likelihood of being ZIKV(+).

Other issues.

1. High levels of arboviral coinfection were not included in analysis- this can, and should have been considered in the regression models, both as interactions and as covariates. These data are rich enough to support a more comprehensive analysis.

2. No controls, and 19 (59%) of cases were ZIKV(-)- this is truly bizarre. I suspect what’s going on here is that ZIKV is not playing nice in serological tests [6]. Specifically, optical density (titer) responses for anything are a continuum, which requires a cut-off to determine sero-positivity, (generally 3SDs above a pool of sero-naïves).

If this cutoff is ‘wrong’ for ZIKV antobodies then there could be massive bias in classifying exposure, so the exposures captured might represent only the very highest levels of viremia where the risk could, indeed be very high. Moreover, the high levels of co-infections suggest something is interfering with the serology in an important way.

While not directly applicable to arboviruses, one example (Helicobacter pylori) found large differences in ORs when using a generic ELISA vs. one tuned for populations-at-risk [7].

Update 1: I should be clear here, I am not questioning that ZIKV is associated with MC as it clearly is in NE Brazil, but I am not yet convinced it is the sole risk factor, and the magnitude of that association is entirely unsettled.


1.         de Araújo TVB, Rodrigues LC, de Alencar Ximenes RA, de Barros Miranda-Filho D, Montarroyos UR, de Melo APL, et al. Association between Zika virus infection and microcephaly in Brazil, January to May, 2016: preliminary report of a case-control study. Lancet Infect Dis. doi:10.1016/S1473-3099(16)30318-8
2.         Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14: 365–376. doi:10.1038/nrn3475
3.         Rake C, Gilham C, Hatch J, Darnton A, Hodgson J, Peto J. Occupational, domestic and environmental mesothelioma risks in the British population: a case–control study. Br J Cancer. 2009;100: 1175–1183. doi:10.1038/sj.bjc.6604879
4.         Gandini S, Botteri E, Iodice S, Boniol M, Lowenfels AB, Maisonneuve P, et al. Tobacco smoking and cancer: A meta-analysis. Int J Cancer. 2008;122: 155–164. doi:10.1002/ijc.23033
5.         Nakagawa S, Cuthill IC. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev. 2007;82: 591–605. doi:10.1111/j.1469-185X.2007.00027.x
6.         De Smet B, Van den Bossche D, van de Werve C, Mairesse J, Schmidt-Chanasit J, Michiels J, et al. Confirmed Zika virus infection in a Belgian traveler returning from Guatemala, and the diagnostic challenges of imported cases into Europe. J Clin Virol. 2016;80: 8–11. doi:10.1016/j.jcv.2016.04.009
7.         Yuan J-M, Yu MC, Xu W-W, Cockburn M, Gao Y-T, Ross RK. Helicobacter pylori infection and risk of gastric cancer in Shanghai, China: updated results based upon a locally developed and validated assay and further follow-up of the cohort. Cancer Epidemiol Biomarkers Prev. 1999;8: 621–624.

Wednesday, June 22, 2016

ZIKV IgM in microcephaly positive infants (Lancet April 18, 2016)

An important contribution to the unfolding public health situation in Brazil appeared recently [1]; in this short report the authors present IgM values for 31 infants born with microcephaly from ca. Sept 12 to Oct 27, 2015. However, the authors present no quantitative analyses, so I've done a secondary analysis of these data:

Full details here:


[1]. "Positive IgM for Zika virus in the cerebrospinal fluid of 30 neonates with microcephaly in Brazil" Marli Tenorio Cordeiro, Lindomar J Pena, Carlos A Brito, Laura H Gil, Ernesto T Marques

Tuesday, March 22, 2016

The Lancet: Vectors, islands and Zika

The newest study exploring the association between ZIKV and microcephaly has just been published [1]; in it, the authors use diverse datasets to model the estimated risk of microcephaly based on eight reported cases combined with the attack rates from serology in French Polynesia (FP). While the authors are refreshingly explicit about their model assumptions, there are some potentially major data limitations that may constrain the utility of this analysis.

Islands and serology
The lynchpin issue for these estimates of the risk for microcephaly is: do the attack rates from captured serology serve as an accurate proxy for ZIKV exposure in pregnant women in FP?

Three serology datasets are used:

Pre-epidemic: 593 people (18–79 yr) from Tahiti (July 2011 - October, 2013).
2nd half epidemic: 196 people (7–86 yr; median 41 yr); general pop from the five most inhabited islands (February and March, 2014).
Post-epidemic: 476 children from Tahiti (6–16 yr; median 11 yr) (May and June, 2014.).

How big can the differences really be if Tahiti (the main island) has > 60% of the population?
Well, the scale bar below is an indicator: French Polynesia is massive. Like 2,000 km-wide massive.

One recent review states (paraphrasing): dengue transmission in FP is in sharp contrast to the Caribbean islands which act as an epidemiological unit with the American continent, due to short inter-island distances plus large population movements [2]. That is, there's rapid and extensive vector-human-viral mixing in the Caribbean, specifically unlike this situation. This extensive heterogeneity in FP has also been clearly documented in entomology & arbovirology, and even in the weather on Tahiti itself [3–5], which directly impacts Aedes vectors.

Moreover, in all vector-borne disease, entomological and consequent serological heterogeneity are the rule, even on local scales. In the 2007 ZIKV outbreak on Yap, attack rates ranged from 0 to 22 per 1000 over ~ 15-km, and important differences were reported in attack rates between gender and age groups [6] (see note).

Suspect and confirmed cases per 1000 popualtion; 2007 ZIKV outbreak in Yap, 
figure 3 in [6] (see note).

In the current study, there’s no indication of how these sero-samples were collected: multistage cluster-sample, hospital samples, or convenience sampling? This is absolutely critical to ensure they aren't heavily biased. Moreover, there are no details on which islands the microcephaly cases were captured from, so the ‘sampling frames’ are undefined, and potentially very discordant.

For diagnosis, ELISA IgG to ZIKV was used for these cohorts; - however, a recent ZIKV case series report with authros from the French National Arboviral Laboratory specifcially highlighted major challenges, including indeterminate IgG results, suggestive of cross-reactivity with other flaviviruses [7].

Thus, there are three very large sources of uncertainty in the underlying serological data: almost certain major geographic heterogeneity; incompatible demography; and inherent uncertainty in the serological diagnosis itself. Together, these strongly suggest that the captured serological data from proxy populations underlying the models in [1] are very unlikely to accurately represent the diversity of viral exposures in women of reproductive age in a diverse archipelago stretching across > 2,000 km of ocean.

Total cases from sentinel sites
The second lacuna is the total suspect cases (top panel of the figure in [1]).

While the authors state in their model assumptions 'The number of Zika vius infections in a given week is proportional to the number of consultations for suspected infection in the same week.' I think it's important to examine how realistic this assumption is.

The total suspected cases are extrapolated from a network of clinic sites across FP (presumably fairly proportional to population density?). These sentinel clinics appear to report only ‘dengue-like’ illness [11] (please correct me if this is wrong). The proportion of these cases confirmed as ZIKV was 4%, but with DENV (4 serotypes), CHIKV, JEV [8], and Ross River virus (RRV) [9] all circulating in FP [10], there’s no reason to suspect the remainder might be ZIKV. This is especially true as many arboviruses have very diverse presentations, making clinical differential diagnosis difficult or impossible (eg [12]; and “the most common clinical manifestations of RRV infections are fever, arthralgia and rash.” [9]).

There is also limited detail on how this extrapolation was carried out, but the heterogeneity in epidemiology, entomology, and ‘island-ness’ combined with the 4% confirmation would make this a challenging endeavor with very large uncertainties.

In short, there is no evidence (in this article at least) that the 'spike' in cases has anything to do with ZIKV in isolation on any of the islands beyond communities in Tahiti where serology was done. The remainder could be ZIKV, or DENV, or RRV, or CHIKV, or... ?? Transmission of all these arboviruses was very likely driven by climatic conditions that favored vector reproduction and survival, allowing all the co-circulating arboviruses to 'have a go' in island-specific ways (due to vectors, travel, and population-level immunities). Moreover, the proportion of individual viruses almost certainly changed during the time period under study, adding to the complexities of estimating trimester-specific risk.

Other issues
A key issue in all disease reporting is the so-called “spatial areal unit problem.” The spatial scale (eg, city block, citywide, or national-scale) at which data are aggregated and examined can cause major changes in observed trends [13]. Rare events (like these 8 cases) are also inherently problematic for this reason (and others) specifically in birth defect studies, and artifacts are not uncommon [14].

Directly following on this point is the use of 2 per 10,000 as the baseline. While this may be due to an abundance of caution, it is the lowest end of the rates from the US (2-12 per 10,000 births) [15]. The combination of these three issues together severely undermines the thresholds for 'unusual rates' used in these models.

While the authors have done an admirable job of assembling data to address a critically important and pressing public health crisis, their use of 'French Polynesia' as the unit of analysis is highly problematic due to well-documented geographic, entomological and epidemiological heterogeneity which severely undermines their serology, extrapolated case counts, and 'alert' thresholds.

While the exact impact of these data issues is difficult to predict in the published models, it certainly greatly increases the uncertainty around the risk estimates, and without island-specific serology tied to where microcephaly cases have occurred, it may simply not be possible to robustly estimate risk from this limited set of eight cases.


1. Cauchemez S, Besnard M, Bompard P, Dub T, Guillemette-Artur P, Eyrolle-Guignot D, et al. Association between Zika virus and microcephaly in French Polynesia, 2013–15: a retrospective study. The Lancet. 2016; doi:10.1016/S0140-6736(16)00651-6
2. Tortosa P, Pascalis H, Guernier V, Cardinale E, Le Corre M, Goodman SM, et al. Deciphering arboviral emergence within insular ecosystems. Infect Genet Evol. 2012;12: 1333–1339. doi:10.1016/j.meegid.2012.03.024
3. Vazeille-Falcoz M, Mousson L, Rodhain F, Chungue E, Failloux A-B. Variation in oral susceptibility to dengue type 2 virus of populations of Aedes aegypti from the islands of Tahiti and Moorea, French Polynesia. Am J Trop Med Hyg. 1999;60: 292–299.
4. Brelsfoard CL, Dobson SL. Population genetic structure of Aedes polynesiensis in the Society Islands of French Polynesia: implications for control using a Wolbachia-based autocidal strategy. Parasit Vectors. 2012;5: 1–12.
5. Hopuare M, Pontaud M, Céron J, Ortéga P, Laurent V. Climate change, Pacific climate drivers and observed precipitation variability in Tahiti, French Polynesia. Clim Res. 2015;63: 157–170. doi:10.3354/cr01288
6. Duffy MR, Chen T-H, Hancock WT, Powers AM, Kool JL, Lanciotti RS, et al. Zika Virus Outbreak on Yap Island, Federated States of Micronesia. N Engl J Med. 2009;360: 2536–2543. doi:10.1056/NEJMoa0805715
7. Maria AT, Maquart M, Makinson A, Flusin O, Segondy M, Leparc-Goffart I, et al. Zika virus infections in three travellers returning from South America and the Caribbean respectively, to Montpellier, France, December 2015 to January 2016. Eurosurveillance. 2016;21. doi:10.2807/1560-7917.ES.2016.21.6.30131
8. Aubry M, Finke J, Teissier A, Roche C, Broult J, Paulous S, et al. Seroprevalence of arboviruses among blood donors in French Polynesia, 2011–2013. Int J Infect Dis. 2015;41: 11–12. doi:10.1016/j.ijid.2015.10.005
9. Aubry M, Finke J, Teissier A, Roche C, Broult J, Paulous S, et al. Silent circulation of Ross River virus in French Polynesia. Int J Infect Dis. 2015;37: 19–24. doi:10.1016/j.ijid.2015.06.005
10. Roth A, Mercier A, Lepers C, Hoy D, Duituturaga S, Benyon E, et al. Concurrent outbreaks of dengue, chikungunya and Zika virus infections-an unprecedented epidemic wave of mosquito-borne viruses in the Pacific 2012-2014. Euro Surveill. 2014;19: 20929.
11. Surveillance et veille sanitaire en Polynésie française. In: Pacific Public Health Surveillance Network [Internet]. [cited 22 Mar 2016]. Available:
12. Duong V, Andries A-C, Ngan C, Sok T, Richner B, Asgari-Jirhandeh N, et al. Reemergence of chikungunya virus in Cambodia. Emerg Infect Dis. 2012;18: 2066–2069. doi:10.3201/eid1812.120471
13. Jones SG, Kulldorff M. Influence of spatial resolution on space-time disease cluster detection. PLoS ONE. 2012;7: e48036. doi:10.1371/journal.pone.0048036
14. De Wals P. Investigation of clusters of adverse reproductive outcomes, an overview. Eur J Epidemiol. 1999;15: 871–875.

Detail from [6]:
"The sex-specific attack rates were 17.9 per 1000 females and 11.4 per 1000 males. Cases occurred among all age groups, but the incidence of confirmed and probable Zika virus disease detected by health care surveillance was highest among persons 55 to 59 years of age."

NB: If the map from Yap infringes on any copyright from The Lancet, I will gladly remove it.

Saturday, March 5, 2016

NEJM pregnancy in Rio cohort: subtleties and design issues

The most recent 'big' paper examining the Zika-microcepahly connection is just out [1]. It is by far the most comprehensive study published so far,  and shows a large range of malformations and other adverse outcomes amongst a cohort of pregnant women in Rio de Janeiro.

I want to preface this by saying the authors are clear that this is a preliminary report; and moreover they are doing very difficult research under what I can only imagine as 'war footing.' That said, there are some fairly major caveats to these findings, and it's not a smoking gun yet. First and foremost, while the title is 'Zika Virus Infection in Pregnant Women in Rio de Janeiro — Preliminary Report', a far more accurate title would be 'Risk of any adverse fetal event in pregnant women who present with rash and agree to follow up.' Why does any of this matter?

It's not clear what the measured outcome is here (perhaps in the overall cohort itself it is), but any measured malformation appears to be the closest. This is a major issue- without a well-defined outcome to measure, any abnormality becomes a study event. Clearly some are major; however one of reported cases has 'mega cisterna magna,' which a quick lit search suggests while rare, is incidental with likely very limited impacts.

Differences in follow-up and lack of blinding (?)
This is the critical issue in any cohort- differential measurement of outcomes severely weakens the entire design. In this case, all 42 of the ZIKV (+) pregnancies have had extensive (presumably research hospital level) followup, while the ZIKV (-) women 'had undergone fetal ultrasonography as part of regular prenatal care, and the results were reported as normal.' 

And then: "Of the 70 remaining women with ZIKV infection, 42 (60%) had prenatal ultrasonographic examinations, with a total of 56 studies performed." Loss of the 40% of the positives is exceedingly worrisome- especially if the women who agreed to follow-up potentially felt changes in fetal movement etc. These two issues are major red flags (threats to validity), and means this is should be considered more of a case series report than a cohort (at this point).

A secondary issue is the potential lack of blinding. The authors don't say whether the ultrasonographers were blinded to the mother's ZIKV status at the readings, but given the design this would be difficult. Lack of blinding has the potential to lead to serious biases in any study, but as
specifically stated in the bible of epi: 'The skill of the examiner and the thoroughness of the examination can have a large effect on the apparent birth prevalence of a particular defect' [2].

It's difficult to do a power analysis since the outcome is unclear, but comparing the proportions of pregnancies with any adverse outcome, it's adequately power at 0.87 (which is presumably why they published these interim results). Interestingly, the difference in stillbirths (4 in 42 pregnancies vs. 0 in 16) is not significant (p=0.20).

Other issues
'Among ZIKV-positive women, more than half reported similar illnesses in other family members, and 21% reported that their partner had been ill.' However, it's clearly apparent that the presentation between ZIKV (+) and (-) patients in Table 1 is very similar. Moreover, dengue is currently co-circulating in Brazil and differential diagnosis by clinical symptoms is very error-prone [2a]. The difference between the dengue serology and the values in Table 1 for 'history of dengue' clearly reinforce this issue.

'Information in prenatal records regarding rubella, cytomegalovirus, and Venereal Disease Research Laboratory serologic testing was reviewed.' This statement is very, very different from any quantified serological measures; and record linkages are notoriously problematic in all health systems.

How representative is this cohort?
As only women presenting with rash were enrolled, how common is rash in ZIKV infections?
There are very limited data, but one study from Micronesia found 90% of 31 confirmed patients [3]. However, the Rio cohort (is or began as?) a dengue-focused one, in which rash is less common amongst adults cases; in adults one study in Nicaragua found 40% [4]; while one in Brazil (adults and children) found 61% had rash at presentation [5]. This suggests that there are a very large number of pregnant women would not be eligible for the study.

Final thoughts
I think it's clear from the very large number of adverse events that something serious is occurring in this cohort, but I don't think these results make a water-tight case for ZIKV in isolation.

Given the high dengue seropositivity, I'd be interested to see the serotype-specific dengue serology data.  Finally, how comparable are the 'rashy' periods relative to active viremia in ZIKV and dengue infections?

[2] Rothman KJ, Greenland S, Lash TL. Modern epidemiology. Third edt. Philadelphia: Lippincott Williams & Wilkins; 2008. pg. 637.
[2a] eg,

Sunday, February 28, 2016

French Polynesia, 2014: how strong is the evidence for Zika-associated microcephaly?

The issue
The 5 February 2016 WHO Situation Report for Zika virus infections reports:

"In light of the increased incidence of microcephaly reported in Brazil, a review of birth data by authorities in French Polynesia indicated an increase in the number of central nervous system malformations in children born between March 2014 and May 2015. 18 cases were reported including 9 microcephaly cases compared to the national average of 0 to 2 cases of microcephaly per year." [1].

Does this represent an increase in proportion of live births with microcephaly at standard levels of statistical significance?

Using the birth rate and population in FP, let's compare 9 cases of microcephaly in 14 months to 0-2 in an 'average' year (with about 4,332 births per annum) [2].

In this plot, the range of microcephaly cases reported annually in 'average' prior years is on the x-axis, and the p-value for a difference in reported proportions with microcephaly is on the y. If we assume zero cases, p= 0.006; if one case, then p for a difference is 0.022; for two cases it's 0.063; and then if we include three cases for completeness, p=0.14. 

US rates are 2 to 12 per 10,000 live births [3]; if comparable to French Polynesia, we'd expect to see ~ 1-6 microcephaly cases in these 5,054 births (14 months). Three cases has a 95% CI of 1-9 cases, and the top end of this range of 6 cases has a 95% CI of 2-13 cases.

Looking at this comparison to US rates another way:

In this plot, up to a rate of 8 cases per 10,000 live births (p=0.014), there is evidence for elevated rates in French Polynesia; at 9 and 10 it's marginal, and above that there's no evidence for differences in rates between the US and FP post-Zika.

There is weak to no evidence to support the statement that there is an increase in microcephaly in French Polynesia post-Zika emergence. With zero cases, there is a statistically significant difference; with a single case it's just scraping significance. Zero cases is improbable as head size is a continuum; and vastly different rates have been captured in Brazil depending on the criteria used [4]; and there is no indication of criteria for these microcephaly cases In FP. Moreover, standardized head measurements have been shown to be inaccurate and may lead to misdiagnoses [5]. Finally, the number of cases reported are within the confidence intervals for rates in the US.

With a total of 8,750 suspect Zika infections [1] amongst a total population of ~278,000 in French Polynesia, I'd expect to see a *much* more robust signal than this if Zika (or Zika in isolation) was causal. However, only 383 (4.4%) of these cases were confirmed, so it's difficult to conclude too much.

Limitations and caveats
This is an inherently crude and 'dirty' analysis. These are very small numbers, of a poorly defined clinical finding subject to many types of reporting biases. These calculations do not use standardized rates (ie, there's no adjustment for age differences in the maternal populations). Finally, reporting of '0-2' in an average year is a very ambiguous statement- if data for actual annual cases were available, this (very simplistic analysis) could be greatly improved.


Analysis in Stata 14 (College Station, TX), using -prtest- to compare the proportions of infants born with microcephaly between birth cohorts.