 Research article
 Open Access
 Open Peer Review
 Published:
A comparison of the testnegative and the traditional casecontrol study designs for estimation of influenza vaccine effectiveness under nonrandom vaccination
BMC Infectious Diseasesvolume 17, Article number: 757 (2017)
Abstract
Background
As annual influenza vaccination is recommended for all U.S. persons aged 6 months or older, it is unethical to conduct randomized clinical trials to estimate influenza vaccine effectiveness (VE). Observational studies are being increasingly used to estimate VE. We developed a probability model for comparing the bias and the precision of VE estimates from two casecontrol designs: the traditional casecontrol (TCC) design and the testnegative (TN) design. In both study designs, acute respiratory illness (ARI) patients seeking medical care testing positive for influenza infection are considered cases. In the TN design, ARI patients seeking medical care who test negative serve as controls, while in the TCC design, controls are randomly selected individuals from the community who did not contract an ARI.
Methods
Our model assigns each study participant a covariate corresponding to the person’s health status. The probabilities of vaccination and of contracting influenza and noninfluenza ARI depend on health status. Hence, our model allows nonrandom vaccination and confounding. In addition, the probability of seeking care for ARI may depend on vaccination and health status. We consider two outcomes of interest: symptomatic influenza (SI) and medicallyattended influenza (MAI).
Results
If vaccination does not affect the probability of noninfluenza ARI, then VE estimates from TN studies usually have smaller bias than estimates from TCC studies. We also found that if vaccinated influenza ARI patients are less likely to seek medical care than unvaccinated patients because the vaccine reduces symptoms’ severity, then estimates of VE from both types of studies may be severely biased when the outcome of interest is SI. The bias is not present when the outcome of interest is MAI.
Conclusions
The TN design produces valid estimates of VE if (a) vaccination does not affect the probabilities of noninfluenza ARI and of seeking care against influenza ARI, and (b) the confounding effects resulting from nonrandom vaccination are similar for influenza and noninfluenza ARI. Since the bias of VE estimates depends on the outcome against which the vaccine is supposed to protect, it is important to specify the outcome of interest when evaluating the bias.
Background
Influenza vaccine effectiveness (VE) has to be reestimated in every season because predominant influenza virus types, subtypes and phenotypes change from one season to the next, necessitating a new vaccine targeting different strains in most seasons. As annual influenza vaccination is now widely recommended, randomized clinical trials for estimating VE are no longer ethical in many populations, and observational studies based on patients seeking medical care for acute respiratory illnesses (ARI) are the most efficient, and hence most widely used option. However, observational studies for estimating VE are prone to multiple sources of bias.
In this paper we present a new probability model for comparing the bias and precision of VE estimates from two popular casecontrol study designs under nonrandom vaccination, i.e., vaccination probabilities may depend on a covariate. In both study designs, ARI patients seeking medical care who test positive for influenza infection are considered cases. In the testnegative (TN) design, ARI patients seeking medical care who test negative for influenza infection serve as controls, while in the traditional casecontrol (TCC) design, controls are randomly selected individuals who did not contract an ARI, usually from the same community from which the cases came. The TN design was introduced in 2007 [1], and most of the influenza VE casecontrol studies conducted since then have used this study design. However, TCC studies are still being used occasionally [2–4]. TCC studies are usually costlier and more resource intensive due to the need to recruit controls through a separate mechanism.
Estimates of VE from casecontrol studies may be subject to the following sources of bias:
(a) Probabilities of noninfluenza ARI may depend on vaccination status: In TN studies, individuals with noninfluenza ARI serve as controls. Therefore, TN studies may produce biased estimates of VE unless vaccinees and nonvaccinees are equally likely to develop noninfluenza ARI. The validity of this assumption has not yet been confirmed. De Serres et al. [5] used data from randomized clinical trials to argue that this assumption is usually satisfied. However, a randomized influenza vaccine trial [6] found that vaccinees had a significantly increased risk of virologicallyconfirmed noninfluenza infection (that may lead to ARI) as compared to those who received the placebo.
(b) Probabilities of influenza and noninfluenza ARIs may depend on confounders: Covariates such as health status, age, exposure, education and socioeconomic status may be associated with both the likelihood of being vaccinated and the likelihood of developing influenza and noninfluenza ARIs.
(c) Vaccination may affect probability of seeking medical care in influenza patients: Several studies [7–9] suggest that vaccinated individuals who contract influenza may have milder symptoms than unvaccinated influenza patients, and therefore may be less likely to seek medical care. This effect of vaccination is not expected to change healthcareseeking behavior of noninfluenza ARI patients.
(d) Probabilities of seeking medical care against ARIs may depend on confounders: Since only ARI patients who seek medical care may be included in a TNC study, and only influenza patients who seek care may be included as cases in TCC studies, covariates that are associated with both the likelihood of being vaccinated and the likelihood of seeking care against ARI may contribute to the bias of influenza VE estimates.
(e) Misclassification bias: Diagnostic tests for influenza viruses are not 100% sensitive and specific. Vaccination status may also be misclassified.
In this work we consider the first four sources of bias. To focus on these sources, we ignore misclassification biases which are known to result in negative bias (i.e., bias toward lower estimation of VE) and are common to all studies that rely on results of diagnostic tests.
The goal of this article is to evaluate and compare the bias and precision of estimates of VE resulting from TN and TCC studies. As we will see, the bias of VE estimates may depend on the outcome of interest, i.e., the outcome against which the vaccine is expected to protect. We consider two outcomes of interest, symptomatic influenza (SI) and medicallyattended influenza (MAI). In both the TN and TCC study designs, only influenza patients seeking medical care are considered cases. Therefore, one expects these study designs to produce estimates of VE against MAI. However, the media usually reports VE estimates from these casecontrol studies as ‘vaccine effectiveness against influenza’, without including the ‘medicallyattended’ clause. As a result, the public may interpret these estimates as the effectiveness of the vaccine against any influenza illness, i.e., VE against SI. One of the objectives of this work is to highlight the importance of (a) clearly specifying the outcome against which the vaccine is supposed to protect, and (b) understanding that the bias of a VE estimate may be different for the two outcomes of interest.
We will (a) evaluate the bias of each of the VE estimates for each of the outcomes by comparing the expected value of the estimate with the true VE, and (b) evaluate the standard errors of the VE estimates. To perform these evaluations and comparisons, we developed a detailed stepwise probability model of the process involved in collecting data in these studies and deriving VE estimates. The model includes a covariate, health status, that may be associated with both the likelihood of being vaccinated and the propensity of seeking medical care against ARI. This allows us to assess the effects of nonrandom vaccination on the bias of VE estimates.
Methods
We first describe the reallife process involved in conducting the two types of casecontrol studies and obtaining the estimates of VE. We then describe the model we developed to mimic this process.
The study population
The source population for both types of casecontrol studies consists of all individuals receiving most of their medical care at a single clinic or at a specific network of clinics. Since influenza VE varies by age, we can assume that the model pertains to a subpopulation corresponding to a single age group.
The study designs
When a member of the study population develops an ARI, s/he may decide to report to a clinic for treatment. At the clinic, the health care provider may ask the person to be tested for influenza viruses. If the person agrees then a swab is taken and sent to a laboratory for testing. In both study designs, a person who tests positive is eligible to be considered a case. In a TN study, an individual who tests negative is eligible to be considered a control. In a TCC study, controls are randomly selected members of the study population who have not developed ARI prior to their inclusion in the study. Usually, one or more controls are selected right after a case is identified. In both study designs, the vaccination status of every case or control is determined from manual or electronic records, or from oral histories.
Outcome of interest and true VE
In this work we evaluate estimates of VE when the outcome of interest is either SI or MAI. SI is sometimes called ‘influenza illness’ or ‘influenza ARI’. Surveillance for SI is needed in the entire study population, and for persons ill with compatible illnesses, samples of influenza are taken for verification. A person is considered a true case of SI if s/he has an ARI and is infected by an influenza virus. For MAI, a true case is defined as a person who is influenzainfected, develops an ARI, and seeks medical care. In both cases, the true VE is defined as one minus the ratio of the probability of the outcome of interest in vaccinees and nonvaccinees.
Estimation of VE and bias of VE estimates
In this work we focus on identifying the main sources of bias and their effects on the performance of the VE estimates. Some of these biases can be adjusted for in the analysis, but this is beyond the scope of the current work. In casecontrol studies, VE is usually estimated as one minus the odds ratio (OR) of being vaccinated in cases vs. controls. The bias of the estimate is defined as the difference between the expectation of the estimated VE and the true VE.
The model
The model we developed for comparing the estimates from the two study designs follows the scheme described above with a few simplifications. We assume that (a) when a person seeks medical care for ARI then her/his probability of being tested for influenza viruses does not depend on vaccination status, health status, or on the actual cause of ARI (influenza/noninfluenza); (b) given a person’s symptoms and influenza infection status, the sensitivity and specificity of the test do not depend on the tested person’s vaccination or health status; (c) a person’s vaccination status is determined without error; and (d) controls in a TCC study are selected at random from all asymptomatic individuals in the study population (See “The study population” section).
Our model includes a covariate, health status, and we assume that a person’s probabilities of being vaccinated, developing an ARI, and seeking medical care against ARI may be associated with her/his health status. In this way, the model generates possible confounding effects linking vaccination status with the probabilities of being included in the study and of becoming a case or a control.
The model consists of five steps, where the value of a single variable is determined at each step. The probability distribution of this variable may depend on the values of the variables from the previous steps. Below we define the five steps, the associated variables, and the probabilities determining each variable’s distribution.
Step 1: Health status
A person can be classified as “healthy” or “frail”. Define a binary variable X, where X=1 for a “healthy” person and X=0 for a “frail” person. Denote π=P(X=1).
Step 2: Vaccination
A person may be vaccinated against influenza. Define a binary variable V, where V=1 for a vaccinated person. The probability of being vaccinated may depend on health status; therefore, denote α _{ x }=P(V=1X=x), x=0,1.
Step 3: Influenza infection and ARI
During the influenza season, a person may become infected with an influenza virus and develop an ARI. This outcome is referred to as “influenza ARI” (FARI), where “F” stands for flu. A person may also develop an ARI not resulting from influenza infection. This outcome is referred to as “noninfluenza ARI” (NFARI). We therefore define an outcome variable Y with 3 categories as follows: Y=0 for no ARI, Y=1 for NFARI, and Y=2 for FARI. The distribution of Y depends on the person’s vaccination status, V, and health status, X. We denote β _{ vx }=P(Y=1V=v,X=x), v=0,1, x=0,1 and γ _{ vx }=P(Y=2V=v,X=x) for v=0,1, x=0,1 with β _{ vx }+γ _{ vx }≤1 for all v,x. Here we assume the “leaky vaccine” model, in which the vaccine provides a reduction in the probability of influenza transmission to the vaccinated person, rather than complete immunity [10]. Under this model, a vaccinee has a lower probability of becoming infected than a nonvaccinee, but is not rendered completely immune from influenza infection.
Step 4: Seeking medical care for ARI
A person with ARI may seek medical care and, in this case, be tested for influenza viruses. We define a binary variable M with M=1 for a person seeking medical care for her/his ARI. The probability of this event depends on Y (only individuals with ARI seek medical care), and it may be different for FARI and NFARI patients. In addition, the conditional distribution of M given Y may depend on X and V. We therefore define δ _{ yvx }=P(M=1Y=y,V=v,X=x), where y=1,2, v=0,1 and x=0,1.
In order to reduce the number of parameters, we make two simplifying assumptions regarding the probabilities of seeking medical care: (1) the effect of health status on probability of seeking medical care does not depend on vaccination status or type of ARI; (2) the effect of vaccination status on probability of seeking medical care does not depend on health status (but it may depend on type of ARI).
Define a “standard person” as a person with X = 0 and V = 0. For a “standard person”, we define δ _{ SN }, δ _{ SF } as follows:

δ _{ SN }=P(M=1Y=1,V=0,X=0)=δ _{100}

δ _{ SF }=P(M=1Y=2,V=0,X=0)=δ _{200}
In addition, we define two multipliers:

λ = multiplier for x = 1; λ does not depend on V and Y.

Ψ _{ F } = multiplier for v=1 only when y=2; Ψ _{ F } does not depend on X.
λ is the ratio of the probabilities of seeking medical care comparing a healthy and a frail person. Ψ _{ F } is the ratio of the probabilities of seeking care comparing a vaccinated and unvaccinated influenza ARI patient.
Then, { δ _{ yvx }} can be written in terms of δ _{ SN }, δ _{ SF } and the multipliers λ, Ψ _{ F } as follows:

δ _{100}=δ _{ SN }, δ _{101}=δ _{ SN }∗λ, δ _{110}=δ _{ SN }, δ _{111}=δ _{ SN }∗λ.

δ _{200}=δ _{ SF }, δ _{201}=δ _{ SF }∗λ, δ _{210}=δ _{ SF }∗Ψ _{ F }, δ _{211}=δ _{ SF }∗λ∗Ψ _{ F }.
Note: The multiplier Ψ _{ F } reflects the effect of severity of ARI in an influenza infected person. We assume that vaccination may reduce severity of symptoms, hence a vaccinated influenza patient may be less likely to seek care than an unvaccinated patient.
Step 5: Testing for influenza infection.
Although only individuals who seek medical care for an ARI are tested for influenza infection, it will be convenient to define a binary variable T as the (possibly unobserved) test result for any person with an ARI, regardless of whether or not s/he is actually tested. Define T=1 (T=0) if a person would test positive (negative) for influenza if tested. Because of assumption (b) above, the probability of testing positive given the person’s influenza infection status does not depend on X, V, or M. Denote τ _{ y }=P(T=1Y=y) for y=1,2. Note that τ _{1} is one minus the test’s specificity and τ _{2} is the test’s sensitivity. In this study, we assume the test has 100% sensitivity and 100% specificity, i.e. P(T=1Y=1)=τ _{1}=0 and P(T=1Y=2)=τ _{2}=1.
Figure 1 shows the directed acyclic graph (DAG) of the model. Recent papers by Sullivan et al. [11] and Lipsitch et al. [12] discuss the use of DAGs to explore sources of bias of VE estimates from TN studies. A summary of the variables and parameters in our model is given in Table 1.
True VE in our model
When we evaluate the true VE, we assume that vaccination is done at random, i.e. for true VE we assume that vaccination status does not depend on health status X (α _{0}=α _{1}=α).
The true VE against SI is:
The true VE against MAI is:
Using the parameters defined above, V E T _{ SI } and V E T _{ MAI } can be written as:
The proofs of these results can be found in Appendix 1.
Estimates of VE in our model
In both the TN and TCC study designs, VE is estimated as one minus the odds ratio (OR) in the C×V table crossclassifying the individuals included in the study, where C is a binary indicator of case/control status with C=1 for a case. For convenience, the TN and TCC studies will be represented by the letters A and B, respectively. In a TN study, the case/control variable is denoted C _{ A }, where (C _{ A }=1)=(M=1,T=1) and (C _{ A }=0)=(M=1,T=0). Then the estimate of VE is: V E _{ A }=1−O R _{ A }, where
Note that all the probabilities condition on M=1 as only individuals who seek medical care for ARI are included in the TN study.
In a TCC study, the case/control variable is denoted C _{ B }. Cases are defined in the same way as in the TN study, i.e., (C _{ B }=1)=(M=1,T=1)=(C _{ A }=1). Controls are individuals included in a random sample drawn from all the asymptomatic individuals in the study population. In other words, (C _{ B }=0) is a random subset of (Y=0). In addition, we define a binary variable B indicating whether or not a person is included in the TCC study, i.e., (B=1)=(C _{ B }=1orC _{ B }=0). The VE estimate is based on the OR in the C _{ B }×V table when all the probabilities condition on B=1: V E _{ B }=1−O R _{ B }, where
Note that in a reallife study, the odds ratios are estimated from the relative frequencies of the corresponding events, rather than from their (unknown) probabilities. Therefore, the modelbased estimates of VE defined above are actually the expected values of the observed estimates. For convenience we will continue to refer to them as “the VE estimates”.
Using the parameters defined above, V E _{ A } and V E _{ B } can be written as follows:
The proofs can be found in Appendix 2.
Bias and standard errors of estimates
The bias of an estimate of VE is the difference between the expected value of the estimate and the true VE. As the true VE depends on the outcome of interest (SI or MAI), the bias of each estimated VE will be evaluated separately for each of the two outcomes.
In Appendix 3 we use approximations based on the “Delta method” for the standard errors (SEs) of odds ratios [13] to derive expressions for the SEs of both VE estimates in terms of the parameters and the corresponding sample size(s). For evaluating the SEs we consider the observed odds ratios, where the probabilities are replaced by the corresponding observed relative frequencies.
The values of bias reported in the text and tables represent absolute numbers. For example, if the true VE is 60% (i.e., 0.6) and the range of bias (0.40, 0.20). This means that the estimated VE varies from 0.20 (underestimating the true VE = 0.6 by 0.40) to 0.80 (overestimating the true VE by 0.20).
Probability ratios
Next, we define a few probability ratios comparing vaccinees and nonvaccinees or healthy and frail individuals. These ratios will be helpful in the presentation of the results (see Table 1 for a full list of the notations used in this paper).

\(\rho _{\beta } = {\frac {\beta _{1x}}{\beta _{0x}}}\), the ratio of the probabilities of NFARI comparing a vaccinated and an unvaccinated person of the same health status.

\(\eta _{\beta } = {\frac {\beta _{v1}}{\beta _{v0}}}\), the ratio of the probabilities of NFARI comparing a healthy and a frail person of the same vaccination status.

\(\rho _{\gamma } = {\frac {\gamma _{1x}}{\gamma _{0x}}}\), the ratio of the probabilities of FARI comparing a vaccinated and an unvaccinated person of the same health status.

\(\eta _{\gamma } = {\frac {\gamma _{v1}}{\gamma _{v0}}}\), the ratio of the probabilities of FARI comparing a healthy and a frail person of the same vaccination status.
The parameters λ and Ψ _{ F } defined earlier are also probability ratios:

\(\lambda = {\frac {\delta _{yv1}}{\delta _{yv0}}}\) The ratio of the probabilities of seeking medical care comparing a healthy and a frail person of the same vaccination status. We assume that this ratio is the same for FARI and NFARI patients.

\(\Psi _{F} = {\frac {\delta _{21x}}{\delta _{20x}}}\) The ratio of the probabilities of seeking medical care comparing a vaccinated and an unvaccinated FARI patient of the same health status.
Table 2 presents the main sources of bias that can be identified from our model. The absence of bias A is essential for the validity of the TN design, since the VE estimate from this design is based on comparing the odds of being vaccinated in FARI patients (cases) and NFARI patients (controls). This bias may be a result of virus interference [6] (if vaccinees are more likely than nonvaccinees to contract NFARI, then the estimated VE will be falsely high). Biases B1 and B2 represent the effects of health status on the probabilities of NFARI and FARI, respectively. These effects, which are sometimes called the ‘healthy vaccinee effect’, represent the confounding resulting from association of health status with the probability of exposure (vaccination) and the outcome. Bias BS is a special case of B1∩B2. It results when health status affects both the probabilities of FARI and NFARI but the risk ratios comparing a healthy and a frail person are the same for the both types of ARIs. Bias C represents the effect of vaccination status on the probability of seeking care in patients with SI. This effect may be due to less severe symptoms in vaccinated persons compared to unvaccinated ones. As stated earlier, we assume perfect sensitivity and specificity of the influenza test (τ _{1}=0, τ _{2}=1), as it is wellknown that misclassifications result in negativelybiased estimates of effectiveness.
Results
Sources of bias
We first state conditions for the unbiasedness of the VE estimate based on the TN design. The proofs of these results can be found in Appendix 4.

Under random vaccination (α _{0}=α _{1}), the estimate of VE when the outcome of interest is SI is unbiased if biases A and C are absent. When the outcome of interest is MAI, the estimate of VE is unbiased if bias A is absent.

Under nonrandom vaccination (α _{0}≠α _{1}), the estimate of VE when the outcome of interest is SI is unbiased if biases A, B1, B2, and C are absent. When the outcome of interest is MAI, the estimate of VE is unbiased if biases A, B1, and B2 are absent.
It is interesting to note that the absence of any source of bias, the ORbased VE estimate from a TN study is unbiased even if the ‘rare disease’ assumption is not satisfied, while the ORbased estimate from a TCC study is biased. To show this, let’s use the following simplified notation: α = probability of being vaccinated, β = probability of NFARI, γ _{0} and γ _{1} = probabilities of FARI in unvaccinated and vaccinated, respectively, and δ = probability of seeking care. Then the true VE is 1−ρ, where ρ = γ _{1}/ γ _{0} is the risk ratio. In a TN study, the probabilities of vaccinated and unvaccinated cases are α∗δ∗γ _{1}, and (1−α)∗δ∗γ _{0}, respectively. The corresponding probabilities of controls are α∗δ∗β, and (1−α)∗δ∗β, respectively. Then the OR in the table of casecontrol status by vaccination status equals to ρ, i.e. the true risk ratio, implying that the estimated VE is unbiased. In a TCC studies, the probabilities of cases are the same as in the TN study, while the probabilities of vaccinated and unvaccinated controls (individuals without ARI) are ϕ∗α∗(1−γ _{1}−β) and ϕ∗(1−α)∗(1−γ _{0}−β), respectively, where ϕ is the sampling fraction of controls. Hence, the OR in the TCC study is [ρ∗(1−γ _{0}−β)]/(1−ρ∗γ _{0}−β). This OR is less than ρ (the true RR) if ρ> 0, hence the estimated VE exceeds the true VE in a TCC study as long as the true VE is positive.
Next we explore the magnitude of the effects of various sources of bias and their combinations. We consider three scenarios for vaccination probabilities (see Table 3). In Table 4 we present the range and the maximum absolute value of the bias of VE estimates resulting from TN and TCC studies under the three vaccination scenarios and various combinations of sources of bias. For these results we used the following baseline values of some of the parameters: π=0.7, β _{00}=0.2, γ _{00}=0.1, δ _{ SN }=0.2, δ _{ SF }=0.3, ρ _{ γ }=0.4. π is the probability of being ‘healthy’; β _{00} and γ _{00} are the probabilities of NFARI and FARI, respectively, for an unvaccinated ‘frail’ person; δ _{ SN } and δ _{ FN } are the probabilities of seeking medical care for NFARI and FARI, respectively, for an unvaccinated ‘frail’ person; ρ _{ γ } is the risk ratio comparing the probability of FARI for a vaccinated and an unvaccinated person  thus, the true VE against SI is 1  0.4 = 0.6 (60%). The values of β’s, γ’s are based on data from various randomized placobocontrolled trials (see [14], Table A1), and the values of δ are based on data from several observational studies. In all the tables, figures and examples, values of VE are presented as fractions, rather than percentages.
In the calculations for Tables 4 and 5, when a source of bias was present we used a reasonable range for the corresponding probability ratio. When bias A was present, ρ _{ β } was allowed to vary from 0.5 to 2.0. For biases B1, B2, and BS, we allowed η _{ β } and η _{ γ } to vary between 0.5 and 1.0, since one would not expect frail persons to have lower probabilities of ARI, compared to healthy persons. For bias C, the ratio Ψ _{ F } could vary between 0.5 to 1.0, since one would expect vaccination to reduce the probability that a person with SI will seek medical care compared to a person with ARI resulting from a different pathogen. For bias D, we let λ vary between 0.5 to 2.0.
For each combination of two or more sources of bias, we calculated the minimum, mean, and maximum of the bias and the absolute values of the bias by allowing the probability ratios that are not fixed to vary independently in the ranges specified above. For example, when biases A, B1, and B2 are absent, we used ρ _{ β }=η _{ β }=η _{ γ }=1,0.5≤Ψ _{ F }≤1,0.5≤λ≤2.
Summary of results
The impact of sources of bias
Our model allows us to evaluate the impact of the sources of bias listed in Table 2. Each source of bias is a result of a possible effect of vaccination or health status on the probability of FARI or NFARI or seeking care. Below we summarize our results for each of the sources of bias. We also use numerical examples to illustrate the magnitude and direction associated with each source of bias. Unless otherwise specified, the true VEs against SI and MAI are 0.6 (60%). In each of these examples we assume that only one source of bias is present.

(1)
Vaccination affects the probability of NFARI (bias A)

This bias does not depend on vaccination scenario nor on the outcome of interest (SI or MAI).

Estimates of VE from TN studies may suffer from severe bias.

This effect also affects the bias of VE estimates from TCC studies, though to a lesser extent.

Example: As the ratio of the probability of NFARI comparing vaccinated and unvaccinated persons varies from 0.5 to 2.0, VE estimates from TN studies range from 0.2 to 0.8, respectively, while VE estimates from TCC studies range from 0.67 to 0.50, respectively (Fig. 2).


(2)
Health status affects the probabilities of FARI and NFARI (biases B1, B2  the ‘ healthy vaccinee effect ’)

The bias does not depend on the outcome of interest (SI or MAI).

Under nonrandom vaccination, these effects may result in substantial bias of VE estimates from TN or TCC studies. However, this bias is usually less severe compared to the biases resulting from sources A, C and D.

If the effect of health status on the probability of ARI is the same for FARI and NFARI, i.e., bias BS is present, then the TNbased estimates of VE are unbiased.

Example: Suppose that the probabilities of vaccination are 0.8 and 0.4 for healthy and frail persons, respectively. Consider three cases regarding the risk ratios P(ARI in a healthy person) / P(ARI in a frail person): (a) When these risk ratios are 0.5 for NFARI and 0.8 for FARI, then the estimated VEs from TN and TCC studies are 0.51 and 0.67, respectively. (b) When the risk ratios are 0.8 for NFARI and 0.5 for FARI, then estimated VEs from TN and TCC studies are 0.67 and 0.72, respectively. (c) When the risk ratios for NFARI and FARI are equal and their common value ranges from 0.5 to 1.0, then estimated VEs from TN studies are always unbiased (i.e., they equal 0.6), while estimates from TCC studies range from 0.63 to 0.73. In Figs. 3 and 4, we set the risk ratio for NFARI to 0.75 and let the risk ratio for FARI vary between 0.5 to 1.0.


(3)
Vaccination affects the probability of seeking medical care for FARI, but it does not affect the probability of seeking care for NFARI (bias C)

When this effect is present then the true VEs against SI and MAI may be different as the vaccine directly affects (reduces) the probability of seeking care in influenza cases, but not in controls. Thus the estimates’ bias may depend on the outcome of interest.

If all other sources of bias are absent, the bias of VE estimates does not depend on the vaccination scenario.

Estimates of VE from TN or TCC studies may be severely biased when the outcome of interest is SI.

When the outcome of interest is MAI, estimates of VE from TN studies are unbiased, while the bias of estimates from TCC studies is usually small and is not affected by the magnitude of the effect underlying this source of bias.

Example: Let the ratio R = P(seeking medical care against FARI if vaccinated) / P(seeking medical care against FARI if unvaccinated) vary from 0.5 to 1.0. Then the true VE against SI remains fixed at 0.6, while the true VE against MAI varies with R from 0.8 to 0.6. The estimated VEs from TN studies equal the true VE against MAI for all values of R, while the estimated VEs from TCC studies vary from 0.82 to 0.63 (see Fig. 5). For example, when R=0.5 then the true VE against MAI is 0.80, and the VE estimates from TN and TCC studies are 0.80 and 0.82, respectively. This translates into severe bias when the outcome of interest is SI but small bias when the outcome of interest is MAI.


(4)
Health status affects the probabilities of seeking care against FARI and NFARI (bias D)

The bias of VE estimates does not depend on the outcome of interest (SI or MAI).

In the absence of other sources of bias, VE estimates from TN studies are unbiased regardless of the vaccination scenario.

Under nonrandom vaccination, this effect may result in substantial bias in VE estimates from TCC studies.

Example: We assume that the probabilities of seeking care do not depend on vaccination status. As the ratio of the probabilities of seeking care comparing healthy and frail individuals varies from 0.5 to 2.0, VE estimates from TN studies remain fixed at 0.6 (i.e., they are unbiased) under both random and nonrandom vaccination. When the probabilities of vaccination are 0.8 and 0.4 for healthy and frail persons, respectively, the VE estimates from TCC studies vary from 0.72 to 0.53 (Figs. 6 and 7).

In addition, we found that in some cases the true VEs against SI and MAI are different. Hence, the bias of VE estimates may depend on the outcome against which the vaccine is supposed to protect. For example, if the only sources of bias are BS, C, and D then, the VE estimate from TN studies is unbiased when considering effectiveness against MAI. The same estimate may overestimate the true VE against SI by 0.20 (i.e. 20%).
Comparison of the bias of VE estimates from TN and TCC studies:

If one is concerned that vaccination may affect the probability of noninfluenza ARI, then one should prefer the TCC study design. However, TCCbased VE estimates may still be biased in this case. For example, when the ratio of the probability of NFARI comparing a vaccinated and an unvaccinated person is 0.5, then the bias of VE estimate from TN study is 0.4 while the bias of VE estimate from TCC study is 0.07.

Under nonrandom vaccination, effects of health status on probabilities of influenza and noninfluenza ARI (the ‘healthy vaccinee effect’) may bias VE estimates from both study designs. In general, TNbased estimates perform slightly better than TCCbased estimates when this effect is believed to be the main source of bias. If the effect of health status is similar for FARI and NFARI, then the TN design produces less biased estimates compared to the TCC design. For example, suppose the probabilities of vaccination are 0.4 and 0.8 for healthy and frail persons, respectively. When the risk ratios for NFARI and FARI are both 0.75, then the VE estimate from TN study is unbiased, while the bias of VE estimate from TCC study is 0.07.

If one assumes that vaccination does not affect the probability of noninfluenza ARI but one is concerned that vaccinated influenza patients are less likely to seek care than unvaccinated patients (because of reduced symptoms severity), then VE estimates may suffer from severe bias in both study designs when the outcome of interest is SI. In this case, the bias of TNbased estimates may be somewhat smaller than that of TCCbased estimates. This source of bias does not affect VE estimates when the outcome of interest is MAI! For example, suppose that the ratio comparing vaccinated and unvaccinated FARI cases w.r.t. the probability of seeking medical care is 0.5. When the outcome of interest is SI, then the bias of a VE estimate from TN study is 0.2 and the bias of a VE estimate from TCC study is 0.22. When the outcome of interest is MAI, then the VE estimate from a TN study is unbiased, while the bias of a VE estimate from a TCC study is 0.02.

Under nonrandom vaccination, the TN study design is preferable to the TCC design if one is concerned about bias resulting from possible effect of a person’s health status on her/his probability of seeking care against ARI. For example, suppose that the probabilities of vaccination are 0.8 and 0.4 for healthy and frail persons, respectively. When the ratio of the probabilities of seeking medical care comparing healthy and frail persons is 0.5, then the VE estimate from TN study is unbiased while the bias of VE estimate is 0.12.
Precision of VE estimates
Table 5 presents the standard errors of VE estimates from TN and TCC studies. From this table we conclude that:

Nonrandom vaccination may reduce precision of VE estimates.

If the probability of NFARI is associated with vaccination status, then VE estimates from TN studies are somewhat less precise compared to VE estimates from TCC studies, although the differences in precision were small.

If the probability of NFARI is not associated with vaccination status, then the precision of VE estimates from TN and TCC studies are similar.
Discussion and conclusions
We developed a new model for the evaluation of the bias and precision of influenza estimates from casecontrol studies. The new model is more comprehensive than previously suggested models [5, 14–18] for the following reasons:

It allows assessment of the impact of nonrandom vaccination.

It incorporates a confounder (health status) which links vaccination status with the probabilities of ARI and of seeking medical care for these ARIs.

By including parameters corresponding to the probabilities of seeking medical care, the model allows us to examine the effect of association of these probabilities with vaccination and health status on the bias of VE estimates.

The model allows evaluating and comparing the precision of VE estimates.
Some of the sources of bias discussed here have been identified and addressed in earlier publications, but, to our best knowledge, none of the previous papers present a comprehensive discussion of all the possible sources of bias that may arise under a given model. In addition, the current model attributes the associations between the various factors involved in estimation of VE (vaccination, contracting influenza and noninfluenza ARIs and seeking medical care) to an underlying covariate. Previously published models, including an earlier version of our model [14], included parameters representing these associations but these associations were not based on a common underlying factor. Therefore, we believe that the current results and conclusions may differ from those derived from less structured models.
Our calculations confirm earlier findings [15] that when the probability of noninluenza ARI depends on vaccination status, VE estimates from testnegative studies may be severely biased. However, even when this probability is not affected by vaccination, VE estimates from the two types of casecontrol studies considered in this work may suffer from substantial bias. In addition to the wellknown ‘healthy vaccinee effect’ (probabilities of vaccination and of ARI depend on health status), bias of VE estimates may result from heterogeneities in healthcareseeking behaviors. Specifically, if vaccination reduces the probability that an influenza patient seeks medical care (because her/his symptoms are less severe than those of an unvaccinated influenza patient) then VE estimates from TN or TCC studies may grossly overestimate the true VE against SI. On the other hand, when the outcome of interest is MAI then the biases resulting from vaccinerelated reduction in symptoms’ severity are very small. Recent papers [7–9] found evidence of vaccineassociated reduction in influenza patient’s symptoms severity. The effects of healthcareseeking behaviors on VE estimates from studies in which only ARI patients who seek medical care may become cases need to be further investigated.
The results of this work lead to the following conclusions:

In general, estimates of influenza VE from casecontrol studies where only ARI patients seeking medical care are tested for influenza infection may suffer from severe bias, i.e. an absolute bias of 20% or more, especially when the outcome of interest is SI.

The bias of VE estimates may depend on the outcome against which the vaccine is supposed to protect. When the outcome of interest is MAI, seeking medical care is a component of the outcome. In other words, the true VE against MAI reflects the vaccine effect on seeking medical care and on contracting influenza. This explains why true VE against MAI may differ from true VE against SI. When bias C is present, the vaccine directly affects (reduces) the probability of seeking care in influenza cases, but not in controls. As a result, VE against MAI is lower than VE against SI.

Influenza VE estimates from TN studies are usually presented as ‘VE against medicallyattended influenza’. However, the media and lay persons may interpret these VE estimates simply as the protective effectiveness of vaccination against contracting influenza illness, i.e. VE against SI. Health authorities and the public should be made aware of this distinction.

When the outcome of interest is SI, the TN design provides valid estimates (i.e., no or small bias) if the following assumptions are satisfied: (a) vaccination does not affect the probability of noninfluenza ARI, (b) effects of confounding variables on the probabilities of influenza and noninfluenza ARI are similar, and (c) vaccination does not affect the probabilities of seeking medical care for influenza ARI due to reduced severity of symptoms. When the outcome of interest is MAI, then only assumptions (a) and (b) are necessary for obtaining a valid VE estimate from a TN study.

Estimates of VE from TCC studies have small bias when the outcome of interest is SI if assumptions (a) and (c) are satisfied, assumption (b) is replaced by the stronger assumption (b*) of no presence of confounding, and the additional assumption (d) that the probabilities of seeking medical care for ARI are not affected by potential confounders is satisfied. When the outcome of interest is MAI, then TCCbased estimates of VE have small bias under assumptions (a), (b*), and (d).

It is important to collect more data on healthcareseeking behaviors of ARI patients and to study the effects of vaccination and potential confounders on these behaviors.
In summary, the test negative design produces less biased VE estimates, compared to the traditional casecontrol design provided that vaccination does not modify the probability of noninfluenza ARI. However, this very popular study design may still produce biased estimates of influenza VE, especially when the outcome of interest is symptomatic influenza. One can expect monitored cohort studies, where every study participant reporting an ARI is tested for influenza infection, to provide less biased estimates of VE against SI. In a future publication we plan to compare the bias of these cohort studies, which are much more expensive, with that of TN studies.
Our study has a few limitations:

In order to focus on bias associated with the study designs, we ignored bias resulting from misclassification of infection and vaccination status.

Our model does not account for the dynamics of outbreaks of influenza and other ARIcausing infections.

We only consider unadjusted VE estimates as we tried to focus on sources of bias rather than on how one can reduce bias using standard or novel statistical techniques [19].
In the future we plan to improve the model by incorporating dynamics of the related processes. We also plan to use stochastic simulations to assess bias and precision of influenza VE estimates for other study designs (e.g. cohort studies) and to propose new study designs resulting in less biased VE estimates.
Appendix 1
True VE‘s in our model
The true VE against SI is:
Since
we have
Since, for true VE, we have: α _{0}=α _{1}. We can get,
So that,
Therefore,
The true VE against MAI is:
Since
we can get,
Therefore,
Hence,
Appendix 2
Modelbased estimates of VE
The modelbased estimate from TN study is:
O R _{ A } can be written as:
So that,
Therefore,
The modelbased estimates from TCC study is:
O R _{ B } can be written as:
Since,
and P(Y=0V=v,X=x)=1−P(Y=1V=v,X=x)−P(Y=2V=v,X=x)=1−γ _{ vx }−β _{ vx }, so we have:
Therefore,
Appendix 3
Standard errors of the VE estimates
For TN study, the approximate standard error of V E _{ A } is:
where N _{ A } is the number of persons who were tested for influenza (M=1), i.e., the total sample size for the TN study. The probabilities (\(p^{A}_{V1}\), \(p^{A}_{01}\), \(p^{A}_{11}\)) can be written in terms of the parameters defined earlier.
Base on what we got earlier, we know
Thus,
Therefore, we have
In the TCC study, the approximate standard error of V E _{ B } is:
where \(N^{b}_{C1}\) is the number of cases and \(N^{b}_{C0}\) is the number of controls. The probabilities \(\left (p^{B}_{10},p^{B}_{11}\right)\) can be written in terms of the parameters defined earlier:
Appendix 4
Unbiasness under random and nonrandom vaccination
Unbiasness under random vaccination
If the vaccination is done at random, then α _{0}=α _{1}. The VE estimates can be written as:
(1) If ρ _{ β }=Ψ _{ F } = 1, and one of the following conditions is satisfied, then V E _{ A }=V E T _{ SI }.

(a)
λ=1;

(b)
η _{ γ }=1.
Proof
Since \(\rho _{\beta _{x}} = \Psi _{F}\) = 1, then \({{\beta _{10}}\over {\beta _{00}}} = {{\beta _{11}}\over {\beta _{01}}} = \Psi _{F}\) = 1. We have:
So,
If (a) λ=1 is satisfied, so
If (b) η _{ γ }=1 is satisfied, then γ _{01}=γ _{00} and γ _{11}=γ _{10}. Thus,
Hence,
(2) If ρ _{ β }=1, then V E _{ A }=V E T _{ MAI } □
Proof
Since ρ _{ β }=1, then \({{\beta _{10}}\over {\beta _{00}}} = {{\beta _{11}}\over {\beta _{01}}} = 1\).
So,
(3) If λ=1 and 1−γ _{1x }−β _{1x }=Ψ _{ F }(1−γ _{0x }−β _{0x }), where x=0,1, then V E _{ B }=V E T _{ SI } □
Proof
Since 1−γ _{1x }−β _{1x }=Ψ _{ F }(1−γ _{0x }−β _{0x }), where x=0,1, and λ=1, then
(4) If γ _{1x }+β _{1x }=γ _{0x }+β _{0x }, where x=0,1, then V E _{ B }=V E T _{ MAI } □
Proof
Since γ _{1x }+β _{1x }=γ _{0x }+β _{0x }, x=0,1, then
So:
□
Unbiasness under nonrandom vaccination
If the vaccination is not done at random, then α _{0}≠α _{1}.
(5) If ρ _{ β }=η _{ β }=η _{ γ }=Ψ _{ F }=1, then V E _{ A }=V E T _{ SI }.
Proof
Since ρ _{ β }=η _{ β }=η _{ γ }=1, then β _{00}=β _{10}=β _{01}=β _{11}=Δ β, γ _{01}=γ _{00} and γ _{11}=γ _{10}. Thus,
So,
(6) If ρ _{ β }=1 and η _{ β }=η _{ γ }, then V E _{ A }=V E T _{ MAI }. □
Proof
Since η _{ β }=η _{ γ }, so \({{\beta _{01}}\over {\beta _{00}}} = {{\beta _{11}}\over {\beta _{10}}} = {{\gamma _{01}}\over {\gamma _{00}}} = {{\gamma _{11}}\over {\gamma _{10}}}\). Then we have: \({\gamma _{11}\over {\beta _{11}}} = {\gamma _{10}\over {\beta _{10}}} \stackrel {\Delta }{=} a, {\gamma _{00}\over {\beta _{00}}} = {\gamma _{01}\over {\beta _{01}}} \stackrel {\Delta }{=} b\), and \({{\beta _{10}}\over {\beta _{00}}} = {{\beta _{11}}\over {\beta _{01}}} = 1\). Then:
and,
So, V E _{ A }=V E T _{ MAI }. □
Abbreviations
 ARI:

Acute respiratory illness
 FARI:

Influenza ARI
 MAI:

Medicallyattended influenza
 NFARI:

Noninfluenza ARI
 OR:

Odds ratio
 SE:

Standard error
 SI:

Symptomatic influenza
 TCC:

Traditional casecontrol
 TN:

Testnegative
 VE:

Vaccine effectiveness
References
 1
Skowronski DM, Masaro C, Kwindt TL, Mak A, Petric M, Li Y, Sebastian R, Chong M, Tam T, De Serres G. Estimating vaccine effectiveness against laboratoryconfirmed influenza using a sentinel physician network: results from the 20052006 season of dual a and b vaccine mismatch in canada. Vaccine. 2007; 25(15):2842–51. doi:10.1016/j.vaccine.2006.10.002.
 2
Belongia EA, Kieke BA, Donahue JG, Greenlee RT, Balish A, Foust A, Lindstrom S, Shay DK, Marshfield Influenza Study Group. Effectiveness of inactivated influenza vaccines varied substantially with antigenic match from the 20042005 season to the 20062007 season. J Infect Dis. 2009; 199(2):159–67. doi:10.1086/595861.
 3
Choi WS, Noh JY, Seo YB, Baek JH, Lee J, Song JY, Park DW, Lee JS, Cheong HJ, Kim WJ. Casecontrol study of the effectiveness of the 20102011 seasonal influenza vaccine for prevention of laboratoryconfirmed influenza virus infection in the korean adult population. Clin Vaccine Immunol. 2013; 20(6):877–1. doi:10.1128/CVI.0000913.
 4
Havers F, Sokolow L, Shay DK, Farley MM, Monroe M, Meek J, Daily Kirley P, Bennett NM, Morin C, Aragon D, Thomas A, Schaffner W, Zansky SM, Baumbach J, Ferdinands J, Fry AM. Casecontrol study of vaccine effectiveness in preventing laboratoryconfirmed influenza hospitalizations in older adults, united states, 20102011. Clin Infect Dis. 2016; 63(10):1304–11. doi:10.1093/cid/ciw512.
 5
De Serres G, Skowronski DM, Wu XW, Ambrose CS. The testnegative design: validity, accuracy and precision of vaccine efficacy estimates compared to the gold standard of randomised placebocontrolled clinical trials. Euro Surveill. 2013; 18(37). https://doi.org/10.2807/15607917.ES2013.18.37.20585.
 6
Cowling BJ, Fang VJ, Nishiura H, Chan KH, Ng S, Ip DKM, Chiu SS, Leung GM, Peiris JSM. Increased risk of noninfluenza respiratory virus infections associated with receipt of inactivated influenza vaccine. Clin Infect Dis. 2012; 54(12):1778–83. doi:10.1093/cid/cis307.
 7
Castilla J, Godoy P, Dominguez A, MartinezBaz I, Astray J, Martin V, DelgadoRodriguez M, Baricot M, Soldevila N, Maria Mayoral J, Maria Quintana J, Carlos Galan J, Castro A, GonzalezCandelas F, Garin O, Saez M, Tamames S, Pumarola T, Influenza CCC. Influenza vaccine effectiveness in preventing outpatient, inpatient, and severe cases of laboratoryconfirmed influenza. Clin Infect Dis. 2013; 57(2):167–75. doi:10.1093/cid/cit194.
 8
VanWormer JJ, Sundaram ME, Meece JK, Belongia EA. A crosssectional analysis of symptom severity in adults with influenza and other acute respiratory illness in the outpatient setting. BMC Infect Dis. 2014; 14:231. doi:10.1186/1471233414231.
 9
Deiss RG, Arnold JC, Chen WJ, Echols S, Fairchok MP, Schofield C, Danaher PJ, McDonough E, Ridoré M, Mor D, Burgess TH, Millar EV. Vaccineassociated reduction in symptom severity among patients with influenza a/h3n2 disease. Vaccine. 2015; 33(51):7160–7. doi:10.1016/j.vaccine.2015.11.004.
 10
Haber M, Longini Jr IM, Halloran ME. Measures of the effects of vaccination in a randomly mixing population. Int J Epidemiol. 1991; 20(1):300–10.
 11
Sullivan SG, Tchetgen Tchetgen EJ, Cowling BJ. Theoretical basis of the testnegative study design for assessment of influenza vaccine effectiveness. Am J Epidemiol. 2016; 184(5):345–53. doi:10.1093/aje/kww064.
 12
Lipsitch M, Jha A, Simonsen L. Observational studies and the difficult quest for causality: lessons from vaccine effectiveness and impact studies. Int J Epidemiol. 2016. doi:10.1093/ije/dyw124.
 13
Agresti A. Categorical Data Analysis, 3rd ed. Wiley series in probability and statistics, vol. 792. Hoboken, NJ: Wiley; 2013.
 14
Haber M, An Q, Foppa IM, Shay DK, Ferdinands JM, Orenstein WA. A probability model for evaluating the bias and precision of influenza vaccine effectiveness estimates from casecontrol studies. Epidemiol Infect. 2015; 143(7):1417–26. doi:10.1017/S0950268814002179.
 15
Orenstein EW, De Serres G, Haber MJ, Shay DK, Bridges CB, Gargiullo P, Orenstein WA. Methodologic issues regarding the use of three observational study designs to assess influenza vaccine effectiveness. Int J Epidemiol. 2007; 36(3):623–31. doi:10.1093/ije/dym021.
 16
Ferdinands JM, Shay DK. Magnitude of potential biases in a simulated casecontrol study of the effectiveness of influenza vaccination. Clin Infect Dis. 2012; 54(1):25–32. doi:10.1093/cid/cir750.
 17
Foppa IM, Haber M, Ferdinands JM, Shay DK. The case testnegative design for studies of the effectiveness of influenza vaccine. Vaccine. 2013; 31(30):3104–9. doi:10.1016/j.vaccine.2013.04.026.
 18
Jackson ML, Nelson JC. The testnegative design for estimating influenza vaccine effectiveness. Vaccine. 2013; 31(17):2165–8. doi:10.1016/j.vaccine.2013.02.053.
 19
Talbot HK, Nian H, Chen Q, Zhu Y, Edwards KM, Griffin MR. Evaluating the casepositive, control testnegative study design for influenza vaccine effectiveness for the frailty bias. Vaccine. 2016; 34(15):1806–9. doi:10.1016/j.vaccine.2016.02.037.
Acknowledgement
The authors wish to thank Dr. Ivo Foppa and two reviewers for helpful comment.
Funding
This research was supported by the National Institute of Allergies and Infectious Diseases of the National Institutes of Health (NIH) under Award R01AI110474, and by IPA 111037605 with the Centers for Disease Controls and Prevention (CDC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the CDC.
Availability of data and materials
Not applicable. No data or other materials are used in this work.
Author information
Ethics declarations
Ethics approval and consent to participate
Not applicable. No data on human subjects are used in this work.
Consent for publication
All the authors have read and approved the manuscript and agreed to be included in the publication.
Competing interests
The authors declares that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional information
Authors’ contribution
MS, QA and MH developed the model and wrote the manuscript. WO and KA made significant revisions. All authors have read the manuscript and agreed to its content.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Casecontrol study
 Testnegative study
 Probability model
 Symptomatic influenza
 Medicallyattended influenza