Estimates of state-level chronic hepatitis C virus infection, stratified by race and sex, United States, 2010

Background Hepatitis C virus (HCV) is the most common blood-borne viral infection in the United States. Previously, we used data from the National Health and Nutrition Examination Survey (NHANES) and mortality data from the National Vital Statistics System (NVSS) to estimate the prevalence of HCV antibodies (anti-HCV) and HCV RNA among all U.S. states. However, demographic differences in HCV burden at the state-level have not been systematically described. This analysis quantified the HCV burden stratified by sex and race (and associated disparities) for each U.S. state. Methods Building on our previous method, we used three publicly available data sources to estimate HCV RNA prevalence among noninstitutionalized adults stratified by sex and race group. We used a small-area estimation approach that included direct standardization of NHANES demographic data with logistic regression modeling of HCV-related mortality data as an adjustment factor to estimate the state-level prevalence and total persons with chronic HCV infection for sex and race groups in all U.S. states. Results Nationally, males had an estimated HCV RNA prevalence of 1.56% (95% CI: 1.37–1.84%) and females had a prevalence of 0.75% (95% CI: 0.63–0.96%). Stratified by race, national estimated prevalence of HCV RNA was highest among non-Hispanic black (2.43, 95% CI: 2.10–2.90%), followed by non-Hispanic white (1.05, 95% CI: 0.90–1.27%) and Hispanic/other (0.74, 95% CI: 0.59–1.04%). Males in most jurisdictions (41/51) have an HCV RNA prevalence that is between 1.5 and 2.5 times higher than their female counterparts. Conclusions HCV infection disparities by sex are mostly consistent across the country. However, race differences in HCV infection differ by state and tailored prevention and treatment efforts specific to the local HCV epidemic are needed to reduce race disparities. Electronic supplementary material The online version of this article (10.1186/s12879-018-3133-6) contains supplementary material, which is available to authorized users.


Background
Hepatitis C virus (HCV) infection is the most common blood-borne infection in the United States [1]. In 2010, an estimated 3.9 million adults had antibodies to HCV (anti-HCV), indicating a previous or current acute infection [2]. Approximately 26% of individuals with acute HCV spontaneously clear infection within 6 months of exposure, but the remaining persons develop a chronic infection [3]. Despite the recent development of curative therapies, chronic HCV remains a leading cause of hepatocellular carcinoma, cirrhosis and liver failure requiring a transplant [4,5]. Chronic HCV infection has been associated with an increase in HCV-related mortality and in 2007, chronic HCV surpassed HIV as a cause of death in the United States [6,7]. Current national surveillance efforts provide an incomplete picture of the burden of both HCV infection and diagnoses in the United States. Both acute and chronic HCV infection diagnoses are reportable conditions in the Centers for Disease Control and Prevention (CDC) National Notifiable Disease Surveillance System (NNDSS) [8]. However, many HCV infections are not captured and surveillance data do not accurately represent the burden of hepatitis C in the United States [9]. Chronic HCV is often asymptomatic and about half of infected persons are unaware of their infection, and therefore could not be reported to surveillance systems [10]. Furthermore, some states do not submit viral hepatitis reports to NNDSS (only 37 states reported HCV cases in 2014) and some states report implausibly low numbers [9]. In 2015, CDC directly funded enhanced case surveillance activities in 8 jurisdictions [11]. Both the O'Neill Institute and the US National Viral Hepatitis Action Plan (Goals 4.2 and 4.3) calls for improvements in mechanisms and timeliness of data to monitor the epidemic [12,13]. In the absence of complete surveillance data, probability-based surveys, such as the National Health and Nutrition Examination Survey (NHANES), can be used to fill this gap, for estimating HCV prevalence on a national level and for the incorporation into our recent model that estimated HCV prevalence at the state-level using a small-area estimation approach [2,14,15].
Nationally, the prevalence and risk of new HCV infection differs across several key demographic groups. About 70% of all chronic infections are among individuals who were born between 1945 and 1965 [1]. Overall, the prevalence of anti-HCV among adults from the 1945-1965 birth cohort was estimated to be 3.5% in 2010 [1]. NHANES analyses have demonstrated a national prevalence of anti-HCV that is higher in men (1.9% compared vs. 1.1% in women) and non-Hispanic blacks (2.2% vs. 1.3% among non-Hispanic whites) [1]. Disparities by race have been demonstrated repeatedly in programmatic data and in jurisdiction-specific epidemiologic profiles [16], which collectively form the basis for national goals of reducing HCV-related mortality among African-Americans and American Indians/Alaska Natives [12]. However, extant published demographic differences in HCV burden at local levels are not typically described by prevalence indicators (anti-HCV or HCV RNA) and have not been systematically described for all states. This analysis extends our earlier method for state-level prevalence estimation to quantify the burden of chronic hepatitis C infection, stratified by race and by sex in each U.S. state.

Data sources
We extended a previously-described small-area estimation approach that synthesizes NHANES national estimates of HCV infection with state-level data on HCV-related mortality from the National Vital Statistics System [2]. The data sources and analytic methods are briefly described below.
NHANES uses a complex, multistage sampling design to collect nationally representative questionnaire and laboratory data on the health of the non-institutionalized United States population [17]. Data and corresponding sampling weights are released every 2 years. Data from seven NHANES release cycles (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) were pooled in order to ensure sufficient data for all demographic groups. All analyses were restricted to respondents aged ≥18 years of age at time of survey. Race/ethnicity was categorized into non-Hispanic black, non-Hispanic white and Hispanic/other. Similar to previous analyses, birth year was classified into three cohorts: before 1945, 1945-1965 and after 1965 [18, 19].
NHANES tested lab samples for anti-HCV using an anti-HCV screening chemiluminescence immunoassay and a confirmatory recombinant immunoblot assay (RIBA) [20]. Samples with positive results on RIBA were tested for HCV RNA using an in vitro nucleic acid amplification test. Participants with a positive or indeterminate anti-HCV test and positive HCV RNA test were considered to represent chronic HCV infection. As done in previous analyses, participants who tested positive for anti-HCV but negative for HCV RNA (resolved infections; n = 112) and those who tested positive for anti-HCV but did not have a HCV RNA test (n = 160) were not included in this analysis [14].
The National Vital Statistics System (NVSS) collects demographic, geographic and cause of death data from all death certificates in the United States [21]. Mortality Multiple Cause Microdata files (from 1999 to 2012) were used and data were categorized into the same sex, race and birth cohort categories described above. Any death records that listed the ICD-10 code for acute viral hepatitis C (B17.1) or chronic viral hepatitis C (B18.2) as the underlying or a multiple cause of death were considered to indicate HCV-related mortality.
Annual intercensal population estimates (1999-2012) from the US Vintage 2000, 2009 and 2014 data sets were used as denominators for HCV-related mortality rates within each demographic strata [22]. Microdata from the 2010 5-year American Community Survey (ACS, years 2006-2010) were used to generate estimates of the noninstitutionalized population [23,24], which were combined with our estimated number of HCV cases to estimate HCV prevalence rates for each state. The incorporation of ACS population totals into this analysis is an update to our previously published approach and reflect current NHANES guidance [25]. Estimated population total for each sex and race group are reported in Additional file 1: Table S1.

Analysis
The number of persons with chronic hepatitis C in each state were estimated using a standardization-based estimator described in detail earlier [2]. Briefly, we used NHANES data to calculate weighted estimates for national HCV RNA prevalence for 18 strata of sex (2),   (3) and birth cohort (3) [25]. We multiplied these weighted estimates by 2010 ACS 5-year population estimates for the corresponding demographic stratum for each state to generate crude state-bystratum totals. We used NVSS mortality data and intercensal population totals to fit a high-order logistic regression model the average HCV-related death rates in the corresponding 12 strata over the 14-year period. We compared observed state-by-strata HCV-related mortality totals to model predictions to assess collinearity and model fit [26,27]. A ratio of state-by-demographic stratum effects were calculated by comparing state specific HCV-related death rates to national HCV-related death rates (within the same strata). The crude state-bystratum HCV RNA totals were adjusted by the state-bystratum effects to calculate mortality-adjusted HCV RNA prevalence totals in each strata in each state. Totals were summed within single strata of race (white non-Hispanic, black non-Hispanic and Hispanic/other race) or sex and divided by corresponding population totals. This yielded HCV RNA prevalence rate estimates by sex and race for each state. Prevalence rate ratios were generated to compare HCV RNA prevalence for men vs women and non-Hispanic black vs non-Hispanic white. An analogous approach and model was used to estimate the prevalence rate and number of persons with HCV antibodies (anti-HCV), which is often used as an indicator of past or current infection [14]. The results for anti-HCV prevalence by race and sex in each state are presented in Additional file 1: Tables S2-S5. We calculated 95% confidence intervals (CI) for statelevel estimates to account for the joint statistical uncertainty in the NHANES prevalence estimates and HCV-related mortality rate estimates from the logistic regression model. Confidence intervals were calculated using a Monte Carlo simulation that sampled from logit-normal and normal distributions, respectively (k = 10,000). We calculated the coefficient of variation for all prevalence rate and ratio estimates. Similar to the guidelines used by NVSS, any estimates with a coefficient of variation of 23% or greater are potentially unreliable and marked accordingly [28].
Among males, estimated HCV RNA prevalence ranged from 0.65% (Illinois) to 3.12% (District of Columbia, Table 1). Among females, reliable prevalence estimates ranged from 0.31% (Wisconsin) to 2.40% (District of Columbia). In all 51 jurisdictions, estimated HCV RNA prevalence was higher in males compared to females (Rate Ratio (RR) > 1, Fig. 1). In all but two jurisdictions (Alaska and District of Columbia) the corresponding rate ratio 95% confidence interval did not include the null value (RR = 1). Of the 51 jurisdictions, 41 have an estimated male-to-female prevalence ratio between 1.5 and 2.5.
State-level HCV RNA prevalence among non-Hispanic white persons ranged from 0.39% (Wisconsin) to 2.31%   (Tables 3 and 4). In all jurisdictions except for Mississippi and Hawaii, the estimated HCV RNA prevalence was higher among non-Hispanic black compared to non-Hispanic white (RR range from 1.13 to 12.03, Fig. 2). The District of Columbia had the largest relative difference in prevalence between non-Hispanic black and non-Hispanic white (RR = 12.03, 95% CI: 8.36-16.51). Race stratified prevalence estimates are displayed geographically in Fig. 3. Among all state-level race-stratified HCV RNA prevalence estimates, the entire top two deciles are from the non-Hispanic black race (Fig. 3).

Discussion
These results are consistent with previous analysis of chronic HCV infection across demographic groups and provide a more detailed picture of the HCV burden in each US state. Using and NHANES based approach, Denniston et al. estimated a higher prevalence of chronic HCV in men compared to women, which is consistent with all our individual state-level estimates [14]. In general, these results indicate a homogeneity in disparity of HCV prevalence in males and females across the country; with males having roughly double the prevalence as females. This consistency across states is not seen when comparing prevalence of HCV infection by race. Differences in chronic HCV prevalence by subgroups can be a result of differences in risk behaviors, biological mechanisms, and access to screening and treatment. A higher proportion of men report having ever injected drugs (3.6% vs. 1.6% for women), which is a risk factor for infection [29]. However, female injection drug users have a higher risk of HCV infection than male counterparts [30]. Additionally, the consistent difference in HCV prevalence in men and women is likely the result of a biological mechanism in HCV infection and viral clearance. Spontaneous acute HCV resolution has been shown to be higher in females (40%) compared to males (19%) [3,31]. The exact mechanism for higher viral clearance in women is unknown, but has been hypothesized to be due to the estrogen hormone [3].
In contrast to the homogeneity seen in the stratified sex estimates, these results indicate racial disparities in HCV infection differ by state. For example, while the estimated HCV RNA prevalence among non-Hispanic whites is similar in Rhode Island (1.21%) and Arkansas (1.15%), the estimated prevalence among non-Hispanic blacks differs greatly between the two states (5.32 and 1.65%, respectively). Even areas with similar relative disparities may have largely different epidemics. California and North Carolina have similar estimated prevalence rate ratios (2.19 and 2.17, respectively) even though prevalence rates are higher across all three race groups in California.
Racial differences in HCV prevalence could result from a variety of societal and biological factors that lead to differences in exposure to risk factors and disease progression. While difficult to assess, state-level differences in racial disparities of transmission risk factors (such as injection drug use) could contribute to differences in HCV prevalence. While the risk of HCV infection during incarceration is not well-know, several surveys indicate that incarcerated individuals have a higher HCV prevalence than the general population [32]. State-level  differences in incarceration rates by race might also track with local differences in HCV prevalence. Acute HCV clearance has been hypothesized to be lower in African Americans compared to Hispanic and white counterparts [33][34][35]. This is supported by viral clearance being lower in genotype 1 infections, which occur more frequently in African Americans. African-Americans are also more likely to be screened for HCV, leading to higher rates of diagnosis [36]. Furthermore, African-Americans, Hispanics and Asians are all less likely to receive traditional treatment (interferon-based) than Caucasians [37,38]. This is partially because of a higher rate of treatment ineligibility (due to comorbidities) [37,39] and a lower efficacy for viral genotype 1, which is more common in African Americans  [40,41]. African-Americans are also more likely to defer interferon-based treatment [42][43][44] and receive direct activing antiretroviral treatment [45]. To address this disparity, CDC has generated a set of materials and public health messages that specifically aim to reduce the HCV burden in the African-American community [46].

Limitations
This approach uses data from several large, public and population-based data sources. However, there are some limitations to consider when interpreting these results. First, these estimates only represent the non-institutionalized population. Data from NHANES does not represent homeless persons or persons in correctional facilities, nursing homes or other institutions in which chronic HCV prevalence may differ from the general population [47]. Geographic differences in incarceration rates by race and sex could influence statewide HCV prevalence. However, previous work on a national level demonstrates a promising approach for including underrepresented populations that can be extended to the state level in the future [47]. Second, we aggregated 14 years of NHANES data in order to have a sufficient sample size to produce reliable estimates. Slight changes in local HCV incidence over that time period might not be captured in this approach. In the NHANES data, the national overall prevalence of HCV RNA changes slightly from 1999 to 2006 (1.26%; 95% CI: 1.06-1.48%) to 2007-2012 (1.05%; 95%: 0.82-1.34%). However, there is not enough data to reliably determine if there are temporal changes in prevalence within subgroups. Interpreting these results in conjunction with further analysis of local risk behaviors will help state health departments better understand their epidemic. Additionally, we do not present separate results for American Indians/Alaska Natives (AI/AN), who face an elevated HCV burden and are a key population identified in the National Viral Hepatitis Action Plan [12]. Due to under sampling and sparse NHANES data within this group, we are not able to separate AI/AN into an additional race category. We combined AI/AN and other under sampled race groups with Hispanic in order to have sufficient data in each strata of NHANES data. However, 76% of the estimated HCV RNA infections in this combined group were among persons who identified as Hispanic. Similarly, although we control for birth cohort in our model, we do not present results stratified by age. It is challenging to present meaningful estimates for age because we pooled data across 14 years. Finally, despite pooling data, some individual point estimates might still be unreliable. However, those values have been indicated as potentially unreliable, and confidence intervals have been provided for all estimates. Improved direct surveillance data or additional model input data are needed to overcome the demographic limitations of using NHANES data.