Study design and ethical statements
This study involved human participants, and all procedures were conducted in accordance with the guidance provided by the relevant ethics boards. The Institutional Review Board (IRB) of Seoul National University Bundang Hospital (IRB approval number: X-1911-579-902) and the Health Insurance Review and Assessment Service (NHIS-2020-2-067) approved the study protocol. Informed consent was waived by IRB of Seoul National University Bundang Hospital, because data analyses were performed retrospectively using anonymized health records derived from the South Korean NHIS database. Data were extracted by an independent medical record technician at the NHIS center who was unaffiliated with this study.
Data source: NHIS-HEALS database and study population
The NHIS-National Health Screening Cohort (NHIS-HEALS) was used in this study [17]. As the sole public insurance system in South Korea, the NHIS collects information regarding demographics; socioeconomic status; diagnosis of diseases according to the International Classification of Diseases, tenth revision (ICD)-10 codes; and treatment for the diseases. Subscribers to the NHIS who are ≥40 years old are recommended to receive standardized medical examination every 2 years [18]. Using the results of the standardized medical examination, the NHIS constructed the NHIS-HEALS database for medical research. The cohort comprised 514,795 individuals who underwent standardized medical examination between 2002 and 2003, and were followed up until 2015. The database contains information regarding body mass index (BMI), laboratory test results including hemoglobin, and questionnaires on lifestyle (exercise, alcohol consumption, and smoking). We included individuals who underwent a standardized medical examination during 2002–2003 for this study. However, data of individuals who died between 2002 and 2003, or had missing data on hemoglobin were excluded from the analysis.
Exposure: history of anemia
All individuals were divided into two groups: the anemic (who had a history of anemia) and non-anemic groups. Individuals who had hemoglobin levels < 12 g/dL for women and < 13 g/dL for men, during 2003–2003, were considered to have anemia based on the World Health Organization (WHO) criteria. The severity of anemia at that time was categorized as mild (12 g/dL > hemoglobin ≥11 g/dL in women and 13 g/dL > hemoglobin ≥11 g/dL in men), moderate (hemoglobin 8–10.9 g/dL), or severe (hemoglobin < 8 g/dL), using the WHO criteria [19]. Serum hemoglobin concentration was measured using the cyanmethemoglobin method. If the hemoglobin level of individuals was measured twice between 2002 and 2003, the hemoglobin level in 2003 was used to diagnose and classify anemia.
Study endpoint: mortality due to infection
In this study, mortality due to a primary infection was considered as the study endpoint. The NHIS database provided data on the death date and main cause of death for all individuals. Mortality rate due to infection was evaluated for a period of 12-years, from January 1, 2004 to December 31, 2015. The specific diagnoses for mortality due to infection are presented as ICD-10 codes in Table S1.
Covariates
The following variables were collected as covariates for this study: demographic information (age, sex, and BMI), socioeconomic status related information (residence and annual income level), comorbidity related information (underlying disability and Charlson comorbidity index), and lifestyle information (smoking status, alcohol consumption, and exercise frequency). Residence was divided into three groups (Seoul, other metropolitan cities, and other areas), and BMI was categorized into four groups (below 18.5, 18.5–24.9, 25.0–29.9, and > 30 kg/m2). The national income level was registered in the NHIS database to determine the insurance premium of all individuals. Annual income level was divided into five groups using quintile ratio (1st: 0–20% [lowest], 2nd: 20–40%, 3rd: 40–60%, 4th: 60–80%, and 5th 80–100% [highest]), and underlying disability was divided into two groups (mild to moderate, and severe). In South Korea, all physical disabilities should be registered in the NHIS to receive various benefits, and are divided into six levels considering their severity. Thus, in this study, disabilities in the 1st (most severe) to 3rd levels were classified in the severe disability group, while those in the 4th to 6th (most mild) levels were classified in the mild to moderate disability group. Smoking status was divided into four groups (never smoked, previous smoker, current smoker, and unknown [no-response group]), and alcohol consumption was divided into six groups (does not drink, 2–3 drinks per month, 1–2 drinks per week, 3–4 drinks per week, drink almost every day, and unknown [no-response group]). Exercise frequency was divided into six groups (no exercise, exercise 1–2 times per week, 3–4 times per week, 5–6 times per week, exercise almost every day, and unknown [no-response group]). The Charlson comorbidity index was calculated using registered ICD-10 codes from to 2002–2003, as shown in Table S2 [20].
Statistical analysis
The clinico-epidemiological characteristics of the individuals are presented as mean values with standard deviations for continuous variables and numbers with percentages for categorical variables. First, we performed 1:1 propensity score (PS) matching between the anemic group (those with a history of anemia) and non-anemic group to reduce confounders [21]. For this PS-matching, the nearest neighbor method was used without replacement with a caliper of 0.25. All covariates were included in the PS model, and logistic regression analysis was performed to calculate the PSs. The absolute value of the standardized mean difference (ASD) was used to evaluate the balance between the groups before and after PS-matching. The ASD was set at < 0.1 to confirm adequate balance between the groups. After confirming adequate balance, we performed Cox proportional hazards regression analysis for mortality rate due to infection in the PS-matched cohort. In this time to event analysis, death due to infection was set as the event, and survival time from January 1, 2004 to death date was set as the duration. As a first sensitivity analysis, we investigated the association between the anemic group and mortality due to infection during 2005–2015, and not 2004–2015, in the PS-matched cohort to avoid reverse causation bias because there was a short latency time between history of anemia and mortality due to infection in 2004 [22].
As a second sensitivity analysis, we constructed a multivariable Cox regression model for mortality due to infection using the entire cohort to determine: (1) whether the results obtained from the PS-matched cohort were generalizable to the entire cohort, and (2) the risk of mortality due to infection in the anemic group with other important covariates in context, and not in isolation. All covariates were included in the multivariate Cox model for adjustment. Using multivariable Cox regression modeling, we performed subgroup analyses to investigate whether mild, moderate, and severe anemia in the past, were associated with mortality due to infection compared to the non-anemic group. In addition, considering sex is associated with development of anemia [23], we performed subgroup analysis stratified by sex to examine the impact of sex on the association between history of anemia and mortality due to infection. We confirmed that there was no multicollinearity in all multivariable models involving the entire cohort, with a variance inflation factor of < 2.0. The results of the Cox regression are presented as hazard ratios (HRs) with 95% confidence intervals (CIs). C-statistics were used to identify the C-index of the multivariable Cox regression model. All statistical analyses were performed using R software (version 4.0.3 with R packages, the R Project for Statistical Computing, Vienna, Austria). P < 0.05 was considered statistically significant.