Etiological analysis and predictive diagnostic model building of community-acquired pneumonia in adult outpatients in Beijing, China

Background Etiological epidemiology and diagnosis are important issues in adult community-acquired pneumonia (CAP), and identifying pathogens based on patient clinical features is especially a challenge. CAP-associated main pathogens in adults include viruses as well as bacteria. However, large-scale epidemiological investigations of adult viral CAP in China are still lacking. In this study, we analyzed the etiology of adult CAP in Beijing, China and constructed diagnostic models based on combinations of patient clinical factors. Methods A multicenter cohort was established with 500 adult CAP outpatients enrolled in Beijing between November 2010 to October 2011. Multiplex and quantitative real-time fluorescence PCR were used to detect 15 respiratory viruses and mycoplasma pneumoniae, respectively. Bacteria were detected with culture and enzyme immunoassay of the Streptococcus pneumoniae urinary antigen. Univariate analysis, multivariate analysis, discriminatory analysis and Receiver Operating Characteristic (ROC) curves were used to build predictive models for etiological diagnosis of adult CAP. Results Pathogens were detected in 54.2% (271/500) of study patients. Viruses accounted for 36.4% (182/500), mycoplasma pneumoniae for 18.0% (90/500) and bacteria for 14.4% (72/500) of the cases. In 182 of the patients with viruses, 219 virus strains were detected, including 166 single and 53 mixed viral infections. Influenza A virus represented the greatest proportion with 42.0% (92/219) and 9.1% (20/219) in single and mixed viral infections, respectively. Factors selected for the predictive etiological diagnostic model of viral CAP included cough, dyspnea, absence of chest pain and white blood cell count (4.0-10.0) × 109/L, and those of mycoplasma pneumoniae CAP were being younger than 45 years old and the absence of a coexisting disease. However, these models showed low accuracy levels for etiological diagnosis (areas under ROC curve for virus and mycoplasma pneumoniae were both 0.61, P < 0.05). Conclusions Greater consideration should be given to viral and mycoplasma pneumoniae infections in adult CAP outpatients. While predictive etiological diagnostic models of viral and mycoplasma pneumoniae based on combinations of demographic and clinical factors may provide indications of etiology, diagnostic confirmation of CAP remains dependent on laboratory pathogen test results.


Background
Community-acquired pneumonia (CAP) has a major impact on public health and results in more than 10 million visits to physicians and 600,000 hospitalizations each year in the USA [1], where it is the seventh leading cause of death, and its economic burden has been estimated to be more than $17 billion annually [2,3]. Etiological epidemiology and diagnosis are important issues in adult CAP, with particular challenges in identifying the causative pathogens based on patient clinical features. Studies conducted prior to the year 2000 showed that the predominant pathogen of CAP was Streptococcus pneumoniae, which accounted for approximately twothirds of the detected CAP pathogens [4][5][6]. Later, atypical pathogens began showing increasing trends among CAP cases [7]. In the last two decades, with the rapid development of molecular diagnostic techniques and the outbreak of severe acute respiratory syndrome (SARS) coronavirus, avian influenza A (H5N1) virus and the 2009 pandemic influenza A (H1N1) virus, viral CAP has garnered more attention. Shin [8] pointed out that if clinicians do not consider a viral etiology in patients with CAP, they would be unlikely to consider investigations to diagnose respiratory viruses, which would result in the inappropriate use of antibiotics, missed opportunity to consider antiviral treatment and failure to institute appropriate infection control measures. Ruuskanen et al. [9] also stressed the importance of viral CAP and the urgent need to gather epidemiological data on etiological pathogens from developing countries. At present, China still lacks large-scale epidemiological investigations of adult viral CAP. Furthermore, antibiotic resistance is a serious issue in China, as 24.9% of the S. pneumoniae has been found to be resistant to penicillin and 87.5% to macrolide [10], while the macrolide resistance rate of mycoplasma pneumoniae (M. pneumoniae, MP) in adults has been estimated at 69% [11]. Accordingly, underestimation of the prevalence of viral CAP and utilization of inappropriate treatment may increase the spread of antibiotic resistance.
Based on the considerations above, we investigated the etiology of adult CAP in Beijing, China from November 2010 to October 2011. Furthermore, combinations of clinical factors of this population were analyzed in order to build etiological diagnostic models to predict the viral or M. pneumoniae sources of infection.

Study patients
Five hundred adult outpatients were enrolled between November 2010 to October 2011 from 12 hospitals, including 8 teaching hospitals and 4 secondary hospitals, in Beijing, China. These 12 hospitals serve approximately 20 million outpatients annually. Our study subjects were from the outpatient department of infectious diseases or respiratory diseases of these hospitals, at which at least 20,000 CAP patients are seen annually.
The inclusion criteria were based on the following: 1) patients were at least 18 years old; 2) site-of-care decisions for outpatients were made according to the Infectious Diseases Society of America/American Thoracic Society (IDSA/ATS) guidelines on the management of CAP in adults in 2007 [12]; 3) CAP was defined by a new infiltrate on a chest X-ray examined by two radiologists and the presence of one of the following clinical characteristics: new cough or aggravated cough with or without sputum production; fever (> 37.8°C) or hypothermia (< 35.6°C), leukocytosis (> 10 × 10 9 /L) or leukopenia (< 4 × 10 9 /L) [13]; 4) patients agreed to participate in this investigation and accepted the laboratory tests and etiological examination voluntarily. Patients were excluded if they were HIV infected or under an immunosuppressed state, had clinical symptoms for more than 1 week from the time of onset, were pregnant or in the lactation period, were hospitalized within the prior 90 days (hospital stay longer than 2 days), lived in a nursing home or rehabilitation hospital, or had been previously treated with antivirals.

Data collection
Data on demographic factors (sex, age, smoking, coexisting disease and antibiotic pretreatment), clinical symptoms and signs (fever, body temperature, max temperature, heart rate, respiratory rate, systolic pressure ≤ 90 mmHg, cough, expectoration, dyspnea, chest pain, diarrhea, vomiting, dizziness, headache, moist rales and dry rales) and laboratory test results [white blood cell (WBC) counts, neutrophil, lymphocyte, hematocrit, platelets, alanine aminotransferase, aspartate aminotransferase, blood creatinine, blood sodium, pH, PaO 2 , erythrocyte sedimentation rate, C-reactive protein, prothrombin time and activated partial thromboplastin time] were collected using data abstraction forms for patients meeting the inclusion criteria. This process was accompanied by stringent quality controls, including training specific doctors for recording of information; selecting study patients strictly in accordance with inclusion and exclusion criteria; using standardized abstraction forms to guide data collection in case of conflicting, ambiguous and missing information; defining the coexisting disease, including tumor, coronary heart disease, heart failure, cerebrovascular disease, chronic nephropathy, chronic hepatopathy, diabetes mellitus, CAP hospitalization within the prior one year, chronic obstructive pulmonary and autoimmune disease based on specialist opinion; ensuring laboratory tests were obtained from a nationally accredited laboratory; and establishing a monitoring system for the collection of information.

Sample collection
A single throat swab, using Sterile Rayon Swabs (167KS01, Guangzhou, China), was collected from each study patient before receiving antivirals or antibiotic drugs. The swab was immediately placed in a virus transport media tube (167KS01, Guangzhou, China). Each sample was frozen at −80°C within 24 h until analyzed.
Sputum or blood was obtained for bacterial culture before antibiotic therapy, while blood culture was usually performed when the patient temperature was higher than 38.5°C. Patients were instructed to produce a deep expectoration after gargle into a sterile, dry, impermeable, non-absorbent container within 2 h before the test. Ten milliliters of blood was obtained and inoculated into two culture bottles (5 mL was inoculated into an aerobic bottle and 5 mL into an anaerobic bottle). Thirty minutes later, an additional 10 mL of blood was obtained from a different site, and the procedure was repeated. We obtained altogether 369 sputum samples and 85 sets of blood samples. Middle clean urine samples for S. pneumoniae testing were obtained 1 day after enrollment for all study patients. DNA and RNA were extracted from samples using the QIAamp DNA Mini kit (Cat. No.51306, Qiagen, Hilden, Germany) and QIAamp Viral RNA Mini Kit (Cat. No.52906, Qiagen), following the manufacturer's instructions. The extracted RNA was used as template to perform the reverse transcriptase polymerase chain reaction (RT-PCR) with a commercial kit (Fermentas, Shenzhen, China) as follows. Total RNA (8 μl), random hexamers (1 μl of 0.2 μg/μl) and DEPC-treated water (3 μl) were added to an RT tube on ice, incubated at 80°C for 3 min and then chilled on ice again for 2 min. Thereafter, 5× RT buffer (4 μl), 10 mm dNTP (2 μl), RNase inhibitor (1 μl of 20 U/μl) and reverse transcriptase (1 μl of 200 U/μl) were added to the tube, which was then incubated at 37°C for 90 min, followed by 94°C for 2 min. After chilling on ice for a further 2 min, the complementary DNA (cDNA) generated from the reverse transcription was stored at −20°C until ready for use. The PCR amplification system included the cDNA template (3 μl), 5× RV Primer (4 μl), 8-Mop Solution (3 μl) and 2× Multiplex Master Mix (10 μl), and the conditions were: 94°C for 15 min, followed by 40cycles of 94°C for 0.5 min, 60°C for 1.5 min and 72°C for 1.5 min, with extension at 72°C for 10 min. The product was stored at 4°C until used. All PCR products, markers and negative control were analyzed by electrophoresis in 1% agarose gels stained with ethidium bromide and visualized using the Molecular Imager Gel Doc XR System (Bio-Rad 170-8170, Hercules, CA, USA). The Quantitative Diagnosis Kit for M. pneumoniae DNA (PCR Fluorescence Probing, Da An Co., Ltd, Guangzhou, China) was used for M. pneumoniae detection with the following PCR amplification system: DNA template (2 μl), MP PCR reaction mix (40 μl) and Taq enzyme (3 μl). The PCR reaction was carried out using a quantitative PCR instrument (ABI Prism 7500, USA) with the following conditions: 93°C for 2 min, followed by 10 cycles of 93°C for 45 sec and 55°C for 1 min, and another 30 cycles of 93°C for 0.5 min and 55°C for 10 min.

Detection of bacteria
All sputum samples were examined by microscopy, and representative sputum originating from the lower respiratory tract was defined as that containing > 25 granulocytes and < 10 epithelial cells per field of view under a low power microscope. The standard four zoning line method was used for semi-quantitative culture. Conventional blood culture was processed in an automated system. S. pneumoniae urine antigen was detected using an enzyme immunoassay. Detection of bacteria was performed at each hospital laboratory.

Statistical analysis
Statistical analysis was performed using SPSS statistical software version 16.0 (SPSS Inc., Chicago, IL, USA). Discrete variables were expressed as counts (percentage) and continuous variables as means ± SD or median (interquartile range). Frequency comparisons were made with the chisquare test. Two-group comparisons of normally distributed data were performed with the independent samples t-test. For multi-group comparisons, one-way analysis of variance (ANOVA) with the least-squares difference post hoc test was applied. For data not normally distributed, the Mann-Whitney U test was used if only two groups were compared, and Kruskal-Wallis one-way ANOVA was used if more than two groups were being compared. Variables with P values less than 0.2 in univariate analysis were used in multivariate logistic regression analysis. The removal probability for multivariate stepwise logistic regression analysis was 0.1. Discriminatory analysis and Receiver Operating Characteristic (ROC) curves were used to build and assess the predictive etiological diagnostic models. A probability of P < 0.05 was considered to be statistically significant. For convenience in the analysis, the following pathogens were grouped as follows: CoV 229E/NL63, CoV OC43 (CoV); PIV1, 2, 3, 4 (PIV); RSVA and RSVB (RSV).

Ethics statement
The study protocol was in accordance with the Declaration of Helsinki and was approved by the ethics committees of the 12

Distribution of detected virus
We detected 219 virus strains in the 182 patients with viruses, including 166 single and 53 mixed viral infections.

Patient demographic and clinical factors
We collected data from the 500 patients, including demographic characteristics, symptoms, signs and laboratory test results, as shown in Tables 1, 2, and 3. By comparing symptoms, physical examinations and laboratory test results of monomicrobial infections, we found that CAP patients infected with M. pneumoniae were significantly younger than those with viruses or bacteria [M. pneumoniae

Predictive diagnostic model of viral CAP
We performed univariate logistic analysis of all variables in Tables 1, 2, and 3 for viral CAP and selected the following variables with P < 0.2 for multivariate logistic regression analysis: coexisting disease, cough, expectoration, dyspnea, chest pain, dry rales and WBC counts ( Table 4). The multivariate logistic analysis identified four independent factors with P < 0.1 associated with viral CAP, namely cough (OR 2.40, P < 0.1), dyspnea (OR 2.20, P < 0.05), chest pain (OR 0.54, P < 0.1) and WBC counts (OR 0.93,   P < 0.05) ( Table 5). Subsequently, discriminatory analysis and ROC curves were used to establish and assess predictive diagnostic models of CAP by using these four independent factors. The predictive diagnostic model of viral CAP included cough, dyspnea, absence of chest pain and WBC count (4.0-10.0) × 10 9 /L, with an area under the ROC curve (AUC) of 0.61 (95% CI: 0.55 to 0.68). The sensitivity and specificity of this model were 37.4% (95% CI: 29.1% to 45.7%) and 77.2% (95% CI: 71.5% to 82.9%), respectively (P < 0.05).

Predictive dagnostic model of M. pneumoniae CAP
Univariate logistic analysis was performed with all variables in Tables 1, 2, and 3 for M. pneumoniae CAP, and the following variables with P < 0.2 were selected for multivariate logistic regression analysis: age, coexisting disease, smoking, fever, max temperature, neutrophil and C-reactive protein (Table 4). Through the multivariate logistic analysis, two independent factors with P < 0.1 associated with M. pneumoniae CAP were identified, namely age (OR 0.94, P < 0.05) and coexisting disease (OR 0.33, P < 0.1) (

Discussion
In this study, at least one pathogen was found in 54.2% of the patients, and viruses accounted for most of the   [14][15][16]. Available large-scale epidemiological investigations of adult viral CAP have mainly been conducted in developed countries. In China, Cao et al. [17] found that the most common pathogen was M. pneumoniae (29.4%), and respiratory viruses were the second most prevalent (9.6%) in 197 outpatients with CAP. In this study, with a larger sample size, we found a greater proportion of virus-associated CAP. Previously, the incidence of CAP due to atypical pathogens from 4,337 patients worldwide between September 1996 to April 2004 was found to be 22% [18]. Another multicenter study on pathogenic agents in 665 adult patients with CAP in China between December 2003 to November 2004 showed that M. pneumoniae was the most common type of pathogen (20.7%) [19]. Bao et al. also found M. pneumoniae as the top etiological pathogens among 402 fever outpatients with CAP in China from January 2007 to January 2008 [20]. In our study, M. pneumoniae was the second most prevalent etiologic agent at 18%, indicating that attention should still be paid to M. pneumoniae infections in adult CAP outpatients in China. In the last decade, changes in the prevalence of etiological agents of CAP have been observed, such as the decreasing incidence of bacterial CAP and increasing trends in atypical respiratory pathogens and respiratory viruses [21,22]. These changes may be accounted in part by the emergence of PCR technology, either in single or multiplex format, which has greatly improved the sensitivity of diagnostic tests for respiratory viruses (e.g., influenza virus, parainfluenza virus, adenovirus), especially for those that are hard to culture, such as rhinoviruses, coronaviruses and metapneumoviruses [23][24][25]. Another explanation for the observed changes in CAP-related pathogens is that the use of oral antibiotics by patients at the beginning of the febrile episode can significantly reduce the sensitivity of culture [16,26]. As only about 11% of patients with CAP will usually have positive blood cultures, which are more commonly associated with severe illness [27], it is difficult to obtain a positive blood culture in mild to moderate cases of CAP. In our study, two-thirds of the study patients were previously treated with antibiotics, which significantly limited the sensitivity and specificity of detection. Mixed infections have been reported in many previous studies [28][29][30]. In our current investigation, 13.4% of study patients had mixed infections. However, interactions between different pathogens in vivo are poorly understood. It is unclear whether a virus alone causes pneumonia or acts in conjunction with other respiratory pathogens, and a favored hypothesis is that a viral infection is followed by a secondary bacterial infection [9,[31][32][33].
In this study, we detected 15 types of respiratory viruses and analyzed their distribution to supplement the epidemiological investigations of adult viral CAP in China. Consistent with previous studies [9,30], we have shown that Flu A was the most commonly detected virus associated with CAP and have drawn attention to this particular pathogen as a cause of pneumonia, especially in the winter. Rhinovirus infections, which are usually limited to the upper respiratory tract but also can cause pneumonia [34], ranked second. The virus distribution in our study was similar to that in previous reports [35][36][37].
A major current challenge for determining etiological pathogens of adult CAP centers on making the diagnosis based on patient clinical features [38]. In our study, by comparison of monoinfections, we found that M. pneumoniae CAP patients were significantly younger than viral or bacterial CAP patients. Cough with expectoration was more common in bacterial CAP patients compared with the other two groups, and dyspnea was significantly different between individuals with viral and bacterial CAP. Similar conclusions have been made in previous reports, yet the results differed among different study populations [39,40]. Therefore, exploring etiological diagnostic models based on combinations of clinical factors in order to identify the causative pathogens has been a focus of CAP research. While diagnostic models of bacterial CAP have been established [41][42][43], data for such predictive models for viral and M. pneumoniae CAP are still lacking. In our study, we attempted to build viral and M. pneumoniae diagnostic models based on combinations of clinical characteristics from a study population in China. Factors for the predictive etiological diagnostic model of viral CAP consisted of cough, dyspnea, mild chest pain and WBC counts (4.0 -10.0) × 10 9 /L, and that for M. pneumoniae CAP included being younger than 45years old and the absence of a coexisting disease. Corresponding predictors for each model have been reported in the literature. For example, Yang et al. [44] described the association of cough with Flu A infection. Johnstone et al. [40] reported that a viral infection was usually accompanied  [45]. Meanwhile, Cao et al. [17] reported that M. pneumoniae infection was most common in young pneumonia patients without a coexisting disease. In our study, we combined these clinical characteristics to build the diagnostic models. From the AUC of these models, we can see that with the use of clinical characteristics alone, it would be difficult to determine the causative agent(s) accurately, but they can provide a preliminary etiological diagnosis for CAP patients before laboratory results are available. As the accuracy of CAP etiological diagnostic models may be dependent on treatment type (inpatient or outpatient), age and the number of patient samples [41][42][43], the specificity and sensitivity of such models need to be further studied. Currently, etiological diagnosis of CAP still depends on laboratory pathogen test results. While the data obtained was informative, this study had several limitations. One was that we did not investigate the pathogen prevalence in asymptomatic adults from the same population. However, Lieberman et al. had previously found a significantly lower proportion of respiratory viruses in asymptomatic control subjects (7.1%) than in CAP patients [31.7% with at least one respiratory virus, including influenza virus (4.4% vs. 0.4% in control) and rhinovirus (4.9% vs. 2.0% in control] [15]. Second, we used different assays (multiplex PCR, quantitative real-time fluorescence PCR, culture and S. pneumoniae urinary antigen immunoassay) for different pathogens but did not compare the differences in sensitivity and specificity between these microbiological techniques. Third, two-thirds of study patients selfmedicated with antibiotics, which significantly limited the sensitivity and specificity of bacterial detection assays. Fourth, microbiological analysis was not performed for certain respiratory pathogens, such as Chlamydia pneumoniae, Legionella pneumophila and Mycobacterium tuberculosis. Fifth, as the individuals in this population were not severely ill, the findings of this study may not be generalized to hospitalized or severely ill CAP patients. Sixth, the data collection was limited to one year, and the number of study patients may have been insufficient to draw firm conclusions. Therefore, further studies with a larger sample size will be needed to confirm and extend our findings.

Conclusions
In this study, the distribution of CAP-associated pathogens in adults was consistent with trends from other etiological studies showing that bacterial CAP is decreasing, while CAP-associated respiratory viruses and M. pneumoniae are increasing. The results indicate that potential viral and M. pneumoniae infections should be given more attention in adult CAP outpatients. Our survey of 15 types of respiratory viruses and analysis of the distribution viruses supplement the epidemiological investigations of adult viral CAP in China. Among the virus strains detected, Flu A virus was found to be the most prevalent from November 2010 to March 2011, emphasizing the importance of this virus in the winter. As identifying the potential causative pathogens of CAP is of major clinical value, we also made progress on building viral and M. pneumoniae diagnostic models based on combinations of clinical characteristics. These two models revealed that it would be difficult to arrive at the diagnosis accurately using clinical characteristics alone, but they can provide a preliminary indication of the etiological pathogen(s) before laboratory results become available. At the same time, our results highlight the fact that it is necessary to conduct pathological examinations not only for bacteria but also for viruses and M. pneumoniae for etiological diagnosis of CAP patients.

Competing interest
The authors' declare that they have no competing interests.
Authors' contributions YFL participated in the experiments, analyzed the data and drafted the initial manuscript. YG conceived and designed the study, helped to analyze the data and modified the manuscript. MFC supervised enrollment and data collection. BC helped to design the study, coordinated the experiments and contributed reagents and materials. XHY participated in enrollment and data collection and helped to analyze the data. LW participated in the design and coordination of the study. All authors have read and approved the final manuscript.