The diagnostic value of metagenomic next⁃generation sequencing in infectious diseases

Background Although traditional diagnostic techniques of infection are mature and price favorable at present, most of them are time-consuming and with a low positivity. Metagenomic next⁃generation sequencing (mNGS) was studied widely because of identification and typing of all pathogens not rely on culture and retrieving all DNA without bias. Based on this background, we aim to detect the difference between mNGS and traditional culture method, and to explore the relationship between mNGS results and the severity, prognosis of infectious patients. Methods 109 adult patients were enrolled in our study in Shanghai Tenth People’s Hospital from October 2018 to December 2019. The diagnostic results, negative predictive values, positive predictive values, false positive rate, false negative rate, pathogen and sample types were analyzed by using both traditional culture and mNGS methods. Then, the samples and clinical information of 93 patients in the infected group (ID) were collected. According to whether mNGS detected pathogens, the patients in ID group were divided into the positive group of 67 cases and the negative group of 26 cases. Peripheral blood leukocytes, C-reactive protein (CRP), procalcitonin (PCT) and neutrophil counts were measured, and the concentrations of IL-2, IL-4, IL-6, TNF-α, IL-17A, IL-10 and INF-γ in the serum were determined by ELISA. The correlation between the positive detection of pathogens by mNGS and the severity of illness, hospitalization days, and mortality were analyzed. Results 109 samples were assigned into infected group (ID, 92/109, 84.4%), non-infected group (NID, 16/109, 14.7%), and unknown group (1/109, 0.9%). Blood was the most abundant type of samples with 37 cases, followed by bronchoalveolar lavage fluid in 36 cases, tissue, sputum, pleural effusion, cerebrospinal fluid (CSF), pus, bone marrow and nasal swab. In the ID group, the majority of patients were diagnosed with lower respiratory system infections (73/109, 67%), followed by bloodstream infections, pleural effusion and central nervous system infections. The sensitivity of mNGS was significantly higher than that of culture method (67.4% vs 23.6%; P < 0.001), especially in sample types of bronchoalveolar lavage fluid (P = 0.002), blood (P < 0.001) and sputum (P = 0.037), while the specificity of mNGS was not significantly different from culture method (68.8% vs 81.3%; P = 0.41). The number of hospitals stays and 28-day-motality in the positive mNGS group were significantly higher than those in the negative group, and the difference was statistically significant (P < 0.05). Age was significant in multivariate logistic analyses of positive results of mNGS. Conclusions The study found that mNGS had a higher sensitivity than the traditional method, especially in blood, bronchoalveolar lavage fluid and sputum samples. And positive mNGS group had a higher hospital stay, 28-day-mortality, which means the positive of pathogen nucleic acid sequences detection may be a potential high-risk factor for poor prognosis of adult patients and has significant clinical value. MNGS should be used more in early pathogen diagnosis in the future.


Background
Infectious diseases are a leading cause of morbidity and mortality worldwide and spread quickly. As the first-line of pathogen detection, microbiology laboratory plays an important role in infection control by means of microscopic examination, culture, identification, drug sensitivity and so on [1]. However, the limitation of molecular diagnosis and genotyping methods remain that pathogens are undetected in up to 60% of cases [2][3][4]. Failure to identify pathogens in time may delay the precise treatment of antibiotics, leading to unnecessary use of broad-spectrum antibiotics, inducing resistance, and increasing medical costs [5].
With the completion of the human genome project in the early twenty-first century and the rapid development of sequencing technology, high-throughput and low-cost second-generation sequencing technology emerged [6]. It had been used in whole genome sequencing, whole exome sequencing, macro gene sequencing and so on, among which metagenomic next•generation sequencing (mNGS) was studied most widely. The advantage of mNGS lies in the single run to obtain the sequence information of microbial nucleic acid fragments, through analysis and comparison of which to detect all microbial species and sequence [7]. Besides, mNGS can be used for the identification and typing of all pathogens because mNGS does not rely on culture and retrieve all DNA without bias [8]. Based on mNGS results, antimicrobial resistance, virulence, typing and other information can be used for epidemic investigation. It lays a theoretical foundation for the investigation of infectious diseases outbreak in hospital. Therefore, this technology may play a huge role in infection prevention and medical microbiology laboratory.
Thus, based on microbiome sequencing technology, we compared the sensitivity and specificity of mNGS method and traditional culture method to detect pathogens, and discussed the influence of mNGS detection results on the severity and prognosis of patients with infection in our study.

Study patients
We retrospectively reviewed 161 cases suspected of acute or chronic infection from respiratory and critical care medicine department, geriatric department, emergency intensive care unit and emergency department at Shanghai 10th People's Hospital in Shanghai, China, between October 2018 and December 2019. Excluding patients with pregnancy, mental illness and under the age of 18, 109 samples were included in our study and for analysis and then they were categorized into 3 groups, infectious disease (ID) group, noninfectious disease (NID) group, and unknown group according to final diagnosis. Specimens were subjected to mNGS testing (BGI, Intertek, Biotecan, China) and regular clinical microbiological assay in a pairwise manner and final diagnosis was determined by clinicians based on both of them and imaging, clinical feature of patients. Meanwhile, clinical data of all enrolled patients, including complete blood count, C-reactive protein (CRP), procalcitonin (PCT), neutrophil count, interleukin (IL)-2, IL-4, IL-6, Tumor Necrosis Factor-α (TNF-α), IL-17A, IL-10 and Interferon-γ (INF-γ) were collected. The flow diagram of cases inclusion and exclusion was shown in

Metagenomic next-generation sequencing and analysis
Nucleic acid detection and sequencing were performed based on BGISEQ-50 platform (BGI-Tianjin, Tianjin, China) in this research. After the sample was taken, nucleic acid was extracted, the library was built and sequenced, and finally the data was analyzed by using the microbiome database (ftp://ftp.ncbi.nlm.nih.gov/genomes/ ). The experimental process was shown in Fig. 2.
Sample processing and library construction (Fig. 2a) For infected patients or patients with fever of unknown cause, infected site samples or blood were collected according to standard procedures. Each blood, bronchoalveolar lavage Fluid (BALF) or urine sample was at least 5 ml (ml) and at least 3 ml of each sample of cerebrospinal fluid, sputum, or other sterile liquid. Blood must be collected in anticoagulant tube and stored at room temperature, the protective agent in anticoagulant tube is Ethylene Diamine Tetraacetic Acid (EDTA) anticoagulant and special deoxyribonucleic acid (DNA) protective agent. Other samples were collected in sterile tube and stored at − 80°C. Blood was transported at room temperature, and other sterile samples were transported in drikold. Since most of the collected samples contain pathogenic pathogens, they were inactivated (56°C, 30 min) before nucleic acid extraction. In addition to this, blood samples were centrifuged to separate plasma and leukocytes when intracellular bacterial infection was particularly suspicious by clinicians.  Sputum samples were liquefied by using 0.1% dithiothreitol (DTT) for 30 min at room temperature after inactivation [5]. After that, DNA were extracted by TIANamp Micro DNA Kit (DP316, Tiangen Biotech) according to the manufacturer's recommendation. DNA libraries were constructed in steps of DNA fragmentation by enzyme digestion, DNA supplementation terminal, dA tail and sequencing common connector connection. The constructed DNA library was used to obtain the sequence data of DNA fragments by gene sequencing instrument, and the results were analyzed by biological information software. Each trial included internal, negative and positive controls. Internal parameters is a specific molecular tag that is placed in the sample before nucleic acid extraction to track the entire process and to control the quality of DNA. The detection results of negative control products should be no pathogens detected. If there are relevant pathogens detected, it indicates that there may be DNA pollution sources in the environment. Positive contained specific microbic DNA.
Bioinformatic analysis (Fig. 2b) Quality control A. Sequencing subtracted of human host sequences need to be above 90%; B. Reads of microbial detection sequences need to be longer than 50 bp and the effective sequencing data volume should not be less than 20 M without removing the human genome component.

Data filtering
In order to obtain high quality sequence data, the qualified data was further filtered by bioinformatics analysis to remove low quality sequences. FASTQ format was used for analysis. The initial pretreatment steps include low quality read filtering, low-complexity read filtering and adapter trimming. Host subtraction was performed by mapping to host genome and/or transcriptome. The remaining unmapped reads are aligned directly with large reference databases, such as the National Center for Biotechnology Information (NCBI) GenBank database.

Sequences alignment
The filtered sequences were compared with the reference sequences in the pathogen database, which covers bacteria, fungi, viruses, protozoa and other pathogenic microorganisms. According to the final results of pathogen comparison, all parameters of detected pathogens were calculated, including sequence number, relative abundance, genome coverage and depth, etc.
Report generation The species listed in the report were all the microorganisms detected in this test. They were classified by bacteria, viruses, fungi, parasites, mycoplasma, chlamydia and rickettsia. They were ranked from high to low according to their reads and the relative content of the former is higher. When the report goes to the clinic, whether the suspected pathogen detected is related to infection from the clinical dimension was judged, and the final diagnosis was determine by combining the detection parameters.

Determination of cytokines
Detection of TNF-a, IL-2, IL-4, IL-6, IL-8, IL-10, IL-17A and INF-r in serum was by solid phase, sandwich and chemiluminescence using the IMMULITE/IMMULIE 1000 analyzer. The analyzer and chemiluminescence kit were both from SIEMENS, Germany. The processed specimens were sent to the analyzer for testing according to the manufacturer's instructions, and the corresponding cytokine concentrations were recorded.

Cell classification and count detection
Cells were classified using the automatic flow cytometer (Thermo Fisher SCIENTIFIC, American) and divided into total white blood cells, neutrophil count, CD4+ T cell count, CD8+ T cell count, B cells, and NK, T cell count.

Statistical analysis
Comparative analysis was conducted by Pearson χ2 test and t test. Data analysis was performed by using SPSS 22.0 software. P values < 0.05 were considered significant, and all tests were 2-tailed. Logistic regression analysis explored the risk factors associated with positive detection of mNGS.

Diagnostic performance comparison of mNGS and culture
Comparison of diagnostic performance for differentiating ID from NID The cases of mNGS and culture tests in this study were illustrated in Fig. 4a. In the chi-square test of positive rate, there were statistical differences between mNGS and culture of all and of ID group, but no differences in NID and unknown group for the limited amounts. 105 samples were included for further study to compare the diagnostic efficiency for differentiating ID from NID.  (Fig. 4b).

Concordance between mNGS and culture for pathogen detection
In this study, mNGS and culture were both positive in 21 of 109 (19.3%) cases and were both negative in 25 of 109 (22.9%) cases. There were 58 cases (53.2%) were positive by mNGS only and 5 (4.6%) were positive only by culture. The 2 results in double-positive cases were completely matched (overlapped of all pathogens) in 3 of 21 and totally mismatched (overlapped of no pathogen) in 3 of 21 (Fig. 4c). The remaining 15 cases were found to at least one but not all overlapped of pathogens in polymicrobial results, which defined as "partly matched".
"False positives" and "false negatives" of mNGS In the ID group, three culturable pathogens were missed by mNGS. Among the three "mNGS false-negative" samples, there were 2 culture results paradoxical with clinical diagnosis, the other 1 was completely unidentified by mNGS. At the same time, the possible reasons for the 7 cases of "mNGS false-positive" in the NID group included potential concomitant infection with NIDs (3/7), overinterpretation (3/7) and unknown (1/7) ( Table 2).

Comparison analysis at the sample-type level
In the types of BALF, tissue, blood and sputum samples, mNGS detection had significantly higher sensitivity than the culture method (P = 0.002 for BALF, P = 0.025 for tissue, P < 0.001 for blood, P = 0.018 for sputum), and the overall sensitivity of mNGS in the sample types was significantly different (P = 0.03). In the types of pleural fluid, CSF, pus, bone marrow and nasal swab, there were no significant differences in sensitivity between two methods (P > 0.05). In addition, in the culture method, the positive rate in BALF was higher than that in the whole blood (P = 0.019), and there was no difference in the overall sensitivity of the culture method in the sample type, as shown in Fig. 5b.

Comparison of infection indexes in positive and negative group by mNGS in ID
Classification and counting of leukocyte and lymphocyte in positive and negative group by mNGS In this study, complete blood count, CRP and PCT tests were examined on the day of examination of pathogenic microorganisms to determine the differences in the total number of white blood cells, lymphocytes and neutrophils between the positive group and the negative group by mNGS. The results showed ( Table 3) that there were no statistically differences in leukocyte and lymphocyte between positive and negative groups by mNGS (P > 0.05).

Comparison of cytokine concentrations in positive and negative group by mNGS
In order to explore the correlation between the status of immune function in patients and the positive results of pathogen examination, this study detected and analyzed the peripheral blood (TNF-a, IL-2, IL-4, IL-6, IL-8, IL-10, IL-17A and INF-r) in infected patients. The results indicated that the peripheral blood concentrations of IL-10 in the positive group was higher than that in the negative group, and the differences were statistically significant (P = 0.044), while other cytokine showed no difference between groups as shown in Table 3.

Analysis of correlative factors for positive result of pathogen extraction by mNGS
In order to further explore the related risk factors of positive mNGS test in infected patients, this study used Logistic multivariate regression analysis to analyze the patients' information and whether the pathogen was detected in the patients. After the confounding factors were removed, the variables that were significant for detection was age (P = 0.037, OR: 1.076, 95% CI:1.005-1.152), which promoted the detection of pathogens (Table 4).

Potential implications of clinical mNGS test Potential inappropriate antibiotic usage for patients with virus isolates
There were 4 viruses identified by mNGS from 23 patients in this study, the majority of the identified viruses were herpes simplex virus (n = 15), followed by Epstein-Barr virus/ herpes simplex virus (n = 5), Epstein-Barr virus (n = 1), Hepatitis A virus (n = 1) and torque teno virus (n = 1). Nearly 50% of patients were diagnosed with a hospital-acquired infection (12/23) and 17 of 23 patients were given broad-spectrum antibiotics based on symptoms, imaging. 10 of 23 patients were suspected of inappropriate antibiotic usage, which means after broadspectrum antibiotic treatment, patients' symptoms did not improve or even worsened and after identifying the real pathogen through mNGS and adjusting the antibiotic use based on that, patients' condition improved. 7 of 23 were considered immunocompromised hosts characterized by deficiency of the immune system or immune response caused by infectious factors, mycotoxins, drugs and nutritional deficiencies. (Table 5).

The influence of positive by mNGS on the hospital days and survival of patients
As Table 6 showed, there were 67 samples in positive group with 57 males and 26 in negative group with 20 males. There was no significant difference in mean age between the two groups (59.70 yrs. vs 60.50 yrs., P = 0.84). Positive group had a longer hospital day (HOD, 176.63 days vs 150.96 days, P = 0.047) and a higher 28- Fig. 5 The overlap of positivity between mNGS and culture in pathogen and sample types. a. 19 pathogens detected in ID group with their corresponding frequencies were showed in histograms. Klebsiella, bacteria without MTB/NTM, EBV, CMV, NTM, Anaerobes, Saccharomyces cerevisiae, Proteus, Pneumocystis carinii, Abiotrophia, Nocardia, Staphylococcus aureus, Enterococcu and Escherichia coli demonstrated a trend of higher positivity rate in mNGS than that in culture with no statistical differences (P > 0.05). Acinetobacter baumannii and MTB were found equally in two groups. b. The overall sensitivity of mNGS in the different sample types were significantly different (P = 0.03) while sample types did not affect the sensitivity of pathogens in culture. Interestingly, especially in the types of BALF, blood and sputum samples, mNGS had significantly higher sensitivity than the culture (P = 0.002 for BALF, P < 0.001 for blood, P = 0.037 for sputum). Abbreviations: BALF, bronchoalveolar lavage fluid; CSF, cerebrospinal fluid; mNGS, metagenomic next-generation sequencing; HSV, herpes simplex virus; CMV, cytomegalovirus; EBV, Epstein-Barr virus; MTB, Mycobacterium tuberculosis; NTM, nontuberculous mycobacteria; ns, no significant difference day mortality (9.0% vs 0%, P = 0.049) than those of negative group, but there were no statistical differences in 14-day mortality (4.5% vs 0%, P = 0.278) and 90-day mortality (13.4% vs 3.9%, P = 0.180) between groups. The average survival time of two groups were 176.64 days and 150.96 days, respectively, but P value for t test between groups was 0.425, no statistical differences. The survival curves of the two groups were shown in Fig. 6. At the meantime, we analyzed the relationship between pathogens read number and HOD, 14-day-mortality, 28day-mortality and 90-day-mortality, which showed that the higher pathogens read number, the higher 90-daymortality and the longer HOD (Table 7).

Discussion
The traditional clinical model for diagnosing infectious diseases is for doctors to make a differential diagnosis and then conduct a series of tests to try to identify the pathogen [9][10][11][12]. Traditional diagnostic techniques of microbiology laboratory ranges from smear microscopy, microorganisms' culture, antigen antibody detection and PCR mainly. Whereas most traditional methods were often time-consuming and has a lower positive rate than mNGS [2][3][4]. Although molecular diagnostic assays are a quick way to diagnose the most common infections, almost all conventional microbial trials in use today only target a limited number of pathogens at a time or require successful culture of microorganisms from clinical samples [13]. While mNGS analyze the entire microbiome in patients' samples [8] so it has been used to discover novel viral pathogens and diagnose viral infections in people widely [14][15][16]. Therefore, we explored the application and differences between traditional culture method and mNGS in clinical infectious diseases in adults. BALF, blood, sputum, tissue, CSF, pleural fluid, pus, bone marrow or nasal swab from 109 patients suspected of infection were collected and specimens were subjected to regular clinical microbiological assay and mNGS testing in a pairwise manner in our study. We then systematically compared the clinical features and test results of mNGS and traditional culture.    The results suggested that there were no significant differences in age, gender, length of stay and fatality rate between two groups and mNGS had an advantage in sensitivity rate compared with traditional culture method. A team of researchers also found that mNGS detected potential pathogenic bacteria, which had advantages in speed and sensitivity compared with culture and pathology [17], Miao's team [5] showed that mNGS had a sensitivity of 50.7% for the diagnosis of infectious diseases, higher than traditional culture (50.7% vs 35.2%). In particular, the diagnosis of MTB, virus, anaerobic bacteria, nocardia and fungi has obvious advantages. The results were similar to our results, which showed that the sensitivity of mNGS was 67.4%, significantly higher than that of culture method (23.6%). High sensitivity of mNGS may because pathogen DNA has a long survival time in plasma, the use of antibiotics has a small impact on mNGS results, while traditional cultures are greatly affected by the use of antibiotics. Because of the small sample size, mNGS showed no statistical difference compared with culture method in pathogen types although there was a trend of superiority in Klebsiella, bacteria without MTB/NTM, EBV, CMV, NTM, Anaerobes, Saccharomyces cerevisiae, Proteus, Pneumocystis carinii, Abiotrophia, Nocardia, Staphylococcus aureus, Enterococcus and Escherichia coli. However, mNGS detection had a significantly higher sensitivity than the culture method in BALF (P = 0.002), tissue (P = 0.025), blood (P < 0.001) and sputum (P = 0.018) samples.
Based on the advantages shown by mNGS, we then investigated the influence of positive mNGS detection results on the severity and prognosis of patients with infection. By comparing the classification and counting of leukocyte, lymphocyte and cytokine concentrations in positive and negative groups, we found that IL-10 concentration in peripheral blood in the positive group was higher than that in the negative group and there were no statistically differences in other cytokine concentrations, leukocyte and lymphocyte. According to the results of correlative factors analysis for positive test of mNGS, patients' age may promote the detection of pathogens. In the survival analysis, positive group had a higher 28-day mortality (9.0% vs 0%, P = 0.049) than that    Fig. 6 The survival curves of positive and negative group of mNGS in ID. The survival curves suggested that the overall survival rate declined faster in the positive group, however, there was no statistically differences between the two groups of negative group, but there were no statistical differences in average survival time. The pathogens read number by mNGS was positive related to the HOD and 90day-mortality of patients with infectious diseases. All of that indicated older people were more likely to have positive results and positive results of mNGS detection may represent a worse outcome. Fortunately, mNGS has moved from scientific application to clinical practice and is changing the way disease diagnosed and treated [18][19][20]. In addition to what we mentioned above, mNGS also has merits in many other aspects. Firstly, mNGS does not need prior clinical information to detect infectious pathogens, and the results can be reported quickly and accurately, greatly shortening the diagnosis time of infectious pathogens. Early and rapid reporting of the results by mNGS provides clinical clues to the next step in diagnosis and treatment, especially avoiding overuse of antibiotics for viral infections [21,22]. Rapid results reported by mNGS also can promote timely adjustment of treatment in clinical practice. As our data showed, almost one-half of patients with virus infection were suspected of inappropriate antibiotic usage. Secondly, mNGS was used in some rare infectious pathogens. It detected Naegleria fowleri [23], brucellosis [24], cysticercosis, taenia bocinea [25], gondii [26] in CSF, Hepatic tuberculosis in blood [27] in previous reports. Thirdly, studies have shown that mNGS can be used not only for pathogen identification, but also for microbiome characterization, parallel analyses of human host responses, drug resistance gene and virulence factor detection. All of these led to the rapid development of mNGS in immunodeficiency difficult-to-diagnose cases and immunocompromised patients [13]. Thirdly, antibiotic usage had little influence on mNGS results due to the long survival time of pathogen DNA in plasma, but traditional cultures were affected by antibiotic use [21,22]. Higher sensitivity of mNGS than culture in this study may because that mNGS is less affected by prior antibiotic usage. However, mNGS still has some limitations at present, such as human background, background bacteria contamination, no uniform standards for detailed experimental procedures [2,[28][29][30][31], inability to distinguish infection and colonization, standardization of bioinformatics analysis process, and problem of report interpretation. The results must be interpreted in the context of the clinical situation. It's worth noting that background microbial contamination is a common problem faced by mNGS technology, which can be partially eliminated through negative quality control, but it requires clinical familiarity with common background bacteria and better interpretation results combined with clinical practice [24].
In this study, we systematically compared mNGS and traditional culture method in sensitivity, specificity, pathogen type and sample type. On this basis, we also compared and analyzed the differences between the positive and negative groups of mNGS which was few at present. Patients of positive group found to have a trend of worse prognosis suggested need more attention clinically. Small sample size was the biggest deficiency of our study, so that there were many results indicated a certain trend without reaching statistical significance unfortunately. Therefore, more patients need to be included in the study in the future. Not randomized controled was also the limitation of study. As a retrospective study, this study has some limitations like limited data and data accumulation not controlled by the researcher. Besides, limit generalizability caused by single-center study, lack of a gold standard comparator for diagnostics, lack of antibiotic usage detail and classification bias were also the limitations of this study.

Conclusions
In summary, mNGS had a higher sensitivity than culture, especially in the types of BALF, blood and sputum samples, and there was a trend of higher sensitivity of Klebsiella, CMV and EBV detection. The worse trend of outcome in patients with positive mNGS results than negative group prompted more clinical attention to patients with positive mNGS results is required. Therefore, based on what we found above and other advantages of mNGS like quick results, less affected by prior antibiotic exposure and so on, we suggest that mNGS should be used more in early pathogen diagnosis in the future. Nonetheless, interpreting data of mNGS will be a challenge for doctors in guiding clinical treatment of infectious diseases.