Human metapneumovirus prevalence and patterns of subgroup persistence identified through surveillance of pediatric pneumonia hospital admissions in coastal Kenya, 2007–2016

Background Human metapneumovirus (HMPV) is an important respiratory pathogen that causes seasonal epidemics of acute respiratory illness and contributes significantly to childhood pneumonia. Current knowledge and understanding on its patterns of spread, prevalence and persistence in communities in low resource settings is limited. Methods We present findings of a molecular-epidemiological analysis of nasal samples from children < 5 years of age admitted with syndromic pneumonia between 2007 and 2016 to Kilifi County Hospital, coastal Kenya. HMPV infection was detected using real-time RT-PCR and positives sequenced in the fusion (F) and attachment (G) genes followed by phylogenetic analysis. The association between disease severity and HMPV subgroup was assessed using Fisher’s exact test. Results Over 10 years, 274/6756 (4.1%) samples screened were HMPV positive. Annual prevalence fluctuated between years ranging 1.2 to 8.7% and lowest in the recent years (2014–2016). HMPV detections were most frequent between October of one year to April of the following year. Genotyping was successful for 205/274 (74.8%) positives revealing clades A2b (41.0%) and A2c (10.7%), and subgroups B1 (23.4%) and B2 (24.9%). The dominance patterns were: clade A2b between 2007 and 11, subgroup B1 between 2012 and 14, and clade A2c in more recent epidemics. Subgroup B2 viruses were present in all the years. Temporal phylogenetic clustering within the subgroups for both local and global sequence data was seen. Subgroups occurring in each epidemic season were comprised of multiple variants. Pneumonia severity did not vary by subgroup (p = 0.264). In both the F and G gene, the sequenced regions were found to be predominantly under purifying selection. Conclusion Subgroup patterns from this rural African setting temporally map with global strain distribution, suggesting a well-mixed global virus transmission pool of HMPV. Persistence in the local community is characterized by repeated introductions of HMPV variants from the global pool. The factors underlying the declining prevalence of HMPV in this population should be investigated. Electronic supplementary material The online version of this article (10.1186/s12879-019-4381-9) contains supplementary material, which is available to authorized users.

3 the global pool. The factors underlying the declining prevalence of HMPV in this population should be investigated. Key words: Human metapneumovirus, prevalence, subgroup, epidemic, temporal.

Background
Human metapneumovirus (HMPV) is single-stranded, negative-sense RNA virus with a genome of about 13kb (1,2). The virus belongs to the Pneumoviridae family and genus Metapneumovirus. HMPV infections occur across all ages with severe disease predominantly occurring in children below 2 years of age and the elderly (3)(4)(5). Since first description in 2001 (5), HMPV has been detected in all continents and its disease prevalence varies widely (6). Nearly every child by the age of 5 years has been infected by HMPV (5,7). Clinical presentation of HMPV infection ranges from mild upper respiratory tract illness to severe lower respiratory tract disease (8) and overlaps with that of other common respiratory viruses especially respiratory syncytial virus (RSV) (9,10).
HMPV is classified into two major groups, A and B, based on antigenic variation and nucleotide differences in the fusion (F), nucleoprotein (N) and attachment (G) glycoprotein genes (2,11,12). Phylogenetic analyses of F (open reading frame, i.e. ORF) and G (ORF) sequences further divides the two groups into subgroups i.e. A1 and A2 (group A), and B1 and B2 (group B) subgroups (11,13). In addition, there has been report of within subgroup A2 division, i.e. presence of two phylogenetically distinct clades A2a and A2b (14). A few studies have revealed increased heterogeneity of A2, including identification of another distinct clade, provisionally assigned as A2c (15,16). Furthermore, two novel clades within subgroup A2 with 180 or 111 nucleotide duplications in the G gene have been detected (17)(18)(19)(20). Geographically, HMPV subgroups have been reported to circulate widely and cluster temporally (21), with the exception of subgroup A1 that has been identified in a few countries (21)(22)(23). There is frequent co-circulation of subgroups with replacement of the predominant subgroup after a period of one or two seasons, although the drivers of this phenomena are unclear (21). Virus prevalence also varies from year-to-year within the same location (24,25).
A few studies have reported long-term subgroup circulation patterns of HMPV, necessary for improved epidemiological understanding and, in due course, of potential value to the design and implementation of control measures. In addition, there are a few studies on HMPV in Africa, which bears a high burden of pneumonia morbidity and mortality (26,27).
In this study, we analysed the surface F and G gene nucleotide sequences collected through paediatric pneumonia surveillance at the Kilifi County Hospital in rural coastal Kenya, from 2007 to 2016, to describe the molecular epidemiology and gain insights into the spread and the evolutionary dynamics of HMPV. The two genes (F and G) code for immunogenic surface proteins that are the targets for vaccine development. G gene is highly variable and is used to assess the diversity of HMPV. Similarly, F gene which is quite conserved, has been used to characterise HMPV and is the gene of target in many molecular diagnostic assays. This report extends our previously published work from the same location for the period 2007-11 (28) and the analysis includes sequences from this previous work.

Methods
Study population. The study was conducted at the Kilifi County Hospital (KCH) as part of long-term surveillance aimed at understanding the epidemiology and disease burden of RSV-associated pneumonia cases (29). KCH, located in coastal Kenya, is a referral hospital serving the population of around 260,000 (circa 2012) within the Kilifi Health and Demographic Surveillance System (30), and beyond in the wider Kilifi County. The population is mainly rural-agrarian. Upon presentation of a child to the paediatric ward at KCH, a detailed medical review is undertaken by the clinician upon which the decision to 5 admit is made. For this study, children ( 59 months of age) admitted to the paediatric ward between January 1 st 2007 and December 31 st 2016 were eligible if they presented with modified WHO defined syndromic severe or very severe pneumonia as previously described (29). During the period between 2007 and 2009, only admissions arising from residents of Kilifi Health and Demographic Surveillance System (KHDSS) were eligible, whereas in later years, non-KHDSS residents were included. Following written informed consent from the parent or guardian, a nasopharyngeal flocked swab, nasal wash or combination of nasopharyngeal swab and oropharyngeal swab was collected from each child and transferred into viral transport medium for laboratory screening. Ethical approval for the study was obtained from the Kenya Medical Research Institute Scientific and Ethics Review Unit. HMPV CN/gz01/08/A2) were included (36). A fragment length of 345bp was analysed for F gene and at least 640 bp for G gene. Subgroups were confirmed if sequences clustered with the reference sequences within a major branch with >70 % bootstrap support on the ML tree. Mean nucleotide genetic distances were also determined to assess sequence similarity between Kilifi and the global data set. To further assess the clustering of HMPV subgroups, ML tree was reconstructed using only full F (1593bp) gene sequences.

HMPV prevalence in paediatric hospital admissions. Between January 2007 and
December 2016, 9079 individuals below 60 months of age were admitted to the paediatric wards at KCH with severe or very severe pneumonia. Samples were collected from 6756 (74%) individuals and 274 (4.1%) were determined HMPV positive (Table 1). Decreased HMPV prevalence was recorded in more recent years i.e. from 2014 to 2016 (Table 1).
G gene sequence analysis (Kilifi). G gene sequences were less conserved compared to F gene sequences with overall mean sequence identity of 73% (nt) and 56% (aa).

HMPV subgroup and Disease severity.
We did not observe a statistically significant association between HMPV subgroup and pneumonia severity ( p-value=0.264) ( Table 3).

Discussion
As is the norm for HMPV detection in respiratory samples, we used molecular PCR-based diagnostics (6). Recent studies have shown that mutations at primer and probe binding sites can lead to false negative diagnostic results and hence underestimation of disease burden (50,51). Evaluation of the in-house diagnostic rRT-PCR assay would be important to determine whether there are any missed variants/subgroups and whether this can be associated with the apparent gradual decline in HMPV prevalence observed in the current dataset. There is also some evidence that reduction in bacterial pneumonia (e.g. S. pneumoniae), as has been seen in this coastal location over this study period (52), results in a reduction in viral pneumonia (53,54). Future investigations will be necessary to give more insight.
The analysed Kilifi F and G protein encoding genes were generally determined to be under purifying selection pressure, which drives RNA virus evolution by purifying the deleterious mutations due to RNA replication errors (55). However, for the B2 subgroup, a higher dN/dS ratio was observed in G gene sequences suggestive of diversifying selection within B2 viruses. The distinct diversifying selection and persistence of the B2 viruses observed requires further investigation. In this study, our F and G genes sequence analyses were 13 based on partial gene lengths, 345 bp and 640 bp, respectively. Therefore, our results on the genetic distance estimates, evolutionary and selection pressure analysis should be interpreted with caution. The partial lengths may have reduced our potential to discriminate genetic clusters, a possible explanation for higher sequence similarity observed between and within the subgroups for F gene. Overall, although our sequence analyses were limited to partial lengths, the newly designed F and G genes primers allowed full length sequencing of the two genes for newly generated data. In addition, the newly designed G gene subtype specific primers allowed sequencing of all HMPV subtypes and significantly improved G PCR recovery by two-fold compared to previously reported assay. This improved the study power to characterise the different circulating HMPV variants. Overall, we successfully collected clinical samples from 74% (6756/9076) of the study enrolled participants and characterised 75% (205/274) of the HMPV viruses identified. In this study we failed to collect samples from 24% of the eligibles. We have previously reported similar results, which results from refusals and difficulty in collection of nasal specimens from children with very severe disease (29). Hence, if some HMPV variants are associated with disease severity there may be bias in the prevalence and variant composition estimates. However, the proportion not collected shows no systematic change over time. In addition, the proportion of very severe pneumonia cases was high among sampled eligibles compared to unsampled eligibles. Therefore, it's unlikely there was any bias in the prevalence estimates of HMPV groups or subgroups.
Our analyses on HMPV epidemiology in Kilifi were limited to two viral genes only. Whole genome sequencing might give more insights into transmission, HMPV subgroup characterisation and molecular evolution. Our estimation and inference on the association between HMPV subgroup and disease severity was biased to in-patient surveillance data, and therefore future studies should include outpatient surveillance data. In addition, 14 inpatient surveillance and sampling of HMPV infections might not be representative of the full variant population circulating in this community and not correctly reflect incidence and prevalence, as HMPV infections have also been reported in outpatient settings (56).
Future studies across different locations in Kenya and in Africa will be important for tracing the introduction and transmission patterns of the virus.

Conclusions
In conclusion, this report shows HMPV activity characterised by marked annual variation in The Kenya Medical Research Institute Scientific and Ethics Review Unit (SERU) approved the study. A written informed consent was obtained from participant's parent or guardian.

Consent for publication
Not applicable.

Availability of data and material
All data generated or analysed during this study has been deposited to the Virus

Competing Interests
The authors declare that they have no competing interest.

Funding
This study was supported by the Wellcome Trust, United Kingdom (grants 102975, 100542, 084633, and 077092). The funder had no role in other aspects of the study including its design, data collection, data analysis, data interpretation, or writing of this manuscript. the Virus Epidemiology and Control (VEC) Research Group in Kilifi who were involved at various stages of this study that has allowed this current report to be written. We thank Dr Matt Cotten for sharing his scripts for generating the hiliter plots. We also thank the parents and guardians of the children for accepting to participate in this study. This work is published with permission of director KEMRI.