Divergences on expected pneumonia cases during the COVID-19 epidemic in Catalonia: a time-series analysis of primary care electronic health records covering about 6 million people
BMC Infectious Diseases volume 21, Article number: 283 (2021)
Pneumonia is one of the complications of COVID-19. Primary care electronic health records (EHR) have shown the utility as a surveillance system. We therefore analyse the trends of pneumonia during two waves of COVID-19 pandemic in order to use it as a clinical surveillance system and an early indicator of severity.
Time series analysis of pneumonia cases, from January 2014 to December 2020. We collected pneumonia diagnoses from primary care EHR, a software system covering > 6 million people in Catalonia (Spain). We compared the trend of pneumonia in the season 2019–2020 with that in the previous years. We estimated the expected pneumonia cases with data from 2014 to 2018 using a time series regression adjusted by seasonality and influenza epidemics.
Between 4 March and 5 May 2020, 11,704 excess pneumonia cases (95% CI: 9909 to 13,498) were identified. Previously, we identified an excess from January to March 2020 in the population older than 15 years of 20%. We observed another excess pneumonia period from 22 october to 15 november of 1377 excess cases (95% CI: 665 to 2089). In contrast, we observed two great periods with reductions of pneumonia cases in children, accounting for 131 days and 3534 less pneumonia cases (95% CI, 1005 to 6064) from March to July; and 54 days and 1960 less pneumonia cases (95% CI 917 to 3002) from October to December.
Diagnoses of pneumonia from the EHR could be used as an early and low cost surveillance system to monitor the spread of COVID-19.
The coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) started as an outbreak in Wuhan (China) and rapidly evolved into a worldwide pandemic and a public health emergency of international concern . As of 11 December 2020, more than 69 million people have contracted the virus worldwide, with more than 1.5 million confirmed deaths . Asymptomatics seem to account for approximately 20% of infected . Most cases present mild influenza-like illness (ILI) symptoms including fever, cough, sore throat, dyspnea, fatigue and myalgia . Other symptoms such as diarrhea or dysgeusia and/or anosmia may be present one or 2 days before the fever and dyspnea . Around 10–20% of symptomatic patients present severe forms of disease that require hospital admission, mainly due to pneumonia with severe inflammation [4,5,6,7]. Lung damage can progress rapidly and an early detection is essential for better management and also for surveillance of medium and long-term sequelae .
The first case in Europe was reported on 20 January 2020 in France and the first case in Catalonia (Spain) was on 25 February 2020. However, some studies have suggested a previous circulation of the virus [9, 10]. Due to the rapid spread of the disease, on March 14 Spain established a national lockdown, as many other countries . During the first surge, as Spain lacked the capacity to test all cases, reverse transcriptase–polymerase chain reaction (RT-PCR) test confirmation was only required when patients were admitted to hospital or were healthcare staff. In a previous study, only 38.5% of clinical COVID-19 cases diagnosed in primary care between March 1 and April 242,020 received a RT-PCR test . On May 11, primary care acquired the capacity to perform RT-PCR test and the number of tests increased as stated in the official figures of the Catalan Health Department website (https://dadescovid.cat/).
Knowledge of the infection status and continuous measurement of the transmission is one of the five necessary components described for containing COVID-19 . This measurement should also include surveillance based on clinical diagnoses. Primary care clinical diagnoses have been used as a surveillance system for influenza epidemics during many years . In February and March 2020, we found an excess of influenza diagnoses using this clinical surveillance system that suggested an early spread of COVID-19 in Catalonia prior to the first reported case . This finding showed the utility of flu clinical diagnoses as a potential low-cost surveillance system capable of early monitoring the spread of the epidemic. However, as flu cases are limited to winter, we need comprehensive and complementary information to understand the actual trend of the COVID-19 epidemic, including severity.
In addition to the current system, the aim of our study is to analyse pneumonia diagnoses and compare the trend in the 2019–2020 season with those of the previous years in order to add another surveillance measurement of the evolution of the epidemic that could add a severity component.
We performed a time-series study of clinical diagnoses of pneumonia. The study period included 6 years from 1 September 2014 to 4 December 2020 considered as seasons from autumn to summer as routinely done for influenza epidemics.
Our main outcome was clinical diagnoses of pneumonia. Daily counts of diagnoses of pneumonia were retrospectively extracted from the primary care electronic health records (EHR) of the Catalan Institute of Health (ICS for its Catalan initials). ICS is the main health provider in Catalonia. It manages about 75% of all primary care practices in the Catalan public health system and covers about 6 million people . All general practices use the same EHR known as ECAP. The ECAP is a software system that serves as a repository for structured data on diagnoses (coded according to the International Classification of Diseases 10th revision ICD-10), clinical variables, prescription data, laboratory test results, visits and diagnostic requests.
We defined pneumonia diagnoses according to the ICD-10 classification and included all codes presented in Supplementary Table 1. COVID-19 confirmed cases were obtained from the official website of the Catalan Health department Dadescovid.cat (https://dadescovid.cat/).
Daily counts of pneumonia cases were computed based on the frequency of cases recorded in the previous 7-day period to avoid weekly effects on recording practice.
We obtained the expected cases for the study period using a time series regression adjusted by seasonality and influenza epidemics. Flu cases were also computed as daily counts in a 7-day period as done for pneumonia and COVID-19 cases and were extracted from the same EHR.
Dataset was divided into three sets: training set (from September 2014 to August 2018), validation set (from September 2018 to August 2019) and analysis set (from September 2019 to December 2020). We used the training set to adjust the model and validation set to test our method as a sensitivity analysis. We checked whether our method identified any excess or lack of pneumonia cases in a regular year not affected by the COVID-19 pandemic. Finally, we projected the estimated time series to our analysis set.
The expected cases were estimated using data from the training seasons adjusted for the influenza epidemics. Excess pneumonia cases were defined as the number of observed minus the expected cases in all periods where observed cases were greater than the upper 95% confidence interval (95% CI). Similarly, the lack or reduction of pneumonia cases was defined as the difference between the expected minus the observed for all periods in which the observed number was below the lower 95% CI. Excess and lack of pneumonia were only calculated for the analysis set (2019–2020 season).
Time-series analysis was performed globally and for age groups (< 15 years old, 15–64 years old, > 64 years old). We calculated 95% CIs for each estimate.
Between 1 September 2014 and 4 December 2020 we observed 260,910 pneumonia cases of whom 28.7% were diagnosed in the population younger than 15 years, 37.1% in the population between 15 and 64 years and 34.2% in the population older than 64 years. The mean number of pneumonia diagnoses was 40,224 for seasons between 2014 and 2015 and 2018–2019. The 2019–2020 season included 50,039 cases, an increase of 24.4%. The percentage of pneumonia affecting the population between 15 and 64 years have increased during 2020 compared to previous years, even during the first months of the year (38.4% from January to March 2019 compared to 45% in 2020). This is a result of an increased number of pneumonia cases in the population older than 15 years since early 2020 (Supplementary material 3).
In five out of six seasons included in our study, the peak of pneumonia cases coincided with the peak of the influenza epidemics, except for the season 2019–2020 (Fig. 1).
Figure 2 shows the observed and estimated number of 7-day new pneumonia cases (with 95% CI) by age groups, for the validation and the analysis sets. In the whole population, we didn’t observe large periods of excess or lack of pneumonia during the validation set (2018–2019). Nonetheless, between 4 March and 5 May 2020, we observed a great period of excess pneumonia cases, accounting for 63 days and 11,704 excess cases (95% CI: 9909 to 13,498). There were other periods of excess pneumonia before, during December 2019, late January 2020 and early February 2020, although they account for less days and less cases. In addition, 25 days of excess were observed from 22 October 2020 to 15 November 2020, accounting for 1377 excess cases (95%CI 665–2089) (Table 1).
Table 1 also presents the different periods with excess pneumonia cases stratified by age group. Population between 15 and 64 years and the population older than 64 had similar periods of excess cases, accounting for 72 and 73 days and 8274 and 5010 cases during the March–May excess, respectively. And accounting for 37 days and 1562 cases and 42 days and 1147 cases respectively during the October–November excess period. Conversely in children we didn’t observe any excess during the whole period of the study (Fig. 2 and Table 1).
In contrast, regarding the lack of pneumonia, we observed a reduction of pneumonia cases among people younger than 15 years between 18 March and 26 July, accounting for 131 consecutive days with less observed pneumonia than expected, 3534 less observed cases during that period (95% CI: 1005 to 6064) and a reduction of 84.43% (95% CI 60.67 to 90.3%) compared with the expected. In addition, another period of lack of pneumonia was observed in children from 12 October to 4 December, with 54 days of reductions and a 1960 less cases (95% CI 917 to 3002). We didn’t observe any period of reduction of pneumonia for other age groups in the analysis set (Table 2).
Finally, Fig. 3 shows the trend of pneumonia and COVID-19 cases from March 2020. We observed two waves of COVID-19 cases coinciding with two pneumonia peaks in the population older than 15 years.
In our work, we found an excess of 11,704 pneumonia diagnoses between 4 March 2020 and 5 May 2020. This excess seems to be related to the COVID-19 pandemic, as temporally overlapped with the first wave in Spain. Prior to this large excess, we found a small excess period in mid-December that doesn’t seem related to COVID-19 as it only lasted for 7 days and then observed cases went down to the expected for at least 1 month. But after that, it is remarkable that between 28 January and 19 February 2020 (1 month before the first confirmed COVID-19 case was reported in Catalonia), we found 22 out of 23 days of excess of pneumonia separated into two periods that should be accounted as the same. Although this period matches with the peak of influenza epidemic, it could also be related to the transmission of other pathogens since our method is already adjusted by influenza epidemic. In addition, we found that this excess in early 2020 was caused by an excess of cases in the population older than 15 years (64% more pneumonia cases in the population between 15 and 64 years and a 53.4% increase in those older than 64 when compared to year 2019), while in children pneumonia cases decreased by 10%. The excess found in early 2020 and the huge excess found only a few days after the first official case of COVID-19 in Catalonia also suggest that SARS-CoV-2 could have been circulating in the Catalan population when the first imported case was reported on 25 february 2020 . People with COVID-19-related pneumonia may have been masked under other diagnoses, allowing transmission before the first control measures were implemented. This hypothesis was launched in a previous study where we found an excess of influenza cases from the beginning of February in Catalonia  and was also suggested in other studies in different countries . In addition, a CDC report estimated that the start of SARS-CoV-2 communitary transmission was around mid-January to February . Finally, scientists found virus traces in wastewater collected in January in Barcelona  and in December in Italy . Our results, then, are in line with those published until now and strengthen this hypothesis.
After May, we didn’t find any other excess until mid-October, coinciding with the second wave of the COVID-19 pandemic in Catalonia. However, this excess was lesser than the one found in the March–May period, suggesting a lower severity of the situation. Official data from Spanish government about excess mortality shows an excess of 120% in Catalonia from 13th March to 8th May and an excess of 25.8% from 6th October to 28th November  . So, similar to our findings in pneumonia, severity of the second wave was lesser than the first one and pneumonia advanced some days as an alert of increased severity, without delays as occurred in mortality notifications. Nonetheless, the number of COVID-19 cases was greater because testing increased since May in Catalonia. Using pneumonia as a surveillance indicator could help to understand the severity of the wave regardless of the capacity to test.
Trends of pneumonia in children followed a different pattern. We found a greater lack of pneumonia cases during the lockdown in patients younger than 15 years old. This is an expected finding, as schools closed on March 13 and didn’t reopen until September 14. In addition, mobility measures during the lockdown were stricter in children, while adults were able to move for essential work or to shop. This reduction was also found in other infectious diseases. For example, in Diagnosticat (https://www.ics.gencat.cat/sisap/diagnosticat/principal), an official website of the Catalan Health Department that publishes weekly clinical diagnoses of 7 notifiable infectious diseases since 2010 , we saw a decrease in the weekly rates of chickenpox from lockdown onset as it happens with pneumonia. But most importantly, we didn’t find any excess pneumonia in patients younger than 15 years during February and early March, before the lockdown. This seems to suggest that COVID-19 is less severe in children, as other studies have pointed out . More surprising are the results found in October–December. Despite schools reopened in September in Catalonia (with measures), diagnoses of pneumonia in children continued below the expected. Several studies have observed low transmission and severity of COVID-19 in educational settings, which could explain our findings . In addition, this reduction of pneumonia in children also suggests that other common pathogens that cause pneumonia in children could have low transmission possibly due to anti-covid measures . This is consistent with the data of the southern hemisphere. In Australia, researchers observed reductions of 99% of respiratory syncytial virus infections during its winter that they considered related to COVID-19 control measures . In Catalonia, measures in schools were the use of masks for children older than 6 years, bubble groups, not mixing children from different groups and daily symptoms screening. More studies are needed to confirm if some of these measures affected the transmission of other pathogens or there are other factors that caused a great reduction of all types of pneumonia even with schools open, such as a viral competition between SARS-CoV-2 and other viruses . This could be of interest for the following seasons.
Our research has several limitations. Firstly, the design of our study does not allow us to ensure a causal link between COVID-19 epidemic and the excess of pneumonia, but only a temporal coincidence. Moreover, as we lacked tests we were not able to differentiate the etiology of each pneumonia. However, our method offers a low-cost surveillance system that could help to detect unusual trends, supporting public health responses. Secondly, as our study uses data from several years, changes in population structure could limit the use of this method. Nonetheless, population structure in terms of age and gender has remained stable in the study period . Thirdly, using data from the primary care EHR could introduce some bias as we lacked information about emergency departments or hospital admission. Finally, our system could miss unusual trends caused by mild, asymptomatic infections or even undocumented infections , especially at the beginning of an epidemic. In that sense, it is also important to monitorize other related diagnoses such as diarrhea or mild ILI infections to have a complete picture of the current pandemic. However, our system could be used as a complementary surveillance system and to assess severity of the epidemic before other indicators such as the excess of mortality.
The strengths of this study include population-based data automatically obtained from primary care EHR. Several studies have used the Catalan EHR to do useful research in real-world conditions and for the surveillance of different diseases [10, 26]. Our database covers over 75% of the population of Catalonia, allowing us to detect general and local excesses and lack of pneumonia. It’s also a quick and low cost method to integrate in the current information systems of any region using EHR. In addition, we have tested our method in a non-COVID-19 affected season (2018–2019) and we didn’t find any unusual pattern, strengthening our subsequent findings. Finally, this study presents an analysis of trends of expected pneumonia during two waves of COVID-19 epidemic. The consistency of our results is reaffirmed by the reproducibility during the second wave and the similarity with excess mortality figures , suggesting that an increase of pneumonia could be an alert of COVID-19 outbreak and future mortality.
Monitoring clinical diagnoses of pneumonia and its comparison with what is expected could help interpreting the epidemiological situation. This surveillance system based on routinely collected data from the EHR could be used as a low cost warning system that complements other surveillance systems, allowing to advance public health response independently of the capacity to test. Using this system we found an excess of cases in early 2020 that suggests that SARS-CoV-2 could be circulating before the first official case was identified. In addition, more studies are needed to understand the causes of the reduction of pneumonias in children during COVID-19 epidemic.
Availability of data and materials
The data and analytical code underlying this article are available in: https://github.com/ErmengolComa/pneumonia.git.
95% Confidence interval
Coronavirus disease 2019
Electronic health records
Catalan Institute of Health (for its Catalan initials)
International Classification of Diseases 10th revision
Reverse transcriptase–polymerase chain reaction
European Centre for Disease Prevention and Control. Novel coronavirus disease 2019 (COVID-19) pandemic: increased transmission in the EU/EEA and the UK – sixth update – 12 March 2020. Stockholm: ECDC; 2020.
COVID-19 Map - Johns Hopkins Coronavirus Resource Center. https://coronavirus.jhu.edu/map.html. Accessed 11 Dec 2020.
Byambasuren O, Cardona M, Bell K, et al. Estimating the extent of asymptomatic COVID-19 and its potential for community transmission: Systematic review and meta-analysis. J Assoc Med Microbiol Infect Dis Can (JAMMI). 2020;5(4):223–34. .
Young BE, Ong SWX, Kalimuddin S, Low JG, Tan SY, Loh J, Ng OT, Marimuthu K, Ang LW, Mak TM, Lau SK, Anderson DE, Chan KS, Tan TY, Ng TY, Cui L, Said Z, Kurupatham L, Chen MIC, Chan M, Vasoo S, Wang LF, Tan BH, Lin RTP, Lee VJM, Leo YS, Lye DC, for the Singapore 2019 Novel Coronavirus Outbreak Research Team. Epidemiologic features and clinical course of patients infected with SARS-CoV-2 in Singapore [published correction appears in JAMA. 2020 Apr 21;323(15):1510]. JAMA. 2020;323(15):1488–94. https://doi.org/10.1001/jama.2020.3204.
Sagnelli C, Celia B, Monari C, Cirillo S, de Angelis G, Bianco A, Coppola N. Management of SARS-CoV-2 pneumonia. J Med Virol. 2021;93(3):1276–87. https://doi.org/10.1002/jmv.26470.
Prieto-Alhambra D, Balló E, Coma E, et al. Filling the gaps in the characterization of the clinical management of COVID-19: 30-day hospital admission and fatality rates in a cohort of 118 150 cases diagnosed in outpatient settings in Spain. Int J Epidemiol. 2021;49(6):1930–9.
Salzberger B, Buder F, Lampl B, et al. Epidemiology of SARS-CoV-2. Infection. 2020;1–7. [published online ahead of print, 2020 Oct 8]
Pan Y, Guan H, Zhou S, Wang Y, Li Q, Zhu T, Hu Q, Xia L. Initial CT findings and temporal changes in patients with the novel coronavirus pneumonia (2019-nCoV): a study of 63 patients in Wuhan, China. Eur Radiol. 2020;30(6):3306–9. https://doi.org/10.1007/s00330-020-06731-x.
Bernard Stoecklin S, Rolland P, Silue Y, et al. First cases of coronavirus disease 2019 (COVID-19) in France: surveillance, investigations and control measures, January 2020. Euro Surveill. 2020;25(6):2000094.
Coma Redon E, Mora N, Prats-Uribe A, et al. Excess cases of influenza and the coronavirus epidemic in Catalonia: a time-series analysis of primary-care electronic medical records covering over 6 million people. BMJ Open. 2020;10:e039369.
Boletín oficial del estado (BOE). Real Decreto 463/2020, de 14 marzo, por el que se declara el estado de alarma para la gestión de la situación de crisis sanitaria ocasionada por el COVID-19: https://www.boe.es/diario_boe/txt.php?id=BOE-A-2020-3692 (Accessed 15 May 2020).
Han E, Tan MMJ, Turk E, Sridhar D, Leung GM, Shibuya K, Asgari N, Oh J, García-Basteiro AL, Hanefeld J, Cook AR, Hsu LY, Teo YY, Heymann D, Clark H, McKee M, Legido-Quigley H. Lessons learnt from easing COVID-19 restrictions: an analysis of countries and regions in Asia Pacific and Europe. Lancet. 2020;396(10261):1525–34. https://doi.org/10.1016/S0140-6736(20)32007-9.
Closas P, Coma E, Méndez L. Sequential detection of influenza epidemics by the Kolmogorov-Smirnov test. BMC Med Inform Decis Mak. 2012;12:112 Published 2012 Oct 3.
Bolíbar B, Fina Avilés F, Morros R, Garcia-Gil Mdel M, Hermosilla E, Ramos R, Rosell M, Rodríguez J, Medina M, Calero S, Prieto-Alhambra D, Grupo SIDIAP. Base de datos SIDIAP: La historia clínica informatizada de Atención Primaria como fuente de información para la investigación epidemiológica. Med Clin (Barc). 2012;138(14):617–21. https://doi.org/10.1016/j.medcli.2012.01.020.
R Core Team. R software: version 3.5.1. R Found Stat Comput, 2018.
CDC COVID-19 Response Team, Jorden MA, Rudman SL, et al. Evidence for limited early spread of COVID-19 within the United States, January-February 2020. MMWR Morb Mortal Wkly Rep. 2020;69(22):680–4 Published 2020 Jun 5.
Chavarria-Miró G, Anfruns-Estrada E, Guix S, et al. Sentinel surveillance of SARS-CoV-2 in wastewater anticipates the occurrence of COVID-19 cases. MedRxiv 2020.06.13.20129627. https://doi.org/10.1101/2020.06.13.20129627.
La Rosa G, Iaconelli M, Mancini P, et al. First detection of SARS-CoV-2 in untreated wastewaters in Italy. Sci Total Environ. 2020;736:139652. https://doi.org/10.1016/j.scitotenv.2020.139652.
Instituto de Salud Carlos III. Vigilancia de los excesos de mortalidad por todas las causas: MoMo:: https://www.isciii.es/QueHacemos/Servicios/VigilanciaSaludPublicaRENAVE/EnfermedadesTransmisibles/MoMo/Documents/informesMoMo2020/MoMo_Situacion%20a%209%20de%20diciembre_CNE.pdf (Accessed 15 Dec 2020).
Ludvigsson JF. Systematic review of COVID-19 in children shows milder cases and a better prognosis than adults. Acta Paediatr. 2020;109(6):1088–95. https://doi.org/10.1111/apa.15270.
Kuitunen I, Artama M, Mäkelä L, Backman K, Heiskanen-Kosma T, Renko M. Effect of social distancing due to the COVID-19 pandemic on the incidence of viral respiratory tract infections in children in Finland during early 2020. Pediatr Infect Dis J. 2020;39(12):e423–7. https://doi.org/10.1097/INF.0000000000002845.
Yeoh DK, Foley DA, Minney-Smith CA, et al. The impact of COVID-19 public health measures on detections of influenza and respiratory syncytial virus in children during the 2020 Australian winter [published online ahead of print, 2020 Sep 28]. Clin Infect Dis. 2020:ciaa1475.
Pinky L, Dobrovolny HM. Coinfections of the respiratory tract: viral competition for resources. PLoS One. 2016;11(5):e0155589 Published 2016 May 19.
IDESCAT Territorial and demographic indicators. Estructura per edats, envelliment I dependència, 2020: https://www.idescat.cat/pub/?id=inddt&n=915&lang=en (Accessed 24 Nov 2020).
Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, Shaman J. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 2020;368(6490):489–93. https://doi.org/10.1126/science.abb3221.
Violán C, Foguet-Boreu Q, Fernández-Bertolín S, Guisado-Clavero M, Cabrera-Bean M, Formiga F, Valderas JM, Roso-Llorach A. Soft clustering using real-world data for the identification of multimorbidity patterns in an elderly population: cross-sectional study in a Mediterranean population. BMJ Open. 2019;9(8):e029594. https://doi.org/10.1136/bmjopen-2019-029594.
The authors of this paper would like to thank all primary care professionals in Catalonia for their work and resilience providing care to the catalan population during these difficult times. We also would like to acknowledge the efforts of all members of the SISAP team during the last months.
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Ethical approval and consent to participate
All analysis and methods of this study have been performed in accordance with the Declaration of Helsinki (last update, Fortaleza, Brazil 2013) and existing relevant guidelines and regulations in Spain. This work was also approved by the Clinical Research Ethics Committee of the IDIAPJGol (project code: 20/172-PCV), including a waiver for the informed consent of patients taking part in the study, and the data extracted from the EHR were fully anonymised. There was no patient or public involvement in this study.
Consent for publication
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Coma, E., Méndez-Boo, L., Mora, N. et al. Divergences on expected pneumonia cases during the COVID-19 epidemic in Catalonia: a time-series analysis of primary care electronic health records covering about 6 million people. BMC Infect Dis 21, 283 (2021). https://doi.org/10.1186/s12879-021-05985-0
- Pneumonia;public health surveillance
- Electronic health records
- Primary care