Hospital surveillance data
This study is not an experimental research but an observational and descriptive epidemiological study using anonymous data extracted retrospectively in our hospital database. In accordance with the French legislation, this type of analysis does not require approval by an ethics committee and the study is registered under No. 13–156 in the registry of our hospital for data processing exempted from declaration to the CNIL (Commission Nationale de l’Informatique et des Libertés, French Commission on Information Technology and Liberties).
Hospital surveillance data were extracted and processed from ED’s electronic medical record of patients older than 15 years of age who visited the adult ED of the North Hospital Group, Lyon University Hospitals, between June 1st, 2007 and March 31st, 2011 (N = 101,001). Three time series were constituted from hospital data: they were named UrgIndex-hospitalisations, ICD10-consultations, and ICD10-hospitalisations.
Syndromic surveillance based on the automatic extraction from medical records of the emergency department (UrgIndex-hospitalisations)
The first hospital time series, UrgIndex-hospitalisations, is based on the data extraction of the patient’s medical record. The ED computerized medical record is made up of two types of variables: structured variables (e.g., age, emergency diagnosis code), and variables related to medical writing in the medical record using natural language (e.g., chief complaint, observation notes). The data extracted from the ED computerized medical record were processed by UrgIndex. The corresponding algorithm includes the processing of two types of variables (structured and free text variables) and contains three steps:
-
1)
matching keywords that describe different syndromes with the text of the computerized medical record, by means of an application that automatically processes natural language variables [5].
-
2)
for each patient, calculating the probability of having a potentially transmissible infection and
-
3)
determining whether the probability is above a detection threshold set by the user depending on the sensitivity and positive predictive value of the detection algorithm [6].
The time series was constituted of the daily number of patients hospitalised with respiratory syndrome after an emergency visit in the same hospital. The infectious diseases detected by UrgIndex and classified into “upper airways or respiratory syndrome” were influenza, viral pneumonias other than influenza, bacterial pneumonias, bronchitis, infections of the upper airways, and tuberculosis.
Influenza surveillance based on ICD-10 codes assigned by physicians after ED consultation (ICD10-consultations)
Discharge summaries, produced at the end of each visit in an ED, provide information necessary for regional and national health surveillances. In 2006, the content and format of this discharge summary were defined at the national level. Data to be transmitted to a regional server (Oscour server), must be extracted from computer systems deployed in the EDs. These data include the ICD-10 coding of the medical cause of the visit [14]. This coding is done by the emergency physician at the end of the consultation.
The second hospital time series (ICD10-consultations) were composed of the daily number of patients who visited the ED for a medical cause coded as influenza (J09, J10, J11, J10.0, J10.1, J10.8, J11.0, J11.1, and J11.8) in the discharge summary of the ED consultation.
Influenza surveillance based on ICD-10 codes assigned by physicians in patients hospitalised after ED consultation (ICD10-hospitalisation)
The third time series (ICD10-hospitalisation) were composed of the daily number of patients hospitalised after an ED visit and for whom a medical cause coded as influenza (J09, J10, J11, J10.0, J10.1, J10.8, J11.0, J11.1, and J11.8) in the discharge summary of the ED consultation.
Regional surveillance data
Four time series were obtained using regional data from existing influenza surveillance systems.
Oscour® surveillance system
The first series included the daily number of patients who visited EDs for influenza in hospitals participating in the Oscour® network (Oscour-consultations). This network collects summary data extracted from computer systems deployed in the EDs and transmitted to the Oscour® regional server.
The second series included a subgroup of the first series: the daily number of patients hospitalised for influenza after an ED visit within the Oscour® network (Oscour-hospitalisation).
The Oscour® system started operating in June 2009 in the Rhône-Alpes region. The time series began on June 29th, 2009 (week 27) and ended on April 3rd, 2011 (week 13). The data analysed in this study came from 19 hospitals that participated in the network throughout the study period. The local hospital data presented in this study came from one of the nineteen participating hospitals. Discharge diagnostic codes were available for at least 70% of the patients at all but one hospital, for which it was only 10%.
Regional sentinel network
The second source for regional series was from the Sentinel network for the Rhône-Alpes region. The corresponding third series included the weekly number of patients who consulted their general practitioner for influenza-like illness (i.e., sudden fever > 39°C, with myalgia and respiratory signs). The data in this series were downloaded for the Rhône-Alpes region and the study period from the website of the Sentinel network, INSERM, UPMC (http://www.sentiweb.fr).
Google flu trends website
The fourth regional series included the weekly number of queries about influenza on the Google search engine that were made by users living in the Rhône-Alpes region. The data were downloaded for the study period from Google Flu Trends website (http://www.google.org/flutrends/fr/#FR).
Descriptive analysis of local hospital and regional time series
Regional and local hospital data were aggregated by week and described graphically for seasonal fluctuations, amplitude of the fluctuations, and the beginning of peaks of activity. The total number of ED visits, hospitalisations or Google queries was calculated for every outbreak period and for every time series.
Method to detect community outbreaks from local hospital data
The cumulative sum (CUSUM) method was applied to local hospital data for outbreak detection by calculating the numbers of daily patient visits and identifying those that were above an outbreak threshold (set by a computational method described below). The three algorithms described by Hutwagner for the Early Aberration Reporting System (EARS) surveillance system developed by the United States Centres for Disease Control and Prevention (CDC) were applied to the hospital time series [15]. The formula for the three algorithms was:
St is the CUSUM statistics computed at t-time, Xt the number of observations at t-time, μ0 the expected mean, σxt the variance, and k the detectable difference to the mean. The CUSUM algorithms used three different moving average calculation methods for μ0 (C1-mild, C2-medium, C3-ultra). The C1-mild calculation method used a moving average calculated as the mean of ED visits during the previous 7 days (t-7 to t-1), the C2-medium calculation method used a moving average calculated as the mean of visits during the previous 7 days with a free interval or lag of 2 days (t-9 to t-3), and the C3-ultra calculation method used a moving average calculated during the same period as C2, but the C3 statistic was the sum of the statistics from three sampling days, St, St-1, and St-2). An algorithm was built for detecting a new outbreak into 3 steps:
-
1)
St calculation using either C1, C2, C3 for estimation of μ0 and different k values.
-
2)
Defining the threshold of outbreak detection: this threshold was determined as a value of μ0 (average number of visits) plus three standard deviations (SD). A signal was defined as a value of St exceeding this threshold.
-
3)
Defining the duration of the signal for triggering an alert: different numbers of consecutive days with a signal to consider a potential outbreak community were assessed: one-day signal, 3-days signal and 5-days signal.
The variation of values of St, k, and number of consecutives days, allowed to evaluate different algorithms. A total of 54 algorithms were assessed for the three time series.
Evaluation of detection performance
For evaluating the ability to detect community influenza outbreaks with ED data of a single hospital, the hospital series were compared with regional data. The outbreak reference periods used were taken from publications of the regional Unit (Cire Rhône-Alpes) of the French Institute for Public health surveillance (InVS), which are used to inform regional healthcare facilities when the threshold of detection of regional flu epidemic is reached. The outbreak thresholds were based on the application of the Serfling method to the regional Sentinel network data [16]. As regional data are available as weekly aggregated data, the beginning of an outbreak was defined as the first day of the week. We considered as true alert only signals that began before the outbreak period and were ongoing during the outbreak period or signals that began during the outbreak period. If a signal began after the outbreak period (without signal during the period), we considered this signal as a false alarm and that the outbreak was not detected.
The algorithms’ performance was assessed using sensitivity, specificity and timeliness defined by Hutwagner et al. and Cowling and al [15, 17]. Sensitivity was defined as the number of outbreaks in which ≥ 1 day was flagged by the CUSUM algorithm divided by the total number of outbreaks. The specificity was defined as the number of days of non-epidemic periods and that were not flagged by the CUSUM algorithm divided by the total number days of non-epidemic periods. The mean timeliness was defined as the mean number of complete days that occurred between the beginning of an outbreak and the first day the outbreak was flagged. If the surveillance system detects the outbreaks before the community alert, the timeliness would be negative, and if the surveillance system detects the outbreaks after the community alert, the timeliness would be positive.In order to take into consideration the particular epidemiological situation related to pandemic influenza in 2009, a sensitivity analysis was conducted. This sensitivity analysis consisted in excluding the year 2009, starting from the end of the monitoring period of the previous year (period ended in 2009-S18), i.e. from 2009-S19 to 2010-S18 (see Figure 1).