Skip to main content

Table 1 Counting of the string matching. The first and second columns are related to the first pattern, which consists of the string matching for the dyspnea synonyms (“Dyspnea”, “Respiratory Discomfort”, and “Shortage of Breath”) found in the reported symptoms field from the dataset of ECU patients. The following two columns are related to the second pattern, which is only searched if the first one is found. The second pattern is associated with the string matching of the words “Cough”, “fever”, or both, where the absence of matching is represented by “-”. The most frequent combined pattern is “Dyspnea” and “Cough”, corresponding to 30,828 (~41%) suspected cases, while the total number of suspected cases is 75,289 (~7$% of 1,026,804, the total number of confirmed cases) in the studied period, calculated by the sum of all string matchings for the first and second patterns.

From: COVID-19 outbreaks surveillance through text mining applied to electronic health records

Pattern 1

Total

Pattern 2

Total Combined

Dyspnea

101,523

Cough

30,828

Fever

9,137

Cough & Fever

12,701

-

48,857

Respiratory Discomfort

6,999

Cough

1,907

Fever

648

Cough & Fever

795

-

3,649

Shortage of Breath

44,916

Cough

10,373

Fever

4,414

Cough & Fever

4,486

-

25,643

 

Total Suspected Cases = 75,289