Skip to main content

The effect of sociodemographic factors on COVID-19 incidence of 342 cities in China: a geographically weighted regression model analysis



Since December 2019, the coronavirus disease 2019 (COVID-19) has spread quickly among the population and brought a severe global impact. However, considerable geographical disparities in the distribution of COVID-19 incidence existed among different cities. In this study, we aimed to explore the effect of sociodemographic factors on COVID-19 incidence of 342 cities in China from a geographic perspective.


Official surveillance data about the COVID-19 and sociodemographic information in China’s 342 cities were collected. Local geographically weighted Poisson regression (GWPR) model and traditional generalized linear models (GLM) Poisson regression model were compared for optimal analysis.


Compared to that of the GLM Poisson regression model, a significantly lower corrected Akaike Information Criteria (AICc) was reported in the GWPR model (61953.0 in GLM vs. 43218.9 in GWPR). Spatial auto-correlation of residuals was not found in the GWPR model (global Moran’s I = − 0.005, p = 0.468), inferring the capture of the spatial auto-correlation by the GWPR model. Cities with a higher gross domestic product (GDP), limited health resources, and shorter distance to Wuhan, were at a higher risk for COVID-19. Furthermore, with the exception of some southeastern cities, as population density increased, the incidence of COVID-19 decreased.


There are potential effects of the sociodemographic factors on the COVID-19 incidence. Moreover, our findings and methodology could guide other countries by helping them understand the local transmission of COVID-19 and developing a tailored country-specific intervention strategy.

Peer Review reports


The coronavirus disease 2019 (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), began in December 2019 and has spread quickly among the population [1]. Since the outbreak of COVID-19 in Wuhan, Hubei Province, Chinese government has taken unprecedented measures in response to the serious public health issue [2]. Since then, the COVID-19 epidemic in China has been basically brought under control, with a total of 80,744 confirmed cases as of March 25th, 2020 [3], after which almost all of the newly confirmed cases are the imported cases from abroad. Still, COVID-19’s impact is global, with approximately 29 million confirmed cases and over 820,000 deaths among 188 countries by the end of September 14th, 2020 [4]. Therefore, to prevent and control the pandemic, it is crucial to study the features and risk factors for COVID-19. Although numerous researchers have already conducted studies on the epidemiological characteristics, clinical diagnosis and treatment methods for COVID-19 [5,6,7,8,9], few have reported on the geographical distribution of COVID-19 in relation to the sociodemographic factors of different regions.

In medical research, most studies utilize conventional regression models, such as ordinary least square regression and generalized linear models (GLM) [10,11,12]. However, these conventional regression models generate bias by producing average parameters over the whole studied regions without considering the potential geographical variation. Geographically weighted regression (GWR) is a powerful approach to explore the possible geographical variations of mortality and incidence of infectious diseases and other health problems across space [13, 14]. The geographically weighted Poisson regression (GWPR), extended from GWR, was initially developed to model small-scale mortality that followed the Poisson distribution. Recently, GWPR is increasingly used to explore the relationships between the incidence or mortality of diseases and geographically changing factors [15,16,17,18].

Therefore, the main issues addressed in this study are as follows: a) to describe the geographical characteristics of COVID-19 incidence across different cities in China; b) to explore the spatially varying relationship of COVID-19 incidence to distances to Wuhan, GDP, health resources, and population density.


Data sources and data setting

Using the available surveillance data on COVID-19 in China, we conducted a geographic epidemiological study with the city as the basic geographical unit. Data on the confirmed cases of COVID-19 as of March 25th, 2020 in each city was extracted from reports of the National Health Commission of the People’s Republic of China and Provincial health committees [3]. From the 2019 China Statistical Yearbooks [19], we also extracted data on the gross domestic product (GDP), population of inhabitants, land area, and health resources indicators (including number of health personnel per 1000 people, number of hospital beds per 1000 people, and number of health institutions per 1000 people) in each city of China.

Each city’s population density was calculated by dividing the population of inhabitants per year by local land area. There is a 3-level administrative structure in China, consisting of the province, city, and district/county. According to the Ministry of Civil Affairs of the People’s Republic of China, we divided China into 346 cities. However, due to the lack of information on COVID-19 in Hong Kong, Macau, Taiwan, and Dongsha Islands, only 342 cities were incorporated in the analysis.

The study’s basic geographic unit was cities, of which the geographic location was defined as the geographic coordinates ((i.e., latitude/longitude) of the city center where the governmental agencies locates. The geographic information on China cities, including the longitude, latitude, and distance to Wuhan, was acquired from the Google Earth (

Data analyses

The incidence of COVID-19 in each city was measured as the number of confirmed cases per million people. A principal component analysis was performed to extract a synthesized variable by using software SPSS 20.0 with three indicators related to health resources, including the number of health personnel per 1000 people, number of hospital beds per 1000 people, and number of health institutions per 1000 people. The Kaiser-Meyer-Olkin value and Bartlett’s test of sphericity were used to evaluate the reliability of principal component analysis. In the study, the Kaiser-Meyer-Olkin value was 0.665 and the P-value of Bartlett’s test of sphericity was < 0.001. The first principal component with a variance contribution of 81.12%, was adopted to represent the comprehensive conditions of health resources for different cities in China. GDP was used as a proxy for the socioeconomic status of each studied city. The synthesized health resources variable, GDP, population density as well as distance to Wuhan of each city were defined as explanatory variables in this study. The ArcGIS 10.2 software [20] (Environmental Systems Research Institute, Inc., Redlands, CA, US) was used to map the geographic distributions of COVID-19 incidence and explanatory variables by city. Patients’ identification number, area codes as well as research variables (such as incidence, GDP, and population density) were input in Excel software as data, sorted and further imported into ArcGIS. The area codes were obtained from the database of the Regulation of the Ministry of Civil Affairs of the People’s Republic of China, then the data table was linked to the map file using the area codes to draw the visual map.

The traditional GLM Poisson regression analysis was performed by R 3.5.3 software based on the assumption that the COVID-19 incidence follows the Poisson distribution. The fitting formula of the analysis is expressed as

$$ {\mathrm{lnO}}_{\mathrm{i}}={\upbeta}_0+{\upbeta}_1\left(\mathrm{DEN}\right)+{\upbeta}_2\left(\mathrm{GDP}\right)+{\upbeta}_3\left(\mathrm{DIST}\right)+{\upbeta}_4\left(\mathrm{HEA}\right)+{\upvarepsilon}_{\mathrm{i}} $$

where Oi denotes the incidence of COVID-19 in city i, β0 is the global intercept, βj (j = 1,2,3,4) are model parameters corresponding to explanatory variables. DEN is the average population density (100 inhabitants/km2) of city i. GDP is the gross domestic product (100 million Renminbi Yuan) of city i. DIST is the straight-line distance (100 km) of the municipal building of city i to the municipal building of Wuhan. HEA is the synthesized variable obtained through principal component analysis to reflect health resources of city i, and εi is the error term of city i.

In the GWPR model, coefficient changes with geographic locations, which means the GWPR model can capture the spatial data’s instability and find the local association between the dependent variable and explanatory variables. The formula of the GWPR model is expressed as

$$ {\mathrm{lnO}}_{\mathrm{i}}={\upbeta}_0\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)+{\upbeta}_1\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)\ \left(\mathrm{DEN}\right)+{\upbeta}_2\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)\ \left(\mathrm{GDP}\right)+{\upbeta}_3\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)\ \left(\mathrm{DIST}\right)+{\upbeta}_4\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)\ \left(\mathrm{HEA}\right)+{\upvarepsilon}_{\mathrm{i}} $$

where (ui,vi) denotes the two-dimensional coordinates of each city, and the definitions of other model parameters are similar to those in the GLM Poisson regression model mentioned above. The GWR 4.0 software ( was used to calibrate the GWPR model with the iterative reweighted least-squares method. A distance-based weighting scheme was used to allocate weights to each city by taking samples within a defined neighbourhood into calculation and by giving more weights to nearby samples than faraway samples. The kernel type and function for geographic weighting to estimate local coefficients for each city and bandwidth size was adaptive bisquare. The best bandwidth size was determined automatically using the golden section search method, based on the lowest corrected Akaike Information Criteria (AICc). Because the spatial auto-correlation is an important issue in the GLM Poisson regression model, each observation’s spatial auto-correlation is therefore expected to be removed after adjusting for the non-stationary effect in the GWPR model. To assess the spatial auto-correlation of both the GLM Poisson regression model and the GWPR model, Moran’s I coefficient, which ranges from − 1 to 1 [21], was used. When Moran’s I equals to zero, it signifies no spatial auto-correlation. In this study, the AICc and Moran’s I coefficient were used to measure how good the fit of the GWPR model and GLM is.

The complete analysis is as follows. Firstly, a traditional GLM Poisson regression analysis was performed using R 3.5.3 software, to estimate the effects of explanatory variables on COVID-19 incidence in China’s 342 cities. Considering that spatial auto-correlation might not be adjusted by the traditional GLM Poisson regression model, all explanatory variables were taken into the GWPR model in the GWR 4.0 software to explore the geographical disparities in the effects of both independent and dependent variables. Lastly, ArcGIS 10.2 software was used to display the distribution of the COVID-19 incidence and sociodemographic factors on the map of China and to intuitively reflect the geographical differences in the relationship between sociodemographic factors and COVID-19 incidence.


By March 25th, 2020, 80,744 confirmed cases of COVID-19 were diagnosed in 342 cities across China, with an incidence of 57.9 per million. Among the studied cities, Wuhan has the highest incidence of COVID-19 (4512.8/1000000 people), while some cities in the west have the lowest incidence (few confirmed cases) (Shown in Fig. 1). The top ten cities with the highest incidence are shown in Table 1. According to the global Moran’s I statistic (Moran’s I = 0.039, p < 0.05), the incidence of COVID-19 had positive auto-correlation or clustered patterns all over China.

Fig. 1
figure 1

Spatial distribution of the COVID-19 incidence in China

Table 1 Summary of top ten cities with the highest incidence

Considerable geographical disparities were found in the distribution of our explanatory variables among the studied cities. Compared with the western cities, China’s central and eastern cities have a higher socioeconomic standing (Fig. 2a), denser population (Fig. 2b) and better health resources (Fig. 2c). The distance between Wuhan and each studied city is presented in Fig. 2d. A more detailed description of these study variables is provided in Table 2.

Fig. 2
figure 2

Spatial distribution of the exploratory variables in China

Table 2 Summary of descriptive statistics of the independent variables and dependent variables

The GLM Poisson regression model shows that the intercept and four explanatory variables are all at a significant level of 1% (Table 3). The distance of each studied city to Wuhan is negatively associated with the incidence of COVID-19. When the distance increases by 100 km, the incidence of COVID-19 decreases approximately by a factor of 0.7818. Furthermore, local population density and health resources in each city also show an inverse correlation with the incidence of COVID-19, suggesting that higher population density and better health resources might reduce the incidence of COVID-19. Interestingly, a higher GDP is associated with an increased incidence, although the correlation is very weak (the coefficient is 0.0002). After controlling for all explanatory variables using the GLM Poisson regression model, residuals still exhibit positive spatial auto-correlation (global Moran’s I = 0.128, p < 0.001), indicating that GLM Poisson analysis is inadequate to address the non-stationary spatial relationships.

Table 3 Summary statistics of traditional GLM Poisson regression model

Further fitting GWPR model with spatially varying intercept and explanatory variables (Table 4) found a significantly lower AICc than fitting GLM Poisson regression model (43,218.9 in GWPR vs. 61,953.0 in GLM, respectively). No spatial auto-correlation of residuals was found in the model (global Moran’s I = − 0.005, p = 0.468), inferring that the spatial auto-correlation had been captured by the GWPR model.

Table 4 Summary statistics of local GWPR model

Figure 3 shows the spatial varying coefficients of four explanatory variables in the GWPR model. The economic indicator GDP is positively associated with the incidence of COVID-19, with higher coefficients in the central and northern cities (Fig. 3a). As population density increases, the incidence of COVID-19 for most of the cities decreases with exception of the southeastern cities (Fig. 3b). Health resources also have a negative impact on the incidence of COVID-19, with higher coefficients in the central and eastern cities and lower coefficients in the western and northeastern cities (Fig. 3c). A higher distance between Wuhan and the studied cities might decrease the risk of COVID-19, with the coefficient ranging from − 1.0596 to − 0.6655 among different cities (Fig. 3d).

Fig. 3
figure 3

Spatial distribution of the coefficients of exploratory variables in the GWPR model


To explore the potential risk factors of COVID-19, GIS (Geographic Information System) was used to visualize the geographic distributions of COVID-19 incidence in relation to the sociodemographic factors including GDP, population density, distance to Wuhan, and health resources. In this study, the local GWPR model and traditional GLM Poisson regression model were compared to find the optimal fitting model for exploring the association between the sociodemographic factors and COVID-19 incidence. The results revealed that compared with the GLM Poisson regression model, calibration of the GWPR model obviously results in an improved model fitting.

According to the GLM Poisson regression model and the GWPR model, cities with a higher GDP might have an increased risk for COVID-19. A recent study found that the rapid spread of COVID-19 worldwide tended to appear first in the most economically developed regions where high-level international trade and commercial activities were prevalent. Following the initial spread of COVID-19 along international trade routes between the developed regions, the virus spreads later to the developing areas [22]. In our study, a higher coefficient was observed in the midlands and northern cities than in the southern cities of China in the GWPR model. A possible explanation for this phenomenon is that the southern cities have more robust economy than the northern cities. The economic improvement might exert a more extensive and significant influence on the northern cities, it accordingly increase the infection density of COVID-19 [23]. Further investigation is required for more detailed causes.

Our result also revealed that with the distance to Wuhan increasing, the incidence of COVID-19 decreases among all of the studied cities based on both GLM Poisson regression model and GWPR model. The spatial varying coefficients shows a decreasing trend from the southeast to the northwest in the GWPR model. Since more than 5 million people had already left Wuhan before it was officially sealing off, we were unable to track where exactly these people had gone. Therefore, the distance to Wuhan could be used in part to represent this massive human migration. Obviously, cities located at a greater distance to Wuhan will experience less or even no contact with the infectious sources, which hinders the spread of COVID-19. On the contrary, in cities near Wuhan with convenient transportation system and a high degree of trafficking, their residents were more likely to contact with the infectious sources, which will promote the spread of COVID-19. Consistent with our current and previous findings [24], other studies have also revealed the aggregation characteristics of the virus and reminded us the importance of shutdown of the epidemic areas and isolation of the infectious sources [25].

According to the GWPR model, the coefficients of health resources were negative in 342 cities and showed a degressive trend from the southeast to the northwest, indicating that better health resources might mitigate the spread of COVID-19. Better health resources could help identify the sources of infection and enable suspected patients and close contacts to gain better access to quarantine measures, which in turn prevents the spread of COVID-19 and reduce it’s incidence. Other studies have also emphasized the importance of controlling the sources of infection and cutting off the routes of transmission [26]. However, it is worth noting that health resources were more lacking in the western cities than in the central and eastern cities of China. Previous reports have also confirmed the substantial regional disparities of both availability and accessibility to health resources in China [27]. Fortunately, since the outbreak of COVID-19, Chinese government has undertaken tremendous efforts in constructing new medical facilities, mobilizing the country’s vast and robust medical forces and accelerating the delivery of medical supplies, and as a consequence, has quickly brought the epidemic under control. This concurs with our findings. In order to effectively control the spread of COVID-19, we urge all governments to ramp up the amount of available and accessible medical and health resources in various regions. China’s situation could provide a guide to other countries on how to prepare for possible local outbreaks, especially for resource-limited countries [28].

With regard to the population density, both GWPR model and GLM Poisson regression model showed a negative association between population density of each city and the incidence of COVID-19. In the GWPR model, this effect decreases from the north, which has a lower population density, to the south, which has a higher population density. Interestingly, in paradox, COVID-19 incidence is higher in cities with a lower population density. This unique virus spreading pattern in China is possibly due to the following reasons: First, many usually highly populated large cities are much less populated during the Spring Festival in China due to massive migration of people from highly populated large cities to less populated medium and small cities as well as rural areas for the sake of family reunion. Second, after the outbreak of COVID-19 in Wuhan, many residents of highly populated large cities, including Wuhan, undertake “evasive activity” to return to less populated small cities or rural areas. Notably, a study from the United State reported that household size, rather than overall population density, is more strongly associated with the prevalence of COVID-19 [29]. Moreover, another study considered that the population density is a more useful predictor of COVID-19 infections and mortality for metropolitan areas, but not for rural areas [30]. Thus, it is necessary to deeply explore the relationship of population density to the incidence of COVID-19.

To be noted, this study has some limitations. First, the observed differences may be subject to many unobserved and unavailable confounding factors such as age, gender, nationality, and other natural factors, all of which were not accounted in the multivariate analysis. Second, because this study is based on surveillance data, the causal relationship between sociodemographic characteristics and the incidence of COVID-19 could not be demonstrated. Third, due to different policies and measures in response to COVID-19 in each country, our results could not be extrapolated to other countries. Nevertheless, to the best of our knowledge, this study is the first to combine the COVID-19 surveillance and sociodemographic data into GIS and analyze the possible risk factors of COVID-19 incidence in China from the spatial perspective, filling the gap of knowledge of this geographical region.


Our results show that local GWPR model is a better fitting model to investigate the effects of sociodemographic factors on COVID-19 than the traditional GLM Poisson regression model. Cities with a higher GDP, limited health resources, and a shorter distance to Wuhan, were at a higher risk for COVID-19. Moreover, the relationship between the population density and COVID-19 incidence might be mediated by the peculiar set of circumstances during the spread of the virus in China, i.e., the Spring Festival and Spring Transportation in China. In conclusion, these findings shed light on the effect of sociodemographic factors on COVID-19 incidence from the geographic perspective and have important public health policy implications for COVID-19 management and prevention in China. In addition, the study could be used as a guide for other countries to understand the local spread of COVID-19.

Availability of data and materials

The city-level COVID-19 confirmed case number information was made available from the Health Commission of the People’s Republic of China and Provincial health committees [3]. Data on the GDP, population of inhabitants, land area, and health resources indicators in each city of China were available from the 2019 China Statistical Yearbooks [19].



Coronavirus disease 2019


Severe acute respiratory syndrome coronavirus 2


Generalized linear models


Geographically weighted regression


Geographically weighted Poisson regression


Gross domestic product


Corrected Akaike Information Criteria


  1. Chan JF-W, Yuan S, Kok K-H, To KK-W, Chu H, Yang J, et al. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet. 2020;395(10223):514–23.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Chen S, Yang J, Yang W, Wang C, Barnighausen T. COVID-19 control in China during mass population movements at new year. Lancet. 2020;395(10226):764–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. National Health Commission of the People's Republic of China. The latest situation of pneumonia epidemic of new coronavirus infection at 24:00 on March 25. 2020 Availablefrom:

    Google Scholar 

  4. John Hopkins Coronavirus Resource Center (JHCRC). COVID-19 Dashboard. 2020 from: (Accessed 14 Sept 2020).

    Google Scholar 

  5. Han Y, Liu Y, Zhou L, Chen E, Liu P, Pan X, et al. Epidemiological assessment of imported coronavirus disease 2019 (COVID-19) cases in the most affected city outside of Hubei Province, Wenzhou. China Jama Network Open. 2020;3(4):e206785.

    Article  PubMed  Google Scholar 

  6. Hung IF-N, Lung K-C, Tso EY-K, Liu R, Chung TW-H, Chu M-Y, et al. Triple combination of interferon beta-1b, lopinavir-ritonavir, and ribavirin in the treatment of patients admitted to hospital with COVID-19: an open-label, randomised, phase 2 trial. Lancet (London, England). 2020;395(10238):1695–704.

    Article  CAS  Google Scholar 

  7. Pan A, Liu L, Wang C, Guo H, Hao X, Wang Q, et al. Association of public health interventions with the epidemiology of the COVID-19 outbreak in Wuhan, China. JAMA. 2020;323(19):1915–23.

    Article  CAS  PubMed  Google Scholar 

  8. Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention. JAMA. 2020;323(13):1239–42.

    Article  CAS  PubMed  Google Scholar 

  9. Xie J, Tong Z, Guan X, Du B, Qiu H. Clinical Characteristics of Patients Who Died of Coronavirus Disease 2019 in China. JAMA Netw Open. 2020;3(4):e205619.

  10. Choi M, Lee M, Lee MJ, Jung D. Physical activity, quality of life and successful aging among community-dwelling older adults. Int Nurs Rev. 2017;64(3):396–404.

    Article  CAS  PubMed  Google Scholar 

  11. Zheng XY, Qin GY, Tu DS. A generalized partially linear mean-covariance regression model for longitudinal proportional data, with applications to the analysis of quality of life data from cancer clinical trials. Stat Med. 2017;36(12):1884–94.

    Article  PubMed  Google Scholar 

  12. Takele K, Zewotir T, Ndanguza D. Understanding correlates of child stunting in Ethiopia using generalized linear mixed models. BMC Public Health. 2019;19(1):626.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Yang TC, Matthews SA. Understanding the non-stationary associations between distrust of the health care system, health conditions, and self-rated health in the elderly: a geographically weighted regression approach. Health Place. 2012;18(3):576–85.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Zhou Y-B, Wang Q-X, Liang S, Gong Y-H, Yang M-X, Chen Y, et al. Geographical variations in risk factors associated with HIV infection among drug users in a prefecture in Southwest China. Infect Dis Poverty. 2015;4(1):38.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Yang TC, Shoff C, Matthews SA. Examining the spatially non-stationary associations between the second demographic transition and infant mortality: a Poisson GWR approach. Spat Demogr. 2013;1(1):17–40.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Manyangadze T, Chimbari MJ, Macherera M, Mukaratirwa S. Micro-spatial distribution of malaria cases and control strategies at ward level in Gwanda district, Matabeleland South, Zimbabwe. Malar J. 2017;16(1):476.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Wang N, Mengersen K, Tong S, Kimlin M, Zhou M, Liu Y, et al. County-level variation in the long-term association between PM2.5 and lung cancer mortality in China. Sci Total Environ. 2020;738:140195.

    Article  CAS  PubMed  Google Scholar 

  18. Alves AT, Nobre FF, Waller LA. Exploring spatial patterns in the associations between local AIDS incidence and socioeconomic and demographic variables in the state of Rio de Janeiro, Brazil. Spat Spatiotemporal Epidemiol. 2016;17:85–93.

    Article  PubMed  Google Scholar 

  19. National Bureau of Statistics. China Statistical Yearbook, 2020. China Statistics Press. (Accessed 22 Apr 2021).

  20. ArcGIS Resources: ArcGIS Help 10.2, 10.2.1, and 10.2.2. Available from: (Accessed 19 Jan 2021).

  21. Bui LV, Mor Z, Chemtob D, Ha ST, Levine H. Use of geographically weighted Poisson regression to examine the effect of distance on tuberculosis incidence: a case study in Nam Dinh, Vietnam. PLoS One. 2018;13(11):e0207068.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Ludovic J, Bourdin S, Nadou F & Noiret G. Economic globalization and the COVID-19 pandemic: global spread and inequalities. [Preprint]. Bull World Health Organ. E-pub: 23 April 2020.

  23. Xing JN, Guo W, Qian SS, Ding ZW, Chen FF, Peng ZH, et al. Association between macroscopic-factors and identified HIV/AIDS cases among injecting drug users: an analysis using geographically weighted regression model. Biomed Environ Sci. 2014;27(4):311–8.

    Article  PubMed  Google Scholar 

  24. Gokmen Y, Turen U, Erdem H, Tokmak İ. National Preferred InterpersonalDistance Curbs the Spread of COVID-19: A Cross-Country Analysis. Disaster Med Public Health Prep. 2020. p. 1–7.

  25. Medeiros de Figueiredo A, Daponte Codina A, Moreira Marculino Figueiredo DC, Saez M, Cabrera León A. Impact of lockdown on COVID-19 incidence and mortality in China: an interrupted time series study. [Preprint]. Bull World Health Organ.

  26. Harapan H, Itoh N, Yufika A, Winardi W, Keam S, te H, et al. Coronavirus disease 2019 (COVID-19): a literature review. J Infect Public Health. 2020;13(5):667–73.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Yu M, He S, Wu D, Zhu H, Webster C. Examining the multi-scalar unevenness of high-quality healthcare resources distribution in China. Int J Environ Res Public Health. 2019;16(16):2813.

    Article  PubMed Central  Google Scholar 

  28. Makoni M. Africa prepares for coronavirus. Lancet. 2020;395(10223):483.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Maroko AR, Nash D, Pavilonis BT. COVID-19 and inequity: a comparative spatial analysis of New York City and Chicago hot spots. J Urban Health. 2020;97(4):461–70.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Zhang CH, Schwartz GG. Spatial disparities in coronavirus incidence and mortality in the United States: an ecological analysis as of May 2020. J Rural Health. 2020;36(3):433–45.

    Article  PubMed  PubMed Central  Google Scholar 

  31. World Medical Association. World medical association declaration of Helsinki: ethical principles for medical research involving human subjects. JAMA. 2013;310(20):2191–4.

    Article  Google Scholar 

Download references


We thank the health workers for their outstanding contribution and sacrifice during the COVID-19 pandemic.


This work was supported by the COVID-2019 Emergency Prevention Science and Technology Project of Xi’an City (grant number: 2020005YX005), the National Natural Science Foundation of China (grant numbers: 81602928 and 81541069), and the Natural Science Foundation of Shaanxi (grant number: 2017JM8102). The funding bodies have neither role in the design of the study, collection, analysis, interpretation of data, nor in writing the manuscript.

Author information

Authors and Affiliations



Z.H., P.L. and L.Y. conceived and designed the study; C.F., M.B. and Z.L. conducted literature search and collected the data; Z.H., and P.L. performed statistical analysis and wrote the manuscript. All authors reviewed and approved this manuscript.

Corresponding author

Correspondence to Leilei Pei.

Ethics declarations

Ethics approval and consent to participate

Our study was conducted following the Declaration of Helsinki guidelines [31]. All the data used in our study was from the public data on the official website of the government and didn’t involve copyright issues. To protect the privacy of the individuals, all data analyzed were anonymized in occasion of data use, processing, sharing and interaction. None of the study personnel could see the personal information of individuals.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Liu, Y., Chen, F. et al. The effect of sociodemographic factors on COVID-19 incidence of 342 cities in China: a geographically weighted regression model analysis. BMC Infect Dis 21, 428 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: