Study area and data collection
Shenyang, capital city of Liaoning Province and also the largest city in northeast China, 41°N and 123°E (Figure 1), was selected as the study area. It has a temperate climate and a population of about 7.6 million. Shenyang has a semiwetness continent climate of the north temperature zone as well as concentrated precipitation and distinct seasons because of the monsoon. Annual average temperature is about 8.1°C, the highest monthly average temperature is 24.0°C in July, and the lowest monthly average temperature is -8.5°C in January. Its annual rainfall is 501.5 mm and the non-frost period is 183 days. Four seasons are spring, March – May; summer, June – August; autumn, September – November; winter, December – February. Demographic information for Shenyang was collected from local government report.
Bacillary dysentery is a legally mandated notifiable disease in China, the Law on the Prevention and Control of Infectious Diseases [6] requires health-care staff to report any of the 37 infectious diseases, including bacillary dysentery, to the Center for Disease Control and Prevention (CDC) through the National Noticeable Infectious Disease Reporting system (NIDR). The data of monthly incidence of bacillary dysentery in Shenyang from 1950 to 1996 was obtained from Liaoning Center for Disease Control and Prevention. Because the study period covers a long time span, four time subgroups were obtained based on historical and economic development of China. In order to control seasonal effects, two models were set up in each time subgroup, one is based on data from January to July, and the other is base on data from August to December. The meteorological information was collected from Shenyang Meteorological Bureau. Meteorological data consisted of the corresponding monthly air pressure, average temperature, maximum temperature, minimum temperature, precipitation, evaporation and relative humidity.
Data analysis
Correlation analysis
The relationship between monthly mean meteorological factors and the monthly incidence of bacillary dysentery was examined. Spearman's correlation was performed to quantify the relationship between monthly weather variables and the monthly incidence of bacillary dysentery a lag of one to four months.
Multi-colliearity Diagnosis
The judgment of multi-colliearity was made by checking related statistics, such as tolerance value or variance inflation factor (VIF), and condition index. The ith tolerance value is defined as 1-, where is the coefficient of determination for regression of the ith independent variable on all the other independent variable. VIF is just the reciprocal of a tolerance value. With the recommendations of several statisticians[7], large VIF value greater than 10 and/or average VIF greater than 6 indicates strong collinearity, and that those variables are collinear if their condition indexes are more than 20 and corresponding variance proportions are more than 0.5.
Ridge regression analysis
There was collinearity among the meteorological factors, especially between constant and air pressure, average temperature and maximum temperature, and minimum temperature and relative humidity. To avoid the multi-collinearity, ridge regression was used to quantify the relationship between weather variables and bacillary dysentery incidence. By using an improved least square method, ridge regression sought standardized coefficients (i = 1, 2,..., m), its formal equation was written as:
(1)
Compared to Ordinary Least Square(OLS) linear regression, the basic principle of ridge regression is to artificially reduce correlation coefficient rij of each pair of variables xi and xj (including response variable and independent variables) to rij/(1+k), k is called ridge parameter, and usually 0 < k < 1. K value was selected when all the regression coefficients were relatively stable and the sign of the coefficients did not change.
Hierarchical cluster analysis
Hierarchical cluster analysis was adopted to group the weather variables. In this study, the agglomerative method with between-group average linkage algorithm was adopted and the measure for similarity was the Pearson correlation.
All the above statistical analyses were performed by Statistical Product and Service Solutions (SPSS 12.0 for windows, SPSS Inc., Chicago, IL, USA).
Ethical review
The present study was reviewed by research institutional review board of China Medical University and found to be utilization of disease surveillance data and meteorological data not requiring oversight by an ethics committee.