Association between $$PM_10$$ and respiratory diseases admission in peninsula Malaysia during haze
Study population
The data on respiratory disease cases listed in ICD (J00 – J99) from Jan 1st 2000 to Dec 31st 2019 were were obtained from the Health Informatics Center, Ministry of Health following the approval from the National Medical Register (NMRR). This study involved 92 government hospitals in peninsula Malaysia covering 13 states in Malaysia which are Kedah, Perak, Pulau Pinang, Perlis, Kelantan, Pahang, Terengganu, Melaka, Johor, Negeri Sembilan, Selangor, Wilayah Persekutuan Kuala Lumpur and Wilayah Persekutuan Putrajaya. Peninsula Malaysia covers 79% of the population in Malaysia and represents the majority of the population in Malaysia. peninsula Malaysia also indicates the speed of development with many industrial areas in Malaysia, they are likely to be most affected by air pollution from vehicular and industrial emissions. In addition, the Klang Valley area which includes around the Federal Territory of Kuala Lumpur,the Federal District of Putrajaya and Selangor has become a metropolitan city where there are much urbanization and people in Malaysia migrate here to improve their lifestyle. This study also includes the southern point of Johor, which is situated in front of Singapore and is exposed to relatively high pollution levels as a result of the rapid industrialisation in Johor. Future research will expand the investigation to examine how pollution affects Malaysia’s by including non peninsula regions’ respiratory health. The patient’s age, gender, admission date and diagnosis of respiratory diseases was obtained from the medical records. There are 3 age groups in this study (0–14 : young age, 15–64 : working age, 65 and above : old age). The age group classification follows the categorization of 3 main age groups in Malaysia listed in the Department of Statistics Malaysia. According to DOSM in the analysis of Labor Force Survey (LFC) in Malaysia, working age in Malaysia refers to the age structure of the economically active population, which is classified to be between the ages of 15 and 64. The respiratory diseases were classified following the Malaysia Health Indicators coded from J00 to J99. Respiratory diseases are classified into 9 code diseases which are (CD1 = Diseases of the upper respiratory tract, CD2 = Influenza, pneumonia and other acute lower respiratory infections, CD3 = Other diseases of the upper respiratory tract, CD4 = Chronic lower respiratory disease, CD5 = Lung diseases due to external agent, CD6 = Other respiratory diseases principally affecting the interstitium, CD7 = Suppurative and necrotic conditions of lower respiratory tract, CD8 = Other diseases of pleura, CD9 = other diseases of the respiratory system).
In this study, non-Malaysian patients have been deliberately excluded. The rationale behind this decision is the study’s exclusive focus on examining the impact of air pollution on the respiratory health of Malaysian citizens. This methodological approach of exclusively targeting Malaysians serves to enhance the uniformity of the sample, thereby allowing for better control of other potentially influential variables. For instance, individuals from other countries who have recently arrived in Malaysia, or those who plan to stay for a relatively short duration, might not be as profoundly affected by air pollution. Additionally, their attitudes towards air pollution could be shaped by their prior experiences in the countries they resided in before coming to Malaysia.From a practical standpoint, the implications of this study are notably more relevant to policymakers when it specifically pertains to the perspectives of Malaysian citizens. Malaysians are the demographic group directly impacted by the government’s regulations, plans, and incentives related to environmental protection, particularly in the context of tax-related factors. In addition, this study only involved patients who were admitted to government hospitals. Patients who were admitted to hospital due to respiratory diseases in private hospitals will be excluded from this study.
Air pollutants and meteorological data
For this study, historical daily data on pollutants such as Particulate Matter (\(PM_10\)), and meteorological records such as humidity, wind speed, and temperature were collected at 92 monitoring stations across 13 states in peninsula Malaysia.. The southwest monsoon season in Malaysia, which lasts from June to September, and the dry season in the equatorial SEA region were used to define the haze period. The increased number of fires during the dry season contributes to the formation of haze. Thus, the hazy period for this study has been defined from June to September. Since fine particles comprise most of the particulate matter (PM) in haze samples, thus this study defined the haze period according to particulate matter \(PM_10\). As a result, it’s also shown in Fig. 1 that the concentration of \(PM_10\) increased most significantly between June and September every year. All these data were obtained from Malaysia’s Department of Environment (DOE). Some pollutants’ missing data (less than 7%) were treated by using the Multivariate Imputation via Chained Equation (MICE) package in R which assumes the missing data are missing at random. Consequently, the probability of the missing values depends on the observed value and can be predicted using them. Predicted Mean Matching (PMM), which is applicable to numerical variables, is the method used in the MICE imputation to predict the missing values.
The limited availability of data has constrained the depth of the analysis in this study, specifically in addressing area-level characteristics, and it is important to note this constraint in order to accurately interpret the associations between pollutants and hospitalisation outcomes.
Statistical analysis
Air pollutants and meteorological factors influence global human morbidity and mortality, and, and most of their effects on human health have lag and non linearity. In this study, quasi-Poisson Generalized Linear Model (GLM) with the distributed lag nonlinear model (DLNM) was used to analyze the lag-exposure-response of the association of daily respiratory diseases admissions and daily air pollutants and meteorological factors during the haze period17,18. Since the daily counts for respiratory diseases hospitalization follow Poisson distribution, hence quasi-Poisson statistical model was used to address the over dispersion of respiratory diseases hospitalization. Since many researchers found that there is a nonlinear association between environmental factors and health conditions, hence DLNM was used to explore the bi-dimensional exposure-lag-response relationship. The delayed effect of air pollutants on respiratory diseases during haze was predicted using the cross-basis function of the distributed lag nonlinear model (DLNM). The maximum lag for this study was set at 30 days and the degree of freedom for the exposure variables was determined by minimizing the generalized cross-validation score. The final model structure can be shown as follows;
$$\beginaligned \beginaligned \log \left[ E\left( Y_t\right) \right] = \alpha&+\beta X_t, p+n s\left( \text Time _t, d f\right) +n s(D O Y, d f)+n s\left( T e m p_t, d f\right) \\&+n s\left( R H_t, d f\right) +n s\left( W S_t, d f\right) + \text factor \left( D O W_t\right) \endaligned \endaligned$$
(1)
where \(Y_t\) is the number of hospital visits for respiratory diseases at calendar day starting from t (1, 2, 3, …, 3060) for the haze period ; [E(\(Y_t\) )] represents the expected number of daily hospital visits for respiratory ; \(\alpha\) is the intercept ;\(X_t,p\) refers to the cross-basis matrix for the pollutants produced by DLNM to fit the distributed lag effects ; ns is the smoother for natural cubic spline ; Time refers to calendar time to control for the long-term trend and seasonality of daily admission ; df was the degree of freedom ; DOY is the day of the year from day 1 to 365 days or 366 days for the leap year ; Temp is the temperature on the day t ; RH is the relative humidity ; WS is the windspeed of each day t and DOW is the day of the week starting from Monday to Saturday.
The degree of freedom for each cofounder was selected based on the lowest (Q-AIC) Akaike Information Criterion for quasi-Poisson which is ;
$$\beginaligned Q A I C=-2 \mathscr L(\hat\theta )+2 \widehat\emptyset k \endaligned$$
(2)
where \(\mathscr L\) is the log-likelihood of the fitted model with parameters \(\hat\theta \) and \(\widehat\emptyset \) is the estimated over dispersion parameter, where k is the number of the parameters. The degree of freedom for the equation are found to be 7 for \(Time_t\), \(WS_t\) and \(DOY_t\). The degree of freedom for \(Temp_t\) and \(RH_t\) is 3 according to the previous study19,20,21,22,23. Air pollution may have a delayed effect, hence each single-pollutant model was evaluated for impacts on the current day (lag 0) and the exposure to the seven days prior (lag 7) or even 30 days prior (lag 30). This study fixed the maximum lag days to 30 days during the haze period because there is a lag influence of atmospheric particulates on human health during short-term exposure. The cumulative lag effects of the pollutant concentration were observed starting from the first two-day moving average (lag 01) to the eight days moving average (lag 07) and until one month (lag 030). This study explored the relative risk of respiratory diseases admissions according to the gender, age and 9 categories of respiratory diseases. The association of air pollutants and hospital admissions was presented as relative risk (RR) with 95% confidence interval correlated with 10\(\mu \textrmg / \textrmm^3\) increases in \(PM_10\). All data cleaning and statistical analyses were conducted in R Statistical Software Version 2.5.1. The R packages used for this study involve “dlnm”, “MICE”, “spline” and “mgcv”.
Sensitivity analysis
The sensitivity analysis results showed that the model was stable and consistent for a range of modifications made to the smoothers’ degree of freedom and was shown in Supplementary S1 and S2. The expected number of admission are unaffected by increasing the number of degree freedom by 3,4,5,6 that used to control seasonal and long-term trends as well as temperature.
Ethical approved
This is an observational study. We confirm that all methods employed in this study were conducted in strict accordance with the applicable guidelines and regulations. Informed consent was obtained from all subjects participating in the research, as well as their respective legal guardian(s) in the case of minors. The Medical Research and Ethics Committee Ministry of Health Malaysia has confirmed that no ethical approval is required. The Medical Research and Ethics Committee (MREC), Ministry of Health Malaysia (MOH) has provided ethical approval for this study. Please take note that all records and data are to be kept strictly CONFIDENTIAL and can only be used for the purpose of this study. All precautions are to be taken to maintain data confidentiality. Permission from the District Health Officer / Hospital Administrator / Hospital Director and all relevant heads of departments / units where the study will be carried out must be obtained prior to the study. Authors required to follow and comply with their decision and all other relevant regulations, including the Access to Biological and Benefit Sharing Act 2017.
link