2016 2nd International Conference on Sustainable Energy and Environmental Engineering (SEEE 2016) ISBN: 978-1-60595-408-0

The Research on Air Pollution Laws in Urban Agglomeration Based on High Frequency AQI Data Qiu-ling HU* and Zhe YANG School of International Business, Normal University, Xi’an Shaanxi, , 710119 *Corresponding author

Keywords: High frequency AQI data, Guanzhong urban agglomeration, AQI hour index, VAR model, Associated rules.

Abstract. Based on high frequency data of AQI and contaminants, this paper makes researches including general situation of air pollution, fluctuation rules of air quality in one day and associated rules of air pollution between cities by using statistical analysis methods like the hour index of AQI and building the VAR model. The conclusions are as following: firstly, it is obvious that air pollution which often manifests as pollution of particulates has a seasonal effect and a clustering property. Secondly, different seasons have different air quality fluctuation rules in one day. However, with no consideration of the phase position diversity of graphs, intraday fluctuation rules of air quality in one season are similar to those in other seasons. Thirdly, the deterioration of air condition of one city can cause the deterioration of air condition of other cities, and the peak of this influence appears in one day generally, and the influence weakens with the increase of spatial distance.

Introduction Air pollution harms human health and sustainable development of environment and economy. However, urban air pollution is very serious in China. As the main form of promoting urbanization, the city agglomeration causes a higher risk of pollution threat because of its agglomeration effect. The researches on air pollution in urban agglomeration in China are mainly concentrated in Hebei Urban Agglomeration, Yangtze River Delta Urban Agglomeration and Urban Agglomeration [1,2]. There are a few of researches concerning the air pollution in Guanzhong urban agglomeration. However, for Guanzhong urban agglomeration, we have not found the researches based on the hour frequency data. In this paper, through the analysis of the hour frequency AQI (Air Quality Index) data and the pollutant concentration data, we expect to dig out multiple levels of air pollution laws in this city group and provide a scientific basis for air pollution control in this region.

Literature Review In recent years, researchers have made analysis of the characteristics and reasons of air pollution from several perspectives. Kassomenos P, Vardoulakis S and Chaloulakou analyzed the sources and seasonal characteristics of particulate pollution in three European cities [3]; Kimbrough Sue, Baldauf Richard W and Hagler Gayle S W had the analysis of the impact of seasonal changes on local air quality in Las Vegas city [4]; Zhou HJ, He J and Zhao BY analyzed the distribution characteristics of the particulate matter in city in the pollution season [5]; Xu JS, Xu HH and Xiao H have made the research of composition characteristics and sources of the aerosol in varying degrees of pollution period in city [6]. In terms of methodology, descriptive statistics, correlation analysis and principal component analysis have been widely used to study the air pollution problem. Azid A, Juahir H and Toriman ME forecasted the air pollution level by using the method of principal component analysis and artificial neural network; Assareh N, Prabamroong T and Manomaiphiboon K made a statistical analysis of the amount of ozone in the eastern region of Thailand from 1997 to 2012 during dry seasons [7,8]; In addition, neural network, fuzzy mathematics, support vector machine and modern econometric model method have also been

applied to different degrees, such as Bai Y, Li Y and Wang XX based on back-propagation neural network model to analyze and predict the atmospheric pollutant concentration; Kaburlasos VG, Athanasiadis IN and Mitkas PA estimated the ozone content by applying the method of fuzzy reasoning [9,10]. With the advent of the high frequency air quality data, some researchers try to use the data mining method to study the associated rules between different air pollutants. Jia Jin used data mining and other methods to analyze the spatial-temporal characteristics of atmospheric complex pollution [11]. These studies provide a methodological basis for the study of air pollution in the Guanzhong urban agglomeration based on high frequency AQI data.

Study Region and Data Study Region The study region of this paper is Guanzhong urban agglomeration which includes five cities. They are Xi'an, , , and . Guanzhong urban agglomeration locates at the alluvial plain formed by Weihe River and Jinghe River. It is at a low altitude and is surrounded by northern mountains and Qinling Mountains in the south [12]. For Guanzhong urban agglomeration, this kind of unique topographic feature leads to a relatively closed system in which the mutual influence of air pollution between different cities is very obvious. Because of the closed nature of the system, pollution within the system is difficult to dissolve. In seasons prone to generate air pollution, this leads to a phenomenon that air pollution in this area is easy to accumulate. Therefore, the scope of the air pollution is wide, the degree of it is deep and the duration of it is long. Additionally, according to relevant research data and the experience of living in this area, we got a general conclusion that air pollution situation in Guanzhong urban agglomeration is not optimistic. Data

In this paper, the data of AQI and the concentration data of SO 2, CO, NO 2, O 3, PM 2.5 and PM 10 are collected from a website (http://www.pm25.in) which collects and integrates relevant data from the website of Ministry of Environmental Protection of the People’s Republic of China in real time. The range of the sample data is from 0 o’clock on January 2nd, 2015 to 23 o’clock on December 31st, 2015. And we chose the minimum error method of cubic interpolation to fill up a few missing data after comparing several common interpolation methods. The AQI which is divided into six grades in China is a dimensionless index for quantitative description of the status of air quality, moreover, the bigger the value, the more serious the air pollution is. In addition, we adopt the season division standard that spring is from March to May, and so on, for other seasons.

Statistical Analysis of Air Pollution in Guanzhong Urban Agglomeration In this part, we will analyze statistical rules of air pollution in GuanZhong urban agglomeration from two aspects—general situation of air quality and the intraday fluctuation of air quality. General Situation of Air Quality Analysis Based on AQI Time Series. Figure 1 is the AQI sequence diagrams in five cities. In Figure 1, the line of moderate pollution means that AQI reaches level four (151), and the line of mean value measures the average level of air quality in one year of that city. According to Figure 1, we find that AQI sequence diagrams of the five cities have similar wave characteristics. And the following laws are listed: Firstly, annual average AQI values of five cities are between 87.70 and 94.83. Though the difference is not dramatic, all of them are higher than the national average of 79.36. This reflects that the overall air quality in Guanzhong urban agglomeration is not good. Secondly, the period that five city AQI value more than the "moderate pollution line" are mainly occurred in spring and winter, generally, the AQI level in summer and autumn is low. It shows obvious seasonal characteristics that air quality in winter and spring is worse than that in summer

and autumn; Air pollution will be concentrated in a certain period of time, showing a "clustering property". Thirdly, the AQI value of five cities always reaches to peaks or valleys at the same period. A typical example is that five cities have appeared four serious polluted weathers during 7700th hour to 8700th hour. This similarity reflects that there are association rules of air pollution between the five cities.

Note: L1 is the moderate pollution line means that the AQI value reaches level four; L2 is the line of mean value which measures the average level of air quality in one year of that city. Figure 1. AQI sequence diagrams of five cities. Analysis Based on Primary Pollutants. According to the Technical Regulation on Ambient Air Quality Index (HJ 633-2012), the primary pollutant is the pollutant which has the largest IAQI when the AQI is higher than 50. The statistical results of three main primary pollutants in 8736 hours of five cities are shown in Table 1. And the seasonal distribution of “AQI<50” (means that the air quality is good) and three main primary pollutants is shown in Table 2. From Table 1 and Table 2, we can get the following laws: Table 1. Percent of each primary pollutant in five cities (%).

Xi'an Xianyang Tongchaun Baoji Weinan

O3 1.15 0.41 0.27 0.44 0.46

PM 2.5 25.59 39.81 42.66 40.73 41.79

PM 10 73.27 59.76 57.07 58.83 57.73

The primary pollutant that appears most in the year is PM 10 which mainly appears in spring, summer and autumn, while in Xi'an, PM 10 as the primary pollutant accounted for as high as 73.27% throughout the year. After PM 10 , the second major pollutant is PM 2.5 which mainly appears in spring, autumn and winter. In addition, O3 mainly appears in summer, and O3 pollution in Xi'an is the most serious among five cities. Moreover, the situation of "AQI<50" mainly appears in summer, autumn and spring, this reflects that air quality in winter is the worst in a year.

Table 2. Seasonal distribution of “AQI<50” and primary pollutants (%). Spring Summer Autumn Winter AQI<50 24.26 39.14 33.78 2.82

O3 20.00 78.57 1.43 0.00

PM 2.5 33.41 13.27 26.67 26.65

PM 10 36.61 28.61 21.92 12.86 Fluctuation Rules in One Day of Air Pollution Construction of AQI Hour Index. The seasonal index is used to reflect the relationship between the variable level in a quarter and the total average level. Drawing the seasonal index chart can help us to clearly summarize the impact of monthly changes on research variables. In this paper, similarly, we define the AQI hour index which is used to reflect the relationship between the AQI level at some time points and seasonal average level. And though the hour index chart we can clearly summarize the air quality fluctuation laws in one day in a certain season. And the construction processes are as follows:

Step 1: calculate the average AQI at each time point ( xk ) in a certain season. For AQI time series, 24 hours is a cycle, and we suppose that the number of cycles in a certain season is n , the calculation formula is as follows:

n

∑ xik x = i=1 , k =1,2, ,24 . (1) k n Step 2: calculate the average AQI of a certain season ( x ).The calculation formula is as follows:

n 24

∑∑ xik x = i=1 k = 1 . (2) 24 n

Step 3: calculate the AQI hour index ( H k ). The calculation formula is as follows:

x H = k , k =1,2, ,24 . (3) k x

The situation that H k is higher than 1 indicates that the AQI value of this time point is often higher than the total average value of the season, otherwise it shows that the AQI value of the time point is often lower than the total average value of the season. Through the observation of AQI hour index chart, we can sum up the fluctuation laws of air pollution in one day. Analysis of Intraday Fluctuation of Air Pollution. In this part, we take Xi'an as an example for analysis. The fluctuation graphs of AQI hour index of Xi’an in four seasons are shown in Figure 2. And the following laws are listed:

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 Figure 2. Fluctuation graphs of AQI hour index in four seasons of Xi’an. Figure 3. Stationarity test of VAR model.

Firstly, the fluctuations of hour index in spring and winter are roughly the same, but different with that in summer, and the fluctuation in autumn is somewhere between the two. Secondly, the hour index in spring and winter from 10 p.m. to 1 p.m. the next day are basically in a state of greater than 1. A small peak is reached between 2 a.m. and 3 a.m., and at about 11 a.m. it reaches to a larger peak and the air quality is the worst at this point of time. Between 2 p.m. and 9 p.m., the index is less than 1. The minimum value appears at 5 p.m. and the air quality is the best at this point of time. While in the summer, from 8 a.m. to 10 p.m., the hour index is basically greater than 1, two peaks appeared at 11 a.m. and 7 p.m. respectively, and from 10 p.m. to 7 a.m. the next day, hour index is less than 1 and the minimum value occurs at about 1 a.m.. Summing up the change of air quality within one day, we can say that the air quality in first half day (0 a.m. to 1 p.m.) is worse than that in the rest of the day (2 p.m. to 11 p.m.) in spring and winter. While in summer, air quality in night is better than that in the day time. Thirdly, AQI hour index fluctuation graphs of four seasons are similar in shape, but not in phase positions. In other words, the hour index fluctuation graphs of spring and winter can be obtained through a left-ward or a right-ward shift in the graph of autumn or summer. It can also be said that four seasons have a consistent daily relatively change of air pollution, while the best or worst of air quality appears in different time in one day. Furthermore, similar laws can be got by analyzing the AQI hour index of other four cities.

Analysis of Associated Rules of Air Pollution between Cities Based on VAR Model In this part, we use vector autoregressive model to analyze the air pollution associated rules between the five cities. Additionally, we find that the frequency of PM 10 and PM 2.5 are the top two among all the pollutants in research of primary pollutant, and in view of the fact that PM 2.5 is more harmful to human health than PM 10 , hence, the PM 2.5 is selected as analysis object. Introduction of VAR Model The mathematical expression of the VAR model used in this paper is:

=Φ + +Φ + = yyt1 t− 1  ptpt yu − t1,2, , T . (4)

In this model, yt is an endogenous variable column vector with five dimensions, p is the lag × Φ Φ order, T is the number of samples, 5 5 matrixes 1, , p are coefficient matrices to be estimated, ut is the 5 dimensional disturbance column vector, also known as a vector of innovations. As mentioned earlier, we used the full year time series data of PM 2.5 of Xi'an, Xianyang, Tongchuan, Baoji and Weinan in 2015 to build up the VAR model. VAR Model Building When fitting the VAR dynamic regression model, the series of the participating models must meet the requirements of stability. And the ADF test method is used to test the stability of the above series, and the results show that the PM 2.5 series of the five cities are stable at the confidence level of 99%, hence the VAR model can be established. The results of ADF test are shown in Table 3. Table 3. Results of ADF test. Series name(city) Test statistic Critical value (1% ) Prob. Conclusion XA(Xi'an) -8.1333 -3.4309 0.0000 stationary XY(Xianyang) -6.7312 -3.4309 0.0000 stationary TC(Tongchuan) -8.5250 -3.4309 0.0000 stationary BJ(Baoji) -9.7914 -3.4309 0.0000 stationary WN(Weinan) -8.1202 -3.4309 0.0000 stationary

Then, the lag period of the model is determined as 3 by using the Schwartz criterion (SC). The empirical results show that the goodness of fit of the 5 equations in the VAR model is more than 0.967, and the stationarity test of the VAR model shows that all the inverse roots of the characteristic AR polynomial lie inside the unit circle, means that the estimated VAR is stable. The stationarity test of VAR model is shown in Figure 3. Analysis of Impulse Response For the VAR model, we focus on the dynamic characteristics of the system, that is, the analysis of the impact on other variables after exerting an impact on one endogenous variable by using the impulse response. The results of impulse response analysis are shown in Figure 4. The figures of impulse response show that the impact on Xianyang, Tongchuan, Baoji and Weinan after exerting an impact on the PM 2.5 concentration of Xi'an. Then following laws are listed: The PM 2.5 concentration of Xi'an have a positive influence on that of the other four cities, this mean that the PM 2.5 concentration of other cities’ will increase with the increase of the concentration of Xi 'an. In addition, the impacts on the other four cities have the same change trend, with the passage of time, the effect increases firstly, and reach a peak, then decay gradually. Moreover, the peaks of influence on four cities all appear in 24 hours after the impact on Xi'an is exerted. To be precise, different cities need different time to reach to the peak of influence: Xianyang (5 hours), Weinan (6 hours), Tongchuan (17 hours) and Baoji (20 hours). Combined with the Guanzhong City Group map, we find that the farther the spatial distance, the longer it takes to reach to the peak. And the influence peak values of four cities are different. With the increasing of spatial distance, the influence peak value gradually decreases: Xianyang (6.7), Weinan (4.9), Tongchuan (4.0) and Baoji (3.7).

Figure 4. Responses of the other city to Xi’an. Furthermore, similar laws can be got by analyzing the impulse response of the other four cities.

Conclusions Based on high frequency data of AQI and contaminants, we make studies including the general situation of air pollution, fluctuation rules of air quality in one day and associated rules of air pollution between cities by using statistical analysis like the hour index of AQI and building the VAR model. And the following laws are obtained: Firstly, the overall air quality conditions of the Guanzhong urban agglomeration is not good, and the pollution characteristics are obvious. The average air pollution level in five cities throughout the year is more terrible than the national average level; And annual pollution fluctuations of five cities are highly similar, and the fluctuations of air quality of five cities show a clear correlation; Air pollution in spring and winter is more serious than that in summer, "seasonal effect" and " clustering

property " of air pollution is obvious. Air pollution in Guanzhong urban agglomeration is mainly manifested as particulate matter pollution. Most of the primary pollutants in one year are PM 10 and PM 2.5 . What’s more, the O3 pollution mainly concentrates in summer. Secondly, air pollution laws in one day are obviously different from season to season. The fluctuation of AQI hour index in spring and winter is clearly different from that in summer, and air quality in first half day is worse than that in the rest of the day in spring and winter, while in summer, air quality in night is better than that in the day time; However, without the consideration of the phase position diversity, intraday fluctuation rules of air quality in one season are similar to those in other seasons. Thirdly, there are obvious association rules between the air pollution in five cities. The deterioration of air condition in one city can cause deterioration of air condition in the other cities. The peak of this influence appears in one day. As the spatial distance increases, the influence weakens.

Reference [1] Wan Qing, Wu Chuanqing, Zeng Juxin. Study on the urbanization efficiency and determinants of China's urban agglomerations[J]. China population, resources and environment, 2015, 25 (2), 66-74. [2] Fang ChuangLin, Guan Xingliang. Comprehensive measurement and spatial distinction of input-output efficiency of urban agglomerations in China[J]. Acta geographica sinica, 2011,66 (8), 1011-1022. [3] Kassomenos P, Vardoulakis S, Chaloulakou A etal. Levels, sources and seasonality of coarse particles (PM 10 -PM 2.5 ) in three European capitals - Implications for particulate pollution control[J]. Atmospheric encironment, 2012, 54, 337-347. [4] Kimbrough Sue, Baldauf Richard W, Hagler Gayle S W etal. Long-term continuous measurement of near-road air pollution in Las Vegas: seasonal variability in traffic emissions impact on local air quality[J]. Air quality atmosphere and health, 2013 (1), 295-305.

[5] Zhou HJ, He J, Zhao BY etal. The distribution of PM 10 and PM 2.5 carbonaceous aerosol in Baotou, China[J]. Atmospheric research, 2016, 178, 102-113. [6] Xu JS, Xu HH, Xiao H etal. Aerosol composition and sources during high and low pollution periods in Ningbo, China[J]. Atmospheric research, 2016, 178, 559-569. [7] Azid A, Juahir H, Toriman ME etal. Prediction of the level of air pollution using principal component analysis and artificial neural network techniques: a case study in Malaysia[J]. Water air and soil pollution, 2014, 225 (8). [8] Assareh N, Prabamroong T, Manomaiphiboon K etal. Analysis of observed surface ozone in the dry season over Eastern Thailand during 1997-2012[J]. Atmospheric research, 2016, 178, 17-30. [9] Bai Y, Li Y, Wang XX etal. Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions[J]. Atmospheric pollution research, 2016, 7 (3), 557-566. [10] Kaburlasos VG, Athanasiadis IN, Mitkas PA. Fuzzy lattice reasoning (FLR) classifier and its application for ambient ozone estimation[J]. International journal of approximate reasoning, 2007, 45 (1), 152-188. [11] Jia Jin. Analyzing temporal, spatial characteristics and sequence pattern of atmospheric compound pollution with air quality data[D]. Zhejiang: Zhejiang University, 2014, 1-135. [12] Han Chao. The statistical distributions of air pollutant concentrations and the relationship between air qualities and meteorological elements in Guanzhong of Shaanxi, China[D]. Xi’an: Chang'an University, 2012, 1-73.