CEJC 2(2003)121{136

Water QualityStudy of the River Basin, (1989 {1998)

1 2 3 P.Simeonova ,V.Simeonov ¤, G. Andreev

1 Institute ofSolid State Physics, BulgarianA cademyof Sciences, 1172So¯ a, Tzarigradsko Chaussee Blvd. 72, Bulgaria 2 Chairof Analytical Chemistry, F acultyof Chemistry, University ofSo¯ a \St.Okhridski", 1164So¯ a, J. BourchierBlvd. 1, Bulgaria 3 Institute ofOceanology, BulgarianA cademyof Sciences, 9000V arna,P.O. Box152, Bulgaria

Received 8January2003; accepted 18March 2003

Abstract: Thepresent paperdeals with anestimation ofthe waterquality ofthe Struma river. Long-termtrends, seasonalpatterns anddata set structures arestudied by the use of statistical analysis. Nineteen samplingsites alongthe mainriver stream anddi¬ erent tributaries wereincluded in the study.Thesites arepart ofthe monitoringnet ofthe regionof interest. Seventeen chemical indicators ofthe surface waterhave been measuredin the period 1989{ 1998in monthly intervals. It is shownthat the waterquality is relatively stable throughout the monitoringperiod, which is indicated by alack ofstatistically signi­cant trends for many ofthe sites andby chemical variables. Several seasonalpatterns areobserved at the sampling sites andfour latent factors areidenti­ ed asresponsible for the dataset structure. c Central EuropeanScience Journals. All rights reserved. ® Keywords:Surfac ewater quality, statistical data treatment, linear regressionanalysis, time- series analysis,cluster analysis,princip al componentsanalysis

1Introduction

The Struma riveris located in the southern part ofBulgaria. Itruns from north to south and has alength of 290 kmto the Greekborder. The stream length from that point to the AegeanSea isabout 110 km.The total catchments area isnearly 10,250km 2 within

¤ E-mail:[email protected]­ a.bg Unauthenticated Download Date | 9/24/15 11:30 PM 122 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

Bulgaria and the Mountain and the , and `surrounding’ mountain ranges form it.Being a cross border riverStruma basin isof substantial importance for both countries. That iswhy the careful monitoring of the water qualityin long-term or short-term timeperiods at di®erent sampling sitesis not onlyan ecologicalbut alsoa politicalissue. Similar studies for the Yantra river[1-3] have indicated that veryuseful information can beextracted from the data collectedin the monitoring period. In Fig.1 apresentation of the Struma °ow in the Bulgarian territory isgiven along with the location of the sampling sitesfrom the Struma rivermonitoring network con- trolled by the Ministry of Environment and Waters. As seenthe monitoring system involvesmany sites where regular testing of the water qualityis performed {on daily, weeklyor monthly basis. The industrial and agricultural activityin the region ofthe Struma riveris relatively high. The population within the basin totals some532,000 (6.47% of the population of Bulgaria) with nearly 300,000 (71%) in urban areas. The main towns of over20,000 population are:, , Kjustendil, Dupniza, and Sandanski. As far as the land use isconcerned some29,700 ha. of land isunder irrigation and the natural conditions in this region are favorable for growing vegetables,fruits, tobacco, cotton and almonds. The water in the region isused for production ofelectricityand irrigation. Electricity isgenerated atpower stations (\Kalin",\Kamenitza", \Pastra", \Rila",etc.) in someof its tributaries. There are severaluncompleted irrigation systems:\Dolna Dikanja-Kovachevzi-Rado- mir" schemesthat are intended to use water from the reservoir \Pchelin",\Ddyakovo- Dupnitza" scheme,that should use waters from the reservoir \Dyakovo".At present an irrigation system \Pirinska Bistrica" isin process of reconstruction and modernization. Due to regular monitoring asubstantial amount ofanalyticaldata isalready available, but there isstilla lackof summarizing studies taking into account allaspects of the river system and extracting allpossible information from the data sets. Of course, someof the analyticalrecords are not complete;in the data set one could detect missing data or unmeasured cases.The onlyway to reach anew levelof information concerning the water qualityis the application of multivariate statistical methods (chemometrics and environmetrics). In the caseof the Struma riverit isof substantial importance to detect trends in the concentrations of the major chemicalconstituents determined in the monitoring net as wellas to revealthe seasonal behaviour of the components and identify possible sources of pollution. The aimof the present study isto ¯nd out and explainall these multivariate statistical parameters.

Unauthenticated Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 123

Fig. 1 StrumaRiver monitoring net

2Experimental

2.1 Sampling and chemical analysis

The location of the sampling sitesdelivering the data isindicated in Fig.1. Nineteen Unauthenticated sites were chosen coveringalmost completelyDownload theDate river| 9/24/15stream 11:30 PM from its spring down to the 124 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

Greekborder. The period of observation for the region of interest was 10 years(1989 { 1998). The chemicalindicators involvedwere: pH, dissolvedoxygen, BOD5 (biological oxygendemand), COD(chemicaloxygen demand), conductivity,acidity,dissolvedmat- ter, non-dissolved matter, total hardness, chloride,sulfate, ammonium, nitrate, nitrite, iron, magnesium, calcium.The chemicalanalysis performed includes standard analytical methods as routinely applied inthe control laboratories of the monitoring net. Potentiom- etry,titrimetry,gravimetry,and spectrophotometry are the standard analyticalmethods widelyused in surface water qualityanalysis especially for major indicators likethose mentioned above.The sample preparation and sample measurements are described in detail elsewhere[4].

2.2 Statisticalanalysis

For trend analysisweighted annual averagevalues from di®erent sampling sitesfor chem- icalobservation ofmajor parameters inthe water ofthe Struma riverwere accumulated. Trends in the substance concentrations were evaluated using least-square regression ap- proach and statistical testing of the regression coe±cient signi¯ cance after estimation of the standard error of the coe±cient value at p < 0:05 and p < 0:01 and respective estimation ofthe residuals by F -test. The correlation coe±cient r was alsocalculated as ameasure for trend signi¯cance [5]. F orany caseof chemicalcomponent determined at a givensampling sitethe predicted valuesfor substance concentration with respect to the linear regression analysiswere determined. Until recentlythe mathematical methods of time-seriesanalysis (TSA) in the en- vironmental scienceshave only been used quiterarely; the methods havemostly been applied ineconomics.The mathematical fundamentals of time-seriesanalysis are mainly described invarious books and papers dealing with statistics and econometrics[6-10]. In general,time-series analysis has the following main purposes: 1.display ofthe series; 2.preprocessing ofdata; 3.modeling and description ofthe series; 4.forecasting with suitable models; 5.control of predicted data. Usually,the ¯rst step in the time-seriesanalysis is drawing adata plot, which gives an ideaof the shape of the time-series.It mayproperly display periodicity,trends, °uc- tuations and outliers. In the next step it isconvenient to construct, if necessary,the seasonal sub seriesplots to get additional information onseasonal °uctuations. Stripping away random °uctuations are achievedby smoothing the seriesor by ¯ltering the peri- odicities.Thus, ashort-term forecast ispossible with amemory of the last valuesof the series.In this casesimple moving average,exponential smoothing, seasonal di®erencing, cumulativesum (CUSUM) technique and seasonal decomposition are applied asmodeling methods. Regressionand correlation techniques are alsoknown in TSA. In the present study the input data for alltested indicatorsUnauthenticated and for allsampling sites with long enough Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 125 period ofmonitoring (at least60 consecutivemonths ofobservation) were treated in such away asto obtain seasonal decomposition. Only¯ vesampling sites are considered sinceonly they ful¯ll the requirement of ex- tended period ofobservation for TSA. The seasonal e®ects were determined by the use ofthe additive model:

x(t) = l(t) + m(t) + sea(t) + e(t) where l(t)isthe levelcomponent with smoothing parameter ¬ m(t)isthe trend component with smoothing parameter ­ sea(t) isthe seasonal component with smoothing parameter ® e(t)isthe error ofthe model. The constants for the levelcomponent, for the trend and seasonal component must lie within the interval between 0and 1.The constants ¬ , ­ and ® are optimized by stepwise variation and were,respectively 0, 0 and 0.5. The software packageapplied was STATISTICA 5.0. Cluster analysis(CA) and principal components analysis(PCA) were used for mul- tivariate statistical modeling of the input data [11,12].The main goalof the hierarchical agglomerativecluster analysisis to spontaneously classifythe data into groups ofsimilar- ity(clusters) searching objects inthe n-dimensional spacelocated inclosestneighborhood and to separate astable cluster from other clusters. Usually,the sampling sites are con- sidered as objects for classi¯cation, eachone determined by aset of variables(chemical concentrations). It isalso possible to search for linksbetween the variablesturned to objects of classi¯cation. Inorder to achievethis aseriesof procedures isnecessary: 1.Normalization of the raw input data to dimensionless units in order to avoidthe in°uence of the di®erent range of chemicaldimensions (concentration); 2.Determination of the distance between the objects ofclassi¯cation by application of somesimilarity measure, e.g.Euclidean distance or correlation coe±cient; 3.Performing appropriate linkagebetween the objects by someof the cluster algorithms likesingle, average or centroid linkage; 4.Plotting the results asdendrogram;

5.Determination ofthe cluster signi¯cance by 0.33D max or 0.66 Dmax criterion 6.Interpretation ofthe clusters both for objects and variables. Using cluster analysisone could display the object similarityin areliableway to makethe initialinterpretation of the data set structure. But amore reliabledisplay method proves to bePCA. Itenables the reduction of the dimensionality ofthe spaceof the variablesin the direction of the highest varianceof the system,new variablesbeing linear combinations of the previous variables,replacing the old coordinates of the factor space.The new coordinates are calledlatent factors or principal components [12].The interpretation of the new factors isthe main goalof the chemists sincethey deliveruseful information about latent relationships within the data set.The results are indicated by two sets {factor scoresgiving the new coordinates of the factor spacewith the location of the objects and factor loadings informingUnauthenticated on the relationship between the variables. Download Date | 9/24/15 11:30 PM 126 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

Onlystatistically signi¯ cant loadings ( > 0:70) are important for the modeling procedure. The new principal components (latent factors) explaina substantial part ofthe total varianceof the system for an adequate statistical model.Usually ,the ¯rst principal component (PC1) explainsthe maximalpart ofthe system variation and eachadditional PChas arespectivecontribution to the varianceexplanation but with lesssigni¯ cance. Areliablemodel requires normally such anumber of PCs, so that over75 %of the total variation can be explained.In our modeling we haveapplied the Varimax rotation mode of PCA that allowsa better explanation of the system in consideration sinceit strengthens the roleof the latent factors with higher impact onthe variation explanation and diminishes the roleof PCs with lower impact. In order to understand the contribution of eachlatent factor to the total mass of the chemicalcomponents in the surface water an apportioning model iso® ered. The source apportioning follows the approach of Thurston and Spengler [13]that uses multiple regression ofthe total mass on the absolute principal components scores (APCS) obtained by the performance of PCA.

3Resultsand discussion

The input data (annual mean valuesof the concentrations) are availableon request by the authors for nineteen sampling sitesand 17 parameters of the surface water. The summary basicstatistics of the input data ispresented inTable1.

3.1 Trendanalysis

The summarized results from the least-square linearregression analysisfor the timeinter- valin consideration (1989 {1998) are presented in Table2. Only statistically signi¯ cant trends are discussed in the Table. It isinteresting to note that the physicalindicators of the surface waters (color and smell)that were alsomeasured (in relativeunits) throughout the monitoring did not show any trend at any of the sites.They obviouslydo not posses enough information weight to beused infurther seasonal ordata-structure studies. For 9out of19sitespH revealsa non-signi¯cant trend. Foragroup of sites(293, 296, 297,298 and 299) the trend isstatisticallysigni¯ cant: at two of the sitespH isincreasing whilefor the other three pH isdecreasing.The dynamic range ofpH valuesfor the sites mentioned isprobably related to their location in the neighborhood ofanthropogenically- in°uenced regions (near to the towns Pernik and ).Another group ofsites(123, 400 {403) alsoindicates adecreasing trend of pH. The sitesin consideration are tributaries of the Struma likeDjerman (near the town of Dupnitsa), (near to town of Sandanski), Kustendilska (near to town of Kustendil) and Elishnitsa. Again, a slight acidi¯cation e®ect is found due probably to anthropogenic wastes. Site 123 isclose to Simitlivillage where possible acidifyingwastes can contribute to decreasing pH.

Anon-signi¯cant trend isfound for almostUnauthenticated allsampling sitesin regard to the dissolved Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 127

Indicator Meanvalue Stand. Dev. MinimumMaximum

pH 7.84 0.35 6.76 9.30

Dissolved O2 7.93 1.55 4.06 12.22 Conductivity 492.54 224.71 112.00 1085.40 BOD5 9.82 8.64 0.58 50.00 Total acidity 14.41 11.54 1.89 90.37 COD 29.51 40.00 5.00 340.00 Diss. substance 346.66 156.09 80.00 900.00 Susp. solids 57.32 54.71 2.00 472.73

Cl¡ 28.63 10.91 9.88 107.97 2 SO4¡ 69.15 44.70 27.33 270.68 + NH4 1.20 1.23 0.05 7.64

NO2¡ 0.12 0.33 0.05 3.41

NO3¡ 5.47 3.31 0.03 18.32 Fe3+ 0.38 0.42 0.03 2.60 Ca2+ 40.28 22.68 14.80 92.10 Mg2+ 13.94 9.21 4.16 67.00 Hardness 4.02 2.50 1.39 12.04

Table 1 BasicDescriptive Statistics (concentrations in mg/ dm 3).

O2.Exceptions are sites 126,399 and 402 with statisticallysigni¯ cant trends of increase in dissolvedoxygen content. Both sites are located atStruma tributaries. Signi¯cant trends in conductivity are seenat 11 out of 19 sites.The signi¯cance in the trend isin somecases linked to increase(6 sites) and to decrease(5 sites) in conductivity.The increaseis more characteristic for siteslocated near settlements with possible pollution sources and the decreaseis related to locations near cleanareas far from the anthropogenic impact. Considering the parameters BOD5 and CODone can ¯nd domination ofnon-signi¯cant trends. The signi¯cant casesfor BOD5 show either trends of increase(120, 127,296, 399, 400) or decreasein BOD5 levels(122, 297,298, 299). In the ¯rst group of sitesthe roleof the tributaries isshown again. Probably,they accumulate more pollution that isinserted into the main stream. Then aprocess of dilution and self-puri¯cation starts (the second group of sites indicate this process) and the BOD5 and CODcontent decreases. The trends with respect to dissolvedsubstances and suspended solids are non-signi¯cant in most ofthe cases.Both parameters are related to seasonal events.In afew occasions the trend isdecreasing which maymean thatUnauthenticated as awhole aprocess of self-puri¯cation is Download Date | 9/24/15 11:30 PM 128 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

Site Species with increasing trend Species with decreasing Conditional trend site pattern

120 BOD5, Acid. Fe3+ Urban

+ 3+ 121 Cond., NH4 COD, Fe Urban 122 - BOD5,Diss.substances Rural 123 Susp.solids pH Rural 124 Cond.,Acid., Susp.solids, COD, Fe3+ Urban

Cl¡, NO3¡, Hardness + 125 Acid., NH4 - Urban 2 126 Diss. O2 Susp.solids, SO 4¡, Hard- Tributary ness 127 Cond.,BOD5, Acid., Diss. - Urban 2 Substances, SO 4¡, NO3¡ 293 pH - Background

2 3+ 296 BOD5,COD, SO 4¡, Fe pH, Diss. Substances, Urban Hardness + 297 pH, Diss. Substances BOD5,COD, NH 4 ,NO2¡ Background 298 - pH, BOD5,Cond., Rural + Diss.subst., NH 4 ,NO3¡ + 3+ 299 pH, Cond.,Diss.subst., NH 4 BOD5,Susp.solids, Fe Tributary

301 Conductivity Cl¡ Tributary 2+ 399 Diss.O2,BOD5, Acid., Susp. Cond., Diss.subst., Ca , Tributary + solids, NH4 Hardness 400 Cond., BOD5,Diss.Subst., pH Tributary

NO3¡ 401 - pH, Cond.,Diss.Subst., Tributary 2+ NO3¡ Ca , Hardness

402 Diss.O2 pH, Cond.,Diss.Subst., Tributary

403 COD, NO2¡ pH, Cond.,Diss.Subst., Tributary Ca2+, Mg2+, Hardness

Table 2 SummarizedT rend Distributionfor all Sites at hand. When typicalchemical indicators are treated with respect to the water quality,a speci¯c pattern isobserved. For chloride only4 out of 19 sites possess asigni¯cant trend: site124 (a sitenear to the Greekborder) with atrend of decreaseand site301

(river Strumeshnitsa) with atrend of increase.UnauthenticatedF or sites401 and 403 the total number of Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 129 observations for chloride concentration ismuch lessthat the whole period ofmonitoring due to technicalreasons though atrend ofincreaseis found. Itcon¯rms in someway the assumption that the inlets introduce more polluting speciesin the main stream, which undergoes self-puri¯cation processes. The sulfate ion concentrations indicate astatisticallysigni¯ cant trend for 3out of 19 sites.F orsites with increasing trend (127, 296) the situation istypical for anthropogenic sites closeto the larger towns of Pernik and Blagoevgrad sincethe trend of decreaseis found for atributary (Djerman river). The situation isquite similar for the trend in ammonium ion concentrations (5 out of 19 sites). The dominant tendency for concentration increaseis typical urban sites like Kustendil (125) and two tributaries (299, 399) sincethe rural sites 297 and 298 indicate ammonium concentration decrease. No tendency isfound for the nitrite ion concentration. Onecould assume that this chemicalvariable is not so informative inour trend study and can beeliminatedfor large- scalestudies. Onlysporadic eventsobviously could contribute to nitrite concentration changes along the Struma river. Fiveout of nineteen sitesreveal signi¯ cant trends for nitrate concentration changes. Again, for the urban siteslike the border area (site 124), Blagoevgrad (site 127,a large town) and the Kustendilska river(site 400) the trend isincreasing.F orthe more remote sites 401 and 298 the trend isdecreasing. The concentrations ofmetalions indicate someinteresting tendencies.Unfortunately , the monitoring of calciumand magnesium as separate chemicalindicators began only recentlyand no reliabledata are availablefor serious trend study.Nevertheless,even a limiteddata set for Ca and Mg gavesome preliminary information: adecreasing trend for both ions at sites401 and 403.A much better tendency could be seenfor iron with signi¯cance of the trend at 5sites,mainly decreasing evenfor urban sites.The only increasing trend isobserved for site296 closeto the industrial town of Pernik. Measurements of\water hardness" indicated the sameproblems ascalciumand mag- nesium ions {alimiteddata set with missing data. The onlytwo sites with signi¯cant trend for hardness (both showing decreasing tendency) are 401 and 403 (river inlets).

3.2 Time-series analysis

The sampling sitescover all varieties of the sampling pattern in the area of the Struma riverbasin {rural (124), urban (294), tributary (403, 401) and background (293). The monitoring at these sitesallows the performance of time-seriesanalysis with respect to the seasonal decomposition ofthe data sets. The results can besummarized inthe following way: Site 124(rural): Winter maximums in the concentration valuesare typicalfor dis- solvedoxygen (January), dissolvedsubstances (November), sulfate (November), chloride (October) and ammonium (October); summer maximums are found for pH (July), COD

(June), BOD5 (July), suspended solids (July),Unauthenticated calcium(June), magnesium (June), total Download Date | 9/24/15 11:30 PM 130 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 hardness (June), iron (June), nitrate (July), nitrite (July) and free acidity(June). This pattern corresponds, in principle,to someseasonal eventslike enhanced agricultural ac- tivityat the end of the summer period when larger amounts of fertilizersare used or decreasesin the water levelin the high summer leading to relativeincreases of some concentrations. Site 294(urban): This siteis located near to the industrial town Pernik and reveals somedi® erent seasonal behaviour with respect to winter and summer maximums in the chemicalconcentration levels.The winter maximums are typicalfor speciesrelated to anthropogenic activitylike ammonium (January), nitrate (February), nitrite (December), iron (January), COD(March), BOD5 (February), free acidity(F ebruary), pH (March) and dissolvedoxygen (March). In the summer period maximums can be detected for calcium(June), magnesium (July), total hardness (August), dissolvedsubstances (Au- gust), suspended solids (September), chloride (September), sulfate (August). Obviously, in such aregion the water qualityis substantially in°uenced by the anthropogenic ac- tivityto form seasonal patterns with domination of higher levelsof anthropogenically in°uenced speciesin the winter season and increasesof levelsfor more naturally occur- ring parameters during the summer period. Sites 401and 403 (inlet sites): In the previous trend study specialattention was paid to tributary sites where the increaseof wastes carried by the tributaries to the main stream are presumed to takeplace. Thus, asimilarseasonal pattern for these sites could be expected.Although there are somedi® erences, it maybe stated that sites 401 and 403 revealsimilar summer and winter peaksin the valuesof the indicators of water quality.For instance, both sitesshow winter maximums in pH, BOD5,COD, free acidityand nitrates although in di®erent months. Summer maximums are typical for calcium,magnesium, total hardness, and dissolvedsubstances and suspended solids. In this respect the tributary sitesresemble the seasonal pattern of the urban site294. The other indicators {sulfate, chloride,ammonium, nitrite, iron and dissolvedoxygen showed di®erent behaviour for the di®erent sites and do not belong to acertain seasonal pattern. Again, anthropogenic and natural in°uences could be taken into account to explainsome of the peaks,e.g. for site401 increased summer concentrations are found for nitrite, iron and dissolvedoxygen and increased winter concentrations for sulfate, chloride, ammonium. Forsite403 the winter season islinked to higher valuesof ammonium, iron, and dissolvedoxygen and the summer peaksare in sulfate and ammonium. Obviously, somelocal events play an important rolein the determination of the tributary water quality. Site 293(background): This siteis actually the Struma riverspring and according to water qualitystandards should be lessin° uenced by anthropogenic factors. Indeed, it seemsthat mainlynatural in°uences determine the seasonal pattern in this case.In the summer period the lower water levelcauses concentration peaksfor sulfate, nitrate, nitrite, calcium,magnesium, total hardness, iron, chloride,free acidity,dissolvedoxygen, COD, BOD5, dissolvedsubstances and suspended solids.Winter peaks are observed only for pH and ammonium. Thus, the background siteresembles the rural sitealready being Unauthenticated Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 131 in consideration.

3.3 Cluster and principal components analysis

The clustering procedure (Ward’s linkagealgorithm, squared Euclidean distance assim- ilaritymeasure) has leadto the formation of the following signi¯cant and stable clusters with respect to the nineteen objects (sampling sites;in this caseeach site is presented by the averagechemical concentration of eachvariable over the whole period ofmonitoring): K1(122, 123,124, 298, 299) K2(120, 121,125, 127, 294) K3(126, 301,399, 400, 401, 402, 403) K4(293, 297) It maybe concluded that the groups ofsimilarityare formed mainlyaccording to the siteseparation inseveralcategories: rural (K1), urban (K2), inlet(K3) and background (K4) sites.It isobvious that the water qualityis related to certain extent to the site location.An important consequencecould be acertain reduction of the sitesalong the riverstream if there isa clearob jectiveof the location characteristics. It seems much more important to organize severalwell-equipped and monitored sites from the categories(patterns) \urban", \rural", \inlet" and \background" instead of distributing many sites with similarpattern of quality.This suggestion has to be con¯rmed with amore completemonitoring data set obtained without any lackof analyticaldata for severalyears. Probably ,the in°uence of di®erent anthropogenic and natural factors on the water qualityis related to typicalwastes (e.g.waste waters comingwith the tributaries or after activeagricultural activityetc.) or seasonal changes in the water level(especially for rural sites where no serious anthropogenic contribution isexpected). The cluster analysisof the sixteenvariables indicates the formation offour signi¯cant groups ofsimilarity: K1(dissolved matter) K2(sulfate, calcium,suspended matter, COD) K3(magnesium, chloride,free acidity,BOD5) K4(iron, nitrite, nitrate, ammonium, hardness, dissolvedoxygen, pH) The linkageof certain chemicalcomponents isprobably due to the correlation between them and itisindication for the existenceof severalreasons of similarity.These reasons willbe commented after presentation ofPCA results. In principle,PCA con¯rms the clustering ofthe variablesas shown in factor loadings table. Four latent parameters (factors or principal components) explaina substantial part of the total variation of the set (the explainedvariance is about 79 %). The ¯rst factor could be conditionally named \anthropogenic". It consists of chemicalcomponents that are related to products from anthropogenic activitylike dissolved oxygen, BOD5, free acidity,suspended matter, chloride,ammonium, and sulfate. The signi¯cance of this latent parameter for the water qualityis quite Unauthenticatedhigh as it explainsover 30% of the total Download Date | 9/24/15 11:30 PM 132 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

Variable PC1PC2PC3PC4

pH 0.65

Dissolved O2 0.76 BOD5 0.86 Free acidity 0.93 COD 0.86 Diss. matter 0.94 Ndiss. matter 0.76

Cl¡ 0.93 2 SO4¡ 0.88 + NH4 0.74

NO2¡ 0.62

NO3¡ 0.87 Fe3+ 0.76 Ca2+ 0.92 Mg2+ 0.87 Hardness 0.65 Expl. variance 32.52%21.74 % 14.87% 9.72%

Table 3 Factorloadings (only statistically signi­ cant ­ guresare given). varianceof the system.The second latent factor isprobably of natural origin and is conditionally named \waterhardness" .It contains allchemical parameters that are responsible for the water hardness likecalcium, magnesium and the parameter total hardness itself and, additionally,dissolvedmatter. The third latent factor isrelated mainlyto the nitrite and nitrate concentrations and to the iron content as wellas to the chemicaloxygen demand (COD). This factor isnamed \biological" sincethe parameters mentioned are dominantly linkedto biologicalactivity .The last factor isconditionally named \acidity"and isrelated to the pH valueof the water. After estimation of the factors responsible for the data structure, an apportioning procedure iscarried out to evaluatethe contribution of eachpossible source to the total mass of the surface water. The modeling was performed according to the apportioning approach of Thurston and Spengler where the total mass of the sample isdistributed between the sources identi¯ed by PCA (four in this case)after multiple linear regression of the total mass on the absolute principal components scoresfrom the PCA. More orless, the apportioning re°ects the weight of eachlatent factor on the sample mass. Thus, itis possible to determine the impact ofdi®erent Unauthenticated factors, both anthropogenic and natural, on Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 133 the surface water quality.InTable4 the results of the source apportioning are presented.

Variable Anthr. factor Hardness Biological Acidic R 2

pH 15.8 - - 84.2 0.39

Dissolved O2 73.4 - 17.7 8.9 0.68 BOD5 81.7 - 18.3 - 0.64 Free acidity 86.4 - - 14.6 0.81 COD 13.7 - 71.7 14.6 0.66 Diss. matter 15.8 69.6 14.6 - 0.76 Ndiss. matter 73.7 12.3 14.0 - 0.84

Cl¡ 88.1 - 11.9 - 0.58 2 SO4¡ 74.2 12.8 - 10.0 0.64 + NH4 86.7 - 13.3 - 0.67

NO2¡ 10.1 - 79.9 10.0 0.41

NO3¡ 11.0 - 82.4 6.6 0.52 Fe3+ 8.8 - 91.2 - 0.44 Ca2+ - 92.4 7.6 - 0.81 Mg2+ - 94.2 5.8 - 0.82 Hardness - 100 - - 0.81

Table 4 Sourceapportioning results (in % contribution).

It isreadilyseen that the contribution ofeachlatent factor to the portion ofthe mass for eachchemical parameter isdi® erent according to the di®erent impact of the source on the concentration. Forinstance, the chloride concentration isdistributed between the anthropogenic factor (88.1 %)and the biologicalfactor (11.9 %),but the anthropogenic impact ismuch higher. Similar conclusions could be made for any chemicalparameter involvedin water quality. In the last column of the table the multiple correlation coe±cient r2 ispresented (cal- culated between the experimentallyfound valuesand calculatedby the regression model valuesof the total mass). Itgivesan ideaabout the adequateness ofthe respectivemodels for eachof the chemicalparameters. The non-signi¯cant coe±cients are underlined. As awhole the most of the models are statisticallyadequate and can beused for predictive purposes. Unauthenticated Download Date | 9/24/15 11:30 PM 134 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

4Conclusion

There isclear evidence that the Struma riverwater qualityindicators undergo typical changes in the period of monitoring (1989 {1998), which do not in°uence dramatically its general water qualitycharacteristics. Asawhole the lackof signi¯cant trends for most of the chemicalindicators and for most of the sampling sitesis an important indication for the stability of the water quality.However,the organization of the monitoring net and the data collectionrequire more e®orts in order to get reliabledata sets with good data quality. The relativestability ofthe water qualitythroughout the long-term observation period issub ject of changes within certain timeperiods for certain indicators and sites.As it can be expectedthe typicalurban sites where the anthropogenic activityis higher show in many casesvarious tendencies (site 127 near the town of Blagoevgrad or site124 near the Greekborder) related to largenumbers of chemicalindicators (sixor seven). But it isquite di± cult to acceptthoroughly the hypothesis that one can distinguish a typical\urban pattern" for the water qualityin the region ofinterest. The tributary sites (399, 400,401, 402, 403, 299, and 126) are characterized with higher dynamics of the water qualityand it maybe stated that a\tributary pattern" isformed with statistically signi¯cant trends ofmany chemicalindicators. Probably,seasonal and localevents play a major rolein this behavior linkedto collectionsof waste within asmallerwater volume.It isalso hard to detect atypical\rural" or\clean"pattern for sitesthat are neither urban nor tributaries. Some ofthe assumed \clean"sites (122, 123,293) show indeed stability of the water qualityand signi¯cant trends for onlyone to three chemicalindicators. Others, however (sites 297,298) are characterized with higher dynamics of indicator changes. Therefore, the trend study isan important indication for the stability of the Struma riverwater qualityand makesit possible to selectregions of the main stream and the tributaries which are more easilyin° uenced by seasonal or sporadic localevents. Although itisquitedi± cult to expectspeci¯ c seasonal patterns inthe water quality of alargeriver system, some conclusions could be derived from the present study. The rural sites lend toward seasonal patterns including winter maximums in the con- centration valuesfor dissolvedoxygen, dissolved substances, sulfate, chloride and am- monium re°ecting the lateautumn agricultural activityand summer maximums for pH, COD, BOD5,suspended solids,calcium, magnesium, total hardness, iron, nitrate, ni- trite and free aciditycorresponding to the decreasein the water levelleading to e®ective concentration increase. For the urban sites winter maximums are probably typicalfor speciesrelated to an- thropogenic in°uence like ammonium, nitrate, nitrite, iron (January), COD, BOD5,free acidity,pH and dissolvedoxygen while in the summer period maximums can bedetected for calcium,magnesium, total hardness, dissolvedsubstances, suspended solids,chloride and sulfate resulting from lower water levels. The tributary sitesare subject to waste waters and their seasonal pattern. It may besaid that the water qualityindicators possessUnauthenticated alocalspeci¯ city .Resemblanceboth to Download Date | 9/24/15 11:30 PM P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136 135 rural and urban sites could be found depending on the speci¯c location of the tributary sites.Again, anthropogenic and natural in°uences could betaken into account to explain someof the seasonal peaks. Finally,the pattern ofthe background sites isin°uenced onlyby natural factors and in this casethe changes ofthe water levelplay a substantial rolein the seasonal behaviour in the water quality. The multivariate statistical modellingof the data set has indicated that the sampling stations could beclustered ingroups ofsimilarityaccording to four main patterns: rural, urban, inletand background sites.These results mayindicate that the organization of the monitoring requires,on one hand, involvementof any type ofsitesand, onthe other, that the total number of sitescan be reduced if enough representative sitesare kept for the four main patterns. Four latent factors are responsible for the data set structure. These factors are condi- tionallynamed anthropogenic, water hardness, biologicaland acidic.Their con¯guration isproved both by cluster and by principal components analysis. The source apportioning indicated the extent to which chemicalconcentrations in the surface water are in°uenced by anthropogenic, acidicand natural factors. In most of the casesthe mass formation isa function not onlyof acidicbut alsoof more anthropogenic or natural factors. The surface water qualityin the Bulgarian region of the Struma riverstream and the respectivetributaries, therefore, depends on severalstatistically signi¯ cant factors as follows: 1.Sampling sitelocation: rural, urban, inletor background position; 2.Relationship between the chemicalvariables leading to the formation of four latent factors determining the data structure: anthropogenic, water hardness (natural), biologicaland acidic;the last two factors could be considered as anthropogenically in°uenced; 3.Quantitative distribution (apportioning) of the chemicalconcentrations between the latent factors indicating the origin of the chemicalvariables { anthropogenic and natural. The application ofmultivariate statistical approaches to the monitoring data made it possible to collectnew types ofinformation about sampling and chemicalconstitution of the water phase.

References

[1]S. Stefanov, V.Simeonov,S. Tsakovski:\Chemometrical Analysisof Waste Water Monitoring Data from Yantra RiverBasin, Bulgaria", Toxicologicaland EnvironmentalChemistry ,Vol.70, (1999), pp. 473{482. [2]V. Simeonov,S. Stefanov, S.Tsakovski:\Environmetrical Treatment of Water QualitySurvey Data from Yantra River,Bulgaria", MikrochimicaA cta , Vol. 132, (2000), pp. 15{21.

[3]S. Tsakovski,V. Simeonov,S. Stefanov:Unauthenticated \Time-seriesAnalysis of Long-Term Water Download Date | 9/24/15 11:30 PM 136 P.Simeonovaet al. / CentralEuropean Journal of Chemistry 2(2003)121{136

QualityRecords from Yantra RiverBasin, Bulgaria", Fresenius Environmental Bulletin,Vol.8, (1999), pp. 28{36. [4]Bulgarian State Standards, ENand ISO,So¯ a, 1985. [5]V. Simeonov: Principlesof AnalyticalData Treatment ,UniversityPress, So¯a, 1998. [6]J.W. Einax, H.W.Zwanziger,S. Geiss: Chemometricsin EnvironmentalAnalysis , VCH, Weinheim,1997. [7]E. Foerster and B.Roenz: Methodender Korrelations- und Regressionsanalyse , Die Wirtschaft, Berlin,1979. [8]R. Schlittingen and B.H.Jansen: Zeitreihenanalyse ,Oldenburg, Muenchen, 1989. [9]C. Chat¯eld: TheAnalysis ofTime Series: An Introduction , 4th Edition, Chapman and Hall,London, 1989. [10]P .J.Brockwelland R.A.Davis: TimeSeries: Theoryand Methods , Springer, Heidelberg,1987. [11]D.L. Massart and L.Kaufman: TheInterpretation of Analytical Chemical Data by theUse ofCluster Analysis ,J.Wiley,New York,1983. [12]B. Vanderginste, D.L.Massart, L.Buydens, S.DeJong, P.Lewi,J. Smeyers-Verbeke: Handbookof Chemometrics and Qualimetrics, Elsevier,Amsterdam, 1998. [13]G. Thurston and J.Spengler: \AQuantitative Assessment of Source Contributions to Inhalable Particulate Matter Pollution in Metropolitan Boston", Atmospheric Environment, Vol.19, (1985), pp. 9{25.

Unauthenticated Download Date | 9/24/15 11:30 PM