Extreme Hydrological Events: Precipitation, Floods and Droughts (Proceedings of the Yokohama Symposium, July 1993). IAHS Publ. no. 213, 1993. 75

Some characteristics of heavy rainfalls in the Yamato river basin found by the principal component and cluster analyses

M. KADOYA & H. CHIKAMORI Disaster Prevention Research Institute, Kyoto University, Gokasho, Uji, Kyoto 611, T. ICHIOKA Sogochosasekkei Co., Ltd., Umedakita building, Shibata 1-8-15 Kita-ku, 530, Japan

Abstract In this paper, characteristics of spatial distribution of heavy rainfalls causing floods in the Yamato River basin located in the Kinki district are examined by applying the techniques of both principal component and cluster analyses. The result of the principal component analysis shows that rain gauge stations in this basin can be arranged into eight groups from the common characteristics of rain storms. On the other hand, the result of cluster analysis differs slightly from the former, but supports it in the practical sense. Finally, the correlations of flood peak discharges at with 12-, 24-, and 48-hour maximum rainfalls averaged over a basin are examined. The result shows that the flood peak discharges have strong correlation with areal 12-hour rainfalls, especially with those in new urbanized areas along the main channel.

INTRODUCTION

Clarifying the characteristics of heavy rainfalls is the fundamental importance in the planning of flood control or the design of river structures. In this paper, the Yamato River basin located in the Kinki district is chosen as an objective research basin, and the relation between characteristics of heavy rainfall in the basin and flood peak discharges at Kashiwara is examined by applying the techniques of both principal component and cluster analyses to the rainfall data obtained in the basin. First, the technique of principal component analysis is applied for the spatial distribution pattern of the data of maximum 12-, 24-, and 48-hour rainfalls observed in and around the basin, and regionalization of this basin and grouping of rain storms are performed on the basis of characteristics of spatial distribution of rainfall. Next, regionalization of this basin and grouping of rain storms are also performed by the cluster analysis, and their results are compared with those of the principal component analysis. Finally, the correlation of flood peak discharge at Kashiwara with areal rainfall for a given duration is investigated and the effect of characteristics of spatial distribution of rainfall is clarified.

RESEARCH BASIN AND RAINFALL DATA

The Yamato River rises from the Kasagi Mountainous Zone, flows through the Nara and the Kawachi Plains, and goes down into the . Since the basin of this 76 M. Kadoya et al. river has so large area as 1070 km2, the flood formation process is not simple. In this basin, flood risk has been increased because the middle zone of the basin has been urbanized rapidly in recent three decades. This basin suffered severe flood disaster caused by a record heavy rainfall in 1-3 August 1982. We use the data of 12-, 24-, and 48-hour maximum rainfalls observed at 26 points in and around the basin (Fig. 1) for 19 rain storm cases for which flood peak discharges at Kashiwara were over 1000 m3/s or 2 days rainfalls over 100 mm were observed at Nara during 1966-1985. In which, all of the duration time of rain storm are defined at Nara. Unrecorded data of rainfall are substituted by the data estimated from isohyetal maps. Moreover, the hourly rainfall data unrecorded at before 1974 are substituted by those at Otori located near the Sakai station.

Fig. 1 Locations of rain gauge stations in and around the Yamato river basin.

PRINCIPAL COMPONENT ANALYSIS

Interpretation of principal components

In this analysis, rain gauge stations are treated as individuals and rainfalls as variables. Fig. 2 shows the distributions of scores of principal components for 12-hour rainfall. In this figure, the Yamato River basin is divided into 26 polygons using the Thiessen method, and the darker the polygon governed by a gauging point, the higher the score Characteristics of heavy rainfalls in the Yamato river basin 77

Fig. 2 Distributions of scores of principal components for 12-hour rainfall, (a) for the first, (b) for the second, (c) for the third, (d) for the fourth. of the principal component at this point. In general, the first principal component is said to be an index of magnitude. This tendency is well seen in Fig. 3. That is, Fig. 3(a) shows the relation between scores of the first principal component and the mean point rainfalls defined as mean rainfall of every storm at a given point, and Fig. 3(b) the one between factor loadings and the coefficients of correlation defined by the mean and the mean point rainfall. On the other hand, it is found in Fig. 2(a) that scores of the first principal component tend to increase to the north part of this basin. Thus we conclude that the first principal component also represents the variation of rainfall in the south-north direction. The score of the second principal component increases to the west part of this basin, as seen in Fig. 2(b). The score of the third principal components tends to decrease near the main channel of the Yamato River, as seen in Fig. 2(c), and the score of the fourth principal components tends to increase near Nara and the east mountainous area. However, the tendencies of the third and the fourth principal components are not clear compared with the ones of the first and the second principal components.

. (b)

-*3 " D 1 I - 12

-4-20246 -1 0 I Score of 1st P.C. Factor Loading of 1st P.C. Fig. 3 Scatter diagram about the first principal components for 12-hour rainfalls; (a) score to mean rainfall at each point, (b) factor loading for the coefficient of correlation between point and mean rainfalls of every storm at given points. 78 M. Kadoya et al.

Table 1 shows that the cumulative contribution reaches almost 80% by the fourth principal component, thus we conclude that characteristics of heavy rainfalls in this basin can be explained fully using the scores of principal components from the first to the fourth.

Table 1 Cumulative contributions of principal components (%). Duration of Princip il Compc nent Rainfall(hr) 1st 2nd 3rd 4th 5th 12 42.3 59.3 72.0 81.0 84.8 24 41.8 56.9 68.4 77.5 82.1 48 41.6 57.7 71.0 78.9 83.8

On the basis of the interpretation of the principal components mentioned above, we discuss the regionalization of this basin due to the difference of characteristics of heavy rainfalls. Fig. 4(a) shows the relation between scores of the first and the second principal components and Fig. 4(b) the relation between those of the third and the fourth. Rain gauge stations can be classified into six groups, I to VI, as shown in Fig. 4(a). Since the rain gauge 15 is far from the other stations belonging to the group III in Fig. 4(b), it is separated from the group III and treated as another group. The group III can be divided into two groups, stations (4, 5, 10) and (6, 18, 19), because these two groups are far from each other, as seen in Fig. 4(b). Thus they are treated as the different ones. Furthermore, the group VI can be divided into two parts according to whether the sign of scores of the third principal component is positive or negative. As the result, rain gauge stations are arranged into 9 groups for 12-hour rainfall. We also investigate the distribution of scores of the principal components for 24- and 48-hour rainfalls in the same way as mentioned above for 12-hour rainfall, then the basin is regionalized into 7 regions for 24-hour rainfall, and 8 regions for 48-hour rainfall. The difference among these regionalizations is summarized as follows. a) Regions 8 and 9 in Fig. 5 are contained in one region for 24-hour rainfall, while

(a) (b)

1 H i i o - : v j» i a.' 21 ID," 22CP - |9 H •*4 —ïêa-rfsji-u- kv n>' D2 q D OD,| 6 p u u _._>ÏÏ5 ~ 4t§f3 21 -•^8247n "23 ! rv îm n>4 : vi | 1300 j i i i 111 1 1 li 1 1 1 i i i i 1 1 ! 1 1 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 Score of 1st P.C. Score of 3rd C. Fig. 4 Scores of principal components for 12-hour rainfall: (a) relations between the first and second principal components; (b) relations between the third and fourth principal components. Characteristics of heavy rainfalls in the Yamato river basin 79

Fig. 5 Regions divided by the principal component analysis of 12-hour rainfall.

this region is divided into two different regions for 48-hour rainfall, that is rain gauge station No. 20 and a region consisting of 5 stations (21, 22, 24, 25, 26). b) Regions 2 and 3 in Fig. 5 are contained in one region for both 24- and 48-hour rainfall. c) Station No. 3 belonging to the region 4 in Fig. 5 joins in the region 5 located in the upper stream of the main channel for 24-hour rainfall. However, in order to provide the useful information to the planning of flood control, it is desirable that the regionalization for expressing characteristics of heavy rainfalls is summarized to only one type eliminating the effect of rainfall duration. Since the differences of regionalization for every duration are not severe fortunately, the summarization is performed as follows; a) Six gauges (20, 21, 22, 24, 25, 26) contained in either region 8 or 9 are far from the main channel, and then the difference of effect of rainfall in both regions on flood peaks in the main channel is not severe. Therefore, these six stations are summarized to one group. b) Four gauges (12,'13, 14, 15) contained in either region 2 or 3 are also far from the main channel. Therefore, these four stations are summarized to one group. c) Rain gauge No. 3 is located at a point of the midstream of the main channel and considered to be a key point for flood discharge. Therefore, it is treated as a representative point of one separate region. As the result, rain gauges in this basin are arranged into 8 groups, as shown in Fig. 6. Fig. 7 shows distributions of areal 12-hour rainfall of every storm in these regions. In the next section, we will refer to this figure in the process of grouping of rain storms.

Grouping of rain storms

We try to group rain storms considering the distribution of factor loadings obtained by the principal component analysis. Before grouping rain storms, we consider the charac­ teristics represented by factor loadings. Characteristics of 12-hour rainfall is as follows. The factor loading of the first principal component corresponds well with the coefficient of correlation between point and mean rainfalls of every storm at a given point. Therefore, if this value becomes high, rainfall becomes large, especially in the north area. The factor loading of the second principal component expresses a variation of 80 M. Kadoya et al.

Fig. 6 Eight regions summarized by the principal component analyses of 12, 24, and 48- hour rainfalls.

300 No. I 300 No. 2 300 No.3 300 No. 4 Jul I. 1966 - Jul 8.1967 -; Oct 27. 1967 - Jul 5. 1968 '200 r^oo -200 ^200

= 100 °100 I 100 = u 0 0 ttOQxtD 0 Iffllbn Region No. Region No. im-nflReal on No. j * 0 Region No. 1 Ô u No. 7 300 r No. 8 Sep 16.1972 - Apr 8. 1974 -200-

5 ofTTTTTHI Region No. Region No.

300 No.9 300 No.10 300 No. 11 300 No. 12 Jul 3. 1975 •; Bug 6. 1975 -; Sep 8. 1976 •; Jun 23. 1978 r200 ^200 r200 r200

Ï 100 | 100 lOOr [ 100 ex.« s 0 TlTh-m "rrrrrrn 0 taflffl 0 0 1 Region No. Region No. Region No. Region No.

300 No. 13 300 No. M 300 No. 15 300 No. 16 Jun 27.1979 ~ Jul 10.1980 •; Aug 2.1932 T Jun 20.1983

r200 r200 r200|- r200 f 100 f 100 Jioo I 100 s £ o MM o TT-rrw. & 0 0 rnîî^n I Region No.8 Region No. Region No.

300r No. 17 300r No. 18 300 No. 19 Sep 28.1983 -J Jun 26. 1984 - Jun 25.1985

:200 -200 :200

1100 1100 Cffllnh 0 trnmn 0 tmnm Region No. Region No. Region No, Fig. 7 Regional distribution patterns of maximum 12-hour rainfall. The region is defined in Fig. 6. rainfall in the east-west direction. If this factor loading becomes high, rainfall becomes large in the west area. On the basis of the same idea as mentioned above, the following characteristics is explained. If the factor loading of the third principal component is Characteristics of heavy rainfalls in the Yamato river basin 81 high, rainfall around middle and down stream of the main channel is large. However, for 24 hours, rainfall in the south west area of this basin becomes large, when the factor loading of the third principal component becomes high. Similarly, the factor loading of the fourth principal component becomes high, rainfall near Nara tends to become large. Now, a criterion for rain storm grouping is settled as follows; First, whether an absolute value of factor loading exceeds 0.7 or not, is used as a criterion for rain storm grouping. If the number of rain storms selected by this criterion is not enough for grouping, the value is lowered to 0.6. This selection is almost equivalent to the criterion whether the contribution of principal component exceeds 49% or not, and 36%, respec­ tively. Actual work is performed as follows. The first selection is done by the line of 0.6 in Fig. 8 and two groups, (6, 8, 9, 14, 16, 18, 19) and (3, 11, 12), are selected. In the same way, the group (2, 13, 15) for the second, and (7, 17) and (5) for the third, and (10) for the fourth are selected. Since rain storms (1) and (4) are hard to classify based on Fig. 8, they are classified considering the distribution of 12-hour rainfall in Fig. 7. As the result, 19 rain storms are grouped into six types and regions for which the grouped rain storms concentrate, as follows: A. North Part: 6, 8, 9, 14, 16, 18, 19 B. South Part: 3, 11, 12 C. West Part: 2, 13, 15 D. North and South Parts: 7, 17 E. Middle and Down Stream of the Main Channel: 4, 5 F. Whole Basin: 1, 10 The same technique as described above is applied to 24 and 48-hour rainfalls, and the results are almost similar to the one for 12-hour rainfall.

-10 1 -10 1 Factor Loadings Factor Loadings of 1st P.C. of 3rd P.C. Fig. 8 Factor loadings of principal components for each rain storm: (a) the first and second principal components; (b) the third and fourth principal components.

CLUSTER ANALYSIS AND ITS RESULTS

It is known that the principal component analysis is effective to clarify the characteristics of heavy rainfalls in mountainous rainy areas (Kadoya et al., 1987, 82 M. Kadoya et al.

1989). However, all result obtained in the Yamato River basin, where the flat area is extensive and rainfall is smaller than mountainous areas generally, is not definite. Then the cluster analysis (Okuno et al, 1976, Tanaka & Wakimoto, 1983) is performed to compare the result with or to supplement to that of the principal component analysis. Although the method of cluster analysis contains both hierarchical and non-hierarchical methods, the former is used in this study.

Regionalization

A dendrogram is shown in Fig. 9 for grouping of rain gauge stations for 12-hour rainfall. This dendrogram is cut at the level 25 of dissimilarity, then six regions are obtained as shown in Fig. 10. This result corresponds to the one of the principal component analysis as follows; Region III in Fig. 9 corresponds to region 2 in Fig. 5. Region I consists of regions 1 and 3, and region VI consists of regions 8 and 9. In the cluster analysis, rain gauges Nos.7 and 10 are grouped with the cluster containing

Î 20°

CO

"Ë 150

CD CO O 100

50 II n rv 7 VI V 0 JTTl rV=L X~l rV-. I 112153174 5 16121314 6 71018198 9 232021 2224 2526 Gaugi rig Stat i on Fig. 9 Dendrogram for clustering rain gauge stations by 12-hour rainfalls.

Fig. 10 Regions divided by the cluster analysis of 12-hour rainfall. Characteristics of heavy rainfalls in the Yamato river basin 83 gauges Nos.6, 18, and 19, but this cluster is not coincide with any cluster region obtained by the principal component analysis. Rain gauges Nos.7 and 10 are not so clearly characterized in the principal component analysis, thus this difference between both results is considered to be inevitable. Regionalizations by 24- and 48-hour rainfalls are also performed, but their results are almost similar to those of the principal component analysis.

Grouping of rain storms

In the grouping process of rain storms, storm cases are treated as individuals, and rain­ falls as variables. Rainfall is standardized to a variate with mean of zero and standard deviation of one in every storm, that is, influence of rainfall magnitude is eliminated. A dendrogram for 12-hour rainfall is shown in Fig. 11. This graph is cut at the level 25 of dissimilarity to get five clusters. By the comparison of these clusters with rain storm types grouped by the principal component analysis, the following relations are clarified. Cluster III in Fig. 11 corresponds to group B, and cluster V is almost the same as group A. Moreover, cluster I consists of groups D and F. Consequently, it is concluded that the group of rain storms summarized by the cluster analysis is almost similar to those by the principal component analysis. The same technique as applied to 12-hour rainfall is used for 24- and 48-hour rainfalls, and the results are almost similar to those of the principal component analysis.

150

100 -

50 T i i m iv V ij 0 rrv "1 rTn^ r l'r-J-n X AI 17H 7 10 2 15133 1211 i 19 5 6 168 9 14 18 Rain No. Fig. 11 Dendrogram for clustering rain storm cases by 12-hour rainfall in each station.

RELATION BETWEEN PEAK DISCHARGE AT KASHIWARA AND CHARACTERISTICS OF HEAVY RAINFALL

The correlation of flood peaks at Kashiwara with 12-, 24-, and 48-hour rainfalls is examined. Fig. 12 shows flood peak discharges for every rain storm case, and Fig. 13 areal rainfalls over the basin. Through these figures, it is noticed that a severe flood is apt to occur when 12-hour rainfall is large, for instance Nos. 1, 13, and 15. The coefficients of correlation between flood peak discharges and rainfalls are 0.93 for 12 M. Kadoya et al.

3000

2500

S, 2000

o 1500 o 1000

500

0 5 10 15 19 Rein No. Fig. 12 Flood peak discharge at Kashiwara in every rain storm. hours, 0.75 for 24 hours, and 0.74 for 48 hours, that is, flood peak discharges have stronger correlation with 12-hour rainfall than 24- or 48-hour rainfalls. Next, the relation between flood peak discharges and patterns of spatial variation of rainfall is examined. Rainfall in the storm No. 13 for 12 hours is almost the same as that in the storm No. 11 or 12, as seen in Fig. 13. However, the flood peak discharge for the storm 13 is larger than that for the other two storms. The reason is attributed to the difference of spatial distribution patterns of rainfall, that is, the centre of rainfall in 13 is around the main channel, while the other ones are the south mountainous area. Moreover, the storm 19 has no large 12-hour rainfall, but the flood peak is over 1000 m/s for this storm. The reason is almost the same as that for the storm 13. As the result, it is concluded that flood discharges at Kashiwara have strong correlation with 12-hour rainfalls, in particular, when rainfall becomes heavy in new urbanized areas along the main channel.

300 48 Hours 250 24 12 ' 200

150

100

50

15 10 15 19 Rain No. Fig. 13 Areal rainfall in every rain storm. Characteristics of heavy rainfalls in the Yamato river basin 85

CONCLUSION

In this paper, the statistical characteristics of rain storms in the Yamato River basin are discussed on the basis of the results of both the principal component and cluster analyses. As the result, it is found that flood peak discharges at Kashiwara are strongly influenced by intense rainfall for a short duration such as 12 hours, especially when the rainfall is large in new urbanized areas along the main channel.

REFERENCES

Okuno, T. et al. (1971) Methods of Multivariate Analysis, 159-258. JUSE Press, Tokyo, Japan (in Japanese). Okuno, T. et al. (1976) Methods of Multivariate Analysis, second series, 207-237, JUSE Press, Tokyo, Japan (in Japanese). Kadoya, M., Tanakamaru, H. & Yoneda, T. (1987) Some characteristics of heavy rainfalls in the upper basin of the and their effects on flood runoff forecasting, Proc. Annual Conference of , JSCE, 11-55-1 - 2 (in Japanese). Kadoya, M., Tanakamaru, H. & Shibata, Y. (1989) Application of principal component analysis to examining characteristics of heavy rainfalls in the upper basin of the River Echi, Proc. Annual Conference of Kansai region, JSCE, II-29-1 - 2 (in Japanese). Tanaka, Y. & Wakimoto, K. (1983) Methods of Multivariate Statistical Analysis, 230-244, Gendaisugakusha Press, Kyoto, Japan (in Japanese).