1 Identifying Isoyield Environments for Field Pea Production

2

3

4

5 Rong-Cai Yang*, Stanford F. Blade, Jose Crossa, Daniel Stanton, and Manjula S. Bandara

6

7

8

9 Rong-Cai Yang and Daniel Stanton, Policy Secretariat, Agriculture, Food and Rural

10 Development, Room 300, 7000 – 113 Street, , AB, T6H 5T6 and Dep. of

11 Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada T6G

12 2P5; Stanford F. Blade, Crop Diversification Centre North, Alberta Agriculture, Food and Rural

13 Development, RR6, 17507 Fort Road, Edmonton, AB, Canada T5B 4K3; Jose Crossa,

14 Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT),

15 Apdo. Postal 6-641, 06600 Mexico D.F., México; Manjula S. Bandara, Crop Diversification

16 Centre South, S.S. #4, Alberta Agriculture, Food and Rural Development, Brooks, AB, Canada

17 T1R 1E6. Received ______. *Corresponding author ([email protected])

18

19 Abbreviations: AFPRVT, Alberta Field Pea Regional Variety Test; CV, coefficient of variation;

20 GEI, genotype-environment interaction; UPGMA, unweighted pair-group method using

21 arithmetic averages.

22

1 1 ABSTRACT

2 Cultivars are often recommended to producers based on their averaged yields across sites

3 within a geographic region. However, this geography-based approach gives little regard to the

4 fact that not all sites in a given region have the same level of production capacity. The objective

5 of this paper was to describe a performance-based approach to identifying groups of sites with

6 similar yielding ability (i.e., ‘isoyield’ groups), but not necessarily contiguous, and its use for

7 analyzing the yield data from field pea (Pisum sativum L.) cultivar trials conducted across the

8 Province of Alberta, Canada from 1997 to 2001. Of 34 sites tested over the five years, 11 were

9 in 1997, 20 in 1998 and 2000, 22 in 1999 and 21 in 2001. The consecutive use of regression

10 analysis and cluster analysis allowed for classification of test sites in individual years into

11 different isoyield groups: six in 1997, 10 in 1998, 2000 and 2001 and 12 in 1999. However, the

12 most meaningful isoyield groups were those based on the data across the five years through a

13 normalization procedure developed for averaging the multi-year unbalanced data. The use of

14 such averages significantly lessens the impact of random year-to-year variation on the sites,

15 resulting in only seven isoyield groups for the 34 test sites. The identification of isoyield

16 environments (i) facilitates choosing appropriate cultivars for specific environments and (ii)

17 provides a basis for scaling down the cultivar testing program in Alberta.

18

19

20

2 1 The evaluation of registered cultivars or advanced breeding lines at different sites and in

2 different years is essential for selecting superior cultivars for local producers. Such evaluation

3 usually requires a large number of test sites to cover a wide range of regional climatic and

4 edaphic characteristics. However, it has been difficult to strike a balance between a need for

5 reasonable coverage of the regional agro-geoclimatic characteristics and a necessity for

6 economizing on the number of test sites in the face of (i) shrinking resources and (ii) a growing

7 demand for improving the quality of cultivar testing. The difficulty arises largely from

8 inconsistent performance of genotypes in different environments, i.e., genotype  environment

9 interaction) (GEI). One widely used approach to lessening the GEI impact is to stratify the data

10 for homogeneous subsets of test sites through various clustering techniques (Horner and Frey,

11 1957; Abou-El-Fittouh et al., 1969; Ghaderi et al., 1980; Brown et al., 1983; Collaku et al.,

12 2002;). The key outcome of such data stratification is that GEI is minimized within identified

13 groups, but maximized among the groups. While these studies have effectively reduced the

14 magnitude of GEI for clustered groups, they have one or more of the following drawbacks. First,

15 no consideration is given to the performance of a site or group. In reality, producers need to

16 know whether a selected cultivar would perform well in a ‘good’ or ‘bad’ environment (Helm et

17 al., 2002). Second, dendrograms by most cluster analyses only show topography of relative

18 similarities among sites, but there are no objective criteria for determining the number of clusters

19 from these dendrograms. Such criteria do exist, including those based on whether or not sites

20 within a cluster have similar linear responses (Lin and Butler, 1990) or those based on whether

21 or not crossover GEI within a cluster is negligible (Crossa and Cornelius, 1997; Russell et al.,

22 2003), but they have not been widely used. Third, complications arising from the analysis of

23 multi-year data (e.g., unbalanced data and inconsistency of GEI patterns across years) have been

3 1 generally ignored. Thus, results of data stratification will be more useful when these issues are

2 resolved.

3 With recent interest in diversification of crops, aiming at enhancing the long-term

4 sustainability of agriculture in western Canada, non-traditional crops such as field pea have been

5 increasingly incorporated into the farming system in the Canadian Prairies. In the Province of

6 Alberta, field pea is the most cultivated non-traditional crop, accounting for about 55% of the

7 total acreage for these crops (Olson et al., 2001). As field pea production has been expanded to

8 all possible growing areas of the Province, demand for new cultivars with high and stable yields

9 is increasing. Since 1987, Alberta Agriculture, Food and Rural Development (AAFRD) has

10 coordinated the Alberta Field Pea Regional Variety Test (AFPRVT) Program to conduct multi-

11 year and multi-site testing to recommend cultivars to pea producers across the province. These

12 multi-environment data are routinely averaged on a regional (geographic) basis over years (Park

13 and Lopetinsky, 1999). Clearly, this geography-based criterion for cultivar selection does not

14 address the three issues described above and, thus, may not be reliable for choosing appropriate

15 cultivars according to site production levels.

16 In this study, we propose a performance-based approach to grouping test sites for cultivar

17 recommendation. We coin the term ‘isoyield environments’ to describe those sites that are

18 homogeneous in their yielding ability, but not necessarily contiguous in their geography. The

19 concept of isoyield environments is very similar to that of ‘mega-environments’ (Gauch and

20 Zobel, 1997), but with a focus on the site performance in terms of yielding ability. We use this

21 approach to examine patterns of isoyield groups for the field pea trials conducted from 1997 to

22 2001.

23

24

4 1 MATERIALS AND METHODS

2 Data Sets

3 Yield data used for this study were taken from the field pea cultivar trials conducted by

4 AFPRVT collaborators from 1997 to 2001. The yield data prior to 1997 were cultivar means

5 over replications only and, thus, were not included in the present study. A total of 34 sites were

6 used for the trials over the five years: 11 sites in 1997, 20 in 1998, 22 in 1999, 20 in 2000 and 21

7 in 2001 (Table 1). Twenty-eight to 32 registered cultivars or advanced breeding lines from

8 public or private breeding programs were included in all test sites in a given year, but different

9 cultivars except for check cultivars were usually used in different years either due to a turnover

10 to newly registered cultivars or to unavailability of pedigree seed of older cultivars. Two types of

11 field pea cultivars, green and yellow, were grown in the same trials in 1997 and 1998, but in

12 separate trials at the same test sites from 1999 to 2001. The test sites were distributed over four

13 regions, delineated by their geographical and soil characteristics: 1. , 2. East

14 , 3. West central Alberta and 4. Peace River Region (Fig. 1). The Southern

15 Alberta region was further divided into irrigated and non-irrigated areas. The Peace River Region

16 included some neighboring sites in the Province of British Columbia. All trials were conducted

17 using a randomized complete block design with three or four replications. Yang et al. (2004)

18 detailed trial layout and maintenance.

19 Statistical Analysis

20 Let yij be the average yield of the ith (i = 1, 2, …, g) field pea cultivar over 3 or 4

21 replications in the jth (j = 1, 2, …, e) test site in a given year. We first conducted the baseline

22 analysis that partitions the value of yij into the effect of the ith cultivar ( i ), the effect of the jth

5 1 test site ( j ) and the interaction between these two effects ( ij ) under the classic two-way fixed

2 effects model,

3 yij    i   j  ij   ij [1]

4 where  is the grand mean and the residual errors,  ij ’s, are assumed to be normally and

5 independently distributed with mean zero and variance  2 / n (where n is the number of

6 replicates which, in this case, is n =3 or 4). The GEI effect ( ij ) could be further studied by

7 means of different statistical analyses, including stability analysis based on regression models

8 (Finlay and Wilkinson, 1963) or linear-bilinear models (Zobel et al., 1988; Cornelius et al., 1992;

9 Crossa and Cornelius, 1997) and likelihood analysis based on mixed models (Piepho, 1999;

10 Yang, 2002).

11 For our subsequent cluster analysis, we chose the regression-based stability analysis for

12 deriving dissimilarity indexes between pairs of sites, using a modification of method 1 of Lin and

13 Butler (1990), with the roles of cultivars and sites being swapped. The dissimilarity index

14 between a pair of sites is the difference between residual sums of squares, after fitting a

15 regression on the cultivar index using the data from both sites and after fitting two separate

16 regressions, one for each site. Adopting the approach of Finlay and Wilkinson (1963), we used

17 the following regression model, examining the stability of the sites rather than the stability of the

18 cultivars

19 yij   j b j wi  dij   ij [2]

20 where j =    j is the mean of the jth site, bj is the coefficient of linear regression of yij on

21 the cultivar mean wi, and dij is the deviation from the linear regression (the unexplained portion

22 of interaction). We prefer this regression-based analysis for two reasons. First, the direct

6 1 connection between the cluster analysis and the regression analysis enabled us to establish an

2 empirical cutoff point from the dendrogram based on the F-test statistic (the ratio of the smallest

3 dissimilarity index to the estimated error mean square), so that the number of isoyield groups

4 could be impartially identified. Second, the estimated site means and slopes (the site  cultivar

5 interaction) for individual sites were valuable in selecting appropriate test sites from the isoyield

6 groups identified by the cluster analysis.

7 For the hierarchical cluster analysis and dendrogram construction, we computed the

8 dissimilarity index between pairs of sites for each year using the regression model as described

9 in equation [2]. Thus, the dissimilarity indexes derived in this manner would be the numerators

10 of the F-test statistics for a common regression between any two sites. Extending this concept to

11 more than two sites, as shown in Lin and Bulter (1990), the dissimilarity index between any two

12 clusters (each involving one or more sites) would be the numerator of the F-test for similarity of

13 the two clusters so long as the sites were grouped according to Sokal and Michener’s (1958)

14 unweighted pair-group method. Using this clustering method, a dissimilarity index between a

15 pair of clusters was calculated as the average of dissimilarity indexes between all pairs of sites

16 within and among clusters.

17 These ‘between-cluster’ dissimilarity indexes were calculated by invoking the SPSS

18 CLUSTER procedure with the METHOD subcommand being equal to WAVERAGE (SPSS

19 Inc., 2002). However, they should not be confused with those given by the method of average

20 linkage between clusters (groups), commonly known as unweighted pair-group method using

21 arithmetic averages (UPGMA). An UPGMA-based dissimilarity index would be an average of

22 the dissimilarity indexes between pairs of sites from different clusters as calculated in the SAS

23 PROC CLUSTER with METHOD=AVERAGE option (SAS Institute, 1999) or the SPSS

7 1 CLUSTER procedure with the METHOD subcommand being equal to BAVERAGE (SPSS Inc.,

2 2002). The denominator of the F-tests was the error mean squares (MSE) left unaccounted for

3 after fitting regressions for individual sites. Thus, an empirical cutoff point for the dendrogram

4 constructed from the cluster analysis was established based on the F-test statistic (ratio of the

5 smallest dissimilarity index at each cycle of grouping to the estimated MSE). In other words, the

6 cycle at which the calculated F-ratio exceeded its critical value would be considered an

7 appropriate cutoff point.

8 The across-year analysis had a number of difficulties, including highly unbalanced data

9 in year  site  cultivar combinations and considerable differences in site  cultivar means

10 across years. To overcome these difficulties, we normalized the yield data at each site in each

11 year to create the following 10 ‘cultivar classes’: (-  , -2sij), (-2sij, -sij), (-sij, -0.5sij), (-0.5sij, -

12 0.2sij), (-0.2sij, 0sij), (0sij, 0.2sij), (0.2sij, 0.5sij), (0.5sij, sij), (sij, 2sij), and (2sij,  ), where sij is the

13 standard deviation for the ith year and jth site. The use of 10 classes was an act of balance

14 between the need to have sufficient data points for the regression analysis and to have at least

15 one observation in each class. While the boundary values set for each cultivar class were

16 somewhat arbitrary, the classes would have the expected frequencies of 0.0228, 0.1359, 0.1499

17 0.1122, 0.0793, 0.0793, 0.1122, 0.1499, 0.1359 and 0.0228, if the data were distributed

18 according to a normal distribution. Thus, the site  cultivar class means over years and cultivars

19 were calculated for the regression analysis. The site  site matrix of dissimilarity indexes

20 derived from the regression analysis was used in the cluster analyses, just as done for individual

21 years to generate the dendrogram. The number of distinct isoyield groups from the dendrogram

22 was determined from the F-tests described above.

23

8 1 RESULTS AND DISCUSSION

2 Site Performance

3 Average yields of individual sites were calculated for each year and across years. For

4 illustration, we showed the across-year average yields (Fig. 1). A considerable amount of among

5 site variation existed within each of the four geographic regions. For example, in region 2, the

6 yields ranged from 0.757 Mg ha-1 in Paradise Valley to 3.283 Mg ha-1 in , with the

7 regional average of 2.966 Mg ha-1. Similar patterns of site variation were observed in the

8 individual years (maps not presented), but the ranges were generally wider. For the same

9 example in region 2, the average yields in 2001 ranged from 0.757 Mg ha-1 in Paradise Valley to

10 3.666 Mg ha-1 in Vegreville, with the regional average (over six sites) of 2.080 Mg ha-1.

11 Table 2 presents the combined analyses of variance for individual years and for the

12 averages across years (using the normalization procedure explained earlier). While not entirely

13 comparable, site variation from the analysis based on the across-year averages was much less

14 than that from any individual-year analysis. Likewise, the CV from the across-year analysis was

15 also the lowest, compared to those from the individual-year analyses. The effects due to

16 cultivars, sites and their interaction under the across-year analysis were all significant. On the

17 other hand, while the cultivar and site effects were significant, the site  cultivar interaction was

18 significant from 1999 to 2001, but not in 1997 and 1998. In fact, the F-ratios of mean squares

19 for site  cultivar and for pooled error were less than unity (F < 1) in 1997 and 1998, but were

20 between 2.45 to 5.72 from 1999 to 2001.

21 The nonsignificant interaction in 1997 and 1998 was likely due to larger error variation,

22 as the CV was 17.9% in 1997 and 14.3% in 1998, but 9% or less in the remaining three years.

23 As reported by Yang et al. (2004), the trials from 1997 and 1998 had much larger block sizes (28

9 1 to 32 cultivars per block) than did those from 1999 to 2001 (12 to 22 cultivars per block). Green

2 and yellow cultivars were included in the same trials in the first two years, but separated into

3 different trials in the latter three years. Consequently, Yang et al. (2004) found that the averaged

4 CV of raw data was greater in 1997-1998 (15.9-17.7%) than in 1999-2001 (7.7-9.1%). Further

5 partitioning of the interaction sum squares showed significant differences among linear site

6 regression lines in all five years. Deviation from the regression lines was significant from 1999

7 to 2001, but not in 1997 and 1998.

8 Isoyield Groups

9 The cluster analysis and subsequent F-tests based on dissimilarity indexes calculated for

10 pairs of sites or clusters of sites led to classification of test sites into different numbers of

11 isoyield groups in individual years: six in 1997, 10 in 1998, 2000 and 2001, 12 in 1999, and

12 seven across the five years (Table 3). The dendrogram with a cutoff point (the vertical dashed

13 line) from the across-year analysis is portrayed in Fig. 2. The different superscripted letters in

14 each column of Table 3 identified different isoyield groups and the regression lines of sites

15 within an isoyield group would not be significantly different from one another at the 0.05

16 probability level according to the F-tests. It was evident that the sizes of isoyield groups were

17 smaller in individual years than across years.

18 The numbers of times that a site was paired with other sites within isoyield groups (or

19 concurrences) were small in individual years, but relatively large in the across-year analysis

20 (Table 4). This is reflective of the fact that the sizes of isoyield groups were larger across years

21 than in individual years. The maximum possible number of concurrences for a site would be –n -

22 1, with n being the number of test sites in a given year (n = 11, 20, 22, 20, 21, 34 in 1997, 1998,

10 1 1999, 2000, 2001, across years, respectively). In these extreme cases, all n sites would be

2 clustered together to form one isoyield group.

3 Little relationship existed between the geographic regions and isoyield groups in a given

4 year, as an isoyield group could consist of sites from all four regions (Table 3, Fig. 2). In

5 individual years, average yields of individual sites within isoyield groups were similar, as

6 expected, but the corresponding estimates of regression coefficient (bj values) were not

7 necessarily similar, indicating varying levels of site stability within the groups. Across years,

8 however, yield performance and site stability varied considerably among years and there was

9 little consistency of site pairing across years. For example, cultivar trials were carried out at

10 Standard for four consecutive years (1997-2000), but trial average yields ranged from 1.531 Mg

11 ha-1 in 2000 to 7.464 Mg ha-1 in 1999. The site was rated as stable in 1997 (b = 0.056), but

12 unstable in 1998 (b = 1.968). The four years at this site were really the four distinct

13 environments: 1999 was the best year with the best yield performance and average stability

14 (b~1); 1997 was the second best year with an average yield performance, but above average

15 stability (b~0); 1998 and 2000 were not ‘good’ environments, as they were either unstable

16 (1998) or had low yield performance (2000). The site pairing within the isoyield groups

17 involving Standard was quite inconsistent across the four years. Standard was paired with

18 Vegreville and Fairview in 1997, with Namao and (irrigated) in 1998, with no other

19 sites in 1999, and with Bow Island (dryland) in 2000.

20 Unpredictable year-to-year weather fluctuation typical in the Canadian Prairies may be

21 the possible cause of yield variation and site instability across years. Thus, averaging across

22 years and cultivar classes as we did in the across-year analysis would have filtered out much of

23 the year-to-year variation so that the resultant averaged yields would be close to the true site

11 1 averages. This is certainly consistent with the result from the across-year analysis, showing only

2 seven isoyield groups of 34 sites compared to 10-12 isoyield groups with 22 or fewer sites in

3 individual years, except for 1997 (six isoyield groups with 11 sites only). For those sites with

4 one year data (i.e., , , Paradise Valley, Manning and St. Isidore), the b

5 values calculated from individual years and across years were somewhat different because the

6 cultivar index used as an independent variable in the regression analysis was calculated from

7 yields of actual cultivars in the individual year, but from average yields of cultivar classes

8 (derived from normalization) across years. Nevertheless, such differences were not appreciably

9 large for all cases involved, suggesting the normalization procedure is probably adequate for

10 combining the data across years.

11 A question naturally arises whether or not the fit of a linear relationship, as described

12 above, is good. Testing for significance of b values would usually be considered. However, it

13 should be emphasized that the estimates of stability (b values) from the Finlay-Wilkinson’s

14 regression analysis are data-based indexes for descriptive purposes, but not for prediction. For a

15 prediction model, the independent variable must be measured prior to the experiment, but not

16 derived after the experiment as in the Finlay-Wilkinson’s regression analysis (Lin and Binns,

17 1994). Thus, the goodness-of-fit of the linear regression would be best judged by how much of

18 the variation could be accounted for by the model (Crossa, 1990; Lin and Butler, 1990). It is

19 suggested that a b value, regardless of its magnitude, should be a useful indicator of response

20 characteristics if the coefficient of determination (r2) is at least 50% (Lin and Butler, 1990). In

21 our present study, the r2 values were 50% or higher in 3 of 11 sites in 1997, 5 of 20 sites in 1998,

22 0 of 22 sites in 1999, 0 of 20 sites in 2000 and 9 of 21 sites in 2001. Clearly the linear regression

23 model was generally inadequate in individual years. In contrast, the r2 values were 50% or

12 1 higher in 25 of 34 sites when combining the data across years, suggesting that the proposed

2 linear response adequately described the variation due to site  cultivar class interaction at most

3 test sites.

4 Practical Implications

5 Our study has several important implications for current cultivar testing efforts with field

6 pea and other crops in Alberta and elsewhere. First, under the current system, yield data from

7 cultivar trials are summarized according to geographic regions delineated for each crop.

8 Cultivars with the highest regional averages are recommended to local producers with little

9 regard to the fact that not all sites in a region are capable of the same level of production (Fig. 1).

10 This geography-based approach would have failed to identify the cultivars that are best adapted

11 to ‘good’ or ‘bad’ environments because of the masking effect of taking averages over high and

12 low yielding sites and/or years (Helm et al., 2002). There are earlier attempts to amalgamate

13 ‘similar’ environments through the cluster analysis (e.g., Horner and Frey, 1957; Abou-El-

14 Fittouh et al., 1969; Ghaderi et al., 1980; Brown et al., 1983; Collaku et al., 2002), but they give

15 no ‘objective’ criterion for determining the number of groups within which sites are similar in

16 yielding ability or other agronomic and production characteristics. The criteria developed by

17 Crossa and Cornelius (1997) and Russell et al. (2003) are based primarily on whether or not

18 crossover interactions are minimized among sites within a group, but with little regard to site

19 performances within the group. While such grouping certainly helps plant breeders to identify

20 cultivars with wide adaptability, it is of limited value to producers whose objective is to find the

21 best possible match-up of cultivars with production levels of their farm fields.

22 Second, most studies on genotype  environment interactions have been limited to

23 examining cultivar  site interactions from combined analysis of cultivar trials in a single year.

13 1 The clustering of sites based on the data from individual years would be practically significant if

2 the clustered groups are repeatable over years (Lin and Butler, 1994; Russell et al., 2003).

3 However, our study (Table 3) and many other studies (Lin and Binns, 1994) have shown that

4 there is little consistency of site grouping patterns across years, suggesting the diminutive value

5 of the individual-year analysis. Therefore, we strongly recommend the use of the across-year

6 analyses such as ours. In the past, it has been very difficult to conduct the combined analysis of

7 multi-year data because (i) such data are often unbalanced, so that many statistical analyses

8 developed for balanced data are not readily applicable, and (ii) a site effect in the multi-year data

9 would have two confounded components if site  year interaction is ignored: a predictable part

10 due to ‘fixed’ soil characteristics and photoperiod at a given site and an unpredictable part due to

11 ‘random’ year-to-year weather fluctuations. Our proposed normalization procedure has allowed

12 for creating cultivar classes and averaging unbalanced data across years, thereby effectively

13 overcoming the above two difficulties. As a result, we were able to reveal the more meaningful

14 grouping of isoyield sites based on the data averaged across years. It should be noted that this ad

15 hoc procedure for the across-year analysis somewhat differs from the commonly used pattern

16 analysis (e.g., DeLacy and Cooper, 1990; Abdalla et al., 1996; Trethowan et al., 2001). In the

17 pattern analysis, proximities between pairs of sites as measured by square Euclidean distance are

18 calculated for each year and then averaged across years. The site  site matrix of averaged

19 proximities is used for clustering and ordination of sites in the three-way table of year  site 

20 cultivar. For our field pea data, such averaged distances across years were substantially greater

21 than the ones in some years apparently due to considerable year-to-year variation in the distances

22 between a given pair of sites (results not presented). Consequently, with this elevation in the

14 1 bottom-line distance between the sites, each individual site became a distinct isoyield group

2 according to the F-test.

3 Third, in Alberta and elsewhere, there is a consistent request for improving the quality

4 and efficiency of the cultivar testing. In any case, it is imperative to provide some basis for

5 identifying a few representative test sites. The number of isoyield groups identified in our study

6 suggests a minimum number of sites that would be needed for the future testing. For our field

7 pea data, such numbers were 6 in 1997, 10 in 1998, 2000 and 2001, 12 in 1999, and 7 for across-

8 year data. However, because the ‘true’ site effect in individual years was confounded with

9 random year-to-year variation and because grouping patterns varied from year to year, the

10 number determined from averaged site effects based on the across-year analysis (seven sites) is

11 probably more reflective of true differences among sites. To help determine which site would be

12 selected from each isoyield group, we found it is useful to examine the stability statistics (the b

13 values). Appealing to the interpretation by Finlay and Wilkinson (1963) for cultivar stability, we

14 offer the following considerations when selecting test sites from isoyield groups: (1) a site with

15 the b value close to unity would have average stability, but it would be considered as a ‘good’

16 site if it appears in a high-yielding isoyield group and as a ‘bad’ site if it appears in a low-

17 yielding group; (2) a site with the b value increasing above unity would have below-average

18 stability, but it would be a ‘good’ site for high-yielding cultivars and a ‘bad’ site for average and

19 low yielding cultivars; and (3) a site with the b value decreasing below unity would have above-

20 average stability, but it would be a ‘good’ site for low-yielding cultivars and a ‘bad’ site for high-

21 yielding cultivars.

22

23

15 1 CONCLUSIONS

2 Through the consecutive use of Finlay and Wilkinson’s regression analysis and cluster

3 analysis (Lin and Butler, 1990), we have been able to classify test sites in individual years and

4 across different years into different isoyield groups for field peas. It is also evident that the most

5 meaningful isoyield groups are those based on the data averaged across years. The use of such

6 averages significantly lessens the impact of random year-to-year variation. The procedure is

7 currently being used to analyze the data from cultivar trials of other major crops tested in Alberta

8 and other parts of the Canadian Prairies. However, a critical issue remains about the factors that

9 led to the formation of these isoyield groups. To tackle this issue, we are currently investigating

10 the relationships of yield performance with other agronomic traits, climate and soil variation at

11 different sites. Our ultimate goal is to develop isoyield maps for identifying the best match-up of

12 cultivars with their ‘favored’ environments and climates.

13

14

16 1 ACKNOWLEDGEMENTS

2 We thank Dr. Terrance Ye for his assistance with data analysis and Dr. James Helm for

3 helpful discussion during the course of this work. This research has been supported in part by

4 AAFRD’s Industry Development Sector New Initiative Fund, the Crop Diversification Division

5 and the Natural Sciences and Engineering Research Council of Canada grant OGP0183983.

6

17 1 REFERENCES

2 Abdalla, O.S., J. Crossa, E. Autrique, and I.H. DeLacy. 1996. Relationships among international

3 testing sites of spring durum wheat. Crop Sci. 36:33–40.

4 Abou-El-Fittouh, H.A., J.O. Rawlings, and P.A. Miller. 1969. Classification of environments to

5 control genotype by environment interaction with an application to cotton. Crop Sci.

6 9:135–140.

7 Brown, K.D., M.E. Sorrells, and W.R. Coffman. 1983. A method for classification and

8 evaluation of testing environments. Crop Sci. 23:889-893.

9 Collaku, A., S. A. Harrison, P. L. Finney, and D. A. Van Sanford. 2002. Clustering of

10 environments of southern soft red winter wheat region for milling and baking quality

11 attributes. Crop Sci. 42: 58-63.

12 Cornelius, P.L., M.S. Seyedsadr, and J. Crossa. 1992. Using the shifted multiplicative model to

13 search for "separability" in crop cultivar trials. Theor. Appl. Genet. 84:161–172.

14 Crossa, J. 1990. Statistical analysis of multilocation trials. Adv. Agron. 44: 55-85.

15 Crossa, J., and P.L. Cornelius. 1997. Sites regression and shifted multiplicative model clustering

16 of cultivar trial sites under heterogeneity of error variances. Crop Sci. 37:406–415.

17 DeLacy, I.H., and M. Cooper. 1990. Pattern analysis for the analysis of regional variety trials. p.

18 301–334. In M.S. Kang (ed.) Genotype-by-environment interaction in plant breeding.

19 Louisiana State Univ., Baton Rouge, LA.

20 Finlay, K.W., and G.N. Wilkinson. 1963. The analysis of adaptation in a plant breeding

21 programme. Aust. J. Agric. Res. 14:742–754.

22 Gauch H.G., Zobel R.W. 1997. Identifying mega-environments and targeting genotypes. Crop

23 Sci. 37:311-326.

18 1 Ghaderi, A., E.H. Everson, and C.E. Cress. 1980. Classification of environments and genotypes

2 in wheat. Crop Sci. 20:707–710.

3 Helm, J., P. Juskiw and T. Duggan. 2002. A new look at location yield data. Presentation at the

4 North American Barley Researchers Workshop, Fargo, ND, September 22-25, 2002.

5 [online], available at http://www1.agric.gov.ab.ca/$department/deptdocs.nsf/all/fcd5590.

6 Horner, T.W., and K.J. Frey. 1957. Methods for determining natural areas for oat varieties based

7 upon known environmental variables. Agron. J. 52:396–399.

8 Lin, C.S., and M.R. Binns. 1994. Concepts and methods for analyzing regional trial data for

9 cultivar and location selection. Plant Breed. Rev. 12: 271-297.

10 Lin, C.S., and G. Butler. 1990. Cluster analysis for analyzing two-way classification data. Agron.

11 J. 82:344-348.

12 Olson, M.A., R.-C. Yang and S.F. Blade. 2001. Nutrient concentrations and nutritive values of

13 field pea (Pisum sativum L.) straw in south central Alberta. Can. J. Plant Sci. 81: 419-

14 423.

15 Park, B., and K. Lopetinsky. 1999. Pulse crops in Alberta. Alberta Agriculture, Food, and Rural

16 Development, Edmonton, AB, Canada.

17 Peipho, H.-P. 1999. Stability analysis using the SAS system. Agron. J. 91:154-160.

18 Russell, W.K., K. M. Eskridge, D. A. Travnicek, and F. R. Guillen-Portal. 2003. Clustering

19 environments to minimize change in rank of cultivars. Crop Sci. 43: 858-864.

20 SAS Institute. 1999. SAS/STAT User’s Guide. Version 8.0. SAS Inst., Cary, NC.

21 Sokal, R.R., and C.D. Michener. 1958. A statistical method for evaluating systematic

22 relationships. Univ. of Kansas Sci. Bull. 38:1409 -1438.

23 SPSS Inc. 2002. SPSS ® 11.5 Syntax Reference Guide. Chicago, IL.

19 1 Trethowan, R.M., J. Crossa, M. van Ginkel, and S. Rajaram. 2001. Relationships among bread

2 wheat international yield testing locations in dry areas. Crop Sci. 41: 1461-1469.

3 Yang, R.-C. 2002. Likelihood-based analysis of genotype-environment interactions. Crop Sci.

4 42: 1434-1440.

5 Yang, R.-C., T. Z. Ye, S. F. Blade, and M. Bandara. 2004. Efficiency of spatial analyses of field

6 pea variety trials. Crop Sci. 44: 49-55.

7 Zobel, R.W., J.J. Wright, and H.G. Gauch. 1988. Statistical analysis of yield trial. Agron. J.

8 80:388–393.

9

20 1 LIST OF FIGURES

2

3 Fig. 1. Geographic regions and test sites for field pea variety trials in Alberta during 1997 to

4 2001. The numbers in italics are average yields of sites in Mg ha-1.

5

6 Fig. 2. Dendrogram for clustering 34 sites used for testing field pea varieties in 1997 through

7 2001. The numbers in parenthesis after site names identify different geographic regions

8 (cf. Fig. 1): 1d, Southern Alberta (dryland); 1i, Southern Alberta (irrigated); 2, East

9 Central Alberta; 3, West Central Alberta; and 4, Peace River Region. The dashed line is

10 the “cutoff” point for identifying isoyield groups. The regression lines of sites within an

11 isoyield group do not differ significantly at the 0.05 probability level.

12

21 1 Table 1. Year, number of cultivars and sites, mean of grain yield (Mg ha–1) range and standard deviation (STD) calculated over

2 varieties, sites and cultivar  site two-way tables for field pea variety trials tested in 1997 - 2001.

Cultivar Site Cultivar  Site

No.

Year Varieties No. Sites Mean Range STD Range STD Range STD

------Mg ha-1------

1997 28 11 4.770 3.740-5.379 0.348 2.460-9.057 2.218 1.467-11.903 2.218

1998 28 20 3.351 2.727-3.678 0.238 0.496-6.418 1.743 0.220-7.431 1.759

1999 29 22 4.218 3.560-4.683 0.287 1.176-7.958 2.025 0.559-8.808 2.092

2000 21 20 3.169 2.547-3.462 0.190 0.333-5.617 1.600 0.155-6.905 1.632

2001 33 21 3.110 1.284-3.575 0.398 0.757-5.338 1.309 0.100-6.632 1.434

All years 84 34 3.574† 1.284-5.379 0.720 0.757-6.282 1.245 0.100-10.475 1.742

3 † The mean of all cultivar  site combinations, which was slightly different from the marginal means for cultivars (3.724) and sites

4 (3.334). These three means would be identical as in individual years if the cultivar  site two-way table across years was balanced

5 (i.e., no missing cells).

22 1 Table 2. Combined analyses of variance of grain yield (Mg ha-1) for regional field pea trials tested in 1997 - 2001.

1997 1998 1999 2000 2001 Across-Year Ave.‡

Source of variation df SS† df SS df SS df SS df SS df SS Sites (S) 10 1377.4** 19 1616.6** 21 2501.1** 19 1020.9** 20 1131.7** 33 464.3** Varieties (V) 27 36.3** 27 30.9** 28 50** 20 14.6** 32 106.8** 9 125.6** S  V 270 98.4 513 82.8 588 235.7** 380 78.9** 640 184.4** 278 151.1** Partitioning of S  V Site regression 10 21.1** 19 11.4** 21 22.5** 19 8.9** 20 41.4** 9 16.8** Deviation from regression 260 77.3 494 71.3 567 213.2** 361 69.9 620 143.1** 269 134.2** Pooled error 864 641.4 1512 333 1736 242.8 1100 93.4 1920 96.7 829 31.2 CV (%) 17.9 14.3 8.8 9.1 7.2 6.0 2 ** Significant at 0.01 probability level.

3 † SS, sum of squares.

4 ‡ The analysis was based on the 34 sites and 10 variety classes across years created through the normalization procedure as explained in the text.

23 1 Table 3. Mean grain yield and estimated regression coefficient (b) for regional field pea trials tested at 34 sites across Alberta over

2 1997 to 2001. The different superscripted letters in each column identify different isoyield groups and the regression lines of sites

3 within an isoyield group would not be significantly different at the 0.05 probability level.

1997 1998 1999 2000 2001 Across-Year Ave Site Yield b Yield b Yield b Yield b Yield b Yield b ------Mg ha-1------Southern Alberta (dryland) Acadia Valley - - 1.903a 0.517 ------1.903a 0.494 Barons ------2.791a 0.552 2.791b 1.019 Bow Island ------1.629a 0.152 3.323b 0.741 2.664b 0.475 Brooks - - 1.366c 0.703 6.831a 1.814 - - - - 4.146d 0.457 Carmangay ------1.077b 0.137 - - 1.077c 0.202 - - - - 4.577d 0.821 1.177b 0.024 - - 3.149b 0.283 Standard 3.149a 0.056 5.115f 1.968 7.464b 0.923 1.531a 0.835 - - 4.528e 2.017 Three Hills ------3.824c 0.536 3.824d 1.499 Southern Alberta (irrigated) Bow Island 5.305d 1.497 4.797f 1.968 7.958c 1.356 - - - - 6.043f 1.324 Brooks 7.272f 1.111 2.261a 0.271 6.114i 2.996 5.517i 0.441 4.030e 0.624 4.986e 1.654 East Central Alberta

24 Fort Kent 2.699b 0.211 2.876b 0.748 - - - - - 2.788b 0.684 - - 6.214j 1.965 3.769e 1.228 2.786d 0.491 - - 4.382e 0.983 Killam - - 2.821b 0.823 3.884e 0.994 - 1.486i 0.176 2.675b 0.250 Ohaton - - 2.862b 0.181 2.419j 0.402 2.270e 1.127 1.591i 0.584 2.256b 0.943 Paradise Valley ------0.757j 0.091 0.757c 0.431 Provost - - - - 6.826a 0.409 0.333c -0.107 1.791i 0.405 3.181d 2.029 St. Paul - - - - 1.716k -0.017 2.865d 2.15 3.188b 0.425 2.592b 1.156 Vegreville 3.492a 0.508 2.683b 1.103 2.553j 1.69 3.848f 0.066 3.666c 0.65 3.228b 1.153 Vermilion - - 0.865d 0.404 5.582g 1.54 4.356g 1.509 - - 3.559d 0.747

West Central Alberta Lacombe - - - - - 2.732a 1.185 2.732b 1.457 Namao 9.057e 2.695 4.946f 2.157 5.730g 1.085 5.617i 1.71 5.338g 1.538 6.132g 2.667 Neapolis - - 6.418j 1.958 4.060e 1.174 2.795d 0.438 1.512i 0.276 3.658d 0.788 - - - - 2.775d 0.664 - - 2.775b 0.606 - - 4.023i 0.784 5.134h 1.569 3.953f 0.863 - - 4.417e 0.958 Peace River Region - - 3.139b 1.136 1.176l 0.317 - - 4.411d 1.036 2.973b 1.209 Dawson Creek - - - - 3.545f 0.985 - - 4.366f 2.502 3.982d 2.071 Fairview 3.143a 1.197 0.496e 0.405 - - 4.162g 1.67 3.257b 1.254 2.698b 0.419 - - - - 1.426l 0.449 5.277j 2.867 - - 3.044b 1.013

25 Fort St. John 7.442f 0.415 5.381h 0.872 4.259e 1.085 4.631h 1.807 5.272g 1.496 5.423f 1.108 Fort Vermilion 4.220c 1.521 1.996a 0.711 3.422f 0.61 - - 2.271h 0.798 2.951b 0.989 4.234c 1.339 2.170a 0.65 1.774k 0.182 1.942e 0.325 - - 2.562b 1.785 2.460b 0.268 4.690g 1.009 2.588j 0.513 4.830h 1.543 4.326d 1.474 3.737d 0.756 Manning ------1.662i 0.689 1.662a 0.866 St. Isidore ------3.710c 1.007 3.710d 1.112 1

26 1 Table 4. Concurrences of individual sites with other members of the same isoyield groups from

2 dendrograms generated for 34 sites used for regional field pea trials across Alberta in 1997 to

3 2001.

Site 1997 1998 1999 2000 2001 TOC† AOC‡ Southern Alberta (dryland) Acadia Valley - 3 - - - 3 1 Barons - - - - 1 1 14 Bow Island - - - 1 2 3 14 Brooks - 0 1 - - 1 7 Carmangay - - - 1 - 1 1 Oyen - - 0 1 - 1 14 Standard 2 2 0 1 - 5 3 Three Hills - - - - 2 2 7 Southern Alberta (irrigated) Bow Island 0 2 0 - - 2 1 Brooks 1 3 0 1 0 5 3 East Central Alberta Fort Kent 1 4 - - - 5 14 Irricana - 1 3 3 - 7 3 Killam - 4 3 - 4 11 14 Ohaton - 4 2 1 4 11 14 Provost - - 1 0 4 5 7 St. Paul - - 1 3 2 6 14 Vegreville 2 4 2 1 2 11 14 Vermilion - 0 1 1 - 2 7 East Central Alberta Lacombe - - - - 1 1 14 Namao 0 2 1 1 1 5 0 Neapolis - 1 3 3 4 11 7

27 Paradise Valley - - - - 0 0 1 Penhold - - - 3 - 3 14 Westlock - 0 0 1 - 1 3 Peace River Region Beaverlodge - 4 - 1 1 6 14 Dawson Creek - - - 1 0 1 7 Fahler - - 0 1 - 1 14 Fairview 2 0 1 - 2 5 14 Fort St. John 1 0 1 3 1 6 1 Fort Vermilion 1 3 - 1 0 5 14 Grande Prairie 1 3 1 1 - 6 14 High Prairie 1 0 1 2 1 5 7 Manning - - - - 4 4 1 St- Isidore - - - - 2 2 7 Max. possible concurrences per site 10 19 21 19 20 33 Total concurrences across all sites 12 40 28 26 38 144 284 1 † Total observed concurrences from all years.

2 ‡ Observed concurrences from the across-year analysis.

3

28 1

29 1

120° 118° 116° 114° 112° 110° 60°

59°

Fort Vermilion 2.951 58° 4 57° Manning St. Isidore 1.662 3.710 Fort St. John 5.423 Fairview 56° 2.698 Falher Dawson Creek Beaverlodge 3.982 3.044 High Prairie 2.973 3.737 Grande Prairie 55° 2.562 Neapolis 3576

Westlock Fort Kent 4.417 2 2.788 54° Namao 6.132 St. Paul Vermilion Vegreville 2.592 3.559 3.228 3 53° Ohaton Paradise Valley Lacombe 2.256 0.757 2.732

Killam Provost Penhold 2.675 3.181 52° 2.775 Neapolis 3.658 Irricana 4.382 Oyen Three Hills 3.149 • 3.824 51° Acadia Valley Standard 1.903 4.528 -1 Brooks Region and Mean Yield (Mg ha ): 4.146 Brooks (irr) Carmangay 4.986 1. South (Dryland, 3.010; Irrigated, 5.515) 1.077 50° Bow Island 2. East-Central, 2.824 1 2.664 3. West-Central, 3.943 Barons Bow Island (irr) 2.791 6.043 4. Peace River Region, 3.274 49° 114° 112° 110°

2

30 1 Acadia Valley(1d) Manning(4) Barons(1d) St. Paul(2) Fort Vermilion(4) Fort Kent(2) Penhold(3) Oyen(1d) Beaverlodge(4) Falher(4) Vegreville(2) Grande Prairie(4) Lacombe(3) Bow Island(1d) Killam(2) Fairview(4) Ohaton(2) Carmangay(1d) Paradise Valley(2) Brooks(1d) High Prairie(4) Neapolis(3) Vermilion(2) St. Isidore(4) Three Hills(1d) Dawson Creek(4) Provost(2) Brooks(1i) Standard(1d) Irricana(2) Westlock(3) Bow Island (1i) Fort St. John(4) Namao(3)

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 2 3 4

31