Blackwell Science, LtdOxford, UKBIJBiological Journal of the Linnean Society0024-4066The Linnean Society of London, 2004? 2006 871 155166 Original Article

OPSARIICHTHYS ( CYPRINID) PHYLOGEOGRAPHY IN CHINA P. BERREBI ET AL. Biological Journal of the Linnean Society, 2006, 87, 155–166. With 3 figures

Population structure and systematics of bidens (Osteichthyes: ) in south-east China using a new nuclear marker: the introns (EPIC-PCR)

PATRICK BERREBI*1, XAVIER RETIF1, FANG FANG2 and CHUN-GUANG ZHANG3

1Laboratoire Ecosystèmes Lagunaires, UMR 5119, cc093, University Montpellier 2, place E. Bataillon, 34095 Montpellier, cedex 05, France 2Department of Vertebrate Zoology, Swedish Museum of Natural History, POB 50007, SE 104 05 Stockholm, Sweden 3Research Center of Evolution and Systematics, Institute of Zoology, Chinese Academy of Sciences. 25 Beisihuanxi Road, Haidian, , 100080, China

Received 28 February 2004; accepted for publication 1 February 2005

Chinese fish farming is the oldest aquaculture in the word. The present pressure on the wild ichthyofauna and its diversity is threatening aquaculture because potential genitors are often caught in the wild. One of the possible responses to this threat is to provide new natural fish taxa for aquaculture. The objective of this study was to analyse the genetic structure of populations of and to describe its subdivisions, if any, using nuclear markers, to serve as a guideline for stock selection and management in the potential aquaculture of this species. In 2002 and 2003, two collecting trips were made, one in the middle Chang Jiang basin in Hunan Province, and another in the Xi Jiang basin in Province, China. Length polymorphisms of the intron amplification (EPIC-PCR) were analysed on 24 systems, only five of which gave easily interpretable and polymorphic patterns, corresponding to 11 presumptive loci. According to the multidimensional statistics, the genetic analysis demonstrated the existence of four clearly independent geographical taxa included in O. bidens, characterized by unambiguous diagnostic intron loci. The distribution of these taxa confirms a similarity between both catchment populations: the middle Chang Jiang and the Xi Jiang samples of the Zhu Jiang basin. An additional output of the study was the choice of the population of each group to be first tested in Chinese fish farms. © 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166.

ADDITIONAL KEYWORDS: China – fish farming – introns – Opsariichthys – phylogeography.

INTRODUCTION Gorges Dam, especially in the upper and middle reaches. Reconstruction of connectivity seems to be The Chang Jiang ( River), the longest river in the most urgent necessary improvement, to maintain China and the third largest in the world, harbours 361 the lakes of the middle Chang Jiang in contact with fish species, of which 177 are endemic (Fu et al., 2003). the river network, especially for spawning migrations The basin is usually divided into three ecological parts (Fu et al., 2003). (upper, middle and lower basins), or into five (headwa- Xi Jiang, one of the major tributaries of Zhu Jiang ters, upper, middle, lower and mouth of the river), but (), drains mainly through Guangxi Prov- these divisions are not supported by the fish faunal ince, and is the longest river in southern China. Very compositions (Fu et al., 2003). little is known about the ichthyofauna of Xi Jiang The ichthyofauna is threatened by hydrological River, except for a book on fishes from Guangxi Prov- alterations, mainly the Gezhou Ba Dam and the Three ince, which recorded some 133 freshwater fish species, of which about 33 are endemic (Anonymous, 1981). *Corresponding author. China has been exposed to pollution, overfishing E-mail: [email protected] and all kinds of environmental destruction for

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 155

156 P. BERREBI ET AL. decades, and the overall result has been a drastic In 2002 and 2003, two collecting trips were made, decrease both in number of fish species and in the size one in the middle Chang Jiang basin in Hunan Prov- of populations (Li, 2001). ince, and the other in the Xi Jiang basin in Guangxi Chinese fish farming is the oldest aquaculture in Province and large series of O. bidens were obtained the word. Traditional methods are applied to a few (see Table 1). The objective of this study was to anal- species, mostly cyprinids. The present pressure on the yse the genetic structure of populations of O. bidens wild ichthyofauna and its diversity also threatens and to describe its subdivisions, if any, using nuclear aquaculture because potential genitors are often markers, to serve as a guideline for stock selection and captured in the wild. One of the possible reactions management for the potential aquaculture of the spe- to this threat is to provide new natural fish taxa for cies. The aim was to obtain population data such as aquaculture. genetic diversity and level of panmixia. In a recent collaboration between Europe and Because keeping samples frozen in the field was China, new feral species were analysed with a view to nearly impossible in southern China (lacking the proposing some of them for breeding in Chinese fish infrastructure to provide liquid nitrogen), allozyme farms. The Opsariichthys and especially O. analysis was abandoned and the exon-primed intron- bidens were targeted. Information about the genetic crossing (EPIC-PCR) technique was used instead. structure of O. bidens populations is scarce. Phylogeo- Introns can be considered as the new universal graphical data is important for evaluating the suit- nuclear markers, applicable to all species without any ability of the species for fish farming, and for prerequisite genetic knowledge. detecting possible conservation needs at the popula- tion level. O. bidens is one of the most widespread cyprinid MATERIAL AND METHODS species of eastern Asia. The genus Opsariichthys is considered to be most closely related to Opsaridium SAMPLING and Raiamas, but also to , or Can- Sampling was conducted in two regions, the middle didia. However, Howes (1980) demonstrated that Chang Jiang (Yangtze) basin in Hunan Province in Opsariichthys and Zacco are more distantly related March–April 2002 (sites A6–A32) and in the Xi Jiang than was expected. Both genera are well established (Pearl River) in Guangxi Province in March 2003 and morphologically distinct. Z. platypus was chosen (sites B04–B54). Figure 1 and Table 1 provide geo- as the outgroup. graphical and biological data on these samples. A total

Chang Jiang 30°N

A16 A11 +A14 A06

A32 A29 B04 B09

B822 B1 B10 B54 Xi Jiang B46 B34 20°N

100°E 110°E 120°E

Figure 1. Map of south-eastern China showing location of the sample sites. See Table 1 for site names and descriptions.

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166

OPSARIICHTHYS (CYPRINID) PHYLOGEOGRAPHY IN CHINA 157

of 198 specimens of O. bidens and nine specimens of Z. platypus (outgroup) were collected from 14 locali- ties. Table 1 indicates the method of collecting sam-

Sampling gear ples. Samples collected from natural waters by our field team were considered to represent part of a nat- ural population; samples bought at the fish markets were probably derived from a single natural popula- tion but the exact origin remains unknown, and the possibility cannot be excluded that occasional market samples could have been composites with specimens March 2003March 2003March 2003 – – Seine April 2002 –

2 April 2002 – captured from different localities. 26–27 March 2003 – 22 March 2003 – 16 March 2003 Seine 11 March 2003 Seine Collection date 18 March 200221 March 200221 March 2002 Seine 24 March 2002 Cast net Cast net Seine 10 March 2003 – The fish were first observed in the field for rapid identification, using mainly the mouth characters to separate Opasriichthys and Zacco species (Opsari- 2 7 1 0 4 0 03 03 06 46 ichys has much longer premaxilla and maxilla bones, Sample size with a deep notch on the premaxilla, which meets a strong extension from the dentary. Zacco has a smaller E1 E8 E1 E6 E2 E2 E2 E1 E7 E1 E2 E1 E2 E9 ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ mouth, shorter premaxilla and maxilla, with a smooth 53 57 34 34 56 57 20 12 00 59 01 38 51 16 ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ lateral edge both on the premaxilla and on the den- 17 32 58 58 28 09 19 53 34 11 23 18 04 06 B22). ° ° ° ° ° ° ° ° ° ° ° ° ° °

= tary). Identifications were confirmed at the Swedish Museum of Natural History (both genera are well sep- N; 111 N; N; 110 N; N; 107 N; N; 107 N; N; 111 N; 111 N; 111 N; 110 N; 110 N; 111 N; 110 N; 109 N; 110 N; 108 N; ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ arated by many osteological characters, with several 40 24 01 36 58 15 26 45 00 00 12 02 50 27 ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ synapomorphies supporting the monophyly of each 31 51 06 49 48 59 58 47 04 00 28 59 50 55 ° ° ° ° ° ° ° ° ° ° ° ° ° ° genus, according to Howes, 1980), except for the 24 specimens of O. bidens from site EC-2003–010 (= B10). With the exception of the nine outgroup specimens of Z. platypus, all individuals analysed were identified as O. bidens. Station number Coordinates

MOLECULAR ANALYSES DNA was extracted from fin tissue using the phenol : chloroform method. PCR reactions were car- ried out in an Eppendorf Mastercycler. The 10-µL were obtained from locality EC-2003–022 (

reaction mixture consisted of 1 µL 10× reaction buffer

(Promega), 2.5 mM MgCl2, 0.2 mM each dNTP (Invitrogen), 0.5 µM each primer (MWG-Biotech AG, labelled with CY5 or fluoresceine), 0.3 U Taq poly- merase (Sigma) and 1 µL DNA template. A first dena- ° Zacco platypus turation step at 94 C for 3 min was followed by 35 cycles of denaturation for 1 min, annealing at an appropriate temperature (see Table 2) for 1 min and extension at 72 °C for 1 min 20 s. A final 10-min exten- sion at 72 °C was carried out. Only length polymorphisms of intron amplification

Opsariichthys bidens Opsariichthys were analysed. All necessary details concerning the informative intron systems are given in Table 2. In all, 24 systems were tested but only five gave easily inter- pretable and polymorphic patterns. Only polymorphic loci were scored (Table 3), which means that the heterozygosity parameters must be interpreted with caution; results do not represent the estimated poly- Sampling details of morphism of the species but rather the relative poly- morphism among samples. We use the term ‘system’ here because each pair of primers amplified several able 1. ou Jiang tributary Jiang You Xi Jiang EC-2003–022 B22 23 Nine more specimens of the outgroup species Zhongshan market Fuchuan Jiang Xi Jiang EC-2003–054 B54 24 Rongxian market Beiliu Jiang Xi Jiang EC-2003–046 B46 22 Ming Jiang tributary Ming Jiang Xi Jiang EC-2003–034 B34 22 QiangongbaSiguitan HeDie ShuiDongkou market Jiang Yuan Jiang Yuan Zi Shui Chang Jiang Li Shui Chang Jiang EC-2002–011Y EC-2002–014 Chang Jiang A11 A14 EC-2002–029 Chang Jiang A29 28 EC-2002–016 28 A16 27 29 Gantan He Jiang Yuan Chang Jiang EC-2002–006 A06 28 T River/Market Subbasin Basin Locality Beijiang marketLuoqing JiangAnyang market Liu Jiang Liu Jiang Hongshui Xi Jiang Xi Jiang Xi Jiang EC-2003–009 B09 EC-2003–010 EC-2003–018 B10 B18 24 24 23 Shaoyang marketSanjie Zhen market Zi Shui Li Jiang Xi Jiang Chang Jiang EC-2002–032 EC-2003–004 A32 B04loci 27 due 25 to sequence duplications (e.g. tandem dupli-

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 158 P. BERREBI ET AL.

cations, pseudogenes; see Hassan et al. (2002) and Atarhouch et al. (2003)). To be sure that only target loci were amplified, we first tested each system by a (2003) (2003) thermic gradient PCR in an Eppendorf Mastercycler . .

(2000) gradient to determine the maximum annealing tem- . et al et al perature that allowed amplification. When several loci

et al were amplified, we were sure that the primer homol- ogous sequences were the same or very similar to the universal primers’ complements. Atarhouch Chow & Takeyama (1998) Takeyama Chow & Chow (1998) Atarhouch Bierne One µL of PCR mixture from each individual was loaded onto an 8% denaturing polyacrylamide gel (Biorad). The PCR products were visualized with a FMBIO fluorescent imaging system (Hitachi). Allele ′ sizes were determined using a fluorescently labelled ladder of known size (Promega) with the FMBIO ANALYSIS 8.0 image analyser program.

STATISTICAL ANALYSES Most statistical analyses were performed using the program GENETIX (Belkhir et al., 1998). This analy- sis was structured in four stages. First, a multidimen- sional analysis gave the overall structure of the AGGTGCTCGTTCCACATGA TCTGGCACCACACCTTCTACAA AGTAATGACGTCGCAGATGTTCT CGACAGGTTCACTCTCGAGGAG GACCACCTCCGAGTCATCTC C CTGACCATGATGGCCAGAAA GTTAGCTTCTCCCCCAGGTT GCTATAACCCTCGTAGATGGGCAC A CCATACCTTCTACAATGAGCTCCG GACCAGAGGCGTAGAGGAGAGC samples. Because this kind of analysis can demon- strate the existence of unexpected taxa and because, if these taxa are species, several sympatric species can be included in a given sample, this analysis is a pre- requisite before other calculations. The analysis cho- sen was a factorial correspondence analysis (FCA) ° ° ° ° ° (Benzecri, 1973; She et al., 1987). In this kind of anal- 60 56 50 53 Annealing temperature Primer sequence60 Reference ysis, individuals are first coded according to the pres- ence of the different alleles with values 0 (allele absent), 1 (heterozygotic for the allele), 2 (homozygotic for the allele). The computation then aims at finding composite axes which are a combination of the vari- ables and which optimize the differences between the ALMex5R Mlc-3-R CK7R C Act-2-R PmActI-R analysed individuals. The relationships among indi- viduals can be visualized on two or three axes. This computation is not completely independent from dif- ferentiation parameters, as Guinand (1996) showed that the inertia values (i.e. the proportion of the total information explained by an axis, given as a percent) along each axis are equivalent to linear combinations of the monolocus Fst values. The clustering of the spots/individuals on the dia- gram represents the overall genetic structure and thus possible distinct taxa. To support the definition Intron position Abbreviation Primer name of clusters, PartitionML (Belkhir & Bonhomme, 2002) was also used. This program searches for the best possible partition of a sample into independent pan- mictic clusters and simultaneously assigns individu- als to them using a maximum likelihood criterion. Characteristics of informative (polymorphic) intron systems Following Smouse, Waples & Tworek (1990) under the null hypothesis of k source populations, the opti- mal number of source populations to be retained able 2. Myosin light chain 3 Mlc-3 Mlc-3-F Creatine kinase 6 CK6 CK6F Calmodilin III 4 CaM CALMex4F Actin 2 2 Act2 Act-2-F System name T Actin 1 1 Act1can PmActI-F be obtained by incrementing k until the null

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 OPSARIICHTHYS (CYPRINID) PHYLOGEOGRAPHY IN CHINA 159 0.11 0.89 0.33 0.25 0 0 0.06 0.08 1 0.08 0 0 0 0.17 0.170.08 0.09 0 0.03 0 0.05 0 0 0 0.25 0 0 0 0 0 0.30 0 0 0 0 0 0 0.200.30 0 0 0.08 0 0.15 0 0.06 0 0 0 0 0 0.02 0 0 0 0 0 0 0 0.21 0.20 0.13 0.58 0.15 0 0.27 0 0.05 0.04 0 0 0 0 0 0 0.22 0.05 0.21 0 0 0 0.24 0 0 0 0.50 0 0 0 0 0 0 0 0 0 0.97 1 0.5500001011 0.73 1 1 0.92 1 1 1 0.83 0000 0.03 0 0.45 0.27 0 0 0 0 0 0 0 00000000000 11111111111 000000000000 000000000000 0000 000 000001111101 00000000000 000000 0000000 000000000 111110000010 0000 000 00000 0000000 00000 0000000 0.14 0.30 0.17 0.25 0 0 0 0 0 0.06 0.06 0.09 0 00000 11111 00000 0000 00000 00000 11111 11111 1 00000 0 0 00000 00000 00000 00000 0 A06 A11 A14 A16 A29 A32 B0A B04-b B09 B10 B18 B18-b B22-o B34 B46 B54 B22-z Allele frequencies of the polymorphic intron loci 595 715720725 0.03730 0.03735 0 0.03740 0 0.03 0 0.08 0 0.11 0.03 0.02 0.08 0 0.08 0 0.05 0.05 0.08 0 0 0 0 0 0 0 0 0.14 0.05 0.05 0 0.03 0 0 0.03 0 0 0 0 0 0 0 0 0 0 0 0590 0 0 0 0 0.06 0 0 710 0 0 0.02 0.35 0 745750755 0.05 0 0.08 0.38 0.02 0 0.03 0.05 0.02 0 0.04 0 0 0 0 0 0.06 0 0 0.11 0 0 0 0 0 0 0 0500 0 0 0 0 0 0 0.25 0 0.05 0 0.22 760765770 0.10775 0.13780 0.13 0785 0.13 0.10 0.19790 0.03 0.08795 0.11 0.03 0.05800 0.03 0 0 0.13 0.43 0.15 0.03 0.08 0 0 0.25 0 0 0.26 0.10 0.05 0.07 0 0.02 0 0.03 0.11 0.02 0.15 0.05 0.10 0 0.25 0 0.21 0.15 0.14 0.15 0520 0.05 0 0 0 0.31 0 0.03380 0 0.25 0 387 0.25 0394 0.03 0 0.10 0 0.10 0.48 0 0.65 0.21 0.26 0 0.30 0.08 0 0 0.15 0.19null 0 0.35 0 0 0 0.06 0 0.25 0.25 0 0 0.21 0.04 0 0 0 0 0 0 0.40 0 0 0.08 0 0 0.31 0.38 0.05 0.03 0 0.38 0 0 0.03 0 0.05 0.19 0.13 0 0 0.05 0.18 0.14 0.05 0.18 0 0.06 0 0 0.06 0.06 0 0.14 0 0 0 0 0.69 0.33 0 805810 0 0 0 0510 0.02 0.02 0 0 0 0 400417 0.05 0.26 0.55 0.58 0.75 0.79 0.55 0.69 0.50 0.95 1 0 1 0 0 0 0.58 0 815 820 830 500 able 3. Act2–3 21 20 24 10 7 10 18 2 10 23 5 4 6 17 8 12 9 T Act1–1 20 19 24 10 7 10 18 2 10 24 5 4 6 17 8 11 9 Act2–1 21 20Act2–2 24 21 10 20 7 24 10 10 18 7 2 10 10 18 24 2 5 10 4 24 6 5 17 4 8 6 12 17 9 8 12 9 Act1–2 20 19 24 10 7 10 18 2 10 24 5 4 6 17 8 12 9

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 160 P. BERREBI ET AL. 0.06 0.17 0.78 0.25 0.75 0.96 0 0.04 0 0.06 0 0.11 o). Panmixia was checked through Fis checked was Panmixia o). 0.080.92 0 1 0 0.94 0 1 0 0.89 d, defined by their lengths in base pairs. The defined by their lengths in base pairs. d, 0.13 1 1 0.07 0 1 NS, non-significant; *significant (5%); **highly significant (1%); ***very highly significant **highly significant (1%); *significant (5%); non-significant; NS, 0.33 0.26 0.47 0.78 0.08 0.30 0.40 0.36 0.15 0.42 : − P 0.97 0 1 1 0 0 0 0 0.93 1 0 0.03 1 0 0 0 0.88 0 0 0 0 0 001000000000 1101100000 000000100000 00000000000 0010001000 00000101110 000001 00000000000 000000000000 000000000 1111101 0000010 00000000000 000000100000 001000000100 00000000000 110110000000 000001011010 0.07 0.06 0.00 0.17 0.51 0.09 0.23 00000 11111 00000 00000 00000 00000 00000 00000 11111 1 00000 11111 00000 00000 0 00000 00000 00000 00000 A06 A11 A14 A16 A29 A32 B0A B04-b B09 B10 B18 B18-b B22-o B34 B46 B54 B22-z NS NS NS NS ** NS ** *** NS *** ** NS * *** * NS ** Continued 355360370 0.07375 0.83 0 0.10 0.73 0.28 0.79 0.04 0.17 0.90 0 0.10 0.57 0.43 0 0.50 0.45 0.83 0.05 0.14 0 0.03 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0.08 0.92 0 0 1 0 0 1 0 0.96 0.04 0.89 0 0 0.11 425 350 423 406 410 920 233236239 0 0.15392 0.85 0.05 0 0.95 0.09 0.91 0 0.20 0.70 0 0.10 1 0 0.10 0.90 0.06 0.94 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 910 230 235 240 900 550 535 525 510515520 0.64 0.36 0.75 0.25 0.73 0.27 1 0 1 0 s able 3. HnbHoFi 0.23 0.22 0.21 0.22 0.20 0.17 0.22 0.15 0.15 0.08 0.20 0.18 0.16 0.13 0.15 0.18 0.11 0.08 0.11 0.07 0.06 0.02 0.10 0.09 0.11 0.08 0.08 0.14 0.05 0.09 0.15 0.23 0.13 0.14 P Origin River River River River Mark. Mark. Mark. Mark. Mark. River Mark. Mark. River River Mark. Mark. River Mlc3–2 21 20 24 10 7 10 18 2 10 24 5 4 6 17 8 12 9 CK6-2 20 19Mlc3–1 21 17 20 10 24 7 10 10 7 18 10 2 18 10 2 23 10 5 24 4 5 6 4 17 6 8 17 12 8 9 12 9 CaM-2 21 20CK6-1 20 23 19 10 16 7 8 10 7 15 10 2 18 9 4 24 10 4 23 3 5 6 4 15 6 8 16 12 7 9 12 9 T CaM-1 21 20 22 8 6 9 15 2 7 23 4 3 6 14 7 11 6 The lines in italics after the loci names gives the sample sizes. The numbers under the loci names are various alleles foun The lines in italics after the loci names gives sample sizes. bottom of the table gives unbiased expected heterozygosity (Hnb) according to Nei (1978) and the observed heterozygosity (H 1984) and its significance is given below [ & Cockerham, estimation (Weir at the market). Mark., by scientists; The last line concerns the sampling method (River, (0.1%)].

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 OPSARIICHTHYS (CYPRINID) PHYLOGEOGRAPHY IN CHINA 161 hypothesis is not rejected by the ‘k + 1 vs. k’ likeli- A hood ratio test. 1 Second, according to the samples and to the possible Group 4 subsamples (if several taxa were present in a given 0.5 Group 1 sample), the classical parameters were calculated, i.e. 0 the allele frequencies, the observed (Ho) and expected Group 2 non-biased (Hnb) heterozygosity of Nei (1978), and the –0.5 F-statistics of Wright using the estimators of Weir & –1 A Sampling Cockerham (1984). B Sampling Third, Fst estimations were performed using –1.5 Zacco Genetix software. Weir & Cockerham’s (1984) estima- B04 tor θ was calculated and its significance assessed by –2 Group 3 B18 1000 permutations. The Fst calculations were orga- –2.5 nized at three levels: firstly between groups, secondly inside the main group which is Group 1 covering two –3 basins (in Group 1, Fst were calculated between –3.5 basins, between subbasins of the Chang Jiang basin, but not between the subbasins of the Xi Jiang basin –4 represented by one sample) and thirdly between sam- ples of the same subbasin. –4.5 –1 –0.5 0 0.51 1.5 2 Fourth, a phylogenetic tree was constructed based on the Nei (1978) genetic distances between samples. B 6 This distance takes into account the effects of small A Sampling sample sizes. For this, the neighbour joining algorithm B Sampling of the PHYLIP 3.5c software package (Felsenstein, Group 4 1993) was performed followed tree construction using 5 B04 TREEVIEW 1.40 (Page, 1996). Linkage disequilib- B18 rium (according to Black & Krafsur, 1985) was finally Zacco estimated using the same package. 4

RESULTS 3 MOLECULAR DATA Twenty-four pairs of primers (systems) were tested 2 and only five of them gave easily interpretable and polymorphic patterns. For each pair of primers, the PCRs were carried out using the annealing tempera- 1 ture specified in the original descriptive paper. This Group 1 Group 3 first PCR gave a large series of bands (from three to 20) generally not usable for genotype determination. 0 When a promising polymorphism was detected, a ther- mic gradient PCR was processed to determine the maximum annealing temperature giving any readable –1 product. The final eleven readable loci were often Group 2 accompanied by several monomorphic loci. Only poly- morphic loci were scored and used for statistical anal- –2 ysis (Table 2). –1 –0.5 0 0.5 1 1.5 2 Figure 2. A, multidimensional analysis (FCA) positioning each individual according to its genetic characteristics. PARTITION This diagram represents the first plan defined by axes 1 According to the multidimensional analysis projection (horizontal, inertia 14.5%) and 2 (vertical, inertia 9.7%) given in Figure 2, the overall sample showed four and shows the differentiation between Groups 1, 3 and 2+4. clearly independent clusters: in Figure 2A, (i) Group 1 B, second plan of the same analysis with axes 1 (horizontal, is located from the origin to the negative part of axis 1 inertia 14.5%) and 3 (vertical, inertia 7.5%) showing the (horizontal), (ii) Group 2 is located at its positive end, genetic differentiation between Groups 2 and 4 that is not and (iii) Group 3 is located near the origin on axis 1 but discernible in the first plan.

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 162 P. BERREBI ET AL. at the negative end on axis 2 (vertical). This third larly distant from the other groups, which does not group was composed of individuals clearly distinct really accredit the of dividing the samples from those in groups 1 or 2, but which were bought into two genera, while the monophyly of each genus together with other fish of Group 1 (sample B04) and cannot be denied here. Group 2 (sample B18) at the Sanjie Zhen market in The main purpose of the multidimensional analysis Linchuan County and Anyang market in Du’an is expressed clearly for samples B04 and B18, which County, respectively. Figure 2B describes well (iv) the were each demonstrated as being composed of repre- fourth group, corresponding to the reference sample of sentatives of two groups, Groups 1 and 3 and Groups the species Z. platypus (station A22). This species has 2 and 3, respectively. The subsamples were considered similar coordinates to Group 2 on the first axis, but as independent UTOs for the tree construction. extends its range far along the positive values of axis 3. Most of the Fst estimations were significant. The The PartitionML program gave two kinds of results: mean intergroup Fst was 0.66 (0.60 < Fst < 0.70). In first, using the Smouse et al. (1990) procedure, a par- Group 1, the group best represented in the sample and tition into four groups was the most likely. Second, the covering most of the study area, the interbasin Fst design of the groups was identical to that deduced was 0.07 while the mean intersubbasin Fst was simi- from the FCA, except for one individual (99.5% over- lar: 0.08 (0.07 < Fst < 0.10) inside the Chang Jiang lap). This result confirms that the groups obtained by basin (not calculated for the Xi Jiang subbasin which correspondence analyses are also the best partitions was represented mostly by one sample). The mean into biological subgroups (i.e. with the best panmixia intersample Fst was 0.08 (0.01 < Fst < 0.18) in the and the lowest linkage disequilibrium). Chang Jiang basin but 0.30 in the Xi Jiang basin (0.09 < Fst < 0.51). These results clearly highlight the intergroup dif- PHYLOGENETIC IMPLICATION ferentiation, which is the main outcome of the data. The neighbour joining tree in Figure 3 summarizes In Group 1, most of the differentiation was the samples’ genetic organization: Group 1 is very concentrated between samples, especially in the Xi homogeneous and genetically far from the other Jiang basin. The differentiation between basins and groups. Groups 2 and 3, clearly separated from the between subbasins (at least in the Chang Jiang others, are however, divided into at least two sub- basin) were similar. groups each. The Z. platypus outgroup is not particu-

POPULATION CHARACTERISTICS Table 3 gives the allele frequencies of each sample and B09 Group 1 their heterozygosity calculated from polymorphic loci A32 A29 A16 only. The B04 and B18 samples were split into two B10 B04 A11 subsamples each according to the FCA/PartitionML B54 A14 A06 results. This table also gives the estimation of het- erozygote disequilibria (Fis estimated by the f param- eter of Weir & Cockerham (1984)) which can be Group 3 considered as a panmixia test. The sampling tech- nique is also indicated because panmixia estimation is 0.1 Nei D B04b only pertinent when the samples have been obtained by scientific sampling. The samples’ heterozygosity ranged between 0.07 B18b and 0.23. Despite the use of Nei unbiased het- erozygosity taking into account sample size, it is B22Z probably impossible to calculate accurately a repre- sentative value of the polymorphism for small-sized B46 Group 4 Group 2 samples. There was no link between Hnb and the river/ B22 market origin of the samples. B18 B34 Population heterozygosities varied between groups. Group 1 was generally more polymorphic (0.11 < Figure 3. Neighbour-joining tree combining all the sam- Hnb < 0.23) than was Group 2 (0.07 < Hnb < 0.14). ples analysed including the outgroup Zacco platypus The other groups seemed more polymorphic with (B22Z) (see Discussion). The branch lengths are propor- Hnb = 0.26 for Group 3 (which contained only six fish) tional to Nei (1978) distances. and Hnb = 0.23 for the Z. platypus sample (nine fish).

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 OPSARIICHTHYS (CYPRINID) PHYLOGEOGRAPHY IN CHINA 163

The overall heterozygosity per group was similar thys and Zacco, occur in both the Xiangjiang and between Groups 1 (Hnb = 0.21; 156 fish) and 2 Lijiang rivers. Thus the Lingqu Trench may be an (Hnb = 0.18; 36 fish). exchange channel for the fish fauna distributed in dif- Considering now the Fis values (Table 3) and their ferent basins. The Liujiang River (B09, B10 and B54) significance using the permutation procedure of the is also close to the Lijiang River and the similarity of GENETIX program, when the sample was captured by their fish fauna can be explained by river captures. the scientific team (by cast net or seine), the amount of disequilibria (= no panmixia) accounted for 50% of the samples (33% when the calculation was limited to GENETIC STRUCTURE OF OPSARIICHTHYS BIDENS samples larger than nine fish). When the sample was IN CHINA bought at the market, the value was 56% (25% when According to Bermingham & Moritz (1998), when limited to samples larger than nine fish). The method compared with mtDNA sequences, nuclear genes of obtaining fish had therefore little or no influence on (e.g. introns), used for population-level phylogenetics the Fis value. This probably means that when the present several flaws including greater coalescent sample was bought from only one fisherman (as in the time and possible reticulate evolution among nuclear majority of the cases), the fish were caught together alleles due to recombination and hybridization. This is and cannot be considered as consisting of several why introns were used here mainly for interpopula- population assemblages. tion relationships, intrapopulation diversity and tax- Linkage disequilibria were observed at P < 0.05 for onomic argumentation. Phylogenetic reconstruction four pairs of loci and at P < 0.001 for one pair: Mlc3–2/ was not the aim of this study. CaM-2. However, no significant disequilibrium was Introns are still infrequently used for population maintained after a sequenced Bonferroni test (Rice, genetics and phylogeography. This kind of marker, the 1989). No clear linkage was therefore demonstrated, subject of various recent publications (Dixon et al., indicating that the loci were independent. 1996; Villablanca, Roderick & Palumbi, 1998; Bierne et al., 2000; Daguin, Bonhomme & Borsa, 2001) is promising for intraspecific structure description; how- DISCUSSION ever, basic knowledge is still missing and several tech- nical problems must still be solved. Neutrality of GENERAL KNOWLEDGE ABOUT CHINESE introns is expected because they are non-coding DNA ICHTHYOFAUNA AND CYPRINIDS zones. Hitchhiking can also be suspected because Comparing our results with the general Chinese fish introns are positioned inside selected genes. Moreover, distribution given by Yap (2002), we can confirm a several papers have described the possible role of similarity between two drainage populations: the mid- introns and so their possible subjection to selection dle Chang Jiang samples (all ‘A’ samples) and the Xi (Federova & Federov, 2003, for example) while others Jiang samples of the Zhu Jiang basin (B04, B09, B10 have described their neutrality (Jaruzelska et al., and B54) grouped into what we call here ‘Group 1’. The 1999). In this study, introns proved to be efficient western samples of Xi Jiang (B18, B22, B22 and B34) because they gave a coherent geographical structure. formed the genetically distinct Group 2. The strong similarity of all the samples belonging to This pattern, which differs totally from the present Group 1 is a remarkable observation. The maximum within-basin connections, could be explained by two Nei distance between two samples in this group was kinds of phenomena: (i) former interbasin connections 0.171 while the minimum was 0.002. If selection had which could be deduced from geological data or recent shuffled the cards, this homogeneity would have been river captures between the two basins, and (ii) present destroyed by the ecological diversity of the 800-km within-basin isolation, probably due to ecological fac- range of this group. tors preventing north–south migration of O. bidens, or In Group 2, two subgroups were observed between more simply, limited migration between tributaries. the downstream sample (B46) and the three samples ‘A’ samples covered the middle Yangtze (Dongtinghu of the upper basin. The Nei distance was relatively Lake Basin). ‘B’ samples (B04, B09, B10 and B54) high: 0.345 between B46 and B22, while the three were from the Lijiang and Liujiang Rivers, two tribu- other samples were very similar (0.007 < D < 0.012). taries of Xijiang River, the longest tributary of the Group 2 is clearly divided into two subgroups distrib- Pearl River basin. Xiangjiang River (A32), flowing to uted in different parts of the same basin. Dongtinghu Lake and Lijiang River (B04, B54) are Group 3 was the only one to have populations sym- connected by the Lingqu Trench, an artificial canal patric with other groups: in B04b Group 3 was sym- built over 2000 years ago, in Xing-an County Town, patric with Group 1 and in B18b with Group 2. Guangxi. According to some research on the fish fauna Nevertheless, this third group was clearly separated of China, many common species, including Opsariich- from the other two. The distance between both sam-

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 164 P. BERREBI ET AL. ples of Group 3 was not negligible (D = 0.402) but the genera are well separated by many osteological char- sample sizes were very small, and this must be con- acters, with several synapomorphies supporting the firmed by further study. monophyly of each genus. According to another mor- Fst estimations provided an assessment of the dif- phological study, Opsariichthys and Zacco could have ferent levels of structure included in the data. Most a very close relationship (Chen, 1982). They always (27/28) Fst estimations were considered as being dif- exist in very similar or even in the same ecological ferent from zero (1000 permutations) at each level environment. of comparison. The intergroup structure prevailed Our genetic analysis showed that, genetically, Zacco with values over 0.60 and only intersamples Fst and Opsariichthys are rather closely related. It dem- inside the Xi Jiang basin were similar (for example onstrated that the nominal species O. bidens is, at B09/B54, Fst = 0.51). On the contrary, interbasin and least in China, composed of several geographical taxa, intersubbasin comparisons gave similar values each of them characterized by unambiguous diagnos- (between 0.06 and 0.1), indicating the homogeneity tic intron loci, and that Zacco is nested within Opsari- of this group over subbasins and basins. Therefore, ichthys, close to Group 2. two populations living in two different basins are To explain the first observation, a large phylogenetic statistically not more differentiated than are two analysis of most Chinese cyprinid genera is necessary. populations living in two subbasins of the Chang This analysis should solve the taxonomic ambiguity Jiang River. observed here, as have other similar analyses on The mean intergroup distances were high: D = 0.921 cyprinids (Kotlik et al., 2002; Tsigenopoulos, Kotlik & between Groups 1 and 2, 0.555 between Groups 1 and Berrebi, 2002). 3, and 0.779 between Groups 2 and 3. The absolute To explain the second discrepancy, it must be value of genetic distance is not a very good basis for accepted that the rate of molecular evolution differs determining the status of groups. Several publications from the rate of morphological evolution and that have tried to link genetic distance and species or classifications based on each kind of data could partly genus status using allozymes (Avise & Aquadro, 1982; diverge. The possible taxonomic modification (i.e. Johns & Avise, 1998). However, each kind of marker placing the O. bidens species complex and Z. platypus has its own reference, which cannot be extrapolated to of the You Jiang tributary, Xi Jiang basin, in the others. Moreover, with introns, several loci can be same evolutionary group and even in the same genus) amplified using only one pair of primers, and we cannot be applied to part or all of the species of the did not include numerous monomorphic loci in the same genera without a detailed analysis involving calculation, mainly because they appear as only weak most of the species of both genera. In particular, the bands on the gels. A genetic distance using only poly- well-established intergeneric morphological distinc- morphic loci can be used not as a reference but only at tion must remain the basis of practical systematics. the intrastudy level. Molecular data should instead propose evolutionary Comparing the intergroup distances (0.555 < D history hypotheses to link the taxa. To investigate the < 0.921) and the intergeneric distance (0.522) shows hypothesis of hybridization needs larger sample sizes. that Z. platypus is not more isolated in the tree topol- The Chinese distribution of what may be called the ogy compared with any Opsariichthys group. This ‘O. bidens species complex’ has not been sampled genetic closeness was also observed by He, Chen & exhaustively. It is probable that more taxa will be dis- Tsuneo (2001) using mtDNA sequences. Moreover, covered within this species complex. the genetic closeness of Group 2 and the outgroup (Z. platypus) can be interpreted as a marginal hybridization. POPULATION DATA AND FISH FARMING IN CHINA Howes (1980) demonstrated, using morphological The analysis of the genetic structure of O. bidens in data, that Zacco is outside the bariliine monophyletic southern China from the middle Chang Jiang and lineage, which includes the Opsariichthys monophyl- middle Xi Jiang basins has been motivated by the etic group (Opsariichthys, Opsaridium and Raiamas), necessity to provide new wild species and strains for and six other genera (Megarasbora, Luciosoma, Par- Chinese fish farming. Before domestication, it is nec- luciosoma, Leptocypris, Engraulicypris and Barilius). essary to know the exact status of the population used He also demonstrated that Z. pachycephalus was to provide genitors for breeding. The knowledge of the indeed a species of Opsariichthys, a conclusion that natural taxa of the same taxonomic unit must be used was also supported by Ashiwa & Hosoya (1998). to compare their performance under artificial rearing Opsariichthys and Zacco share many similar external and to choose the most preadapted population for morphological characteristics, and they both have establishing a strain. unusually wide, almost totally congruent, geographi- This study provides very useful data because vari- cal distributions. Yet, according to Howes (1980), both ous populations of O. bidens have been analysed and

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 OPSARIICHTHYS (CYPRINID) PHYLOGEOGRAPHY IN CHINA 165 compared, and because several groups have been dis- tions. In: Hecht MK, Wallace B, Prance GT, eds. Evolution- covered within the same taxonomic unit. While com- ary biology. New York, London: Plenum Press, 151–185. plementary morphological analyses are necessary, the Belkhir K, Bonhomme F. 2002. PartitionML: a maximum data leads us to propose the hypothesis of at least likelihood estimation of the best partition of a sample into three subspecies or species within the complex panmictic units. http://www.univ-montp2.fr/~genetix/parti- O. bidens. It is clear that populations of each group tionml.htm. Montpellier: Laboratoire Génome et Popula- should be tested for domestication and their perfor- tions, CNRS UPR 9060, Université Montpellier II, mance should be compared. Montpellier (France). The other output of the study is the choice of the Belkhir K, Borsa P, Goudet J, Chikhi L, Bonhomme F. 1998. GENETIX, logiciel sous WindowsTM pour la géné- population of each group to be tested first. It is gener- tique des populations. http://www.univ-montp2.fr/∼genetix/ ally accepted that the more a strain is genetically poly- genetix.htm. Montpellier: Laboratoire Génome et Popula- morphic, the greater its capacity for adaptation to a tions, CNRS UPR 9060, Université Montpellier II, Montpel- new environment, which would be good in a new spe- lier (France). cies to be used for stocking fish farms. In the case of Benzecri JP. 1973. L’analyse des données. Paris: Dunod. the two main groups (1 and 2) analysed here, only the Bermingham B, Moritz C. 1998. Comparative phylogeogra- river samples (i.e. sampled by nets by the scientific phy: concepts and applications. Molecular Ecology 7: 367– team) and those large enough to give a sure value of 369. heterozygosity are considered. Bierne N, Lehnert SA, Bédier E, Bonhomme F, Moore SS. The three samples from the Yuan Jiang subbasin 2000. Screening for intron-length polymorphisms in penaeid (Chang Jiang drainage) are the best candidate for shrimps using exon-primed intron-crossing (EPIC)-PCR. Group 1: A06, A11 and A14. Hnb was high, over 0.2. Molecular Ecology 9: 233–235. Hnb was lower in Group 2 (between 0.07 and 0.14). Black WC, Krafsur ES. 1985. A FORTRAN program for The sample sizes were also lower and the choice is lim- the calculation and analysis of two-locus linkage disequi- ited. Only sample B34 combines all of the prerequi- librium coefficients. Theoretical and Applied Genetics 70: sites for fish farming. 491–496. No recommendation can be given concerning Group Chen IY. 1982. Revision of the fishes of genera Opsariichthys, 3, except that rivers harbouring this taxon alone are Zacco, Candidia, and Parazacco. Oceanologia et Limnologia missing and are necessary for experimentation. Sinica 13: 293–298. Chow S. 1998. Universal PCR primer for calmodulin gene ACKNOWLEDGEMENTS intron in fish. Fisheries Science 64: 999–1000. Chow S, Takeyama H. 1998. Intron length variation observed This study was funded by the European Commission, in the creatine kinase and ribosomal protein genes of the INCO-DEV programme, project ECOCARP (New swordfish Xiphias gladius. Fisheries Science 64: 397–402. native fish species for Asian aquaculture: conserving Daguin C, Bonhomme F, Borsa P. 2001. The zone of sym- natural genetic reserves and increasing options for patry and hybridization of Mytilus edulis and M. gallopro- sustainable use of aquatic resources), contract num- vincialis, as described by intron length polymorphism at ber ICA4-CT-2001–10024. Stefan Andersson, Frank locus mac-1. Heredity 86: 342–354. Johansson (UMU), Arne Malzah (Uni-Kiel), Du Jiwu, Dixon B, Nagelkerke LAJ, Sibbing FA, Egberts E, Stet Zhao Yahui (IZB), Mao Han, Wang Xuzhen, Peng RJM. 1996. Evolution of MHC class II B chain-encoding Zuogang and He Shunping (IHB) helped to collect genes in the Lake Tana barbel species flock (Barbus interme- samples in the first and second field trips. dius complex). Immunogenetics 44: 419–431. Federova L, Federov A. 2003. Introns in gene evolution. REFERENCES Genetica 118: 123–131. Felsenstein J. 1993. PHYLIP 3.5 (phylogeny inference pack- Anonymous. 1981. Freshwater fishes of Guangxi Province. age). Seattle, Washington: University of Washington. Nanning: Guangxi People’s Press. Fu CZ, Wu JH, Chen JK, Qu QH, Lei GC. 2003. Freshwater Ashiwa H, Hosoya K. 1998. Osteology of Zacco pachyceph- fish biodiversity in the Yangtze River basin of China: pat- alus, sensu Jordan & Evermann (1903), with special refer- terns, threats and conservation. Biodiversity and Conserva- ence to its systematic position. Environmental Biology of tion 12: 1649–1685. Fishes 52: 163–171. Guinand B. 1996. Use of a multivariate model using allele fre- Atarhouch T, Rami M, Cattaneo-Berrebi G, Ibanez C, quency distribution to analyse patterns of genetic differenti- Augros S, Boissin E, Dakkak A, Berrebi P. 2003. New ation among populations. Biological Journal of the Linnean primers for EPIC amplification of intron sequences for fish Society 58: 73–195. and other vertebrate population genetic studies. Biotech- Hassan M, Lemaire C, Fauvelot C, Bonhomme F. 2002. niques 35: 676–682. Seventeen new exon-primed intron-crossing polymerase Avise JC, Aquadro CF. 1982. A comparative summary of chain reaction amplifiable introns in fish. Molecular Ecology genetic distance in the vVertebrates. Patterns and correla- Notes 2: 334–340.

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166 166 P. BERREBI ET AL.

He SP, Chen YY, Tsuneo N. 2001. Sequences of cytochrome logenetic trees on personal computers. Computer Applica- b gene for primitive cyprinid fishes in East Asia and their tions in the Biosciences 12: 357–358. phylogenetic concerning. Chinese Science Bulletin 46: 661– Rice WR. 1989. Analyzing tables of statistical tests. Evolution 665. 43: 223–225. Howes GJ. 1980. The anatomy and classification of bariliine She JX, Autem M, Kotoulas G, Pasteur N, Bonhomme F. cyprinid fishes. Bulletin of the British Museum of Natural 1987. Multivariate analysis of genetic exchanges between History (Zoology) 37: 129–198. Solea aegyptiaca and Solea senegalensis (Teleosts, Soleidae). Jaruzelska J, Zietkiewicz E, Batzer M, Cole DEC, Moisan Biological Journal of the Linnean Society 32: 357–371. J-P, Scozzari R, Tavare S, Labuda D. 1999. Spatial and Smouse PE, Waples RS, Tworek JA. 1990. A genetic mix- temporal distribution of the neutral polymorphisms in the ture analysis for use with incomplete source population data. last ZFX Intron: analysis of the haplotype structure and Canadian Journal of Fisheries and Aquatic Science 47: 620– genealogy. Genetics 152: 1091–1101. 634. Johns GC, Avise JC. 1998. A comparative summary of Tsigenopoulos C, Kotlik P, Berrebi P. 2002. Biogeography genetic distances in the Vertebrates from the mitochondrial and pattern of gene flow among Barbus species (Teleostei: cytochrome b gene. Molecular Biology and Evolution 15: Cyprinidae) inhabiting the Italian Peninsula and neighbour- 1481–1490. ing Adriatic drainages as revealed by allozymes and mito- Kotlik P, Tsigenopoulos CS, Rab P, Berrebi P. 2002. Two chondrial sequence data. Biological Journal of the Linnean new Barbus species from the Danube River basin, with rede- Society 75: 83–99. scription of B. petenyi (Teleostei: Cyprinidae). Folia Zoolog- Villablanca F, Roderick G, Palumbi S. 1998. Invasion ica 51: 227–240. genetics of the mediterranean fruit fly: variation in multiple Li SF. 2001. A study on biodiversity and its conservation of nuclear introns. Molecular Ecology 7: 547–560. major fishes in the Yangtze River. Shanghai: Shanghai Sci- Weir BS, Cockerham CC. 1984. Estimating F-statistics for entific and Technical Publishers. the analysis of population structure. Evolution 38: 1358– Nei M. 1978. Estimation of average heterozygosity and genetic 1370. distance from a small number of individuals. Genetics 89: Yap SY. 2002. On the distributional patterns of Southeast- 583–590. East Asian freshwater fish and their history. Journal of Bio- Page RDM. 1996. TREEVIEW: An application to display phy- geography 29: 1187–1199.

© 2006 The Linnean Society of London, Biological Journal of the Linnean Society, 2006, 87, 155–166