Advance Publication by J-STAGE

Genes & Genetic Systems

Received for publication: September 6, 2014 Accepted for publication: May 8, 2015 Published online: December 18, 2015 1 Inconsistent diversities between nuclear and plastid genomes of AA genome species in the genus

2 Hao Yin1, Masahiro Akimoto2, Reunreudee Kaewcheenchai3, Masahiro Sotowa5, Takashige Ishii4 and Ryuji

3 Ishikawa5

4 1United Graduate School of Agricultural Sciences, Iwate University, Morioka, Iwate 020-8550, Japan

5 2 Obihiro University of Agriculture and Veterinary Medicine, Obihiro, Hokkaido 080-8555, Japan

6 3 Prachinburi Research Center, Bureau of Rice Research and Development, Rice Department, Prachinburi,

7 25150 Thailand

8 4 Graduate School of Agricultural Science, Kobe University, Kobe, 657-0013, Japan

9 5 Faculty of Agriculture and Life Science, Hirosaki University, Hirosaki, Aomori 036-8561, Japan

10

11 Running head: Genomic diversity of Oryza AA genome species

12 Key words: Oryza, AA genome, chloroplast genome, next-generation sequencing, maternal lineage

13 *Corresponding author: E-mail: [email protected]

14

1 1

2 AA genome species in the genus Oryza are valuable resources for improvement of cultivated rice. Oryza rufipogon

3 and O. barthii were progenitors of two domesticated rice species, O. sativa and O. glaberrima, respectively. We used

4 chloroplast single-nucleotide repeats (RCt1-10) to evaluate genetic diversity among AA genome species. Higher

5 diversity was detected in the American species O. glumaepatula and the Asian species O. rufipogon. Other

6 chloroplast sequences indicated that O. glumaepatula shares high similarity with O. longistaminata. Insertions of

7 retrotransposable elements, however, showed a close relation between O. barthii and O. glumaepatula. To clarify

8 phylogenetic relationships among AA genomes, whole-genome sequences obtained from different species were used

9 to develop chloroplast INDEL markers. The INDEL patterns clearly showed multiple maternal origins of O.

10 glumaepatula. The complicated origins have resulted in high genetic diversity in this species. In contrast, the

11 Australian endemic species O. meridionalis tended to show narrower diversity than the other species. High variation

12 in O. rufipogon, reconfirmed using the chloroplast INDELs, covered the variation in O. meridionalis and part of the

13 variation in O. glumaepatula. Maternal lineages including O. barthii, O. longistaminata and the remainder of O.

14 glumaepatula were phylogenetically close to each other and carried low genetic diversity. They were separated from

15 independent lineages, suggesting that they had diverged from a single ancestral maternal lineage, but diverged later

16 to keep gene flow within respective species, as SSR compositions suggested. Genetic relationships among AA

17 genome species indicate how these species have evolved and become distributed across four continents.

18 1ページに有

19

20

21 INTRODUCTION

22 The evolution of the genus Oryza, especially AA genome species, has been one of the major focuses of rice

23 research (Oka, 1988). AA genome species include two cultigens: O. sativa, which is cultivated worldwide, and

24 O. glaberrima, an endemic species distributed along the River Niger. The progenitors of these cultivated

25 species had once been regarded as the O. perennis complex, consisting of four geographical races – Asian,

26 African, American, and Oceanian – inhabiting different regions (Morishima, 1969). These were subsequently

27 renamed and defined as different species: O. rufipogon, O. barthii, O. longistaminata, O. glumaepatula, and

2 1 renamed and defined as different species: O. rufipogon, O. barthii, O. longistaminata, O. glumaepatula, and

2 meridionalis (Oka, 1988). Oryza sativa was domesticated in East Asia from O. rufipogon (Oka, 1988; Fuller

3 and Sato, 2008; Huang et al., 2012), and O. glaberrima was independently domesticated from O. barthii

4 M. et al., 2014).

5 The major cultivated species, O. sativa, has a deep population structure due partly to introgression from

6 various varieties of O. rufipogon, and also from other varietal groups (Huang et al., 2012; Garris et al., 2005;

7 Ishikawa et al., 2002a, b; McNally et al., 2009; Molina et al., 2011; Tang and Morishima, 1997). Wide

8 variation had already occurred 86 to 440 ky ago between subspecies in O. sativa, indica and japonica, and 2

9 my ago among AA genome wild species (Molina et al., 2011; Zhu and Ge, 2005). The deep divergence

10 between subspecies in O. sativa could be seen in genome divergence (Huang et al., 2012). This variation

11 could be seen in various measurements of divergence (McNally et al., 2009; Molina et al., 2011). japonica

12 was able to be classified into Temperate-japonica, and Tropical-japonica (Oka, 1988; Sato, 1991).

13 Divergence between Tropical-japonicai and indica was estimated as deep as about 3900 y (Molina et al.,

14 2011). However, these varietal groups shared monophyletic origin. Because they shared identical alleles at

15 several loci regulating major agronomic traits such as non-shattering, both varietal groups were created

16 through past introgression to share the same genetic components (Ishikawa et al., 2002a, b; Tang and

17 Morishima, 1997). Past introgression events among wild forms and cultivars have also been inferred from

18 genomic data (Huang et al., 2012). Such introgressions can still be observed as weedy rice, and in parts of

19 genome sequences where selective sweeps have occurred (Molina et al., 2011). In fact, strong selective

20 sweeps can be seen to have occurred at several domesticated key genes such as the non-shattering gene and

21 white pericarp gene (Li et al., 2006; Lin et al., 2007; Sweeney et al., 2007). In contrast, the endemic species O.

22 glaberrima was domesticated from the annual O. barthii, which is widespread in west tropical Africa (Oka,

23 1988; Khush, 1997; Semon et al., 2005). Recent genomic data have suggested that O. glaberrima was

24 domesticated from a narrow gene pool in O. barthii (Wang, M. et al., 2014). Because the domesticated species

25 occurred independently from O. rufipogon, O. glaberrima has diverged sufficiently from O. sativa to develop

26 multiple reproductive barriers (Morishima, 1969). The related perennial species, O. longistaminata, shared the

27 same habitat as African species, but was more widely distributed in Africa and independent from

28 domestication.

3 1 Two other wild species in different continents, O. meridionalis and O. glumaepatula, have shown no

2 domestication events. Oryza meridionalis is a species endemic to Oceania including New Guinea and

3 Australia, and characterized by an annual life history and morphologically by a short anther (Ng et al., 1981;

4 Vaughan, 1994; Lu, 1999). Its distribution partly overlaps that of O. rufipogon. Recently, a possible new

5 species has been reported in the same area, which appears to have diverged from the endemic species but

6 acquired a perennial life history (Brozynska et al., 2014; Sotowa et al., 2013). In addition, a conventional

7 perennial type, the so-called Australian O. rufipogon, was found to share a highly similar chloroplast genome

8 with this possible new species and also O. meridionalis. However, the Australian O. rufipogon did not show

9 any reproductive barrier against Asian O. rufipogon. Nuclear markers also suggested that the Australian type

10 did not show any divergence from the Asian type. However, there is still insufficient understanding of

11 divergence among AA genome species. Another example is the American species, O. glumaepatula, which is

12 distributed in the New World ranging from Cuba to Brazil. The life form of this taxon is still uncertain; an

13 ecotype distributed in Central America and the northern region of South America seems to have a perennial

14 habit (Oka, 1988), while another ecotype in tropical Brazil shows an annual-perennial intermediate form

15 (Akimoto et al., 1998). This means that there are still opportunities to detect de-novo diversity in natural

16 populations for this and other species.

17 The availability of various molecular markers now allows researchers to acquire a comprehensive grasp of

18 divergent evolution among the AA genome species described above, including chloroplast sequences,

19 chloroplast single-nucleotide repeats, and nuclear simple sequence repeats (SSRs), although this has not been

20 completely applied to different genomes other than those of O. sativa in some cases. Since the complete

21 sequencing of the chloroplast genome of O. sativa cv. Nipponbare (Hiratsuka et al., 1989), the known

22 presence of conserved sequences has made it possible to compare divergent accessions and also the

23 mitochondrial and nuclear genomes (Brozynska et al., 2014; Sotowa et al., 2103). Re-sequencing is also

24 available for any genomes belonging to the genus Oryza, allowing possible alignment against the reference

25 sequence (Waters et al., 2012). In the present study, we utilized these materials and tools to evaluate the

26 phylogenetic relationships of AA genome species among Asian, African, American, and Oceania species.

27

4 1

2

3 MATERIALS AND METHODS

4 materials Wild rice accessions composed of ranks 1 to 3 and constituted from five AA genome

5 species were provided by the National BioResource Project in Japan (Nonomura et al., 2010). The collections

6 formed a core collection selected from typical representatives in historical collections stored at the National

7 Institute of Genetics, Mishima, Japan. Twenty accessions for O. barthii, 20 accessions for O. glumaepatula,

8 19 accessions for O. longistaminata, 39 accessions for O. rufipogon and 18 accessions for O. meridionalis

9 were applied in order to clarify their genetic diversity (Table 1). Seven O. rufipogon accessions were

10 mentioned as Oceania O. rufipogon, because two of them originated in Australia and the other five originated

11 in Papua New Guinea. In a previous report, we characterized them as distinctive accessions against Asian O.

12 rufipogon (Sotowa et al., 2013). In order to compare genetic diversity among the cultivars, 20 indica and 20

13 japonica accessions were also genotyped (Supplementary Table S1). The japonica cultivars had been already

14 classified into Tropical-japonica (Tr-J) and Temperate-japonica (Tm-J) using a method described by Sato.

15 (1991). These wild rice accessions were supplied as DNA samples from National Institute of Genetics, Japan.

16 DNA samples of cultivars and additional material were extracted with general Urea method.

17

18 Molecular markers Chloroplast single-nucleotide repeat markers, RCt1~RCt10, were applied to evaluate

19 the genetic diversity of the maternal origins of AA genome species (Ishii and McCouch, 2000; Ishii et al.,

20 2001). Nuclear SSR markers were also applied to evaluate genetic diversity (Table 2). These fragments

21 were amplified with Thermopol® Taq (NEB Co., Japan) and electrophoresed on 6% denatured

22 polyacrylamide gel and genotyped after staining with silver nitrate (Wang, Y. P. et al., 2012). The plastid

23 genotypes were also subjected to principal component analysis (PCA) to evaluate their relationships. The

24 chloroplast genes, rpl16 and matK, were amplified to clarify sequence diversification using the respective

25 primers listed in Table 2. Fragments of rpl16 and matK were amplified to detect SNPs in order to know

26 phylogenetic relations among African and American species. All fragments were amplified with Thermopol®

27 Taq under the following cycle conditions: 94ºC for 3 min for pre-heating, 30 rounds for 94ºC for 10 s, 55ºC

5 1 for 30 s and 72ºC for 30 s, and 72ºC for 5 min. These PCR fragments were purified with a FastGene Gel/PCR

2 extraction kit (Nippon Genetics Co., Japan), then sequenced with a BigDye® Terminator v3.1 Cycle

3 Sequencing Kit (Applied Biosystems, Japan) using ABI3500. Retrotransposon insertions were re-examined,

4 as reported previously by Cheng et al. (2002). Three insertions, p-SINE1-r705, p-SINE1-r801, and

5 p-SINE1-r806, were applied to clarify relationships among O. barthii, O. longistaminata, and O.

6 glumaepatula. PCR was performed with Ex Taq® (Takara Co., Japan) using the following protocol: 94ºC for

7 3 min for pre-heating, 30 rounds for 94ºC for 10 s, 60ºC for 30 s and 72ºC for 30 s, and 72ºC for 5 min.

8 Amplified fragments were electrophoresed on 1.5% agarose gels to clarify whether insertions were carried by

9 each accession.

10

11 Novel INDEL markers based on whole-genome sequence data W1171 (O. glumaepatula), Thai wild rice,

12 45-2 grown at Prachinburi Rice Research Center (Rice Department, Thailand), were subjected to

13 next-generation sequencing (NGS) in order to cover novel INDELs which may be detected only from

14 non-model accessions other than known accessions. DNA samples of these accessions were extracted from

15 mature leaves with DNAeasy Plant Mini Kit (QIAGEN Co., Japan). The whole genome sequences were

16 obtained with Illumina-Hiseq as 100-bp pair-ends, with 75,832,764 reads for W1171 and 62,762,232 reads for

17 45-2. The raw reads were re-sequenced using CLC-workbench genomics (CLC Bio Japan Inc., Japan) against

18 the complete chloroplast genome of O. sativa cv. Nipponbare (GU592207.1). Insertions or deletions

19 (INDELs) were then screened. When the variations showed over 1000X coverage, over 100 counts, and

20 frequencies of more than 50%, these INDELs were applied to “wet experiments”. To confirm these variations

21 in comparison with the Nipponbare chloroplast genome (GU592207.1), four INDELs from each accession

22 were developed (Table 2). Another complete chloroplast genome of O. meridionalis (GU592208) was aligned

23 with the Nipponbare genome (GU592207.1) and four INDELs were developed.

24

25 Data analysis RCt microsatellite and nuclear SSRs were applied to GenAlEx

26 (http://biology-assets.anu.edu.au/GenAlEx/Welcome.html) to perform principal component analysis (PCA) and to

27 calculate genetic distance among accessions. The number of alleles (Na), effective allele (Ne), and expected

6 1 heterozygosity (He) were also calculated by GenAlEx. The Ne will be significantly less than the actual number if

2 the frequencies for some alleles are so much smaller than others.

3 He was calculated using the formula,

2 4 He = 1- Σ (1-xi ).

5 Phylogenetic trees based on RCt genotypes, INDELs and SSRs were constructed by the neighbor joining

6 method using Populations 1.2.31 (http://bioinformatics.org/~tryphon/populations/). Phylogenetic trees were

7 drawn with MEGA 5.0 (Tamura et al. 2011).

8

9 RESULTS

10 Genetic diversity in chloroplast microsatellites

11 Polymorphism in chloroplast single-nucleotide repeats was found in six of ten loci examined

12 (Supplementary Table S2). The average expected heterozygosity (He) ranged from 0.176 to 0.542 among

13 species (Table 3). The highest score was found for O. glumaepatula, followed by 0.528 for O. rufipogon. Our

14 previous study showed that O. rufipogon accessions were composed of two diverged maternal lineages

15 (Sotowa et al. 2013). Therefore, these accessions were divided into Asian and Oceania groups, and these were

16 found to have distinct scores of 0.511 and 0.150, respectively. Oryza sativa showed relatively higher genetic

17 diversity when two varietal groups (indica and japonica) were mixed together. These varietal groups diverged

18 as subspecies level and were strongly influenced by different groups of O. rufipogon. Genetic diversity of

19 each varietal group was calculated separately. These scores were lower than that of O. rufipogon. Sub-groups

20 in japonica, Tm-J and Tr-J, were also diverged to some extent. The two subgroups were also calculated for the

21 scores. Then, the two subgroups showed a relatively lower score than the japonica group. These trends

22 suggested that these varietal groups tended to carry different RCt genotypes among O. sativa.

23 To obtain an overview of diversity among AA genome species, PCA was adopted. Allelic combinations

24 revealed particular groups (Fig. 1A). The subgroups in Asian cultivated species were clearly separated from

25 each other, but the variation of the cultivars as a whole was included within that of O. rufipogon. It was

26 convinced that O. rufipogon in the core collection covered the fundamental variation inherited to the varietal

27 accessions. PCA showed that the Oceania group was located in a position intermediate between O. rufipogon

7 1 and O. meridionalis (Fig. 1B). After excluding O. sativa, the relationships of all wild AA genome species

2 were confirmed by PCA (Fig. 1C). Oryza glumaepatula overlapped partly with O. longistaminata and O.

3 rufipogon, and O. barthii was placed outside the others (Fig. 1D). The distribution graph obtained by PCA

4 suggested that O. longistaminata shared a closely related chloroplast genome with O. glumaepatula.

5 A phylogenetic tree was constructed with RCt genotypes (Supplementary Fig. S1). Oryza meridionalis, O.

6 barthii, and Oceania O. rufipogon formed distinctive clades. Oryza glumaepatula tended to form two major

7 clades with either O. rufipogon or O. longistaminata. Oryza rufipogon accessions were widely distributed.

8 Ambiguous divergence was due to the mutable nature of single-sequence repeats. Except for RCt6,

9 polymorphic markers carried multiple alleles (Supplementary Table S2). In one case, RCt1 carried seven

10 alleles in a single species. Although mutable alleles of RCt markers led phylogenetic relations to be

11 ambiguous, PCA with RCt repeats demonstrated a novel phylogenetic relationship of O. glumaepatula.

12

13 Sequence-based phylogenetic analysis with rpl16 and matK

14 Parts of the chloroplast sequence including rpl16 were sequenced among all species. As the rpl16 sequence

15 included RCt8 and other multiple single-nucleotide repeats, its phylogenetic tree did not show clear

16 relationship (data not shown). Thus, single-nucleotide repeats were excluded from the rpl16 sequence data

17 and SNP combinations were summarized as haplotypes to confirm relationships among O. glumaepatula, O.

18 barthii, and O. longistaminata (Table 4). Oryza longistaminata shared same haplotypes (Haplotype 2 and 7)

19 with O. rufipogon. One of the haplotypes (Haplotype 2) was a major haplotype in O. glumaepatula. Other two

20 haplotypes (Haplotype 3 and 4) in the species carried unique SNPs to Haplotype 2. Haplotype 3 and 4 carried

21 an A to G substitution at 1430 nt against Haplotype 2. Haplotype 4 carried another substitution, an A to T

22 substitution at 655 nt. They were identical to Haplotype 2 except for the two substitutions. None of

23 haplotypes was identical to two haplotypes in O. barthii. Because Haplotype 5 and 6 carried a T insertion

24 between 779 and 780 nt and Haplotype 6 carried another A substitution at 1180 nt when compared to

25 Haplotype 2. These variations resulted in a unique feature in O. barthii.

26 In order to confirm relationships among O. glumaepatula, O. barthii, and O. longistaminata, an additional

27 chloroplast gene, matK was also sequenced. In each case, O. glumaepatula and O. longistaminata shared the

8 1 same sequence which was identical to O. sativa cv. Nipponbare with except for one accession, W1448

2 carrying C to T substitution at 684 nt (Supplementary Table S3). The O. barthii possessed a unique

3 substitution.

4

5

6 Genetic distance evaluated by nuclear DNA

7 It has previously been reported that O. glumaepatula shared genetic similarity with O. barthii at nuclear

8 level (Cheng et al., 2002). However, chloroplast markers did not show any evidence for this. Nuclear SSR

9 markers and other markers were therefore applied to clarify the phylogenetic relationships in order to resolve

10 the apparent inconsistency regarding genetic relationships between American and African species.

11 Twenty-eight SSR markers were randomly chosen and seven of them showed sufficient polymorphism. Other

12 markers represented monomorphism over all species or were not amplified with DNA templates in multiple

13 species, probably because of sequence divergence among different species or genomic rearrangement. The

14 averaged He scores for the species overall ranged from 0.335 at AP004212 to 0.704 at RM257

15 (Supplementary Table S4). Oryza meridionalis showed the lowest diversity among the species. The highest

16 He score was found in the Asian wild rice, O. rufipogon. This relative high value for Oceania O. rufipogon

17 would have been due to the mixture of Asian and Australian factors described in our previous paper (Sotowa

18 et al., 2013).

19 These genotypes were then applied for calculation of genetic distances in order to construct a phylogenetic

20 tree. Species-specific clades were clearly recognized for O. meridionalis and O. glumaepatula. Oryza

21 rufipogon showed a scattered distribution and partly formed clades with O. longistaminata. Oceania O.

22 rufipogon W1235 and W1239 formed a clade with the Australian endemic species O. meridionalis (Fig. 2).

23 The past classification of Oceania O. rufipogon, W1235 and W1239, based on a field observation conducted

24 by Katayama (1968) would be mistranslated to annotate the species classification between annual types

25 belonging either O. rufipogon or O. meridionalis. Because O. meridionalis had not been reported at that time.

26 They would be termed as O. meridionalis. Detail phenotypic observations should be performed to determine

27 the taxonomic classification. Other Oceania O. rufipogon accessions formed three clades with parts of Asian

9 1 O. rufipogon accessions. In one of the clades, two Australian O. rufipogon accessions, W2078 and W2109,

2 with one Papua New Guinean O. rufipogon accession, W1238, formed a single clade with Cambodian, Indian,

3 and Malayan O. rufipogon accessions. W1230 was grouped with accessions from India, Laos, Myanmar, and

4 Thailand. W1236 was grouped with accessions from India, Indonesia, Philippines, and Thailand. Oryza

5 glumaepatula and O. barthii were not genetically close and tended to form distinctive clades with other

6 species. One O. barthii accession was close to one Indian O. rufipogon accession, W0137.

7 Previously reported close relationship between O. glumaepatula and O. barthii has been demonstrated on

8 the basis of SINE insertions (Cheng et al., 2002). Accessions used in this experiment were re-examined using

9 the three known SINE insertions (Supplementary Table S5). Oryza barthii and O. glumaepatula shared

10 p-SINE1-r806 insertion, while only O. glumaepatula carried p-SINE1-r801 insertion. Oryza longistaminata

11 carried p-SINE1-r705 insertion but other species did not carry. These insertions suggested that these species

12 were independent from each other and all accessions in single species originated from single ancestral

13 populations, respectively, based on the data of p-SINE1 insertions. Some accessions in O. rufipogon included

14 in same clades with O. longistaminata, did not carry the p-SINE1-r705 insertion (data not shown). Genetic

15 relations speculated by SSR genotypes could not show precise genetic relations beyond species. SSR markers,

16 however, gave an overview to know relations among species.

17

18 NGS data

19 Two accessions were newly re-sequenced against chloroplast genome of O. sativa cv. Nipponbare. Two

20 accessions W1171 (O. glumaepatula) and Thai wild rice 45-2 (O. rufipogon) were newly re-sequenced

21 against chloroplast genome of O. sativa cv. Nipponbare. 45-2 was wild rice at Prachinburi Rice Research

22 Center, Rice Department, Thailand. The two accessions were subjected to next-generation sequencing (NGS)

23 in order to cover novel INDELs which may be detected only from non-model accessions other than known

24 accessions. In order to know polymorphism in wild rice population, such a local accession was selected.

25 Based on the data of Basic Variant Detection, several rearrangements, insertions, and deletions were assumed

26 to exist. Another chloroplast genome for O. meridionalis was also used to develop INDELs. Except for

27 rufi-cpINDEL2 and meri-cpINDEL5, other INDELs were presumed to be simple four- to six-nucleotide

10 1 insertions or deletions (Table 2). These INDELs were expected not to be mutable like SSR (Table 5). A

2 particular INDEL marker rufi-cpINDEL3, was denoted as an ATAGAA deletion (Table 2). In fact, the site was

3 flanked by incomplete inverted repeat units similar to ATAGAA. Thus, the INDEL marker generated quite a

4 high number of alleles and high He scores (Table 5, Supplementary Table S6). Oryza glumaepatula and O.

5 barthii carried six and seven alleles at rufi-cpINDEL3. These seemed to behave as markers with high

6 mutability such as SSR markers.

7 Phylogenetic tree drawn with the 11 INDELs except for rufi-cpINDEL3 was shown in Figure 3. Oryza

8 meridionalis accessions tended to form species-specific clades. Four Oceania O. rufipogon accessions

9 (W1235, W1238, W1239, and W2078) were included in a single clade with O. meridionalis. Two of them,

10 W1235 and W1239, were grouped with O. meridionalis in a clade constructed with nuclear SSR genotypes

11 (Fig. 2). As mentioned above, these may belong to O. meridionalis as species taxonomically. In contrast, other

12 Oceania O. rufipogon accessions, W1238 (Papua New Guinea) and W2078 (Australia) were grouped with

13 Asian O. rufipogon accessions originated from Cambodia (W2263), India (W1681), and Malaya (W0593)

14 with an accession from Australia (W2109) into another clade constructed with nuclear genotypes. Oryza

15 barthii also showed narrow genetic diversity. The species formed a single clade except for W1702 which was

16 close to remaining O. barthii in the tree. The clade was relatively close to the clade consisted of O.

17 longistaminata and parts of O. glumaepatula. In overview, O. glumaepatula formed several subgroups which

18 were scattered between O. rufipogon and O. longistaminata.

19 Phylogenetic trees obtained from INDEL markers based on different species were shown in Supplementary

20 Fig. S2. Phylogenetic tree obtained with the data of glum-cpINDEL tended to show similar tree in Fig. 3.

21 Oryza glumaepatula carried multiple maternal lineages shared partly with O. rufipogon and partly with O.

22 longistaminata. A tree with the data of rufi-cpINDEL tended to be ambiguous. It was due to high mutability in

23 rufi-cpINDEL3. Thus, 12 INDELs including the INDEL marker also created relatively ambiguous tree

24 (Supplementary Fig. S3). Thus, INDEL markers would help to understand maternal lineages easily and

25 efficiently.

26

27

11 1 DISCUSSION

2 Diploid AA genome species are unique among other diploid and tetraploid species belonging to Oryza because

3 two domesticated forms, O. sativa and O. glaberrima, belong to AA genome species. As the major cultigen, O.

4 sativa, is composed of the distinctly diverged indica and japonica, O. rufipogon was presumed to carry high

5 diversity. However, in terms of chloroplast single-nucleotide repeats, the genetic diversity of O. rufipogon was the

6 second highest after O. glumaepatula. RCt markers have helped to distinguish accessions at subspecies level (Ishii

7 and McCouch, 2000. When compared at the species level, it could not give precise resolution. On the other hand,

8 species distributions detected by PCA reflected overview of species divergence. However, phylogenetic relation

9 was not precise as INDEL or sequence based relations because of the high mutability of RCt markers compared

10 with SNPs. The divergence speed of chloroplast DNA was speculated as slower than that of nuclear DNA, but

11 there are high mutation rates in single-nucleotide repeats included in RCt markers. Sequences of chloroplast gene,

12 rpl16 and its flanking sequences also demonstrated a similar result. When we excluded single-nucleotide repeats

13 from rpl16 sequences, a phylogenetic tree obtained from the data demonstrated relatively clear tree. matK sequence

14 did not include such repeats inside. Thus, phylogenetic relation estimated by the sequence data was simple. Oryza

15 glumaepatula shared high similarity with O. longistaminata with chloroplast sequence of matK and rpl16. In

16 contrast, O. glumaepatula shared same SINE insertions into nuclear genome with O. barthii, but not with O.

17 longistaminata described as Cheng et al. (2002). Compared to the relation obtained with chloroplast data and the

18 SINE insertions, nuclear SSR genotypes demonstrated that O. glumaepatula was independent both from O. barthii

19 and O. longistaminata. It was due to the nature of SSR markers which are easily mutated. These data suggested

20 that all species have diverged as species but there were complicated phylogenetic relations more than that we ever

21 considered.

22 SSR markers possess higher mutability as described above. They offered good resolution inside species or even

23 among landraces (Garris et al., 2005; Ootsuka et al., 2014). In this report, they were applied to know how species

24 related with each other. Some accessions belonging to different species showed close distance. In a case of O.

25 longistaminata and O. rufipogon, some accessions were included in same clades, although an insertion of

26 p-SINE1-705 was detected only in O. longistaminata but not in O. rufipogon. It suggested that the resolution

27 beyond species with SSR markers was not reliable. It was due to mutability in repeat number of each locus.

12 1 Species- specific clades were recognized for O. glumaepatula, O. barthii, and O. meridionalis. SSR markers gave

2 an overview to know relations among species. They have diverged as independent species.

3 As single-nucleotide repeats tend to mutate at a higher rate than other kinds of DNA sequences, we

4 developed INDEL markers from chloroplast genome information. NGS techniques allowed us to obtain

5 genome information from non-model species such as O. glumaepatula and O. meridionalis. Although there is

6 no complete nuclear genome, re-sequencing against the chloroplast genome is available. The higher copy

7 number of chloroplast genomes resulted in highly reliable polymorphism. Several INDELs were developed

8 and relatively larger INDELs were selected to expect obvious differences. Most of INDELs resulted from a

9 simple insertion or a deletion event except for rufi-cpINDEL3 which offered six and seven alleles in O.

10 glumaepatula and O. barthii, respectively. Several same motives inside the targeted region by designed

11 primers may generate such relatively higher number of multiple alleles.

12 Except for highly mutable rufi-cpINDEL3, other INDEL markers made it possible to evaluate the

13 polymorphism inside species. Especially, INDELs in a particular accession in O. glumaepatula helped to

14 know heterogeneity in the species, and in fact, high heterogeneity was detected. The resolution allowed us to

15 detect subgroups in the species. These groups of O. glumaepatula distinguished with chloroplast data were

16 thought to reflect its complex evolutionary history, in which O. glumaepatula probably carried multiple

17 maternal origins shared partly with O. longistaminata and partly with O. rufipogon. This complex feature may

18 result in the highest He score estimated by RCt markers. Under a process of divergence, proto-O.

19 glumaepatula population as a maternal donor apparently had high gene-flow from proto-O. barthii population

20 before they diverged as different species in different continents. Thus, parts of O. glumaepatula shared higher

21 similarity of maternal lineages with O. longistaminata, but they shared with high similarity with O. barthii at

22 nuclear level estimated by SINE insertions. This inconsistency between nuclear and cytoplasmic composition

23 had been created before these species had completely diverged with each other. All O. barthii accessions were

24 separated from other maternal origins estimated by chloroplast INDEL data. Different maternal origin of O.

25 barthii from other species suggested that a minor maternal lineage carried by the ancestral population evolved

26 as O. barthii.

27 When compared to O. glumaepatula, Australian O. meridionalis accessions showed the lowest

13 1 heterogeneity. In the case of O, meridionalis, this may have been attributable to a founder effect of earlier

2 migration to the Australian continent from Asia. The maternal lineage, however, has not been found in Asia.

3 Probably the lineage had been extinct in Asia. All accessions of O. rufipogon in Australia and Papua New

4 Guinea shared same or similar maternal lineage with O. meridionalis. This suggested that O. meridionalis had

5 diverged from Asian O. rufipogon in the past, and later the progeny carrying the same maternal lineage and

6 belonging to O. rufipogon might extend to Oceania area. Oryza rufipogon accessions were scattered in

7 various clades and covered various maternal lineages of other AA genome species. It could be that O.

8 rufipogon donated those maternal lineages. Further data using chloroplast genome INDELs obtained from

9 NGS data would allow a more detailed understanding of which species might be progenitor species among

10 AA genome species. The current data suggest that this might be O. rufipogon.

11

12 ACKNOWLEDGEMENTS

13 This work was funded by a Grant-in-aid B (Oversea project, No. 25304021). The valuable wild rice accessions

14 used in this study were distributed by the National Institute of Genetics supported by the National BioResources

15 Project, MEXT, Japan. Sequencing facility in Gene Research Center,Hirosaki University was used.

16

17

14 1 REFERENCES

2

3 Akimoto, M., Shimamoto, Y., and Morishima, H. (1998) Population genetic structure of wild rice Oryza

4 glumaepatula distributed in the Amazon flood area influenced by its life-history traits. Mol. Ecol. 7, 1371-1381.

5 Brozynska, M., Omar, E. S., Furtado, A., Crayn, D., Simon, B., Ishikawa, R., and Henry, R. J. (2014) Chloroplast

6 genome of novel rice germplasm identified in Northern Australia. Trop. Plant Biol. 7, 111–120.

7 Chen, X., Temnykh, S., Xu, Y., Cho, Y. G., and McCouch, S. R. (1997) Development of a microsatellite framework

8 map providing genome-wide coverage in rice (Oryza sativa L.) Theor. Appl. Genet. 95, 553-567.

9 Cheng, C., Tsuchimoto, S., Ohtsubo, H., and Ohtsubo, E. (2002) Evolutionary relationships among rice species

10 with AA genome based on SINE insertion analysis. Genes Genet. Syst. 77, 323–334.

11 Fuller, D., and Sato, Y-I. (2008) Japonica rice carried to, not from, Southeast Asia. Nat. Genet. 40,

12 1264-1265.

13 Garris, A. J., Tai, T. H., Coburn, J., Kresovich, S., and McCouch, S. R. (2005) Genetic structure and diversity in

14 Oryza sativa L. Genetics 169, 1631-1638.

15 Hiratsuka, J., Shimada, H., Whittier, R., Ishibashi, T., Sakamoto, M., Mori, M., Kondo, C., Honji, Y., Sun, C.R.,

16 Meng, B. Y., et al. (1989) The complete sequence of rice (Oryza sativa) chloroplast genome: Intermolecular

17 recombination between distinct tRNA genes accounts for a major plastid DNA inversion during the evolution of

18 the cereals. Mol. Gen. Genet. 217, 185-194.

19 Huang, X., Kurata, N., Wei, X., Wang, Z. X., et al. (2012) A map of rice genome variation reveals the origin of

20 cultivated rice. Nature 490, 497-501.

21

22 Ishii, T., and McCouch, S. R. (2000) Microsatellites and microsynteny in the chloroplast genomes of Oryza and

23 eight other Gramineae species. Theor. Appl. Genet. 100, 1257-1266.

24 Ishii, T., Xu, Y., and McCouch, S. R. (2001) Nuclear- and chloroplast-microsatellite variation in A-genome species

25 of rice. Genome 44, 658-666.

26 Ishikawa, R., Yamanaka, S., Kanyavong, K., Fukuta, Y., Sato, Y-I., Tang, L., and Sato, T. (2002a) Genetic

27 resources of primitive upland rice in Laos. Econ. Bot. 56, 192-197.

15 1 Ishikawa, R., Sato, Y-I., Tang, T., and Nakamura, I. (2002b) Different maternal origins of Japanese lowland and

2 upland rice populations. Theor. Appl. Genet. 104, 976-980.

3 Katayama, T.C. (1968) Scientific Reports on the Rice-Collection-Trip to the Philippines, New Guinea, Borneo and

4 Java. Memories of Fac. Agri., Kagosihma Univ. 6, 89-134.

5 Khush, G. S. (1997) Origin, dispersal, cultivation and variation of rice. Plant Mol. Biol. 35, 25-34.

6 Li, C., Zhou, A., and Sang, T. (2006) Rice domestication by reducing shattering. Science 311, 1936-1939.

7 Lin, Z., Griffith, M. E., Li, X., Zhu, Z., Tan, L., Fu, Y., Zhang, W., Wang, X., Xie, D., and Sun, C. (2007) Origin

8 of seed shattering in rice (Oryza sativa L.) Planta 226, 11–20.

9 Lu, B. R. (1999) First record of the wild rice Oryza meridionalis in Indonesia. Intern. Rice Res. Notes 24(3), 28.

10 McNally, K. L., Childs, K. L., Bohnert, R., Davidson, R. M., et al. (2009) Genomewide SNP variation reveals

11 relationships among landraces and modern varieties of rice. Proc. Nat. Acad. Sci. USA 106, 12273-12278.

12 McCouch, S. R., Teytelman, L., Xu, Y., Lobos, K., Clare, K., Walton, M., Fu, B., Maghirang, R., Li, Z., Xing,

13 Y., et al. (2002) Development and mapping of 2240 new SSR markers for rice (Oryza sativa L.). DNA Res.

14 9, 199-207.

15 Molina, J., Sikora, M., Garud, N., Flowers, J. M., Rubinstein, S., Reynolds, A., Huang, P., Jackson, S., Schaal,

16 BA., Bustamante, C. D., Boyko, A. R., and Purugganan, M. D. (2011) Molecular evidence for a single

17 evolutionary origin of domesticated rice, Proc. Natl. Acad. Sci. USA 108, 8351-8356.

18 Morishima, H. (1969) Phenetic similarity and phylogenetic relationships among strains of Oryza perennis,

19 estimated by methods of numerical . Evolution 23, 429-443.

20 Nakamura, I., Kameya, N., Kato, Y., Yamanaka, S., Jomori, H., and Sato, Y-I. (1997) A proposal for identifying

21 the short ID sequence which addresses the plastid subtype of higher . Breed. Sci. 47, 385–388.

22 Ng, N. Q., Hawkes, J. G., William, J. T., and Chang, T. T. (1981) The recognition of a new species of rice (Oryza)

23 from Australia. Bot. J. Linn. Soc. 82, 327-330.

24 Nonomura, K-I., Morishima, H., Miyabayashi, T., Yamaki, S., Eiguchi, M., Kubo, T., and Kurata, N. (2010) The

25 wild Oryza collection in National BioResource Project (NBRP) of Japan: History, biodiversity and utility. Breed.

26 Sci. 60, 502-508.

27 Oka, H-I. (1988) Origin of Cultivated Rice. Japan Scientific Societies Press, Tokyo, Elsevier, Amsterdam.

16 1 Ootsuka, K., Takahashi, I., Tanaka, K., Itani, T., Tabuchi, H., Yoshihashi, T., Tonouchi, A., and Ishikawa, R.

2 (2014) Genetic polymorphisms in Japanese fragrant landraces and novel fragrant allele domesticated in northern

3 Japan. Breed. Sci. 64, 115-124.

4 Panaud, O., Chen, X., and McCouch, S. R. (1996) Development of microsatellite markers and characterization of

5 simple sequence length polymorphism (SSLP) in rice (Oryza sativa L.). Mol. Gen. Genet. 252, 597-607.

6 Sato, Y-I (1991) Variation in spikelet shape of the indica and japonica rice cultivars in Asian origin. Japan. J.

7 Breed. 41, 121-134.

8 Semon, M., Nielsen, R., Jones, M. P., and McCouch, S. R. (2005) The population structure of African cultivated

9 rice (Steud.): Evidence for elevated levels of linkage disequilibrium caused by admixture with

10 O. sativa and ecological adaptation. Genetics 169, 1639-1647.

11 Sotowa, M., Ootsuka, K., Kobayashi, Y., Hao, Y., Tanaka, K., Ichitani, K., Flowers, J. M., Purugganan, M. D.,

12 Nakamura, I., Sato, Y-I., et al. (2013)

13 Molecular relationships between Australian annual wild rice, Oryza meridionalis, and two related perennial

14 forms. Rice 6, 26.

15 Sweeney, M. T., Thomson, M. J., Cho, Y. G., Park, Y. J., Williamson, S. H., Bustamante, C. D., and McCouch, S. R.

16 (2007) Global dissemination of a single mutation conferring white pericarp in rice. PLoS Genet. 3, e133.

17 Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., and Kumar, S. (2011) MEGA5: Molecular

18 Evolutionary Genetics Analysis using maximum liklihood, evolutionary distance, and maximum parisomu

19 methods. Mol. Biol. Evol. 28, 2731-2739.

20 Tang, L-H., and Morishima, H. (1997) Genetic characterization of weedy and the inference on their

21 Origins. Breed. Sci. 47, 153-160.

22 Vaughan, D. (1994) The Wild Relatives of Rice -A Genetic Resources Handbook. pp. 48-49. IRRI, Manila,

23 Philippines.

24 Wang, M., Yu, Y., Haberer, G., Reddy, M., et al. (2014) The genome sequence of African rice (Oryza

25 glaberrima) and evidence for independent domestication. Nat. Genet. 46, 982-988.

26 Wang, Y. P., Bounphanousay, C., Kanyavong, K., Nakamura, I., Sato, Y-I., Sato, T., Zhang, H-S., Tang, L-H,

27 and Ishikawa, R. (2012) Population structural analysis of an in-situ conservation site for wild rice in Laos. Genes

17 1 Genet. Syst. 87, 311-322.

2

3 Waters, D. L. E., Nock, C. J., Ishikawa, R., Rice, N., and Henry, R. J. (2012) Chloroplast genome sequence

4 confirms distinctness of Australian and Asian wild rice. Ecol. Evol. 2, 211-217.

5 Zhu, Q. H., and Ge, S. (2005) Phylogenetic relationships among A-genome species of the genus Oryza revealed by

6 intron sequences of four nuclear genes. New Phytologist 167, 249-265.

7

8 Legends for figures

9 Fig. 1. PCA among O. sativa (indica, Tm-japonica, and Tr-japonica) and AA genome species. A. Varietal

10 groups and O. rufipogon were compared. B. Oryza sativa, Asian and Oceanian O. rufipogon, and O.

11 meridionalis as endemic species in Australia were compared. C. All wild rice species carrying AA genomes

12 were compared. D. African and American species were compared.

13 Fig. 2. Phylogenetic tree obtained by the neighbor-joining method based on seven SSR loci.

14 Fig. 3. Phylogenetic tree obtained by the neighbor-joining method based on 11 chloroplast INDEL markers

15 developed by using chloroplast genomes.

16

17

18 Supplementary Fig. S1. Phylogenetic tree obtained by the neighbor-joining method based on RCt genotypes.

19 Supplementary Fig. S2. Phylogenetic tree obtained by the neighbor-joining method based on different sets of

20 INDEL markers created from different sources of genome sequence data. A. Four chloroplast INDEL markers

21 developed from O. glumaepatula (W1171) genome were applied for genotyping, and the data were used to

22 create the tree. B. Four chloroplast INDEL markers from Thai O. rufipogon (45-2) genome were applied for

23 genotyping. C. Four chloroplast INDEL markers from the O. meridionalis genome (GU592208) were applied

24 for genotyping.

25 Supplementary Fig. S3. Phylogenetic tree obtained by the neighbor-joining method based on 12 chloroplast

26 INDEL markers.

18 Table 1. Wild rice accessions examined in this experiment Species, name of accession Origin Rank(1-2-3) O. glumaepatula W1169 Cuba 1 W1171 Cuba 3 W1183 British Guiana 3 W1185 Suriname 2 W1187 Brazil 2 W1189 Manaos 3 W1191 Brazil 3 W1196 Colombia 2 W1477 Brazil 3 W2140 Brazil 3 W2145 Brazil 1 W2149 Brazil 3 W2160 Brazil 3 W2165 Brazil 3 W2173 Brazil 3 W2184 Brazil 3 W2192 Brazil 3 W2199 Brazil 1 W2201 Brazil 3 W2203 Brazil 3 O. barthii W0042 No description 3 W0652 Sierra Leone 1 W0698 Guinea 2 W0720 Mali 2 W0747 Mali 2 W1050 Gambia 3 W1063 No description 3 W1410 Sierra Leone 3 W1416 Sierra Leone 3 W1443 Mali 3 W1467 Cameroun 3 W1473 Tchad 3 W1574 Nigeria 3 W1583 Tchad 3 W1588 Cameroun 1 W1605 Nigeria 3 W1642 Botswana 3 W1643 Botswana 3 W1646 Tanzania 2 W1702 Mali 3 O. longistaminata W0643 Gambia 2 W0708 Guinea 2 W1004 Ghana 3 W1232 Unknown 3 W1413 Sierra Leone 1 W1420 Mali 3 W1423 Mali 3 W1444 Ivory Coast 3 W1448 Ivory Coast 3 W1454 Upper Volta (Burkina Faso) 3 W1460 Dahomey 3 W1465 Nigeria 3 W1504 Tanganyika 3 W1508 Unknown 1 W1540 Congo 2 W1570 Nigeria 3 W1573 Nigeria 3 W1624 Cameroun 2 W1650 Tanzania 3 O. meridionalis W1297 Darwin, Australia 2 W1300 Darwin, Australia 3 W1625 Darwin, Australia 1 W1627 Australia 2 W1631 Kununurra area, Australia 3 W1635 Darwin, Australia 1 W1638 Queensland, Australia 3 W2069 Kununurra area, Australia 2 W2071 Kununurra area, Australia 3 W2077 from Darwin to Normanton, Australia 3 W2079 from Darwin to Normanton, Australia 2 W2080 from Darwin To Normanton, Australia 3 W2081 Matarauka, Australia 3 W2100 Queensland, Australia 3 W2103 Queensland, Australia 2 W2105 Queensland, Australia 3 W2112 Queensland, Australia 3 W2116 Queensland, Weipa, North Point, Australia 3 Asian O. rufipogon W0106 Phulankara, near Cuttack, Orissa, India 1 W0107 Pahala, Orissa, India 3 W0108 Cuttack, Orissa, India 3 W0120 Cuttack, Orissa, India 1 W0137 Kadiam, Andhra, India 3 W0180 Ngao, Lamphang, Thailand 3 W0593 Binjai Rendah, Malaya 3 W0610 Rangoon, Burma (Myanmar) 3 W0630 Magwe, Burma (Myanmar) 2 W1294 Musuan, Mindanao, Philippines 1 W1551 Saraburi, Thailand 3 W1666 Siliguri, India 3 W1669 Orissa, India 3 W1681 Orissa, India 3 W1685 Orissa, India 3 W1690 Chiengrai, Thailand 3 W1715 China 3 W1807 Sri Lanka 2 W1852 Chiang Saen, Thailand 3 W1865 Saraburi, Thailand 3 W1866 Saraburi, Thailand 1 W1921 Saraburi, Thailand 1 W1939 Bangkoknoi, Thailand 3 W1945 No description 2 W1981 Palembang, Indonesia 3 W2003 from Pajani to Bombay, India 1 W2014 India 3 W2051 Hobiganji, Bangladesh 2 W2263 Cambodia 2 W2265 Laos 3 W2266 Laos 3 W2267 Laos 3 Australian O. rufipogon W2078 from Darwin to Normanton, Australia 2 W2109 Queensland, Australia 3 New Guinean O. rufipogon W1230 Baad, Koembe, Dutch New Guinea 3 W1235 Taram, Neth. New Guinea 3 W1236 Madang, Australian New Guinea 2 W1238 Koembe River, Neth. New Guinea 3 W1239 Sakor River, Neth. New Guinea 3 Table 2. List of primers used in this experiment

Chloroplast Original data to develop Type of markers Locus Forward Reverse genome Region INDEL type Remark chloroplast INDELs (INDELs) RCt RCt1 CAT CCT TTT CAA TCC AAA ATC A TGC CTG ATG TAG GGA AAA GC (A)10 Ishii and McCouch. (2000) RCt2 CTG GGG GGG ATT ATA CCT GT ATA TCT CTC ATT TCC GAC GCA (A)11 Ishii and McCouch. (2000) RCt3 TAG GCA TAA TTC CCA ACC CA CTT ATC CAT TTG GAG CAT AGG G (A)10 Ishii and McCouch. (2000) RCt4 ACG GAA TTG GAA CTT CTT TGG AAA AGG AGC CTT GGA ATG GT (T)12 Ishii and McCouch. (2000) RCt5 ATT TGG AAT TTG GAC ATT TTC G ACT GAT TCG TAG GCG TGG AC (T)10 Ishii and McCouch. (2000) RCt6 GAA TTT TAG AAC TTT GAA TTT TTT ACC C AAG CGT ACC GAA GAC TCG AA (A)10 Ishii and McCouch. (2000) RCt7 GTG TCA TTC TCT AGG CGA AC AAA TAT GAC AGA AAA GAA AAA TAG G (T)10 Ishii and McCouch. (2000) RCt8 ATA GTC AAG AAA GAG GAT CTA GAA T ACC GCG ATT CAA TAA GAG TA (T)17 Ishii and McCouch. (2000) RCt9 ATA AGG TTA TTC CCC GCT TAC C AAA TTG GGG GAA TTC GTA CC (T)10 Ishii and McCouch. (2000) RCt10 TCT TCA TTT GGA ATC TGG GC CTA TTG ATG CAA ACG CTG TAC C (T)10 Ishii and McCouch. (2000) rpl16 rpl16-exon1 ATG CTT AGT GTG TGA CTC GTT AG This study rpl16-336f GGT CTA TGA ATT ACA TCA TAA AAA G This study rpl16-500f TTT TTG GAA GCT CCA TTG CGA G Sotowa et al. (2013) rpl16-1kb ATG AGA AGA AAC TCT CAT GTC C This study rpl16-486r CAA TTT CTC AGT TTT ATT AAC TCG G This study rpl16-5Preverse TGT TTA CGA AAT CTG GTT CTT TTG This study 3P ATC TGC TAC ATT TAA AAG GGT Nakamura et al. (1997)

INDEL (cp genome) glum-cpINDEL1 CTCGGACGAATAATCTAATACATGG CTATGATTCTATGTTCTCCTTAGTG 46087..46091 TATAT Deletion This study W1171 geome data glum-cpINDEL2 ATATATAGTCAAGAAAGAGGATC ATGAATTAACAAATAAGACAGG 78424..78429 TTTTTT Deletion This study W1171 geome data glum-cpINDEL3 CAAAAATTTTCTCATTGAAACAATC CAATTTGAGTTACGAAACAAGGGAG 103827^103828 GTTTT Insertion This study W1171 geome data glum-cpINDEL4 TGGCGGCAGTCTCGAAAAAG CAAGTTCACGAACTAATAAGG 105208^105209 ATTCA Insertion This study W1171 geome data

rufi-cpINDEL1 GGATTCACCGAAACAAACAACC GCCAAATTGAGCAGGTTGCG 12670..12673 AGGG Deletion This study 45-2 genome data rufi-cpINDEL2 TTTGGGGAAGAAAACATCTTCC TAAACGGAGAGAATCGACTAAG 14012..14013 AC Deletion This study 45-2 genome data rufi-cpINDEL3 AATTGCTCTCACCGCTCTTTC TAGTCGAATTGTTGTATCAACTC 17380..17385 ATAGAA Deletion This study 45-2 genome data rufi-cpINDEL4 TAATTTGATATGGCTCGGACG TGCTATGATTCTATGTTCTCC 46087..46091 TATAT Deletion This study 45-2 genome data

meri-cpINDEL2 GCCTTGTTCAGGAACTCGACAG TTGGTTGTACCATTGCATTTCAG 5852..5856 CAATC Deletion This study GU592208 meri-cpINDEL3 AATGGCGCAATGATCTTGGAGA GAATGGCGATGGCTCGATTTC 8192^8193 AGAAA Insertion This study GU592208 meri-cpINDEL5 AAGTGTGCCTTGCAACCGAG AAGCAGCAGAACACCTGAAAC 13566..13567 T Deletion This study GU592208 meri-cpINDEL8 GATATATTTGTGCTGGCATTCTC TTCCAGTGAAAATCATATGCAC 17379..17383 ATAGAA Deletion This study GU592208 matK matK TTGATGCAAGAATTGCCTTTCC AAAATGCAACACCCTGTTCTGACC This study

SINE-INDEL p-SINE1 (r705) TGTTGCGGAACTTGCATTGT AGAATCAAACTTGACCTGTC Cheng et al. (2002) p-SINE1 (r801) CTTGGCTTATTATTACTGATT ATGAAAGAATAGCGTAAACAAAT Cheng et al. (2002) p-SINE1 (r806) ATGCAGCTGTAAAGAAGAGT CAAGATTAAGGCTCATCTGA Cheng et al. (2002)

SSR AP003436 GCAGCGAAGCCAACGTAGTCC CTGCCTTCCCAAACATCTTCTC Wang et al. (2012) AP004212 GGAGGCTCTACTACATATGG TGGGAAACTATGCATCAGTC Wang et al. (2012) +29Cat CACGATCTAGAAGACGAGAG CCAAATTACGCCTTCCTACC Wang et al. (2012) RM3204 GCAACCCTTTCTTCCTCCTC CCAAGGAGAGCGCACTAGC McCouch et al. (2002) RM257 CAGTTCCGAGCAAGAGTACTC GGATCGGACGTGGCATATG Chen et al. (1997) RM3577 CCGATCCCATTCACAGATTC CAGTGCCTTGATCGATGTTG McCouch et al. (2002) RM17 TGCCCTGTTATTTTCTTCTCTC GGTGATCCTTTCCCATTTCA Panaud et al. (1996) Table 3. Expected heterozygosity (He ) evaluated with chloroplast microsatellite markers among AA genome species He Species No. Average RCt1 RCt3 RCt5 RCt6 RCt8 RCt9 Cultigen O. sativa 40 0.049 0.635 0.495 0.489 0.569 0.644 0.480 indica 20 0.000 0.180 0.180 0.095 0.095 0.000 0.092 japonica 20 0.095 0.480 0.000 0.320 0.530 0.480 0.318 Tm-J 10 0.180 0.180 0.000 0.000 0.460 0.480 0.217 Tr-J 10 0.000 0.420 0.000 0.480 0.540 0.320 0.293

Wild O. rufipogon 39 0.650 0.537 0.264 0.375 0.570 0.670 0.528 Asia 32 0.671 0.554 0.273 0.387 0.588 0.692 0.511 Oceania 7 0.000 0.245 0.000 0.000 0.245 0.408 0.150 O. meridionalis 18 0.444 0.401 0.000 0.105 0.204 0.105 0.210 O. glumaepatula 20 0.810 0.585 0.515 0.255 0.445 0.640 0.542 O. barthii 20 0.345 0.335 0.000 0.000 0.000 0.375 0.176 O. longistaminata 19 0.637 0.465 0.000 0.188 0.283 0.548 0.354 Table 4. SNPs without single-nucleotide repeats detectedrpl16 in sequence among wild accessions in AA genome No. of Intron 1 Exon 2 Species Haplotype Accessions accessions 655nt 676nt 770^771nt 779^780nt 1180nt 1288nt 1397-1398nt 1430nt 1450nt O. sativa 1 A T - - G C CG A G 1 Nipponbare

O. glumaepatula 8 A T - - G C GC A G 2 W1171, W1183, W1185, W1196, W1477, W2199, W2201, W2203 7 A T - - G C GC G G 3 W1191, W1169, W1187, W2140, W2145, W2149, W2192 5 T T - - G C GC G G 4 W1189, W2160, W2165, W2173, W2184

W0042, W0652, W0698, W0720, W0747, W1050, W1063, W1410, W1416, W1443, O. barthii 16 A T - T G C GC A G 5 W1467, W1473, W1583, W1588, W1605, W1702 4 A T - T A C GC A G 6 W1574, W1642, W1643, W1646

W0643, W0708, W1004, W1232, W1413, W1420, W1423, W1444, W1448, W1454, O. longistaminata 18 A T - - G C GC A G 2 W1460, W1465, W1504, W1508, W1540, W1570, W1624, W1650 1 A T - - G C GT A G 7 W1573

O. rufipogon W0106, W0108, W0137, W0180, W1294, W1551, W1666, W1669, W1681, W1685, Asia 23 A T - - G C GT A G 7 W1690, W1715, W1807, W1852, W1865, W1866, W1921, W1939, W1981, W2051, W2263, W2265, W2266 7 A T - - G C GC A G 2 W0107, W0120, W0593, W0610, W0630, W1945, W2267 2 A T - - G C GC A C 8 W2003, W2014

Oceania 2 A T - - G C GT A G 7 W1230, W1236, 2 A T C - G C GC A G 9 W2078, W2109 3 A T C - G T GC A G 10 W1235, W1238, W1239

O. meridionalis 2 A T C - G C GC A G 9 W1635, W2116 10 A T C - G T GC A G 10 W1631, W1638, W2069, W2071, W2080, W2081, W2100, W2103, W2105, W2112 6 A A C - G C GC A G 11 W1297, W1300, W1625, W1627, W2077, W2079 Table 5. Expected Heterozygosity (He) detected among AA genome species by using cpINDELs generated from NGS data He Markers O. glumaepatula O. barthii O. longistaminata O. rufipogon (Asia) O. rufipogon (Oceania) O. meridionalis glum-cpINDEL cpINDEL1 0.500 0.095 0.000 0.389 0.000 0.000 cpINDEL2 0.375 0.000 0.100 0.498 0.000 0.105 cpINDEL3 0.375 0.000 0.100 0.000 0.000 0.000 cpINDEL4 0.375 0.000 0.100 0.000 0.000 0.000 rufi-cpINDEL cpINDEL1 0.000 0.000 0.000 0.342 0.490 0.000 cpINDEL2 0.000 0.000 0.000 0.170 0.000 0.000 cpINDEL3 0.750 0.759 0.100 0.117 0.000 0.000 cpINDEL4 0.000 0.000 0.000 0.342 0.000 0.000 meri-cpINDEL cpINDEL2 0.495 0.000 0.000 0.000 0.408 0.000 cpINDEL3 0.000 0.000 0.188 0.000 0.000 0.000 cpINDEL5 0.000 0.000 0.000 0.000 0.000 0.000 cpINDEL8 0.000 0.000 0.488 0.117 0.408 0.000

Average He over 11 loci* 0.193 0.009 0.089 0.169 0.119 0.010 Average He over 12 loci 0.239 0.071 0.090 0.165 0.109 0.009 * rufi-cpINDEL3 was excluded from the scores because of the high mutability. Coord. 2 C. Coord. 2 A. Coord. 1 Principal Coordinates Coord. 1 Principal Coordinates oceania rufipogon meridionalis longistaminata glumaepatula barthii rufipogon Tr Tr-J Tm Tm-J indica B. D. Coord. 2 Coord. 2 Coord. 1 Coord. 1 Principal Coordinates Principal Coordinates glumaepatula barthii longistaminata oceania meridionalis rufipogon Tr Tm indica Tm-J Tr-J Fig. 1 (W1627)

longistaminata (W1420)

0.1 Fig. 2 sativa (Nipponbare) rufipogon (W0610) rufipogon (W0630) rufipogon (W1945) rufipogon (W2003) rufipogon (W2014) rufipogon (W0120) rufipogon (W2267) rufipogon (W0106) rufipogon (W0107) rufipogon (W0108) rufipogon (W0137) rufipogon (W0180) rufipogon (W1666) rufipogon (W1294) rufipogon (W1551) rufipogon (W1669) rufipogon (W1681) rufipogon (W1685) rufipogon (W1690) rufipogon (W1715) rufipogon (W1807) rufipogon (W1852) rufipogon (W1865) rufipogon (W1921) rufipogon (W1939) rufipogon (W1981) rufipogon (W2051) rufipogon (W2266) rufipogon (W1866) rufipogon (W2263) rufipogon (W2265) rufipogon (W0593) glumaepatula (W1187) glumaepatula (W1169) glumaepatula (W2203) glumaepatula (W1196) glumaepatula (W1477) glumaepatula (W2140) glumaepatula (W2145) glumaepatula (W2149) glumaepatula (W2192) glumaepatula (W1191) glumaepatula (W2160) glumaepatula (W2165) glumaepatula (W1189) oceania (W1230) *r oceania (W1236) *r oceania (W1238) *r oceania (W2078) *r oceania (W1235) *m oceania (W1239) *m meridionalis (W1297) meridionalis (W1300) meridionalis (W1625) meridionalis (W1627) meridionalis (W1631) meridionalis (W1635) meridionalis (W1638) meridionalis (W2069) meridionalis (W2071) meridionalis (W2077) meridionalis (W2079) meridionalis (W2080) meridionalis (W2100) meridionalis (W2103) meridionalis (W2105) meridionalis (W2112) meridionalis (W2116) meridionalis (W2081) oceania (W2109) *r longistaminata (W1573) glumaepatula (W2173) glumaepatula (W2184) glumaepatula (W1171) glumaepatula (W2201) glumaepatula (W2199) glumaepatula (W1183) glumaepatula (W1185) longistaminata (W1460) longistaminata (W1004) longistaminata (W1420) longistaminata (W1448) longistaminata (W1454) longistaminata (W1465) longistaminata (W1570) longistaminata (W1423) longistaminata (W1650) longistaminata (W1444) longistaminata (W0708) longistaminata (W1232) longistaminata (W1540) longistaminata (W1413) longistaminata (W0643) longistaminata (W1624) longistaminata (W1504) longistaminata (W1508) barthii (W1702) barthii (W0652) barthii (W0747) barthii (W1050) barthii (W1410) barthii (W1416) barthii (W1467) barthii (W1473) barthii (W1574) barthii (W1583) barthii (W1588) barthii (W1642) barthii (W1643) barthii (W1646) barthii (W0698) barthii (W0720) barthii (W1443) barthii (W1605) barthii (W0042) Fig. 3 barthii (W1063) 0.1 Supplementary Table S1.Oryza sativa accessions used as control population to estimated genetic diversity, consisting fromindica (I), Tropical-japonica (Tr-J), and Temperate- japonica (Tm-J) No. of accession Origin sativa-varietal groups IR36 IRRI I 1 Vietnam I 101 Taiwan I 108 Taiwan I 130 Taiwan I 414 India (Pattambi) I 415 India (Aduturai) I 417 India (Aduturai) I 420 India (Maruter) I 421 India (Pattambi) I 435 Ceyron I 440 India (Bengal) I 706 North China I 710 Central China I 715 Central China I 719 Central China I 724 South China I 729 South China I 761 Hai-Nan-Tao I 868 Mountain of Taiwan I Nipponbare Japan Tm-J 201 Philippines Tr-J 206 Philippines Tr-J 220 Philippines Tr-J 221 Philippines Tr-J 224 Philippines Tr-J 504 Taiwan Tm-J 563 Japan Tm-J 624 Celebes Is. Tr-J 642 Celebes Is. Tr-J 708 North China Tm-J 709 Central China Tm-J 712 Central China Tm-J 718 Central China Tr-J 757 Hai-Nan-Tao Tr-J 848 Mountain of Taiwan Tm-J 871 Mountain of Taiwan Tr-J Up1 Japan Tm-J Up3 Japan Tm-J Up4 Japan Tm-J Supplementary Table S2. Number of alleles (Na ) and effective allele (Ne ) of RCt markers showing polymorphism Allele information at polymorphic loci No. of Species RCt1 RCt3 RCt5 RCt6 RCt8 RCt9 accessions Na Ne Na Ne Na Ne Na Ne Na Ne Na Ne O. sativa indica 20 1 1.000 2 1.220 2 1.220 2 1.105 2 1.105 2 1.105 japonica 20 2 1.105 2 1.923 1 1.000 2 1.471 4 2.128 2 1.923 O. glumaepatula 20 7 5.263 3 2.410 3 2.062 2 1.342 3 1.802 4 2.778 O. barthii 20 4 1.527 3 1.504 1 1.000 1 1.000 1 1.000 2 1.600 O. longistaminata 19 3 2.756 2 1.870 1 1.000 2 1.232 4 1.394 3 2.215 O. rufipogon 39 7 3.376 3 1.966 3 1.865 2 1.536 3 2.381 6 3.687 Asia 32 6 2.860 3 2.160 2 1.358 2 1.600 3 2.327 6 3.030 Oceania 7 1 1.000 2 1.324 1 1.000 1 1.000 2 1.324 2 1.690 O. meridionalis 18 2 1.800 2 1.670 1 1.000 2 1.117 3 1.256 2 1.117 Supplementary Table S3. SNPs in matK sequence among AA genome species in Africa and America matK Species No. of accessions Accessions 516nt 684nt O. sativa 1 AC Nipponbare

O. glumaepatula 20 A C All accessions

O. barthii 20 C C All accessions

All accessions except for O. longistaminata 18 AC W1448 1 A T W1448 Supplementary Table S4. Expected heterozygosity (He ) evaluated with seven SSR loci No. of He Species Average/loci accessions AP003436 AP004212 RM3204 29Cat RM257 RM3577 RM17 O. glumaepatula 20 0.744 0.000 0.615 0.800 0.889 0.095 0.645 0.541 O. barthii 20 0.186 0.521 0.805 0.366 0.908 0.799 0.415 0.571 O. longistaminata 19 0.873 0.314 0.360 0.658 0.698 0.711 0.839 0.636 O. rufipogon 39* 0.931 0.839 0.568 0.714 0.922 0.167 0.894 0.719 Asia 32 0.924 0.830 0.511 0.639 0.917 0.174 0.895 0.699 Oceania 7 0.735 0.694 0.694 0.816 0.776 0.133 0.776 0.660 O. meridionalis 18 0.000 0.000 0.105 0.278 0.105 0.105 0.105 0.100

Average /Species* 0.547 0.335 0.491 0.563 0.704 0.375 0.580 0.514 * In order to calculate average in O. rufipogon , 39 were included 32 Asia and 7 Oceania accessions regarded as one species. Supplementary Table S5. Presence (+) and absence (-) for p-SINE1 retrotransposable elements at three loci among AA genome species in Africa and America p-SINE1* No. of Species (r705) (r801) (r806) accessions + - + - + - O. glumaepatula 20 0 20 20 0 20 0 O. barthii 20 0 20 0 20 20 0 O. longistaminata 19 19 0 0 19 0 19 * p-SINE1 insertions were confirmed with the similar method described in Cheng et al. (2002). Supplementary Table S6. Number of alleles Na ) and effective allele (Ne ) in chloroplast cpINDELs generated from NGS data O. glumaepatula O. barthii O. longistaminata O. rufipogon (Asia) O. rufipogon (Oceania) O. meridionalis Markers Na Ne Na Ne Na Ne Na Ne Na Ne Na Ne glum-cpINDEL cpINDEL1 2 2.000 2 1.105 1 1.000 3 1.636 1 1.000 1 1.000 cpINDEL2 2 1.600 1 1.000 2 1.111 3 1.992 1 1.000 2 1.117 cpINDEL3 2 1.600 1 1.000 2 1.111 1 1.000 1 1.000 1 1.000 cpINDEL4 2 1.600 1 1.000 2 1.111 1 1.000 1 1.000 1 1.000 rufi-cpINDEL cpINDEL1 1 1.000 1 1.000 1 1.000 2 1.519 2 1.960 1 1.000 cpINDEL2 1 1.000 1 1.000 1 1.000 2 1.205 1 1.000 1 1.000 cpINDEL3 6 4.000 7 4.145 2 1.111 2 1.133 1 1.000 1 1.000 cpINDEL4 1 1.000 1 1.000 1 1.000 2 1.519 1 1.000 1 1.000 meri-cpINDEL cpINDEL2 2 1.980 1 1.000 1 1.000 1 1.000 2 1.690 1 1.000 cpINDEL3 1 1.000 1 1.000 2 1.232 1 1.000 1 1.000 1 1.000 cpINDEL5 1 1.000 1 1.000 1 1.000 1 1.000 1 1.000 1 1.000 cpINDEL8 1 1.000 1 1.000 2 1.951 2 1.133 2 1.690 1 1.000 rufipogon (W0630) rufipogon (W0610) rufipogon (W0593) rufipogon (W1945) sativa (Nipponbare) rufipogon (W1865) rufipogon (W0120) rufipogon (W1551) rufipogon (W1807) rufipogon (W1715) rufipogon (W0108) rufipogon (W0137) rufipogon (W1669) rufipogon (W1852) rufipogon (W1685) rufipogon (W1690) rufipogon (W1681) rufipogon (W0106) rufipogon (W0107) rufipogon (W1294) rufipogon (W1666) rufipogon (W2051) rufipogon (W2003) rufipogon (W2014) longistaminata (W1540) longistaminata (W1444) longistaminata (W1423) longistaminata (W1454) longistaminata (W1460) longistaminata (W1465) longistaminata (W1508) longistaminata (W0643) longistaminata (W1004) longistaminata (W1413) longistaminata (W0708) glumaepatula (W1183) glumaepatula (W1171) longistaminata (W1650) glumaepatula (W1185) glumaepatula (W2199) longistaminata (W1232) longistaminata (W1504) longistaminata (W1570) longistaminata (W1448) longistaminata (W1624) longistaminata (W1420) longistaminata (W1573) glumaepatula (W1196) glumaepatula (W2201) rufipogon (W2265) rufipogon (W2266) rufipogon (W2263) rufipogon (W2267) oceania (W1235) oceania (W1238) oceania (W2078) oceania (W2109) oceania (W1236) oceania (W1230) oceania (W1239) glumaepatula (W2140) glumaepatula (W2149) glumaepatula (W2145) glumaepatula (W1189) glumaepatula (W1169) glumaepatula (W2192) glumaepatula (W2160) glumaepatula (W2165) glumaepatula (W2173) glumaepatula (W2184) glumaepatula (W1191) rufipogon (W1866) glumaepatula (W1477) glumaepatula (W2203) rufipogon (W1921) rufipogon (W0180) rufipogon (W1939) rufipogon (W1981) barthii (W1583) barthii (W1588) barthii (W1702) barthii (W0747) barthii (W1050) barthii (W1416) barthii (W1646) barthii (W0652) barthii (W1443) barthii (W0698) barthii (W1643) barthii (W1642) barthii (W1605) barthii (W1574) barthii (W1473) barthii (W1467) barthii (W1410) barthii (W1063) barthii (W0042) barthii (W0720) meridionalis (W2080) glumaepatula (W1187) meridionalis (W2081) meridionalis (W1631) meridionalis (W1638) meridionalis (W1635) meridionalis (W1627) meridionalis (W2079) meridionalis (W1300) meridionalis (W1625) meridionalis (W1297) meridionalis (W2100) meridionalis (W2105) meridionalis (W2077) Supplementary Fig. S1. Phylogenetic tree obtained by the meridionalis (W2071) meridionalis (W2069) meridionalis (W2103) neighbor-joining method based on RCt genotypes. meridionalis (W2112) meridionalis (W2116) 0.1 longistaminata (W1460) glumaepatula (W1171) sativa (Nipponbare) longistaminata (W0708) glumaepatula (W1183) rufipogon (W0106) longistaminata (W1004) B. glumaepatula (W1185) rufipogon (W0107) A. longistaminata (W1232) barthii (W1646) C. rufipogon (W0108) longistaminata (W1420) barthii (W0698) rufipogon (W0137) longistaminata (W1448) barthii (W1063) rufipogon (W0180) longistaminata (W1454) longistaminata (W1460) rufipogon (W0593) longistaminata (W1465) longistaminata (W0708) rufipogon (W0610) longistaminata (W1504) longistaminata (W1004) rufipogon (W0630) longistaminata (W1508) longistaminata (W1232) rufipogon (W1294) longistaminata (W1540) longistaminata (W1420) rufipogon (W1551) longistaminata (W1570) longistaminata (W1448) rufipogon (W1666) longistaminata (W1413) longistaminata (W1454) rufipogon (W1669) longistaminata (W0643) longistaminata (W1465) rufipogon (W1681) longistaminata (W1423) longistaminata (W1504) rufipogon (W1685) longistaminata (W1624) longistaminata (W1508) rufipogon (W1690) longistaminata (W1650) longistaminata (W1540) rufipogon (W1715) longistaminata (W1444) longistaminata (W1570) rufipogon (W1807) glumaepatula (W2199) longistaminata (W1413) rufipogon (W1852) glumaepatula (W2201) longistaminata (W0643) rufipogon (W1865) glumaepatula (W1185) longistaminata (W1423) rufipogon (W1866) glumaepatula (W1171) longistaminata (W1624) rufipogon (W1921) glumaepatula (W1183) longistaminata (W1650) rufipogon (W1939) barthii (W1063) longistaminata (W1444) rufipogon (W1945) barthii (W0042) glumaepatula (W2192) rufipogon (W1981) barthii (W1605) glumaepatula (W1191) rufipogon (W2003) barthii (W1443) barthii (W0042) rufipogon (W2014) barthii (W0720) longistaminata (W1573) rufipogon (W2051) barthii (W0698) barthii (W1702) rufipogon (W2263) barthii (W1646) barthii (W1643) rufipogon (W2265) barthii (W1643) barthii (W1642) rufipogon (W2266) barthii (W1642) barthii (W1574) longistaminata (W0708) barthii (W1588) barthii (W1473) longistaminata (W1232) barthii (W1583) barthii (W1467) longistaminata (W1540) barthii (W1574) barthii (W0747) longistaminata (W1413) barthii (W1473) glumaepatula (W1189) longistaminata (W0643) barthii (W1467) glumaepatula (W2184) longistaminata (W1624) barthii (W1416) glumaepatula (W2173) longistaminata (W1504) barthii (W1410) glumaepatula (W2165) longistaminata (W1508) barthii (W1050) glumaepatula (W1196) rufipogon (W0120) barthii (W0652) glumaepatula (W2160) rufipogon (W2267) barthii (W0747) glumaepatula (W2201) glumaepatula (W2140) barthii (W1702) barthii (W0720) glumaepatula (W2145) rufipogon (W0593) barthii (W1605) glumaepatula (W2149) glumaepatula (W1189) barthii (W1443) glumaepatula (W2192) glumaepatula (W2184) barthii (W1416) glumaepatula (W1191) glumaepatula (W2173) barthii (W0652) glumaepatula (W2160) glumaepatula (W2160) barthii (W1410) glumaepatula (W2165) glumaepatula (W2165) barthii (W1583) glumaepatula (W1171) sativa (Nipponbare) barthii (W1588) glumaepatula (W2201) rufipogon (W0120) glumaepatula (W1187) glumaepatula (W2199) rufipogon (W2267) glumaepatula (W2140) glumaepatula (W1189) rufipogon (W2014) glumaepatula (W2145) oceania (W1230) rufipogon (W2003) glumaepatula (W2149) oceania (W1236) rufipogon (W1945) glumaepatula (W1169) oceania (W1235) rufipogon (W0610) glumaepatula (W2203) oceania (W1238) rufipogon (W0630) glumaepatula (W1477) oceania (W1239) rufipogon (W1666) oceania (W1230) oceania (W2078) rufipogon (W0180) oceania (W1236) oceania (W2109) rufipogon (W0137) oceania (W2109) meridionalis (W1297) rufipogon (W0108) rufipogon (W0610) meridionalis (W1300) rufipogon (W0107) rufipogon (W0630) meridionalis (W1625) longistaminata (W1573) rufipogon (W1945) meridionalis (W1627) rufipogon (W0106) sativa (Nipponbare) meridionalis (W1631) rufipogon (W1866) rufipogon (W2003) meridionalis (W1635) rufipogon (W2263) rufipogon (W2014) meridionalis (W1638) rufipogon (W2265) rufipogon (W0120) meridionalis (W2069) meridionalis (W2081) rufipogon (W2267) meridionalis (W2071) glumaepatula (W1477) meridionalis (W2116) meridionalis (W2077) glumaepatula (W1196) meridionalis (W2112) meridionalis (W2079) glumaepatula (W2203) meridionalis (W2105) meridionalis (W2080) glumaepatula (W1169) meridionalis (W2103) meridionalis (W2081) glumaepatula (W1191) meridionalis (W2100) meridionalis (W2100) glumaepatula (W2192) meridionalis (W2081) meridionalis (W2103) glumaepatula (W2149) meridionalis (W2080) meridionalis (W2105) glumaepatula (W2145) meridionalis (W2079) meridionalis (W2112) glumaepatula (W1187) meridionalis (W2077) meridionalis (W2116) glumaepatula (W2140) meridionalis (W2071) glumaepatula (W1187) rufipogon (W1294) meridionalis (W2069) glumaepatula (W1169) rufipogon (W1551) meridionalis (W1638) glumaepatula (W2203) rufipogon (W1669) meridionalis (W1635) glumaepatula (W1196) Supplementary Fig. S2. Phylogenetic tree rufipogon (W1681) meridionalis (W1631) glumaepatula (W1477) rufipogon (W1685) meridionalis (W1627) glumaepatula (W2173) rufipogon (W1690) meridionalis (W1625) glumaepatula (W2184) obtained by the neighbor-joining method rufipogon (W1715) meridionalis (W1300) glumaepatula (W1183) rufipogon (W1807) meridionalis (W1297) glumaepatula (W1185) rufipogon (W1852) oceania (W2078) barthii (W0652) based on different sets of INDEL markers rufipogon (W1865) oceania (W1239) barthii (W0747) rufipogon (W1921) oceania (W1235) barthii (W1050) rufipogon (W1939) oceania (W1238) barthii (W1410) created from different sources of genome rufipogon (W1981) glumaepatula (W2199) barthii (W1416) rufipogon (W2051) barthii (W1050) barthii (W1467) rufipogon (W2266) rufipogon (W0106) barthii (W1473) sequence data. A. Four chloroplast INDEL oceania (W1230) rufipogon (W0107) barthii (W1574) oceania (W1235) rufipogon (W0108) barthii (W1583) oceania (W1236) rufipogon (W0137) barthii (W1588) markers developed from O. glumaepatula oceania (W1238) rufipogon (W0180) barthii (W1642) oceania (W1239) rufipogon (W0593) barthii (W1643) oceania (W2078) rufipogon (W1294) barthii (W1646) (W1171) genome were applied for oceania (W2109) rufipogon (W1551) barthii (W1702) meridionalis (W1297) rufipogon (W1666) barthii (W0698) meridionalis (W1300) rufipogon (W1669) barthii (W0720) genotyping, and the data were used to create meridionalis (W1625) rufipogon (W1681) barthii (W1443) meridionalis (W1627) rufipogon (W1685) barthii (W1605) meridionalis (W1631) rufipogon (W1690) barthii (W0042) the tree. B. Four chloroplast INDEL markers meridionalis (W1635) rufipogon (W1715) barthii (W1063) meridionalis (W1638) rufipogon (W1807) longistaminata (W1460) from Thai O. rufipogon (45-2) genome were meridionalis (W2069) rufipogon (W1852) longistaminata (W1004) meridionalis (W2071) rufipogon (W1865) longistaminata (W1420) meridionalis (W2077) rufipogon (W1866) longistaminata (W1448) applied for genotyping. C. Four chloroplast meridionalis (W2079) rufipogon (W1921) longistaminata (W1454) meridionalis (W2080) rufipogon (W1939) longistaminata (W1465) meridionalis (W2100) rufipogon (W1981) longistaminata (W1570) INDEL markers from the O. meridionalis meridionalis (W2103) rufipogon (W2051) longistaminata (W1423) meridionalis (W2105) rufipogon (W2263) longistaminata (W1650) meridionalis (W2112) rufipogon (W2265) longistaminata (W1444) genome (GU592208) were applied for meridionalis (W2116) rufipogon (W2266) longistaminata (W1573) 0.1 0.1 0.1 genotyping. oceania (W1235) oceania (W1238) oceania (W1239) oceania (W2078) meridionalis (W1297) meridionalis (W1300) meridionalis (W1625) meridionalis (W1627) meridionalis (W1631) meridionalis (W1635) meridionalis (W1638) meridionalis (W2069) meridionalis (W2071) meridionalis (W2077) meridionalis (W2079) meridionalis (W2080) meridionalis (W2100) meridionalis (W2103) meridionalis (W2105) meridionalis (W2112) meridionalis (W2116) meridionalis (W2081) oceania (W2109) oceania (W1230) oceania (W1236) glumaepatula (W1187) glumaepatula (W1169) glumaepatula (W2203) glumaepatula (W1477) glumaepatula (W2192) glumaepatula (W1191) glumaepatula (W2149) glumaepatula (W2140) glumaepatula (W2145) longistaminata (W1573) glumaepatula (W1196) glumaepatula (W2160) glumaepatula (W2165) glumaepatula (W1189) glumaepatula (W2173) glumaepatula (W2184) barthii (W1702) barthii (W0747) barthii (W1467) barthii (W1473) barthii (W1574) barthii (W1642) barthii (W1643) barthii (W0042) barthii (W1646) barthii (W0698) barthii (W1063) barthii (W1583) barthii (W1588) barthii (W0720) barthii (W1605) barthii (W1443) barthii (W1416) barthii (W0652) barthii (W1410) barthii (W1050) glumaepatula (W1171) glumaepatula (W2201) glumaepatula (W2199) glumaepatula (W1183) glumaepatula (W1185) longistaminata (W0708) longistaminata (W1232) longistaminata (W1540) longistaminata (W1413) longistaminata (W0643) longistaminata (W1624) longistaminata (W1504) longistaminata (W1508) longistaminata (W1444) longistaminata (W1650) longistaminata (W1423) longistaminata (W1570) longistaminata (W1465) longistaminata (W1454) longistaminata (W1448) longistaminata (W1420) longistaminata (W1460) longistaminata (W1004) rufipogon (W0120) rufipogon (W2267) rufipogon (W2003) rufipogon (W2014) sativa (Nipponbare) rufipogon (W1945) rufipogon (W0610) rufipogon (W0630) rufipogon (W0593) rufipogon (W1666) rufipogon (W0180) rufipogon (W0137) rufipogon (W0108) rufipogon (W0106) rufipogon (W0107) rufipogon (W2265) rufipogon (W1866) rufipogon (W2263) rufipogon (W1294) rufipogon (W1551) rufipogon (W1669) rufipogon (W1681) rufipogon (W1685) rufipogon (W1690) rufipogon (W1715) rufipogon (W1807) rufipogon (W1852) rufipogon (W1865) rufipogon (W1921) Supplementary Fig. S3. Phylogenetic tree obtained by the neighbor-joining rufipogon (W1939) rufipogon (W1981) rufipogon (W2051) method based on 12 chloroplast INDEL markers. rufipogon (W2266) 0.1