Accepted Article Lanzhou, China. Lanzhou, China. University, Chengdu, China. This by is protectedreserved.article Allrights copyright. doi: 10.1111/mec.14097 lead to differences between this version and the throughbeen the copyediting, pagination typesetting, which process, may and proofreading This hasarticle been accepted publicationfor andundergone peerfull reviewbutnot has [email protected]. E-mail: 18011455756. +86 Phone: China. Chengdu, University, Liu, Sichuan Correspondence:*for Jianquan Author † 4 3 2 Xin Luo ancestor unknown an and congener diploid only its from Chasing ghosts: Allopolyploid origin of Article :Original type Article 02-Mar-2017 : Date Accepted Revised: 02-Mar-2017 Date Received :30-Nov-2016 Date PROF. JIANQUAN LIU: 0000-0003-4277-9772) (Orcid ID DR. QUANJUN HU: (Orcid ID 0000-0001-6922-2144)

1 Liu Both authors contributed equally to this equally work this to authorscontributed Both Research Center for Medicine and Biology, Zunyi Medical University, Zunyi, China. Zunyi, China. Medicine University, ZunyiResearch for and Biology, Medical Center Fife KY169TH, UK. StAndrews, StAndrews, School of Biology, University University, Lanzhou Life College of Science, Agro-Ecosystem, Grassland State of Laboratory Key College LifeSichuan of andEco-environment, Science, MOE Laboratory for Bio-resources Key 1, 2* 1† , Quanjun Hu , Quanjun 1, 2† , Pingping Zhou, Pingping 1 , Dan Zhang

Version of Record. Please this as Version cite ofRecord. article sinensis sinensis Oxyria 1 , Qian Wang 1,4 , Richard J. Abbott J. Richard , () 3 , Jianquan , Jianquan Accepted Article adaptation of sinensis O. ‘ghost’ tetraploidspecies theorigin implicated of in of theunknown changes. climatic The extinction Quaternary inresponse to ranges distributional their in specieschanges both experienced that suggested modeling similarly niche Ecological oscillations. bothcontracted of species that the distributions second polyploidization event, whereas whereas event, second polyploidization indicated that indicated that present in in present sinensis million years ago (Ma), genomea second occurred duplication Maapproximately 6 giveto rise to analysis indicated that divergencefollowing from chromosomes from O from chromosomes (2 niche modeling were conducted for this purpose. GISH revealed that revealed GISH purpose. this for niche conducted were modeling situ This by is protectedreserved.article Allrights copyright. genome in the polyploid. In this study, we aimed to determine the polyploid origin of origin of Inthe study, we todetermine polyploid this aimed genome polyploid. inthe its of presence the through existenceisrecognised prior its inthat species abe as‘ghost’ treated of donor a may in extinct genome the circumstances the found polyploid becomesuch extinct. Under has an ancestor when challenging of isparticularly species a the origin polyploid Reconstructing Abstract n =40) for which only one congeneric species is known, i.e. diploid i.e. isknown, which species one only =40) for congeneric hybridization (GISH), transcriptome, phylogenetic and demographic analyses, and ecological and ecological analyses, demographic and phylogenetic transcriptome, (GISH), hybridization . Oxyria sinensis

O . inaddition digyna to a set thatclustered with those in O. sinensis O. . and 26 chromosomes from an unknown ancestor. Transcriptome Transcriptome ancestor. unknown an from chromosomes 26 and digyna was shown to contain homologous gene sequences divergent from from those divergent sequences homologousgene contain shown to was to repeated climatic changes in the region where it now occurs. where now it region in the changes climatic torepeated expanded its distribution approximately 6-7 Ma, possibly following the following approximately Ma, 6-7 distribution possibly expanded its O . digyna and re-expanded during the Pleistocene climatic climatic thePleistocene andre-expanded during expanded its range much later. It was also indicated It wasalsoindicated later. much range its expanded O . O. sinensis O. , involving genome duplication around 12 12 around duplication genome involving , digyna

could have resulted from superior superior from have resulted could O . O. sinensis O. O. digyna(2 . Coalescent simulations simulations Coalescent . digyna comprised 14 14 comprised n =14). Genomic =14). Oxyria sinensis O in .

Accepted Article This by is protectedreserved.article Allrights copyright. diversification (Grant 1981; Stebbins 1985; Otto 2007; Otto 2007; Soltis 1985; Stebbins 1981; (Grant diversification Introduction history words Key examined diploid angiosperms show genome duplication in their ancestry (Fawcett ancestry in show genome (Fawcett duplication their angiosperms examined diploid formation (Ramsey&Schemske Barker 1998; allopolyploid for that than bemuch higher to thought is formation autopolyploid of rate the Though Grant 1971;Lewis1981). (Stebbins 1980; a of via single genome doubling (autopolyploidy) lineage can Polyploids be hybridization originatefollowing congeners (Wood for reported numbers chromosome based lowest polyploidy, on the via directly originated have to considered are species angiosperm extant of 15% than more two types (Barker two types (Barker the near of amongparity indicates vascular and allopolyploids frequencies autopolyploids of originate more easily in allopolyploids than autopolyploids (Barker autopolyploids than moreoriginate inallopolyploids easily Levin Parisodmay Ramsey&which 2016), & (Fowler 1984; Broennimann progenitors 2011; its from divergence niche by instant aided is polyploid a of The establishment autopolyploids. advantage over an evolutionarymay hold Thisallopolyploids that suggests 2016). Sherman-Broyles Polyploidy, or whole-genome duplication (WGD), has played a key role in evolution and (WGD), evolution a playedhas whole-genome inplant or key role duplication Polyploidy, : Oxyria et al. et , allopolyploid speciation, ghost species, GISH, transcriptome, demographic demographic transcriptome, GISH, species, ghost speciation, allopolyploid , 2016), while in some genera allopolyploids are more frequent &(Doyle are more frequent while allopolyploids genera some in 2016), et al. tween two different lineagestween twodifferent or (allopolyploidy) 2016), a recent assessment of the relative therelative of assessment a recent 2016), et al. et

2014; Soltis & Soltis 2016). All 2016). Soltis Soltis 2014; & et al. 2016). Moreover, 2016). Moreover, et al. et al. 2009), while while 2009), 2009). 2009). Accepted Article This by is protectedreserved.article Allrights copyright. 2000; Mittermeier Mittermeier 2000; (Myers hotspot abiodiversity as recognized is and species plant endemic numerous contains Miocene and deep valleys created by uplifts of the Himalaya and Qinghai-Tibet Plateau since themiddle This withmultipleregion S1). mountains theEasthigh Himalaya (Fig. to native species herbaceous sequenceanalysis. this However, usuallyis annot due the option to ofsuchunavailability fossils. through directly ancestor an extinct identify to maybe possible it extracted DNA canbe which from are available fossilsIf it) part of in genome polyploid. (or the its of presence the recognised through existence its is prior in that may a species in polyploid asthe ‘ghost’ a be found genome) treated of apart of donor genome (or theextinct such a Under of is extinct. circumstances polyploid ancestor challenging when an becomes particularly such resolution However, their origin. evolved following (Barker progenitors their from genomesinherited more divergent or two of possession their to due readily change more may adaptto environmental environments a of range and allopolyploids wider tolerate the area (Liu the area of species ranges have occurred (Brochmann and expansions frequent contractions inparallel, during the and where, climatic Quaternary oscillation addition to reproducing sexually can form large clones on cliff faces and crevices through rhizome and crevices through faceson cliff large canform clones sexually to addition reproducing in which and female maleplants, hermaphrodite, containing It istrioecious, slopes mountains. high of 1600 from m to3800 ranging hot valleys m at altitudes a.s.l. (Meng Resolving the ancestry of polyploid species is important to an understanding of how such species to of an understanding species isimportant polyploid of Resolving ancestry the In the study reported here, we examined the origin of of origin the examined we here, reported study the In et al. 2016). 2016). et al. 2001;Liu Nie 2004; et al. 2004; Wilson 1992; Wu 1998). However, polyploids are not as widespread in as are 2004;not in widespread WilsonWu However,polyploids 1992; 1998). et al. et 2005) as might be expected for a region subjected to to subjected as 2005) a region be might for expected et al. 2004). Oxyria sinensis Oxyria sinensis

et al. 2015) and rarely on the the and on rarely 2015) mainly mainly occursindry and (Polygonaceae), a perennial perennial a (Polygonaceae), Accepted Article changes? did This by is protectedreserved.article Allrights copyright. and what effect might polyploidization have of history and hadonthedemographic might polyploidization effect what that al. tolerancedrought (Yang formation (Liu terms of its progenitors? (iii) How do the demographic histories of of histories (iii) demographic Howdothe progenitors? its terms of What allopolyploidy? (ii) or autopolyploidy istheancestry of through originate, i.e. How (i) did questions: toaddress following the (ENM) modeling niche also ecological genomic that evidence phylogenetic S1). There before (Fig. Pleistocene the is also distribution inits became circumarctic America, and North Russia Western and into andeastward into Europe westward Himalaya, migrated (Wang analysis to a phylogeographic According recent (Meng site same the at sympatrically occur never they (Fig.S1), ranges in overlapping with East the Himalaya species Although distributed are (Fig.both S1). 4500 m a.s.l. in Eurasia Americaand North rangesof the mountain and Arctic throughout the distributed species widely perennial herbaceous, which isa hermaphroditic, 2012). We began the research reported here by conducting a cytological analysis which confirmed whichconfirmed analysis here a cytological byconducting We reported 2012). research began the O. sinensis O. O. digyna O. in situ O. digyna O. was diploid was(2 diploid and and et al. et hybridization (GISH), transcriptome, phylogenetic and demographic analyses, and analyses, demographic and phylogenetic transcriptome, (GISH), hybridization O. digyna O. 2007; Zhao & Yang 2008). The three sex types were recently shown to differ in in differ to shown were recently sex The three types &2007; Zhao 2008). Yang andO.sinensis et al. n change their distributions in response to earlier and recent climatic and climatic recent toearlier response in distributions change their 2014). The only congener to to congener Theonly 2014). =14) whereas whereas =14) diverged from each other ~12 million years(Sun (Ma) millionago ~12 other from each diverged O. sinensis O. diverse from at habitats seaelevations level to was polyploid (2 et al. O. sinensis O. 2016), 2016),

O. sinensis O. digyna O. is et al. n O. digyna =40). We then conducted conducted We=40). then 2015, Wang 2015, O. digyna andO. originated in the East East the in originated O. sinensis (Mountain sorrel), O. sinensis O. O. sinensis O. et al et ? (iv) How ? (iv) compare . 2016). 2016). . in

et et Accepted Article squashes were counted for each individual using an OLYMPUS BX53 microscope. an using OLYMPUS microscope. counted for each individual BX53 were squashes tip at root of well-spread three least metaphase at in 3:1 –acetic Chromosomes mitotic ethanol acid. storing and beforeh fixing 2–3 w/v) for (0.01% colchicine with pretreated were roots and germinated Seeds were each of 1994). individual & Islam-Faridi (Jewell numbers determine chromosome Table of S1) 1,3and4, Table (populations of S1) Cytological analysis and genomicCytological analysis. transcriptome the in comparison for China, Province, Gansu This by is protectedreserved.article Allrights copyright. common garden. We also collected fresh leaves of of fresh leaves Wealsocollected garden. common a stage a mature to in grown plants to produce germinated were seeds analyses, and transcriptome For cytological insilicagel. thefield,and stored in leaves dried were fresh For molecular analysis, S1). for were each (Table (Garmin) latitude, taken usingmonitor and longitude locality anEtrex GIS O. sinensis of Nine populations speciesinthe East both Himalaya. of thethroughout distributions from analyses Plant material Material andmethods For cytological analysis, at least three individuals representing each of three populations three of populations representing each individuals three least at For cytological analysis, We collected fresh leaves and seed of O. seedof and sinensis fresh leaves Wecollected and 12 populations of of and 12populations were examined. Cytological analyses were performed using root tips to to tips root using performed were analyses Cytological examined. were O. digyna wereO. digyna sampled (Fig. S2, Table S1). Records of altitude, in situ O. sinensis O. hybridization(GISH) and three populations (populations 13, 16 and 19, 16and19, (populations 13, populations andthree grown at Lanzhou University, Lanzhou University, at grown tanguticum Rheum O. digyna andO.

for molecular and cytological

Accepted Article I (New England BioLabs). Sequencing was performed on an Illumina HiSeq 2000. After removing onan 2000. After HiSeq Illumina was Sequencing performed I BioLabs). England (New DNA and polymerase H (Invitrogen) RNase primers, hexamer random with (Invitrogen) transcriptase reverse using out carried was cDNAsynthesis and second strand First at 75°C. divalent with cations bp by ~200 fragments treatment cleaved into Thiswassubsequently mRNA. extract polyA-tagged 20 Bioanalyzer. for of oneindividual (Fig. of and S2) also leaves each of one individual analysis Transcriptome Adobe System. inPhotoshop and embellished package IPPthe software Olympus using captured were System andphotographs Image Live Cell VOX3D UItraVIEW by a PerkinElmer chromosomescounterstained DAPI were antifade solution. inVectashield with Slides examined were and Mix,Science) Applied Roche (DIG- detecting FITC-conjugated avidin with detected O. sinensis of chromosomes metaphase Well-spread Science). Applied Roche Mix, Translation UTP Nick (DIG- young leaves of describedLeitch in This by is protectedreserved.article Allrights copyright. Rh. nobile Total RNA was extracted according to the CTAB protocol (Doyle & Doyle 1987) from leaves tothe CTAB(Doyle from according protocol of & DoyleTotal RNA wasextracted 1987) To confirm that from root tip cells were used for slide denaturation and hybridization. Labeled andhybridization. DNA denaturation slide was cellstip were for root used from (Wang μ digyna O. g of total RNA from each individual was purified using polydT conjugated beads to to beads conjugated polydT using purified was individual each from RNA total g of et al. O. sinensis O. et al. Oxyria (1994) with minor modifications. Total genomic DNA was isolated from from Totalgenomic DNA with was modifications. isolated minor (1994) 2014). The integrity of RNA was assessed using an Agilent 2100 2100 using RNAan Agilent of The integrity wasassessed 2014). by the CTAB method (Doyle & Doyle 1987) and labeled with digoxigenin- with labeled and &Doyle 1987) (Doyle method CTAB the by species after germinating seeds collected from populations 1 and13 populations collected from seeds germinating after species contains the contains genome of Rh. tanguticum O. digyna O. . We also downloaded transcriptome data transcriptome We. also downloaded

, we performed GISH analyses as as analyses GISH performed we , Accepted Article This by is protectedreserved.article Allrights copyright. sequences were produced by CD-HIT V 4.6.1 (Huang by produced (Huang CD-HIT were V4.6.1 sequences representative non-redundant, of set A contigs. into bp 200 exactly of reads short paired-end (Grabherr V2.1.1 Trinity analyses. subsequent all for reads clean adapter sequences, raw reads were filtered using Trimmomatic V 0.32 (Bolger (Sonnhammer (htt 2.0.6. V the Trinityrnaseq program of by inthe openread predicted were and pepthe sequences Transdecoder frame sequences using the Mclust V.5.0.2 (Fraley V.5.0.2 Mclust using the paralogs values between Ks estimated for be could peaks corresponding values of the models and of mixture Ksvalues,normal event agenome produce is distribution normal expected duplication to (LynchAs whole & Conery 2000). parologs each by of estimated duplication between pairs Ks of relativeage of order genome in a of Paralogs sorted paralogs. were polyploid of between pairs (Jiao genome and duplications divergence interspecific Blanc of the approach employed We each species. within comparisons (blastn) divergence based on Ks. We then dated genome duplications in the polyploid thepolyploid in genome duplications dated then We Ks. on based divergence of and time divergence genetic species interspecific calculate two the of to sequence pairs orthologous We used 1:1 &the (Löytynoja 2008). byGoldman aligned Prank sequence pairs orthologous and paralogous (Ka) between substitution non-synonymous and (Ks) synonymous rates of method implementedinthe CODEML estimate to PAML V4.1 2007) the program (Yang of package Orthologous inthe sequences Orthologous et al. 2015), while paralogous sequences were identified by conducting all byvs all all blast conducting were identified sequences 2015), paralogous while p://trinityrnaseq.github.io/). et al. et Oxyria et al. 2012; Scottet species were determined using Inparanoid V 4.1. V4.1. using Inparanoid determined were species et al. et et al. 2016) as implemented in R. A mutation as rate A in R. implemented 2016) mutation 2010) with a threshold value of 0.95; acds value with 0.95; threshold 2010) of 2012), anda maximum likelihood

et al. 2011) was used to assemble assemble used wasto 2011) O. sinensis O. et al. et et al. et 2014) to generate generate to 2014) (2004) to detect detect to (2004) based onKs based Accepted Article sinensis tanguticum origin of i.e. ×10 1 lowe tothe addition in timeestimations used all for This by is protectedreserved.article Allrights copyright. of 1.2 1.2 of ×10 et al. et PhyML3.1 (Guindon gene using trees constructed we each at locus, sequences homologous group of diploid to compared O. digyna of populations aof study in and surveyed 7783, 8719, These9342 10702). previously 6500, were loci 3375, as1007, here (designated loci seven nuclear surveyed across variation of demographic analyses loci nuclear seven of analyses basedon history and demographic relationships Phylogenetic outgroup. ratio of ratio 1:1:1:2and1:1:1:3 between (Alexeyenko Multiparanoid using sequences homologous O. sinensis To obtain further evidence that would help distinguish between an autopolyploid or allopolyploid or allopolyploid distinguish between an autopolyploid help evidence would To obtain that further Fourteen individuals from nine populations of of populations nine from individuals Fourteen 2010) based on the method of maximum likelihood with sequences of sequences of with based maximum 2010) of method likelihood on the and related taxa, which included O. sinensis O. -8 (TableS1) (Wang -8 and 1.5×10 due to genome triplication and incomplete loss of replicated genes. We genes. We extracted and replicated lossof togenome incomplete due triplication and for nuclear sequences estimated from fossil calibration of of calibration fossil from nuclear estimated sequences for Rh. nobile O. O. of sequences between homologous relationships weexamined phylogenetic O. digyna (2 -8 et al. et (Holliday (2 n et al. et = 22). = n 2016). DNA extractions and PCR amplifications, and using amplifications, PCR DNA extractions 2016). =14). Thus, two to three paralogs at threeparalogstwo to =14). Thus, Rh. tanguticum Rh. Oxyria sinensis 2016; Ossowski Ossowski 2016; O. digyna O. st and highest mutation rates recorded for plants, plants, rates recorded for mutation and highest st O. sinensisO. and two diploid species of the sister genus, genus, and the sister of species twodiploid , (2 Rh. nobile n = 40) is likely to be a hexaploid when hexaploid be a to likely is 40) = et al. et al.

2010). 2010). were used in phylogenetic and phylogenetic used were in 2006) across taxa according to the the to taxaaccording across 2006) , O. digyna some locimight be retained in Oxyria and Rh. nobile Rh. (Wang al. et O. sinensis O. considered as considered . For For each . 2016) was 2016) Rh. Accepted Article This by is protectedreserved.article Allrights copyright. KX999721-KX999862). We also used materials and sequence data of Weand sequence materials also used KX999721-KX999862). Numbers GenBank (Accession in deposited have been sequences by newly eye. obtained All homologous primers, were conducted following the procedures described by Wang by theprocedures described conducted following were primers, homologous demonstrated to have originated and migrated from Asia (Wang Asia from migrated and have originated to demonstrated were Arctic the America and North Europe, in occurring those because analyses demographic and following colonization of these other regions, they might have been subject to mutation fixation havemutation regions,might to thesetheyof been and subject other colonization following this from analysis other Weexcluded nuclear loci. al. et DIYABC (ABC) (Cornuet in V.1 computation Bayesian approximate using change demographic of scenarios plausible four examined and designed (http://taxonomy.zoology.gla.ac Treeview 1.6.6 in wasfile displayed output made. The inferencesall from and 2500 excluded so the first a 25% estimatedas ‘burn-in’ were discarded“burn-in”, trees following were branch each valuesfor support indicating probabilities posterior The generations. 100 every model Four selected.chains were runatthesame 6milliongenerations. time to Treessampled were Rh. tanguticum al. Sequences for each nuclear gene were aligned and checked by 6.0 and checked aligned MEGA gene each nuclear were for Sequences (TaKaRa, Dalian, China) and 20positive for clones each locus wereper individual sequenced. witha Kit pMD19-T were pMD19-TCloning into Vector products ligated Amplified Vector (2016), but selected only populations of of only selected but populations (2016), Phylogenies were constructed for all nuclear haplotypes found in in found nuclearall haplotypes for constructed were Phylogenies To investigate the early demographic histories of both both of histories To investigate demographic early the as outgroup) using MrBayes v.3.0 (Ronquist & lsenbeck 2003) and with and & lsenbeck GTR+G with the 2003) v.3.0 using (Ronquist asMrBayes outgroup) .uk/rod/treeview.html). .uk/rod/treeview.html). O. digyna O. O. digyna O. digyna 2010) based on the sequence data of the seven seven the data of sequence the on based 2010) that occurred in Asia for phylogenetic and and phylogenetic for Asia in occurred that O. sinensis populations outside Asia because during during because Asia outside populations

et al. 2016). 2016). O. digyna O. digyna digyna andO. O. sinensis

(Tamura reported by Wang and et al. et al. in Asia, we we Asia, in O. digyna O. (2016). (2016). 2013) and (with et et Accepted Article simulated data sets that matched best the observed data. We used a mutation rate of 1.2 1.2 We×10 of rate a used theobserved data. mutation simulated datasetsthatmatched best produced of thecombination priors option scenario in DIYABC check prior combination’ which to This by is protectedreserved.article Allrights copyright. simulated data sets (on average 10 average simulated datasets(on interest. The distribution of each parameter is listed in Table S3. A reference table containing 4×10 in of table A containing Table is listed S3. reference eachparameter The distribution interest. anddata parameters the of bestfits of whichdemographic scenario identification DIYABC allows expansion-shrinkage-expansion. and (4) expansion-shrinkage (3) expansion, recent two species, (2) of the expansionorigin sincethe population continuous were (1) examined demographic scenarios (Wang populations to those Asian of different and ratesoscillations climatic (excluding redundant and unreliable records) and unreliable redundant (excluding sets (www.gbif.org) data specimens’ and investigations field our from information included ENM for current distributions Dataperform used (LIG) to periods. (LGM) Last Interglacial and Oxyria with default settings (Phillips &2008) examine thetwo divergence to Dudík ecological between nichemodelingEcological of data set. the real to setssimilar data to produce scenario a of given check model to ability the studied each under datasets 10,000 simulating by was conducted analyses. regression estimateandlogistic direct Wang Ecological niche modeling (ENM) was conducted via maximum entropy using MaxEnt 3.3.4 3.3.4 usingMaxEnt conducted viamaximumentropy (ENM) was nichemodeling Ecological et al. species and to indicate their distributions currently Last Maximum currently and during the Glacial indicatedistributions and their to species 2016) and selected the scenario with the highest posterior probability probability the by highestwith performing the posterior and selected scenario 2016) O. digyna O. 6 per scenario)wasgenerated. We used the ‘pre-evaluation scenario and

combined with data for 20 ecological variables (altitude (altitude 20ecological variables data combined with for O. sinensis O. Model checking for the best demographic scenario scenario the bestdemographic Model for checking

et al et . 2016). The. 2016). four -8 (after 6

Accepted Article This by is protectedreserved.article Allrights copyright. Precipitation of Warmest Quarter Warmest of Precipitation bio18 Seasonality, Precipitation Mean Wettest, bio15 of bio8 Isothermality, Temperature Mean Range, bio2 Mean Diurnal bio3 Annual Temperature, environmental (bio1 variables the of Seven thedatabase. WORLDCLIM from variables) downloaded and 19 bioclimatic sinensis al et The outp 20. to replicates of number 1988). and (Swets good when>0.9 0.7–0.9, of range whenanthe in AUC useful was score Amodel prediction was considered 1997). area under the curve (AUC) of the receiver operating characteristic (ROC) plot (Fielding & Bell the by measured wasmodel the of accuracy The themodel. in training for 75% and testing 25% for at set were Outputs analyses. final for selected were r<0.7 coefficients correlation Pearson pairwise while those of GISH analysis sinensis O. of Results niches). overlap) to 1(identical D both of and I 0 may niche 100 from (no range Values pseudo-replicates. with statistics, similarity I Warren’s D and by Schoener’s assessed overlapspecies betweentwo the niche estimate was used to . 2005) was used to draw the range of suitable distributions. ENMTools 1.3 (Warren ENMTools 1.3 was the distributions. suitable . 2005) todraw of range used Chromosome counts showed that all plants examined of plants examined of all that showed counts Chromosome O. digyna genomic digyna labeledtotal O. DNA of with probed O. sinensis O. were polyploid with2 were polyploid

andbio19

We set the maximum number of iterations to 5,000 to and iterations We of number the the maximum set ut format was set to be logistic. Precipitationof Coldest Quarter n = of Chromosomes 40 (Fig. 1aandFig. 1b). O. digyna O.

at mitotic metaphase that showed GISH

DIVA-GIS version 7.5 7.5 (Hijmans version DIVA-GIS were diploid with 2 diploid were ) which exhibited whichexhibited ) et al. n = 14 = 2010) 2010) O. Accepted Article between 1:1 pairs on theKsorthologous for values based divergence Interspecific sequences. orthologous the of 14520 pairs from extracted degenerate sites fourfold calculated the Kaand Ksvalues for divergence occurred around 12 Ma rate divergence ×10 around the mutation 1.2 using occurred of for paralogs and 15196 6420 and we identified sequences orthologous paralogous from at 1E-5was set toseparate approach. Thee-value blast hit V andthereciprocal best 4.1 Inparanoid using sequences orthologous 1209 and 1098 O.digyna 1209 and1098 for sinensis analysis Transcriptome Ma using the lowest, 1 ×10 1 Mathe lowest, using This by is protectedreserved.article Allrights copyright. within the between andtranslocation recombination of a lack indicating thatstainedblue, any chromosomes 26 the in of no evidenceGISHpart occurring for of origin anallopolyploid for evidence strong provided therefore sinensis digyna O. entire the contains that confirmed This Fig.green, 1c). 14 of 40 chromosomes the occurred(stained for Clean sequences comprising 1.57G and 1.74G of cDNA were obtained for cDNA for of were and1.74G obtained 1.57G comprising Clean sequences O. sinensis O. , respectively.The (stained blue, Fig. 1c), showing that these are derived from a different donor or donor donors. GISH a are from different derived these that 1c), showing Fig. blue, (stained O. digyna O. and O. digynaand . and and O. sinensis O. de novo -8 O. sinensis O. , and highest, 1.5 ×10 1.5 , andhighest, genome. GISH did not occur for the remaining 26 chromosomes of of chromosomes occur remaining not 26 the genome.for GISH did and and O. sinensis O. peaked at Ks at peaked assembly produced 36125 and 50955 contigs with an and N50 value with contigs 36125 50955 of produced assembly , respectively, by all vs all blast (blastn) comparison. We comparison. (blastn) blast all vs all by respectively, , , respectively (Table S2). We identified 14520 pairs of of pairs We14520 S2). (Table identified respectively , c. O. digyna -8 0.15 (Fig. 2) and from this we estimated this we estimated andfrom 2) (Fig. 0.15 , mutation ratesfor plants respectively). Similarly, and other ancestral genome(s) contained contained genome(s) ancestral other and

O. sinensis -8 -8 (whichlies between 11 and 15 . Interestingly, there was O. digyna O. O. sinensis O. and O. and

O. Accepted Article This by is protectedreserved.article Allrights copyright. groups, 55%) awhile 55%) third setfrom groups, sinensis O. from genesets homologous In two onetree, setsd). homologous the 1:1:1:3 3c, gene gene(Fig. for types of tree obtained were from O. sinensis from one set sinensis O. from genesets homologous two In onegene tree, 3a,b). (Fig. homologous gene datasets the for 1:1:1:2 tree were recovered gene types of with 556 and11of (1:1:1:2) detected. the first type the(1:1:1:3) of second Two kind homologous 1:1:1:2and1:1:1:3 identified sequences O. sinensis duplication (a Ks peak at Ks peak (a duplication genomeThe pairs. second of orthologous ontheanalysis species divergencebased of two the between lowest and highest substitution rates for plants, respectively). This corresponds to the estimated time Ma (using the synonymous substitution 1.2 ×10 rate substitution synonymous Ma the (using at Ks peak The first species. O. sinensis valuesin paralogouspairs for O. digyna for turn in estimated was sequences of pairs paralogous for sites degenerate fourfold of the values Ks The two The two O. sinensis O. and and clustered with the only from with one clustered We therefore in two genomeduplications one which or haveoccurred. a is polyploid (Fig. 2 and Fig. S3), indicating two possible genome duplications in the ancestry of this in thethis of two possibleS3), indicating ancestry genome Fig. duplications 2 and (Fig. comprised a separate clade. In the other tree, one homologous from gene set homologous one tree, other In the clade. O. digyna comprised a separate Rheum O. sinensis O. comprised an independent clade (Fig. 3b, 349/556 groups, 63%). two 63%). Similarly, groups, 349/556 3b, (Fig. clade an independent comprised species and species c. (Fig. 2). Though we did not find an obvious peak in the distribution peak anKs of in the not obvious we Though 2). distribution find (Fig. did 0.077) was estimated as above to have occurred 6.0 (5-7) Ma. Ma. (5-7) 6.0 as tohave occurred above was estimated 0.077) 0.15 was estimated (Ks/substitution rate) to have occurred around 12.2 12.2 tohave around rate) occurred was c.0.15 (Ks/substitution estimated O. O. digyna O. digyna O. O.sinensis occurred together in one monophyletic clade in one 6/11 monophyletic 3c. (Fig. together occurred while groups, the 37%) 207/556 (Fig. 3a. together clustered , we found two peaks in the distribution of such Ks values of such in Ks twodistribution the we peaks in , found included in the transcriptome analysis are diploids, while diploids, are analysis transcriptome the in included O. O. digyna clustered into a different monophyletic clade with the a monophyletic the clade into different with clustered -8 , 11and15Ma , between the using lies which across these four species (using MultiParanoid) MultiParanoid) species (using four these across into a clade,monophyletic theotherset while

Accepted Article This by is protectedreserved.article Allrights copyright. Phylogenetic relationships and demographic history based on analyses of seven nuclear loci loci nuclear seven of analyses basedon history and demographic relationships Phylogenetic 45%). groups, 5/11 representing set the contained also which of one clades, separate three formed representing only set 5b). The results of PCA conducted on summary statistics for the four simulated demographic thesimulated four for PCA conducted of onsummary statistics The results 5b). species (Fig. both of during histories early 4) the demographic (scenario and shrinkage re-expansion a model expansion- of genes the sevensupported from geneticnuclear population derived data 9342 agreeinFig. withthe tree well 3a. illustrated and 10702 genes for trees the of topologies the while respectively, 3c, Fig. and 3b Fig. in illustrated trees with the phylogenetic genesare consistent and 1007 for 7783 the trees obtained clade. Similarly, O. sinensis Fig. Fig. the with illustrated 4) agreed 3dwithparalogousin topology well phylogeny the of alleles of and 6500, 3375 gene (genes trees Fortwo example, analysis. in transcriptome the setsgene identified homologous for the phylogenetic constructed of trees topologies the tendedto support 4) trees (Fig. both across seven genes sinensis each in gene for The recovered haplotypes number of bp. 3768 aligned was sequences all length of The seven nucleargenes sequenced in ABC modeling and tests of the demographic dynamics of the two two the of dynamics thedemographic of ABC and tests modeling varied from 18 to 24. To assess phylogenetic relationships between haplotypes for each of each the of between haplotypes for 18 relationships To to 24. from assess phylogenetic varied comprising two separate clades and orthologs of the two species forming a monophyletic monophyletic a species forming two the of and orthologs clades separate two comprising O. digyna O. Oxyria . In the other tree, the three homologous gene sets from genesets threethe tree, . In homologous the other species we constructed haplotype trees for each gene in turn. These turn. gene eachin trees haplotype for constructed we species O.sinensis ranged from 318 774 318 Thebp in from to full length. ranged

Oxyria Oxyria O. O. digyna species conducted on conducted species O.sinensis (Fig. 3d, (Fig. 3d,

O. O.

Accepted Article predictive withAUC>0.92 and SD<0.01.ability Multiple niche overlap values identity testsand for and isothermality quarter thewettest of mean temperature an were isothermality to modelpredictions change Ecological niche divergence shif and distributional change. climate early inresponse to Pleistocene histories that suggest both therefore results Ma, t1=1.38Ma) being to thatestimated similar for of (t1) thetime occurred and2 Ma between expansion 2.5 of with second the for for (t2) expansion first the of the time revealed expansions and thewithin past severala bottleneck million years. The estimates parameter (Table S4) other scenarios (Fig. S5). Thus, digyna This by is protectedreserved.article Allrights copyright. significantly higher 4(directfor scenario P=0.63 estimate: to 0.8 for also were regression logistic and estimate direct for obtained probabilities posterior the species, both simulated datasetsgenerated forscenario 4 than to for scenario 4 (Fig. S4). Thus, the observed data sets both for species were more similar to the generated data the sets within simulated many embedded were both species sets for the data observed that and showed data the observed distribution, predictive from posterior the simulations scenarios, (3-3.2 Ma). Shrinkage in the population sizes of both species is estimated to have to estimated is species both of sizes population the in Shrinkage Ma). (3-3.2 O. digyna Ecological niche modeling (ENM) showed that the environmental variables that contributed most contributed that variables the environmental that showed (ENM) modeling niche Ecological ; logistic regression: logistic ;

P=0.60 to 0.70 for 0.70 to P=0.60 O. sinensis O. O. sinensis d precipitation of the warmest quarter for of for the warmest quarter d precipitation and and and O. sinensis O. O. sinensis O. O. digyna O. O. digyna O. those generated for the three other scenarios. For the scenarios. other thosefor three generated O. digyna ts in response to recent Quaternary climate climate ts recentQuaternary inresponse to , P=0.58 to to 0.62 for , P=0.58 are indicated to have experienced two tohaveexperienced are indicated tobe 6-7 Ma, whichis far earlier than that experienced similar demographic O. digyna O. (db=0.74 Ma, Ma). These (db=0.74 t1=1.78 latter

O. sinensis . All models exhibited high All models. exhibited (db=0.66 (db=0.66 O. sinensis O. digyna , P=0.78 to 1for , P=0.78 O. sinensis O. ) than for the , and , O. O. Accepted Article digyna diploid, being the other polyploid has one that species an allopolyploid of history Discussion years. the over 21,000 last twospecies these of distributions inthe has change occurred little LGMthat theday speciessuggests andthepresent between both for areas predicted concordance of This by is protectedreserved.article Allrights copyright. modeling have also shed light on the historical demography of of demography historical the on light shed also have modeling and ecological niche simulations coalescent combined with analyses andanalyses. These genetic genomic can through challenge, but presents a proceed suchan of major and allopolyploid evolution theancestry Reconstructing species. parent ‘ghost’ extinct anand presumably derived from unknown, of diploidcomplement the entire chromosome containing of distribution the stages three all at LIG and LGMthat 6a), at during currently (Fig. the or the than narrower the much were study present both of distributions that indicatedthe ENM further between two species. the differentiation clear indicating D=0.372) ecological (Fig. 6b), (I=0.665, values actual than higher significantly were values observed that D demonstrated and I by measured The major aim of the research reported here was to reconstruct the evolutionary and demographic and the evolutionary reconstruct to was here reported research the aim of major The . Oxyria sinensis Oxyria O. digyna O. digyna (2 n =40), which is one of only two species comprising thegenus comprising only of is one which species two =40), (2 O. digyna O. n =14). GISHestablished that was more extensive than extensive was more parent species now extinct. We focused on the Wethe on focused extinct. species now parent Oxyria O. digynainadditionto 26 chromosomes species in the parts of Asia investigated in in Asia investigated of parts the in species

O. sinensis O. sinensis O. sinensis. and its extant parent parent extant and its is an allopolyploid is an allopolyploid The high high The Oxyria O. , the Accepted Article with a third diploid species (now extinct) to produce hexaploid to extinct) produce(now hexaploid with a third species diploid 12 a Ma, around species then which hybridized form diploid tetraploid extinct with an unknown to digyna diploid with approximately Ma, the first genome12 whichthenhybridized during duplication 2 with (both species progenitors. that evidence further in addition ato genome to that similar of that indicated loci, nuclear specific seven for obtained sequences also and analysis by transcriptome detected homologousof sequences 6 Ma have (5-7) around wasestimated to occurred 2). Phylogenetic analyses (Fig. genome duplication digyna, O. diverged from therefore, that the initial genome duplication occurred when the tetraploid ancestor of ancestor occurred genomewhen tetraploid the that initial the duplication therefore, sequence variation (Sun divergence of with for that largely consistent Ma. 12(11-15) around This tohavedateis The wasestimated genomeS3). occurred first duplication duplications were detected in weredetected duplications digyna, This by is protectedreserved.article Allrights copyright. Origin of O. sinensis sinensis O. From the distribution of Ks values obtained for orthologous sequences in sequences orthologous of Ks for values obtained From thedistribution to produce hexaploid hexaploid produce to identified by transcriptome analysis, moderate signatures of two genome two independent signatures of moderate analysis, identified transcriptome by O. sinensis may have originated in several different ways. First, two unknown and extinct diploid and diploid unknown extinct ways. two First, several different mayin have originated n O. sinensis 14 = onewith or 2

et al. and indeed such duplication second this divergence. The may have triggered and indeed suchduplication O. sinensis O. 2012), which would be unaffected by genome duplication. It is likely, is It duplication. genome by be unaffected would which 2012), O. sinensis is an allopolyploid species and that and speciesthat anallopolyploid is O. sinensis O. around 6 Ma.around Second, diploid n , whereas none were detected for for nonewere detected , whereas O. digyna digyna O. =12) could have hybridized and formed an allotetraploid =12) could and an formed allotetraploid have hybridized O.sinensis contained the genome of lineage an unknown ‘ghost’ contained genome of the (Fig. 3, Fig. 4). This finding, therefore, provided provided therefore, (Fig. 4). This finding, Fig. 3, from O. digyna O.

O. sinensis O. digyna O. digyna O. based on chloroplast DNA based onchloroplast O. digyna digyna O. around 6 Ma. Both of these 6 Ma.of Both around O. sinensis O. is one of its its is oneof may have hybridized hybridized mayhave (Fig. 2and Fig. O. sinensis and and O. O. O.

Accepted Article This is because the phylogenies constructed from homologous sequences supporting one or the other one other the supporting or homologous sequences from constructed This thephylogenies is because theavailable andthird evidence. pathways from first the clearly wearebetween unabletodistinguish more or less at the same time. Though we might rule out the second pathway of origin of theorigin second pathway of out more lessattheThough rule same we or might time. have would occurred two species, between these givingto anallotetraploid rise genome duplication between divergence i.e. events, two these unlikely that 2 of number chromosome the produce to events polyploidization second or the first after lost were two chromosomes that assumed may be it species), diploid the extinct genusx the of = is 2 number 7 (with chromosome basethe of if all (Fig.these 3c).In cases, based on maternally inherited chloroplast DNA sequence variation based(Sun onmaternallyDNA sequence variation inherited chloroplast thetwospecies divergence with between of date the Thisconsistent date 2). is (Fig. sequence pairs between This by is protectedreserved.article Allrights copyright. sinensis sets in gene homologous the one of which in by phylogenies supported is pathway 6 Ma. This around with 12 Ma, hybridized later which autotetraploid to produce an genome duplication may diploid an unknown species have undergone Third, with the set present in for 1:3 ratio in a occurring sequences homologous from constructed by phylogenies those supported pathways are between 12Mawas divergence dated to based(polyploidization) on genomic genomeThe initial duplication Of the three pathways described, the second one, second the described, three pathways the Of clusters present with the set in O. O. digyna O. digyna O. vs.O. digyna sinensis O. and and the due to following. seems unlikely step, diploidspecies and asafirst an extinct O. digyna, O. O. sinensis O. while the other two sets form two independent clades (Fig. 3d). 3d). twoindependent clades(Fig. theother two sets form while according to the distribution of Ks values for 1:1 orthologous orthologous 1:1 of Ks valuesfor the to distribution according , in which one of the three sets present in in the setspresent whichoneof three , in O. digyna, O. while the other two sets form a monophyletic aclade while two setsform themonophyletic other involving the formation of involving theformation an allotetraploidof O. digyna digyna O. O. digyna O.

and another diploid species, species, and diploid and another to form n et al. et = 40 for O. sinensis O. 2012). It is highly highly is It 2012). O. sinensis O. O. sinensis O. approximately approximately O. sinensis clusters clusters n =14 for =14 . O. , Accepted Article origin. of date true its determine and anallopolyploid of history bethe evolutionary reconstruct genomicto of required will data may be incorrect as could be the case for be case for the ascould may be incorrect evolutionary history, datesof origin estimated dire depending allopolyploid specieson their of in that the point origins important regarding dating Ma, theactualorigin of 6Ma allopolyploidization between diploid from speciesa thatdiverged diploid This by is protectedreserved.article Allrights copyright. it would indicate that thoughit would divergence indicate that between was a step first theoriginin of case) other the in autotetraploid and case one in (allotetraploid a tetraploid of origin the that however, caseWe can conclude, here. aswas the sequences homologous different tobe across likely observed al. bothofevolution, which are polyploids verycommon in Alexander-Webber (Soltis & Soltis 2016; concerted by loss and DNA complicated are andits ancestors apolyploid between shared sequences of homologous relationships Phylogenetic 3 and4). recovered(Fig. theseof were pathways two 2016). As a consequence, different phylogenies supporting different evolutionary scenarios are are scenarios evolutionary different supporting phylogenies different consequence, As a 2016). If If O. sinensis O. originated either through the first or third pathway, i.e. involving involving i.e. pathway, third or first the through either originated O. sinensis O. O. sinensis O. is much younger, occurring approximately 6 Ma. This raises anMa. 6This raises approximately occurring much younger, is O. digyna O. O. sinensis O. (Fig. 7), and that the maternal parent of this tetraploid was O. digynaO. around around Ma.12 O. sinensis ctly from genetic divergence between sister species species between sister genetic from divergence ctly . In many such cases, therefore, detailed analyses analyses detailed therefore, manysuch cases, In . and an extinct tetraploid ‘ghost’ species (Fig. 7), 7), species (Fig. ‘ghost’ tetraploid an extinct and and

O. O. digyna can be dated to around 12 can be dated to around et Accepted Article current analyses cannot distinguish between these or other possible times when the tetraploid ‘ghost’ ‘ghost’ thetetraploid times when other possible or between these distinguish cannot analyses current thattriggered mark oscillations climatic Pleistocene the during stage alater at extinct mayhave become the tetraploid Ma. Alternatively, 6 approximately its origin sizefollowed that population of expansion duringrapid its extinct) parent (now this tetraploid has(or have)also becomeextinct. It is feasiblethat at 12million some duringthelast yearsstage species diploid theunknown of in the origin involved genome only with thein presence its extinct of species parent of years tetraploid the 6 million last the during stage some At LGM. the is since that years, last21,000 duringthe this region Asia,and there appears to have the beenchange little in respectiveof these distributions species in digyna these Ma) and time atthe present Glacial 6). Throughout are (Fig. periods, (0.021-0.018 Maximum Last Ma) at were theythe than (0.14-0.12 period Last the Interglacial during were narrower much both species of distributions the that nichemodeling. ENMwere indicated ecological indicatedby sinensis Ma 1.78 thenexpanded again approximately (for markedly, but and beginning of the Pleistocene population sizes of both both of sizes population Pleistocene the of and beginning Ma following its origin (Fig. 5). The simulations further indicated that towards the end of the Pliocene seven nuclear genes indicated that seven genes nuclear indicated This by is protectedreserved.article Allrights copyright. Demographic history of of history Demographic Coalescent simulations using approximate Bayesian computation on population genetic for data on population computation Bayesian approximate using simulations Coalescent maintained aconsiderably wider distribution than ). Additional and more recent contractions and expansions in the distributions of both species both of thedistributions and and in moreexpansions recentcontractions Additional ). O. sinensis O. sinensis O. O. digyna O. and expanded its population size approximately 6 rapidly population its expanded O. sinensis O. ed fluctuations in species’ population sizes. Our sizes. Our in population species’ ed fluctuations in response to past climatic oscillations oscillations climatic inresponse past to O. sinensis O. sinensis O. signifying its existence.former Similarly,

O. digyna O. sinensis O. in central, south and southwest and southwest south central, in and and O. digyna O. ) and) 1.38 Ma(for replaced its tetraploid tetraploid its replaced O. sinensis O. contracted contracted became became O. O. O. O. Accepted Article References References China. of Projects 111 Collaboration International and (2010DFA34610) China of Republic the People's of Technology and Science of Ministry (2014CB954100), Research Basic for Project Key National of ScienceChina (31590821), Natural Foundation National grants by from was This supported work Acknowledgements years. in during occurred region the the 12 million climatic, last that due to other possibly causes, extinct by by species maydiploid havetetraploid, been replaced the of species diploid ancestral the of extinction What years. the over caused thelast 6million region the occurred in massive that of climatic changes in digya in feasiblethat thistetraploid becameextinct because thecombination of itsgenome with that of This by is protectedreserved.article Allrights copyright. BarkerMS, AE, Li Arrigo N, Z,Baniaga Levin On (2016) the DA relative ofabundance autopolyploids and ChapmanRJ, MA (2016) Abbott D, Alexander-Webber Alexeyenko A, Tamas I, Liu G, Sonnhammer EL (2006) Automatic clustering of orthologs and inparalogs inparalogs and orthologs of clustering Automatic (2006) EL Sonnhammer G, Liu I, Tamas A, Alexeyenko existed and that its genome currently remains largely intact within the hexaploid the hexaploid within remains intact existed its andlargely genome currently that species a thatsuch tetraploid analyses doshow, however, Theour of becameextinct. results ancestor allopolyploids. andone ofitsparental speciescorrelates with biasedgene expression and DNA loss. shared by multiple proteomes. 445-454. 107, O. sinensis O. New Phytologist New , resulted in the origin of a species that was better adapted to meet the challenges meet the challenges adapted was in of to a species better the origin , resulted that Bioinformatics , 210, 391-398. 391-398. 210, , O. sinensis O. , 22, e9-e15. Morphological convergence an between allopolyploid also cannot be ascertained. This (or these) This (or be ascertained. cannot also or sinensis or O.

O. digyna O. sinensis O. Journal of Heredity Journal of , or became became or , . It is O. O.

Accepted Article Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O (2010) New algorithms and methods andmethods algorithms O(2010) New W,M, Gascuel Lefort Anisimova Hordijk V, Dufayard JF, S, Guindon V(1981) Grant Grabherr MG, Haas BJ, Yassour M FraleyRafteryC, TB, Murphy L Scrucca AE, mclust R: (2012) Version 4 for Normal Mixture Modeling for FowlerLevin NL, Ecological DA (1984) constraints on conservation in errors prediction of assessment the for methods of review A (1997) JF Bell AH, Fielding FawcettPeerJA, Y(2009)de Maere PlantsVanwith S, Doyle JJ, Sherman-Broyles (2016) Double trouble: and definitions of polyploidy. Double trouble: taxonomy Sherman-Broyles definitions and Doyle(2016) JJ, Hijmans RJ, Cameron SE, Parra JL, Jones PG, Jarvis A (2005) Very high resolution interpolated climate climate interpolated resolution high Very (2005) A Jarvis PG, Jones JL, Parra SE, Cameron RJ, Hijmans This by is protectedreserved.article Allrights copyright. tissue. leaf fresh of quantities small for procedure isolation DNA rapid A (1987) JL Doyle JJ, Doyle Cornuet A(2010) Ravigne´ Inference onpopulationJ, V,Estoup model using DNA history and checking et al. IG (2004) Polyploidy inarctic Journalof plants. Biological the Alsos AK, Brysting C, Brochmann Bolger AM,LohseM, UsadelB (2014) Trimmomatic: a flexible fortrimmer Illuminasequence data. BlancWolfe Widespread KH G, paleopolyploidy(2004) inmodel plant species inferred from distributions age with its diploid progenitor. Biology to estimate maximum-likelihoodphylogenies: performance assessing the of PhyML 3.0. without a reference genome. Clustering,Classification,Model-Based and Density Estimation. presence/absence models. survive the Cretaceous–Tertiary extinction event. 10.1111/nph.14276 press) DOI: 5742. 5742. Bull 19,11-15. surfacesglobal for land areas. sequence and microstaellite data with the software DIYABC(v.1). Linnean Society Bioinformatics genes. duplicate of , 59, 307-321. 307-321. 59, , Plant speciation , btu170. , 82, 521-536. 82,, 521-536. Plant Cell Plant Environmental Conservation . Columbia University Press, New York. The American Naturalist Nature Biotechnology , 16, 1667-78. 1667-78. 16, , et al. International Journal ofClimatology (2011) Full-length transcriptome assembly from RNA-Seq dataFull-length transcriptome RNA-Seq from (2011) assembly ProceedingNational of Academy of Sciences double genomes mighthave had a better chance to the establishmentof a novel polyploid in competition , 29, 644-652. 29,, 644-652. , 124, 703–711. 703–711. 124, , , 24, 38-49. 24, , 38-49.

BMC Bioinformatics , 25, 1965-1978. 1965-1978. 25, , New Phytologist New , 11. Systematic , 106, 5737- 106, , Phytochem (in , Accepted Article This by is protectedreserved.article Allrights copyright. HollidayL,Zhou JA, Bawa R,ZhangM, Oubida RW (2016) Evidence for extensiveparallelism but divergent Mittermeier R Lynch The Coneryof M, evolutionarygenes. (2000) JS fate duplicateconsequences and Löytynoja A, Goldman N (2008) Phylogeny-awareplacementgap prevents errors insequence alignment and Liu onLuLiugenus, Ho endemic the studies TN, SW, JQ, (2001) Karyological AM Sino-Himalayan in karyotypes of Uniformity (2004) JQ Liu WS Liu FH, FH, Liu Yu dicotyledons. in angiosperms: PolyploidyLewis WH (1980) Leitch AR, SchwarzacherT, JacksonLeitchInIJ D, (1994) situhybridization: a practical guide. Jiao Y,Leebens-Mack J,Ayyampalayam S Jewell DC, Islam-Faridi Atechniquefor N somatic preparation(1994) chromosome ofmaize. andC-banding In comparing and clustering for web server a Suite: CD-HIT Li L, W(2010) Fu Y, B, Gao Niu Y, Huang Meng L, Chen G, Li Z, Yang Y, Wang Z,MengL,Li WangL Y, Wang Chen G, (2015) Refugial Z, drive Yang isolation the range and expansions ecoregions genomic architectureof adaptation along altitudinal and latitudinalgradients in , 5. , 5. reports Phytologist evolutionary analysis. Cremanthodium 329-342. 144, the Qinghai-Tibet eastern Plateau highlands and adjacent areas. Annals of Botany York. Ltd Publishers the Maize Handbook biological sequences. diversificationof the .core genetic structure of of structure genetic 1155. et al. . CEMEX& AgrupacionSierra Madre, Mexico. , 209, 1240-1251. 1240-1251. , 209, Hotspots revisited: Earth’s biologically richest and most endangered terrestrial terrestrial endangered and most richest biologically Earth’s revisited: (2004)Hotspots . (Asteraceae: Senecioneae). , 100, 51-54. 51-54. 100, , et al.(2007) Large clones on faces:cliff expandingby through rhizomes crevices. Oxyria sinensis Oxyria , Inc. pp. 484-493. Springer press, New York. York. New press, Springer 484-493. pp. Inc. , Science Bioinformatics , 320, 1632-1635. 1632-1635. 320, , (Polygonaceae) in the Himalaya-Hengduan Mountains. Genome Biology Genome , 26, 680-682. , 680-682. 26, Ligularia et al. (2012) A genome (2012)A triplication with associated early Botanical Journal of the Linnean SocietyLinnean of the Botanical Journal (Asteraceae: Senecioneae), a highly diversifiedgenus of , 13, 1. , 13,1. In Polyploidy

Botanical Journal of the Linnean Society Linnean the of Journal Botanical , pp, 241-268. Springer press, New New press, Springer 241-268. pp, , Populus trichocarpa Science , 135, , 107-112. , 290, 1151- 290,, BIOS Scientific Scientific BIOS Scientific . New New , Accepted Article ScottLS, Wholegenome StenzIngvarsson Baum DA NW, P, (2016) duplication ( in coast redwood Bayesian (2003)3:Ronquist phylogenetic F, Huelsenbeck JP inferenceMrBayes nder mixed models. plants. flowering in formation polyploid of rates and mechanisms, Pathways, 1998. DW. Schemske J, Ramsey Proceedings ofthe National Academyyarrow. of wild in adaptation ecological and Polyploidy (2011) J Ramsey Phillips SJ, DudíkM Modeling (2008) ofspecies distributionsMaxent: with new extensions and a niches. ecological on polyploidy of impact the of hypotheses unified Towards (2016) O Broennimann C, Parisod Otto (2007)The evolutionary SP Stebbins GL (1985) Polyploidy, hybridization,Stebbins Polyploidy, andthe habitats.of GLnew (1985) invasion plants higher in evolution Chromosomal Stebbins GL(1971) This by is protectedreserved.article Allrights copyright. mostly analysis orthology between InParanoid 273 proteomes, EL, Sonnhammer G(2015) 8: Östlund Soltis DE, Visger CJ, Soltis PS (2014) The polyploidynow: then…and revolution Stebbins revisted. Soltis PS, Soltis DE (2016) Anci Myers N, Mittermeier RA,Mittermeier CG, FonsecaDa GA, Kent J(2000) Biodiversityhotspots for Ossowski S, Schneeberger K, Lucas-Lledó JI K, Schneeberger S, Ossowski Nie ZL, Wen J, Gu ZJ, Boufford DE, Sun H (2005) Polyploidy in the flora of the Hengduan Mountains hotspot, hotspot, Mountains Hengduan of the flora the in Polyploidy (2005) H Sun DE, Boufford ZJ, Gu J, Wen ZL, Nie Annual Reviewof Ecology and Systematics Sciences eukaryotic. eukaryotic. Botanical Garden Bioinformatics comprehensive evaluation. New Phytologist sempervirens mutations in Journal of Botany in Plant Biology 186-193. 211, Nature priorities. conservation Annals of the Botanical Missouri Garden Annals of China. southwestern , 108, 7096-7101. 7096-7101. 108, , Nucleic Acids Research Arabidopsis thaliana ) and its implications for explainingthe polyploidyrarity of in conifers. , 19, 1572-1574. , 19, 1572-1574. , 30, 159-165. , 30, 159-165. , 212, 540-542. 540-542. 212, , , 72, 72, , 824-832. , 101, 1057-1078. 1057-1078. 101,, ent WGD events as drivers of key innovations in angiosperms. Ecography consequences of consequences polyploidy. Cell , 403, 853-858. 853-858. 403, , . Science , 43, D234-D239. 43, , D234-D239. , 31, 161-175. , 161-175. 31, et al. et , 327, 92-94. 92-94. 327, , , 29,, 467–501. The(2010) rate andmolecular ofspectrum spontaneous . Addison-Wesley, London. London. Addison-Wesley, . , 131, 452-462. 131, , 452-462. , 92, 275-306. , 275-306. 92,

Annals of the Missouri Missouri the of Annals New Phytologist Current Opinion Sequoia Sequoia American American

, Accepted Article This by is protectedreserved.article Allrights copyright. manuscript. the wrote R.A. and data; J.L. the analyzed andQ.H X.L. theresearch; performed and Q.W. D.Z. X.Z., Q.H., X.L., the research; J.L. ontheorigin andevolution of designed the species China. on projects The collaborate authors in Author contributions Filipski Ku A, D, TamuraPeterson G, Stecher K, accuracyMeasuring of Science (1988) Swets the systems. diagnostic JA radiationof Liu Rapid Wang Q, (2012) JQ WanDS, Wang AL, Sun YS, Zhao (2008) YP F,Reproductive allocation Yang adioeciousin perennial YangZhuZK, Re Wang J, Meng WL, LH (2014) Wu Z (1998) LH I,Wood The Mayrose Barker (2009) MS, PB,of Rieseberg N, TE,frequency Takebayashi Greenspoon, (1992) EO Wilson Warren DL, GlorRE, TurelliM (2010) ENMTools: a toolbox for comparative studies ofenvironmentalniche Wang Liu Allen etGA Q, al. J, LiuB M, RI, (2014) Milne Wang Wang L, ZhouJ, Han H, Yang PAML Z 4:phylogenetic by (2007) maximum likelihood. analysis altitudinal gradients. case study ofthe mountain sorrel, analysis version 6.0. evolution of morphological traits. 1586-1591. 1586-1591. Bulletin, Tokyo. plants. vascular in speciation polyploid models. e110712. “Glasshouse” Plant nobileRheum (Polygonaceae) with Special Translucent Bracts. sinensis 4033-4040. 4033-4040. Ecography (Polygonaceae), a subdioeciousperennial herbnative East Himalayas. Delineation and unique features of the Sino-Japanese floristic region floristic Sino-Japanese the of features unique and Delineation The diversity of life , 33, 607-611. 33, , 607-611. Molecular biologyand evolution Journal ofJournal Systematics and Evolution (2016) Arcticplantorigins and earlyof formation circumarctic distributions: a . Harvard UniversityPress, Boston. Molecular Phylogenetics and Evolution Oxyria digyna Proceeding of National Academy of Sciences of Academy National of Proceeding mar SMEGA6: (2013) molecular evolutionarygenetics Oxyria amongsex morphs stress of drought Oxyria to sponses . New Phytologist Genome-Scale Transcriptome Analysis of Analysis Alpine the Transcriptome Genome-Scale , 30, 2725-2729. , 2725-2729. 30, , 46, , 830-835.

, 209, 343-353. 343-353. 209, , , 240, 1285–1293. 240,, 1285–1293. Molecular Biology and Evolution and Molecular Biology Rheum Oxyria sinensis , 63, , 150-158. (Polygonaceae) and parallel . University of Tokyo Tokyo of University . Ecology and Evolution and Ecology PLoS ONE , 106, 13875-13879. 13875-13879. 106, , (Polygonaceae) along , 9, , 24, , 4, Accepted Article Table S1 closest of using 1% regression datasets. O. digyna Table S2 19 Figure are and 20) shown S2. populations in (except for S5 Fig. data. observed and the distribution, predictive posterior the from simulations scenarios, demographic simulated 4 the from statistics summary on a PCA of axes 2 the first showing checking results sinensis Table S3 S4 Fig. mclust. by estimated were curve) (black models mixture secondar Evidence of Ks green, 0.153. is = colored Ks S3 value frequencyforFig. fourfold degenerate sites of info Detailed excluded). and 20 19 populations S2 Fig. respectively. (GR) Greenland and (NA) America North (EU), Europe Russia (RU), (QTP), Abbreviations Qinghai-Tibet Plateau represent 2016). digyna This by is protectedreserved.article Allrights copyright. Fig. S1 Fig. article. this supporting Additional of version information onlinemay inthe be found Supporting information doi:10.5061/dryad.bc1v8. Dryad from downloaded can be which ‘ENM_data’, the name with directory a in archived areanalysis ENM in samples used of locations The KR003454-KR003705. KX999721-KX999862, number: Accession SRX2200877. SRX2200544, 2200537, SRX SRX621187, NCBISRA: Data accessibility from the QTP into the arctic region and beyond are indicated by arrows (from Wang (from arrows by indicated are beyond and region thearctic QTP into the from Posterior probabilities forthe 4competing scenariosdemographic for (a) Pre-evaluate scenario-prior combinations for (a) of Geographical distribution Global distributions of distributions of Global and four scenarios for for scenarios and four Sampling localities of Assembly statistics forthe two Descriptions and prior settings for all parameters used DIYABCin (four scenarios for all in the QTP from both a direct estimate using the 500 closest datasets and a logistic a a the QTP 500 closestin and logistic directdatasets from both using estimate O. digyna O. O. sinensis O. sinensis O. O. sinensis O. in the QTP). Population size parameters are in units of of units in are parameters size Population QTP). the in Oxyria (in blue) and blue) (in O. digyna andO. rmation on populations isprovided inTable onpopulations rmation S1 and and species. species. y duplication is colored blue, Ks = 0.076. Normal Normal Ks blue, 0.076. is colored = y duplication O. digyna O. (in green). Migration routes of of routes Migration green). (in O. digyna O. sinensis and O. sinensis used in this study. Sampling localities localities study. Sampling usedinthis O. sinensis. sinensis. O.

samples used in this study (with (with study this in used samples Evidence of first duplication

(b) O. digyna digyna O. O. sinensis sinensis O. in QTP. Model QTP. in et al and

. (b) (b) O. O. O. O. Accepted Article that these derived from not are (2 (2 digyna This by is protectedreserved.article Allrights copyright.

Fig. 1 Fig. years). (5 time generation scaled by of wastime: years) and db) (unit (t Time parameter used. parameters was of transformation logit sets and the data of thesimulated closest 1% is based on QTP. Estimation of thedemographic of history thebest scenario for Computation Table S4 generations. effectivepopulation while size timeparameters (N), n n =14). GISH shows 14 chromosomes (green) of of (green) GISH shows14chromosomes =14). genomic of labeledtotal DNA with metaphase probed mitotic at =40) chromosomes Chromosomes of (a) . GISH is for notevident theremaining 26chromosomes of Estimations of the posterior distributions of parameters revealed by Approximate Bayesian Bayesian Approximate by revealed parameters of distributions posterior the of Estimations O. sinensis O. O. digyna O. and (b) . Bar =9 um. O. digyna containing DNA similar to that to of that DNA similar containing O. sinensis (including bottleneck areduration) (including inunits of at mitotic metaphase. mitotic at

O. sinensis O.sinensis O. sinensis sinensis O. and (stained blue) showing (stained blue)

(c) O. digyna O. digyna Oxyria sinensis Oxyria O. digyna in the the in O. O.

Accepted Article This by is protectedreserved.article Allrights copyright.

d4. of frequency density=The sites, degenerate d4= Thefourfold of Ksvalue genome and duplications. divergence interspecific two within between and and lines) green 2 Fig. Ks value distribution of fourfold degenerate sites of orthologs (yellow line) and paralogs (blue (blue and paralogs (yellow line) orthologs of sites degenerate fourfold of Ksvalue distribution Oxyria species to identify and estimate (Ks/mutation rate)

Accepted Article This by is protectedreserved.article Allrights copyright.

blue ( by are illustrated each type. Species genefor presented tree of are groups The number and percentage and d). (c and1:1:1:3 and b) (a ratios 1:1:1:2 of accordingMultiParanoid to by sequenceidentified 3 Fig. O. sinensis The homologous gene relationships from maximum likelihood analysis based on homologous onhomologous analysis likelihood maximum based from The gene relationships homologous ), green ( green ), ), yellow ( yellow ), O. digyna Rh. tanguticum ) ) and orange (

Rh. nobile ) respectively.

Accepted Article This by is protectedreserved.article Allrights copyright.

tanguticum digyna 4 Fig. Bayesian phylogenetic trees constructed for haplotypes of seven nuclear loci present in present loci nuclear seven of haplotypes for constructed trees phylogenetic Bayesian and O. sinensis (triangle) was used as outgroup. as wasused outgroup. (triangle) , with clade ,support with indicated by Bayesian probabilities. posterior

Rh. Rh. O. O.

Accepted Article This by is protectedreserved.article Allrights copyright.

years bottleneck; of ago. duration Ma, million db, present; years times from t, size; in expanded or size; population contracted N1b, population two the for scenarios demographic history 5 Fig. (a) Schematic diagram of the four demographic scenarios tested by DIYABC. Inferred(b) Oxyria species. NA, ancestral population size; N1, current current size; population N1, ancestral NA, species.

Accepted Article modeling. MaxEnt from values actual indicate arrows randomizations, sinensis area, from highest (red) to lowest (gray). (b) The results of niche identity tests (I and D) between inan a of occurring species the probability ago.represent LGM years LIG. million MYA, Colours and distributions of of distributions

This by is protectedreserved.article Allrights copyright. Fig. 6 Fig. Ecological niche modeling of differentiation Ecological between differentiation niche of modeling and and O. digyna O. sinensis . Gray bars show the simulated identity values generated from 100 100 from valuesgenerated . Grayidentity barsshow simulated the and O. digyna O. in central, south and south-west Asia at the present time, the time, at thepresent Asia andsouth-west south incentral, O. sinensis O.

and and O. digyna O. . (a) Predicted O. O.

Accepted Article This by is protectedreserved.article Allrights copyright. species. the of ploidy the to proportional are lines of Number species. extinct million years lines The indicate ago. solid indicate or extant species lines theunknown dotted while Ma, tobe assumed proportional. grams areaxis) in species. Times ‘ghost’ (horizontal unknown

Fig. 7 Fig. A diagram portraying the allopolyploid origin of of origin allopolyploid the portraying diagram A O. sinensis

between O. digyna and the the and