Physiology Preview. Published on January 15, 2016, as DOI:10.1104/pp.15.01470

1 Running title: 2 Evolutionary dynamics of the LRR-RLK gene family 3 4 Corresponding Authors: 5 Iris Fischer ([email protected]) and Nathalie Chantret ([email protected]) 6 INRA SupAgro, UMR AGAP, 2 Place Pierre Viala, Bât. 21, 34060 Montpellier, France 7 Telephone: +33 (0)4 99 61 60 83 8 9 Research Area: Genes, Development and Evolution

Downloaded from www.plantphysiol.org on February1 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Copyright 2016 by the American Society of Plant Biologists 10 Evolutionary dynamics of the Leucine-Rich Repeats Receptor-Like Kinase 11 (LRR-RLK) subfamily in angiosperms. 12 13 14 Iris Fischer1,*, Anne Diévart2, Gaetan Droc2, Jean-François Dufayard2 and Nathalie Chantret1,* 15 16 1 INRA, UMR AGAP, F-34060 Montpellier, France 17 2 CIRAD, UMR AGAP, F-34398 Montpellier, France 18 *Corresponding Authors 19 20 Summary: Phylogenetic analysis of leucine-rich repeat containing receptor-like kinases 21 demonstrates the dynamic nature of gene duplication, loss and selection in this family.

Downloaded from www.plantphysiol.org on February2 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 22 I.F., A.D., and N.C. designed the study; G.D. performed the LRR-RLK extraction; J-F.D. 23 performed the phylogenetic clustering; I.F., A.D., and N.C. performed the data analysis and 24 statistics; I.F. drafted the manuscript with the help of A.D. and N.C. 25

26 Financial sources: IF was funded by the German Research Foundation (DFG): FI 1984/1-1 27 and the ARCAD project (Agropolis Resource Center for Crop Conservation, Adaptation and 28 Diversity) of the Agropolis Foundation. This work was partially funded by grant #ANR-08- 29 GENM-021 from Agence Nationale de la recherche (ANR, France).

30 31 Corresponding Authors: 32 Iris Fischer ([email protected]) and Nathalie Chantret ([email protected])

Downloaded from www.plantphysiol.org on February3 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 33 ABSTRACT 34 Gene duplications are an important factor in plant evolution and lineage specific expanded 35 (LSE) genes are of particular interest. Receptor-like kinases (RLK) expanded massively in 36 land and Leucine-Rich Repeat (LRR)-RLKs constitute the largest RLK family. Based 37 on the phylogeny of 7,554 LRR-RLK genes from 31 fully sequenced 38 genomes, the complex evolutionary dynamics of this family was characterized in depth. We 39 studied the involvement of selection during the expansion of this family among angiosperms. 40 LRR-RLK subgroups harbor extremely contrasted rates of duplication, retention or loss and 41 LSE copies are predominantly found in subgroups involved in environmental interactions. 42 Expansion rates also differ significantly depending on the time when rounds of expansion or

43 loss occurred on the angiosperm phylogenetic tree. Finally, using a dN/dS-based test in a 44 phylogenetic framework, we searched for selection footprints on LSE and single-copy LRR- 45 RLK genes. Selective constraint appeared to be globally relaxed at LSE genes and codons 46 under positive selection were detected in 50% of them. Moreover, the LRR domains – and 47 specifically four amino acids in them – were found to be the main targets of positive selection. 48 Here, we provide an extensive overview of the expansion and evolution of this very large 49 gene family.

Downloaded from www.plantphysiol.org on February4 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 50 INTRODUCTION 51 Receptor-like kinases (RLK) comprise one of the largest gene families in plants and expanded 52 massively in land plants (Embryophyta) (Lehti-Shiu et al., 2009; Lehti-Shiu et al., 2012). For 53 plant RLK gene families, the functions of most members are often not known (especially in 54 recently expanded families) but some described functions include innate immunity (Albert et 55 al., 2010), pathogen response (Dodds and Rathjen, 2010), abiotic stress (Yang et al., 2010), 56 development (De Smet et al., 2009), and sometimes multiple functions (Lehti-Shiu et al., 57 2012). The RLKs usually consist of three domains: an amino-terminal extracellular domain 58 (ECD), a transmembrane (TM) domain, and a carboxy-terminal kinase domain (KD). In 59 plants, the KD usually has a serine/threonine specificity (Shiu and Bleecker, 2001) but 60 tyrosine specific RLKs were also described (e.g. BRASSINOSTEROID INSENSITIVE1 (Oh 61 et al., 2009)). Interestingly, it was estimated that ~20% of RLKs contain a catalytically 62 inactive KD (e.g. STRUBBELIG, CORYNE (Chevalier et al., 2005; Castells and 63 Casacuberta, 2007; Gish and Clark, 2011)). In Arabidopsis, 44 RLK subgroups (SG) were 64 defined by inferring the phylogenetic relationships between the KDs (Shiu and Bleecker, 65 2001). Interestingly, different SGs show different duplication/retention rates (Lehti-Shiu et 66 al., 2009). Specifically, RLKs involved in stress response show a high number of tandemly 67 duplicated genes whereas those involved in development do not (Shiu et al., 2004) which 68 suggests that some RLK genes are important for responses of land plants to a changing 69 environment (Lehti-Shiu et al., 2012). There seem to be relatively few RLK pseudogenes 70 compared to other large gene families and copy retention was argued to be driven by both, 71 drift and selection (Zou et al., 2009; Lehti-Shiu et al., 2012). As most SGs are relatively old 72 and RLK subfamilies expanded independently in several plant lineages, duplicate retention 73 cannot be explained by drift alone and natural selection is expected to be an important driving 74 factor in RLK gene family retention (Lehti-Shiu et al., 2009). 75 Leucine-Rich Repeat (LRR)-RLKs, which contain up to 30 LRRs in their extracellular 76 domain, constitute the largest RLK family (Shiu and Bleecker, 2001). Based on the kinase 77 domain, 15 LRR-RLK SGs have been established in Arabidopsis (Shiu et al., 2004; Lehti- 78 Shiu et al., 2009). So far, two major functions have been attributed to them: defense against 79 pathogens and development (Tang et al., 2010b). LRR-RLKs involved in defense are 80 predominantly found in lineage specific expanded (LSE) gene clusters whereas LRR-RLKs 81 involved in development are mostly found in non-expanded groups (Tang et al., 2010b). It 82 was also discovered that the LRR domains are significantly less conserved than the remaining 83 domains of the LRR-RLK genes (Tang et al., 2010b). In addition, a study on four plant

Downloaded from www.plantphysiol.org on February5 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 84 genomes (A. thaliana, grape, poplar, rice) showed that LRR-RLK genes from LSE gene 85 clusters show significantly more indication of positive selection or relaxed constraint than 86 LRR-RLKs from non-expanded groups (Tang et al., 2010b). 87 The genomes of flowering plants (angiosperms) have been shown to be highly dynamic 88 compared to most other groups of land plants (Leitch and Leitch, 2012). This dynamic is 89 mostly caused by the frequent multiplication of genetic material, followed by a complex 90 pattern of differential losses (i.e. the fragmentation process) and chromosomal rearrangements 91 (Langham et al., 2004; Leitch and Leitch, 2012). Most angiosperm genomes sequenced so far 92 show evidence for at least one whole genome multiplication event during their evolution (see 93 e.g. Jaillon et al. (2007); D'Hont et al. (2012); The Tomato Genome Consortium (2012)). At a 94 smaller scale, tandem and segmental duplications are also very common in angiosperms (The 95 Arabidopsis Genome Initiative, 2000; International Rice Genome Sequencing Project, 2005; 96 Rizzon et al., 2006). Although the most common fate of duplicated genes is to be 97 progressively lost, in some cases they can be retained in the genome and adaptive as well as 98 non-adaptive scenarios have been discussed to play a role in this preservation process (for 99 review see Moore and Purugganan (2005); Hahn (2009); Innan (2009); Innan and Kondrashov 100 (2010)). Whole genome sequences also revealed that the same gene may undergo several 101 rounds of duplication and retention. These lineage specific expanded genes were shown to 102 evolve under positive selection more frequently than single-copy genes in angiosperms 103 (Fischer et al., 2014). The study by Fischer et al. analyzed general trends over whole 104 genomes. Here, we ask if, and to what extent, this trend is observable at LRR-RLK genes. As 105 this gene family is very dynamic and large – and in accordance with the results of Tang et al. 106 (2010b) – we expect the effect of positive selection to be even more pronounced than in the 107 whole genome average. 108 We analyzed 33 Embryophyta genomes to investigate the evolutionary history of the 109 LRR-RLK gene family in a phylogenetic framework. Twenty LRR-RLK subgroups were 110 identified and from this dataset we deciphered the evolutionary dynamics of this family within 111 angiosperms. The expansion/reduction rates were contrasted between SGs and species as well 112 as in ancestral branches of the angiosperm phylogeny. We then focused on genes which 113 number increased dramatically in a subgroup- and/or species-specific manner (i.e. LSE 114 genes). Those genes are likely to be involved in species-specific cellular processes or adaptive 115 interactions and were used as a template to infer the potential occurrence of positive selection. 116 This led to the identification of sites on which positive selection likely acted. We discuss our 117 results in the light of angiosperm genome evolution and current knowledge of LRR-RLK

Downloaded from www.plantphysiol.org on February6 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 118 functions. Positive selection footprints identified in LSE genes highlight the importance of 119 combining evolutionary analysis and functional knowledge to guide further investigations. 120 121 122

Downloaded from www.plantphysiol.org on February7 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 123 RESULTS 124 We extracted genes containing both LRRs and a KD from 33 published Embryophyte 125 genomes. Here, we mostly describe the findings for the 31 angiosperm (eight monocot and 23 126 dicot) genomes we analyzed. The 7,554 LRR-RLK genes were classified in 20 subgroups 127 (SGs). This classification was inferred using distance related methods because the high 128 number of sequences to be analyzed would imply excessive computation time for methods 129 relying on maximum likelihood. Since we decided to study the evolutionary dynamic of LRR- 130 RLK gene family using the SG classification as a starting point, we first wanted to verify that 131 each SG was monophyletic. Ten subsets of about 750 sequences were created by picking one 132 sequence out of ten to infer a PHYML tree (data not shown). Analysis of the trees shows that 133 most SGs (14) are monophyletic with strong branch support. On the other hand, for six SGs 134 (SG_I, SG_III, SG_VI, SG_Xb, SG_XI and SG_XV), the topology differs slightly between 135 trees: in at least five trees out of ten, either the SG appears to be paraphyletic or few 136 sequences are placed outside the main monophyletic clade with low branch support. As we 137 could not confirm that these SG are monophyletic, they were tagged with a "*" throughout the 138 manuscript. 139 Next, we determined the number of ancestral genes present in the last common ancestor 140 of angiosperms (LCAA) using a tree reconciliation approach (see Materials and Methods). In 141 short, tree reconciliation compares each SG-specific LRR-RLK gene tree to the species tree to 142 infer gene duplications and losses. Note that since only LRR-RLKs with at least one complete 143 LRR were considered, some of the inferred gene losses might correspond to RLKs without, or 144 with degenerated LRRs. Using this method, we predicted the number of LRR-RLK genes in 145 the LCAA to be 150. All SGs were present in the LCAA but the number of genes between 146 SGs was highly variable (Table I). SG_III* and SG_XI* show the highest number of ancestral 147 genes, with 32 and 29 genes, respectively. The lowest numbers of ancestral genes are 148 recorded for SG_VIIb, SG_Xa, SG_XIIIa, and SG_XIIIb which only possessed two genes and 149 SG_XIV which only contained one. These results show that already in the LCAA that lived 150 ~150 million years ago (Supplemental Table SI) some SGs were more prone to retain copies 151 than others. We wanted to determine if this ancestral pattern was preserved during the course 152 of angiosperm evolution and if different SGs expanded or contracted compared to the LCAA. 153 154 Expansion rates of LRR-RLK genes differ between subgroups and species 155 To gain a more comprehensive understanding of LRR-RLK evolution, we first looked at SG- 156 specific expansion rates in two complementary ways. First, we calculated the global SG

Downloaded from www.plantphysiol.org on February8 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 157 expansion rate (= ratio of contemporary LRR-RLK genes per species in one SG divided by 158 the ancestral number) for each SG (Fig. 1). Second, we inferred the branch-specific expansion 159 rate of each SG on the phylogenetic tree of the 31 angiosperm species. We did this by 160 automatically computing the ratio of descendant LRR-RLKs divided by the ancestral number 161 of LRR-RLKs at every node (see Materials and Methods) (Fig. 2). Looking at the global SG 162 expansion, we found that SG_Xa, SG_XIIa, SG_XIIb, and SG_XIV expanded more than 2- 163 fold on average, and SG_I* and SG_IX around 2-fold (Fig. 1, Supplemental Table SII). 164 Interestingly, SG_XIIa already had a moderate-high ancestral gene number (nine) and 165 therefore seems to be generally prone to high retention rates. Indeed, SG_XIIa was subject to

Downloaded from www.plantphysiol.org on February9 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 166 repeated rounds of major expansion events (i.e. expansion > 2-fold) during its evolutionary 167 history, e.g. in Poaceae, the Solanum ancestor, Malvaceae, and the Arabidopsis ancestor; but 168 also species-specific expansions, e.g. in THECC, GOSRA, ARALY, SCHPA, MALDO, 169 LOTJA, POPTR, and JATCU (Fig. 2, see Table II for five-digit species code). On the other 170 hand, SG_I* and SG_XIIb had a medium number of copies in the ancestral genome (seven 171 and four, respectively) but the pattern of expansion is quite different when analyzed in detail 172 (Fig. 2). For SG_I*, the expansion rate is mostly due to ancestral expansion events rather than 173 species-specific ones. For example, the high number of copies in ARATH and EUTSA (Fig. 174 1) is not due to expansions specific to these species but rather an expansion in .

Downloaded from www.plantphysiol.org on February10 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 175 Subsequently, copies were lost in the other species of this family analyzed here (ARALY, 176 SCHPA, BRARA) but retained in ARATH and EUTSA (Fig. 2). Species-specific expansions 177 can also be observed in SG_I*, mostly in PRUPE and POPTR. For SG_XIIb, on the other 178 hand, the high expansion rate is mostly due to recent species-specific expansions in PHODC, 179 MUSAC, VITVI, GOSRA, MALDO, POPTR, and JATCU. But one major ancestral 180 expansion can be observed in . 181 SG_IX, SG_Xa, and SG_XIV had only few copies in the LCAA (three, two, and one, 182 respectively) and all show a relatively high global expansion rate (Fig. 1). For these 183 subgroups also, a contrasted branch-specific expansion patterns can be observed (Fig. 2). 184 SG_Xa went through relatively few major expansions: one can be detected in the dicots 185 ancestor and a species-specific one in POPTR. Likewise, SG_IX shows only one ancestral 186 expansion in Malvaceae but more species-specific expansions in PHODC, MUSAC, 187 MALDO, and GLYMA. Finally, SG_XIV went through several rounds of ancestral 188 (monocots, dicots, Malvaceae, and Brassicaceae) as well as species-specific expansion 189 (PHODC, MALDO, and POPTR). The other SGs show a moderate expansion rate (1.3-1.75) 190 or no expansion at all (Fig. 1, Supplemental Table SII). SG_XV* is the only SG for which the 191 number of copies was decreasing on average compared to the LCAA genome (0.77). It is 192 important to note that the LCAA ancestral gene number could have been slightly over- 193 estimated for those SGs without a confirmed monophyletic origin (denoted by “*”), resulting 194 in an under-estimation of global expansion rate. However, we re-calculated the global 195 expansion rates for each of those SGs using the largest subset of sequence that always include 196 a stable monophyletic clade. The obtained global expansion rate differed only slightly from 197 the ones presented here (data not shown) and the conclusions drawn remain unchanged. 198 Because some species underwent whole genome duplication (WGD) or whole genome 199 triplication (WGT) relatively recently compared to others (Table III), we determined species- 200 specific patterns of LRR-RLK expansions and looked if those patterns are consistent with the 201 recent history of the species. Therefore, we computed the global species expansion rate (= 202 ratio of LRR-RLK genes per SG in one species divided by the ancestral number) for each of 203 the 31 angiosperm species. As expected, the global expansion rate differs significantly 204 between species (Fig. 3, Supplemental Table SII). Compared to the LCAA (150 genes), the 205 number of LRR-RLK genes did not decrease for most species except for LOTJA (114) and 206 CARPA (127). This indicates that, on average, LRR-RLK genes are more prone to retention 207 than loss. Some species, however, did not significantly expand their average number of LRR- 208 RLK genes compared to the common ancestor: PHODC (158), CUCME (149), CUCSA

Downloaded from www.plantphysiol.org on February11 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 209 (180), SCHPA (194), BRARA (185), MEDTR (183), and RICCO (182). LRR-RLK genes 210 expanded more than 2-fold in GLYMA (477), MALDO (441), POPTR (400), and GOSRA 211 (372), and around 2-fold in MUSAC (280), MAIZE (241), SETIT (301), ORYSJ (317), 212 ORYSI (301), SOLTU (254), PRUPE (260), MANES (238), and EUTSA (240). The 213 remaining species show a moderate expansion rate (1.4-1.75): CACJA (222), THECC (238), 214 JATCU (208), ARATH (222), SOLLC (232), SORBI (225), BRADI (225), VITVI (193), and

Downloaded from www.plantphysiol.org on February12 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 215 ARALY (195). As expected, the four species with the highest global expansion rate 216 (GLYMA, MALDO, POPTR, GOSRA) are recent polyploids in which most SGs have 217 expanded (Fig. 2). However, some SGs expanded more than 2-fold, indicating that small scale 218 duplication events have occurred in addition to polyploidy. In POPTR, for instance, the global 219 expansion rates of SG_Xa and SG_XIIb are more than 8-fold (Fig. 3) and a strong branch- 220 specific expansion rate is detected on the terminal POPTR branch (3.25 for SG_Xa and 5.4

Downloaded from www.plantphysiol.org on February13 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 221 for SG_XIIb) (Fig. 2). Surprisingly, SG_VIIa and SG_VIIb show a high branch-specific 222 expansion rate in POPTR (4.0 and 3.0, respectively), which is not reflected in the global 223 expansion rate in this species (Fig. 3). This is due to the fact that SG_VIIa and SG_VIIb went 224 through strong reduction in Malpighiales (0.33) and Fabids (0.5), respectively. Thus, the 225 cumulative effect of successive reductions and expansions is not evident in the global 226 expansion rate. These contrasted evolutionary dynamics can also be observed in MALDO. A

Downloaded from www.plantphysiol.org on February14 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 227 global expansion of SG_IX was not detected because of the strong reduction in 228 Amygdaloideae. To summarize, this data can be integrated into the species phylogeny to draw 229 an image of the complex evolutionary dynamics of the LRR-RLK gene family through time 230 (Fig. 4). 231 232 Different patterns of lineage specific expansion in LRR-RLK subgroups 233 Given the differences of LRR-RLK expansion rates between species, we wanted to identify 234 cases of LSE, i.e. cases where a high duplication/retention rate is specific to one species. 235 Using a tree reconciliation approach (see Materials and Methods), we built a dataset 236 consisting of ultraparalog clusters (UP – only related by duplication) which represents the 237 LSE events and a superortholog reference gene set (SO – only related by speciation). We only 238 considered clusters containing five or more sequences. After cleaning, our final dataset 239 comprised 75 UP and 189 SO clusters containing 796 and 1,970 sequences, respectively 240 (Table IV). The median number of sequences in the UP clusters is not significantly different 241 from the median number in SO clusters (8 in both cases) (Supplemental Fig. S1). For UP 242 clusters, however, the alignments are significantly longer (Mann-Whitney test: p<0.001) with 243 a median of 3,237 base pairs (bp) compared to 2,841 for SO clusters. One possible 244 explanation for this could be that UP clusters are more dynamic and might contain more 245 LRRs. PRANK, the alignment algorithm we used, introduces gaps instead of aligning 246 ambiguous sites and therefore produces longer alignments when sequences are divergent. 247 However, this phenomenon does not influence the outcome of further test for positive 248 selection using codeml (Yang, 2007). 249 We then wanted to determine which SGs are represented in the SO and UP dataset. 250 Unsurprisingly, all SGs were present in SO clusters (Fig. 5). This could be expected as all 251 SGs were already present in the LCAA and remained stable or expanded (except SG_XV*). 252 In general, the frequency of SO clusters (and sequences) for each SG reflects the number of 253 copies in the LCAA (Table I, Fig. 5). On the other hand, only eleven of the 20 subgroups 254 were represented in UP clusters (SG_I*, SG_III*, SG_VI*, SG_VIII-2, SG_IX, SG_Xa, 255 SG_Xb*, SG_XI*, SG_XIIa, SG_XIIb, SG_XIIIa) and these SGs harbor a total of 837 256 sequences. SG_I*, SG_VIII-2, SG_XIIa, and SG_XIIb are clearly over-represented which is 257 in accordance with their expansion pattern. Other expanded SGs, however, have only a low 258 number of UP clusters or – in the case of SG_IV – no UP clusters at all. Therefore, it seems 259 that recently duplicated genes are more prone to be retained in some SGs. 260

Downloaded from www.plantphysiol.org on February15 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 261 Differences of selective constraint between subgroups, domains, and amino acids 262 To provide further insight into the LRR-RLK gene family evolution, we wanted to determine 263 under which kind of selective pressures the LRR-RLK genes evolved. We focused on the

264 dataset described above, i.e. LSE and orthologous genes. We inferred the dN/dS -ratio (or ω, 265 i.e. the ratio of non-synonymous vs. synonymous substitutions rates) at codons of the 266 alignments and branches of the phylogeny of the UP and SO clusters. An ω=1 indicates 267 neutral evolution/relaxed constraint, an ω<1 indicates purifying selection, and an ω>1 can 268 indicate positive selection. We used mapNH (Dutheil et al., 2012; Romiguier et al., 2012) to 269 compute the ω for each branch. mapNH ran for 71 UP and 176 SO clusters containing 1,246 270 and 2,960 branches, respectively (Table IV). We first wanted to test for relaxation of selective 271 constraint in UP and SO clusters and looked for branches with ω>1. We found 6.04% of UP 272 branches but only 0.49% of SO branches to have an ω>1. The mean ω for branches with ω>1 273 is significantly larger in UP clusters (1.45) compared to SO clusters (1.13, p=0.004). The 274 same is true for branches with ω<1 where ω is significantly larger in UP clusters (0.48) 275 compared to SO clusters (0.24, p<0.001). Overall, the mean ω is significantly larger for 276 branches from UP clusters (0.54) than for SO clusters (0.24, p<0.001) (Table IV, 277 Supplemental Fig. S2). 278 We found 38 out of 75 UP clusters (= 50.67%) containing codons under positive 279 selection (see Supplemental Table SIII for more details) after manual curation but only six out 280 of 186 SO clusters (= 3.23%). Additionally, codons under positive selection found in UP

Downloaded from www.plantphysiol.org on February16 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 281 clusters are not distributed evenly over domains (Fig. 6). To account for differences in domain 282 size, a hit frequency, i.e. the number of sites under positive selection we found relative to all 283 sites possible for each domain, was calculated (see Materials and Methods). The domain 284 showing the highest hit frequency is the LRR domain, followed by the cysteine-pairs and their

Downloaded from www.plantphysiol.org on February17 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 285 flanking regions (Fig. 6A). Hits in both domains are distributed over all SGs and species 286 tested. The KD and its surrounding domains contain very few codons under positive selection. 287 Domains classified as “other” combine domains important for the function of the LRR-RLK 288 genes but vary between SGs. For example, SG_I* (Fig. 6B) contains a malectin domain. All 289 hits classified as “other” here fall in the malectin-like domain (MLD) of a POPTR SG_I* 290 cluster. 291 Finally, we wanted to investigate whether some AAs in the LRR are more frequently 292 targeted by positive selection. The LRR typically contains 24 AAs and sometimes islands 293 between them (Fig. 6C). Four AAs were predominantly subject to positive selection: 6, 8, 10, 294 and 11 which all lie in the LRR-characteristic LXXLXLXX β-sheet/β-turn structure. 295 296 297

Downloaded from www.plantphysiol.org on February18 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 298 DISCUSSION 299 We studied the SG- and species-specific expansion dynamics in LRR-RLK genes from 31 300 angiosperm genomes in a phylogenetic framework. We also analyzed the lineage-specifically 301 expanded genes in this family to determine to which extent positive selection occurred on

302 them using a dN/dS-based test. We found differences in expansion patterns depending on SGs 303 and species but only a few SGs that were subjected to LSE. A significantly higher proportion 304 of LSE LRR-RLK genes was affected by positive selection compared to single-copy genes 305 and the LRR domain (specifically four AAs within this domain) were targeted by positive 306 selection. In the following, we will discuss our findings in more detail. 307 308 Subgroup- and species-specific expansions 309 We observed significant variations in the global expansion rates between LRR-RLK SGs. 310 These are due to a complex history of expansion-retention-loss cycles that are specific to each 311 SG. The phylogenetic approach allowed us to determine the relative importance of ancestral 312 versus recent species-specific expansions for each SG and to characterize precisely the 313 loss/retention dynamics along evolutionary history of the studied species (summarized in Fig. 314 4). For example, SG_III* and SG_XI* had a high copy number of LRR-RLK in the LCAA 315 and kept a stable copy number over the last 150 million years. On the other hand, SG_I*, 316 SG_XIIa, and SGXIIb, which had a moderate copy number in the LCAA, keep expanding. 317 Some functions have been described for genes of these SGs, mainly in A.thaliana 318 (Supplemental Table SIV). For SG_III* and SG_XI*, mostly genes involved in development 319 are described. The high numbers of ancestral genes in these two SGs combined with their size 320 stability during angiosperm evolution may be interpreted as an early high level of 321 diversification/specialization of these genes which are needed to orchestrate common 322 developmental features. This hypothesis can be reinforced by the high number of 323 superorthologous genes in these subgroups. For SG_I* and SG_XIIa, on the other hand, 324 mostly genes involved in response to biotic stress are described up to now. These observations 325 confirm that different expansion/retention patterns appear to be related to gene function 326 although one has to keep in mind that functions have only been assigned to few LRR-RLK 327 genes. Three SGs (SG_IX, SG_Xa, and SG_XIV) expanded compared to their very low 328 ancestral number (1-3) leading to a high total expansion rate. As it has been postulated that 329 duplications are the raw material for adaptation (Nei and Rooney, 2005; Fischer et al., 2014), 330 the evolution of those SGs was likely driven by adaptation – to varying degrees in different 331 angiosperm species, depending on the environment they evolved in. The known functions are

Downloaded from www.plantphysiol.org on February19 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 332 both related to response to biotic or abiotic stress and development. Because so far our 333 knowledge of LRR-RLK functions is limited and mostly restricted to A. thaliana, further 334 studies are needed to make more reliable statements on the link between function and 335 expansion/retention dynamics in different SGs. 336 Next, we wanted to ascertain species-specific expansions of LRR-RLK genes and how 337 they are related to the recent history of the species in our study. Whole genome multiplication 338 has been argued to be a major force in diversification of angiosperms (Soltis et al., 2009; 339 Soltis and Burleigh, 2009; Renny-Byfield and Wendel, 2014). All angiosperms share two 340 ancient WGDs (Jiao et al., 2011). Likewise, all monocots share a WGD ~130 Mya (Tang et 341 al., 2010a) and most dicots () share a WGT around the same time (Jaillon et al., 342 2007; Wang et al., 2012), but more recent WGDs and WGTs occurred in many angiosperm 343 species (Fig. 4, Table III). The link between WGD/Ts and the number of LRR-RLK genes is 344 not straightforward. We found that in Glycine max, Gossypium raimondii, and Malus x 345 domestica, which were subject to relatively recent WGDs (15-13, 17-13 and 45-30 Mya, 346 respectively) (Pfeil et al., 2005; Velasco et al., 2010; Wang et al., 2012), the number of LRR- 347 RLK genes expanded more than 2-fold compared to the LCAA. These results are in 348 accordance with what was already described for these species. Indeed, it was found that G. 349 max contains a very large number of retained genes from this WGD (Cannon et al., 2014). 350 Additionally, recent studies on large gene families in G. raimondii indicate that their copy 351 number is either driven by retention after the last WGD (e.g. NAC transcription factors) 352 (Shang et al., 2013) or a combination of segmental (SD) and tandem duplications (TD) (e.g. 353 WRKY transcription factors) (Dou et al., 2014). For M. domestica (most recent WDG after 354 the divergence for peach according to Verde et al. (2013)), a recent study on nucleotide 355 binding site (NBS) LRR genes showed that they also stem mostly from SDs and TDs (Arya et 356 al., 2014). 357 More contrasted results are observed in Brassicaceae where two WGDs occurred (Barker 358 et al., 2009; Fawcett et al., 2009). Most SGs expand their number of genes on this ancestral 359 branch, but the species belonging to this clade mostly retain or loose genes on average (Fig. 2 360 and 4). The only exception concerns Eutrema salsugineum (an A. thaliana relative) which is 361 the only species with a more than 2-fold average expansion rate. The global expansion rate in 362 E. salsugineum is mostly due to two SGs (SG_I* and SG_XIIIa). In the original genome 363 paper (Wu et al., 2012), the authors found that genes from the category “response to stimulus” 364 (response to salt stress, osmotic stress, water deprivation, ABA stimulus, and hypoxia) are 365 significantly over-represented in E. salsugineum compared to A. thaliana. This over-

Downloaded from www.plantphysiol.org on February20 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 366 representation is described as mostly caused by SDs and TDs (Wu et al., 2012) in accordance 367 with what we observed in SG_XIIIa. This could be of functional importance to this halophile 368 plant. 369 Finally, of all species analyzed here, Zea mays and Brassica rapa (and maybe Manihot 370 esculenta) show the most recent cases of WGD/T (12-5 and 9-5 Mya, respectively) (Schnable 371 et al., 2011; Wang et al., 2011) yet their expansion rates are moderate. This is further evidence 372 for the dynamic nature of angiosperm genomes that has been discussed before (Leitch and 373 Leitch, 2012; Fischer et al., 2014). After a WGD event, genomes tend to return to the diploid 374 (or previous) state by losing redundant duplicated genes (fractionation process) – although the 375 gene loss is biased (Bowers et al., 2003; Schnable et al., 2009). Which genes are lost or 376 retained strongly depends on their function (De Smet et al., 2013). However, it has been 377 shown that genes involved in stress response are mostly created by TDs rather than WGD 378 (Hanada et al., 2008). Indeed, it was hypothesized before that RLK genes involved in stress 379 response mostly duplicate by TD (Shiu et al., 2004). Here, we provide a detailed 380 representation of expansion-retention-loss dynamics of the whole LRR-RLK gene family in 381 31 angiosperm species (Fig.4). Each new genome sequenced will improve the accuracy of the 382 expansion-retention-loss event predictions and will help identifying new elements that can be 383 useful for future functional analysis and/or linked to adaptive traits. 384 385 Studying selection pressures in a large and dynamic gene family 386 As described above, the composition of LRR-RLKs in each of the 31 studied angiosperm 387 species results from a complex dynamic of species- and SG-specific expansion/loss events. To 388 further investigate the potential role of this family in plant adaptation we analyzed to which 389 selective pressures the LRR-RLKs were submitted. Such an analysis cannot be considered for 390 the phylogeny of the entire gene family because of the high number of sequences and high 391 sequence divergence (the phylogeny on which we divided the SGs was inferred on the 392 conserved kinase domain only). We then chose to focus on two specific cases: (i) LSE as a 393 specific case of duplication/retention, and (ii) a subset of strictly orthologous genes. Indeed, 394 LSE has been shown to fuel adaptation in angiosperms (Fischer et al., 2014) and we wanted to 395 test the prevalence of this mode of duplication in our large dataset. Therefore, we evaluated to 396 which extant LRR-RLK genes were subject to LSE and how positive selection acted on those 397 genes. As a reference, we chose the strictly orthologous subset. This approach allows the 398 interpretation of LSE evolution compared to the general LRR-RLK selective background 399 (Fischer et al., 2014).

Downloaded from www.plantphysiol.org on February21 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 400 The power of this phylogenetic approach relies on the number of species analyzed and we 401 profit from an ever increasing number of sequenced plant genomes. Another important 402 requirement for this approach is the quality of sequencing and annotation – especially for a 403 large gene family – as sequencing errors and mis-annotations can lead to false positives when 404 testing for positive selection (Han et al., 2013). We profit from a recently developed pipeline 405 designed to automatically perform different steps of the analysis (Fischer et al., 2014). This 406 allowed us to quickly incorporate sequenced genomes of choice and future studies can easily 407 expand this analysis as new reliable data becomes available. Finally, we set great value on 408 manually verifying the data throughout the process – from the identification of the LRR- 409 RLKs to the inference of positive selection. Although this is tedious work for such a large 410 dataset, it is nevertheless important. As we recently showed, ~50% automatically reported 411 instances of positive selection turned out to be false positives after manual curation (Fischer et 412 al., 2014). 413 We found that all SGs are represented in the single-copy reference set with an over- 414 representation of SG_III* and SG_XI*. This is in accordance with the fact that these two SGs 415 had the highest number of copies in the genome of the LCAA and did not significantly 416 expand since (see above). In general, the frequency of clusters from the single-copy gene set 417 (and sequences) for each SG reflects the number of copies in the LCAA (Table I, Fig. 5). On 418 the other hand, only eleven of the 20 SGs were represented in the LSE dataset. This is mainly 419 because the majority of expansions are rather old in these SGs, whereas they happened 420 relatively recently in SG_I*, SG_VIII-2, SG_XIIa, and SG_XIIb (see above). Fourteen 421 species (or clades) are represented in the LSE dataset: MUSAC (2 ultraparalog clusters), 422 SETIT (1), ORYZA (10), VITVI (3), SOLAN (6), MEDTR (3), GLYMA (2), PRUPE (6), 423 MALDO (11), POPTR (8), BRASS (11), GOSRA (5), THECC (2), and PHYPA (5). Again, 424 not every species is affected to the same extent, but this does not necessarily reflect recent 425 WGD/T. Additionally, LSE can also arise from SD and TD which frequency of occurrence is 426 not uniform within or between genomes. Our results indicate that different species are more 427 prone to retain recently duplicated genes than others. This in turn might reflect on their recent 428 evolution or domestication which should be examined in more detail in future studies. 429 When focusing on the study of selective pressures, we first looked at ω at the branches of 430 the LSE and single-copy gene clusters and found that selective constraint was relaxed in the 431 LSE dataset. This outcome was expected as it was previously shown that LSE genes evolve 432 more relaxed constraint than single-copy genes in angiosperms (Fischer et al., 2014). This 433 study, however, looked at whole angiosperm genomes but a similar pattern has already been

Downloaded from www.plantphysiol.org on February22 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 434 demonstrated in other large gene families (e.g. Johnson and Thomas, 2007; Xue et al., 2012; 435 Yang et al., 2013a; Yang et al., 2013b) and in LRR-RLK genes in particular (Tang et al., 436 2010b). Previous studies on that subject only had a limited dataset (four angiosperm species; 437 Tang et al. 2010b). Here, we demonstrate that this is still true when a larger and more 438 representative sample of angiosperms is considered. 439 Next, we wanted to identify codons which evolved under positive selection in the LSE 440 and the single-copy dataset. A recent study on gene families in the whole genomes of ten 441 angiosperms found that 5.4% of LSE genes contained codons showing positive selection 442 footprints (Fischer et al., 2014). Here, we ask if and to what extend this is also true for the 443 large and dynamic LRR-RLK gene family. We discovered that for LSE LRR-RLK genes, the 444 rate of codons under selection is almost 10-fold higher (50.67%) than the genome average. In 445 addition, we found >3% of single copy genes containing codons under selection whereas 446 Fischer et al. (2014) described no case of positive selection at the single-copy gene clusters in 447 their study. Together with the high rate of branches with ω>1 in LSE gene clusters (6.04%, 448 compared to 0.49% for single-copy genes) this indicates that LRR-RLK genes are more prone 449 to evolve under positive selection than the average of angiosperm gene families. As it might 450 be expected, all ultraparalog clusters with codons under positive selection come from the four 451 over-represented SGs: SG_I* (1 ultraparalog cluster), SG_VIII-2 (3), SG_XIIa (24), and 452 SG_XIIb (10). The single-copy gene clusters with codons under selection come from six SGs: 453 SG_III, SG_VIIa, SG_Xa, SG_Xb, SG_XIIa, and SG_XIIb. Therefore, recent expansion and 454 retention only affect a few SGs but in those SGs positive selection plays an important role. 455 For SG_XIIa, positive selection has been already previously inferred for genes involved in 456 environmental interactions: Xa21, which confers resistance to the bacterial blight disease, was 457 found to have evolved under positive selection in rice (Wang et al., 1998; Tan et al., 2011); 458 and FLS2, involved also in response to biotic stress, shows a signature of rapid fixation of an 459 adaptive allele in Arabidopsis (Vetter et al., 2012). Future studies on smaller subsets of SGs 460 will surely cast further light on selection patterns in LRR-RLK genes. Only eleven species (or 461 clades) are represented in the LSE dataset with codons under positive selection: SETIT (1 462 ultraparalog cluster), ORYZA (2), SOLAN (4), MEDTR (2), GLYMA (2), PRUPE (3), 463 MALDO (8), POPTR (7), BRASS (2), GOSRA (5), and THECC (2). Not every species is 464 affected the same level by positive selection and again future studies might bring more details 465 concerning the evolutionary history of specific species and SGs to light. 466 In addition, we found that not every domain of the LRR-RLK genes was similarly 467 affected by positive selection. Most codons under selection fall in the LRR domain. This

Downloaded from www.plantphysiol.org on February23 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 468 outcome might be expected as LRRs are very dynamic and plasticity in this region provides 469 plants with a broad toolset to face environmental challenges and therefore undergoes positive 470 selection frequently (Zhang et al., 2006; Tang et al., 2010b). Only very few codons under 471 positive selection were found in the kinase domain and its surrounding regions. This result is 472 consistent with the fact that the KD is very conserved among species and SGs and evolved 473 mostly under purifying selection (Shiu et al., 2004; Tang et al., 2010b). A more surprising 474 result was the identification of a significant number of positively selected sites in the 475 malectin-like domain of a Populus trichocarpa SG_I* cluster. So far, the function of 476 extracellular malectin-like domains of RLKs is not well understood (Lindner et al., 2012). 477 However, a malectin-like domain-containing SG_I* LRR-RLK has been described to confer 478 susceptibility to a downy mildew pathogen in A. thaliana and to have similarities to 479 Symbiosis RLKs (SYMRKs) which are important for the regulation of bacterial symbiont 480 accommodation (Markmann et al., 2008; Hok et al., 2011). Therefore, our results suggest that 481 it could be interesting to further investigate the function and evolutionary history of this 482 SG_I* domain, particularly in P. trichocarpa. Another unexpected finding was the frequent 483 occurrence of positive selection at the cys-pairs and flanking regions which are involved in 484 folding and/or the binding to other proteins. To what extent the function of LRR-RLKs is 485 affected by mutations in the cys-pair regions depends on the function of the gene (Song et al., 486 2010; Sun et al., 2012) and it would be interesting to study this in more detail in the future. 487 Finally, we took a closer look at the AAs in the LRR primarily affected by positive 488 selection. Only four, out of the 24 AAs a LRR typically contains, were predominantly and 489 strongly subject to positive selection. These variable AAs lie in the un-conserved part of the 490 LRR-characteristic LXXLXLXX β-sheet/β-turn structure which is involved in protein-protein 491 interactions (Jones and Jones, 1997; Enkhbayar et al., 2004). Specifically, solvent-exposed 492 residues were targeted by positive selection (Parniske et al., 1997; Wang et al., 1998). Further 493 investigation on the functional consequences of these nucleotide variations need to be done to 494 confirm their adaptive potential but our findings align very well to the current understanding 495 of LRR ligand binding. Taken together, our results could be very useful for further functional 496 investigations of LRR-RLK genes in different species. 497 498 499 CONCLUSIONS

500 We studied LRR-RLK genes from 33 land plant species to investigate SG- and species- 501 specific expansion of these genes, to which extent they were subject to LSE, and to determine

Downloaded from www.plantphysiol.org on February24 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 502 the role positive selection played in the evolution of this large gene family. We described that 503 some SGs are more prone to expansion/retention than others and that the expansions occurred 504 at different times in the evolution of LRR-RLK genes. This fine-scale analysis of the dynamic 505 allowed us to identify branches and species for which a higher than average retention rate 506 could indicate a potential adaptive event for some SGs. We also described that only a few SGs 507 show patterns of recent LSE and that at those genes selective constraint is relaxed. More than 508 50% of the LSE genes contain codons which show evidence for positive selection which is 509 almost 10-fold the frequency previously described at gene families in angiosperms (Fischer et 510 al., 2014). Finally, we found that over the LRR-RLK genes the LRR domain and specifically 511 four AAs responsible for ligand interaction are most frequently subject to selection. 512 513 514 MATERIALS AND METHODS

515 Studied Genomes 516 We analyzed 31 angiosperm genomes (eight monocot (sub)species and 23 dicot species) (see 517 Table II): Phoenix dactylifera (Al-Dous et al., 2011), Musa acuminata (D'Hont et al., 2012), 518 Oryza sativa subsp. japonica (International Rice Genome Sequencing Project, 2005), Oryza 519 sativa subsp. indica (Yu et al., 2002), Brachypodium distachyon (The International 520 Brachypodium Initiative, 2010), Zea mays (Schnable et al., 2009), Sorghum bicolor (Paterson 521 et al., 2009), Setaria italica (Zhang et al., 2012), Solanum tuberosum (Potato Genome 522 Sequencing Consortium, 2011), Solanum lycopersicum (The Tomato Genome Consortium, 523 2012), Vitis vinifera (Jaillon et al., 2007), Lotus japonicus (Sato et al., 2008), Cajanus cajan 524 (Varshney et al., 2012), Arabidopsis thaliana (The Arabidopsis Genome Initiative, 2000), 525 Arabidopsis lyrata (Hu et al., 2011), Schrenkiella parvula (a synonym is Eutrema parvula, we 526 used the nomenclature from Oh et al. (2014)) (Dassanayake et al., 2011), Eutrema 527 salsugineum (a synonym is halophila, we chose the nomenclature according to 528 phytosome: http://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Esalsugineum) (Wu 529 et al., 2012), Brassica rapa (Wang et al., 2011), Populus trichocarpa (Tuskan et al., 2006), 530 Glycine max (Schmutz et al., 2010), Medicago truncatula (Young et al., 2011), Prunus 531 persica (Ahmad et al., 2011), Malus x domestica (Velasco et al., 2010), Ricinus communis 532 (Chan et al., 2010), Jatropha curcas (Sato et al., 2011), Manihot esculenta (Prochnik et al., 533 2012), Cucumis sativus (Huang et al., 2009), Cucumis melo (Garcia-Mas et al., 2012), Carica 534 papaya (Ming et al., 2008), Gossypium raimondii (Wang et al., 2012), and Theobroma cacao 535 (Argout et al., 2011). We also extracted LRR-RLKs from the moss Physcomitrella patens

Downloaded from www.plantphysiol.org on February25 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 536 (Rensing et al., 2008) and the spikemoss Selaginella moellendorffii (Banks et al., 2011). 537 Throughout the article we refer to the species using five-digit identifiers which can be found 538 in Table II. Altogether, we analyzed 33 genomes from 39 proteomes (we used several 539 annotation versions of the A. thaliana and O. sativa genomes). A table containing the details 540 on which genome versions we used can be found in Supplemental Table SV. The phylogeny 541 of those species is provided in Fig. 4. 542 543 LRR-RLK extraction, clustering, phylogeny, and identification of gain/loss events 544 We used the hmmsearch program (Eddy, 2009) to extract peptide sequences containing both, 545 intact (i.e. non-degenerated) LRR(s) and a KD from the proteomes as previously described by 546 (Dievart et al., 2011). We classified SGs using the KD by a global phylogenetic analysis (the 547 tree can be found at http://phylogeny.southgreen.fr/kinase/index.php - Global Analysis). First, 548 sequences were aligned using MAFFT (Katoh et al., 2005) with a progressive strategy. 549 Second, the alignments were cleaned using TrimAl (Capella-Gutiérrez et al., 2009) with 550 settings to remove every site with more than 20% of gaps or with a similarity score lower than 551 0.001. Third, a similarity matrix was computed by ProtDist (Felsenstein, 1993) using a JTT 552 model. Fourth, a global distance phylogeny was inferred using FastME (Desper and Gascuel, 553 2006) with default settings and SPR movements to optimize the tree topology. Fifth, SGs 554 were defined manually in the global phylogeny using the Arabidopsis genes as reference 555 which lead us to 20 SGs in contrast to the 15 previously described (Shiu et al., 2004; Lehti- 556 Shiu et al., 2009). 557 More accurate phylogenies were then inferred for each of the 20 SGs. The KD of the 558 sequences attributed to each SG were re-aligned using MAFFT with an iterative strategy 559 (maximum of 100 iterations). Alignments were cleaned using TrimAl with settings to only 560 remove sites with more than 80% of gaps. Then, maximum likelihood phylogenies were 561 inferred by PhyML 3.0 (Guindon et al., 2010) using a LG+gamma model and the best of NNI 562 and SPR topology optimization. Statistical branch support was computed using the aLRT/SH- 563 like strategy (Guindon et al. 2010). This left us with 20 phylogenies, one for each SG (all 564 phylogenies are available at http://phylogeny.southgreen.fr/kinase/index.php - SG_I - 565 SG_XV).

566 Each of the 20 phylogenetic trees has been reconciled with the species tree using RAP- 567 Green (Dufayard et al., 2005; https://github.com/SouthGreenPlatform/rap-green). By 568 comparing the gene tree to the species tree, this analysis allows to root phylogenetic trees and 569 to infer duplication and loss events (Dufayard et al., 2005). We tested this approach of rooting

Downloaded from www.plantphysiol.org on February26 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 570 (by minimizing the number of inferred duplications and losses) and compared it to rooting 571 with outgroups (data not shown). The two methods provided very close root locations that did 572 not change the overall conclusions. Using this RAP-Green tree reconciliation approach 573 (parameters: Maximum support for reduction 0.95), we inferred the number of duplications 574 and losses at each node of the species tree. Briefly, each duplication and loss respectively 575 increases and decreases by one the number of copies in the common ancestor of the 576 taxonomic group analyzed.

577 We determined the global SG- and species-specific expansion rate by computing the 578 number of LRR-RLK genes in one SG divided by the ancestral number and number of LRR- 579 RLK genes in one species divided by the ancestral number, respectively. An ANOVA 580 analysis showed that the expansion rate differed significantly between the SGs/species (p<2e- 581 16 in both cases). We used the TukeyHSD test of the agricolae package (http://cran.r- 582 project.org/web/packages/agricolae/index.html) in R (R Development Core Team, 2012) to 583 further explore which groups of SGs/species differ from each other. This test compares the 584 range of sample means and defines an Honest Significance Difference (HSD) value which is 585 the minimum distance between groups to be considered statistically significant. In short, 586 TukeyHSD is a post-hoc test which groups subsets by significance levels after ANOVA 587 showed significant differences between subsets. 588 589 LSE dataset and testing for positive selection 590 Testing for adaptation can be done by comparing positive (Darwinian) selection footprints in 591 lineages with recently and specifically duplicated genes to reference lineages containing only 592 single-copy genes. One way to infer positive selection is by analyzing nucleotide substitution 593 data at the codon level in a phylogenetic framework. As nucleotide substitutions can either be 594 nonsynonymous (i.e. protein changing, thereby potentially impacting the fitness) or 595 synonymous (i.e. not protein changing, thereby theoretically without consequences for the 596 fitness, but see (Lawrie et al., 2013)), the nonsynonymous/synonymous substitution rate ratio,

597 denoted as dN/dS or ω, can be used to infer the direction and strength of natural selection. An 598 ω<1 indicates purifying selection and the closer ω is to 0, the stronger purifying selection is 599 acting. Under neutral evolution, ω=1. An ω>1 indicates that positive selection is acting. 600 We identified ultraparalog clusters (UP – only related by duplication) using a tree 601 reconciliation approach (Dufayard et al., 2005). Those represent our LSE gene set. As a 602 single-copy gene reference, we chose a superortholog gene set (SO – only related by 603 speciation). We chose clusters with a minimum of five sequences. To address the question of

Downloaded from www.plantphysiol.org on February27 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 604 whether or not positive selection is more frequent after LSE events, we compared the results 605 obtained on UPs with those obtained on SO gene sets. Species which diverged <15 Mya were 606 merged for the LSE detection (see Fig. 4) in order not to overly reduce the ultraparalog (UP) 607 dataset and to not induce bias due to very recent speciation events: ANDRO (= ZEAMA and 608 SORBI), ORYZA (= ORYSJ and ORYSI), SOLAN (= SOLLC and SOLTU), CUCUM (= 609 CUCSA and CUCME), BRASS (= ARATH, ARALY, BRARA, SCHPA and EUTSA). We 610 then applied the pipeline developed by Fischer et al. (2014) to the extracted UP and SO 611 clusters. In short, the pipeline consists of following steps: (i) The clusters were aligned using

612 PRANK+F with codon option (Löytynoja and Goldman, 2005). The alignments were cleaned 613 by GUIDANCE (Penn et al., 2010) with the default sequence quality cut-off and a column 614 cut-off of 0.97 to remove problematic sequences and unreliable sites from the alignments. We 615 used PRANK and GUIDANCE here as previous benchmarks (Fletcher and Yang, 2010; 616 Jordan and Goldman, 2012) showed that these programs lead to a minimum of false positives 617 when inferring positive selection using codeml. The cleaned alignments can be retrieved here: 618 http://phylogeny.southgreen.fr/kinase/alignments.php – Alignments: Manually curated 619 alignments for positive selection analysis. (ii) We relied on the egglib package (De Mita and 620 Siol, 2012) to infer the maximum likelihood phylogeny at the nucleotide level for every 621 alignment using PhyML 3.0 (Guindon et al., 2010) under the GTR substitution model. (iii) 622 We ran the codeml site model implemented in the PAML4 software (Yang, 2007) to infer 623 positive selection on codons under several substitution models. In clusters identified to have 624 evolved under positive selection, Bayes empirical Bayes was used to calculate the posterior 625 probabilities at each codon and detect those under positive selection (i.e. those with a 626 posterior probability of ω>1 strictly above 95%). All alignments detected to be under positive 627 selection at the codon level were curated manually for potential alignment errors. A table 628 containing the details on all codons showing a signal of positive selection using codeml can 629 be found in Supplemental Table SIII. (iv) We used mapNH (Dutheil et al., 2012; Romiguier et 630 al., 2012) to infer ω at the branch level. 631 In order to analyze the distribution of positively selected sites among domains, we 632 calculated a hit frequency that computes the number of sites under positive selection found in 633 each domain relative to all sites possible. All possible sites for each domain were calculated 634 as follows: First, we extracted the size of each domain of every SG. If SGs were subdivided 635 further we took the average size of each domain. Second, we multiplied the size of each 636 domain by the number of UP clusters we found for each SG. For example: the LRR of SG_I* 637 contains an average of 77 sites. We found 8 UP clusters for SG_I*. Therefore, the total

Downloaded from www.plantphysiol.org on February28 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 638 number of possible LRR sites for SG_I* is 77*8=616 sites. Third, we added the sites for each 639 domain up for all SGs.

Downloaded from www.plantphysiol.org on February29 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 640 SUPPLEMENTAL DATA 641 Supplemental Tables SI – SV can be obtained from: 642 http://phylogeny.southgreen.fr/kinase/index.php - Tables 643 Supplemental Table SI: Estimated divergence times and corresponding references for Fig. 4. 644 Supplemental Table SII: Results of the TukeyHSD test. 645 Supplemental Table SIII: Details on all codons showing a signal of positive selection using 646 codeml. 647 Supplemental Table SIV: Arabidopsis LRR-RLK gene classification according to TAIR 648 (https://www.arabidopsis.org/) 649 Supplemental Table SV: List of genomes used here, with name, link, and version of the 650 genome fasta file. 651 Supplemental Figure S1: Summary of ultraparalog and superortholog cluster size and 652 length. 653 Each dot represents (A) an ultraparalog or (B) a superortholog alignment. The histogram 654 above the scatter plot represents the count of alignments for each cluster size (= number of 655 sequences in alignment); the histogram right to the scatter plot represents the frequency of 656 alignments for each alignment length. 657 Supplemental Figure S2: ω distribution of branches of UP and SO clusters. 658 The distribution of ω of branches in UP (purple) and SO clusters (yellow). 659 660 661 ACKNOWLEGEMENTS 662 The authors thank the reviewers for their helpful comments.

Downloaded from www.plantphysiol.org on February30 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 663 Table I: Total number of LRR-RLK in our angiosperm dataset, number of ancestral genes in 664 LCAA, and median global expansion rate for each SG among the 31 species.

Total number Number of Median global SG of genes ancestral genes expansion rate I* 482 7 2.00 II 349 9 1.22 III* 1,400 32 1.22 IV 131 3 1.33 V 263 5 1.80 VI* 324 10 1.00 VIIa 157 3 1.67 VIIb 84 2 1.50 VIII-1 216 5 1.40 VIII-2 355 8 1.25 IX 193 3 1.67 Xa 143 2 2.00 Xb* 367 9 1.11 XI* 1,177 29 1.28 XIIa 1,126 9 3.00 XIIb 423 4 2.00 XIIIa 84 2 1.50 XIIIb 77 2 1.00 XIV 84 1 3.00 XV* 119 5 0.80 Total 7,554 150 665

Downloaded from www.plantphysiol.org on February31 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 666 Table II: Five-digit code for each species.

Species name Common name Five-digit code Phoenix dactylifera Date palm PHODC Musa acuminata Banana MUSAC Brachypodium distachyon Purple false brome BRADI Oryza sativa ssp. japonica Asian rice ORYSJ Oryza sativa ssp. indica Indian rice ORYSI Setaria italica Foxtail millet SETIT Zea mays Maize MAIZE Sorghum bicolor Milo SORBI Solanum tuberosum Potato SOLTU Solanum lycopersicum Tomato SOLLC Vitis vinifera Common grape vine VITVI Theobroma cacao Cacao tree THECC Gossypium raimondii Cotton progenitor GOSRA Carica papaya Papaya CARPA Arabidopsis thaliana Thale cress ARATH Arabidopsis lyrata Out-crossing ARATH relative ARALY Brassica rapa Turnip BRARA Schrenkiella parvula A saltwater cress SCHPA Eutrema salsugineum A saltwater cress EUTSA Cucumis sativus Cucumber CUCSA Cucumis melo Melon CUCME Prunus persica Peach PRUPE Malus x domestica Apple MALDO Lotus japonicus LOTJA Medicago truncatula Barrel medic MEDTR Glycine max Soybean GLYMA Cajanus cajan Pigeon pea CAJCA Populus trichocarpa Black cottonwood POPTR Ricinus communis Castor oil plant RICCO Jatropha curcas Barbados nut JATCU Maniohot esculenta Cassava MANES Selaginella moellendorffii A spikemoss SELML Physcomitrella patens A moss PHYPA 667

Downloaded from www.plantphysiol.org on February32 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 668 Table III: Polyploidy events Estimated times of polyploidy event and corresponding 669 references for Fig. 4. Event Name Reference Age [million years] 1 Seed plant tetraploidy (Jiao et al., 2011) 350-330 2 Angiosperm tetraploidy (Jiao et al., 2011) 230-190 3 Monocot tetraploidy (Tang et al., 2010a) 130 4 Date palm WGD (D'Hont et al., 2012) 75-65 (?) 5 Banana gamma (D'Hont et al., 2012) 100 6 Banana beta (D'Hont et al., 2012) 65 7 Banana alpha (D'Hont et al., 2012) 65 8 Grass tetraploidy B (sigma) (D'Hont et al., 2012) 123-109 9 Grass tetraploidy (rho) (Paterson et al., 2004) 70 10 Maize tetraploidy (Schnable et al., 2011) 12-5 Eudicot hexaploidy (Jaillon et al., 2007; Cenci et 11 150-120 (Arabidopsis gamma) al., 2010; Wang et al., 2012) (The Tomato Genome 12 Solanum hexaploidy 91-52 Consortium, 2012) 13 Papiloniod tetraploidy (Pfeil et al., 2005) 55-54 14 Soybean tetraploidy (Pfeil et al., 2005) 15-13 (Velasco et al., 2010; Verde 15 Apple tetraploidy 45-30 et al., 2013) 16 Poplar tetraploidy (Tuskan et al., 2006) 65-60 17 Arabidopsis beta (Fawcett et al., 2009) 70-40 18 Arabidopsis alpha (Barker et al., 2009) 23 19 Brassica hexaploidy (Wang et al., 2011) 9-5 20 Cotton WGD (Wang et al., 2012) 20-13 (Mühlhausen and Kollmar, ? (after 21 Cassava WGD 2013) Crotonoideae split) 670 671

Downloaded from www.plantphysiol.org on February33 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 672 Table IV: Details of the LSE and mapNH analysis for UP and SO clusters. UP SO total number of clusters 75 189 clusters for final mapNH analysis 71 176 median cluster size (1st; 3rd Qu) 8 (6; 12) 8 (6; 14) min; max cluster size 5; 38 5; 25 median alignment length (1st; 3rd Qu) 3,237 2,841 (2,952; 3,574) (2,034; 3,192) min; max alignment length 1,749; 8,691 861; 6,216 branches analyzed/total number of 1,193/1,246 2,860/2,960 branches clusters with branches omega > 1.0 (%) 25 (35.21) 10 (5.68) branches with omega < 1.0 (%) 1,121 (93.96) 2,846 (99.51) mean omega for < 1.0 branches +/- sd 0.48 +/- 0.17 0.24 +/- 0.12 branches with omega > 1.0 (%) 72 (6.04) 14 (0.49) mean omega for > 1.0 branches +/- sd 1.45 +/- 0.51 1.13 +/- 0.14 mean omega +/- sd 0.54 +/- 0.31 0.24 +/- 0.13 673

674

Downloaded from www.plantphysiol.org on February34 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 675 FIGURE LEGENDS 676 Figure 1: Global expansion rate in each subgroup: Total number of genes in each species 677 divided by the ancestral number (see Table I). An ANOVA test showed that the expansion 678 rate differs significantly between SGs (p<2e-16). Therefore, we performed a TukeyHSD test 679 to determine which SGs exactly show a significant difference between each other and group 680 those SGs by significance level (a-e). Letters above the boxplot indicate TukeyHSD 681 significance group (see Supplemental Table SII). The significance groups are color coded 682 according to the mean expansion rate: orange: >2.25 fold expansion; red: 1.75-2.25 fold 683 expansion; purple: 1.30-1.75 fold expansion; blue: 0.75-1.30 fold expansion (i.e. no 684 expansion). The outlier species are labeled for each SG. See Table II for species IDs. 685 686 Figure 2: Branch-specific expansion/diminution of LRR-RLK genes for every SG on every 687 branch in the phylogenetic tree. The tree on the left displays all the nodes and branches, 688 polyploidy events are marked with dots. Every line gives the expansion rate where the current 689 (descendant) node is compared to the previous (ascendant) node. Red boxes indicate 690 expansion, blue boxes indicate diminution, and blank boxes indicate stagnation. For example: 691 SG_I* has the same number of copies in monocots compared to the ascendant node (= 692 angiosperms) indicated by a blank box. In PHODC, a diminution occurred compared to the 693 ascendant node (= monocots) indicated by a blue box. In MUSAC, an expansion occurred 694 compared to the ascendant node (= monocots) indicated by a red box and so on. 695 696 Figure 3: Global expansion rate in each species: Total number of genes in each species 697 divided by the ancestral number (see Table I). An ANOVA test showed that the expansion 698 rate differs significantly between species (p<2e-16). Therefore, we performed a TukeyHSD 699 test to determine which species exactly show a significant difference between each other and 700 group those species by significance level (a-e). Letters above the boxplot indicate TukeyHSD 701 significance group (see Supplemental Table SII). The significance groups are color coded 702 according to the mean expansion rate: orange: >2.25 fold expansion; red: 1.75-2.25 fold 703 expansion; purple: 1.40-1.75 fold expansion; blue: 0.80-1.40 fold expansion (i.e. no 704 expansion). The outlier SGs are labeled for each species. See Table II for species IDs. 705 706 Figure 4: Phylogenetic tree of the 33 species studied here. Five-digit species identifiers are 707 given in parenthesis next to the species name. Species which diverged <15 mya were merged 708 for the LSE analysis (see Materials and Methods): ANDRO, ORYZA, SOLAN, CUCUM,

Downloaded from www.plantphysiol.org on February35 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 709 BRASS. Polyploidy events and their estimated ages are indicated on the tree: circles on the 710 branches represent WGD, dark circles represent WGT. The numbers in the circles refer to 711 details on the polyploidization events given in Table I. Species divergence and their estimated 712 age are indicated by grey squares on the nodes. The numbers in the squares refer to details on 713 the divergence times given in Supplemental Table SI. Dots and asterisks on the branches 714 indicate SG expansions. Dots, 2-fold; small asterisks, between 2 and 4-fold; large asterisks, 715 equal or more than 4-fold. SG I* (brown), SG_IV (dark green), SG_V (grey), SG_VIIa 716 (orange), SG_VIIb (yellow), SG_VIII-1 (dark brown), SG_VIII-2 (green), SG_IX (light blue), 717 SG_Xa (dark blue), SG_XIIa (pink), SG_XIIb (purple), SG_XIIIa (red), SG_XIV (black), 718 SG_XV* (white). The asterisks and dots do not indicate the exact age. 719 720 Figure 5: Distribution of UP and SO clusters and sequences over all SGs. Frequency of all 721 extracted UP (dark blue) and SO clusters (dark orange) for each SG; Frequency of all 722 extracted UP (light blue) and SO sequences (light orange) for each SG. 723 724 Figure 6: (A) Hit frequency (i.e. frequency of codons under selection vs. total number of 725 sites) for each domain of the LRR-RLK genes. (B) Schematic structure of the LRR-RLK 726 genes, here with SG_I* gene structure as an example. The absence/presence and size of the 727 domains varies between SGs, please see text for details. N-term, N-terminal end; SP (dark 728 grey), signal peptide; Cys-pair 1 (blue), first cysteine pair; NC1, N-terminal of Cys-pair 1; 729 CC1, C-terminal of Cys-pair 1; other (green), other domains; LRR (red), leucine-rich repeat; 730 Cys-pair 2 (blue), first cysteine pair; NC2, N-terminal of Cys-pair 2; CC2, C-terminal of Cys- 731 pair 2; TM (black), trans-membrane domain; JM, juxta-membrane domain; KD (yellow), 732 kinase domain; C-term, C-terminal end; inter, other inter-domain regions. (C) Frequency of 733 amino acids in the LRR domain under positive selection. L, leucine; N, asparagine; G, 734 glycine; I, isoleucine; P, proline; x, variable; is, island between LRRs.

Downloaded from www.plantphysiol.org on February36 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. 735

Downloaded from www.plantphysiol.org on February37 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Parsed Citations Ahmad R, Parfitt D, Fass J, Ogundiwin E, Dhingra A, Gradziel T, Lin D, Joshi N, Martinez-Garcia P, Crisosto C (2011) Whole genome sequencing of peach (Prunus persica L.) for SNP identification and selection. BMC Genomics 12: 569 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Al-Dous EK, George B, Al-Mahmoud ME, Al-Jaber MY, Wang H, Salameh YM, Al-Azwani EK, Chaluvadi S, Pontaroli AC, DeBarry J, Arondel V, Ohlrogge J, Saie IJ, Suliman-Elmeer KM, Bennetzen JL, Kruegger RR, Malek JA (2011) De novo genome sequencing and comparative genomics of date palm (Phoenix dactylifera). Nat Biotechnol 29: 521-527 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Albert M, Jehle AK, Mueller K, Eisele C, Lipschis M, Felix G (2010) Arabidopsis thaliana pattern recognition receptors for Bacterial Elongation Factor Tu and Flagellin can be combined to form functional chimeric receptors. J Biol Chem 285: 19035-19042 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Argout X, Salse J, Aury JM, Guiltinan MJ, Droc G, Gouzy J, Allegre M, Chaparro C, Legavre T, Maximova SN, Abrouk M, Murat F, Fouet O, Poulain J, Ruiz M, Roguet Y, Rodier-Goud M, Barbosa-Neto JF, Sabot F, Kudrna D, Ammiraju JS, Schuster SC, Carlson JE, Sallet E, Schiex T, Dievart A, Kramer M, Gelley L, Shi Z, Berard A, Viot C, Boccara M, Risterucci AM, Guignon V, Sabau X, Axtell MJ, Ma Z, Zhang Y, Brown S, Bourge M, Golser W, Song X, Clement D, Rivallan R, Tahi M, Akaza JM, Pitollat B, Gramacho K, D'Hont A, Brunel D, Infante D, Kebe I, Costet P, Wing R, McCombie WR, Guiderdoni E, Quetier F, Panaud O, Wincker P, Bocs S, Lanaud C (2011) The genome of Theobroma cacao. Nat Genet 43: 101-108 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Arya P, Kumar G, Acharya V, Singh AK (2014) Genome-wide identification and expression analysis of NBS-encoding genes in Malus x domestica and expansion of NBS genes family in Rosaceae. PLOS ONE 9: e107987 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Banks JA, Nishiyama T, Hasebe M, Bowman JL, Gribskov M, dePamphilis C, Albert VA, Aono N, Aoyama T, Ambrose BA, Ashton NW, Axtell MJ, Barker E, Barker MS, Bennetzen JL, Bonawitz ND, Chapple C, Cheng C, Correa LGG, Dacre M, DeBarry J, Dreyer I, Elias M, Engstrom EM, Estelle M, Feng L, Finet C, Floyd SK, Frommer WB, Fujita T, Gramzow L, Gutensohn M, Harholt J, Hattori M, Heyl A, Hirai T, Hiwatashi Y, Ishikawa M, Iwata M, Karol KG, Koehler B, Kolukisaoglu U, Kubo M, Kurata T, Lalonde S, Li K, Li Y, Litt A, Lyons E, Manning G, Maruyama T, Michael TP, Mikami K, Miyazaki S, Morinaga Si, Murata T, Mueller-Roeber B, Nelson DR, Obara M, Oguri Y, Olmstead RG, Onodera N, Petersen BL, Pils B, Prigge M, Rensing SA, Riaño-Pachón DM, Roberts AW, Sato Y, Scheller HV, Schulz B, Schulz C, Shakirov EV, Shibagaki N, Shinohara N, Shippen DE, Sørensen I, Sotooka R, Sugimoto N, Sugita M, Sumikawa N, Tanurdzic M, Theißen G, Ulvskov P, Wakazuki S, Weng JK, Willats WWGT, Wipf D, Wolf PG, Yang L, Zimmer AD, Zhu Q, Mitros T, Hellsten U, Loqué D, Otillar R, Salamov A, Schmutz J, Shapiro H, Lindquist E, Lucas S, Rokhsar D, Grigoriev IV (2011) The Selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science 332: 960-963 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Barker MS, Vogel H, Schranz ME (2009) Paleopolyploidy in the : Analyses of the Cleome transcriptome elucidate the history of genome duplications in Arabidopsis and other Brassicales. Genome Biol Evol 1: 391-399 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433-438 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Cannon SB, McKain MR, Harkess A, Nelson MN, Dash S, Deyholos MK, Peng Y, Joyce B, Stewart CN, Rolf M, Kutchan T, Tan X, Chen C, Zhang Y, Carpenter E, Wong GK-S, Doyle JJ, Leebens-Mack J (2014) Multiple polyploidy events in the early radiation of nodulating and non-nodulating legumes. Mol Biol Evol: doi:10.1093/molbev/msu1296 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25: 1972-1973 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Castells E, Casacuberta JM (2007) Signalling through kinase-defective domains: the prevalence of atypical receptor-like kinases in plants. J Exp Bot 58: 3503-3511 Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Cenci A, Combes M-C, Lashermes P (2010) Comparative sequence analyses indicate that Coffea (Asterids) and Vitis (Rosids) derive from the same paleo-hexaploid ancestral genome. Mol Genet Genomics 283: 493-501 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, Puiu D, Melake-Berhan A, Jones KM, Redman J, Chen G, Cahoon EB, Gedil M, Stanke M, Haas BJ, Wortman JR, Fraser-Liggett CM, Ravel J, Rabinowicz PD (2010) Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol 28: 951-956 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Chevalier D, Batoux M, Fulton L, Pfister K, Yadav RK, Schellenberg M, Schneitz K (2005) STRUBBELIG defines a receptor kinase- mediated signaling pathway regulating organ development in Arabidopsis. Proc Natl Acad Sci USA 102: 9074-9079 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title D'Hont A, Denoeud F, Aury JM, Baurens FC, Carreel F, Garsmeur O, Noel B, Bocs S, Droc G, Rouard M, Da Silva C, Jabbari K, Cardi C, Poulain J, Souquet M, Labadie K, Jourda C, Lengelle J, Rodier-Goud M, Alberti A, Bernard M, Correa M, Ayyampalayam S, McKain MR, Leebens-Mack J, Burgess D, Freeling M, Mbeguie AMD, Chabannes M, Wicker T, Panaud O, Barbosa J, Hribova E, Heslop-Harrison P, Habas R, Rivallan R, Francois P, Poiron C, Kilian A, Burthia D, Jenny C, Bakry F, Brown S, Guignon V, Kema G, Dita M, Waalwijk C, Joseph S, Dievart A, Jaillon O, Leclercq J, Argout X, Lyons E, Almeida A, Jeridi M, Dolezel J, Roux N, Risterucci AM, Weissenbach J, Ruiz M, Glaszmann JC, Quetier F, Yahiaoui N, Wincker P (2012) The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488: 213-217 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Dassanayake M, Oh DH, Haas JS, Hernandez A, Hong H, Ali S, Yun DJ, Bressan RA, Zhu JK, Bohnert HJ, Cheeseman JM (2011) The genome of the extremophile crucifer Thellungiella parvula. Nat Genet 43: 913-918 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title De Mita S, Siol M (2012) EggLib: processing, analysis and simulation tools for population genetics and genomics. BMC Genet 13: 27 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title De Smet I, Voss U, Jurgens G, Beeckman T (2009) Receptor-like kinases shape the plant. Nat Cell Biol 11: 1166-1173 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title De Smet R, Adams KL, Vandepoele K, Van Montagu MCE, Maere S, Van de Peer Y (2013) Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants. Proc Natl Acad Sci USA 110: 2898-2903 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Desper R, Gascuel O (2006) Getting a Tree Fast: Neighbor Joining, FastME, and Distance-Based Methods. In I John Wiley & Sons, ed, Curr Protoc Bioinformatics, pp 6.3.1-6.3.28 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Dievart A, Gilbert N, Droc G, Attard A, Gourgues M, Guiderdoni E, Perin C (2011) Leucine-rich repeat receptor kinases are sporadically distributed in eukaryotic genomes. BMC Evol Biol 11: 367 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Dodds PN, Rathjen JP (2010) Plant immunity: towards an integrated view of plant-pathogen interactions. Nat Rev Genet 11: 539-548 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Dou L, Zhang X, Pang C, Song M, Wei H, Fan S, Yu S (2014) Genome-wide analysis of the WRKY gene family in cotton. Mol Genet Genomics 289: 1103-1121 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title

Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Dufayard JF, Duret L, Penel S, Gouy M, Rechenmann F, Perrière G (2005) Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases. Bioinformatics 21: 2596-2603 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Dutheil JY, Galtier N, Romiguier J, Douzery EJ, Ranwez V, Boussau B (2012) Efficient selection of branch-specific models of sequence evolution. Mol Biol Evol 29: 1861-1874 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Eddy SR (2009) A new generation of homology search tools based on probabilistic inference. Genome Inform October 2009: 205- 211 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Enkhbayar P, Kamiya M, Osaki M, Matsumoto T, Matsushima N (2004) Structural principles of leucine-rich repeat (LRR) proteins. Proteins: Struct, Funct, Bioinf 54: 394-403 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Fawcett JA, Maere S, Van de Peer Y (2009) Plants with double genomes might have had a better chance to survive the Cretaceous-Tertiary extinction event. Proc Natl Acad Sci USA 106: 5737-5742 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Felsenstein J (1993) PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author, Department of Genetics, University of Washington, Seattle Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Fischer I, Dainat J, Ranwez V, Glémin S, Dufayard J-F, Chantret N (2014) Impact of recurrent gene duplication on adaptation of plant genomes. BMC Plant Biol 14: 151 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Fletcher W, Yang Z (2010) The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol Biol Evol 27: 2257-2267 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Garcia-Mas J, Benjak A, Sanseverino W, Bourgeois M, Mir G, González VM, Hénaff E, Câmara F, Cozzuto L, Lowy E, Alioto T, Capella-Gutiérrez S, Blanca J, Cañizares J, Ziarsolo P, Gonzalez-Ibeas D, Rodríguez-Moreno L, Droege M, Du L, Alvarez-Tejado M, Lorente-Galdos B, Melé M, Yang L, Weng Y, Navarro A, Marques-Bonet T, Aranda MA, Nuez F, Picó B, Gabaldón T, Roma G, Guigó R, Casacuberta JM, Arús P, Puigdomènech P (2012) The genome of melon (Cucumis melo L.). Proc Natl Acad Sci USA 109: 11872-11877 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Gish LA, Clark SE (2011), The RLK/Pelle family of kinases. Plant J 66: 117-127 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O (2010) New algorithms and methods to estimate maximum- likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59: 307-321 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hahn MW (2009) Distinguishing among evolutionary models for the maintenance of gene duplicates. J Hered 100: 605-617 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Han MV, Thomas GWC, Lugo-Martinez J, Hahn MW (2013) Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. Mol Biol Evol 30: 1987-1997 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hanada K, Zou C, Lehti-Shiu MD, Shinozaki K, Shiu SH (2008) Importance of lineage-specific expansion of plant tandem duplicates Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. in the adaptive response to environmental stimuli. Plant Physiol 148: 993-1003 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hok S, Danchin EGJ, Allasia V, Panabières F, Attard A, Keller H (2011) An Arabidopsis (malectin-like) leucine-rich repeat receptor- like kinase contributes to downy mildew disease. Plant, Cell Environ 34: 1944-1957 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hu TT, Pattyn P, Bakker EG, Cao J, Cheng JF, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, Haberer G, Hollister JD, Ossowski S, Ottilar RP, Salamov AA, Schneeberger K, Spannagl M, Wang X, Yang L, Nasrallah ME, Bergelson J, Carrington JC, Gaut BS, Schmutz J, Mayer KF, Van de Peer Y, Grigoriev IV, Nordborg M, Weigel D, Guo YL (2011) The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 43: 476-481 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Huang S, Li R, Zhang Z, Li L, Gu X, Fan W, Lucas WJ, Wang X, Xie B, Ni P, Ren Y, Zhu H, Li J, Lin K, Jin W, Fei Z, Li G, Staub J, Kilian A, van der Vossen EA, Wu Y, Guo J, He J, Jia Z, Ren Y, Tian G, Lu Y, Ruan J, Qian W, Wang M, Huang Q, Li B, Xuan Z, Cao J, Asan, Wu Z, Zhang J, Cai Q, Bai Y, Zhao B, Han Y, Li Y, Li X, Wang S, Shi Q, Liu S, Cho WK, Kim JY, Xu Y, Heller-Uszynska K, Miao H, Cheng Z, Zhang S, Wu J, Yang Y, Kang H, Li M, Liang H, Ren X, Shi Z, Wen M, Jian M, Yang H, Zhang G, Yang Z, Chen R, Liu S, Li J, Ma L, Liu H, Zhou Y, Zhao J, Fang X, Li G, Fang L, Li Y, Liu D, Zheng H, Zhang Y, Qin N, Li Z, Yang G, Yang S, Bolund L, Kristiansen K, Zheng H, Li S, Zhang X, Yang H, Wang J, Sun R, Zhang B, Jiang S, Wang J, Du Y, Li S (2009) The genome of the cucumber, Cucumis sativus L. Nat Genet 41: 1275-1281 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Innan H (2009) Population genetic models of duplicated genes. Genetica 137: 19-37 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Innan H, Kondrashov F (2010) The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 11: 97-108 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436: 793-800 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyere C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V, Felice N, Paillard S, Juman I, Moroldo M, Scalabrin S, Canaguier A, Le Clainche I, Malacrida G, Durand E, Pesole G, Laucou V, Chatelet P, Merdinoglu D, Delledonne M, Pezzotti M, Lecharny A, Scarpelli C, Artiguenave F, Pe ME, Valle G, Morgante M, Caboche M, Adam-Blondon AF, Weissenbach J, Quetier F, Wincker P, French-Italian Public Consortium for Grapevine Genome C (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449: 463-467 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, Soltis DE, Clifton SW, Schlarbaum SE, Schuster SC, Ma H, Leebens-Mack J, dePamphilis CW (2011) Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97-100 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Johnson DA, Thomas MA (2007) The monosaccharide transporter gene family in Arabidopsis and rice: a history of duplications, adaptive evolution, and functional divergence. Mol Biol Evol 24: 2412-2423 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jones DA, Jones JDG (1997) The role of leucine-rich repeat proteins in plant defences. Adv Bot Res 24: 90-167 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jordan G, Goldman N (2012) The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol 29: 1125-1139 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only TitleDownloaded Only Author from and www.plantphysiol.orgTitle on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Katoh K, Kuma K-I, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33: 511-518 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Langham RJ, Walsh J, Dunn M, Ko C, Goff SA, Freeling M (2004) Genomic duplication, fractionation and the origin of regulatory novelty. Genetics 166: 935-945 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Lawrie DS, Messer PW, Hershberg R, Petrov DA (2013) Strong purifying selection at synonymous sites in D. melanogaster. PLOS Genet 9: e1003527 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Lehti-Shiu M, Zou C, Shiu S-H (2012) Origin, diversity, expansion history, and functional evolution of the plant Receptor-Like Kinase/Pelle family. In F Tax, B Kemmerling, eds, Receptor-like Kinases in Plants, Vol 13. Springer Berlin Heidelberg, pp 1-22 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Lehti-Shiu MD, Zou C, Hanada K, Shiu S-H (2009) Evolutionary history and stress regulation of plant Receptor-Like Kinase/Pelle genes. Plant Physiol 150: 12-26 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Leitch AR, Leitch IJ (2012) Ecological and genetic factors linked to contrasting genome dynamics in seed plants. New Phytol 194: 629-646 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Lindner H, Müller LM, Boisson-Dernier A, Grossniklaus U (2012) CrRLK1L receptor-like kinases: not just another brick in the wall. Curr Opin Plant Biol 15: 659-669 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA 102: 10557-10562 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Markmann K, Giczey G, Parniske M (2008) Functional adaptation of a plant Receptor- Kinase paved the way for the evolution of intracellular root symbioses with bacteria. PLOS Biol 6: e68 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Ming R, Hou S, Feng Y, Yu Q, Dionne-Laporte A, Saw JH, Senin P, Wang W, Ly BV, Lewis KL, Salzberg SL, Feng L, Jones MR, Skelton RL, Murray JE, Chen C, Qian W, Shen J, Du P, Eustice M, Tong E, Tang H, Lyons E, Paull RE, Michael TP, Wall K, Rice DW, Albert H, Wang ML, Zhu YJ, Schatz M, Nagarajan N, Acob RA, Guan P, Blas A, Wai CM, Ackerman CM, Ren Y, Liu C, Wang J, Wang J, Na JK, Shakirov EV, Haas B, Thimmapuram J, Nelson D, Wang X, Bowers JE, Gschwend AR, Delcher AL, Singh R, Suzuki JY, Tripathi S, Neupane K, Wei H, Irikura B, Paidi M, Jiang N, Zhang W, Presting G, Windsor A, Navajas-Perez R, Torres MJ, Feltus FA, Porter B, Li Y, Burroughs AM, Luo MC, Liu L, Christopher DA, Mount SM, Moore PH, Sugimura T, Jiang J, Schuler MA, Friedman V, Mitchell-Olds T, Shippen DE, dePamphilis CW, Palmer JD, Freeling M, Paterson AH, Gonsalves D, Wang L, Alam M (2008) The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452: 991-996 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Moore RC, Purugganan MD (2005) The evolutionary dynamics of plant duplicate genes. Curr Opin Plant Biol 8: 122-128 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Mühlhausen S, Kollmar M (2013) Whole genome duplication events in plant evolution reconstructed and predicted using myosin motor proteins. BMC Evol Biol 13: 202 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Nei M, Rooney AP (2005) Concerted and birth-and-death evolution of multigene families. Ann Rev Genet 39: 121-152 Pubmed: Author and Title Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Oh D-H, Hong H, Lee SY, Yun D-J, Bohnert HJ, Dassanayake M (2014) Genome structures and transcriptomes signify niche adaptation for the multiple-ion-tolerant extremophyte Schrenkiella parvula. Plant Physiol 164: 2123-2138 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Oh M-H, Wang X, Kota U, Goshe MB, Clouse SD, Huber SC (2009) Tyrosine phosphorylation of the BRI1 receptor kinase emerges as a component of brassinosteroid signaling in Arabidopsis. Proc Natl Acad Sci USA 106: 658-663 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Parniske M, Hammond-Kosack KE, Golstein C, Thomas CM, Jones DA, Harrison K, Wulff BBH, Jones JDG (1997) Novel disease resistance specificities result from sequence exchange between tandemly repeated genes at the Cf-4/9 locus of tomato. Cell 91: 821-832 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC, Freeling M, Gingle AR, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Peterson DG, Mehboob ur R, Ware D, Westhoff P, Mayer KF, Messing J, Rokhsar DS (2009) The Sorghum bicolor genome and the diversification of grasses. Nature 457: 551-556 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Paterson AH, Bowers JE, Chapman BA (2004) Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc Natl Acad Sci USA 101: 9903-9908 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Penn O, Privman E, Landan G, Graur D, Pupko T (2010) An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol 27: 1759-1767 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Pfeil BE, Schlueter JA, Shoemaker RC, Doyle JJ (2005) Placing paleopolyploidy in relation to taxon divergence: A phylogenetic analysis in legumes using 39 gene families. Syst Biol 54: 441-454 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Potato Genome Sequencing Consortium (2011) Genome sequence and analysis of the tuber crop potato. Nature 475: 189-195 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Prochnik S, Marri P, Desany B, Rabinowicz P, Kodira C, Mohiuddin M, Rodriguez F, Fauquet C, Tohme J, Harkins T, Rokhsar D, Rounsley S (2012) The cassava genome: Current progress, future directions. Tropical Plant Biology 5: 88-94 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title R Development Core Team (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Renny-Byfield S, Wendel JF (2014) Doubling down on genomes: Polyploidy and crop plants. Am J Bot 101: 1711-1725 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rensing SA, Lang D, Zimmer AD, Terry A, Salamov A, Shapiro H, Nishiyama T, Perroud P-F, Lindquist EA, Kamisugi Y, Tanahashi T, Sakakibara K, Fujita T, Oishi K, Shin-I T, Kuroki Y, Toyoda A, Suzuki Y, Hashimoto S-i, Yamaguchi K, Sugano S, Kohara Y, Fujiyama A, Anterola A, Aoki S, Ashton N, Barbazuk WB, Barker E, Bennetzen JL, Blankenship R, Cho SH, Dutcher SK, Estelle M, Fawcett JA, Gundlach H, Hanada K, Heyl A, Hicks KA, Hughes J, Lohr M, Mayer K, Melkozernov A, Murata T, Nelson DR, Pils B, Prigge M, Reiss B, Renner T, Rombauts S, Rushton PJ, Sanderfoot A, Schween G, Shiu S-H, Stueber K, Theodoulou FL, Tu H, Van de Peer Y, Verrier PJ, Waters E, Wood A, Yang L, Cove D, Cuming AC, Hasebe M, Lucas S, Mishler BD, Reski R, Grigoriev IV, Quatrano RS, Boore JL (2008) The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science 319: 64-69 Pubmed: Author and Title CrossRef: Author and Title Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Google Scholar: Author Only Title Only Author and Title Rizzon C, Ponger L, Gaut BS (2006) Striking similarities in the genomic distribution of tandemly arrayed genes in Arabidopsis and rice. PLOS Comput Biol 2: e115 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Romiguier J, Figuet E, Galtier N, Douzery EJ, Boussau B, Dutheil JY, Ranwez V (2012) Fast and robust characterization of time- heterogeneous sequence evolutionary processes using substitution mapping. PLOS ONE 7: e33852 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Sato S, Hirakawa H, Isobe S, Fukai E, Watanabe A, Kato M, Kawashima K, Minami C, Muraki A, Nakazaki N, Takahashi C, Nakayama S, Kishida Y, Kohara M, Yamada M, Tsuruoka H, Sasamoto S, Tabata S, Aizu T, Toyoda A, Shin-i T, Minakuchi Y, Kohara Y, Fujiyama A, Tsuchimoto S, Kajiyama Si, Makigano E, Ohmido N, Shibagaki N, Cartagena JA, Wada N, Kohinata T, Atefeh A, Yuasa S, Matsunaga S, Fukui K (2011) Sequence analysis of the genome of an oil-bearing tree, Jatropha curcas L. DNA Res 18: 65-76 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Sato S, Nakamura Y, Kaneko T, Asamizu E, Kato T, Nakao M, Sasamoto S, Watanabe A, Ono A, Kawashima K, Fujishiro T, Katoh M, Kohara M, Kishida Y, Minami C, Nakayama S, Nakazaki N, Shimizu Y, Shinpo S, Takahashi C, Wada T, Yamada M, Ohmido N, Hayashi M, Fukui K, Baba T, Nakamichi T, Mori H, Tabata S (2008) Genome structure of the legume, Lotus japonicus. DNA Res 15: 227-239 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463: 178-183 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Schnable JC, Springer NM, Freeling M (2011) Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc Natl Acad Sci USA 108: 4069-4074 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh CT, Emrich SJ, Jia Y, Kalyanaraman A, Hsia AP, Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia JM, Deragon JM, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326: 1112-1115 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Shang H, Li W, Zou C, Yuan Y (2013) Analyses of the NAC transcription factor gene family in Gossypium raimondii Ulbr.: Chromosomal location, structure, phylogeny, and expression patterns. J Integr Plant Biol 55: 663-676 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Shiu S-H, Bleecker AB (2001) Plant Receptor-Like Kinase gene family: Diversity, function, and signaling. Sci. STKE 2001: re22- Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Shiu S-H, Karlowski WM, Pan R, Tzeng Y-H, Mayer KFX, Li W-H (2004) Comparative analysis of the Receptor-Like Kinase family in Arabidopsis and rice. Plant Cell 16: 1220-1234 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson AH, Zheng C, Sankoff D, dePamphilis CW, Wall PK, Soltis PS (2009) Polyploidy and angiosperm diversification. Am J Bot 96: 336-348 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Soltis DE, Burleigh JG (2009) Surviving the K-T mass extinction: New perspectives of polyploidization in angiosperms. Proc Natl Acad Sci USA 106: 5455-5456 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Song X, Guo P, Li C, Liu C-M (2010) The cysteine pairs in CLV2 are not necessary for sensing the CLV3 peptide in shoot and root meristems. J Integr Plant Biol 52: 774-781 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Sun W, Cao Y, Jansen Labby K, Bittel P, Boller T, Bent AF (2012) Probing the Arabidopsis Flagellin Receptor: FLS2-FLS2 association and the contributions of specific domains to signaling function. Plant Cell 24: 1096-1113 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Tan S, Wang D, Ding J, Tian D, Zhang X, Yang S (2011) Adaptive evolution of Xa21 homologs in Gramineae. Genetica 139: 1465- 1475 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Tang H, Bowers JE, Wang X, Paterson AH (2010a) Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc Natl Acad Sci USA 107: 472-477 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Tang P, Zhang Y, Sun X, Tian D, Yang S, Ding J (2010b) Disease resistance signature of the leucine-rich repeat receptor-like kinase genes in four plant species. Plant Sci 179: 399-406 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title The International Brachypodium Initiative (2010) Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463: 763-768 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title The Tomato Genome Consortium (2012) The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485: 635-641 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao RR, Bhalerao RP, Blaudez D, Boerjan W, Brun A, Brunner A, Busov V, Campbell M, Carlson J, Chalot M, Chapman J, Chen GL, Cooper D, Coutinho PM, Couturier J, Covert S, Cronk Q, Cunningham R, Davis J, Degroeve S, Dejardin A, Depamphilis C, Detter J, Dirks B, Dubchak I, Duplessis S, Ehlting J, Ellis B, Gendler K, Goodstein D, Gribskov M, Grimwood J, Groover A, Gunter L, Hamberger B, Heinze B, Helariutta Y, Henrissat B, Holligan D, Holt R, Huang W, Islam-Faridi N, Jones S, Jones-Rhoades M, Jorgensen R, Joshi C, Kangasjarvi J, Karlsson J, Kelleher C, Kirkpatrick R, Kirst M, Kohler A, Kalluri U, Larimer F, Leebens-Mack J, Leple JC, Locascio P, Lou Y, Lucas S, Martin F, Montanini B, Napoli C, Nelson DR, Nelson C, Nieminen K, Nilsson O, Pereda V, Peter G, Philippe R, Pilate G, Poliakov A, Razumovskaya J, Richardson P, Rinaldi C, Ritland K, Rouze P, Ryaboy D, Schmutz J, Schrader J, Segerman B, Shin H, Siddiqui A, Sterky F, Terry A, Tsai CJ, Uberbacher E, Unneberg P, Vahala J, Wall K, Wessler S, Yang G, Yin T, Douglas C, Marra M, Sandberg G, Van de Peer Y, Rokhsar D (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313: 1596-1604 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MT, Azam S, Fan G, Whaley AM, Farmer AD, Sheridan J, Iwata A, Tuteja R, Penmetsa RV, Wu W, Upadhyaya HD, Yang SP, Shah T, Saxena KB, Michael T, McCombie WR, Yang B, Zhang G, Yang H, Wang J, Spillane C, Cook DR, May GD, Xu X, Jackson SA (2012) Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat Biotechnol 30: 83-89 Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Fitzgerald LM, Gutin N, Lanchbury J, Macalma T, Mitchell JT, Reid J, Wardell B, Kodira C, Chen Z, Desany B, Niazi F, Palmer M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre P, Lespinasse Y, Allan AC, Bus V, Chagne D, Crowhurst RN, Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouze P, Sterck L, Toppo S, Lazzari B, Hellens RP, Durel CE, Gutin A, Bumgarner RE, Gardiner SE, Skolnick M, Egholm M, Van de Peer Y, Salamini F, Viola R (2010) The genome of the domesticated apple (Malus x domestica Borkh.). Nat Genet 42: 833-839 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Verde I, Abbott AG, Scalabrin S, Jung S, Shu S, Marroni F, Zhebentyayeva T, Dettori MT, Grimwood J, Cattonaro F, Zuccolo A, Rossini L, Jenkins J, Vendramin E, Meisel LA, Decroocq V, Sosinski B, Prochnik S, Mitros T, Policriti A, Cipriani G, Dondini L, Ficklin S, Goodstein DM, Xuan P, Fabbro CD, Aramini V, Copetti D, Gonzalez S, Horner DS, Falchi R, Lucas S, Mica E, Maldonado J, Lazzari B, Bielenberg D, Pirona R, Miculan M, Barakat A, Testolin R, Stella A, Tartarini S, Tonutti P, Arus P, Orellana A, Wells C, Main D, Vizzotto G, Silva H, Salamini F, Schmutz J, Morgante M, Rokhsar DS (2013) The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 45: 487-494 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Vetter MM, Kronholm I, He F, Haweker H, Reymond M, Bergelson J, Robatzek S, de Meaux J (2012) Flagellin perception varies quantitatively in Arabidopsis thaliana and its relatives. Mol Biol Evol 29: 1655-1667 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wang G-L, Ruan D-L, Song W-Y, Sideris S, Chen L, Pi L-Y, Zhang S, Zhang Z, Fauquet C, Gaut BS, Whalen MC, Ronald PC (1998) Xa21D encodes a receptor-like molecule with a Leucine-Rich Repeat domain that determines race-specific recognition and is subject to adaptive evolution. Plant Cell 10: 765-779 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wang K, Wang Z, Li F, Ye W, Wang J, Song G, Yue Z, Cong L, Shang H, Zhu S, Zou C, Li Q, Yuan Y, Lu C, Wei H, Gou C, Zheng Z, Yin Y, Zhang X, Liu K, Wang B, Song C, Shi N, Kohel RJ, Percy RG, Yu JZ, Zhu YX, Wang J, Yu S (2012) The draft genome of a diploid cotton Gossypium raimondii. Nat Genet 44: 1098-1103 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun JH, Bancroft I, Cheng F, Huang S, Li X, Hua W, Wang J, Wang X, Freeling M, Pires JC, Paterson AH, Chalhoub B, Wang B, Hayward A, Sharpe AG, Park BS, Weisshaar B, Liu B, Li B, Liu B, Tong C, Song C, Duran C, Peng C, Geng C, Koh C, Lin C, Edwards D, Mu D, Shen D, Soumpourou E, Li F, Fraser F, Conant G, Lassalle G, King GJ, Bonnema G, Tang H, Wang H, Belcram H, Zhou H, Hirakawa H, Abe H, Guo H, Wang H, Jin H, Parkin IA, Batley J, Kim JS, Just J, Li J, Xu J, Deng J, Kim JA, Li J, Yu J, Meng J, Wang J, Min J, Poulain J, Wang J, Hatakeyama K, Wu K, Wang L, Fang L, Trick M, Links MG, Zhao M, Jin M, Ramchiary N, Drou N, Berkman PJ, Cai Q, Huang Q, Li R, Tabata S, Cheng S, Zhang S, Zhang S, Huang S, Sato S, Sun S, Kwon SJ, Choi SR, Lee TH, Fan W, Zhao X, Tan X, Xu X, Wang Y, Qiu Y, Yin Y, Li Y, Du Y, Liao Y, Lim Y, Narusaka Y, Wang Y, Wang Z, Li Z, Wang Z, Xiong Z, Zhang Z, Brassica rapa Genome Sequencing Project C (2011) The genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43: 1035-1039 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wu H-J, Zhang Z, Wang J-Y, Oh D-H, Dassanayake M, Liu B, Huang Q, Sun H-X, Xia R, Wu Y, Wang Y-N, Yang Z, Liu Y, Zhang W, Zhang H, Chu J, Yan C, Fang S, Zhang J, Wang Y, Zhang F, Wang G, Lee SY, Cheeseman JM, Yang B, Li B, Min J, Yang L, Wang J, Chu C, Chen S-Y, Bohnert HJ, Zhu J-K, Wang X-J, Xie Q (2012) Insights into salt tolerance from the genome of Thellungiella salsuginea. Proc Natl Acad Sci USA 109: 12219-12224 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Xue Z, Duan L, Liu D, Guo J, Ge S, Dicks J, ÓMáille P, Osbourn A, Qi X (2012) Divergent evolution of oxidosqualene cyclases in plants. New Phytol 193: 1022-1038 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang T, Chaudhuri S, Yang L, Du L, Poovaiah BW (2010) A calcium/calmodulin-regulated member of the Receptor-like Kinase family confers cold tolerance in plants. J Biol Chem 285: 7119-7126 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24: 1586-1591 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang Z, Wang Y, Zhou Y, Gao Q, Zhang E, Zhu L, Hu Y, Xu C (2013a) Evolution of land plant genes encoding L-Ala-D/L-Glu epimerases (AEEs) via horizontal gene transfer and positive selection. BMC Plant Biol 13: 34 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang ZL, Liu HJ, Wang XR, Zeng QY (2013b) Molecular evolution and expression divergence of the Populus polygalacturonase supergene family shed light on the evolution of increasingly complex organs in plants. New Phytol 197: 1353-1365 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Young ND, Debelle F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KA, Tang H, Rombauts S, Zhao PX, Zhou P, Barbe V, Bardou P, Bechner M, Bellec A, Berger A, Berges H, Bidwell S, Bisseling T, Choisne N, Couloux A, Denny R, Deshpande S, Dai X, Doyle JJ, Dudez AM, Farmer AD, Fouteau S, Franken C, Gibelin C, Gish J, Goldstein S, Gonzalez AJ, Green PJ, Hallab A, Hartog M, Hua A, Humphray SJ, Jeong DH, Jing Y, Jocker A, Kenton SM, Kim DJ, Klee K, Lai H, Lang C, Lin S, Macmil SL, Magdelenat G, Matthews L, McCorrison J, Monaghan EL, Mun JH, Najar FZ, Nicholson C, Noirot C, O'Bleness M, Paule CR, Poulain J, Prion F, Qin B, Qu C, Retzel EF, Riddle C, Sallet E, Samain S, Samson N, Sanders I, Saurat O, Scarpelli C, Schiex T, Segurens B, Severin AJ, Sherrier DJ, Shi R, Sims S, Singer SR, Sinharoy S, Sterck L, Viollet A, Wang BB, Wang K, Wang M, Wang X, Warfsmann J, Weissenbach J, White DD, White JD, Wiley GB, Wincker P, Xing Y, Yang L, Yao Z, Ying F, Zhai J, Zhou L, Zuber A, Denarie J, Dixon RA, May GD, Schwartz DC, Rogers J, Quetier F, Town CD, Roe BA (2011) The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480: 520-524 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yu J, Hu S, Wang J, Wong GK-S, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296: 79-92 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhang G, Liu X, Quan Z, Cheng S, Xu X, Pan S, Xie M, Zeng P, Yue Z, Wang W, Tao Y, Bian C, Han C, Xia Q, Peng X, Cao R, Yang X, Zhan D, Hu J, Zhang Y, Li H, Li H, Li N, Wang J, Wang C, Wang R, Guo T, Cai Y, Liu C, Xiang H, Shi Q, Huang P, Chen Q, Li Y, Wang J, Zhao Z, Wang J (2012) Genome sequence of foxtail millet (Setaria italica) provides insights into grass evolution and biofuel potential. Nat Biotechnol 30: 549-554 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhang X, Choi J, Heinz J, Chetty C (2006) Domain-specific positive selection contributes to the evolution of Arabidopsis Leucine- Rich Repeat Receptor-Like Kinase (LRR RLK) genes. J Mol Evol 63: 612-621 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zou C, Lehti-Shiu MD, Thibaud-Nissen F, Prakash T, Buell CR, Shiu S-H (2009) Evolutionary and expression signatures of pseudogenes in Arabidopsis and rice. Plant Physiol 151: 3-15 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title

Downloaded from www.plantphysiol.org on February 18, 2016 - Published by www.plant.org Copyright © 2016 American Society of Plant Biologists. All rights reserved.