Articles https://doi.org/10.1038/s41477-020-0613-7

An ancestral signalling pathway is conserved in intracellular symbioses-forming lineages

Guru V. Radhakrishnan 1,12, Jean Keller 2,12, Melanie K. Rich2,12, Tatiana Vernié 2,12, Duchesse L. Mbadinga Mbadinga2, Nicolas Vigneron2, Ludovic Cottret3, Hélène San Clemente 2, Cyril Libourel 2, Jitender Cheema1, Anna-Malin Linde4, D. Magnus Eklund 4, Shifeng Cheng5, Gane K. S. Wong 6,7,8, Ulf Lagercrantz4, Fay-Wei Li 9,10, Giles E. D. Oldroyd 1,11 ✉ and Pierre-Marc Delaux 2 ✉

Plants are the foundation of terrestrial ecosystems, and their colonization of land was probably facilitated by mutualistic asso- ciations with arbuscular mycorrhizal fungi. Following this founding event, plant diversification has led to the emergence of a tremendous diversity of mutualistic symbioses with microorganisms, ranging from extracellular associations to the most intimate intracellular associations, where fungal or bacterial symbionts are hosted inside plant cells. Here, through analysis of 271 transcriptomes and 116 plant genomes spanning the entire land-plant diversity, we demonstrate that a common symbiosis signalling pathway co-evolved with intracellular endosymbioses, from the ancestral arbuscular mycorrhiza to the more recent ericoid and orchid mycorrhizae in angiosperms and ericoid-like associations of bryophytes. By contrast, species forming exclu- sively extracellular symbioses, such as ectomycorrhizae, and those forming associations with cyanobacteria, have lost this signalling pathway. This work unifies intracellular symbioses, revealing conservation in their evolution across 450 million yr of plant diversification.

ince they colonized land 450 million yr ago, have been spaces of the plant tissue, can be found in diverse species among the foundation of most terrestrial ecosystems1. Such success- the embryophytes, including hornworts, liverworts, ferns, gymno- Sful colonization occurred only once in the plant kingdom, and sperms and angiosperms9. Despite the improved nutrient acqui- it has been proposed that the symbiotic association formed with sition afforded to plants by these different types of mutualistic arbuscular mycorrhizal fungi supported that transition2,3. Following symbioses, entire plant lineages have completely lost the symbiotic this founding event, plant diversification was accompanied by the state, a phenomenon known as mutualism abandonment4. emergence of alternative or additional symbionts4. Among alterna- Our understanding of the molecular mechanisms governing the tive symbioses, the association between orchids and with establishment and function of these symbioses comes from forward both ascomycetes and basidiomycetes are two endosymbioses with and reverse genetics conducted in legumes and a few other angio- specific intracellular structures in two plant lineages that lost the sperms10, and is restricted to AMS and root-nodule symbiosis7,11. ability to form arbuscular mycorrhizal symbiosis (AMS)5. Orchid These detailed studies in selected plant species have enabled phy- mycorrhiza and ericoid mycorrhiza represent two clear symbiosis logenetic analyses to more precisely link the symbiotic genes with switches, in which intracellular associations are sustained, but the either AMS or root-nodule symbiosis. Indeed, the loss of AMS or nature of the symbionts are radically different. Similarly, within the root-nodule symbiosis correlates with the loss of many genes that liverworts, the Jungermanniales engage in ericoid-like endosym- are known to be involved in these associations7,11. The gene losses bioses but not AMS, and represent another symbiont switch that are thought to be the result of a relaxed selection pressure follow- occurred during plant evolution6. Other symbioses can occur simul- ing loss of the trait, resulting in co-elimination, which specifically taneously with AMS; for example, root-nodule symbiosis, an associ- targets only genes required for the lost trait12. Co-elimination can ation with nitrogen-fixing bacteria that evolved in the last common be tracked at the genome-wide level using comparative phyloge- ancestor of Fabales, Fagales, Cucurbitales and Rosales7. Another nomic approaches on species with contrasting retention of the trait example is ectomycorrhizae, an extracellular symbiosis found in of interest13,14. Such approaches have led, for instance, to the discov- several gymnosperm and angiosperm lineages—in some lineages ery of genes associated with small RNA biosynthesis and signalling13 both AMS and ectomycorrhizae have been retained, whereas other or cilia function15. Applied to AMS, comparative phylogenomics lineages have switched from AMS to ectomycorrhizae8. Finally, asso- in angiosperms identified a set of more than 100 genes that were ciations with cyanobacteria, which occur only in the intercellular lost in a convergent manner in lineages that lost the AMS16–18.

1John Innes Centre, Norwich, UK. 2LRSV, Université de Toulouse, CNRS, UPS, Castanet-Tolosan, France. 3LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France. 4Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden. 5Agricultural Genome Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China. 6BGI-Shenzhen, Shenzhen, China. 7Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada. 8Department of Medicine, University of Alberta, Edmonton, Alberta, Canada. 9Boyce Thompson Institute, Ithaca, New York, NY, USA. 10Plant Biology Section, Cornell University, New York, NY, USA. 11Sainsbury Laboratory, University of Cambridge, Cambridge, UK. 12These authors contributed equally: Guru V. Radhakrishnan, Jean Keller, Melanie K. Rich, Tatiana Vernié. ✉e-mail: [email protected]; [email protected]

280 Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants NaTure PlanTs Articles

Table 1 | Genome assembly statistics for Marchantia species sequenced as part of this study and comparison with the M. polymorpha ssp. ruderalis TAK1 reference genome M. polymorpha ssp. M. paleacea M. polymorpha ssp. M. polymorpha ssp. ruderalis TAK1 polymorpha montivagans Assembly size (Mb) 210.6 238.61 222.7 225.7 Scaffolds 2,957 22,669 2,741 2,710 N50 length (kb) 1,313.57 77.78 368.25 589.42 BUSCO score Complete 821 817 855 855 Single copy 793 790 832 829 Duplicated 28 27 23 26 Fragmented 48 53 38 42 Missing 571 570 547 543 GC content (%) 41.1 40.3 42.2 42.1 Reference Bowman et al.26 This study This study This study

Mb, megabases; kb, kilobases.

All classes of functions essential for AMS were detected among these demonstrated that loss of AMS in six angiosperm lineages is associ- genes, including the initial signalling pathway—essential for the ated with the convergent loss of many genes16,17. SymDB contains host plant to activate its symbiotic program—and genes involved in species from across the entire land-plant lineage that have lost AMS, the transfer of lipids from the host plant to the arbuscular mycorrhi- and thus provided us with a platform to assess co-elimination of zal fungi. Since their identification by phylogenomics, novel candi- genes associated with the abandonment of mutualism through- dates were validated for their involvement in AMS through reverse out the plant kingdom. We generated phylogenetic trees for all the genetic analyses in legumes19–21. genes previously identified as being lost in angiosperms with loss Targeted phylogenetic analyses have identified multiple symbiotic of AMS (Supplementary Figs. 1–32 and Supplementary Table 2). genes in the transcriptomes of bryophytes, but study of the overall Among these gene phylogenies, those missing from the largest molecular conservation of symbiotic mechanisms in land plants number of lineages that have abandoned mutualism were selected. are lacking22–24. Similarly, the plant molecular mechanisms behind Six genes, SymRK, CCaMK, CYCLOPS, the GRAS transcription the diverse array of mutualistic associations, either intracellular or factor RAD1 and two half-ATP-binding cassette (ABC) transport- extracellular, are poorly understood10. In this study, we demonstrate ers—STR and STR2—were consistently lost in non-mutualistic lin- through analysis of a comprehensive set of plant genomes and tran- eages in angiosperms, gymnosperms, ferns and Bryophytes (Fig. 1 scriptomes, the loss and conservation of symbiotic genes associated and Supplementary Figs. 1, 3 and 13–15). Very few exceptions with the evolution of diverse mutualistic symbioses in plants. to this trend were found (Fig. 1 and Supplementary Figs. 13 and 14), for instance, the presence of CCaMK in the aquatic angio- Results sperm Nelumbo nucifera, which was previously reported in Bravo A database covering the diversity of plant lineages and symbi- et al.16. However, further analysis of this locus revealed a deletion otic associations. Genomic and transcriptomic data are scattered in the kinase domain leading to a probably non-functional pseu- between public repositories, specialized databases and personal dogene (Supplementary Fig. 33). The same deletion was present in websites. To facilitate large-scale phylogenetic analysis, we compiled two different ecotypes and three independent genome assemblies resources for species covering the broad diversity of plants and sym- (Supplementary Fig. 33). The second consequential exception was biotic status in a centralized database, SymDB (www.polebio.lrsv. in mosses, where CCaMK and CYCLOPS were present despite the ups-tlse.fr/symdb/). This sampling of available resources covers lin- documented loss of AMS in this lineage (Fig. 1 and Supplementary eages forming most of the known mutualistic associations in plants Figs. 13 and 14). Previously, it was proposed that the selective pres- (Supplementary Table 1), including AMS, root-nodule symbiosis, sure acting on both genes was relaxed in the branch following the ectomycorrhizae, orchid mycorrhiza, cyanobacterial associations in divergence of the only mycorrhizae-forming moss (Takakia) and hornworts and ferns, and ericoid-like symbioses in liverworts. SymDB other moss species24. Using two independent approaches (RELAX also includes genomes of lineages that have abandoned mutualism and PAML) we confirmed this initial result and identified sites in the angiosperms, gymnosperms, monilophytes and bryophytes. under positive selection (Supplementary Figs. 34 and 35 and To enrich this sampling, we generated two additional datasets: an Supplementary Table 3), suggesting the neofunctionalization of in depth transcriptome of the liverwort Blasia pusilla, which asso- these two genes. From our analysis, we also detected in three spe- ciates with cyanobacteria, and—since no genome of an arbuscular cies that are thought to be non-mutualistic the presence of STR2 mycorrhizal host was available for the bryophytes—we sequenced (in a single fern species) and RAD1 (in two liverworts) (Fig. 2b,c). the genome of the complex thalloid liverwort Marchantia paleacea The presence of these genes may reflect additional cases of species- de novo, which specifically associates with arbuscular mycorrhizal specific neofunctionalization or may be the result of misassignment fungi25. The obtained assembly was of similar size and complete- of the symbiotic state27. ness to the Marchantia polymorpha TAK1 genome26 (Table 1). Among the species that have abandoned mutualism, M. poly- These new datasets augmented the SymDB database, encompass- morpha is a particularly intriguing case, since the non-mutualistic ing a total of 116 genomes and 271 transcriptomes, providing broad M. polymorpha and the mutualistic M. paleacea belong to recently coverage of mutualistic symbioses in plants. diverged lineages28. M. polymorpha represents the most recent loss of mutualism of which we are aware, and therefore this spe- Mutualism abandonment leads to gene loss, positive selection cies may enable us to witness the process of co-elimination of sym- or pseudogenization of symbiosis genes. Previous studies have biosis genes with the loss of mutualism. For this reason, we chose

Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants 281 Articles NaTure PlanTs

Symbiosis Genes type RAD1 STR STR2 RAM1 ABCB12 ABCB20 DHY HYP2 KinF HYP4 pp2a DnaJ CCaMK CYCLOPS SymRK EPP1 HYP3 CCD RFCa RFCb KinG1 KinG2 VAPYRIN LIN CASTOR EXO70 FatM NFP SEC SYN CYTB561 GRAS HEP TAU AMS RNS OM Ericoid-like Cyanobacteria ECM Data type

Angiosperms

Gymnosperms

Monilophytes

Lycophytes

Hornworts

Liverworts

Mosses

AMS loss Orthologue Absent (in genome) Symbiosis Genome Homologue (including paralogue) Not detected (in transcriptome) No symbiosis Transcriptome Not determined

Fig. 1 | Conservation of the symbiotic genes in land plants. The tree on the left depicts the theoretical plant phylogeny. The heat map indicates the phylogenetic pattern for each of the 34 investigated genes. The type of symbiosis formed by each investigated species is indicated by black boxes. Genes in bold co-evolved with AMS (RAD1, STR and STR2) or with every type of intracellular symbiosis (CCaMK, CYCLOPS and SymRK). RNS, root-nodule symbiosis; OM, orchid mycorrhiza; cyanobacteria, association with cyanobacteria; EcM, ectomycorrhizae. to sequence the genome of the symbiotic M. paleacea to allow a supports a recent abandonment of mutualism in M. polymorpha. To detailed comparison between symbiotic and non-symbiotic liver- better estimate the timing of this abandonment, we collected 35 wort species. Using microsynteny, we identified potential remnants M. polymorpha ssp. ruderalis accessions in Europe (Supplementary of symbiotic genes in M. polymorpha ssp. ruderalis TAK1, pseudo- Table 4) and sequenced CYCLOPS and CCaMK. All accessions har- genes for SymRK, CCaMK, CYCLOPS and RAD1 existed in genomic boured pseudogenes at these two loci, confirming the fixation of these blocks syntenic with M. paleacea, while STR and STR2 were com- null alleles in the subspecies ruderalis (Fig. 3 and Supplementary pletely absent (Fig. 3). These pseudogenes have accumulated point Fig. 36). Besides M. polymorpha ssp. ruderalis, two other mutations, deletions and insertions (Fig. 3), and their presence M. polymorpha subspecies have been reported—ssp. polymorpha

282 Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants NaTure PlanTs Articles

a b c

A

m Ambt Cryac

b

t

r

i

Linlin 209725 e r

r v i evm 27.m 206754

Ambtri evm 27 m 5

2 Psi

7

Tree scale: 0.1 Tree scale: 0.1 .

m Tree scale: 0.1 nud 0.2 Musacu GSM Horvul 7 Horvu 94.1 22594 0.1 o

1 WQML

d 2017

0 Psec o O Orysat LOC

e Horvul HO 9 1 Musacu GS 2 del.A . NOKI L Musacu GS l r

. Ausspi 20 5 Marp T Sorbic So 886.1 3091 0 Horvul A y 5 Zeamay Zm0 axbac 20 22 .model.AmT 7 8 44 l HO 0 s

1 m 01.1 7152 HO 026010 Sor hi 2 0 3

. 026010 Leptospo Zeam Podco a 8 Selst 0 1 9 0 T al 2014 Marpal 2015253 LFVP Mar 0 Setita Seita.6G098800.1 Sela m t 2 Orysat LOC Os09g23640 QVMR Bra 0158904 3600.1 g

r R 0 b L 9 Mar 081916 T RVU3 P 1 2 Setita Seita. 05. Anaco v

eptospor 7 78 ic Sobic.0 HO p O 3 Marpal VU3 r v1.0 4746

1 Bradis Br 1 X al 20 8 a 20 2.1 po 20076 U d r a p C G R Mar . r 205082 12.1 2672 Selwil 2 77149 8a013 11 00221.1 8 1 al 20 0 0 is Bra h Brad MU .1 y Zm000 A 5 bic.007G 90 LFV 3 VU5Hr1G g MU 4

3837 WWSS Selwi R Eu 3G0 37028m 05m 4 1619 100.1 O Os01g6 C s Selkra 2 6.1 0 11 pal 2009 r 1 Achr H 610.2 20 r v1.0 scaf 1 VU3 H 3

G angiate c S14152 1252 I Anacom 12 482.1 PON3 m 0 s

Q Anaco A Selk 15 EG010708t1 4 8 r1G088 . 01 a s scaf

YLP 5 171 ZZ A 22917 r1G088 angiate M is Brad 34.1 PON971 1691 0 1 . 11583 0008a03 505 LFV 0400 poran c

R Achr9T27200 001 2g0103 f BTTS C 013634 KJYC l 20 i 8 f 0 Aco0 Achr P Liverw d 1 1 a g06047 86 L o

X v1.0.G008704.1

t 7 X38873 v2 00 r Hr1G0662 0 .1 8T i2g5 b 01589 8751 3 SC g adi4g2981 1g04 i a 20416 l HWO Live n 1 fold00 3G27130 00897 097430.2.cf002821 1 017 d .v1.0.G0087 1 M Conifers o 9929.m00 P 6861 6 5G287400. 01000 e .t1 48 IHWO L 0 W. 490 a IHWO m 0 0 S01891S1748 095100. gfu848 g S Liverworts 2 s 7g022 7 v10031 AE Monilo 76 y Zm0000 DCAR 03 Manes.1 8a014 GDQ 4 B S 2 OL r nge1.1g0 b P009 7 060690 P 0 3496S iate 11 H 0 1.1 e e k Aco001046.1 6 Thecc1 a S174 2 i2g26 r 01000

Aco0 T2767 7 .1 X Liverworts 2 samp E c 1 R 8 a o fold00060.152 0 7g02 Luncru 2014175o i 3 KJYC Ginbi 4673.1 780. a 0 NC 02968 7 8 r 5 n C 940.1 t B 20.1 9 ZFGK 1Scf0279 onilo 4 1 rts 2 sampl 80 001 26

1212 i Lycoph CA08g 62 ZFG Mar C o b 07 6926 Mo 0 049.33 0.1 .1.m Selmoe 13 n t r Scf01696 0.2 6 1 Conife Liverw Ly lcl 87S24 Scane onifers.p1nifers.p1 Qr 4809 222 r 3.1.g0000 . a l1610 e

axi162S . r o asgl41 G 1 Marpal sc l phytes. 6 1 1 pal w Lun 5 0 e Selm 0001. 2 Solyc01g iverw cophytes.p 20.2 n l Gb 25 S l161 e 0 S iben101Scf01696g06031.1 Zeam Ly p 0077.1 0 6 not L484 016 orts myco Luncr 37 0.2 60.1 Daucar 008 1 G045100.1 up HL.SW Selm i . Marpal sc lophytes Riccom hytes.p1 .1

6 P 4 001 sc .p1 Manesc cru 20 H iben10 2 Lyc ras OBHm Ch m eg g19108 cophytes.p2 Daucar DCAR K L nd lcl JXT Calfis 201 Thecac Alng orts mycorrizLyc ytes.p2 . Daucar DCAR 003539 1 X amples combined.p Fraexc FRAEX38873 v2 0 i lcl JXT 20154 r13S005 1 G03610 Begfuc B a Citsin or Fraexc F ben101 Zizjuj r OBHm Ch 935832 r i u o oe 90494 Distri Dist u Dryd i t f es combin Citcle Cicl Alng Capann 2074553 r ycophyteophytes.p Mor r 1 88.1.g0 f8805 FGenes s.p1 lu l NC 02968 479.1 g590 a ophyt Solpen Sopen0 ts mycor Humlup HL.S 80.1 oe 838 Querob u 7.1 1 11 0293 p1 Huml Jugr l 44084.1 ff13491 FGenes 4275 Sollyc 0 reo TXVB Liverwo 3 164 a Para 0001.1 48 IRBN Live Casgla C 5P Petaxi P le T rrizal.p 8 6 Nicben N Potri.01 70m ff5479 FGs com 7353 05937 8351 RTMU Liv .p1 8 Alng 9 TX es.p1 i anes.06G004300.13 1 Dryd 2 Mar 02412 1 Nicben N Potri.015 TXVB Rubocc B AN icon 73852.1.g00001.1 p 1 Alng C VB Live Nicben N 653.m0 69m rizal.p1s.p1 Roschi Rchi al scaf ed 1 Zizjuj lc m PCP0av sc0002N al.p 9 4 bined.p1 1 Drydru Dryd P 7500. Poptr v10033 Liverworts. Roschi Rchna F 45394.1.g0 .p1 1 31901 1 Poptri e enes Potmic 1 AN icon 03 h Maldomyrco MDP00004vi dul26A009 rwort 008236580.1 1962 f3964 FGeneshrworts.prts.p13.1 P um lcl u 11 h2.1 1 Manesc M v10007 Fraa AN iscf001scf0 XP e rua r4g04 e h ana F F 2388.1 g050.1.mk rwor P s.p Riccom 2 1t1 8. Fraves FvH4a 5g290 rumdul PrPrupe.1G23 1 Fra n FAN i ts.p1 P as G072Hm Ch P STR2 1 Citcle Cicl p1 1 Pru B yrbre l Fraa na 024126.100. 1 cl NW 008 Citcle Cicl EG01291 ST NC 6 2.1 Pruper t 0.1 0002.1 p evm.model.supercontig 3.375 Fraa 1 Citsin orange1.1ga 003437m R 69 Vigang Rubocc Br 8337 988083 Thecc1 1 Pruavi Pav sc000 2 02238.1.g0 0001.1 Carp i.007G108700.1 Solp mum lcl 2 .1 171 vigan. Roschi RchiOH4 4g2482 .1 XP ora e Pru per Prupe.6G208 80 64791 Potmic 1 0093 Thecaci G i.008G290900. n So udul26A025593P P 0093 Vang1 iscf000 78592.1.g0 56016. pen09g036340. Pru ul Pr X Vigra 1g07370 Fraves Fv FAN Pyrcom PCP Gosra 1232 Sollyc S d 5.1 d Vradi 1 9140 olyc09g09 Pru 8815 Vigung 07g1473 .1 Fraanaa FAN iscf000 Gosrai Gora 1 Maldom MDP0000153 Vigun03 N iscf00155624.1.g00001. Maldom M 0099 Datglo Datgl554S11494S01078 Capann 8410.1. cl NW 0089 1 Phavul P g315800.10.1 Fraan FA 7469 DP00008479603.71 1 Pyrbre l hvul.0 Fraana Begfuc Begfu Nicben Niben101S CA03g03110 03G208600. Maldom MDP Begfu13278S46660 8922 Pyrcom PCP029502.211382 Daucar DCAR 00 000075526 Begfuc .1 XP 022137340.1 cf03036g07014. Cajcaj Ccajan 1 1 2 Petaxi Peaxi 1 Maldom MDP0000 .1 1122 Daucar DCAR 02621205g015468 Pyrcom PCP01 Momcha lcl NW 019104497 162Scf00286g00240.1 PCP025048 Glymax Glyma.17G127500.16 5737.1 Pyrcom Helann HanXRQChr 0790.1 12.1 XP 009345867.1 42031 Helann HanXRQChr16g0497891 Pyrbre lcl NW 008988114.1 XP 018503790.1 12882 Cucmax CmaCh18G00 Nicben Niben101Scf05548g02014.1 Pyrbre lcl NW 0089889 Glymax Glyma.05G045300.1 Cucpep Cp4.1LG09g11180.1 Datglo Datgl Helann HanXRQChr09g0272901 3323 f00378g00624.1 115S01120 32672 Maldom MDP000024 Cucsat Petaxi Peaxi162Sc Momcha lcl NW 019104492.1 Trisub lcl DF973985.1 GAU43614.1 Fraexc FRAEX38873 v2 0000406 .1 evm.model.Chr1.717 RNA38803 60.1 G235100 Cucmel ME Cucma XP 022132948. TGAC v2 m Fraexc FRAE Pruper Prupe.5 Nicben Niben101Scf05548g02015.1 x CmaCh07G005 1 4903 Tripra Tp57577 91 Capann CA03g24270X38873 v2 00 Citl LO3C0021 Cu 1 0003780 an Cla0 42.2.1 1 cpep 880.1 nA17 Chr4g00578067. Sollyc Solyc03g1 .1 87 Lagsic Lsi 1844 0347150. Cu Cp4.1LG19g Medtru Mtru Prudul Prudul26A015441P19.1 238 6 Fraexc FRAEX38873 v2 000347150.2 cmos Cm 103 Cicari Ca 21 10.1 Solpen 824028 .1.mk Mimp 02G0024 73 v2 00 Citl 00. Sopen03g10950.1. P 00 RAEX388 an Cl oCh07G0059 1 500 Nicben Niben101Sc 1 132.1 X 6 Chafasud Mimpu1 40. exc F 4341 Lagsi a014087 p Lj4g3v13891 A Ni 1 Fra 38873 v2 000006480.1 30. Lotja -P cben N 0301 lcl NC 024 2S0783 Nissch N 109S X hr11g032 Cucmec Ls 1 1400 Petax 40.1 sc0000103.1dr19 g860 0001.1 Cha RAE C 4740211 i01G0 608 iben10 Prumum avi Pav g0 1 Arah fa9 05408 XRQ Cucs l MELO3C0076960829 Nissch Nissc5453S10mum1 S03 24 Petaxi iP Pe f151 Pru ru Dry 001. issc3 5284 Fraexc F hr15g0 osper axi162 1 20 Dryd 0 .1 Ara yp arah S 1g0324351457 Mim at evm. 0.1 n Cerca40S044 Petax Scf043 g0101 33882.1. 1 29316 Helann Han hr1 Chaf 90872 eaxi16 6 1.g0 0001 Arai du 67S2 HanXRQC pud Mim 0 2 Zizjuj l i Scf0139028g0005 .1 30.1 r Ar y. nn 4000.1 Cerca mo Cercan pu721 Distr Pe 97730. .1 Ara pa a Tifrunn 1270 as Chafa del.Ch .2 Casaus Casta m 3254S axi162 2S FAN iscf002 2.1.g0 01 du.CH84 Hela r DCAR 02 00560 Casaus Castan pu262 .1 6S280 6 Mor cl NC 02968 cf00089 g 4.1 5g175 Cercah Ara HanXRQC pud Mi 039238 i Dist 00037. 7769 4 48 t yp er. Lotjap n Cerca9S r6.3080 Par not L484 006916 S Fraana H 6 Chaf arah ip.55F8V gnm1.a Helann Dauca 00.1 70 5 Mim pu865 Treo cf01390g FAN iscf002001 G anes.13G1333.m0 0 2 Lot 3S21 m 0S416 and l r238S1 g0002 1 a 8121 Casaus n M 8 G004800.10 m Cicar S2 sgl1772S Dryd n 7 Nissch Nissc52as Chafa4 Cerca y n 7 jap Lj Lj 6 Chafas Chafa a r1.802.1 r 3 7. AN iscf ves Fv .T n1.H0WU6 Medtr 06 137 g6040.t1 .1 Rubocc i lcl JXclT .1 XP 00007.1 1 Fraa F Lotja ifrun T 0 31707 g 0 Rosc 9353 a Fra Potmic 8 627 41m g3v0104 3 gl500 ru Dry J n Cica Manesc G000922t1 Tris rip i Ca 24302 osper Mimpud Mi Potmic 2 X Castanosper4 Riccom 2 i.007G23 390359 g3v146 Aln del.Ch T 015882 42 Medtr 4 n i Potri.00 E 0 Glym ra u Mtrun Casgla C Jugre 609 Frav B Fraa 3g0493261 p Lj4 9S24 er V Glymax Glyma.01G02000 lu 26237. Pru r T .gnm1.a .1 u 0 hi Rc C 0100 T risu r 242S Cajcaj Cc Tp575 m C 890.10.1 P Br d as G03 i 33202 Phavu b lcl DF973 8 Pru r 01000150.1 P r Cajcaj Crip Ca 15399 Poptr V 499. um2 Alng a0192485 0.1 ru es Fv as G01975295S 2 AN icon20124905.1.g000 Glymax G 7 1 V 2 P 78.1 v10026 V ax Gly avi Glymax Glyma.08 Nissch Ni igun A 0 F u Mtru 05 Thecc1 A 7 Phavul Phvul.002G3 g3v31 4 Pyrcom e igra 1.1 m h 8 Vigung Vigun03 r b nge1.1g S2097414103 ru 326.1 PO OBHm Chr7g01 Vigrad Vr 0985 .1 igan 930.1 Pyrb Vigang vi contig 755.4 9 a i 712.1 28403 1 1 3 a r r r A P p i 14060 n a . 5 1 lcl DF OBHm S F 650- Pyrco 102 t 1 4419 a a Maldom MD um lcl 5 n1.K59 Gosrai Gora P 1 77 17 Ch Mald d er M Mald 9 y 4 Tp575 7 1 h d H 0168 OBHm Ch 2 gl603S g l t ul Pr av sc000 m 870.2 d Phvul.00 l MELO3 r i 1 0443.1 152 5 RNA- 4 Fraan 2 T y u 13893 a b 15140 P . 9 6 g ma. 1G00057 S069 P 4 5g07 1 n Thecac TGAC v2 Cucsat evm.mo Citlan Cl 1923 5m re lcl N um1 V 00.1 9 Rubocc B p a A 18072 r m H Vr 1 r S n 0 c A17 Ch Aln gl2271 vigan 6 S 1 0 K 1G00059 8 r g g49004.t1 jan 122 7 e g12291.t1 igun0 N4 .1 A 77m 1 a 09916 el.supe 7 r 6 S upe.5G1561 e 0 8 97 0 PA.1 Citsin or a 1 ON85 u sgl291 o Ch . p.90JFZ 5g04 4.1LG04g1 m PC Citcle Cicl l g adi Lagsic Lsi06G01 0 o d 0 3 PCP039udul26 N .v1.0.G008675.1 jan 230 i 00.1 GA 5 7 28179.t1 r lyma. 5 S s r l Roschi Rch a p 2 5 S20855 1 S 7 Aln ssc5 Cucme m MDP00002 a 9G2 aCh 7 m MD c 3 .1 031- a 1 C 613.1 c r .v1.0.G008676.1 7 1 adi0 N 3 0 gan. l u d 4012 7g019088 P038 h . 1 l 1 m 0 6 02413 9 6 4 621S N u 3 1 M .v1.0.G008677.1 TGAC v226.1 G Alng Jugre Araip.9Z02C Ara 1g09 0 026 y 1 1 . 2 l Roschi Rch 1621 W 008 03.1 18973 ne-0.0- V 0 3 9 gfu78 12-mRNA- P m nn1.8ZQ6 . 2 r . 02400. mR G014574 W 0.1 Jugre n g088600.1 not L484 004555 . W 8 T X 2 9 P 0 A e 022135579.1gfu1092 654 gl108 a 2572 8 ang00 e 3 G143300 1g02650.1 G093900.1 n E G Va pa 6 g0393 7 A 5 Alng i 4545 P Y n 023344.1 1.1 8 i 1 9 f U4 P P00002 a 2 l 5 r NA382 v10019 0 G2471 006G2 4 0228 5 . Be e 00004 959. g002500.1 9 Y 01000362.1 PON60 Casgla C u 0 up HL.SW Mor Jugreg g g Cucpep C c B 6 G055500. e 1 0 ng10g0029 1 2.1 1 1 0 s Jugreg g43297.t AU1 n 80.1 C Mornot L484 004554 8 n 8 g15 Araipa Ara 0 1 0 0 l 0.1 1 3 cl . 9 00.1

n 7 . m 1 S Cucmax C i 9 0 28.1 2 1 nge1.1g 3 8 A e 31200.1 r n B01000130.1 PON6 P Carpap evm.mo .gnm1.a 1 4 m 8 5 XP Huml -gene-0. 91 1009 2ss02220 t 8474.1 X 2 r 4 a 0.1 g Cucmos CmoCh u Thecc1 G . 8P X otri.01 8 o R 3-abinit-g er . 8 l 00. Begfu . 494.1 X m 786.1 g 3 Querob Qrob r 1 Begfuc 5 1 008 Humlup HL.SW 4 4 5 g Datglo Datg P P o 0 i lcl JXT 1 NA1489 n Datglo Dat or 1628 e .mk r . 8655 0 1 Casgla Casgl66 9403

n 9386 i c

m 5

1 n Humlup HL.S 3 l 0 r 8

c

t 4 1 n 8 i 1 reo A . Citcle Ci 2 1844 ifrunn p 9 1 T 6 u

.

R 0.1 3 r

a

o 0 9 f and lcl JXT .T Gosrai Gorai. i Citsin X 9 n Thecac P 009336 2 y . Poptri 0 P 620. 140-abinit ffold158 T 1 P

n

. 0 .

Par 1 y 1 0

C h . C 0 1 232 N a r 9 2 l a 3 ed-sca L c ffold02 l yp arah W ha lcl NW 01910 5 j a h p 9 573.1 33581 y 55

u E j 5 h

z 2 Ara 3 i a

r . 5 Z 1 Momc

A . 1

1 2 3

masked-sc 5 augustus mask 8

Casmol

Casmol snap

STR STR2 RAD1

d e f

Osmsp. 200 Polco

m Mixspe 21 .p1 1 Phacar 20424 Faltax 201 d Racelo L Dicsco 21 Scoaq L Cerp

1 1 e

.p1 e

1 Racvar 20

Physsp. 2 u p Racelo 2 rizal.p1 1 T H Phacar P u 96 Scoaq r . p Dicsco 20 1 imau a 1 . 1

g s

e h T p p1 l s ned.p Thude p . l 21417 9 u b e 0 d i . Dipfol 20 a

im f Necdou 2 Stesub And e s s es.p1 6 SZY 20 o Atran u 20052 r 2009 Fonan s 6 UOMY c

2 57287 NWQ s 1

s sses.p1 s.p1 e s.p2 p1 2 i n es.p Osmjav A e s.p2 Buxap a s s 201575 0 sses.p1 l rizal. es.p1 s 21390 s combine o o T 0 Brya ul u s e r u 2 o o Par r 7503 1 Mosses 2 e 0 s etpel 20 Phypa 6 Dipcon 21 201587 s up 2014 M s 2012 0 8563 PL 2 5 h 0 o 007572 K l 5 es.p Ophvul 20 Plai 4 M 2006 o Not g 0

1 0 Mosses.p1 368 NGTD Ortlye 2013 e 1 07958 RD s.p1 9 RXR G 1 oss h 1 F t 20 20852 1 2 1 M 2 7 Phifon 2 t 2 1 M 20 B S r e h Moss al 2057235 2 .p Mosses.p1 Mosse osses. 7 6 g 2001 1200 NG osses.p1 0 25 FF N Moss 8 M 4 1 7 3 J Mosses.p1 rts non myco es combi n Mosses.p1 20 vin X Leugl orts.p1 3705 59 4 K 023845 O Leptospo ts mycorrizal.p 4 QO Moss L ned.p1 0148 1 Q 5 BPSG 0 7 M o Por i s 20097 L M sses.p1 r A 20 o t P A W Leualb 7 w V es combined.p T G R Moss 5 R D Q K 9164 LN 9 1 Tree scale: 0.1 54 30 E sses ga 0 4202 Por 704 2 akle 8 B Y 20 M Q Ho B

Leug 3208 Mosses.p1 Tree scale: 0.1 X ZQRI Mosses.p G 5 K Tree scale: 0.1

7 F Mosses.p1 n P Mosses.p1 Y 99 WOG p O ossessses.p1s.p1 M AD 7 Dicsc AB BMM rts 2 sampl C Li C ACW Mosse TM MJ Y Q RQ Ho 9 CMEQ Mosses.p1 P 1 EEM BP P av 201 X YX Coni 9 P 8 2 4 08677 M Nagnag 2010568 U 3c23 e Retmi a S rts non myco nav 20 K Q 8 47 WN W o AWOI Mosse 4

R C 4 A P Mosses.p1 006824 OR 9 V E o 9 E D Mo 03 JMXW M 08350 QH Cerp ZQRI Mo p 0 F Mosses.p1 s 5.1 J 984 VI 200392 o 1 p1 ZTH D AV Liver Liverw Radl r I iverwo 0673 XWH F Mosses.p1 s 54 JMXW 0 0 8 V 1 22 J M 20087 C P AV les comb h 1 S K 1 Z u Racelo 2007 OO Mosses.p1 p l N T 4 5 0 2 r IGUH osses.p1 a DHW Moss M M Ararul 2 1 6 ZACW B TM 1 p T S v 1 592 T S 7 7 L i nworts spo H M 1 6 Mar 0 H O Mosses.p1 179 JA A 616

s Racvar 2 n Ararul 2006434 XTZO Co D Mosses.p1 1404 OR o 20 20039 87 CM S 1 KQD Mo 0 Selsta 20 p1 F 8 1 E F M 668 VBMM 5 e 1 5 M cophytes.p1 p 5 D Mosses.p1 TA MixspPtipul 2 8 1 85 ZACW M 07702 meto N G M V o o 05

s u 4 . 1093 Mar 5 i R MIRS Moss orts. 2 AJB H 22 o r S 3 r Scoaq S n 20008 1 rworts.p1 0 1 11 p TC 0 46.162 r 20 M nworts sp osses.p1 s s 3 angi M V rs 2 sampl 0 B sses.p 01 Ly

p B s A J 1 TM 4 21 WSPM9 SKQD Mosses.p1 1092 sses.p1 7 LN n 20724 Liverw w al 20 WG M Mar F Mosses.p 2 GH 1 B Mosses.p1 0 48106 J TXV 2 r s 2 sampl 0 44 5 p 1 s s 1 Liverw Q Mosses.p 1Mo 770 PM BO Le i G 8 J Mosses.p1 . 5 es.p1 Marp o P l 4 r X M o 0 Baztri 2 4 Q 2 7 e e 0 p 1201 NG 3 1898 LN L B i 200701 Mar EKP e 4 s g 2042 worts.p1 e 2 s ycophytes.p1 es.p

fers 2 sampl 2 o 0 36 orts.p1t 4 00V3.1 s f S Physsp. 06 3 X s t 2090541 WNGH M al 2012 2 r p Mosses s s ycophytes.p1 Mar c RGKI Mosses.p1 1 Liverworts.p1 r 00 KRUQ L C H r e r 2010 i fe 2 E osses.p 913 L

rs.p1 9 n p 2 cane 1 0 s.p1 sses.p1 s w a u VMX s S e MIRS Mo L hyte. i VS Euspo B sses.p Liver o rts.p1 . . M d r 0 4 91 HMH P 0 35876 0 K o sses.p1 a 0 9136 L n K 5 p p al 2 9 Q Mosses.p1 e 200215 201495 e 8 006433 e te Monil 7 M 3 K pal 20 l 18405 w p 17231 a L 86 LF e a 205270 rnwo ra 2100 RGKI e 20375 041 2 RUQ Liver osses l 216 729 QMWB8 MoWS 83 FFP P Pally l al 2007 r 1 1 o S Mosses.p1 o al 20 U 200215 571 M 1 s 3 Lepto fold00 Encstr 20 s 08137 QD Mosse r s.p e 9 ra 20144 101 IHWO L ornw 1 o y 00813 e m 0 ophyte.p1 r 200215 184021 9 5 LGD ycophyt ned.p1 BNCU Liverw H 2125.1 657 sses.p l ju U P Roscap 20 . 20 i p 02893 ZZ 1426 Phypat P . ABIJ 0 28 LFV worts.p1 or 1 Brya Phifon 2 2 e e 20090 2012 L C o t v 91369 0 8 p 4 sses.p1 p bra 201 YFG i r Mar 2015 Ortlye 2000 r f6142 FGenes H Aulh tospora s.p1 200 J r verwo p Pseel ycophytes 2 sam s f 1 9 1 Z Conife sses.p1 n 90 HMH V 4 94 VGS i e I 1 L 1 Rhyse 1 8 s.p1 Physsp. 200 ophyte. L r ABCD MoT 200 4 YFG 1348 H H D ive r 1 Anocl ops scaf4 12 05 LFV1505 r Mosses a 2 I P 201897 NWQC Li Leub 206 3 ycophytes.p1

nifers.p1 Plains 2080273 BGXB Mosses.p1 O P p ts.p1 1424 1 Necdou 2023 7 T R D orts. Leu Rhyser 2 Live PXA L AJ Conifers. .p1 1 .p1 6 Mosses.p1 Anocl tesub 20 3 F al 2012 Calcor o fers.p1 1 0 1 Leub L NQF ycophytes.p1 J D Mosses. Leu Mar Calco 4 i D Mosses.p1 worts.p1 7 WZY r .p1 657 IHWO Liv sporang J Conife 990 BPS Thudel al 2001 5 PIUF LiveH i L les comb B S 99 I v worts. Calcor n 200827 Cycl 04 LFV o Dipfol 2 Phyp0 OO Mo 1 Loeb 1 JHFIJ Liver worts.p1 8912 ZZO AE r K Conife 02069 1425 aklep 2 p 2 rw e Stesub HZW ycophytes.p1 G 7124 KEFD Clide M worts. angiate phytes gam XTZO C N p pal 2068 7 ts.p1 Liverw re 2002 0 T 7 p p Stesub 200 Cyclops scaf9Y L HMH Mar 7596.1 1 T 2 H r o 20109 UJS Conifers.p1 s.p v1.0 scaf Dipfol 2 3c19 Taklep 2 al 20149 7 1 o al sc 7 RBN Liv w p 1 Necdou 2 p d onilophytes.p1 es combined hytes.p1 3 YE T B L rts 2 sam P n XP 592.1 G M Luncru al 2001 200956 Liver 33 IH Y K Live o 1 Anoatt p 20096 ycophytes.p1 p uxap at .p1 Mar 8 rwor Luncr Liverw ve Clide 4081 E 8 CB L M giate 0 O Ginbil S Tr 7 3 sses.p1 p e 0 56 UUHD L FG r p1 1 e 6 hytes.p1 tophyte.p1 Liverw X Coni AIGO Conifers.p1 Z Conife 0 P Mar 200956 6 KRUQ Live e a o t ela 0 ycophytes p e pal s QS Buxap 262 P 0 7 s Loeb G 5 Marpal sc 5 r p1 oe pred onilo P Se 83 ZQVF Coni 0 e worts.p1 ff26 rwo L sses.p1 L L m 593.1 1 h p O Mosse205 o Marp i 2006 0 8 CHJJ Liverworts.p1 r Luncr WO Live P o ts.p . S 06383 A giate C 2243 u p M 7 worts.p1ts.p1 oe pre 0 Liver Liver ia A 02067 7 Buxap 20138002244 3c21 1 sses.p1 p1 Mar p e 200082 BNCU r r 200038 orts non myco Liverw r rts.p1 Sell m ycophytes.p1 onilo L TCJ Con YE 18 FGenese worts.p 1 ran M 0 r Selsta 20m Concon 2 l o P Pally e Concon p M t Monilo Sel 13290 NVG h e r ts.p1 1 09222 GAON ilophytes.p1 e ycophytes.p1m 02067 Mosses.p18 u le worts Sel 201040 ycophytes 2 samp n ca 0 20138 Pallyele 1 r onifers.p o n JDQB Conife X A PO Mosses.p1 0V3.1 e TMU Liv 200038 L phytes gam onilo 1075 XIR del. P n verwo ned.p1 rw s combin Sel 12975 GAON TUO L angiate Gb 309 2 P WOI Mo l in 2018 YK Live r 1 Moniloph oe 13 i 18823.1 h 1506 i ycann 20 w rts 2 sam 1 9 E 6 T 0 R o nifers.p w erwor ff5907 F f 12634 201 53 1 angiate 21 p 4 orts.p1 Peln m fers.p o X etpel 20 9 A Pelne av 20142 201923 9658 LGOW Liver o L e 06924 GK o e r 02067 And s.p1 1 2 T rts my .p1 1 HZW sp es.p Phaca orts 1543 CGDN C 0 H WOI Mosses.p13 P n 0 0 X Hupsel 2 0 Leptospo rts 2 s r m 66 Y ngiate Mote Monilo a 7 tophyte.poe1 3180 1 8 380 8 R 0V3. Por 20 526 46908 IL V 01712 GTUO P ia tophyte.p1 s 213 1 IRBN Live Hupmy p 2009831 767 4 XP r H WG Mosses. n 7 ples combi TXV h 0 X Pele e h up 2004 Por 2 B ed 01713 G ilophyt Phacar 0 hytes leaf phytes. . 16780 GKC 001 0 7 R sses.p1 YBQN L ycophytes.p Livercorrrriz 7.1 Hupsqu 2 YI rang n p 0 e 2015 .p L 06700 IL non ts mycorr p w 61 0 9406 H H 1 Leptospor r 21421 Scane 1 0 WG Mosses.p1 1 Fruspp 12780 WZ O B al.p.p1 Hupsqu 20 31345 ptospora 1 Live e 152 1 RWG Radli 0 T ycophytes.p1 B izal.p1 0 KP p amp p 6 1 L Hupluc 2 NOKI Lepto phytes game Gene 1 207244 2 Q Q Liverwow E giate Mo o 1 Amearg 2025941 IAJW Conifers.p1 8 i 20151 C Schisp. 2 17671 1 n l ytes.p Cephar i 200 05 WOGB Mosses.p1 orts.p O Le 20159 e 6 319089.1 P G .p1 r Hupsel 2 5 my Cephar 2091883 WY n V m 201280 A ycophytes.p1 w B Leptospo Phaca Cunlan 20199 Mosses. Calfis 201 L A 5 m 2168 5 l s combine Taicry 20 EQU 04 19089.1T0920 BQ Mo ophyte.p2 orts.p Hupsel i2c 205538 136 M te Monil 2 e 9 Athcup 20 etsp. 20 i evm 27. 3 9 ycophytes 2 samr uspora a 3 c p Chala r p1 Baztri 2 1829 L o 1 Hupsel 2 11 E 1 JHFIB s comb T EQU 05 cane 8 Pru 20185 angi nilophytes.p1 r 28 WCZBR o sh2.1 1 S 0 m rts.p 1 Linm 79517 VI UOMYSZ r 2024783 XRQ Ho 3 Q Liver rriz izal.p1 .p Piluv u P P sses.p1 oe 82473 ycophytes um lcl N 1 0 2 PIUF 1 Disarc 2 q Achr 0.1 p1 Odoprom 2 23562 GKYHZW L Pru 1 Linlin 9 JV lophytes.p1 9 1 Neopambt u A 7 BX rnworts sp m 26 i 1 2 IRBN a A q 16.1 Sel 023213 GTUOY Ho rts.p1 um lcl C 02413 Dipcon 20 WLIC Cycadales.pY Leptospo1 Live l.p1 1 ycdeu 2 o 5 tosporangiateiate Mo Mon phytes.p RXRQ HornworHo in d Dencat lcl NW 0 2 L Pru N Osmjav 20 20077 p ng ilo rnworts w e Phae 0 820.30 Hupluc 20 010093 rts.p1 mum lc C 024 2.1 e 2 UOMO Le Liverwortr o d .p1 224826 N RXRQ p1 132.1 X XP Osmsp. 2013 2018813 62 IB spora ate Mon phytes.p1 w rts.p1.p1 Phae 87.2 Vigan Hupsel 2 0 4 BC Hornwornwo Pru l NC 00823 Equhy nilo Live orts.p1 Dencat lcl NW 021319089.1 2 1 TC rts.p1 mum lcl N 0241 7695 V MR Eu rangi Aco0 g vigan. Hupsel 2 20184458 rnwo 32.1 P 00823 8052.1 2 Dioedu QV ate Mo phytes.p1 r ts gamsp r Dencat lcl NW 021 r1G058 Hupsel r2 3 FAJB H o Prum C XP 7 VQ Euspoorangi onilo nworts spo o worts.p Dencat lcl NW 021 i1g24 H 1 V 050 rnworts sporophyte.metophyte.p1 um lcl 024132.1 00823 8054.1 Osmsp. 2012 AL M rophyte s.p d 0. .p Vigr ang065 Phaca 9 FAJB H e.p1 N 199 Osmjav 201662207 888 angiate phytes.p1 etoph Musacu GSMU VU2 ots- ad V 049 RQ Ho etophyt C 024132.1 XP 01665 8056.1 2199 1 ALVQ Eusp r Monilo 1 Anacom R 4 Vigun radi 0s00020.1 Notvinhal 201 200 X rnworts gaam 1 0 Psinudar 2001 887 rangiate 1 is Bra 12 g V 0209s00 Par al 2009 94 R ZB Ho XP 1171.1 212199 Tmep r 2001 ophytes.p1 Pyrb rophyte.pyte.p.p1 Orysatd LOC HO Os07g38002G34310 Phavu igun rnworts g etophyte.p 008238 2 a 6 NOKI LeptospoH Leptospo te Monil re lcl N eita.2G357100.1 l P 11g026900.1030.1 Parh r 20246436 WC gam .p1 Pruper P 055.1 21 988 Tmep orangia 1 Bra S 08a029 hvul.0 Phaca rnworts ilophytes rupe.5G020 Linlin 2005542904 HTF ilophytes.p ined.p1 W 0089 1 Horvul TN Core Eudic 11G1869 38 WCZB Ho Prudul Pr 989 FJN Leptosp giate Mon ples comb Pyrb 4 c0 g4 i1.p1 Glymax GlyCajcaj Cc Phacarr 200220024 7 WCZB Ho orangiate Mon Pru udul2 000.1 Onosen 200879610 U poran ilophytes 2 sam re lcl NW 0089 1 Setita ic Sobic.0 87 LR 1 a 00.1 Phaca 3 p nilophytes.p1 Pyrb avi Pav sc000 6A0322 2 VITX Leptos giate Mon 88233.1 ay Zm000 ma.08 jan 46131 20024 EEAQ Eus giate Mo re lcl NW 008 1478.1 g2 40P1 Dipwic 8 oran Sorb DN5592 026287 Glymax G G227200.1 Phacar 14964 poran 988166.1 70.1.mk Blespi 200950 7 RWYZ Leptosp lophytes.p1 X Y Trisub lcl DF974 lyma.1 0 QVMR Eus te Monilophytes.p1 XP 018505 i 200715 orangiate Moni 88155.1 X P 009371 Zeam ni 20962 hr09g 4421 5 5G222300. Scedis 2 1 rangia 018.1 18074 Blesp T Leptosp TRINIT 45.1 GAU49 1 Psinud 201774 ALVQ Euspo nilophytes.p1 Maldom MDP0000 Pittri 2010145 UJT giate Monilophytes.p1 P 365.1 2 Monu r11g032 Tripra Tp57 304.1 38362 r 2006523 ngiate Mo 172991 386 POPJ Leptosporan 009364 HanXRQCCh 577 TGAC v2 Tmepa 8 JVSZ Euspora etophyte.p1 Pyrcom PCP026193.1 Ptevit 2100 Rhoforn mRNA258 te Monilophytes gam Notmon 2002199 YCKE Leptosporangiate Monilophytes.p1 Pyrcom PC823.1 1713157 Helan 0758 Medtru MtrunA17 Chr 79 Equhye 212972 Y Leptosporangia Pyrbre lcl NW 008988166.1 XP 018505017.1 18073 8g0355131 Osmsp. 2005708 UOM Adirad 2062549 BMJR Leptosporangiate Monilophytes.p1 Mald P044647.70 Helann HanXRQ ptosporangiate Monilophytes.p1 988732.1 XP 009343190.1 39583 om MDP0 r DCAR 01 5861 10.1 Cicari Ca 20398.1 Osmjav 2078060 VIBO Le Pyrbre lcl NW 008 1 Cycmic CMI00021778 1 Dauca 0002947 Lotjap Lj3g3v1739280.1 Dipcon 2010165 MEKP Leptosporangiate Monilophytes.p1 Pyrcom PCP041484. Sciver 200584 Pyrcom 000235629 r DCAR 02 2 0000263508 Mictet 20 1 YFZK Conifers.p Pyrbre lcl NW 0089 PCP Dauca EX38873 v 0.1 Lup001774.1 Dipcon 2010166 ME Maldom MDP 73857 MHGD 1 8 025047.1 FRA 0002926 Lupang KP Leptosporangiate Monilophytes.p1 08474 Halbi Con 8912.1 XP 009345 Fraexc 38873 v2 0 Aradu.69Z0K Notmon 2063504 Drydru Drydr202S 053 t d 205488 ifers.p1 866.1 4203 RAEX Aradur 0.1 Adirad YCKE Leptosporang 9 Cephar 21 3 OWFC 0 Fraexc F 00029260.2 nn1.R5AJY 2060298 BMJ iate Monilophytes.p1 Potmic 2 80.1 Auss 13043 NV Conifers.p Maldom MD RAEX38873 v2 0 ifrunner.gnm1.a 9H4 Wooilv 20 R Leptospo H4 5g239 pi 20 GZ Conife 1 P0000322681 Fraexc F hyp arahy.T Araip.PZ Linli 73677 rangiate Monilo 13421 Psechi 2 13603 rs 2 samples iben101Scf00069g12007.1 Ara Araipa PD.1 n 209668 YQEC Lep phytes.p1 Fraves Fv r7g02 544 Taicry 2 08259 BTTS Co Prudul Prudul26A002329P1 Nicben N nn1.DT2Z Linm 6 NOKI tosporangiate Mo 81 9 nife combined.p1 .gnm1.a 33533 ic 205483 Leptosp nilophytes.p iOBHm Ch ras G15 Auschi 2064429 QYLPM Conife rs.p1 400.1 Nicben Niben101Scf01477g00020.1 ifrunner 7261S PA Ginbil G 9 orangiate Roschi Rch Rubocc B Chal Pruper Prupe.6G209 y.T sc1 862- T b 12056 YIXP Leptospo M 1 4789.1 2694603 a 06895 SNJ Conifersrs.p1 68 Petaxi Peaxi16 Arahyp arah Nissch Nis mum16 00251 etsp. 20709 onilophytes. 76.1 Micdecw 2 2069845 YY 6625.1 19 2Scf00703g001 osper S Calm rangiate p1 343.1 PON4N946 354 Ambt PE Con .p 24126.1 XP 00823 Capan 17.1 tan 11328 14649 Amearac 20546375 CGDN Conife Monilop 0 05728 5 1 mum lcl NC 0 110.1.mk n CA02g2435 pu 16 B0100 049.1 PO.v1.0.G034735.16355 Apos ri evm 27.m AIGO Conififers.p1 Pru 4578.1 g Sollyc 0 Casaus Caspud Mim a1194S 101 Cepharg 215 hytes.p1 XT SW Dencath l 3 01.1 Solyc02g0 Mim 19S 1 0 RM r nd lcl J C01000 not L484 006 e lcl KZ XQSG Co Pruavi Pav sc000 Solp 91590. Cepha 2 0463 MV s.p1 Para p HL.Mor Phae o ers.p1 85812.1.g000 10.1 en Sope 2.1 Chafas Chaf G026050. Aga 111726 NIAJW Co Conife ri lcl JXT Humlu 50.1 298493893 Musacu GSMcl NW 02451 del.Am nifers.p Poptr n02g03 4 1480.1 r 20606 rs Treo Mornot L484 00S0 qu na FAN iscf000 H4 1g190 .1 i Potri.00 6 Cercan Cerca6 W mac 2069 V nife March.p1 r526 S08206 Anac PE 932.1 Tr v1.0 scaf1 Fraa Riccom 2 210.1 moCh0G01g2 Mictetolnob 2 56 GJTIGZ Con Conifersrs.p1 0158973 2626473 Ory QU 14 1 P Fraves Fv .g00002 7G004700.1 4G024910.1.1 P Horvu om 319178 KA6 1778 t Manesc 8 p4.1L 0 Mancol 0 49 .1 X gl19262 sat LOC A U 76 1 fold00 91613.1 333.m0 Cucmos C aCh0 3727.1 5531 Retmi 200956568 8R A 2 sa Distri Dist S191 Bra co A 5 .1 134.1 Potmic 3 10021 Citsin orange1.1g0M m CWS ifers.p1 lu Aln gl3972S Setitad l HO 01 Achr X 048.167 FAN iscf000 anes.12G1200564 Cucpep C 8G00614a013661 Faltax 200 200 334 MHG mples co Aln g50324.t1 Zeam is Os066002. P 0207 6084 na r2g01 1214 Citcle C P 02213i0 n SCE Co Co l NC 029689 Alng u gl19262g B R 1T155 Fraa Cucmax C Faltax 205715720 702 QHBI Co nifer m n g50324.t11310.21 So S ra V 1 0 Hm Ch as G1 3268 Carp 019932.2.1r6.434 Ambt 6867 2 s.p1 bined.p1 Alngl Al g Zeamr e d U7 g025 20 3676.1 13 B 0 icle 1400.1 493.1 X Citlan Cl h 284915 CDFRD Co nifers.p1 Zizjuj lc lu Jugre 008865 Rho b ay Zm0000ita.4i1g505H 001 iO Gosraia Gor v10024 4 Lagsic Ls 1 Apos 3824 9 07G0 a .1 ic Sobi r1G0082 r303S 4 p evm.mo 02943 P ri evm VGSX nifers. Alng Jugre r3.100 .1 Dauc G012 0.1 Rubocc Br 1 Thecac 01910 odel.C S442 66 ha he Co 13794.2.1 1 Dauc for ay Zm00 60.3 41 099.1 715090 m l MELO3C.m 1 221537 Dencate P 0 Rhofor 7 Roschi Rch u Dryd Casmol 877m egfu54S S Dencat l q lcl K LYX C nifers. p1 a TRINI c. 000. 4 r 1 d u 27.m Con Citlan Cldel.Ch Helan r 0 8a0241 20.1 ON97 Querob Qr Thecc1ai.008G088el.supe a lcl NW Musacu GS PEQU p1 Lagsic Lsi o G012170 Helann a DCA 10G013 Dryd 885.1 3 h Cucme l1640 65 Musacu G Z45 o ifers.p 6 120.1 0 Hela r DCAR 00 0 1 98.1 714340.9 Jugre gfu9142 2 lcl NW 021 o nifers 2 sa G0126109 Helann T 0 38 0 maker- e 707.1 15 Anaco del. MELO3C 6 5 1 n HanXTRINIT 8a03 34.1 P Casgla C rcontig 129.17 Momc CucsatBegfuc evm B 9 69.1 30256 Orysat L cl NW 2014.1 P n 1 Fraexc F R 0 Y 23 18399 EG000924t 4 Bra 26 A ifers aCh1 Fraexc F nn DN48 0 S Alng g 5 m m Petaxi P Han 0 0 g490o 00.1 ON5 Horvu 6 T oCh1 G006070.1 Nicben N Han 8 39 0 Cercan b P038sca Begfuc B P 186509 Setita d m 45 r v1.0 A.p1 m Cucmel m 6 Nicben 18 .1 C010000482.1 PON Datglo Datg 93.1 76313 S M 021 ples co G006000.1 20692 1 Capa HanXR Y 4 5 07 .v1.0.G001 Mimp lu ff .v1.0.G019297.1 S is Bra A MU UA 3 K p4.1LG14g10130.6 S 5208 Solp XRQCRQ DN7214 00 c0 W 23.1 22595 0.1 old057 Zeamor co01 19817.1 A51190 Cucsat evm.m 4.1LG08g0 Sollyc 23 2 034.1 PON97 6.715 Mim Aln asgl291S07.t1 W ot L484 020 OC Os05g41 3 sc S50895 0.1 1 Popt RAEX38 XRQCh .S 9 22378.2.1 1 137.1 n Nelnuc N l p oCh0 Poptri C 9 01000 Casaus b HO A Achr19817 a mb 368.1 1615 850.0 .1 Man RA ri lcl JXT 6 0 391S Helan 565 0. Riccom 2 Cerca9 32.1 PON54 ic Sobic.Sei f Cucmax C m aCh0 5 0 5 Citsin ora n eaxi16 hr03g Nissch N gl603 880.2 r 6498 Dauca d Achr fold001 08 7 Citcle C e g3 ud Mi 4 2636.1 ined. 81S3 600.1 Carp N reo 1 Mor 47.1 360 0 Dauca l 2 3 Gosrai Gor n 01000 p 33144.13220 5 i2g2 .1 1615 m 1 Thecac n Sope i QC h Ara RV 6 1 Chafas Chafa E T Rhofor a 5 Cerc r 3-au 0249279 Casaus Casta 2 Cucmos C 41 M ben10 p HL.S C C Distri Distr2147 007076 ud Mi 0100 48.1 036010 4 5 ta.3G1 XP 24 i CA01i r03 i1.p Ara u P .1.mk7P1 Fraexc F y T21490.1 p1 Cucpep C gfu1gfu817 2ss01540.1 esc Potri.0S ben10 X 6 c0 g 1g07 3 1 a 2 4 024931 P Frae 5 W Pot odel.Chr atg .1 r17g 6 a 015876262.1 7752 e 0 . 380.1 Ara S 14102 TB X 242 Capan n Ha Zm00 U1 1 XP 02067 01.38 e 0 1 g10 olyc08g 38873 h 0 6 g T21500 K.1 A2 01000 9 nd lcl JXTB h m X Solpe P 022145367.1 161 7 873 v2 g0 08 s 1 Ara d Humlup HL Cucpep C 2 r09g 33230 ustus-ge 2 a XP 09G01789 7 4 Cast J C .1 0.1 Solly 7 7 02214 B G128 K4.1 2 570.1 Petaxi r DC NU 1 X 600. 5 Casaus Castan d yp ara S 00003 a 02067 9V 90.1 t Huml H 7 S pu994 Nicben N P a l MELO3C0 m 1 8 Nicbe r 0 c Be 2 G21370 p evm.mo M 08 1 Lupan 001.1 0 jan 1 9 r Para .1 31708 Distri Dist Nicbe G149500 7 1 0 1 i1. G000380.1 h ur P DCAR 01 Cucmos C radi u ri.001 161 00001 Carp FB Nissch Nissc5 r1G068 6 1 u cf02295 xc FR 001 n Cerca9 n g -1 Citsin or 9750 9 7.1 207NA178 U icl 6 Ara 009339 m PC 0237 0 Carp Citcle Cicl a 9 5546 0 Manesc Manes.10G063500.1 n Scf02 issc5 pu262 T TRINIT 1S G 6 8 i yp arahy -2 T ane 138 001.1 0G2606 V s 0 363 0.1 7 08g0 reori lcl JXT Ar pa nd lcl 906 t 008a0 nge1.1g 0 624.1 246 g0409 T Arahyp arahy A X Cucmax C ang00 54 827. 1 25075 8291 M 1 R 0 T 009339 7 Ar 001 2470 h Arahy a 0 A c AR S e 8520 v2 Cicari a 32S20 5 igun0 2 . n Sop .H2WL6 8976 0.1 o p 1414 Trip Medtru Mtru 0753 d 0046 Citlan Cl n XP n C 5 hecc T 1 .t1 R RQCh 6 523.1 V 3G103100. cf 0 4 Datglo D ar ri lcl JXT e 0.1 .1 n e-0.12 882.2 161 Begfuc ip.02M 7 v10000885m 1 V 0 0 a t 1 a 6 0 s CA08g 625.1 .1 S Begfu ai.008G220 1 5 a h 0.1 j Cc 0 Cucme r i P a Ar XP du.Z r s.17G003 71 m MD 1 0 c AE 0 pa osper S 426.1 g05 a n i Ca 13803. 0 75760.2. 1 0 P 523.1 X p G1308 Pyrco r 0 i dur .g0 n u 01095 du.P8y 93S10677 reo ap evm.mo olyc01 g 8 02 AE a s Lagsic Lsi p o 2 1581 34 0 g Lup02 3 2 a 6 a igr 5 2898 m MDP 8130.2 p evm 4 7 m0 11 . T 00504 n1.BJTL8 adu u NC 02968 4 n a dul26A 4 5g271 8120.2 36097 N eaxi162 1 g3v1 Cucsat evm.m l 60.3 t 4 g00022.010958 ra T -R 00.1 3 nn1. Ara 0 i r 3S2 o c N V oCh08G000 - mRN 2 -mRN Y Phvul.00 a X n L 1 335.1 250 1540.1 0 u 2463 5 mRN S 7 8085 A NC 02968 15S1 i 7g02 i 3134 l 2 Ar 6 a 3 Ar - S AU3167 410. ifrunn en l as G27 G i

0 n 9 p.C87U 8550.1 ben10 X c ange1.1g r09g r 0 vigan. Ar d p ara NC 029681.1 XP 1 ben10 DN62951576 4 n 5 i E Ca 243 T 38873 v2 0 4149 m 1 r 4 A g050 l u Drydr nn1.DK 0 S Aradu.Q4WI . ben igun 17 Chr 1 0

4.1LG17g09 8 g 16200.1 m Mald .1 g Cajca Lj s ospe 2 Tp57577 Ti 388 Cicar el.s o T o 1.a g 2 2 G016677t1 m 2598 8550.1 h ur 8 V 7 08933m a 1 ipa e 01g0 11g09090.1 A 00823 77997.1.g0 Potmic 2 A s n 31619 M . 0 g48181 i 8 s 02001.1 2ss0221 um 138 Mald r 5 00.4 0.2 p frunn ul Pr 2165.1 e d -1 i v100008 g0 1 P G a 1 Zizjuj lc a 6 .mo 9810 havu p.3F2C7 g 0 dur p C G1 . d 0 N 0 2g0885 S c

5 1 8G0 10 jap 0.1 -0.18 sgl16 TGAC v2 m p X 0 upe 0 Zizjuj lc e i 73 igan

2 258271 1 022155334. G143400 e-0.18 b P018 P e 400 1 . S 2 7 55 osper W 2 Drydr c S 96820 Ara h 7 h ob P018 7 r. Zizjuj lc N u r radi 1 1 V h 2 2 2 o 1Scf 7 .gnm r 3 1 Ara Glymax Glyma.09 r 0 m 24099 02215 2 1 4 464.1 G .1 r cf00647 r .gnm1.a m 100. gnm1.an L

5 y. Pru 0 Scf028 c 1G0201 a lcl NW 019104 Ara Cucmax CmaCh0 v1467920. 8 V d 1 Ca 0 7 nA17 Chr5g041 .T 4545S 040- 003 0 v2 00 Glymax Glyma.01 Lot r e . E 0 vi Pav sc000 del.supe Scf028 3 27.1 1 c0 g3 i6.p1 h Mtrun e rcon NW 0089 i s um1 0 7 el.supe .gnm1.a g . Fraves FvH 3 0 00. Tifrun u d r r 520.1 1 G . 009980m a lcl NW 01910 s Alngl7889 Jugre Cucmos C 0 ifrunner .gnm1.a OBHm ChRubocc B t 0

7 i igun0 6 i b Qr h m ang00 ul.0 0 er otri.00 17776 nn 4 o Cucpep Cp 0 3 mum2 u m 1 PA 6 V .2.1 03379 N TGAC v2 l

T Prua AN iscf000 59m 1 1 4 6 Begfuc Begfu848S P o tig 90. .V 0 Datglo Datgl U 0337 N iscf P Momc 5 F i 4 g G e lcl NW 0089 Tp575 4 i c ifru 9 Y 024130.1 0 hvul.00 9 515- A ifrunne A 27149 r Vigra 0 c g 5 Medtru 5 ne Casgla t 9 T n1.5W9 . i 1g01008 Momc A Pyrbre lcl C F ustus-gene 624.1 XP T 8 04624.1 XP na gustus-gen 1 rcontig 1 ra 00529.1 ifrunn Quero Alng 9 1 p 1 G y. 4 N Querob Q rcontig 54.31 R g0000 4 Cajcaj Ccajan 12295 Glyma C a y. 1 T 0 9 8 3 o g01 b lcl DF973 r. .gnm1.a n Pyrbr

n rip - 1 0.2 4 8 P t 651-P Poptr y. vigan n1.T P P . gnm1.a 1 T Vigun v 0.1 rah 4 Fraa A

isu A

0

Lotjap Lj2g3 2 Roschi Rch 00

0 mRNA382 Tr Q31.1 34-aug Phavul Phv Fraa m 8.1

8 B Phavul P mum lcl 9 . yp a 28 1 yp arah 3 A Z 1631

Glymax Glyma.09G202300 R .1 h Glymax h

Vigang 7 T n N 9 nn1.SZ 1 Pru 9 n1.5KC .1 fold05534-au Ara A P.1 F f fold055 Ara f

3

D a lcl NW 01910 Arahyp arah

8 l sca

3

c

l

8 -sca

5

7D Momcha lcl NW 0191 6

4 b 5

u 86.1

Momch 6R.1 s

i maker- r l maker

T

Casmo Casmol

SymRK CCaMK CYCLOPS

Fig. 2 | Maximum-likelihood trees of genes specific to the AMS or intracellular symbioses in land plants. a–c, AMS: STR (sequence substitution model indicated for each phylogenetic analysis (model): TVMe + R5) (a), STR2 (model: SYM + R6) (b) and RAD1 (model: TVMe + R5) (c). d–f, Intracellular symbioses: SymRK (model: GTR + F + R5) (d), CCaMK (SYM + R6) (e) and CYCLOPS (GTR + F + R5) (f). STR and STR2 trees were rooted using their closest paralogues STR2 and STR, respectively. The RAD1 tree was rooted on the bryophyte clade. SymRK, CCaMK and CYCLOPS trees were rooted on the bryophyte clade. Species names are coloured as follows: black, species with intracellular symbiosis; light red, species without intracellular infection; light grey, species with undetermined symbiotic status. Cyan dots indicate species with AMS. High-resolution phylogenetic trees are available as Supplementary Figs. 1, 3 and 13–15. and ssp. montivagans—that are sister to M. paleacea. We pheno- least 29 convergent losses of AMS, thus representing many indepen- typed these two subspecies in controlled conditions and confirmed dent replications of AMS loss in vascular plants and in bryophytes. that the loss of AMS occurred before the radiation of the three We therefore conclude that RAD1, STR and STR2 were specific to M. polymorpha subspecies approximately 5 million yr ago28 (Fig. 3). AMS in the most recent common ancestor of all land plants. The fact We sequenced high-quality genomes of M. polymorpha ssp. that all three genes are absent from non-AM-host lineages indicates montivagans and polymorpha and searched for the presence of the a particularly efficient co-elimination of these genes following the six aforementioned genes. As for M. polymorpha ssp. ruderalis, all loss of AMS, suggesting possible selection against these genes4,17,27. six genes were pseudogenized or missing in these two novel assem- We suggest that one potential driver for the loss of AMS is the adap- blies (Fig. 3). As expected for genes under relaxed selection pres- tation to nutrient-rich ecological niches, which are known to inhibit sure, the signatures of pseudogenization were different between the the formation of AMS29 and thus render the symbiosis redundant. three subspecies (Fig. 3). Alternatively, selection against these genes may be driven by the We conclude that mutualism abandonment leads to the consis- hijacking of the AMS-related pathway by pathogens that would tent loss, pseudogenization or relaxed selection pressure of at least result in positive selection acting against this pathway in the pres- six symbiosis-specific genes in all surveyed land-plant lineages. ence of a high pathogen pressure. Although this question is not settled yet, the example of RAD1, which has been demonstrated to Genes specific to AMS in land plants. We have shown consis- act as a susceptibility factor to the oomycete pathogen Phytophthora tent loss of six genes with mutualism abandonment, but with the palmivora in Medicago truncatula30, provides support for this sec- broader array of genomes present, we are now able to test whether ond hypothesis. RAD1 encodes a transcription factor in the GRAS these genes are lost with mutualism abandonment generally, or family and rad1 mutants display reduced colonization by arbuscu- specifically with the loss of AMS. Three genes, RAD1, STR and lar mycorrhizal fungi, defective arbuscules—the interface for nutri- STR2, show a phylogenetic pattern consistent with gene loss specifi- ent exchange formed by both partners inside the plant cells—and cally associated with the loss of AMS (Fig. 2). Our dataset covers at reduced expression of STR and STR219,31. These two half-ABC

Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants 283 Articles NaTure PlanTs a c 01,000 2,000 3,000 4,000 Marpal CCaMK Marpal Marpol rud Marpol mon Marpol pol Marpol pol Marpol mon Marpol rud

Marpol pol Marpol mon Marpol rud Ber-C Ber-D Bid-A Bie-A Bot-X Bul-B Cas-A Cas-C Cas-D Cas-E Caz-A Dur-A Esc-A Gen-A b Gen-B Marpal CYCLOPS Hec-A Marpol pol Hec-C Marpol mon Lac-A Luc-B Marpol rud Mns-A Mou-A Marpal RAD1 Mur-A Marpol pol Nor-A Marpol mon Nor-B Marpol rud Nor-C Nor-D Marpal SymRK Pra-A Pra-B Marpol pol Pra-C Marpol mon Tou-A Marpol rud Tou-B Tou-C Tou-D 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 Tou-E Exonic region Intronic region Deletions Insertions Voe-A

Fig. 3 | Loss of symbiotic genes following mutualism abandonment in Marchantia. a, Ink-stained transversal sections of M. paleacea (Marpal) and M. polymorpha ssp. ruderalis (Marpol rud), ssp. montivagans (Marpol mon) and ssp. polymorpha (Marpol pol). Arbuscules are present in the midrib of M. paleacea and absent from the three M. polymorpha subspecies. Similar results were observed in three independent experiments. Top images: scale bars, 1 mm; bottom images: scale bars, 0.25 mm. b, M. paleacea gene models aligned with the corresponding pseudogenized loci from the three M. polymorpha subspecies. c, Multiple sequence alignment diversity in an approximately 1-kb region of CCaMK. Pseudogenization pattern in 35 M. polymorpha accessions compared with the three M. polymorpha subspecies. Red vertical lines indicate mismatches and white boxes and red horizontal lines indicate gaps.

transporters are present on the peri-arbuscular membrane, are ectomycorrhizae, resulted in only a quantitative decrease in ectomy- essential for functional AMS, and have been proposed to be involved corrhizae, whereas AMS was completely absent36. in the transfer of lipids from the host plant to arbuscular mycorrhi- All other lineages that switched from AMS to other types of zal fungi20,21,32,33. The specialization of RAD1, STR and STR2 to AMS mutualistic symbioses retained the three signalling genes SymRK, in all plant lineages analysed supports an ancient ancestral origin in CCaMK and CYCLOPS (Fig. 2 and Supplementary Figs. 13–15). land plants for symbiotic lipid transfer to AMS. These lineages are scattered throughout the land-plant phylogeny, thus excluding the hypothesis of a lineage-specific retention of The symbiotic signalling pathway is conserved in species with these genes. The three genes were found in the genomes of three intracellular symbioses. In contrast to RAD1, STR and STR2, Orchidaceae, Apostasia shenzhenica, Dendrobium catenatum and the symbiosis signalling genes CCaMK, CYCLOPS and SymRK Phalaenopsis equestris and in the transcriptome of Bletilla striata37, are not absent from all species that have lost AMS. To understand which form orchid mycorrhizae with Basidiomycetes that develop this mixed phylogenetic pattern, we investigated their conserva- intracellular pelotons5. The three genes were also detected in tion across species with diverse symbioses. CCaMK, CYCLOPS and the transcriptomes of liverworts from the Jungermaniales order SymRK were absent from seven genomes and fourteen transcrip- Scapania nemorosa (3/3), Calypogeia fissa (1/3), Odontoschisma tomes of Pinaceae that form ectomycorrhizae but do not exhibit prostratum (1/3), Bazzania trilobata (2/3) and Schistochila sp. (1/3), AMS. None of these genes were detected in the genome of the fern which have switched from AMS to diverse Ericoid-like associations Azolla filiculoides or in the transcriptome of the liverwort B. pusilla, with Basidiomycetes and Ascomycetes that form intracellular coils5. which have independently evolved associations with nitrogen- Furthermore, these genes are also conserved in the legume genus fixing cyanobacteria but have lost AMS9 (Fig. 2 and Supplementary Lupinus, which associates with nitrogen-fixing rhizobia forming Figs. 13–15). During ectomycorrhizae, the symbiotic fungi colonize intracellular symbiosomes in root nodules, but has lost AMS38. the intercellular space between epidermal cells and the first layer of A unifying feature of these species that have preserved the sym- cortical cells5. Similarly, in both A. filiculides and B. pusilla, nitro- biosis signalling pathway but lost AMS is their ability to engage in gen-fixing cyanobacteria are hosted in specific glands, but outside alternative intracellular mutualistic symbioses; we therefore suggest plant cells34,35. Therefore, all the lineages in our sampling that host that these genes may be conserved specifically with intracellular fungal or bacterial symbionts exclusively outside their cells did not symbioses throughout the plant kingdom. To test this hypothesis, retain SymRK, CCaMK and CYCLOPS, suggesting that these genes we added to our initial analysis the only known intracellular mutu- are dispensable for extracellular symbiosis. Confirming this, knock- alistic symbiosis not covered in our sampling: ericoid mycorrhi- down analysis of CCaMK in poplar, which forms both AMS and zae, which evolved before the radiation of the angiosperm family

284 Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants NaTure PlanTs Articles

a Control Mimpud Distri Fraves Horvul Zeamay Marpal CCaMK-K

0/7 6/6 20/21 13/13 17/17 10/10 16/16 b CCaM K

0/20 5/8 14/15 12/17 4/6 7/8 18/23 c Control Abnormal infected nodules Infected nodules 100 ) a b b % ( 100 b

b ) 75 Marpal 75

50

50 e of plants (%

g a CYCLOP S

Medtru 25 Percenta 25 ercentage of infected nodules per plant

P 0 0 Control Marpal Medtru Control Marpal Medtru

Fig. 4 | Conservation of biochemical properties of CCaMK and CYCLOPS in land plants. a, M. truncatula pENOD11:GUS roots transformed with pUb:CCaMK-K from Mimosa pudica (Mimpud), Discaria trinervis (Distri), Fragaria vesca (Fraves), Hordeum vulgare (Horvul), Zea mays (Zeamay) and M. paleacea (Marpal) show strong activation of the ENOD11:GUS reporter (in blue). Control roots transformed with an empty vector show little or no GUS activity. Numbers of plants showing a strong ENOD11:GUS activation out of the total independent transformed plants are indicated. b, M. truncatula ccamk mutant roots transformed with pUb:CCaMK from M. pudica, D. trinervis, F. vesca, H. vulgare, Z. mays and M. paleacea show infected nodules 26 d post-inoculation with Sinorhizobium meliloti LacZ. Bacteria in the nodules are stained blue. A representative infected nodule is shown for each CCaMK orthologue. Number of plants showing infected nodules out of the total transformed plants are indicated. Scale bars, 200 µm. c, M. truncatula cyclops mutant roots transformed with pUb:CYCLOPS from M. truncatula, M. paleacea and an empty vector (control) show nodules with variable infection level, whereas with the control plants most of the nodules are unifected or with arrested infection (as illustrated), with M. truncatula CYCLOPS and M. paleacea CYCLOPS, fully infected nodules are observed (as illustrated). The box plots show differences in the percentage of fully infected nodules per plant (n = 19 (control), n = 29 (M. paleacea), n = 21 (M. truncatula)). ‘+’ symbols indicate mean values. In box plots, the horizontal line shows the median, the box spans the first to third quartile, the top upper whisker represents the third quartile + 1.5× interquartile range and the lower whisker represents the first quartile − 1.5× interquartile range. Letters indicate different statistical groups after a FDR correction using a 0.95 threshold (Kruskal-Wallis rank–sum test; 2 2 2 χControl Marpal 7:9343, χControl Medtru 11:976, χMarpal Medtru 2:2817). The bar plot shows the percentage of plants with fully infected nodules. Letters I � ¼ I � ¼ I � ¼2 2 2 indicate different statistical groups (χ2 test of independence; χControl Marpal 4:2338, χControl Medtru 6:5317, χMarpal Medtru 0:25144). Two independent ¼ ¼ ¼ biological replicates were conducted for each assay. I � I � I �

Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants 285 Articles NaTure PlanTs

a Casmol b c

maker- Dencat

A

r

a 4 Dencat l

.

h

4 sca y

I

p

6

I

a lcl NW 02 ffold0077 5

r

Casaus Casta

Z a

. 85

h

1

y Dencat lcl NW 0213

Tree scale: 0.1 n Tree scale: 0.1 Tree scale: 0.1 Carpap evm . cl NW 021

T n 985.1 1283 A .1 i Zizjuj lc

a f 48 5 . r p 6008.1 992.11290 1282 u 0-augu 1 Mimpud Mimpu2183S08467 5 949.1 12 Aposh o TP ids.p1 n 1 .1 1 Chafas C m Nissc 1 5

n s N Alnglu n 1 5967.1 1287 1 Gosrai G Alnglu e 1 h 9 Thecac g 5959.1 1286 Querob Qrob r 976.1 1288 3 ids.p1 . NA138 937.1 5995 02213 e

.

r Casgla Casgl36S16522 02213 5 Manesc g P 000.1 1289 4 nospermum2 R 19658.1 sterids.p1 ids.p Lupang Lup00170 09070.1 l NC 029682.1 Citsin stus-ge e P 02213 6 n r Manesc Ma A l h Nissc19 2 .model.sup 36300. 600. g Aposh n 02213 c m 0.1 X 5943.1 1284 e lcl K Y6M7 ids.p1 r 02213 3 2 Nicben n P l r

a Alngl1 . 1 ts-Aster Alngl1 g0196 3 Fraexc F Citcle C 1 1 Monuni u P 02213 18735.1 Aster .

i K hafa1912S .

p 5 r Aposhe 8 Petaxi 440.2 1 n1.P89 o Fraexc FRAEX Popt . X 02213 Thecc1E f Sarsa orai.002G a oran 0 i 440.1 a Z ste Poptri Riccom 3 P 02213 0 n c Sollyc Solyc04g0 7430.1 488.1 X Fraexc FR n g Jugreg g1960.t1 C7AWI 7 T ne-0. 02213 0 7g186 Manes.0 i

A P 4 A . Monu Solpen 4 0 Fraex n e t

488.1 XP 1 y 4488.1 Lotjap 2

P 2 Nibe 1730S14 r Z45 5 n1.RR ri Po 1 S159 6 Gosrai Gor . 2016131 1730S1 4 488.1 X lcl KZ45 . n n 2042 Peaxi16 a h 65.1 GAU16 10 icots- 08g1574 0.1 ge1.1 . 098283 8 44S0659 4 2 n Thecac udicots-A 0 X Poptr TGAC v2 m 0 P0666670.2 yma.12G174500.1 20054.1 0 4 i a 17 Chr 1 Gosrai Gorai.013G0 RAEX388 iclev10024 F Sarsan 2 Potri.0 nes.05G1467 vul.007G1 o d p Dencat lcl NW 021 icots-Aste ercontig 26. 3900.1 ni 2005151 r 0 38-mRNA- gun0 0 3

Phae 0 C 0 7 A 7 5803-P P 1 90.1 c c FRAEX3 . 4488.1488.1 XP X lcl KZ45 cots- n101Scf a g3v1697 R

tri. 7 0s00190.1 i Aradu. r

1 6 1888. 0 Mon Capann Sopen G02197 104488.1 4 9 1 9 Helann H radi g3v169 191 488.1 X 5 Phae 02067 8 re Eudi a 0170.m A 5g3v169 X 11 Zeam 440 p j 8 9 254900. 17660 4 5 Sor e AEX38873 5 Horvul HO i 5G146700. 423S 488.1 X 973 5 Sarsan 2007089 SE 002G0 g000 5 7m l c

G y

4 j .gnm1.a 8 4 G LR . P p uni 200 c r 7 4460 G 1 7 1910 c i 2Scf00 h Lj 38873 v2 000310180 A 1 A 0 043369 SE 05G240 6 X re Eu 0. gfu1098 9 SERM Co 454 u e Eud 02854 a e 0 a Mtrun Rhof 0 Dencat Ca 29590.1 P o

R TN Co . i qu PEQU 09027 b 1 73 v2 0 r 4794.1 P s Core E 1 1 4 150.1 A i Ca 09968.u 1 Tp5757 P 1g3v39 1 e Eud LR j g0905 8 02597g01 . .gnm1.a m 04g025 Datg Thecc1 . . ic Sobic.0 otri.00 744m lcl DF973 1 P NW 01910 G000970.19 G r Aradur G A 600.1 X Phae qu PEQU 00 r Setita Se l 8873 v2 0 CA04g12340 300 Cajcaj Ccajan 22455 Glymax Glyma.10G194500.1 Musacu GS 015881 anXRQC Glymax Gl C 39m 01419 9t1 Phavul Ph a lcl NW 019 K a M Co 57940.2. 2160 Vigung Vi b 9 a 3845 9359 or G001010.1 745 a Vigang vigan.Vang06 TN Co 1969.1 2 e 2 e TN Core 1 Vigrad V 5 1 Lotjap L h 3 P K ai.005G15 0 Lotjap Lj 9 y Zm0000 lcl NW 0213 Dau 066g000 a lcl NW 0 3 7 3 0 K Lotjap Lj ha lcl NW 01910 A c 99S0

1 c0 g2 i1.p1 Ambtri TRINI 00. 007s00 1 2 d re Eudic Cicar a lcl 02068 m Denca m v2 00 h . Cicar ripra K S . J m 073.1 222 0 100.2 a Cepha Medtr risu a lcl NW 019 r3.4263 99.1 10290 Zeamay Zm 5 003101 h ifrunn o RM Core re Eud 1 T y LR s 8 y Ambtri evm 3.4263.1 2 s c0 g2 i2.p1 0.2 T A660 l Bra 0 r l car D j 800.1 2 Datglo a lcl NW 01910 003747.2.1 719.1 7 P q 0 9 R 2 ha lcl NW 0 Sor a 2 Aposh aCh0 e Begfuc Be e PZ Co TN C 0 .T m 3 LR evm 27. TY h a lcl NW 01910 8 AV XVX Cor hr12g036 9 7 Momc . u PEQU 18 247.1 5521 G 319835.1 005. igun01 VU2 r 2014 G n 0310180 m y ifrunn n RM Core -Asterids.p1 0031 Momc n 7 Eudicot 7 EG0 7 SER P 6 4.1LG17g0 K t lcl NW 0 Anacom 1 Momc del.Ch .1 817.1 384 1 1.1 G123000.3 Cunlan DN109 ha lcl NW 0191 P radi0 S05308 dis Bra ots-Asterids. a 4 icots-A CAR 01 42. 4.1LG06g00 V a 8 m 0 x Momc b Phvul.001 5 074.1 1273 o 1 . W A45816 x .T 1 i l v3 2474 e lcl KZ19682.1 ore Eud Momch del.Ch 0 0 p ta.2G337400.1 m Eudicots 80.1 Zeama ic Sobic K V Anacom 0 v10016 o 0598.1 281 y v

415 a 1 Momc 850.1 ip.BY5 0 a 67.1 14 M M model.Am . 5 g vigan.Vang04 M 27.model.Am 0180. 1 Momc H 2G32 nge1.1g e 1g0347181 70.2 A569 1 e Momch 55.1 PON93 397.1 16769 d Nissc30 1565 0 6444 1 g a DN60517 2100132 s-Asterid . MELO3C 02492 1 U 8a028 1 NVGZ Con m c l 0 c m 129 3 Momc 60 r1G045 99700. 0400. 98 c0 P 5466 0 0 0.1 sterids 721 048.1 PON7 a du.J5EZU 1324t1 451886 . 1 p 8 396.1 16768 s

Cucmos CmoCh0 y 2 A Achr s hr Phaequ icots-As 387 0 X 058 y Ara Phae DN60517 00317470.3 7 XP 020848 l 9 21319 0 6919 P Cucmax C 4G01750 l a 4 6 3 di1g25 a or Setita S 000 2 gun XP e i 20088 -Asterid a 0 yp arah e 284 02068 1 Cucpep Cp y Zm0000 Aco0 2 Tr v1. Orysat LOC 9 3 igr gan p ZQVF Conif 024930 1 G n 977.1 3305 n Amearg Cucpep C 04G017560. G h 370 n 9 0.1 g1 i p1 1 19.1 Horvul HO .1 21534 Phavul 003174 0100 . Ar r 7 1 .1 PK .p1 Cucme C01000 Ambtri 0 9.1 10230 4 Vi pu4695S107250S0 Musacu GS 02855 Aco0 5 a 0 ifers Tr2 s.p B 5277 024930 V ipa yp arah a a an 20099 PEQU Cucsat evm.m 0 P 0 08a004 2 13 Vi 5 982 q 9 5 l 2014 03174 terids.p 2 m Phae 1 v1.0 0scaf scaf 1.p Cucsat evm.m X 1G149400 Nissch h ur u 9 9090.1 22 1 4.1 XP 40 531.1 1 M M s.p1 009334 Ara Ambt 20546. 3 0 93 C TRINITY 20238 A66456 Citlan Cl ot L484 0160 c2 T2485 4 Cepfol Cfo sampl 1 P e 13 Aposh P 880. 50.3 Citsin Lagsic Lsin 1 evm 27.m Ara .1 Phae .1 Citcle Cicl 02889 ers 2 samp i lcl JXTr142S Musacu GSM 0 Monu TRINITY fold 1741 Lagsic Lsir 9581 ita.9G151600 Ara NA3089 EQU 055 0 12.1 1 nd lcl JXT 7 ri evm 27.m XP q 5550.1 247 9072.1 Sars 38873 v2 0 0 1 08 IAJW fol Mor 29684.1 459150.1 X ud Mi u PEQ Cavcua 20 X 1 1 es c 00041.17.1 1 7 reo l NW 0154539 .1.mk RVU4 8a0197 Arad e lc n HanXRQC 0 d00041.174 70 T Os03g452 787 M 020 1 Ledpa DCAR 01 Psemen PME 9 Para l NC 02968276S0 Lupang Lup000 q 0 Rhosco 212 r g ombin 1221 NC 0r Anacom l UA u PEQU 42 1 AE 3 v2 les combin Conife Distri Dist l P00007 253.1 617 o Chafas Chafa g Lup001 364.1 GAU27947.1 17005 K 001 Rhofor r DCARR 012949 Zizjuj lc del.A Hr1G00 Z45 Achr167 Rhofor 00218. ed.p 364.1 g030 252.1 6176 .1 Mimp odel.Am U 34 8g Zizjuj lc u Dryd MD 025802.1 1 U 19 Anaco 4 41 Helan RAEX388733887 v2 rs.p Zizjuj lc P 64600.02P11 P 008221 064.1 6175 m Nissch Nissi Ca 04609.1 TGAC v2 mR 1939.1 P 262.1 F cf00007 5.1 0010893 1 0082218 T A 92 Dauca S ed.p 1 Drydr re lcl NW 00898800303 P r v1.0 sAchr Aco0087 . Lupan 77 0T07 3 Dauca Outgroup Maldom A 127.1 X 2 9 1 9599.1 T m 79 RAEX cf0055 1 Pyrb rupe.1G3 XP 01664 180.10.1 Cicar lcl DF973 4 r v1.0 scaf Aco0 1 Fraexc F 1.1 8 avi Pav sc000dul26 024 8T2028 b K 9 27 06 axi162 Pyrcomer PC P u C c Medtru MtrunA17Tp575 Chr6g046105G088400.161 A603 00 0011 Fraexc Scf01845g00009.17429g0000 Pru 02931 af 73. risu a 1 13 Fraexc FRAEX38873axi162S v2 200 g0500 Prup um lcl N 941 fold00 T r 0g3v00a.0 5 0577.1 Prudul Pr C 024127.16g0 3 0 1 rip 3 00.1 fold0014.1 6 Fraexc F Prummum lcl NC 024127.1 X6g0302 001 T 66 um lcl N r 004.96 Pyrbre l Petaxi Pe 1Scf01926 Vigrad Pru m hiOBHm Chr Lotjap Lj 004.233966 iben101Scf0 Vigang Pru OBHm Ch 1.1 4G0512 cl NW 008988 Petaxi Pe vigan.VVradi07g28420. Rubocc Bras G02236 90.1 Petaxi Peaxi16 2020 Vigung Vigun0 Roschi Rc 0001.1 Glymax Glym 0 Pyrbre lcl NW 00898 6100.1 ang10g01 9479 t 54331.1.g0000 1.1 1 Nicben N 02 Phavul Roschiic Rchi FAN iscf00044624.1.g00001.100 7504.1.g0 Maldo Cajcaj Ccajan 4 678.1 XP CA07g1 Glyma Phvul.002 1 Potm N iscf0 2 .1 n 07g 70.2.1 3g022400.1980. Fraana 0767.1.g0000 m MDP000 radi01g08 Nicben Niben10en Glymaxx GGlyma.0 1 AN iscf000 g00001 Phavul Phvul.00 8678.1 XP 018499764.1 38759 Sop 2 G311800.1 Fraanana FAF 0001.1 Maldom MD gun04g077700. Capanen 2 lyma. 5G229600 Fraa FAN iscf000085426.1.8 .g0 0141327 Vigrad V A mRNA- Tripra Cajca 08G03 Fraana FAN iscf00 1564.1 009342 Solp T Tp57577 j Ccajan 7000.15678.1 na N iscf001480.1 P0000160886 Vigung Vi um10080-P Pyrbre risub lcl DF973 TGAC 1 Fraa na FA g00001.1 Maldo lcl NW 00 Pyrcom PCP02249.1 38760 Sollyc Solyc07g Sopen07g026100.0524728970.2 1 v2 mRNA24264 Fraa FvH4 2g383200393.1. s.p1 m MDP0000303416 igang vigan.Vang02g12510.1 PA gustus-gene-1.10- 1 Medtru 16.1 GAU1 Fraves N iscf00 ycadale V 8988678. Solpen .10-mRNA- Mtrun 1683.1 741 na FA GNQG C m10306- 1 XP rob Qrob P0 fold00284-au A17 Chr8g0390 Fraa 2000462 dales.p1 Maldom MD Casaus Castanosperm 0093 8071. Que ustus-gene-1 3 Encbar WQ Cyca anospermu 42248. 1 Cicari Ca 157 11 taeri 2010850 KA P0000299546 1 3876 Casmol maker-scafaffold00284-aug Lotjap Lj4g3v 37.1 S b 32192 Casaus Cast Maldom M 1 3061490.2 LIN Ginbil G 2 Pyrcom Pyrbre lcl NW 00 DP0000163 Lotjap Lj4g3v3061 1 PCP0 989 Casmol maker-sc 490.1 Ginbil Gb 0250 nifers.p Pyrbre l 11426.1 Cercan Cerca37S02975 8988065.1 XP 1907.t1 Lotjap Lj4g3v3061490.3 Abilas 2054444 VSRH Co rs.p1 cl NW 008988 P0120360.2 009353260.1 6585 Jugreg g1 Aradur Abilas 2053983 VSRH Conife 174.1 XP 009366487.1 1 Querob Qrob Aradu.D4WQD 0108782 8751 Pyrcom PCP014785. Jugreg g11907.t2 Arahyp arahy.Tifrunner.gnm1.ann1.LGD3I1.3 Psemen PME0 Casgla Casgl16S00526 1 Arahyp arahy.Tifrunner.gnm1.ann1.AAED67.1 Pintae PTA00026958 Pyrcom PCP024567.1 Casgla Casgl307S01342 Pinjef 2003903 MFTM Conifers.p1 Maldom MDP0000216913 Araipa Araip.1Q1XN Pinjef Alnglu Alngl4792S30622 Alnglu 35 2003904 MFTM Conifers.p1 008499.1 1 Alngl1357S07287 Nissch Nissc181589S167 A Cephar 2021515 NVGZ Pyrcom PCP Alnglu Prudul Prudul26A010695P Chafa spermum08249-P Amearg 202 Conifers 2 samples combine 8 Alngl4792S04638 s Chafa167 Casaus Castano .3QR2KS.2 LIN-like Phyhyp 206 6279 IAJW Conifers d.p1 186.1 642 Alng 900.1 Mimpud Mimpu145269S003S07263 frunner.gnm1.ann1 Cunlan 2 6245 JRNA .p1 64.1 XP 009353 lu Alngl Pruper Prupe.3G218 Arahyp arahy.Ti Aradur Aradu.8BK6A3.3 016142 ZQV Conifers.p1 0089880 665380 17976S18 Mimpu VBA0Y Metgly 2064821 NRXLF Conife Datglo 492 d Mimpu 1353 .gnm1.ann1. 5 Cunlan 2021 rs 2 samples com Pyrbre lcl NW 400.1 Datgl2886S P 008230082.1 14421.mk5 Cercan Cerca105S12022 y.Tifrunner Araipa Araip.D8XCD Cunla 376 ZQVF ConiC fers.p1 Maldom MDP0000 Begfuc Begfu7092 4440S24224 Arahyp arah 0S0743 Stae n 20 bined.p1 1674 Casaus Nissc2649 A ri 20 16771 ESYX onifers 2 s upe.7G009 9 01073.1 g220. Nissch Ginbil 11225 KA Conife ample Citlan Prumum lcl NC 024129.1 X Casa Cast Lupangmum23258-P Lup004513.11181 Ara Gb 387 WQ Cycadalrs bra s combin Pruper Pr A020481P1 Cl 9 03279.1.g00001.1 a 815S0 W rul 20 79 nch ed.p1 dul26 Lagsic Lsia00290 S135 Pruavi Pav sc00 Lotjap usLj Casta nosper 144 olnob 261599 XT es.p1 apex wit u 27 N iscf000 H4 6g43430.10001.1 Castanospermpu 12893 Suna 0 h ne dul Pr 689.1 29062 Cucme 5 a FA Cajcaj Cc mum0 Casaus pud Mi Chafa2196S185317S 9077 Phyhypma 209254186 ZO Conif edles.p1 Pru .1.mk 03G0157 n 5437 t 0g3v00n4osper 2 Mim rca1 0S26 Manco RSC ers.p 0.1 Fraa Fraves Fv8357.1.g0 Glymax Glyma.1 715- Chafas Ce 7 20 324 E Co 1 008245 3 Cucsat evm.ml MELO3 9801 mum2 PA P029108.1 Lagfra 2l 206186146 KLGF nifers.p1 5 ajan 1372819. Cercann Cerca5 .1 5884 Manco 52 JR Co 9.1 XP 0595.1 g020 Cucmax C 90. Potmic 2 Glymax 4061- Cerca 2155 30799 Mictet 20 092614 7 NA nifers.p1 1 AN iscf0010 2g01 P 1 P Pyrcom 50PC l 206057 CDFR Conife 736274 H4 1g2840001.1 Cucmos C C0 a F havul A 018 P00002 Acmpa ZQWM Co r .g0 0978 n as G06771 V Glyma.13G3 46 000016401921470.1 Faltax 13 Con s.p1 0001.1 odel.C igun om MD 0 000.1 Faltax n 2017834 MHGD8 CDF ifers. Cucpep C m 4.2. Fraa V P 2G194800.1 d 1P11 20 nife p Fraves Fv aCh1 iOBHm Chr igan hvul.0 88062.1Mal XPom MDPm PCP 8 Daccom 8 R Coni 1 mum lcl NW 00 Pruavi Pav sc000 4169.1 0001.17 t Poptr h 1 V g 01435 P 201 04099 0P5 HI rs.p 0 r igr V W 0089 Mald upe.7G226 od 6 Co fers.p11 Pru Manesc moCh 3.3751 Rubocc.v1.0.G009507.1 Br 48.1 10821 Vi g igun05g22590 Pyrcor 0.1.mk Podcorru 966 LW Conifers.pn f0010 6554.1.g0 4G0052 2 ad vigan. 0 07600. er P 573.1 26001.1 1 Nagna b 20 LYX Co ifers 1 i Potri.01p Roschi Rch 5 V gung 5G udul262A 0 0.1 Ceph 20 09590PLYX .p Riccom 3 4.1LG03g01 8749.1 9356 igun V 1 Pyrbre lcl N Prup 0 52 201154305 1 4 ON93 25 V radi 14200.1 5371 t1 Cepha g 2 C nifers 2 sa 0314.1.g0Potmic 2 Citsi G00532 90.1 750.1 9357 igan 1 P 008240212.1 g35 3 Ame a 2 89 FM o 1 up HL.SW 8 Vi V V PrudulX Pr 4 1g05 12 r 201711603 X WZ Conifenifers 3g0496481 331 1 Citcle Cicl M 6 4853 Nissc g igun1 05g22650.1ang09g 38820.1.gH Taxbac r SCEBLG N iscf000 r 7 49.1 108221 .2.1 gr 2 T a 20102 K 2 sa mples Fraana FAFAN isc 04 Carp n or anes.083 Huml 59.1 P 2 Ara a g V 2g009 4 T axcu rg 214 6 4 C S G062000.1 0. viganigun 0.1 Potmicr 1 241722 aicry 2 0 UUJS Cono m na ras G21509 Thecac 0 8 Arai d 024133.1 ras G07730530 Metgly 20 200 4 NVGZ C n rs.p ples comcomb 072.m0 490.1 1 060.1 PON6 Aradu 0 032 C avi Pav sc000 s 20 32 GJT o ife 46 95 Gosrai Gor ang 0 hyp arh NisscV g00470 AN iscf000Fraves Fv Fokhod 8153 nifer rs 1 FAN iscf0032 r 6 ot L484 018 r2.2632450.1 Ara radi09 04 Pru F T 9497 W .p ined.p Fraa 80 Gosrai ap evm.mo n 034678 Cicari pa 50.1 a etsp. 0 0 1 a G15 Medtr . um lcl N OBHm Ch 0880141 Micde 128959 Co s.p b n OBHm Ch 0 Gosrai Gor e1.1g TC010000 0060.1 PON V g0692 0525.1 22 1 Cald 237 IAJWI CoCon ifers.p1 ined.p11 0 e X 0100 1579 T h m Rubocc B 1 1 778.1 341 Gosrai B 1 T ang08 Fraan Cupdup 64529 NR n 326 v10019 Mor 1580 .1 Lupan ripra Arai ru Drydr728S 0. 20 2 ifers 2 sa 00965 59.1 PON93 Lupan yp arahr Pru WSS C 56.1 34916 Nicben N 4 .1 Lupang risu ahy 0.1 Lup 21200 Rubocc B L 63.1 3 Auschi 2 0 Z ifers. Fraa Thecc1E 0 1578 0 D g053 3 ec 20c 1 61705 U QS Y ru Dryd 5 Nicben N 600.1 i lcl J 0100 del.Ch Aradu.DET4 4 Dis 1 1577 1 ot L484 002741 AX Co Petaxi P u Ca 00 Dryd 19.1 3 2 1650 C 983 G001630.1 5 0 B o a 6 Widced 2 nife Nicbe 0 1 5 0 Capan Gorai.013G Distri Distr149S u p.6GW8K 0.1 N p p 3700 2 0.1 . 08 Roschi Rchi 3 reor 0 Solp 004095 Piluv 5229 t arc Sollyc b g0539 1.1 POO00898.1r290S 3 0.1 J Con o 1 mp F T Mtrun T 98 r g an Calg 2070661897 nif 145 a Tp57577 Morn s.p1 Dryd MELO3C 1 0244.1 PON5 1 X nd lcl JXT 01000 lcl D ifrunn 2 n 1 01 1.3324 Calm l r a 480.1 S09010 es combi 6 l g r L ers.p1 del.supe n 0.1 ifer . 0794.1 024922 E 0 0.1 2 0 Roschi Rchi o Neopa 7 GDN Conifers.p1 ot L484 000 a Gora 4.1LG05g14 g P 200.31 5 i 0136187012 GK5 VI Con i.007G257200.1 18m 010492 4 y 28555 g Lu Lup024 0.1 Cryj 20 ifers.p 6m g 3 658 0 0158 n p Lup 0100 0 ra 214 X s.p1 e 582.1 .T 2 480 Para a Lup03 Cryj D 762. ot L484 000 19244.2 QSG Co en Sope 90S0 nd lcl JXT S 0.1 1 0 n Ni Cryj 0 F 8 ai.004G G025861t1 0 581.1 S .1 X XP ac 206020 x L 1 3 C01000 Distri DistCitlan Cla Cryjap 1 9 RPM C C n 139.1 3937r 398991 a 0 ifrunn

0398581 a del.Ch1G0063 a 1219 E n 643.1 PON34 i 0 0G00 F9731 e QNGJ C 8 7 T c n CA ben10 Athcup 2408 1 ed.p ot L484 000 u A17 Chr2 .1 1 o ife 5 0 1 Cucmel o 6 a p n 200 0 Mor g 197151 eaxi162 i r Metgly 2 t 690 0 YY nif c0 g1 i1.p1 S ben10 1 .gnm1 8 n 0 i.008G034 i G00615 p02 a i lcl JXTC p 2075 25.1 PON36 Par g 6 Chala p 1 0 G006290. Lagsic Lsi06G00851 01451 r 004 1G0062 F r 02214 U Fokhod 2 p 210 2019 ers.ps.p11 1 6 0 s 1 5 olyc10g 8 0 0 1 P 2 0 2S06543S Piluv Mor ben10 r l nd lcl JXTB l MELO3C024693.2.11G0062 nifer P 2214 0 i lcl JX 76 IFL 3 1 7 Auschi 2 AUD E Co o R contig 161.13 Cucsat evm.m 022140795.1 1 TGAC v a .1 C 1g0348641 3 0 moCh Disarc 20 200629 294 TCJ C n 3 195 reo X e 4.1LG04g1 Widced 1 5063.1 1583 Z Conife ifers.p Mor T P 0298. 1 T r13g u A 6 20 1 reor aCh 97 hr13 0 Par Cunla onifers s.p 10g Cucpep C 02214 2 r. f T Cunlan aicry 2 w 8 RMM 8 3200.2 1 T 716 27 251 P 0 Citlan Cl 69.1 66.1 GAU1 aCh c0 g2 i1.p2 1 446 .an i E Con n 3 NC 02969 5488 Psechi 2 2059 02492 0 E sic L 49.1 gnm1.a 0 8 g l 1324S20065 2 DSXO C hr1 S Psechi 2 axcus 2073 20663 1 n10g03 Cucme l 3200 Ambt 5 I Conifer ifers.p1 01000 1 Scf05841 P 4.1LG16g0 m 6 13200.3 Aposh JDQB Co 1 5 Dencat lcl NW 02 6 oCh2 g37759.t1 55m 2 0 Phaequ onife e gfu1677 9 Distri Dist X Cucmos CmoCh0 349m Musacu GSMUA 1 1 0 Anacom Setita S 1320 NR 9 GMHZ C S Scf0363 p g Sorbi 00.1 MELO3C n1.G2H5J3.1 Zeamay Z 4 egfu918 Orysat LOC Os01g12930. B 0 g0322 Cucsat evm.m e asgl55S02277 11 .1 H 11 Zizjuj lcl NC 029 1 .p DCAR 02 1 1Scf074 l .1 1 HOUF C 6 7 B G 1 503.1 8 0 3 500.1 n 58164 r r 2 1 1 . 2059 cf00575 . r Lag 1 P 02214 A s.p1 1 o ife Cucmos C 1887 DSXO Co V r DCAR 0 8250 1 XIRK 081 00.1 1 16G11 072 6723 1 a 2003 0 33 DN6027 Zizjuj lc 1 2 mR 76142 GK v 980.2 r 8 2 200246 c Cucpep Cp r r Conifers M QChr07g 503.1 X 980.1 R s.p1

d 12895 QSN rs.p v i evm 4

4 8 - Y 00.1 8 Jugre 8 e 7 s 0.1 XP u i 4 u c Sobi Cucmax Cm Alngl40820S265 912.m00 nd lcl JXTB010 3 503.1 X f 7 s n 0 1 .1g001 0 lcl KZ45 ETCJ Coni 0 March 7 l AIGO o 1 a 3

Cucmax C 2 4 u nife PEQU 03267 187 7 n1.0 es.16G 5 g 500 503.1 X l anes. EG037843t B n 8 H 03266 Begfuc B U X i lcl JXT 2010.1 eita.5G008400 Begfuc B 00433. 55 3 521.1 X DatgloCasgla Datg C 32140. 1 contig 23.66 20870.1 Aco010789. i e g03009. YY 8 4 r r 670 ZY E L Co onife fers.p Cucme NA4309 01 4 O 6 HanXRQC 0 4 a r Par Dauca g0102 70 Manes.16 27. Conifers 4 Begfuc Begfu2789 3 ZQVF Con 8 v 07g00 m00008a V fe s.p1 B Dauca 2

d

R n HanXRQCh 6 2 ES AUD n .p1 3 Alng i.009G026 AR 010478 P ange1 n c.003 n reor g0400 I Conife 2 g00 1 rs 2 Y DN6027 3 Cucpep C HanXRQC . n07g0 i a 1 CZ Conife Co Conifer i arch.p1 n

V 2 ifers.p1fers.p 8 93.1 615 09g023938 U 1.1 10g031872 E Conifers.p r T TRINIT 1VCX e m YL s lea 1 n

X Cucmos Cm g

U YL 0 g0300 Thecc1 Cucmax CmaCh2 N 14 1 o J Conife n Riccom 29 0 E samp

3 YX

A E lyc07g 1319790.1950.1 X PKA590 i 0 N del.A P Manesc M 7 Achr10T22180 001 fers.p 001.1 P Conife H 111 G008 X Co Manesc Citcle Ciclev100 Cepfol Cfol v3 2 1 0 5. 7 M Con f.p Scf06148g00002 A Sop 3 v2 0000 c .p Poptri Potri.018G083600.3 M Con Helan n Sope Scf00231g Conife Manesc Man r r s.p 1 5

u 3 Citsin or 1Scf03 1 s.p1 1 1 Helan 3 .1 010663 1 R ha lcl NW 019104 Capann CA07g06710 0 l n 1 G e 1 l .1 m 1 Helan a lcl NW 019104 a lcl NW 01910 . r 1 TRINIT Thecac n s comb NW 0154541 Rhofor F 2 e ife 2 0 s.p1 Daucar DCAR 028695 Helann HanXR 8 h 100. .1 rs.p2 l h Gosrai Gor Daucar DC T ifers.p1 r 1 5 .1 a lcl NW 01910 N anXRQChr Solpe r v1.0 ifers.p s.p1 1 c 3 Solpen ife r Sollyc So axi162 2 rs branchs 2 sampl h x iben10 7 a lcl NW 01910 e 5 1 Momc

1 r e 0 h Momc 2 in ap evm.model.supe P s.p1 Niben10

0 P a e . Momc

r 1 . 1 02067 5 sca d.p1

1 Zizjuj lc Rhofor 8.1 8273 F Momc Carp

Helann H Helann HanXRQChr f fold00 e Momc Nicben N Petaxi a Nicben s combin p 3 ex with need Fraexc FRAEX3887 672.1 24

1

09.97

e d.p1

1

6

2 l es.p1

LIN and/or LIN-like VAPYRIN SYN

Fig. 5 | Maximum-likelihood trees of infection-related genes. a, LIN and its paralogue LIN-like (model: GTR + F + R7); b, VAPYRIN (model: GTR + F + R6); c, SYN (model: TIM3 + F + R5). Due to the high duplication of the gene families, only the angiosperms clade is displayed for VAPYRIN and SYN; whereas gymnosperms were conserved for LIN and LIN-like, due to their divergence following the seed plants whole-genome duplication event. Full trees are shown in Supplementary Figs. 21, 22 and 28. LIN and/or LIN-like tree was rooted on non-seed plants; whereas VAPYRIN and SYN trees were rooted on Amborella trichopoda. Species names are coloured as follows: black, species with intracellular symbiosis; light red, species without intracellular infection; light grey, species with undetermined symbiotic status. Cyan dots indicate species forming AMS. High-resolution phylogenetic trees are available as Supplementary Figs. 9, 16, 21–23 and 28.

Ericaceae. In ericoid mycorrhiza, ascomycetes colonize epidermal Two assays were used to assess the conservation of the biochemical root cells and develop intracellular hyphal complexes5. We collected properties of these CCaMK orthologues. First, truncated versions available transcriptomic data from six species, assembled and anno- that only contain the kinase domain of CCaMK (CCaMK-K) were tated them, and specifically searched for the presence of SymRK, cloned under control of a constitutive promoter. If functional, these CCaMK, CYCLOPS, as well as STR, STR2 and RAD1. Congruent constructs are expected to induce the expression of root-nodule with the loss of AMS, RAD1, STR and STR2 were not detected symbiosis reporter genes such as ENOD1145 when overexpressed in (Supplementary Fig. 37); however, SymRK, CCaMK and CYCLOPS the Fabales M. truncatula roots in the absence of symbiotic bacte- were present in the transcriptome of fortunei roots, ria, as does the M. truncatula CCaMK-K construct45. The constructs but not in the other five transcriptomes derived from leaves or stems were introduced in a M. truncatula pENOD11:GUS background and (Fig. 2 and Supplementary Fig. 37). GUS activity was monitored in the absence of symbiotic bacteria. The receptor-like kinase SYMRK, the calcium- and calmodu- CCaMK-K from every tested species resulted in the spontaneous lin-dependent protein kinase CCaMK and the transcription factor activation of the pENOD11:GUS reporter (Fig. 5). As a second test CYCLOPS are known components of the common symbiosis signal- of the conservation of CCaMK, trans-complementation assays of a ling pathway and contribute successive steps in the signalling pro- M. truncatula ccamk (dmi3) mutant were complemented with cesses triggered by arbuscular mycorrhizal fungi and nitrogen-fixing CCaMK orthologues from the above species. In the presence of bacteria2,39. Genetic analysis of SymRK, CCaMK and CYCLOPS have symbiotic bacteria, all of the CCaMK orthologues were able to been conducted in multiple angiosperms, including dicots from restore nodule formation and intracellular infection in the ccamk the Fabaceae, Casuarinaceae, Fagaceae, Rosaceae and Solanaceae mutant (Fig. 4 and Supplementary Table 5). families as well as in monocots such as rice40–43. In all these species, In legumes, CCaMK phosphorylates CYCLOPS. Phosphorylated defects in any of these three genes resulted in aborted or strongly CYCLOPS then binds to the promoter and activates the transcrip- attenuated intracellular infection by arbuscular mycorrhizal fungi. tion of downstream genes46. To determine whether the CCaMK– In addition, knockout or knockdown of any of these genes in root- CYCLOPS module itself is biochemically conserved across land nodule symbiosis-forming species resulted in impaired intracellular plants, we conducted trans-complementation assays of a M. truncatula infection by nitrogen-fixing bacteria2,39. Conversely, CCaMK knock- cyclops (ipd3) mutant with M. paleacea CYCLOPS. Nodules can down in the Fabaceae Sesbania rostrata did not impact extracellular be formed in the M. truncatula cyclops mutant due to the presence of infection of cortical cells by nitrogen-fixing rhizobia44. Together with a functional paralogue47. In our assay, M. truncatula cyclops mutants this genetic evidence, our results demonstrate that SymRK, CCaMK transformed with the empty vector could develop root nodules, but and CYCLOPS specifically occur in species that accommodate intra- were mostly uninfected. By contrast, transformation with either cellular symbionts, defining a universal signalling pathway for intra- M. truncatula CYCLOPS or M. paleacea CYCLOPS resulted in the cellular mutualistic symbioses in plants. formation of many fully infected nodules in most of the trans- formed cyclops roots (Fig. 4). Conservation of CCaMK and CYCLOPS biochemical prop- In sum, these assays indicate that CYCLOPS and CCaMK erties in land plants. We propose that the symbiosis signalling orthologues that evolved in different symbiotic (AMS, root-nodule pathway has been co-opted for all intracellular endosymbioses in symbiosis and both) and developmental (gametophytes in M. paleacea land plants; this would imply conservation of the biochemical prop- and root sporophytes in angiosperms) contexts have conserved erties of the corresponding proteins over the 450 million years of biochemical properties. land plant evolution. To test this hypothesis, we cloned CCaMK from three dicots forming AMS or both AMS and root-nodule Infection-related genes are conserved in angiosperms with intra- symbiosis, from two monocots forming only AMS and from the cellular symbioses. For a given gene with dual biological functions, liverwort M. paleacea, which forms AMS in the absence of roots. co-elimination is not predicted to occur following the loss of a single

286 Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants NaTure PlanTs Articles a b CSP

SymRKCCaMKCYCLOPSRAD1STR STR2 LIN and/orSYN LIN-likeVAPYRIN AMS RNS M. truncatula CSP NM Arabidopsis thaliana AMS ERM OM P. equestris RNS

RNS Lupinus angustifolius

ErM R. fortunei Cyano RNS

AMS Petunia axillaris AMS AMS Gingko biloba ECM OM EcM Pinus taeda AMS Orthologue AMS Selaginella moellendorffii Pseudogene

Cyano ERM Azolla filiculoides ECM Neo-functionalized like NM Physcomitrella patens NM Absent (from genome) AMS Marchantia paleacea Not detected NM Marchantia polymorpha (from transcriptome)

Cyano Blasia sp.

Fig. 6 | Model for the conservation of symbiotic genes across symbiosis types. a, The common symbiosis pathway (CSP) genes SymRK, CCaMK and CYCLOPS in all land plants. RAD1, STR and STR2 are conserved exclusively in species with AMS. The infection-related genes (that is, VAPYRIN, SYN and LIN and/or LIN-like) in angiosperms are specific to species forming intracellular symbiosis (black background). Mutualism abandonment (NM, white) or loss of intracellular symbiosis (grey) result in the loss of all these genes. The pink symbol indicates the CSP, the cyan symbol indicates genes specifically involved in AMS and the yellow symbol indicates genes linked to infection by intracellular symbionts. b, Schematic representation of transition among symbiotic types and the conservation of the corresponding genes across land plants. Cyano, cyanobacteria association; ErM, ericoid mycorrhiza; ErM-like, ericoid-like mycorrhiza. trait because of the selection pressure exerted by the other, still pres- symbiotic and non-symbiotic species. Given the overall cellular and ent, trait12. For instance, DELLA proteins that are involved in AMS molecular conservation observed in AMS processes in land plants, and have essential roles in gibberellic-acid signalling are retained in we hypothesize that these genes most probably have an endosym- all embryophytes48. To become sensitive to co-elimination, a gene biotic function in species outside of the angiosperms, but the lack must become specific for a single trait. This may occur via succes- of gene erosion of these genes in non-angiosperms suggests that sive losses of traits or via gene duplication leading to subfunction- they have an additional function that ensures their retention. In alization between the two paralogues12. Angiosperm genomes have angiosperms, we see loss of these genes concomitant with the loss of experienced multiple rounds of whole-genome duplications49, and we intracellular symbioses, suggesting either that their essential func- hypothesized that besides the common symbiotic signalling pathway, tion is now redundant in angiosperms or is supported by gene para- other genes might be specialized for intracellular symbioses following logues resulting from the whole-genome duplications that predate subfunctionalization. We screened our phylogenies for genes that are modern angiosperms. retained in angiosperm species forming intracellular symbioses but lost in those that have undergone mutualism abandonment. Six genes Conclusions followed this pattern: KinF, EPP1, VAPYRIN, LIN and/or LIN-like, Through comprehensive phylogenomics of previously unexplored CASTOR and SYN (Fig. 5 and Supplementary Figs. 9, 16, 21–23 and plant lineages, we demonstrate that three genes are evolutionary 28). Among them, CASTOR and, to some extent, EPP150 are compo- linked to AMS in all land plants, including two directly involved nents of the common symbiotic signalling pathway, while VAPYRIN in the transfer of lipids from the host plant to arbuscular mycorrhi- and LIN and/or LIN-like are directly involved in the formation of a zal fungi. We propose that the symbiotic transfer of lipids has been structure required for the intracellular accommodation of rhizobial essential for the conservation of AMS in land plants. Surprisingly, bacteria51–53, and also function in intracellular accommodation of we found that genes associated with symbiosis signalling are invari- arbuscular mycorrhizal fungi54–56. Similarly, SYN has been character- antly conserved in all land plant species possessing intracellular ized in M. truncatula for its role in the formation of the intracellular symbionts, implying repeated recruitment of this signalling path- structures during both AMS and root-nodule symbiosis57. way, independently of the nature of the intracellular symbiont. These results demonstrate that, besides the common symbi- Furthermore, we see evidence for conservation of genes associated otic signalling pathway, genes directly involved in the intracellular with the formation of the cellular structure necessary for intracel- accommodation of symbionts are exclusively maintained in species lular accommodation of symbionts, but this correlation is restricted that form intracellular mutualistic symbioses in angiosperms, irre- to angiosperms that have duplicated many of these components. spective of the type of symbiont or the plant lineage. Pro-orthologues Our results provide compelling evidence for the early emergence of these genes are found in species outside the angiosperms in both of genes associated with accommodation of intracellular symbionts,

Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants 287 Articles NaTure PlanTs with the onset of AMS in the earliest land plants and then recruit- M. polymorpha subspecies. Plant material and DNA extraction. Sterilized gemmae ment and retention of these processes during symbiont switches from one individual each of M. polymorpha ssp. montivagans (sample id MpmSA2) and M. polymorpha ssp. polymorpha (sample id MppBR5) were grown as for that have occurred on many independent occasions in the 450 mil- M. paleacea and isolated for DNA extraction. DNA was extracted with a modifed lion yr of land-plant evolution (Fig. 6). Our work also suggests that cetyl trimethylammonium bromide protocol58. mutualistic interactions involving extracellular symbionts do not utilize the same molecular machinery as intracellular symbioses, Genome sequencing. DNA were sequenced with single-molecule real-time suggesting an alternative evolutionary trajectory for the emergence sequencing technology developed by Pacific BioSciences on a PacBio Sequel System with Sequel chemistry and sequence depth of 60× (ref. 62). of ectomycorrhizal and cyanobacterial associations. Genome assembly. The reads were assembled using HGAP 463. Assembly statistics Methods were assessed using QUAST64 v.4.5.4, BUSCO61 v.3.0.2 and CEGMA65 v.2.5. M. paleacea. Plant material. M. paleacea thalli previously collected in Mexico (Humphreys) were provided by K. J. Field and D. J. Beerling (Te University of Reporting Summary. Further information on research design is available in the Shefeld). Gemmae from these thalli were collected using a micropipette tip and Nature Research Reporting Summary linked to this article. placed into a microcentrifuge tube. A 5% solution of sodium hypochlorite was used to sterilize the gemmae for about 30 s followed by rinsing with sterile water Data availability 5 times to remove residual sodium hypochlorite solution. Te sterilized gemmae All assemblies and gene annotations generated in this project can be found were grown on Gamborg’s half-strength B5 medium on sterile tissue culture plates in SymDB (www.polebio.lrsv.ups-tlse.fr/symdb/). Raw sequencing data can under a 16:8 h day:night cycle at 22 °C under fuorescent illumination with a light be found under NCBI Bioproject PRJNA576233 (B. pusilla), PRJNA362997 −2 intensity of 100 µmol µm s. and PRJNA362995 (M. paleacea genome and transcriptome, respectively) and PRJNA576577 (M. polymorpha ssp. montivagans and ssp. polymorpha). DNA and RNA extraction. Genomic DNA from eight-week-old M. paleacea thalli were extracted as described previously58. RNA extraction was also carried out from eight-week-old M. paleacea thalli using the RNeasy mini plant kit following the Received: 14 October 2019; Accepted: 31 January 2020; manufacturer’s protocols. Fragmentation of RNA and cDNA synthesis were done Published online: 2 March 2020 using kits from New England Biolabs according to the manufacturer’s protocols with minor modifications. In brief, 1 µg of total RNA was used to purify mRNA References on Oligo dT coupled to paramagnetic beads (NEBNext Poly(A) mRNA Magnetic 1. Gensel, P. G. Te Emerald Planet: How Plants Changed Earth’s History (Oxford Isolation Module). Purified mRNA was fragmented and eluted from the beads Univ. Press, 2008). in one step by incubation in 2× first-strand buffer at 94 °C for 7 min, followed by 2. Parniske, M. Arbuscular mycorrhizae: the mother of plant root first-strand cDNA synthesis using random-primed reverse transcription (NEBNext endosymbioses. Nat. Rev. Microbiol. 6, 763–775 (2008). RNA First Strand Synthesis Module), followed by random-primed second-strand 3. Delaux, P. M., Séjalon-Delmas, N., Bécard, G. & Ané, J. M. Evolution of the synthesis using an enzyme mixture of DNA PolI, RnaseH and Escherichia coli DNA plant–microbe symbiotic ‘toolkit’. Trends Plant Sci. 18, 298–304 (2013). Ligase (NEBNext Second Strand Synthesis Module). 4. Werner, G. D. A. et al. Symbiont switching and alternative resource acquisition strategies drive mutualism breakdown. Proc. Natl Acad. Sci. USA Genome sequencing. For the M. paleacea genome sequencing, short-insert paired- 115, 5229–5234 (2018). end and long-insert mate-pair libraries were produced. For the paired-end library, 5. Smith, S. & Read, D. Mycorrhizal Symbiosis (Academic Press, 2008). the DNA fragmentation was done using Adaptive Focused Acoustic technology 6. Kottke, I. et al. Heterobasidiomycetes form symbiotic associations with hepatics: (Covaris). The NEBNext Ultra DNA Library Prep Kit (New England Biolabs) Jungermanniales have sebacinoid mycobionts while Aneura pinguis (Metzgeriales) was used for the library preparation, and the bead size selection for 300–400 base is associated with a Tulasnella species. Mycol. Res. 107, 957–968 (2003). pairs (bp) based on the manufacturer’s protocol. The library had an average insert 7. Griesmann, M. et al. Phylogenomics reveals multiple losses of nitrogen-fxing size of 336 bp. For the mate-pair library, preparation was done using the Nextera root nodule symbiosis. Science 361, eaat1743 (2018). Mate-Pair DNA library prep kit (Illumina) following the manufacturer’s protocol. 8. Wang, B. & Qiu, Y. L. Phylogenetic distribution and evolution of mycorrhizas After enzymatic fragmentation, we used gel size selection for 3–5 kb fragments. in land plants. Mycorrhiza 16, 299–363 (2006). The average size that was recovered from the gel was 4,311 bp. Sequencing was 9. Delaux, P. M., Radhakrishnan, G. & Oldroyd, G. Tracing the evolutionary carried out on an Illumina HiSeq2500 on 2 × 100 bp Rapid Run mode. The library path to nitrogen-fxing crops. Curr. Opin. Plant Biol. 26, 95–99 (2015). preparation and sequencing were carried out by Genewiz. 10. Martin, F. M., Uroz, S. & Barker, D. G. Ancestral alliances: plant mutualistic symbioses with fungi and bacteria. Science 356, eaad4501 (2017). Genome assembly. Adapter and quality trimming were performed on the paired- 11. van Velzen, R. et al. Comparative genomics of the nonlegume Parasponia end library using Trimmomatic v.0.33 using the following parameters (ILLUMINA reveals insights into evolution of nitrogen-fxing rhizobium symbioses. Proc. CLIP:TruSeq2-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 Natl Acad. Sci. USA 115, E4700–E4709 (2018). MINLEN:12). The trimmed paired-end reads were carried forward for assembling 12. Albalat, R. & Cañestro, C. Evolution by gene loss. Nat. Rev. Genet. 17, the contigs using multiple assemblers. Scaffolding of the contigs was done using the 379–391 (2016). scaffolder of SOAPdenovo259 with the mate-pair library after processing the reads 13. Tabach, Y. et al. Identifcation of small RNA pathway genes using patterns of through the NextClip60 pipeline to only retain predicted genuine long-insert mate phylogenetic conservation and divergence. Nature 493, 694–698 (2013). pairs. Assembly completeness was measured using the BUSCO61 plants dataset. 14. Delaux, P. M. Comparative phylogenomics of symbiotic associations. New Phytol. 213, 89–94 (2017). Transcriptome sequencing. cDNA was purified and concentrated on MinElute 15. Dey, G., Jaimovich, A., Collins, S. R., Seki, A. & Meyer, T. Systematic Columns (Qiagen) and used to construct an Illumina library using the Ovation discovery of human gene function and principles of modular organization Rapid DR Multiplex System 1–96 (NuGEN). The library was amplified using through phylogenetic profling. Cell Rep. 10, 993–1006 (2015). MyTaq (Bioline) and standard Illumina TruSeq amplification primers. PCR primer 16. Bravo, A., York, T., Pumplin, N., Mueller, L. A. & Harrison, M. J. Genes and small fragments were removed by Agencourt XP bead purification. The PCR conserved for arbuscular mycorrhizal symbiosis identifed through components were removed using an additional purification on Qiagen MinElute phylogenomics. Nat. Plants 2, 15208 (2016). Columns. Normalization was done using Trimmer Kit (Evrogen). The normalized 17. Delaux, P. M. et al. Comparative phylogenomics uncovers the impact of library was re-amplified using MyTaq (Bioline) and standard Illumina TruSeq symbiotic associations on host genome evolution. PLoS Genet. 10, amplification primers. The normalized library was finally size-selected on a low- e1004487 (2014). melting-point agarose gel, removing fragments smaller than 350 bp and those 18. Favre, P. et al. A novel bioinformatics pipeline to discover genes related to larger than 600 bp. Sequencing was done on an Illumina MiSeq in 2 × 300 bp mode. arbuscular mycorrhizal symbiosis based on their evolutionary conservation The RNA extractions, cDNA synthesis library preparation and sequencing were pattern among higher plants. BMC Plant Biol. 14, 333 (2014). carried out by LGC Genomics. 19. Xue, L. et al. Network of GRAS transcription factors involved in the control of arbuscule development in Lotus japonicus. Plant Physiol. 167, B. pusilla. B. pusilla was originally collected from Windham County, Connecticut, 854–871 (2015). USA, and maintained in a Duke University greenhouse. RNA was extracted from 20. Keymer, A. et al. Lipid transfer from plants to arbuscular mycorrhiza fungi. plants with symbiotic cyanobacterial colonies using Sigma Spectrum Plant Total eLife 6, e29107 (2017). RNA kit. Library preparation and sequencing were done by BGI Shenzhen. A 21. Bravo, A., Brands, M., Wewer, V., Dörmann, P. & Harrison, M. J. Arbuscular Ribo-Zero rRNA Removal Kit was used to prepare the transcriptome library. In mycorrhiza-specifc enzymes FatM and RAM2 fne-tune lipid biosynthesis total, 3 libraries were constructed, which were sequenced on the Illumina Platform to promote development of arbuscular mycorrhiza. New Phytol. 214, Hiseq2000 as 150 bp paired-ends, with insert size of 200 bp. 1631–1645 (2017).

288 Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants NaTure PlanTs Articles

22. Grosche, C., Genau, A. C. & Rensing, S. A. Evolution of the symbiosis-specifc 52. Imaizumi-Anraku, H. et al. Plastid proteins crucial for symbiotic fungal and GRAS regulatory network in bryophytes. Front. Plant Sci. 9, 1621 (2018). bacterial entry into plant roots. Nature 433, 527–531 (2005). 23. Delaux, P.-M. et al. Algal ancestor of land plants was preadapted for 53. Liu, C. W. et al. A protein complex required for polar growth of rhizobial symbiosis. Proc. Natl Acad. Sci. USA 112, 13390–13395 (2015). infection threads. Nat. Commun. 10, 2848 (2019). 24. Wang, B. et al. Presence of three mycorrhizal genes in the common ancestor 54. Pumplin, N. et al. Medicago truncatula vapyrin is a novel protein required for of land plants suggests a key role of mycorrhizas in the colonization of land arbuscular mycorrhizal symbiosis. Plant J. 61, 482–494 (2010). by plants. New Phytol. 186, 514–525 (2010). 55. Feddermann, N. et al. Te PAM1 gene of petunia, required for intracellular 25. Humphreys, C. P. et al. Mutualistic mycorrhiza-like symbiosis in the most accommodation and morphogenesis of arbuscular mycorrhizal fungi, encodes ancient group of land plants. Nat. Commun. 1, 103 (2010). a homologue of VAPYRIN. Plant J. 64, 470–481 (2010). 26. Bowman, J. L. et al. Insights into land plant evolution garnered from the 56. Takeda, N., Tsuzuki, S., Suzaki, T., Parniske, M. & Kawaguchi, M. CERBERUS Marchantia polymorpha genome. Cell 171, 287–304 (2017). and NSP1 of Lotus japonicus are common symbiosis genes that modulate 27. Brundrett, M. & Tedersoo, L. Misdiagnosis of mycorrhizas and inappropriate arbuscular mycorrhiza development. Plant Cell Physiol. 54, 1711–1723 (2013). recycling of data can lead to false conclusions. New Phytol. 221, 18–24 (2019). 57. Huisman, R. et al. A symbiosis-dedicated SYNTAXIN OF PLANTS 13II 28. Villarreal A, J. C., Crandall-Stotler, B. J., Hart, M. L., Long, D. G. & Forrest, isoform controls the formation of a stable host–microbe interface in L. L. Divergence times and the evolution of morphological complexity in an symbiosis. New Phytol. 211, 1338–1351 (2016). early land plant lineage (Marchantiopsida) with a slow molecular rate. New 58. Healey, A., Furtado, A., Cooper, T. & Henry, R. J. A simple method for Phytol. 209, 1734–1746 (2016). extracting next-generation sequencing quality genomic DNA from 29. Read, D. J. & Perez-Moreno, J. Mycorrhizas and nutrient cycling in recalcitrant plant species. Plant Methods 10, 1–8 (2014). ecosystems—a journey towards relevance? New Phytol. 157, 475–492 (2003). 59. Luo, R. et al. Erratum: SOAPdenovo2: an empirically improved memory- 30. Rey, T. et al. Te Medicago truncatula GRAS protein RAD1 supports efcient short-read de novo assembler. Gigascience 4, 30 (2015). arbuscular mycorrhiza symbiosis and Phytophthora palmivora susceptibility. 60. Leggett, R. M., Clavijo, B. J., Clissold, L., Clark, M. D. & Caccamo, M. Next J. Exp. Bot. 68, 5871–5881 (2017). clip: an analysis and read preparation tool for nextera long mate pair 31. Park, H.-J., Floss, D. S., Levesque-Tremblay, V., Bravo, A. & Harrison, M. J. libraries. Bioinformatics 30, 566–568 (2014). Hyphal branching during arbuscule development requires RAM1. Plant 61. Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, Physiol. 169, 2774–2788 (2015). E. M. BUSCO: assessing genome assembly and annotation completeness with 32. Luginbuehl, L. H. et al. Fatty acids in arbuscular mycorrhizal fungi are single-copy orthologs. Bioinformatics 31, 3210–3212 (2015). synthesized by the host plant. Science 356, 1175–1178 (2017). 62. Roberts, R. J., Carneiro, M. O. & Schatz, M. C. Te advantages of SMRT 33. Jiang, Y. et al. Plants transfer lipids to sustain colonization by mutualistic sequencing. Genome Biol. 14, 405 (2013). mycorrhizal and parasitic fungi. Science 356, 1172–1175 (2017). 63. Chin, C. S. et al. Nonhybrid, fnished microbial genome assemblies from 34. Adams, D. G. Te Ecology of cyanobacteria: their diversity in time and space long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013). (Kluwer, 2000); https://doi.org/10.1016/0303-2647(92)90025-T 64. Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: Quality assessment 35. Li, F. W. et al. Fern genomes elucidate land plant evolution and tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013). cyanobacterial symbioses. Nat. Plants 4, 460–472 (2018). 65. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate 36. Cope, K. R. et al. Te ectomycorrhizal fungus Laccaria bicolor produces core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007). lipochitooligosaccharides and uses the common symbiosis pathway to colonize Populus roots. Plant Cell 31, 2386–2410 (2019). Acknowledgements 37. Miura, C. et al. Te mycoheterotrophic symbiosis between orchids and This work was supported by the Agence Nationale de la Recherche (ANR) grant mycorrhizal fungi possesses major components shared with mutualistic EVOLSYM (ANR-17-CE20-0006-01) to P.-M.D., by the Bill and Melinda Gates plant–mycorrhizal symbioses. Mol. Plant Microbe Interact. 31, Foundation as Engineering the Nitrogen Symbiosis for Africa (OPP1172165), by 1032–1047 (2018). the BBSRC as OpenPlant to G.E.D.O (BB/L014130/1), by the 10KP initiative (BGI- 38. Oba, H., Tawaray, K. & Wagatsuma, T. Arbuscular mycorrhizal colonization Shenzhen), by the National Science Foundation (DEB1831428) to F.-W.L. and by the in Lupinus and related genera. Soil Sci. Plant Nutr. 47, 685–694 (2001). Swedish Research Council Vetenskapsrådet (VR) to U.L. (2011‐5609 and 2014‐522) 39. Oldroyd, G. E. D. Speak, friend, and enter: signalling systems that and to D.M.E (2016-05180). G.V.R is additionally supported by a Biotechnology and promote benefcial symbiotic associations in plants. Nat. Rev. Microbiol. 11, Biological Sciences Research Council Discovery Fellowship (BB/S011005/1). Part of 252–263 (2013). this work was conducted at the Laboratoire de Recherche en Sciences Végétales (LRSV) 40. Gutjahr, C. et al. Arbuscular mycorrhiza-specifc signaling in rice transcends laboratory, which belongs to the TULIP Laboratoire d’Excellence (ANR-10-LABX-41). the common symbiosis signaling pathway. Plant Cell 20, 2989–3005 (2008). We are grateful to the genotoul bioinformatics platform Toulouse Midi–Pyrenees for 41. Gherbi, H. et al. SymRK defnes a common genetic basis for plant root providing computing and storage resources. We thank F. Roux for helping with the endosymbioses with arbuscular mycorrhiza fungi, rhizobia, and collection of M. polymorpha accessions, A. Cooke for assistance with M. paleacea DNA Frankiabacteria. Proc. Natl Acad. Sci. USA 105, 4928–4932 (2008). extraction, P. Szoevenyi for advice on M. polymorpha DNA extraction, and D. Barker and 42. Svistoonof, S. et al. Te independent acquisition of plant root nitrogen-fxing members of the Engineering Nitrogen Symbiosis for Africa (ENSA) project for helpful symbiosis in fabids recruited the same genetic pathway for nodule comments and discussion. Figure 6b was prepared by J. Calli (www.jeremy-calli.fr). organogenesis. PLoS ONE 8, e64515 (2013). 43. Buendia, L., Wang, T., Girardin, A. & Lefebvre, B. Te LysM receptor-like Author contributions kinase SlLYK10 regulates the arbuscular mycorrhizal symbiosis in tomato. P.-M.D., G.V.R., M.K.R., J.K., G.E.D.O. and T.V. conceived the experiments; J.K., H.S.C. and New Phytol. 210, 184–195 (2016). L.C. developed symDB; G.V.R., M.K.R., J.K., T.V., D.L.M.M., N.V., C.L., J.C. and P.-M.D. 44. Capoen, W. et al. Calcium spiking patterns and the role of the calcium/ conducted the experiments; A.-M.L., D.M.E. and U.L. generated the M. polymorpha calmodulin-dependent kinase CCaMK in lateral root base nodulation of subspecies genomes; F.-W.L., S.C. and G.K.S.W. generated the B. pusilla transcriptome; Sesbania rostrata. Plant Cell 21, 1526–1540 (2009). P.-M.D., G.V.R., M.K.R., J.K. and T.V. analysed the data; J.K. compiled the supplementary 45. Gleason, C. et al. Nodulation independent of rhizobia induced by a calcium- material; G.V.R., M.K.R., G.E.D.O. and P.-M.D. wrote the manuscript. activated kinase lacking autoinhibition. Nature 441, 1149–1152 (2006). 46. Singh, S., Katzer, K., Lambert, J., Cerri, M. & Parniske, M. CYCLOPS, A DNA-binding transcriptional activator, orchestrates symbiotic root nodule Competing interests development. Cell Host Microbe 15, 139–152 (2014). The authors declare no competing interests. 47. Jin, Y. et al. IPD3 and IPD3L function redundantly in rhizobial and mycorrhizal symbioses. Front. Plant Sci. 9, 267 (2018). Additional information 48. Yasumura, Y., Crumpton-Taylor, M., Fuentes, S. & Harberd, N. P. Step-by-step Supplementary information is available for this paper at https://doi.org/10.1038/ acquisition of the gibberellin–DELLA growth-regulatory mechanism during s41477-020-0613-7. land-plant evolution. Curr. Biol. 17, 1225–1230 (2007). 49. Soltis, P. S. & Soltis, D. E. Ancient WGD events as drivers of key innovations Correspondence and requests for materials should be addressed to G.E.D.O. or P.-M.D. in angiosperms. Curr. Opin. Plant Biol. 30, 159–165 (2016). Peer review information Nature Plants thanks Andrea Genre and the other, anonymous, 50. Valdés-López, O. et al. A novel positive regulator of the early stages of root reviewer(s) for their contribution to the peer review of this work. nodule symbiosis identifed by phosphoproteomics. Plant Cell Physiol. 60, Reprints and permissions information is available at www.nature.com/reprints. 575–586 (2019). 51. Murray, J. D. et al. Vapyrin, a gene essential for intracellular progression of Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in arbuscular mycorrhizal symbiosis, is also essential for infection by rhizobia in published maps and institutional affiliations. the nodule symbiosis of Medicago truncatula. Plant J. 65, 244–252 (2011). © The Author(s), under exclusive licence to Springer Nature Limited 2020

Nature Plants | VOL 6 | March 2020 | 280–289 | www.nature.com/natureplants 289 nature research | reporting summary

Corresponding author(s): Pierre-Marc Delaux

Last updated by author(s): 23/01/2020 Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist.

Statistics For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one- or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section. A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals)

For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated

Our web collection on statistics for biologists contains articles on many of the points above. Software and code Policy information about availability of computer code Data collection Candidate genes homologs sequences were retrieved using the open-source program BLAST+ v2.7.1

Data analysis All programs and software used in this study are free and open-source. Sequences were aligned using MAFFT v7.407 and MUSCLE v3.8.3 (multiple sequences aligners), alignments were trimmed using trimAL v1.2 (tool for trimming spurious/poorly aligned sequences) and phylogenetic analysis were performed using IQ-TREE v1.6.7 (phylogenomic dedicated ML software), prior to ML analysis, best evolutionary model was selected using ModelFinder. Trees were annotated using the iTOLv4.4.2 platform. Statistical analysis and graph drawn in R v3.6.0; RStudio v1.2.1335 and the following packages: ggplot2, tidyverse, xlsx, multcomp, multcompview, seqinr, remotes, ape, dplyr, naniar. Selective pressure analysis were conducted using the ETE Toolkit v3.1.1 and HyPhy RELAX 2.5.0. Assembly of Sanger sequences was performed using Tracy v0.5.5. Transcriptomes from the 1KP project were annotated using the TransDecoder pipeline v5.5.0. The genome of Marchantia paleacea was assembled using SPAdes v3.11 and scaffolded using SOAPdenovo2 v2.0. The transcriptomes of Marchantia paleacea and Blasia pusilla were assembled using Trinity v2.4.0 and annotated using the TransDecoder pipeline v2.1.0. The genomes of Marchantia polymorpha subspecies montivagans and polymorpha were assembled using HGAP v4.0 Genomic alignments of syntenic regions between Marchantia species were conducted using Mauve v2.3.1. October 2018 For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

1 Data nature research | reporting summary Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: - Accession codes, unique identifiers, or web links for publicly available datasets - A list of figures that have associated raw data - A description of any restrictions on data availability

All data used in the manuscript are being made available in the database SymDB - http://www.polebio.lrsv.ups-tlse.fr/symdb/ The draft assemblies and raw sequencing data for the genome and transcriptome of Marchantia paleacea have been deposited at NCBI under PRJNA362997 and PRJNA362995 respectively.

Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Ecological, evolutionary & environmental sciences study design All studies must disclose on these points even when the disclosure is negative. Study description The study includes the phylogenetic analysis of candidate genes on a large set of genomes and transcriptomes, as well as transcomplementation assays of angiosperm mutants

Research sample This dataset is composed of publicly available genomes and transcriptomes as well as newly sequenced ones. All data have been made available on symDB (http://vm-polebio.toulouse.inra.fr/symdb/web/) and the citing information and the Fort Lauderdale agreement information are mentioned (read me file).

For the transcomplentation assays presented in Figure 5, data were collected by visual inspection of the root systems under a dissecting scope.

Sampling strategy Genomes and transcriptomes were collected to cover the main land plant clades (hornworts, liverworts, mosses, lycophytes, ferns, gymnosperms and angiosperms).

Data collection Gene collection was conducted using BLAST (tBLASTn and BLASTp) on the above mentioned set of genomes and transcriptomes.

Timing and spatial scale Marchantia accessions used to run the PCR and sequencing presented in Figure 2 and Supplementary Figure 36 were collected in spring and summer 2018.

Data exclusions For the phylogenetic analyses, partial transcripts that are abundant in the transcriptomes from the 1KP project were removed with a threshold of 60% -length of the M. truncatula query.

Reproducibility Data presented in Figure 5 have been replicated at least twice, as indicated in the material and methods and each replicate include multiple independently transformed root system (as indicated in the legend). All attempt at replication were successful

Randomization Data presented in Figure 5b and c were obtained on plants growing in containers (25 pots / containers). Each container included plants transformed with the different test constructs. Data presented in Figure 5a were obtained on plants growing on plates, with plants transformed by a given constructs in the same plate. Multiple assays were conducted to limit for plate effect, with the negative control included in each independent experiment.

Blinding Data presented in Figure 5 were collected using a blinding system, with each construct attributed a number before scoring. Did the study involve field work? Yes No

Field work, collection and transport

Field conditions DNA samples were collected on wild-growing Marchantia plants in urban habitats. October 2018

Location GPS coordinates for each accession from wich DNA has been extracted are included in Supplementary Table 4, except for two plants where only the locations are reported.

Access and import/export The plant material used for the PCR was collected in anthropized, urban, areas where Marchantia are eliminated as "weeds". Marchantia is neither an engangered species nor a pest and no permit were required to collect or transport the material.

Disturbance Plants were collected in urban area (sidewalks, gardens) with only mosses growing at proximity. This sampling did not result in

2 Disturbance any disturbance nature research | reporting summary Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. Materials & experimental systems Methods n/a Involved in the study n/a Involved in the study Antibodies ChIP-seq Eukaryotic cell lines Flow cytometry Palaeontology MRI-based neuroimaging Animals and other organisms Human research participants Clinical data October 2018

3