Supporting Information Appendix
SI Materials and Methods
Plant materials. Bread wheat (Triticum aestivum L.), diploid ancestral species of T.
urartu and A. tauschii and allotetraploid T. turgidum were used in this study. The
mutants ms1d.1 and ms1e were obtained from the Wheat Genetics Resource Center at
Kansas State University. We screened ms1d.2 and ms1h-q from an EMS-mutagenized population of bread wheat (variety ‘Ningchun 4’). The plants used for map-based cloning were progeny segregated from heterozygous ms1e plants. Triticum turgidum L. accession Langdon (AABB), T. urartu accession G1812 (AA) and Ae. tauschii accession AL8/78 were provided by Professor Hongqing Ling (Institute of Genetics and Developmental Biology, Chinese Academy of Sciences). All plants were grown in a greenhouse under long-day conditions (16 h of light at 22–25°C/8 h of dark at
15–20°C) at a white light intensity of 250 mmol/m2 s.
The preparation of EMS-mutagenized population and isolation of ms1 alelles in
‘Ningchun 4’ variety. The wheat variety ‘Ningchun 4’ was used for preparation of
EMS-mutagenized population. In brief, about 200 kilograms of seeds were soaked in
0.5% (v/v) ethyl methane sulfonate (EMS, Sigma-Aldrich, St Louis, MO, USA) for
12 hours at room temperature (about 25°C), and were planted in a field at Yunnan,
China; 990 kilograms of M2 seeds were harvested. One hundred and thirty-five male sterile mutants were screened from a population of 50 kilograms of M2 seeds (about 1 million seeds). After allelism test, 11 of the 135 male sterile mutants were confirmed
1 to be ms1 alleles.
RT-PCR, qRT-PCR and in situ hybridization. Total RNA was isolated using TRI
reagent (Takara Bio Inc.); genomic DNA was removed with DNase I (Promega). Two
micrograms of RNA per sample were used to synthesize cDNA using a First-Strand
cDNA Synthesis Kit (Thermo Fisher). RT-PCR was performed with LA Taq (Takara
Bio Inc.). qRT-PCR was performed on a cycler apparatus (Bio-Rad) using SYBR
Premix Ex Taq GC (Takara Bio Inc.) according to the manufacturer’s instructions.
Amplification was conducted as follows: 95°C for 2 min, followed by 40 cycles of
95°C for 5 s and 60°C for 35 s. ACTIN was used as an internal control. Three biological replicates with three technique repeats per replicate were conducted. The
primers used for RT-PCR and qRT-PCR are provided in SI Appendix Table S9.
In situ hybridization was performed according to Shitsukawa et al.(1) with minor
modifications. Tissues were cut into 10-μm-thick sections and hybridization was
performed overnight at 50°C. The probe (a 971-bp Ms1 fragment) was amplified
using primers Ms1-ISH-F/R (SI Appendix Table S10) and inserted into pEASY-T1
Simple Cloning Vector (TransGen Biotech) in both forward and reverse orientations.
The vectors were linearized by digestion with HindIII and EcoRI and used as a
template to generate anti-sense and sense probes with T7 RNA polymerase.
Histological analysis. For paraffin sections, tissues were prepared as for RNA in situ
hybridization. Transverse sections (10 μm thick) were cut and stained with 0.25%
2 toluidine blue. Each section was observed under an Axio Imager M2 microscope
(Zeiss).
RNA-seq, resequencing and bioinformatics processing of the sequence data. To map Ms1, we first collected the ms1e allele (from an EMS-mutagenized line of wild-type Chris) for further analysis (2). Male-sterile and wild-type plants produced from heterozygous ms1e plants were collected for MutMap-based cloning.
To obtain RNA-seq data, RNA was extracted using plant RNA extraction kits
(Qiagen) from microspore-stage anthers of segregated wild-type plants, ms1e plants and their heterozygous progeny. Paired-end libraries were prepared from 10 µg of cDNA reverse-transcribed from the RNA samples (mean insert size: 250 bp). The libraries were sequenced using the Illumina HiSeq 2000 system to produce 101-bp paired-end DNA reads. Library preparation and sequencing were performed at the sequencing center of Peking University (BIOPIC sequencing platform). We obtained
25.0 Gb of data for wild type, 47.6 Gb of data for ms1e and 12.2 Gb of data for the heterozygous plants (SI Appendix Table S1). The reads were trimmed for quality control and to remove adapter sequences with Trimmomatic (3) and then aligned to the wheat genome (4) using TopHat2 (5) (parameter: --b2 -mp 40). After filtering for sequence repeats and reads with multiple mapping regions in the genome, the clean reads were extracted with SAMtools (6) and then further processed using perl scripts.
To obtain resequencing data, DNA was extracted using plant DNA extraction kits
(Qiagen) from 10-day-seedlings of segregated homozygous progeny of wild-type and
3 ms1e plants. Paired-end libraries were prepared from 1 µg of DNA (mean insert size:
350 bp). The libraries were sequenced using the Illumina HiSeq 2500 system to
produce 150-bp paired-end reads. Library preparation and sequencing were performed
by Novogene Co. DNA resequencing generated 524.7 Gb of data for wild type and
522.1 Gb of data for ms1e (SI Appendix Table S3). The reads were trimmed for
quality control and to remove adapter sequences with Trimmomatic (3), and then
aligned to the available wheat genomes, including IWGSC (4), TGAC (7) and W7984
(8), using Bowtie 2 (9) (parameter: --mp 40). Sequence repeats and reads with
multiple mapping locations were filtered. The RNA-seq and resequencing data have
been deposited in the NCBI’s SRA database (accession no. SRP113349). All data will be publicly available after the publication of this work.
Identification of candidate SNPs between wild type and ms1e from the RNA-seq
and resequencing data by MutMap analysis. We applied the MutMap method (10)
to our RNA-seq and resequencing data to map Ms1. After filtering SNPs with low
read coverage (<6), index values for the SNPs were computed as follows: index =
Nmutant/(Nreference + Nmutant), where N represents the number of accumulated reads with
corresponding genotypes. The SNPs were mapped to the wheat chromosome (8), and
peaks with high indexMU/indexWT ratios were identified as candidate chromosomal
regions containing Ms1. To exclude bias due to index values caused by homologous
sequences outside the candidate chromosomal region, two steps were performed. First,
SNPs from homologous genes in the wheat genome were identified by comparing
4 sequences from the candidate region with the whole genome sequence using BLASTn,
and SNPs obtained from the BLAST analysis were filtered during index calculation.
Second, haplotypes for 200-bp regions around each candidate SNP were generated
using Haploview (11), and SNPs from the different haplotypes were removed so that
only the index ratios of the same haplotypes between wild type and ms1e were
calculated. When indexWT =0, we define that the indexMU/indexWT ratio is 15 in the
RNA-seq analysis and is 30 in the re-sequencing analysis. Thus, the highest
indexMU/indexWT ratio is 15 and 30 for the RNA-seq analysis and the re-sequencing
analysis, respectively, in our analysis. As the loci with indexMU/indexWT ratios
lower than 2 in RNA-seq analysis and lower than 5 in re-sequencing analysis
represent the low possibilities for the candidate genes, we only included the loci with
indexMU/indexWT ratios higher than 2 in RNA-seq analysis and higher than 5 in re-sequencing analysis in Fig. 1F and G, respectively.
Molecular cloning of Ms1
A traditional map-based cloning approach was adopted to clone Ms1 using SNPs
between wild-type and ms1e as markers. Using 112 male-sterile plants segregated
from ms1e heterozygotes, we initially mapped Ms1 to interval YZ5–YZ2 with the
SNP markers derived from our RNA-seq data. High-resolution markers were
developed using DNA-seq data; Ms1 was initially mapped between DYZ18 and YZ2,
and then to a 198-kb region between DYZ23 and DYZ19.
5 Southern blotting. Genomic DNA was extracted from young leaves of T. aestivum L.
(Ms1/Ms1 and ms1g/ms1g), T. turgidum L. accession Langdon (AABB), T. urartu accession G1812 (AA) and Ae. tauschii accession AL8/78 (DD) by the cetyl
trimethylammonium bromide (CTAB) method. The concentration of the purified DNA
was quantified with a nucleic acid analyzer (NanoDrop 2000; Thermo Scientific).
Forty micrograms of each DNA sample was digested overnight at 37°C with HindIII
(Takara Bio Inc.), then purified and separated on a 0.8% agarose gel overnight at 4°C and 35 V. The separated genomic DNA was transferred to Amersham Hybond-N+
nylon membranes (GE Healthcare) and immobilized by UV crosslinking. The probe
DNA was labeled with digoxigenin according to the manufacturer’s guidelines (DIG
Probe Synthesis Kit; Roche); the primers for the probe are listed in SI Appendix Table
S10. We used a 469-bp fragments from the first intron of Ms1 as probe to get the result included in the manuscript. The identities between Ms1 and Ms-A1 and Ms-D1 of the 469-bp probe sequence are 71% and 76%, respectively. The membranes were probed and then analyzed using a chemiluminescence kit (RPN2106; GE Healthcare).
Sequence alignment and phylogenetic tree analysis. Sequences were aligned with
Clustal X 2.1 and a phylogenetic tree was constructed using Molecular Evolutionary
Genetics Analysis (MEGA) 5.2.1 software by the neighbour-joining method. A
bootstrap analysis of 1000 replicates was performed to provide confidence estimates
for the tree topologies. SI Appendix, Fig. S5 shows the amino acid sequences used for
tree construction.
6
DNA methylation analysis. Genomic DNA was isolated from spikes at meiosis in each sample by the CTAB method. DNA samples (30 μg) were treated with proteinase K (AMRESCO Inc.) at 45°C for 1 h. Next, 1.8 μg of purified DNA was treated with an EZ DNA Methylation-Gold Kit (ZYMO Research). Nested PCRs were then performed with Ex Taq HS DNA Polymerase (Takara Bio Inc.). The products were purified using a HiPure Gel Pure DNA Mini Kit (Magen) and cloned into a pEASY-T1 Simple Cloning Vector (TransGen Biotech). For each amplicon, more than
24 clones were sequenced; the sequence data were analyzed using Kismeth software
(12). The primers used to amplify the promoter regions are listed in SI Appendix
Table S9.
Complementation of ms1. For functional complementation of ms1, Ms1 genomic
DNA (including 2,205 bp upstream of the ATG, the gene body region and 536 bp downstream of the TGA) was inserted into pAHC20 digested with HindIII to construct pAHC20-Ms1p::Ms1. Next, pAHC20-Ms1p::Ms1 was transformed into a callus induced from the immature embryo of a heterozygous ms1e plant via particle bombardment, and transgenic plants were selected and regenerated. Transgenic plants in an ms1e/ms1e background were identified by PCR amplification and sequencing.
To evaluate the function of Ms-A1, pAHC20-Ms1p::Ms-A1 was constructed by replacing the Ms1 gene body region with the Ms-A1 gene body region in pAHC20-Ms1p::Ms1. pAHC20-Ms1p::Ms-A1 was then transformed into a callus
7 induced from the immature embryo of a heterozygous ms1g plant via particle
bombardment, and transgenic plants were selected and regenerated. Transgenic plants
in the ms1g/ms1g background were identified by PCR amplification and sequencing.
The primers used to prepare the constructs are provided in SI Appendix Table S10.
Subcellular localization assay. The plasmids used in the subcellular localization
assay were constructed in p35S-GFP, a pUC-based expression vector that includes the
CaMV35S promoter, GFP reporter gene and polyA of rbcS. To create the
p35S::Ms1-GFP, p35S::Ms1SP-GFP, p35S::Ms1△TM-GFP or p35S::Ms1△SP-GFP construct, full-length or truncated Ms1 cDNA was amplified using a cDNA library prepared from wheat anthers and inserted into p35S-GFP via digestion with KpnI and
BamHI. The primer sequences used for PCR amplification are given in SI Appendix
Table S10.
The peroxisome marker PTS1-mCherry was created by adding the sequence encoding SKL to the 3' end of mCherry (13). The Golgi marker GmMan1-mCherry was created by adding the sequence encoding the first 49 amino acids of GmMan1 to the 5' end of mCherry as described previously (13). The mitochondria marker pFAγ-mCherry was created by adding the sequence encoding the first 57 amino acids of the Arabidopsis thaliana F1-ATPase γ-subunit (At2g33040) to the 5' end of mCherry (14). The plastid marker WxTP-mCherry was created by adding the sequence encoding the first 111 amino acid residues (including the transit peptide sequence of the rice waxy gene) to the 5' end of mCherry (15). The ER marker
8 SP-mCherry-HDEL was created by adding the sequence encoding the signal peptide
of AtWAK2 to the 5' end of mCherry and adding the sequence encoding HDEL to the
3' end of mCherry (13, 16). The primers used to amplify these marker genes are given
in SI Appendix Table S10.
To introduce plasmid DNA into onion epidermal cells, the particle bombardment
method was adopted using a helium-driven particle accelerator (PDS-1,000/He;
Bio-Rad) according to the manufacturer’s recommendations. Three micrograms of
plasmid DNA (1 µg/µl) was mixed with 10 µl of a gold particle (60 µg/µl; diameter:
0.6 µm) solution, 10 µl of 2.5 mM CaCl2, and 4 µl of 0.1 M spermidine, and
incubated for 30 min at room temperature. The plasmid-coated gold particles were
rinsed with 70% ethanol and 100% ethanol, respectively, and then gently suspended
in 10 µl of 100% ethanol. The gold particles were bombarded twice into onion cells
using the particle delivery system with 1100 p.s.i. rupture discs. The bombarded onion
epidermal cells were cultured on MS medium at 25°C in darkness for 24 h.
To detect the GFP- or mCherry-tagged proteins, an LSM710 laser scanning
confocal microscope (Zeiss) was used. GFP was excited at 488 nm, and the
fluorescence emission was detected between 493 and 560 nm. mCherry fluorescence was excited at 543 nm and fluorescence emission was detected between 585 and 680 nm. For the plasmolysis assay, bombarded epidermal cells were plasmolyzed by 20 min of exposure to 30% sucrose before confocal microscope scanning with a 10 × 0.3 numerical aperture objective. For the co-localization assay, bombarded epidermal cells were scanned with a 40 × 1.0 numerical aperture water-immersion objective, 63
9 ×1.4 numerical aperture oil immersion objective or 100 × 1.3 numerical aperture oil immersion objective.
Preparation of Ms1-specific antibodies and immunoblotting. The peptide
CEPVVAAVDLGGGVP from Ms1 was used to raise rabbit polyclonal antibodies against Ms1 and the antisera were affinity-purified. Total proteins were extracted with
Plant Protein Extraction Buffer (50 mM Tris-Cl, pH 7.5, 150 mM NaCl, 10 mM
MgCl2, 1% NP-40, 1 mM PMSF and protease inhibitor cocktail).
The proteins were separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) then transferred to a polyvinylidene fluoride (PVDF) membrane (IPVH00010; Millipore), which was blocked with 5% (w/v) non-fat dried milk in phosphate-buffered saline (PBS) with 0.1% (v/v) Tween-20 (PBST) at room temperature for 1 h. Primary antibodies were added to the blocking solution at various dilutions as indicated below and incubation was continued overnight at 4°C. After being washed three times with PBST, the PVDF membrane was incubated with peroxidase-conjugated secondary antibodies for 1 h at room temperature followed by three washes with PBST and detection using a chemiluminescence kit (RPN2106; GE
Healthcare). The anti-Ms1 antibodies were diluted 1:600; monoclonal anti-MBP-tag antibodies (E8032; New England Biolabs) were diluted 1:2000; anti-tubulin antibodies were diluted 1:2000; peroxidase-conjugated goat anti-rabbit IgG antibodies
(A0545; Sigma) were diluted 1:10,000 and peroxidase-conjugated goat anti-mouse
IgG antibodies (A4416; Sigma) were diluted 1:10,000.
10
Expression and purification of MBP-Ms1-His fusion proteins in Escherichia coli.
To prepare the construct expressing MBP-Ms1-His, primers Ms1-His-F and
Ms1-His-R were used to amplify truncated Ms1 (lacking the signal peptide and
trans-membrane domain) fused with 6×His at its C-terminus. To prepare the construct
expressing His, primers His-F and His-R were used for annealing. The sequences of the primers used for PCR are given in SI Appendix Table S8. Next, the PCR fragment
and annealing product were in-fused into pMal-c2x (New England Biolabs) digested with EcoRI and PstI, respectively.
All fusion proteins were expressed in E. coli strain BL21. An overnight preculture
of E. coli in LB medium (4 ml) was used to start a 200 ml culture in LB medium.
Protein expression was induced using 0.5 mM IPTG at an A600 of 0.6–0.8, and the
bacteria were allowed to grow at 16°C overnight. The cultures were then cooled to
4°C and resuspended in lysis buffer (50 mM NaH2PO4, pH 8.0, 300 mM NaCl and 10
mM imidazole). Next, the cells were lysed by sonication, followed by centrifugation
at 9,000 x g. The resulting supernatant was applied to an Ni-NTA agarose column
(30210; Qiagen). Proteins that bound nonspecifically to the column were washed off using lysis buffer containing 20 mM imidazole. The His-tagged protein was then eluted with lysis buffer containing 500 mM imidazole. The plant samples were concentrated and dialyzed into PBS buffer (10 mM NaH2PO4, 10 mM KH2PO4, 2.7
mM KCl and 137 mM NaCl, pH 7.4) using Amicon centrifugal filter devices
(UFC501024; Millipore).
11
Protein-lipid overlay assay. PIP lipid strips (P-6001) and membrane lipid strips
(P-6002) were purchased from Echelon Biosciences. A protein-lipid overlay assay was performed as described previously (17) with modifications. First, the lipid membranes were blocked in 3% (w/v) fatty acid-free ovalbumin (A5253; Sigma) in
PBS for 3 h at room temperature. Second, purified MBP-Ms1-His, MBP-Ms1N-His,
MBP-Ms1C-His, or MBP-His was added at a final concentration of 10 µg/ml and
incubated overnight at 4°C followed by three washes with PBST and three washes
with PBS. Third, lipid membranes were incubated with anti-MBP antibodies (E8032,
diluted 1:2,000; New England Biolabs) in blocking buffer for 3 h at room temperature
followed by three washes with PBST and three washes with PBS. Then, the lipid
membranes were incubated with mouse antibodies (A4416, diluted 1:10,000; Sigma)
in blocking buffer for 1 h at room temperature followed by three washes with PBST
and three washes with PBS. Finally, the membranes were processed for enhanced
chemiluminescence detection.
12 Fig. S1. Phenotypic characterization of ms1. (A) Microspores from DAPI-stained
Ms1 and ms1e UM: unicellular microspore; BP: bicellular pollen; MP: mature
pollen. n = 100, bar = 50 μm. (B) Paraffin sections of anthers from Ms1 and ms1e
at different stages. E: epidermis; Ar: archesporial cell; Sp: sporogenous cell; En: endothecium; ML: middle layer; T: tapetum; MMC: microspore mother cell; MC:
meiotic cell; Tds: tetrads. Bar = 50 μm.
13 Fig. S2. Physical map of Ms1 using the SNP markers from the MutMap analysis.
SNP markers and physical distances are shown for chromosome (Chr.) 4BS. Double
dot lines indicate the unavailable sequence (gap) in the candidate region. Plants 18-19,
11-240, 11-262, 2-137 and 25-1 were male-sterile (ms1e/ms1e) recombinants. Plant
18-22* was a fertile (Ms1/ms1e) recombinant. B: homozygous SNP; H: heterozygous
SNP. TGACv1_scaffold_328661_4BS, TGACv1_scaffold_328309_4BS, 4BS-sc1246 and 4BS-sc963 are genomic DNA accession numbers from the IWGSC (4). The primer sequences for the mapping markers indicated above are included in SI
Appendix Table S2.
14 Fig. S3. Complementation of ms1e by transformation with Ms1. (A) Spikes from
Ms1, ms1e and a transgenic line containing Ms1p::Ms1 in a ms1e background. Bar =
1 cm. (B) Phenotypes of floral organs from Ms1, ms1e and a transgenic line containing Ms1p:Ms1 in a ms1e background after removal of the palea and lemma.
Bars = 1 mm. (C) Seeds of Ms1 and a transgenic line containing Ms1p:Ms1 in ms1e; no seed developed in ms1e. Bars = 1 mm. (D) Mature pollen grains stained with 1%
I2-KI from Ms1, ms1e and a transgenic line containing Ms1p:Ms1 in a ms1e background. Bars = 200 µm.
15 Fig. S4. Sequence alignment of Ms1 and its orthologues in hexaploid wheat. The sequences are based on the genomic sequence of Chinese Spring (4).
16 Fig. S5. Sequence alignment of Ms1 and its orthologues in the Poaceae family.
The proteins are named according to their species. These sequences were used to produce the phylogenetic tree shown in Fig. 2B.
17 Fig. S6. Ms1 expression in anthers and ms1g and expression and purification of
recombinant MBP-Ms1. (A) Ms1 was expressed in developing anthers during
microspore meiosis. Meiosis: meiosis-stage anther; UM: unicellular microspore-stage anther; BP: bicellular pollen-stage anther; MP: mature pollen-stage anther. (B) No
Ms1 was detected in ms1g. Total proteins were isolated from anthers of Ms1 (A) and
(B) or ms1g (B) plants in microspore meiosis, followed by SDS-PAGE and
immunoblot analysis using anti-Ms1 antibodies. Tubulin was used as a loading
control. MBP-Ms1-His and MBP-His (C) or MBP-Ms1N-his and MBP-Ms1C-His (D) were purified from bacteria expressing MBP-Ms1-His, MBP-Ms1H-His,
MBP-Ms1C-His and MBP-His by affinity chromatography, respectively. M: marker.
18 Fig. S7. Complementation of ms1g by transformation with Ms-A1. (A) Spikes from Ms1, ms1g and a transgenic line containing Ms1p::Ms-A1 in a ms1g background.
Bar = 1 cm. (B) Phenotypes of floral organs from Ms1, ms1g and a transgenic line containing Ms1p:Ms-A1 in ms1g after removal of the palea and lemma. Bars = 1 mm.
(C) Seeds of Ms1 and a transgenic line containing Ms1p:Ms-A1 in ms1g; no seed developed in ms1g. Bars = 1 mm. (D) Mature pollen grains stained with 1% I2-KI
from Ms1, ms1g and a transgenic line containing Ms1p:Ms-A1 in ms1g. Bars = 200
µm.
19 Fig. S8. The predicted structures of Ms1. (A) The predicted signal peptide domain in Ms1 according to SignalP-4.1. C-score: raw cleavage site score; S-score: signal peptide score; Y-score: combined cleavage site score (18). (B) The predicted trans-membrane domain in Ms1. The trans-membrane domain was predicted using the transmembrane strands and topology of the β-barrel outer membrane protein prediction web server (PRED TMBB; http://bioinformatics.biol.uoa.gr/PRED-TMBB/)
based on a hidden Markov model (19). (C) The structure of the eight-cysteine motif in
Ms1 from wheat and other representative members of the Poaceae. (D) Diagram of the structural domains in Ms1.
20 Fig. S9. Ms1 is not a secretory protein. Onion epidermal cells were transiently transformed with constructs encoding GFP and Ms1-GFP. Plasmolysis was induced by adding 30% sucrose prior to confocal scanning. The images were recorded using the GFP channel under a confocal microscope. Bar = 50 µm.
21 Table S1. Counts of paired-end reads from the RNA-seq data Clean Read Raw bases GC Sample Raw reads Clean reads bases length (Gbp) content (Gbp) (bp) Ms1/ms1e_lane6 30,219,193 6.1 27,106,438 5.4 101 52.6% Ms1/ms1e_lane7 30,245,424 6.1 27,165,790 5.4 101 52.7% ms1e/ms1e_lane6 62,097,423 12.5 55,599,082 11.1 101 52.7% ms1e/ms1e 62,091,525 12.5 55,669,677 11.1 101 52.3% _lane7 Ms1/Ms1_lane6 117,864,027 23.8 105,372,896 21.1 101 52.7% Ms1/Ms1_lane7 117,852,306 23.8 105,484,538 21.1 101 52.4%
22 Table S2. Primers used for the SNP markers cM Primer position on Scaffold ID Sequence (5'→3') name Chr. 4B CAGACGAAGTCGCCATCATCAATC YZ5 6.843 TGACv1_scaffold_329667_4BS TTCCTTGTATATGAGCCAGGTCTG CTGTTCTTAGAACTCTTCTTGGTAG DYZ12 6.843 TGACv1_scaffold_328717_4BS GAGATCGAGGAACCAACATATAAGC TACCATCGTGCGAAGAGGGGAAC DYZ11 11.623 TGACv1_scaffold_330253_4BS ACGCTACCGAAAGAATCCTATCCAC TGGGAAATAATCGTGGAAACAGTTC DYZ14 10.41 IWGSC 4BS-sc127 CGCACATGTCGGCGACTGAG TTTTTTGTTGTGTGGATTGATGACC DYZ2 12.837 TGACv1_scaffold_328808_4BS CGATGGTAAAATGGCTAAATTCTGG TCTGGAGGCAGCCCGGTAGCGAC DYZ10 12.837 IWGSC 4BS-sc661 TTGTGCAGGTATTGGTGATTTGCGC AAGTTTTGCGACAGCTTGAACTCTC DYZ9 15.111 IWGSC 4BS-sc2875 CTCGCGCTGTGAGTGTGCTTTCT ATGCTCTTAGTACAAGTATTGTGCG DYZ15 15.111 IWGSC 4BS-sc2875 TCAACGAATAGAGAAGCTGCCATG CAAACAAAAAAAGTCACGGACATTA DYZ3 15.111 TGACv1_scaffold_328625_4BS TAGTTCGTCAAAACCTATCAACATG GTTAACCGTGATTGTTGTTCCTCCT DYZ13 15.111 TGACv1_scaffold_328576_4BS GGAAAGAAAAGATCAGCCCCTAGTG ATATGAAATGTCGTGTAATGGCAC DYZ4 18.525 TGACv1_scaffold_328921_4BS AACTCCTTCAACAAGATGACAACG AAAGCAAAACGAACTCATATCCAAT DYZ18 19.662 TGACv1_scaffold_329888_4BS TGGAGTTCTTGATGAACTAGCGACG CGGAGAAGCAGAATGAAAAGTAAAC DYZ8 19.662 IWGSC 4BS-sc2277 CGATGGAGTTGACCTAAGGGACG AGGCTTCTCAAGATCAAGGGAATG DYZ16 19.0935 IWGSC 4BS-sc1948 TGACTTTTCAAGAAGGCTGAAATTC CGAGATCCACCGGTTTTGACAC DYZ21 19.662 TGACv1_scaffold_329160_4BS CATGGCACATCGTTTACAATCAG TCGGTGAGTAATAGTTAAAGAAACGG DYZ23 19.662 TGACv1_scaffold_328661_4BS CAAGCGTGATCACAAGGTGTTGG CCGATCCGGTGCACATGTTAGTAAC DYZ20 19.662 TGACv1_scaffold_328661_4BS TCGTGAAAAGAGGTCGGGTCAAACC CTACTTCACCACCTCCTACAACTGC DYZ19 19.662 IWGSC 4BS-sc963 CAAAAACCACATCAAGAGCAACCTT AAGTACTCAAAGTGTACTCCCTCCC YZ2 19.662 TGACv1_scaffold_328309_4BS AACCGCCGCGGTGTCCCTCT
23 ATGATCTCAAAGCTTGGATATATTC DYZ7 19.662 IWGSC 4BS-sc1383 CCCATCCACGAAGATATATTATTGG CACACTGTTCTCTCTATTGGTTTCC DYZ6 21.936 TGACv1_scaffold_330289_4BS ACCAATAAGGGCTAAAAGTTCCTCC TTGGAGATTAGATCCACGAAAATCC DYZ17 25.351 IWGSC 4BS-sc495 CTCCCCACTGTGACCAGCCTTAG TTGACTTTGTGCTATAAGTATATGC DYZ5 19.0935 TGACv1_scaffold_328611_4BS AATCTGACTTGTATTGTGTTATTGC ATCTGAATTTGTGTTCGCTGCCAC YZ105 25.351 TGACv1_scaffold_1126105_4BS TTTCTTTGCGAATGGAAGTTAAAC TGACv1_scaffold_4870643 CAGGTAACCACAAAATTGACATTCC YZ8 44.689 _4BS GGTAAATGAGGATATAAACCAAATTTTC
24 Table S3. Counts of paired-end reads from the resequencing data Raw bases Sample Raw reads Read length (bp) GC content (Gbp) Ms1 71,961,639 150 49% Ms1 462,611,898 150 48% Ms1 419,280,872 150 48% Ms1 85,643,682 150 48% 524.7 Ms1 396,226,584 150 48% Ms1 89,621,510 150 48% Ms1 113,668,901 150 48% Ms1 110,060,149 150 48% ms1e 36,727,284 150 50% ms1e 417,756,344 150 50% ms1e 427,418,147 522.1 150 50% ms1e 429,752,054 150 50% ms1e 428,791,204 150 50%
25 Table S4. Nine candidate genes predicted within the 198-kb sequence between DYZ 23 and DYZ19
Gene TGAC location CS_NR Gene location Uniprot ID Annotation ID
TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold Globulin 3B n=1 C1 28661_4BS: 150014: UniRef50_B7U6L5 Tax=Triticum aestivum 20888-18679 1375948-1378157 RepID=B7U6L5_WHEAT Retrotransposon protein, TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold putative, Ty1-copia subclass C2 28661_4BS: 150014: UniRef50_Q10IE2 n=4 Tax=BEP clade 30252-35330 1365268-1361390 RepID=Q10IE2_ORYSJ TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold Putative polyprotein n=6 C3 28661_4BS: 150014: UniRef50_Q6ATL7 Tax=Oryza sativa 38394-42366 1358138-1353609 RepID=Q6ATL7_ORYSJ TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold Globulin 3B n=1 C4 28661_4BS: 150014: UniRef50_B7U6L5 Tax=Triticum aestivum 43063-45385 1352840-1350518 RepID=B7U6L5_WHEAT U3 small nucleolar TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold RNA-associated protein 25 C5 28661_4BS: 150014: UniRef50_M7YQ04 n=2 Tax=Pooideae 56052-66725 1341757-1329881 RepID=M7YQ04_TRIUA TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold Predicted protein n=2 C6 28661_4BS: 150014: UniRef50_F2D737 Tax=Triticeae 94254-92399 1303196-1305051 RepID=F2D737_HORVD Retrovirus-related Pol TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold polyprotein from transposon C7 28309_4BS: 150014: UniRef50_M7YSL0 TNT 1-94 n=1 34286-37738 1254660-1251208 Tax=Triticum urartu RepID=M7YSL0_TRIUA Retrotransposon protein, TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold putative, Ty3-gypsy C8 28309_4BS: 150014: UniRef50_Q2QZQ1 subclass n=50 Tax=Oryza 65849-72658 1223097-1216288 RepID=Q2QZQ1_ORYSJ Uncharacterized protein TGACv1_scaffold_3 Scaffolds_v3_chr4B|scaffold UniRef50_UPI00035 LOC101783098, partial n=2 C9 28309_4BS: 150014: 1185658-1180374 08A36 Tax=Setaria italica 103288-108572 RepID=UPI0003508A36
26 Table S5. The fourteen ms1 alleles in bread wheat
Amino acid or protein allele Accession name Mutagenesis type Mutation type change disruption of splice donor site ms1d.1 FS2 EMS mutagenesis G329A for 1st intron disruption of splice donor site ms1d.2 NC41 EMS mutagenesis G329A for 1st intron ms1e FS3 EMS mutagenesis G1431A C1432 P124R and frameshift
ms1g LZ spontaneous mutation Deletion△ None
ms1h NC642 EMS mutagenesis C1762T stop after P190 disruption of splice acceptor ms1i NC790 EMS mutagenesis G1603A site for 2nd intron ms1j NC791 EMS mutagenesis C1775T S195F disruption of splice acceptor ms1k NC28 EMS mutagenesis G1397A site for 1st intron ms1l NC130 EMS mutagenesis C226T stop after P75
ms1m NC226 EMS mutagenesis C1472T stop after K134
ms1n NC904 EMS mutagenesis T164A V55D
ms1o NC955 EMS mutagenesis G281A C94Y
ms1p NC318 EMS mutagenesis G155A C52Y
ms1q NC110 EMS mutagenesis C148T stop after A49
The mutant site in each allele is based on the sequence in the variety from which the allele is generated.
27 Table S6. Ms1 homologs in other species Gene ID or name Species Contig information Triticum Ms1 TGACv1_scaffold_328661_4BS 94254–92399 aestivum Triticum TaMs-A1 TGACv1_scaffold_290346_4AL 29315–27459 aestivum Triticum TaMs-D1 TGACv1_scaffold_361174_4DS 69146–71038 aestivum Triticum TtMS-A1* N/A turgidum Triticum TtMS-B1* N/A turgidum TuMS1 Triticum urartu TGAC_WGS_urartu_v1_contig_181471:4268–2767 ctg7180000362290, whole-genome shotgun AetMS1 Aegilops tauschii sequence (113193–115077) LOC_Os03g46110 Oryza sativa Chr. 3: 26076622–26073641 ONIVA03G29620 Oryza nivara Chr. 3: 25338122–25340561 OPUNC03G26010 Oryza punctata Chr. 3: 28367645–28369999 LpMS1 Leersia perrieri scaffold_3_115 23556–20677 Phyllostachys Phyllostachys PH01001922 245407–242952 heterocycla MS1 heterocycla Contig: Zjn_sc00058.1, cultivar: Nagirizaki, Zoysia japonica MS1 Zoysia japonica 989555–991578 Contig: Zpz_sc01372.1, cultivar: Zanpa, Zoysia pacifica MS1 Zoysia pacifica 22268–24309 Brachypodium Bradi1g13030 Chr. 1: 9858880–9861102 distachyon Brachypodium Brast02G255500 Chr. 2: 18898450–18901089 stacei Panicum Pavir.Ga01748 Chr. 7a: 21869965–21872338 virgatum Panicum Pavir.Gb01613 Chr. 7b: 21324041–21326309 virgatum Lophatherum gracile Lophatherum N/A MS1† gracile
28 Phragmites australis Phragmites N/A MS1† australis Sb06g017510 Sorghum bicolor Chr. 6: 46903482–46905662 Hordeum vulgare Hordeum vulgare Contig: HVVMRXALLeA0005J15: 14967–13033 MS1 Sevir.7G115900 Setaria viridis Chr. 7: 19782572–19784377 Si012756m.g Setaria italica Chr. 7: 20903438–20903770 GRMZM2G151021 Zea mays Chr. 2: 49392293–49394518 GRMZM2G166484 Zea mays Chr. 10: 119825127–119826659 N/A, not applicable. * TtMS-B1 and TtMS-A1 were cloned using primers from Ms1 and TaMs-A1 and their sequences were confirmed by sequencing. † Lophatherum gracile and Phragmites australis MS1 were cloned by PCR using degenerate primers and genome walking. The primers used to amplify Ms1 orthologues are listed in SI Appendix, Table S9.
29 Table S7: Align the sequence of 5' portion of Ms1 within angiosperm lineages
Ms1-5'_portion ------ATGGAGAGATCCCGCGGGCTGCTGCTGGTGGCGGG 35 Aco005198.1 ATGGATCCCACCCTCTCCCTCTTCCTCCTCGCTGCGGCGGCGCTCGCCGGCGCCGGCGCCGCCGCAGTGGCGGAGCCAGCGCCGTCGAGTTGCGCG 96 Cucsa.239340.1 ------ATGGC 5 Potri.001G119000.3 ------ATGGCTTCTTCTCTCAAGATTTC 23 At1g05450.2 ------ATGAACTCCAATAGTTTCTTAATCTCAGCAGCCTTAATCTTCTCTCTACTATCATC 56 PGSC0003DMT400032496 ------ATGGC 5 DCAR_004911 ------ATGGTGGGTGTGGCGGTGGC 20
Ms1-5'_portion GCTGCTGGCGGCGCTGCTGCC-GGCGGCGGCGGCGCAGCCGGGGGCGCCGTG-CGAGCCCGCGCTGCTGGCGA-CGCAG--GTGGCGCTCTTCTGC 126 Aco005198.1 GAGGAGATCGTCGGGATCTCC-GCTTGCCTCCCCCTCGTCGTGGCGGCGACG-CCGATCACCGCCGCCGCCAA-CGCCACCGCGGCGGCGGCGGCG 189 Cucsa.239340.1 GGTGGTTGCGATGTCGCCGCCCACGGGATGCACCACTAGAGAGC-TGCTTTT-GCTCTCTCCATGTCTGCCTTTCATTTCTGCTCCGCCAAACAAT 99 Potri.001G119000.3 TATTCTGGCGATGATGGTTGTAGTTTTTTTTTCGAGCGCGACAA-CCTTAAC-GAGAGCA-CAAGACCAGTCT-ACTTCTTGTGCATCTAAGTTA- 114 At1g05450.2 AAATTCTCCAACATCGATTCTTGCTCAAATCAATA-CACCATGTTCACCATCTATGCTCTCTAGCGTTACAGGTTGCACGAGTTTTCTAACGGGAG 151 PGSC0003DMT400032496 GTTGACGGCGGCGATAATTGC-GTCAGATGCGCAA--ACAACGC-CGCCGTC-GTGTGCCTCGAAATTAGTGC-CATGT--GCGCCTTACCTTAAC 93 DCAR_004911 GTTGTTGGTGGTTATGGCAGT-GATGACGGCGGAAGGACAGGATATTCCGTC-GTGTGCATCGGGACTGGTGC-CATGC--GCGGATTATTTGAAT 111
Ms1-5'_portion GCGCCCGACATGCCGACGGCCCAGTGCTGCGAGCCCGTCGTCGCCGCCGTCGACCTCGGCGGCGGGGTGCCCTGCCTCTGCCGCGTCGCCGCCGAG 222 Aco005198.1 GCGGAGGCGGCGCCATCCGACGCGTGCTGCGACGCGTTCCTCCGTGGCCTCG---TCGGCGGTGGCGCCGCGTGCCTCTGCCACCTCTTACGGGAC 282 Cucsa.239340.1 CTTTCCGATACGGTTCCTTCTGAGTGCTGTGATGCGTTCTCCTCCGCTTACAG---TGCCGGCGGAGGGATTTGCCTTTGTTATTTTCTTCGTGAG 192 Potri.001G119000.3 GTACCATGTCAACCACCAGACAGCTGCTGCAACTCCATCAAAGAAGCGGTTG----CAA--ATGAGCTTCCTTGTCTTTGCAAACTCTATAACGAC 204 At1g05450.2 GTGGT-AGTTTTCCGACCTCAGATTGTTGTGGGGCTCTTA----AATCGTTAA--CCGGAACCGGTATGGACTGTTTGTGT----CTGATAGTAAC 236 PGSC0003DMT400032496 GCATCGAGT---CCCCCTGCGGAGTGTTGTGATCCATTGAGAGAAGCAATAA----CAA--ATGATTTAGATTGTTTGTGTAAATTGTATGAAAAT 180 DCAR_004911 GCAACCAGTAAGCCGCCGGCTTCGTGTTGTGATCCGATCAAGGAAGCTGTTA----CGA--AACAGCTTCCGTGTTTGTGTAATCTTTATAATACT 201
Ms1-5'_portion C-CGCAGCTCGTCATGG---CGGGCCTCAACGCCACCCACCTCCTCACGCTCTACAGCTCCTGCGGCGGCCTCCGCCCCGGCGG------CGCCC- 307 Aco005198.1 C-CGCTCCTCCTAGGGT---TTCCGATTAACACCTCCCGCATCGCCTCGCTCTTCTCCTCCTGCGGAGCTCCGAACCCTAGCGA------CTCCGC 368 Cucsa.239340.1 C-CTCAGATTTTAGGCT---TTCCGTTGAATCGAACGAAGTTCATGGCTCTGTCTTCGTTTTGTCCTCTTAATGGTGAAAACGGA--ATATATTTG 282 Potri.001G119000.3 C-CCAATTTGTTTCAGAGTTTGGGTATAAATGTCACTCAGGCTGTCATGCTCAGCCAGAGATGCGGTGTCACCACTAATCTCAC------TAGTTG 293 At1g05450.2 CGCAGGTGTTCCAATCAGTATTCCTATAAACCGAACTTTAGCCATCTCTCTCCCTCGTGCATGTGGCATTCCTGGTGTCCCCGTTCAATGCAAAGC 332 PGSC0003DMT400032496 C-CAACTTTGTTGCCTTCACTTGGTATTAATATTACTCAAGCACTTGCACTTCCTAAGGCTTGTAATATTTCTGGTGATCTTAA------TGCTTG 269 DCAR_004911 C-CTGGCTTGTTGAAGTCTTTTGGGATTAATGTTACTCAAGCTGTGAGGCTTCCTACTTTGTGTGGTGTTCCTGGTGATCTCTG------TCAGGG 290
Ms1-5'_portion -ACCTC-GCCGCCGCCTGC------324 Aco005198.1 GGACTC-GTCGTTCTCCGAGATGTGCAATGAATCCCAGTCGTTGCCACCGTTCCGAAGCATCACATGGAATGATACAAGCATACCAAGCCCAGCGA 463 Cucsa.239340.1 GAGAAGAATAGTTCTCTGGACTC--GGTTTGTGCTGCTTCACAAACTCTGCCTCCTCTTCAAAGCTCGAGGATTCCAAGAATCCAAGAGCCGGATA 376 Potri.001G119000.3 CAGCGC-TTCAGCTCCAACGCCAGCTGGTTCAG-CAGTTCCTGGAAACGATGGAGATAATGGTGGTAGCAGGATGTCATTGTCGACTGGACTTTCA 387 At1g05450.2 TTCTGCAGCACCTCTCCCTACTCCAGGTCCTGCGTCTTTCGGTCCGACCACTTCTCCTACAGATTCGCAAACTTCTGATCCTGAAGGGTCTGCTTC 428 PGSC0003DMT400032496 TACTACAGGTGGTGCTCCAGGTC-CAAGTTCTGAAGGCTTGCCACCCCCAGGTAACTAA------327 DCAR_004911 TAATTCCTCTCTCTCCCCCTCTCTCTATCTCTCTCTCCCCCCTCTCTCTATCTCTCTCTCTCCCTCTGTCTCGCTCTCTCCCTCCCTCTCCCTCCC 386
Ms1-5'_portion ------Aco005198.1 GTTCTG---AAGAAACCATTAATCCGTCCTCTGCAAATCTGACCCGACCTGCCCCCGTTCCGATCTGCCCCCCTGTAGCATGTCCGAAACCGTCGG 556 Cucsa.239340.1 GTCCTGCTGATGAGAACATAGAAACTCCCGACGTGGGTTTACCACCAAATGCAATTGTATCGCCCTCTGCACCTGCAGAAAAACCGCAGCCGCCTC 472 Potri.001G119000.3 GGCTTG---CTCGTATTATTGGTCGCGTCTCTCCTGCATTAG------426 At1g05450.2 TTTCCGTCCGCCCACTTCTCCGACAACTTCGCAAACTCCTAATGACAAGGATCTCAGCGGATCGGGCAACGGAGGAGATCCAATGGGGTTTGCTCC 524 PGSC0003DMT400032496 ------DCAR_004911 TCTCCC---CCCTCTCTCTCAATTAGGCATATTGATAAAATCTCTCCTCTCTCTCTCTCTCCCTCTGTCTCTCTCTTTCTCGCTCTCTCTCCCCCC 479
Ms1-5'_portion ------Aco005198.1 AGCCCGACCTCGCACCACAACCACGCCCGGATGCGTCAGCCCGATCGCTGCTGGCCAGTGCAATGTCGCTCATTTTTGCTGTCCTCGCATTCTTTA 652 Cucsa.239340.1 CGTCATCTGCTACAGCTGAACGTTTTTTATTGGCAAGAAAATGTATTGGTTTGTTCTTTTCAGGTCCACTCTTCCTTATTCACATTTTGTGA---- 564 Potri.001G119000.3 ------At1g05450.2 ACCTCCACCCTCGTCGTCGCCTTCCTCTTCGCACTCTCTCAAGCTTTCGTATCTTCTATTTGCTTTCGCCTTTACGATTATCAAATTCATCTAA-- 618 PGSC0003DMT400032496 ------DCAR_004911 CCTCTCTCCCAATTAG------495
Ms1-5'_portion ------Aco005198.1 TTTTCGAAGCGATGAGTCGTGCAGCCGACTAA 684 Cucsa.239340.1 ------Potri.001G119000.3 ------At1g05450.2 ------PGSC0003DMT400032496 ------DCAR_004911 ------
Aco005198.1 (JGI), Ananas comosus , Bromeliaceae , Poales Cucsa.239340.1 (JGI), Cucumis sativus, Cucurbitaceae, Cucurbitales Potri.003G113900.3 (JGI), Populus trichocarpa, Salicaceae, Malpighiales At1g05450.2 (Arapot11), Arabidopsis thaliana, Brassicaceae, Brassicales PGSC0003DMT400032496 (JGI), Solanum tuberosum, Solanaceae, Solanales DCAR_004911 (JGI), Daucus carota Apiaceae, Apiales
30 Table S8: Align the sequence of 3' portion of Ms1 within angiosperm lineages
Ms1-3'_portion ------ERN03927.1 ATGGTTCCAAGCATTAGAGAACCCATTTGCTCGGCCATTATTTGGTGGTTGCTACTGGTTCTTACCGGGGGTTTTTGTGGGATAAGCGGCCATGG 95 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 CTTCTCTCGCTCTGAAGCTTCATCCATAGAGCTCAGTCATGGCCATGGCCATGGCTTCTCTGCAAGTTTACACGAAGCTCTTGTTGATGG-ACCG 189 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------ATGAAGAAGACGATTCAAATCCTCCT--CTTCTTCTTCTTCCTCATCAATCTCACCA 55 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 GAGCTCGAACCAGGCTTTGGATTCAGAAGCCATGGCTTGACCTTTGCCGAGGCTTCATCCATCAAGCACAGGCAGCTCCTTTACTACATCGATAG 284 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ACGCTCTC-TCAATCTCC---TCTGACGGCGGCGTTCTCTCCGATAACGAAGTCCGTCACATTCAACGCCGTCAATTACTCGAATTCGCCGA--- 143 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 GTATGGAGACAGAGGAGAGAGTGTATTCGTAGATCCGAGCTTCGAGTTCGAGAATTCGAGGCTCCGAGAAGCTTACATTGCTCTTCAGGCATGGA 379 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------AC-GAAGCGTCAAAATCACCGTTGATCCTTCTCTAAACTTCGAGAATCCGAGATTGCGAAATGCTTATATAGCTCTACAAGCTTGGA 229 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 AAATGGCCATTATTTCAGATCCCATGAACATCACAGGTAACTGGATTGGCCCTATTGTCTGCAACTACACTGGAGTCTTCTGCTCAAAGAGCTTA 474 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 AACAAGCGATTCTCTCTGATCCAAACAATTTCACTTCGAATTGGATCGGATCCAATGTCTGTAACTACACCGGAGTTTTCTGTTCTCCGGCGCTT 324 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 GATGGCTTGAACCTCACTGTAGTGGCCGGAATCGACTTGAACCACTCTGATATCGCCGGTTATTTGCCTGACGAGCTCGGAAAACTCACCGATCT 569 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 GATAATCGGAAGATTCGTACCGTCGCCGGAATCGATCTCAATCACGCAGATATCGCTGGTTATTTACCTGAAGAGCTTGGTTTGTTATCAGATCT 419 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 AGCTCTTCTTCACCTGAACTCAAATCGATTTTGCGGCACCATTCCGCACACTTTCAAGAGATTGAAGCTCCTCTACGAGCTTGATCTCAGCAATA 664 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 TGCTTTGTTTCATGTTAATTCAAACCGGTTTTGTGGTACTGTACCACACCGGTTTAACCGGCTTAAGCTTTTATTCGAGCTTGATCTTAGTAACA 514 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ATCGATTTGCTGGTCGATTCCCCACTGTAGTTCTCAAACTACCTACCCTGAAATACTTAGATCTCAGGTACAATGAGTTCGAAGGCCCCGTTCCT 759 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ACCGGTTCGCTGGGAAGTTTCCGACGGTTGTCTTGCAATTACCGTCGTTGAAGTTTTTAGATCTCCGGTTTAATGAATTTGAAGGAACTGTACCG 609 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 TCGTCTCTCTTCAATCGGCCATTAGATGCCATTTTCCTCAACCACAATCGATTCCATTTTGAAATTCCAGAGAACTTTGGGAACTCTCCTGTCTC 854 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 AAAGAGCTTTTTAGTAAAGATCTTGACGCGATTTTCATAAACCATAACCGGTTCCGGTTTGAATTACCGGAGAATTTTGGTGATTCGCCGGTTTC 704 PGSC0003DMG400010093 ------DCAR_026603 ------
31 Ms1-3'_portion ------ERN03927.1 TGTGGTGGTTCTGGCAAACAACAGATTCAGGGGCTGCATTCCTTCGAGCTTGGCAAAAATGGCTCCCACATTGAATGAAATCATTATTATGAATA 949 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 GGTTATTGTTTTGGCGAATAACCGGTTCCATGGTTGTGTACCATCGAGCTTGGTGGAGATGAAG---AATCTTAACGAGATCATCTTCATGAACA 796 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ATGGATTATCTGGTTGTGTACCAGAGGAGTTTGGAGCTCTGAAAAATCTAACTGTTTTGGATGTGAGCTTCAACAAATTGGTGGGGAATTTGCCT 1044 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ATGGTCTTAATTCTTGTTTACCGTCTGATATCGGACGGTTAAAGAACGTGACGGTGTTTGACGTCAGTTTTAATGAACTTGTTGGGCCGTTACCG 891 PGSC0003DMG400010093 ------DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 CTAAATTTAGGTGGACTTGTCTCTTTGGAACAGTTGAATGTTGCTCACAACATGCTTTCAGGTCAGATTCCTCCTCAAATTTGCAGTCTTCCAAA 1139 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 GAGAGTGTTGGTGAGATGGTTTCGGTGGAGCAGCTTAATGTGGCGCATAATATGTTGTCGGGGAAGATTCCGGCGAGTATTTGTCAGTTACCGAA 986 PGSC0003DMG400010093 ------ATGGCCTCTCCAGATTCTCCCCCATCTTTTTTTCCATTCCCAAC 44 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 TTTAGACAACTTTACTTTCTCTTATAACTTCTTTGAGGGAGAGCCCCCTGTGTGCTTGAGGCTCCCAAGCTTTGATGATAGGAGG---AATTGCA 1231 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 GCTTGAGAATTTCACTTATAGTTACAATTTCTTTACCGGAGAAGCGCCTGTGTGTTTGAGGTTGCCGGAGTTTGATGATCGGAGA---AATTGTT 1078 PGSC0003DMG400010093 TTTCCCCACAGCTACACCTTCAAATTCTACTTCTGATTCACCTCCTGCTCCTCCTCCTGACTCTTCATCTCCTCCTCCTCCACCACCTGACTCTT 139 DCAR_026603 ------
Ms1-3'_portion ------GGCGCCC 7 ERN03927.1 TTAATGGAAGGCCAAAACAGAGGCCAAGGAAACAATGTAAATCTTTTCTCTCTCATAGA---GTGGATTGTGGTAGTGTCAAGTGTGGTCGAGTT 1323 Aco000634.1 ------ATGGAAACCCTACCCTCGAGCTACAAGGAGAACGACGAGGACGAGGAGCCAATGTGGAGCGACGCCC 67 XM_008456796.2 ------ATGATGGATCCTCGAGCTTTTTTGTTGTGTTTCACTTTCATCTCCATTGCTT 52 ABK95485.1 ------ATGATGCTAAAAAAAGCTGTCATCCT---TCTTTCTTTGATCTGCATTTCGA 49 AT4G13340.1 TGCCGGGAAGACCTGCTCAGAGGTCTCCAGGGCAATGTAAAGCGTTTTTGTCTCGTCCGCCGGTAAATTGTGGATCGTTTAGTTGTGGCCGTTCT 1173 PGSC0003DMG400010093 CAGCTCCACCTCCTTCTCCACCAGCTCCAGATTCTCCTCCTCCGTCAGAATCAAAATCACCTCCACCGGCAGAATCTCCTCCACCTCCACCTCCT 234 DCAR_026603 ------ATGTTTTTTCTGAATTATCAGTGGCCACCAGCTCCTCC-----TCCAAAACCA 48
Ms1-3'_portion ---ACCTCGCCGCCGCCTGCGAAGGACCCGCTCCCCCGGCCGCCGTCGTCAGCAGCCCCCCGCCCC------CGCCTCCACCGTCCGCCGCACC 92 ERN03927.1 TCGCCATCTCCTCCACCGGTATTTTCGCCCCCGCCACCTCTGGTTTTATCACCCCCATCGCCACCAGTTTCATTGCCCCCTCCGGTGCCAATTTC 1418 Aco000634.1 TCGACCCCATCCCATCTCCTCCTTCTCCTCCTCCCTCCGCCGCCGCCGCCGCCGCCGCCGCACACT------CTCCTCCGCCGCCGCCTCCGCC 155 XM_008456796.2 TCGCCATCGCCGGAGC----TCAGTCTCCCTCCAGTCCTCCCACCGCCACCCCTTCTCCTCCCACCACCTCCGCCCCTCCTCCCGCCTCTACTCC 143 ABK95485.1 TTGCTGGTGTTTCTGG----TCAAGCACCAGCAACGTCACCAACAGCAGCACCAGCACCACCCACA------CCAACTTCTTCTCC 125 AT4G13340.1 GTGTCGCCTCGTCCTCCGGTTGTAACGCCGTTACCACCGCCTTCTTTGCCATCTCC------GCCTCCACCTGCGCCAATTTT 1250 PGSC0003DMG400010093 ACAGCAGCGGCGCCACCACCTTCGGCGCCTCCTCCAAAGCCGTCAGTGTCACCACCACCACCTTCTCCTAAGGCTCCTCCACCAGCTAATTCTCC 329 DCAR_026603 GCGCCGCCACCAAAGCCTGAACCACCTCCTGCACCTCCTCCTCCTCCGCCACCTAA------GCCGCCTCCTGCGCCTGCACC 125
Ms1-3'_portion T------CGCCGCAAGCAGCCAGCGCACGAC-GCACCACCGCCGCCACCGCCGTCGAGCG------AGAAGCCGTCGTCCCCGCCGCCGTCCCAG 174 ERN03927.1 TTCG-CCCCCACCTCCGCCAATTTCATTGCCACCACCACAGCCAATTTTGTCGCCACCGCCGCCACCAATTTTGTCGCCACCACCGGCTCCGCTA 1512 Aco000634.1 GGAGGAACACCCCGAACCCCAATCCAACGGCCACGCCTACCCAGCCGGCGTCGACGATCCTCCCCCGCCGTCCGATGACGACGCCGACGACCCCG 250 XM_008456796.2 TCCC-CCTGTTTCATCTCCCCCTCCAGCAGCAA------CTCCCCCTCCAGCTGCTACTCCTCCTCCAGCATCCCCACCACCGGCGTCTCCACCT 231 ABK95485.1 ACCG-CCAGCAACCACTCCTCCACCAGTTTCAG------CCCCACCTCC---TGTTACCCAATCTCCA------CCTCCAGCTACCCCTCCTCCA 204 AT4G13340.1 CTCA-ACACCTCCTACGCTTACTTCCCCACCACCTCCGTCACCGCCTCCGCCTGTTTATTCTCCCCCTCCTCCACCGCCACCACCTCCTCCGGTA 1344 PGSC0003DMG400010093 CCCT-CCAGCTTCATCTCCACCCCCACCATCAAAAGATTCTCCTCCTCCTGCTCCTCCTCCTTCCCCACCTCCTCCCCCTCCAGCAGTTTCACCT 423 DCAR_026603 AGCT-CCGAAGCCTACTCCAAATCCAGCACCTGCACCTCCACCAACACCACCGCCCCCTCCTCCACCGAATCCA---CCACCTCCTCCTCCACCT 216
Ms1-3'_portion --GACCACGACGGCGCCGCCCCCCGCGCCAAGGCCGCGCCCGCCCAGGCGGCCACCTCCACGCTCGCGCCCGCCGCCGCCGCCACCGCCCCGCCG 267 ERN03927.1 TATTCACCACCTCCGCCATCTCTGCCACCTCCTCTATATTCACCACCACCACCTTCGCCTCCCTCGCCATCTCCACCACCATCTCCTCTATACTC 1607 Aco000634.1 ACGACGACGACGACGACGACGACTCCGCCCCGGCGAGGAAGAAGCAGAAGCCCCTCTCCGCCTTCGCCTCCGCCGCCGCCGCCGCAGCCCCTCCT 345 XM_008456796.2 CCCGCATCCCCACCACCAGCGACTCCACCTCCGGCTTCCCCACCACCGGCATCTCCTCCTCCGGCCTCCCCACCACCGG---CTTCTCCTCCTCC 323 ABK95485.1 GTTTCAGCCCCACCACCTGCCACCCCTCCTCCCGCAACCCCACCACCAGCAACTCCTCCTCCCGCCACCCCACCACCAG---CAACCCCACCACC 296 AT4G13340.1 TATTCTCCTCCACCACCACCGCCCCCACCGCCTC------CTCCGCCAGTATATTCTCCTCCACCACCACCACCGCCCCCACCGCCTCC---TCC 1430 PGSC0003DMG400010093 TCTCCTCCACCACCAGTGAAAAATCAACCACCACCACCTGATTCTCCACCTCCTGCACCTGTTGCAAATCCGCCACAAAACTCCCCTCCACCTCC 518 DCAR_026603 AGTCCACCTCCAGCACCCCCGCCTGCGCCTGCTCCCCCTCCAAAGCCAGCTCCTGCACCACCACCAGCTCCACCTCCCCCACCAAGTCCCCCTCC 311
Ms1-3'_portion CCCCAGGCGCCGCACTCCGCCGCGCCCACGGCGCCGTCCAAGGCGGCCTTCTTCTTCGTCGCCACG---GCCATGCTCGGCCTCTACATCATCCT 359 ERN03927.1 GCCGCCTCCGCCTCCGCCTCCGCCTCCGCCACCTTCCCCTCCGCCACCTCCCCCTCCCCCTCCACCTCCACCTCCACCTCCACCATCCCCCCCAC 1702 Aco000634.1 CTCCCCGCTCCTCCCTCCGCCTCCGCCTCGGCCTCGAAGAAGCCGAAGAAGAAGAGCAACAACGTGTGGACCAAGTCCACCTCCCGCAAGGGCAA 440 XM_008456796.2 GGCATCTCCTCCACCAGCATCCCCTCCCCCAGCGATTCCACCGCCTGCACCATTGGCATCACCACCAACGGCAGTGCCAGCTCCTGCACCGAGCA 418 ABK95485.1 TGCTACTCCTCCACCAGCAACCCCTCCTCCCGCTGTTCCTCCACCAGCTCCATTGGCAGCTCCACCAGCTCTTGTTCCAGCTCCAGCTCCCAGCA 391 AT4G13340.1 GCCAGTATATTCTCCGCCACCACCATCGCCGCCTCCACCGCCTCCGCCAGTCTACTCTCCCCCACCACCACCACCGCCTCCACC---TCCTCCGC 1522 PGSC0003DMG400010093 TGCATTGGCTCCCCCTCCAGCCTCACTGCCATCTGCCCCTCCACCTAACCTCTTAACATCTCCACCCCCTTCTATTTCACCTCCTGCTCCCCCAA 613 DCAR_026603 TGCACCACCTCCGGCTCCTCCACCAAGTCCACCACCTGCTCCTGCCCCTCCACCTAATCCTCCACCACCTCCTGCACCTCCACCAAGTCCTCCAC 406
32 Ms1-3'_portion CTGA------363 ERN03927.1 CACCTTCGCCACCCCCGCCATCCCCACCACCACCTTCACCATCTCCCCCACCCCCTTCGCCACCCCCACCA-TCTCCCGCACCACCTTCGCCACC 1796 Aco000634.1 GAAGAAGTCCAAGCCCTCCCCCCACGCCCCCGCCCCGGAGGACACCGTCCTCATGACCCCGATGCCCCGCGGCTTCCCTGACCGCTCCGACGACT 535 XM_008456796.2 AGAAGAAGGTGAAGGCAGCAGCTCCGGGTCCGGCTCCAGTTTCGAGCCCGCC---AGCGCCGTCAGTGGAGGCTCCAGGACCTGCAGGCCCTGAT 510 ABK95485.1 AGCCTAAGTTGAAGTCTCCAGCTCCATCTC---CCCTGGCATTGAGTCCTCC---ATCTCCACCAACTGGCGCTCCTGCTCCAAGTTTGGGTGCT 480 AT4G13340.1 CGGTATACTCTCCTCCGCCTCCGCCAGTATACTCTTCTCCACCTCCTCCGCCTTCTCCAGCACCAACTCCAGTTTATTGCACC-CGTCCACCACC 1616 PGSC0003DMG400010093 ATAATACTTCTCCAGCTGGAGCTCCCCCTCCATTACCTGTGACTCGCCTTCCTACAGAGAAGCCCACTGCTATCCCTAAACCTGCTATCACTGCA 708 DCAR_026603 CACCTCCAGCTCCTCCACCGAGTCCACCACCACCTCCTG---CCCCTCCACCGAGTCCACCACCACCTCCTGCCCCTC-CACCGAGTCCACCACC 497
Ms1-3'_portion ------ERN03927.1 CCCACCATCTCCCCCACCACCATCGCCACCCCCGCCATCCCCCCCACC--ACCATC-GCCACCACCGCCAAT--T-TTGTCCCCACCACCTCCGG 1885 Aco000634.1 CCCCCGACGCCCGCATCTGCCTCTCCCGCATCTACAAGGCCGAGAAGGTCGAGCTCAGCGACGACCGCCTCGCCGCGGGGAGCACCAAAGGCTAC 630 XM_008456796.2 CAATC---TCCTACCCCATCTCAGAACGACAATAGTGGA---GTGGAGAAAGTTTG-GAGAAAGGAGAGTAT--GGTGGGGAGCATAGTGATTG- 595 ABK95485.1 TCCTC---TCCTGGACCCGCTGGAACCGATATGAGTGGA---GTAGAGAAGATGGG-GTCCGTGCAGAAGAT--GGTCCTGAGCCTGGTCTTTG- 565 AT4G13340.1 CCCACCACCTCACTCGCCGCCACCACCACAATTTTCTCCTCCACCACCTGAACCTT-ACTACTACAGCTCAC--CACCACCACCGCATTCTTCAC 1708 PGSC0003DMG400010093 GATTCAAGTGCCAGAAATGGTGGGGGAAATAAGACAGGAAGTGTGGCGGCAATTGGTGTTGTTGCTGGGTTTTTGGCCCTTAGCTTGGTCATTGT 803 DCAR_026603 CCCTCCTGCCCCTCCACCGAGTCCACCACCACCTCCTGCCCCTCCACC--GAGTCC-ACCACCACCTCCTGC--C-CCTCCACCGAGTCCTCCA- 585
Ms1-3'_portion ------ERN03927.1 ----TATATTCGCCACCACCTCCTCCGCTAAATTCACCGCCACCTCCAATTTCACCACCTTATTGTATAAGGTCTCCCCCACCACCTCCGCCAAA 1976 Aco000634.1 CG--CATGGTCCGCGCCACCCGCGGCGTGATGGACGGCGCCTGGTTCTTCGAGATCAGGATCGTGCGGCTCGGGGAGACGGGCCACACCAGGCTT 723 XM_008456796.2 ----GAATGGGATATGTATTTTTGATGCTTTAGGGAGAAGAAAATTAAAGGGTTTCCTTTTGCTGTCTAT--GTGTTTGCCTCTTTTTTTTTCTT 684 ABK95485.1 ----GATCAGCAT---TCTGGTTGCTAACTTAG------591 AT4G13340.1 ----CGCCGCCGCATTCACCTCCACCACCACATTCACCTCCCCCACCGATTTATCCATATCTGTCTCCA---CCGCCCCCACCAACACCAGTTTC 1796 PGSC0003DMG400010093 TGCTGTCTGGTTTACACGCAGGCGAAAGAAAAGAGAGAGTGCATTTAATCTCAATTACCTGGGACCCTCT--CCATTTGCTTCCTCACCAAATTC 896 DCAR_026603 ------CCACCTCCAGCTCCTCCACCGAGTCCACCACCACCTCCTGCCCCTCCACCGAGTCCTCCA---CCACCTCCAGCTCCTCCACCAAG 668
Ms1-3'_portion ------ERN03927.1 TTCACCACCACCACCACCGCCTCACTATCCACCTCCCCCTTCT-CACCATCCACCACCGCCTCCTCACTATCCACCACCGCATCACTATCCACCA 2070 Aco000634.1 GGGTGGACCACCGACAAGGGCGATCTACAGGCGCCCGTCGGGTACGACGGCCATAGCTACGGCTACAGAGATATCGATGGGACTAAGATTCACAA 818 XM_008456796.2 CATTATTTCCTTTCTTCTTGGGATGGCTCTATATTTCATTTCA-TTTTGATTATTATTATTAA------746 ABK95485.1 ------AT4G13340.1 TTCTCCACCACCCACTCCGGTCTATTCCCCTCCTCCCCCACCT-CCTTGTATAGAACCACCACCAC---CTCCACCGTGTATAGAGTATTCACCT 1887 PGSC0003DMG400010093 AGATACATCATTCCTGAGATCAAGGTCTCAACATTCCACTTAT-CTAGCTCCAACCGGCTCACAAAGCAATTTTATGTACTCTCCAGACCATGGA 990 DCAR_026603 CCCTCCCCCAGCACCTCCACCTAGTCCACCCCCGCCTCCTGCC-CCTCCACCAAGCCCTCCTCCAG---CACCTCCACCTAGTCCACCGCCGCCT 759
Ms1-3'_portion ------ERN03927.1 CCTCCTTCTCACCATCCACCA-CCGCCTCCTCACTATCCACCACCCCCACCACATGTGCATTCTCCACCACCACCGTCACCAGTGTATAGCCCAC 2164 Aco000634.1 GGCCTTGAGGGACAAGTATGGGGAGGAGGCCTACACAGAGGGGGATGTGATTGGGTGTTATATTAGCCTCCCCGATGGGGAGGCGTATGCGCCGA 913 XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 CCTCCTCCTC------CACCA-GTCGTTCATTATAGCTCTCCGCCTCCACCGCCAGTCTACTACAGCTCTCCGCCACCTCCACC----AGTCTAT 1971 PGSC0003DMG400010093 GGTATTGGAAATTCAAGATCATGGTTCACTTACGAAGAATTATCTGAGGCAACAAATGGTTTTTCTCCTGATAGTGTTTTGGGTGAAGGAGGGTT 1085 DCAR_026603 CCTGCCCCTC------CGCCAAGCCCTCCCCCANCCACCACCACCACCAGCACCTCCACCAAGCCCTCCTCCTGCACCCCCACCA---AGTCCAC 845
Ms1-3'_portion ------ERN03927.1 CGCCGCC-CATTCACCTTTCACCACCACCACCGTATTACTATGAGTCCCCACCACCACCTCAACCTGTGTATTCTCCTCCTCCACCTTGCATAGA 2258 Aco000634.1 AGCCGCCGCACCTGATTTGGTACAAAGGGCAGAGGTACGTCTACTCGGCCGACGGCAAGGATGAACCGCCCAAGGTAGTGCCTGGGAGTGAGATA 1008 XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 TACAGCT-C-TCCACCT----CCGCCACCGCCGGTTCATTACAGCTCTCCGCCACCACCAGAA---GTCCATTACCATTCTCCGCCT------2049 PGSC0003DMG400010093 TGGATGTGTTTACAAAGGTGTTCTTAATGACGGAAGAGAAGTCGCTGTCAAACAGCTGAAAAGTGGAAGTGGACAAGGGGAGCGGGA---ATTCA 1177 DCAR_026603 CACCACCACCTGCACCT----CCACCAA-GCCCTCCTCCTGCA--CCCCCACCAAGTCCACCACCACCACCTGCACCTCCACCAAGC------925
Ms1-3'_portion ------ERN03927.1 ACCACCACCTCCTCCTCAACCTTGCATTGAACCACCGCCACCACCTACTCCTAGCTATTTGCCAACCCCATCTCCATCACCACCACCGCCACCAA 2353 Aco000634.1 TCTTTCTTCAAGAACGGGATATGCCAAGGTGTCGCCTTCACGGACCTTTTC-----GGTGGACGATACTATCCCGCGGCGTCCATGTACACACTT 1098 XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ----CCATCTCCAGTACACTACAGCTCTCCACCACCGCCACCATCAGCTCC-----ATGTGAAGA----ATCTCCTCCACCAGCACCGGTA---G 2128 PGSC0003DMG400010093 GAGCAGAAGTTGAGATTATCAGCCGTGTGCACCATCGCCATTTGGTTTCACTTGTTGGTTACTGTATCTCAGAGCAGCAAAGGTTACTTGTCTAC 1272 DCAR_026603 ----CCTCCTCCTGCACCCCCACCAAGTCCACCACCACCACCTGCACCTCC------ACCAAGCCCTCCTCCTGCACCCCCACCAA-----G 1002
Ms1-3'_portion ------ERN03927.1 TCCATTATAACTCCCCTCCTCCACC--TTCACCACCGCCACCAATCTATTATAGCCCACCTCCACCACCACATTACAGTTCACCCCCACCTCCAA 2446 Aco000634.1 CCTAACCAGCCCAACTGCGAGGTTCGGTTCAACTTCGGGCCTGATTTCGAGTTTTTTCCGCAAGATTTTGGTGGCCGTCCGACCCCCCGACCGAT 1193 XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 TTCACCACAGTCCACCACCGCCCATGGTTCACCACAGCCCACCACCTCCAGTGATCCACCAAAGCC--CAC------CACCGCCATCTCCTG 2212 PGSC0003DMG400010093 GACTATGTGCCAAATGACACGCTTGACTATCACCTTCATGGTAAAGGCATGCAAACTATGGATTGGGCTACCCGAGTAAAAGTAGCTGCTGGTGC 1367 DCAR_026603 TCCACCAC----CACCACCTGCACC--TCCACCACCTGCACCTCCACCAAGCCCACCACCCGCACCTCCACCA---AGTCCACCCCCACCACCTA 1088
Ms1-3'_portion ------ERN03927.1 TTCATCATAGTCCACCACCTCCAATTCACTATAGTTCACCCCCACCACCTCCACCAGTTCCTTGCAATTCTCCACCCCCACCACCACCGATGAGC 2541 Aco000634.1 GATCGAGGTTCCTTATCACGGCTATGATTGTAAGATTGATGGGCCTGCTGAAAATGGCGTTGCAGAGAAAACTAGTTAA------1272 XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 AATATGAAGGACCACTACCACCGGTCATCGGCGTATCATACGCATCTCCTCCACCACCGCCGTTCTATTGA------2283 PGSC0003DMG400010093 AGCACGTGGACTTGCTTATCTTCATGAAGACTGTCATCCCCGCATTATCCATAGGGATATCAAAACATCAAACATTCTCTTGGATATCAATTTTG 1462 DCAR_026603 --CACCACCTCCCAAC-CCTCCACCAAACCCCCCACCTCTCCCAT-GCTTAAACCCTTTCCGATAAATCCCATGCCCTAA------1164
33 Ms1-3'_portion ------ERN03927.1 GAATTGCCACCTCCTTATATGGGGCCATTGCCACCAGTTACAGCGATTTCTTATAGCTCGCCACCTCCACCGCCTTATTATTGA------2625 Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 AGGCACAGGTTGCTGATTTTGGCCTTGCAAGGTTAGCAGGTGATGCCAGTAGTACACACGTGACAACTCGTGTGATGGGAACCTTTGGATACTTG 1557 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ------Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 GCACCAGAGTATGCATCTAGTGGAAAATTAACAGAGAAGTCTGATGTTTATTCATATGGCGTTGTGCTTTTGGAGCTTATTACGGGACGGAAACC 1652 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ------Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 TGTTGACCAGTCTCAACCCTTAGGTGATGAAAGCCTGGTTGAATGGGCTCGACCTTTGCTTGCTCAAGCACTTGAGACTGAAAATTTTGAAAATG 1747 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ------Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 TAGTAGATCCTAGGCTTGGAAACAACTTTGTTGCGGGTGAGATGTTCCGGATGATTGAAGCAGCTGCAGCTTGCGTTCGTCATTCAGGCTCTAAG 1842 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ------Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 AGGCCACGGATGAGTCAGGTGGTTAGAGCTCTAGATTCCATGGATGAGCTGTCGGATCTGTCCAATGGAGTGAAACCTGGACAAAGTGGAATTTT 1937 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ------Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 TGAGTCAAGGGAACAATCTGCACAGATAAGAATGTTTCAAAAGATGGCATTTGGAAGTCAAGAGTACAGTTCAGATTTCTTCAATTACTCCCAAG 2032 DCAR_026603 ------
Ms1-3'_portion ------ERN03927.1 ------Aco000634.1 ------XM_008456796.2 ------ABK95485.1 ------AT4G13340.1 ------PGSC0003DMG400010093 GCAGCTATAAAAGTTGA 2049 DCAR_026603 ------
ERN03927.1 (ENA), Amborella trichopoda, Amborellaceae, Amborellales Aco000634.1 (JGI), Ananas comosus, Bromeliaceae, Poales XM_008456796.2 (NCBI), Cucumis melo, Cucurbitaceae, Cucurbitales ABK95485.1 (ENA), Populus trichocarpa, Salicaceae, Malpighiales AT4G13340.1 (Arapot11), Arabidopsis thaliana, Brassicaceae, Brassicales PGSC0003DMT400026193 (JGI), Solanum tuberosum, Solanaceae, Solanales DCAR_026603 (JGI), Daucus carota, Apiaceae, Apiales
34 Table S9. The primers used for the molecular cloning of Ms1 orthologues, RT-PCR, qRT-PCR and BSP Primer name Sequence (5'→3') Application TaACTIN-F TCAGCCATACTGTGCCAATC RT-PCR in AABBDD, AABB and AA TaACTIN-R CTTCATGCTGCTTGGTGC
AetACTIN-R CTTCATGCTGCTTGGGGC RT-PCR in DD with TaACTIN-F
Q-PCR in AABBDD, AABB and AA with TaACTIN-QF1 TTCCAGCCATCTTTCATTG TaACTIN-R, Q-PCR in DD with AetACTIN-R
Ms1-QF ACATCATCCTCTGAGTCGCG RT-PCR and Q-PCR for Ms1 in AABBDD and
Ms1-QR GACCACGCAAACACGTACG AABB
Ms-D1-QF GCCTCTACATCATCCTCTGAGTG RT-PCR and Q-PCR for Ms-D1 in AABBDD and DD Ms-D1-QR ATACTCCTGCCAACGACAG
Ms-A1-QF ACATCATCCTCTGAGTCGCC RT-PCR and Q-PCR for Ms-A1 in AABBDD, AABB
Ms-A1-QR CTACCAGGACGCTACGATC and AA
Ms1-BSPF1 AAAATTYGGAAAYGGAAAAG BSP for Ms1 in AABBDD and AABB, first round Ms1-BSPR1 CRRRATCTCTCCATCRTCRC
Ms1-BSPF2 YTTTYTYGYATYYYGAGGY BSP for Ms1 in AABBDD and AABB, second round Ms1-BSPR2 TRRACRCTAARCCAARCCC
Ms-D1-BSPF1 AGGAGAGGCGGTTAYGYG BSP for Ms-D1 in AABBDD and DD, first round Ms-D1-BSPR1 RRCRRRRCRRTCTCTCCC
Ms-D1-BSPF2 GYATYYGGGYYGTYYGAT BSP for Ms-D1 in AABBDD and DD, second round Ms-D1-BSPR2 TCTCCCTCTCTCRCTCRR
Ms-A1-BSPF1 TYAAAAATYGAAAAYGGAAAAY BSP for Ms-A1 in AABBDD, AABB and AA, first
Ms-A1-BSPR1 ATCTCTCCATCRRCRRRRTC round
Ms-A1-BSPF2 ATYYYGAGGAGAGGYGGTTAG BSP for Ms-A1 in AABBDD and AABB, second
Ms-A1-BSPR2 CRRRRTRRTCTCTCTCTCC round
Ms-A1-BSPF3 ATYYYGAGGAGAGGYGGTTA BSP for Ms-A1 in AA, second round Ms-A1-BSPR3 CRRRRCRRTCTCTCTCTCC
Ms-A1-ProF TTCTTGAGAACCACCTTGTTCG To confirm the TtMs1-A promoter sequence Ms-A1-ProR CCATGGAACACTACGTACTAGGC
Ms-A1-GBF TCCGGCATTCCATTTCCGTC To confirm the TtMs-A1 gene body sequence
35 Ms-A1-GBR CCCACCGTCTTCTTCTCAATCG
Ms-A1-TerF CTCTACATCATCCTCTGAGTCGC To confirm the TtMs-A1 terminator sequence Ms-A1-TerR CATCACATCATTAGCAGAAAC
Ms1-B-ProF GCACTAGTTCTTTACTATACTCAAGCC To confirm the TtMs1 promoter sequence Ms1-B-ProR GAGCACTTCTAGCGAGTCAAGAAGG
Ms1-B-GBF CACGCCACCTCCGGCTATATAAG To confirm the TtMs1 gene body and terminator
Ms1-C-GBR AACGCAAAACTTGATCCATTTC sequence
LgMS1-DF CWCCCACCTCCTCGCGCTCTAC Degenerate primer for the Lophatherum gracile MS1
LgMS1-DR GCCATTTCGTGMGGGCCAAGA genomic sequence
LgMS1-GWR1 ACGATTCGAATCCCATATG
LgMS1-GWR2 AGAGAAAGACGAGAACCTGC
LgMS1-GWR3 TGTGTGAGGTACCTTGGCAG Genome walking primer for the Lophatherum gracile
LgMS1-GWF1 AGCAGAAAGCAGATGGTTGC MS1 genomic sequence
LgMS1-GWF2 CTAGCGTGGTTGGAACGC
LgMS1-GWF3 TTTTGAGATTCCGGCAGAAC
PaMS1-DF CTCGCCCGCCTCAACGCC Degenerate primer for the Phragmites australis MS1
PaMS1-DR GCCATTTCGTGMGGGCCAAGA genomic sequence
PaMS1-GWR1 AAGCACACGAACGAATGATTC
PaMS1-GWR2 TGCTGCTGGATCTTGCAAAG
PaMS1-GWR3 ACAACCACTTTGAAAACCAG Genome walking primer for the Phragmites australis
PaMS1-GWF1 GTTTCATTTCGTGTATCTTCCGG MS1 genomic sequence
PaMS1-GWF2 AGCCGGTCGTACGTACTTG
PaMS1-GWF3 CGGTTTGGTTCTCATCGG
36 Table S10. Primers used to prepare the constructs in this study Primer Sequence Application Ms1-ISH-F aagcttCGAGCGAGGGAGAGAGAGACC In situ hybridization probe Ms1-ISH-R gaattcGATCACATAGCATCAGTGGTTC Ms1-SB-F1 CGACATACACGGAGCGATCTATG Ms1-SB-R1 GGCAGAGGCGACGCACTG Southern blotting probe Ms1-SB-F2 TCTACAGCTCCTGCGGCGGCCTCCGC Ms1-SB-R2 CCAAAAGCACGGCCAGCTCTTGCCG Ms1 genomic acgacggccagtgcc aagctt DNA-F1 CACCTAGTTGCATATCTAGTGAACCC Ms1 genomic gcactgcaggcatgc aagctt acgcgt DNA-R1 GGTCTCTCTCTCCCTCGCTCGC For the pAHC20-Ms1p::Ms1 construct Ms1 genomic AGCGAGGGCGGCGCGCCCGGGGCTTG DNA-F2 GCTTAGCGTCCACGC Ms1 genomic caggcatgc aagctt acgcgt DNA-R2 ATAGCAGAATGGAAGCTACAAACAGC gccagtgcc aagctt cctgcagg Ms1p-F CACCTAGTTGCATATCTAGTGAACCC CCTCAGATCTACCAT aggcct Ms1p-R CGTCGCGGCGGGGCGGTCTC Ms-A1 gene acgacggccagtgcc aagctt cctgcagg aggcct body-F ATGGAGAGATCCCGCCGCC For the pAHC20-Ms1p::Ms-A1 construct Ms-A1-gene GCGGGGTCGGCGCGCGACTCAGAGGA body-R TGATGTAGAGGCCG CGGCCTCTACATCATCCTCTGAGTCGC Ms1 ter-F GCGCCGACCCCGC gcactgcaggcatgc aagctt Ms1ter-R ATAGCAGAATGGAAGCTACAAACAGC cgggggacgagctcgggtaccATGGAGAGATCCC Ms1 CDS-F1 GCGGGC cgggggacgagctcgggtaccATGCAGCCGGGG Ms1 CDS-F2 GCGCCGTGC For the p35S::Ms1-GFP, ggtgtcgactctagaggatccGAGGATGATGTAGA Ms1 CDS-R1 p35S::Ms1SP-GFP, p35S::Ms1△TM-GFP GGCCGAGCAT or p35S::Ms1△SP-GFP construct ggtgtcgactctagaggatccGGCCGCCTTGGAC Ms1 CDS-R2 GGCGC ggtgtcgactctagaggatccCGCCGCCGCCGCC Ms1 CDS-R2 GGCAG PTS1-mCherr cggggatcctctaga gtcgac y-F ATGGTGTCCAAGGGCGAG For the PTS1-mCherry marker construct PTS1-mCherr atacgaacgaaagct ctgcag TCA GAGTTTTGA y-R CTTGTACAGCTCGTCCATGC GmMan1-mC cggggatcctctaga gtcgac For the GmMan1-mCherry marker
37 herry-F1 ATGGCGAGAGGGAGCAGATC construct GmMan1-mC CTCGCCCTTGGACACCATAGTTTGACG herry-R1 GTCCCAGAAAAC GmMan1-mC GTTTTCTGGGACCGTCAAACTATGGTG herry-F2 TCCAAGGGCGAG GmMan1-mC atacgaacgaaagct ctgcag herry-R2 TCACTTGTACAGCTCGTCCATGC pFAγ-mCherr cggggatcctctaga gtcgac y-F1 ATGGCAATGGCTGTTTTCCG pFAγ-mCherr CTCGCCCTTGGACACCATGTTCTTAAC y-R1 ACTCTTCATGCGGTTAC For the pFAγ-mCherry marker construct pFAγ-mCherr GTAACCGCATGAAGAGTGTTAAGAACA y-F2 TGGTGTCCAAGGGCGAG pFAγ-mCherr atacgaacgaaagct ctgcag y-R2 TCACTTGTACAGCTCGTCCATGC WxTP-mCher cggggatcctctaga gtcgac ry-F1 ATGTCGGCTCTCACCACG WxTP-mCher CTCGCCCTTGGACACCATGGCAGGGGG ry-R1 GAGGCCACC For the WxTP-mCherry marker construct WxTP-mCher GGTGGCCTCCCCCCTGCCATGGTGTCC ry-F2 AAGGGCGAG WxTP-mCher atacgaacgaaagct ctgcag ry-R2 TCACTTGTACAGCTCGTCCATGC SP-mCherry- cggggatcctctaga gtcgac HDEL-F1 ATGAAGGTACAGGAGGGTTTGTTC SP-mCherry- CTCGCCCTTGGACACCATGCACTCCTT HDEL-R1 GCGAGGTTGC For the SP-mCherry-HDEL marker SP-mCherry- GCAACCTCGCAAGGAGTGCATGGTGTC construct HDEL-F2 CAAGGGCGAG atacgaacgaaagct ctgcag SP-mCherry- TCACAGCTCGTCATGCTTGTACAGCTC HDEL-R2 GTCCATGC gagggaaggatttca gaattc Ms1s-His-F CAGCCGGGGGCGCCGTGC cagtgccaagcttgc ctgcag For the MBP-Ms1s-His construct Ms1s-His-R TCAGTGATGATGATGATGATGGGCCG CCTTGGACGGCG His-F aattc CATCATCATCATCATCACTGA ctgca For the MBP-His construct His-R g TCAGTGATGATGATGATGATG g
38 References
1. Shitsukaw N et al. (2007) Genetic and epigenetic alteration among three homoeologous genes of a class E MADS Box gene in hexaploid wheat. Plant Cell
19:1723–1737.
2. Franckowiak JD, Maan SS, Williams ND (1976) A proposal for hybrid wheat utilizing Aegilops squarrosa L. cytoplasm. Crop Science 16:725–728.
3. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for
Illumina sequence data. Bioinformatics 30:2114-2120.
4. International Wheat Genome Sequencing Consortium (2014) A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome. Science
345:1251788.
5. Kim D et al. (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36.
6. Li H et al. (2009) 1000 Genome project data processing subgroup: The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:2078-2079.
7. Clavijo BJ et al. (2017) An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. Genome Res 27: 885-896.
8. Chapman JA et al. (2015) A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome. Genome Biol 16:26.
10.1186/s13059-015-0582-8.
39 9. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2.
Nature Methods 9:357-359.
10. Abe A et al. (2012) Genome sequencing reveals agronomically important loci in rice using MutMap. Nature Biotech 30:174-178
11. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263-265.
12. Gruntman E et al. (2008) Kismeth: analyzer of plant methylation states through bisulfite sequencing. BMC Bioinformatics 9:371..
13. Nelson BK, Cai X, Nebenführ A (2007) A multicolored set of in vivo organelle markers for co-localization studies in Arabidopsis and other plants. Plant J
51:1126–1136.
14. Lee S et al. (2012) Mitochondrial targeting of the Arabidopsis F1-ATPase
γ-subunit via multiple compensatory and synergistic presequence motifs. Plant Cell
24:5037-5057.
15. Kitajima A et al. (2009) The rice α-amylase glycoprotein is targeted from the
Golgi apparatus through the secretory pathway to the plastids. Plant Cell 21:
2844-2858.
16. Gomord V et al. (1997) The C-terminal HDEL sequence is sufficient for retention of secretory proteins in the endoplasmic reticulum (ER) but promotes vacuolar targeting of protein that escape the ER. Plant J 11:313-325.
17. Dowler S, Kular G, Alessi DR (2002) Protein lipid overlay assay. Sci. STKE
129:pl6
40 18. Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature Methods
8:785-786.
19. Bagos PG, Liakopoulos TD, Spyropoulos IC, Hamodrakas SJ (2004)
PRED-TMBB: a web server for predicting the topology of b-barrel outer membrane proteins. Nucleic Acids Research 32 (Web Server issue):W400-4.
41