Targeted Base Editing in the Plastid Genome of Arabidopsis Thaliana

LETTERS https://doi.org/10.1038/s41477-021-00954-6 Targeted base editing in the plastid genome of Arabidopsis thaliana Issei Nakazato1, Miki Okuno 2,3, Hiroshi Yamamoto 4, Yoshiko Tamura1, Takehiko Itoh 2, Toshiharu Shikanai 4, Hideki Takanashi1, Nobuhiro Tsutsumi1 and Shin-ichi Arimura1 ✉ Bacterial cytidine deaminase fused to the DNA binding We established a system to smoothly assemble the complicated tan- domains of transcription activator-like effector nucleases dem expression vectors of ptpTALECD for each target sequence on was recently reported to transiently substitute a targeted C the Ti plasmid (Extended Data Fig. 1) by replacing the FokI in the to a T in mitochondrial DNA of mammalian cultured cells1. We vectors used in a previous study11 with CD-UGI (Extended Data applied this system to targeted base editing in the Arabidopsis Fig. 2). We introduced the vectors into the nucleus of A. thaliana by thaliana plastid genome. The targeted Cs were homoplasmi- floral dipping12 and attempted to substitute C/G to T/A in 16S rRNA cally substituted to Ts in some plantlets of the T1 generation (Fig. 1c), rpoC1 (Fig. 1d) and psbA (Fig. 1e). Substitution of G5 and/ and the mutations were inherited by their offspring indepen- or G8 in 16S rRNA (highlighted in red in Fig. 1a) to A would confer dently of their nuclear-introduced vectors. spectinomycin (Spm) resistance (below)13,14 while substitutions of Plastid genomes, which encode key genes for photosynthetic C6 in rpoC1 and C10 in psbA to Ts would lead to changes in initiation processes, including both light reactions and carbon assimila- codons from ATG (methionine) to ATA (isoleucine). As a result, tion, are potential targets of plant breeding. Plastid genetic trans- accumulation of their coding proteins would decrease and mutants formation can now be used for only a limited number of species2 would grow poorly15,16 and/or be unable to grow photoautotrophi- and is difficult even in the model plant Arabidopsis3,4. In addition, cally17,18. Other neutral mutations in some C/G pairs in the target it requires insertion of a marker gene into the plastid genome5, so windows, which are the regions between the sequences that the left the created plants are regarded as genetically modified organisms and right transcription activator-like effector (TALE) domains rec- (GMOs). Recently, cytidine deaminase (CD), which converts C to ognize, would also be expected. U to change G/C pairs to A/T pairs in double-stranded DNA, was Twelve ptpTALECD expression vectors were constructed (four successfully used for in vitro targeted base editing of mitochondrial pairs of CD halves for each of three targets). Each vector was intro- DNAs in mammalian cultured cells1. Here, we applied this technol- duced into A. thaliana and, 23 days after stratification (DAS), the ogy to edit targeted bases in three genes in the plastid genome in targeted regions of the T1 plants were sequenced by Sanger sequenc- Arabidopsis plantlets, without leaving any foreign genes in either ing. Only constructs from which T1 plants were obtained are shown the plastid or nuclear genomes. Targeted single nucleotide substitu- in Fig. 1c–e and Supplementary Table 1a–c. In all three target win- tions are expected to be the best way to make desired single nucleo- dows, C/G pairs were replaced with T/A pairs in multiple T1 lines tide polymorphisms (SNPs) without disturbing any other genes or (Fig. 1c–h and Supplementary Table 1a–c). Surprisingly, in many regulatory regions in the plastid genomes of common crops or elite lines, the targeted base(s) seemed to be homoplasmically substituted lines. For the targets, we selected three genes whose modifications (homo), while in other lines, they seemed to be heteroplasmically or would be expected to lead to observable effects; 16S rRNA, whose chimaerically substituted (h/c; Fig. 1c–h and Supplementary Table modification was expected to confer resistance to an antibiotic and 1a–c). Such homoplasmic mutations might have occurred through two genes whose modifications would lead to poor growth; rpoC1, stochastic sorting processes, such as selection of mutations in a which encodes a part of the DNA-directed RNA polymerase subunit small number of copies of plastid genomes (that is, plastid sorting)19 beta′; and psbA, which encodes photosystem II (PSII) protein D1. or gene conversion20. Nevertheless, it is also conceivable that ptp- As in the previous study1, the CD domain (163 amino acids (aa)) TALECD mutated the C/G pairs in the target windows of all plas- of Burkholderia cenocepacia DddA toxin (1,427 aa) was split at the tid genomes at the early stage of embryogenesis because the RPS5A 1,333th or 1,397th amino acid. Each of the amino terminal or car- promoter used for ptpTALECD expression was reported to highly boxy terminal halves of the CD was linked to the C terminus of the drive gene expression in egg cells and early embryos21,22. Not all C/G DNA binding domain of the platinum TALEN6 (pTALECD; Fig. 1a). pairs in the target windows were substituted and the positions of the The N terminus of the pTALECD was linked to a plastid-targeting substituted C/G pairs were biased for all the three target windows signal peptide (PTP) of Arabidopsis thaliana RecA1 protein (51 aa)7,8 (Fig. 1c–e). Three homoplasmically substituted bases were C of (5′) (Fig. 1b), while the C terminus was linked to an uracil glycosyl- TC(3′) (Cp in Fig. 1c–e), which was the preferential target1 but a C ase inhibitor (UGI)1,9 to inhibit hydrolysis of the generated uracil of (5′)AC(3′) in 16S rRNA gene was also substituted (Fig. 1c). (Fig. 1b). The nucleotide sequences of CD and UGI were opti- To investigate the stability of mutation rates during plant devel- mized for A. thaliana codon usage. A pair of PTP-pTALECD-UGIs opment, total DNAs extracted from an emerging leaf of T1 plants (ptpTALECDs) were expressed in a single plant transformation at 11 and 23 DAS (or from a cotyledon of slowly growing plants at vector under control of efficient RPS5A promoters10 (Fig. 1b). 11 DAS; Supplementary Table 1a–c) were sequenced. Among the 1Laboratory of Plant Molecular Genetics, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan. 2School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan. 3Division of Microbiology, Department of Infectious Medicine, Kurume University School of Medicine, Kurume, Japan. 4Department of Botany, Graduate School of Science, Kyoto University, Kyoto, Japan. ✉e-mail: [email protected] 906 NatURE PLants | VOL 7 | JULY 2021 | 906–913 | www.nature.com/natureplants NATURE PLANTS LETTERS a N-terminal C-terminal domain domain CD half CD half combination Platinum TALE left UGI (left-right) 1. 1333C-1333N 16S rRNA 2. 1333N-1333C 3. 1397C-1397N Platinum TALE right UGI 4. 1397N-1397C : Target window CD half C-terminal N-terminal G/C: Special targets predicted to confer Spm resistance domain domain b T-DNA PTP_Fw N-terminal PTP_Fw N-terminal Oleosinpro:: Right border domain TALE left domain TALE right Oleosin-GFP PTP CD half UGI PTP CD half UGI NPT RPS5Apro Plastid C-terminal HSPter RPS5Apro Plastid C-terminal 35Ster T-DNA targeting TALF1_Rv domain targeting TALF1_Rv domain Left border peptide peptide c f Target: g Target: h Target: Target: 16S rRNA 16S rRNA rpoC1 psbA G G C G C C CD half Position in the target window 5 8 10 3 6 10 Type No. (left-right) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 G3 homo, h/c 1 1 G5, G8 h/c C10 h/c 1333C-1333N 3 C6 h/c homo h/c 1333N-1333C 2 homo h/c 3 G5 h/c G3 h/c C10 homo 1397C-1397N 25 homo 14 1 h/c 1 1397N-1397C 9 homo 3 4 G5 homo G3 homo Col-0 Target sequence 5′ 3′ Coding strand 3′ 5′ G , C 5 10 Col-0 d Target: rpoC1 homo CD half Position in the target window Type No. (left-right) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Col-0 h/c 1333C-1333N 2 homo h/c 1333N-1333C 6 homo i h/c 8 1397C-1397N 21 Target gene homo 10 Mutation type at Mutation stablity h/c 4 1 16S rRNA rpoC1 psbA Total 1397N-1397C 6 rate (%) homo 2 11 DAS 23 DAS G5 G8 C10 G3 G6 C10 Target sequence 5′ 3′ homo homo 6 2 9 2 19 37.3 Stably homo Coding strand 3′ 5′ → → homo h/c 2 1 3 5.9 h/c homo 2 1 3 6 11.8 h/c h/c 3 1 10 14 27.5 Unstable Amino acid → → h/c WT 2 1 1 3 1 8 15.7 WT h/c 1 1 2.0 e Target: psbA n = 13 2 4 27 2 3 51 100.0 CD half Position in the target window Type No. (left-right) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 h/c 1 0 100 (%) 1397C-1397N 2 homo Number of mutant plants/total number in the column (%) h/c 1397N-1397C 5 homo 2 Target sequence 5′ 3′ Coding strand 3′ 5′ → Amino acid → 0 100 Number of mutant plants/total number of transformants (%) Fig. 1 | ptpTALECD for three plastid genes. a, A pair of pTALECD proteins and its targeting window (shown in a red rectangle) in 16S rRNA gene and CD half combination.

Targeted Base Editing in the Plastid Genome of Arabidopsis Thaliana

De Novo Genomic Analyses for Non-Model Organisms: an Evaluation of Methods Across a Multi-Species Data Set

MATCH-G Program

Large Scale Genomic Rearrangements in Selected Arabidopsis Thaliana T

An Open-Sourced Bioinformatic Pipeline for the Processing of Next-Generation Sequencing Derived Nucleotide Reads

Sequence Alignment/Map Format Specification

Assembly Exercise

Supplemental Material Nanopore Sequencing of Complex Genomic

Providing Bioinformatics Services on Cloud

Syntax Highlighting for Computational Biology Artem Babaian1†, Anicet Ebou2, Alyssa Fegen3, Ho Yin (Jeffrey) Kam4, German E

Software List for Biology, Bioinformatics and Biostatistics CCT

Computational Protocol for Assembly and Analysis of SARS-Ncov-2 Genomes

Alignment-Free Sequence Comparison: Benefits, Applications, and Tools Andrzej Zielezinski1, Susana Vinga2, Jonas Almeida3 and Wojciech M