Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2019

An analysis of the ramosa1 pathway in Zea mays utilizing CRISPR/Cas9 knockouts

Ryan James Arndorfer Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd

Part of the Agriculture Commons, Genetics Commons, and the Plant Sciences Commons

Recommended Citation Arndorfer, Ryan James, "An analysis of the ramosa1 pathway in Zea mays utilizing CRISPR/Cas9 knockouts" (2019). Graduate Theses and Dissertations. 17391. https://lib.dr.iastate.edu/etd/17391

This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. An analysis of the ramosa1 pathway in Zea mays utilizing CRISPR/Cas9 knockouts

by

Ryan Arndorfer

A thesis submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

Major: Genetics and Genomics

Program of Study Committee: Erik Vollbrecht, Major Professor Shuizhang Fei Philip Becraft

The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this thesis. The Graduate College will ensure this thesis is globally accessible and will not permit alterations after a degree is conferred.

Iowa State University

Ames, Iowa

2019

Copyright © Ryan Arndorfer, 2019. All rights reserved. ii

DEDICATION

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

"On the other hand, we denounce with righteous indignation and dislike men who are so beguiled and demoralized by the charms of pleasure of the moment, so blinded by desire, that they cannot foresee the pain and trouble that are bound to ensue; and equal blame belongs to those who fail in their duty through weakness of will, which is the same as saying through shrinking from toil and pain. These cases are perfectly simple and easy to distinguish. In a free hour, when our power of choice is untrammelled and when nothing prevents our being able to do what we like best, every pleasure is to be welcomed and every pain avoided. But in certain circumstances and owing to the claims of duty or the obligations of business it will frequently occur that pleasures have to be repudiated and annoyances accepted. The wise man therefore always holds in these matters to this principle of selection: he rejects pleasures to secure other greater pleasures, or else he endures pains to avoid worse pains."

Cicero, Marcus Tullius, and H. (Harris) Rackham. De finibus bonorum et malorum. London, W. Heinemann; New York, The Macmillan Co., 1914. Internet Archive, http://archive.org/details/definibusbonoru02cicegoog. iii

TABLE OF CONTENTS

Page

LIST OF FIGURES ...... v

LIST OF TABLES ...... vii

ACKNOWLEDGMENTS ...... viii

ABSTRACT ...... ix

CHAPTER 1. GENERAL INTRODUCTION ...... 1 History of CRISPR ...... 3 CRISPR Use in Plants ...... 6 References ...... 8 Figures ...... 12

CHAPTER 2. CRISPR/CAS9 KNOCKOUT EXPERIMENT OF PUTATIVE RAMOSA1 INTERACTORS ...... 14 Abstract ...... 14 Introduction ...... 14 CRISPR/Cas9 system ...... 18 Materials and Methods ...... 23 Choosing for Cas9-mediated knockout ...... 23 Selecting CRISPR Guides ...... 26 Guide location selection ...... 27 Description and Assembly of the Cas9 Expression System ...... 27 Array screening ...... 29 gRNA cassette assembly ...... 30 Synthetic 8x array ...... 30 Transformation into Agrobacterium tumefaciens EHA101 ...... 31 Array screening in Agrobacterium tumefaciens ...... 32 Agrobacterium transformation of Maize Hi-II ...... 34 Results ...... 35 Sequence analysis uncovers novel mutations in all target genes with varied efficiency ... 36 Discussion ...... 37 ramosa1 mutations create interesting future study possibilities ...... 37 jmjC guides produced mostly frame-shift early termination mutations ...... 39 selT guides were both highly active ...... 39 ail6 guides were largely inactive ...... 40 rho-GDI1 mutagenesis was limited by ear availability ...... 41 Multi-guide strategy is prudent and functional ...... 41 References ...... 43 Figures ...... 46 Tables ...... 72 iv

GENERAL CONCLUSIONS ...... 92

APPENDIX. YEAST-2-HYBRID CANDIDATE INFORMATION ...... 93 v

LIST OF FIGURES

Page

Figure 1-1: Meganucleases, ZFN and TALENs diagrams...... 12

Figure 1-2: Bacterial CRISPR array immune system diagram ...... 13

Figure 2-1: Original ramosa ear discovered by Dr. Walter Gernert...... 46

Figure 2-2: Zinc finger and ramosa1 folding models...... 47

Figure 2-3: ramosa1 zinc finger recognition sequence logo...... 48

Figure 2-4: ramosa1 phenotype in tassel and ear...... 49

Figure 2-5: Yeast-2-Hybrid description...... 50

Figure 2-6: CRISPR/Cas9 function...... 51

Figure 2-7: Screen-shot of University of Bergen’s chop chop V2 program depicting potential guides for ramosa1...... 52

Figure 2-8: ramosa1 gene model...... 52

Figure 2-9: rho-GDI1 and rho-GDI1 paralog gene models...... 53

Figure 2-10: jmjC gene model...... 53

Figure 2-11: selT gene model...... 54

Figure 2-12: ail6 gene model...... 54

Figure 2-13: pENTR-gRNA1 and pENTR-gRNA2...... 55

Figure 2-14: pENTR-gRNA (4x) and Synthetic (8x) Arrays...... 55

Figure 2-15: 4x and 8x Array depictions...... 56

Figure 2-16: gRNA1 cloning colony PCR, BsaI digestion and ligation step...... 57

Figure 2-17: Agrobacterium tumefaciens preparation of 4x array plasmid DNA...... 58

Figure 2-18: Ear holding tool...... 59

Figure 2-19: Sequencing quality levels...... 59 vi

Figure 2-20: Gaps in sequencing alignments...... 60

Figure 2-21: Multi-base biallelic mutations...... 61

Figure 2-22: Cas9 assay on DNA made from callus tissue...... 61

Figure 2-23: Cas9 PCR assay from all regenerated plants...... 62

Figure 2-24: Frequency of insertions and deletions by number of nucleotides...... 63

Figure 2-25: Number of single base insertions by base...... 64

Figure 2-26: Number of targeted genes edited per recovered plant treated with the 8x array. .... 65

Figure 2-27: Tassel phenotypes in parental generation of new ramosa1 alleles ...... 66

Figure 2-28: Characterization of ramosa1 mutations...... 67

Figure 2-29: Characterization of jmjC mutations...... 68

Figure 2-30: Characterization of selT mutations...... 69

Figure 2-31: Characterization of ail6 mutations...... 70

Figure 2-32: Characterization of the rho-GDI1 mutation...... 70

Figure 2-33: 8x Array Guide location vs. Activity in number of unique events ...... 71 vii

LIST OF TABLES

Page

Table 1: Putative ramosa1 gene interactors identified by yeast-2-hybrid...... 72

Table 2: Candidate gene paralogs and ear wt-vs-ra1 developmental timeline expression profiles...... 73

Table 3: Candidate genes and their annotated functions/descriptions ...... 76

Table 4: CRISPR Guides and their location considerations...... 77

Table 5: Primers used in these experiments...... 78

Table 6: Ears used in these studies...... 79

Table 7: Ear data by source and array transformed...... 80

Table 8: Callus events capable of regenerating at least one plant...... 81

Table 9: Mutant classifications for all genes of interest...... 82

Table 10: All surviving plants assayed for mutations in five genes...... 85

Table 11: Collapsed mutation frequencies for all five genes of interest...... 89

Table 12: Number of indels recovered by size in nucleotides...... 91

viii

ACKNOWLEDGMENTS

I would like to thank my committee chair, Dr. Erik Vollbrecht, and my committee members, Dr. Shuizhang Fei and Dr. Philip Becraft for their guidance and support throughout the course of this research. I would like to thank the members of the Vollbrecht lab and extend a special thank you to Dr. Erica Unger-Wallace for her continued assistance with this research. In addition, I would also like to thank my friends, colleagues, the department faculty and staff for making my time at Iowa State University a wonderful experience. Finally, I want to offer my appreciation to my wife Amy Arndorfer for her encouragement, care and support; as well as her very experienced entomologist hands spending hours extracting corn embryos with me.

Thank you all. ix

ABSTRACT

Inflorescence architecture in Zea mays is affected by a large collection of interrelated genes. The ramosa genes are some of the most prominent and well-studied of these genes due to their overtly branched ear and tassel mutant phenotype. ramosa1 confers determinate, short- branch identity on branch meristems during their initiation. Several new genes are proposed to work directly with ramosa1. ail6 was identified from a Quantitative Trait study to identify genes which enhance or suppress the ramosa1 branching phenotype. A Yeast-2-Hybrid study identified several genes as potential interactors with ramosa1. A CRISPR/Cas9 knockout study was performed to produce novel mutants to analyze this pathway. Two unique CRISPR expression arrays were utilized targeting 12 sites in five genes. Agrobacterium tumefaciens mediated transformation was used for transfection of Hi-II immature embryos. Both arrays functioned, and mutations were acquired in all five genes. Specifically, thirteen unique mutations have been detected in the ramosa1 gene, and in the other four genes we detected eight, 16, four and one mutation. A total of 112 plants were recovered and crossed into a B104, B73 or B73 ra1-63 background.

1

CHAPTER 1. GENERAL INTRODUCTION

Genetic research is conducted in two primary pathways, normally termed forward and reverse genetics. Forward genetics involves identifying the gene or genes responsible for an observed phenotype and generally requires a phenotype as a starting point. Reverse genetics involves identifying the function of one or more genes by analyzing the phenotype created by changing or deactivating the gene, also known as gene disruption. One common element of both strategies is the generation of genetic mutants to create a phenotype.

Many strategies have been harnessed for mutagenesis including both insertional mutagenesis and DNA break/repair strategies. Insertional methods disrupt genes with mobile

DNA segments and include naturally occurring, endogenous transposable elements (transposons) and exogenously applied Agrobacterium tumefaciens T-DNA. T-DNA insertion sites are generally random and require bacterial intervention for integration of the foreign DNA (Alonso and Stepanova, 2003). Transposons are mobile, endogenous DNA elements, many of which insert nonrandomly into the genome; Dissociation (Ds) element transposons are more likely to insert into introns and exons, while Mutator transposons show a preference for promoters and 5’- untranslated regions (Vollbrecht et al., 2010). Although transposons can target genes, they do so randomly; therefore, databases of Mutator and Ds lines were established with known insertion locations because individual, specific genes cannot be easily targeted (McCarty et al., 2005;

Vollbrecht et al., 2010).

Chemical or radiological methods have long been favored for DNA damage mutagenesis.

While chemical mutagens like ethyl methanesulfonate (EMS) normally produce single 2 nucleotide substitutions, ionizing radiation produces double stranded DNA breaks which are repaired by homologous recombination (HR) or the error prone non-homologous end joining

(NHEJ) pathway (Santivasi et al., 2014). However, as with insertional mutagens, these methods produce genome wide, random mutations with multiple, potentially confounding mutations for every desirable one. This limitation has led to a decades-long search for methods to create targeted double stranded breaks. This need was met with nuclease solutions like meganucleases, zinc finger nucleases (ZFN) and TALENS. ZFNs and TALENS both utilize a

DNA binding domain tethered to a nuclease domain to create DNA breaks. They also both require one protein for each DNA strand in order to effect one double stranded DNA break

(Figure 1-1). These technologies were heralds of the future and produced predictable mutants

(e.g., Gao et al., 2010; Shukla et al., 2009; Char et al., 2015). However, researchers were slow to adopt these systems due to the complexity in design and high engineering cost of the new pair of required for each targeted DNA sequence.

In comparison to protein-based DNA recognition systems, CRISPR (Clustered Regularly

Interspaced Palindromic Repeats) and Cas9 (Crispr associated protein 9) use RNA complementary binding for DNA recognition (Marraffini and Sontheimer, 2008). The Cas nuclease proteins are RNA sequence reprogrammable enzymes. The Cas9 protein is an enzyme which can produce double-stranded DNA cuts if it is given an RNA template of the sequence to digest. The RNA sequence is an order of magnitude more easily programmed to a specific DNA sequence than is a sequence recognition protein, thus speeding up genetic and biological research in general.

3

History of CRISPR

The story of CRISPR/Cas9 begins with the bacterial CRISPR array. In 1989 two 29bp highly conserved DNA repeats were identified in the Escherichia coli genome. The repeated sequences were separated each time by a 32 or 33bp spacer region of DNA and repeated 7 times for one sequence and 14 for the other. The repeat sequences were similar but distinct and the repeat regions were separated on the genome by 24 kilobases (Nakata et al., 1989). Later in 1993

30-34bp tandem repeat sequences separated by 35-39bp unique spacers were identified in

Haloferax mediterranei (Mojica et al., 1993). The purpose of these repeats was discovered not in the repeat sequences but in the spacers. By BLASTing the spacers against known genomes many of these were discovered to match viral DNA. Specifically, spacers matching P1 phage sequences were found in P1 resistant E. coli. This discovery led to the understanding that

CRISPR is the memory unit of a bacterial adaptive immune system (Mojica et al., 2005). This acquired immunity was demonstrated through selective experiments transferring the entire

CRISPR locus from Streptococcus thermophilus to E. coli and then confirming plasmid and bacteriophage resistance (Sapranauskas et al., 2011).

The CRISPR locus is a region of co-locating DNA segments containing several protein coding genes, functional RNA sequences and the CRISPR immune system memory array. The actual sequences vary between originating organisms and the type of CRISPR system. The system classifications are broken into two classes, five types and 16 subtypes (Makarova et al.,

2015). Class I systems contain multi-subunit effector complexes and are seen in E. Coli (Type I) and Staphylococcus epidermidis (Type III-A). Class II systems have been targeted for genetic engineering due to the single protein effector and are seen in Streptococcus thermophilus and 4

Streptococcus pyogenes (Type II, Cas9) and Francisella novicida (Type V, Cpf1) (Makarova et al., 2015).

CRISPR-Cas classifications are based on the effector proteins which perform the DNA digestion. However, nearly all endogenous CRISPR loci also contain the adapter module genes

Cas1, Cas2 and cns2. These proteins form the complexes necessary for new immunity acquisition and assembling the CRISPR array (Nuñez et al., 2014). In contrast, a system with only effector proteins could effect immunity without the ability to acquire new resistance. It was demonstrated in 2011 that Cas9 was the only protein necessary for resistance (Sapranaskas et al.

2011). Cas9 had long been suspected to be an effector due to its HNH and RuvC nuclease domains which cause double stranded DNA breaks (Bolotin et al., 2005; Makarova et al., 2006).

The Sapranaskas laboratory demonstrated the importance of those two domains through mutation analysis and proved CRISPR-Cas9 from S. thermophilus could be functional in E. coli

(Sapranaskas et al. 2011).

The Cas9 system became one of the primary CRISPR models due to its simplicity. In

2007 the Cas9 protein’s cutting system was dissected which revealed the cut location three nucleotides upstream of a proto-spacer adjacent motif (PAM) sequence identified across the

CRISPR array (Deveau et al., 2008; Horvath et al., 2008). Both plasmid and viral DNA were sequenced after digestion and the specific cut locations upstream from the proto-spacer sequences were verified. The cut regions were also revealed to match the spacers in the CRISPR array demonstrating the precise targetability of the system. The repeat and spacer segments of 5 the array are transcribed together as a pre-CRISPR RNA (crRNA) string and then digested by

RNase III and incorporated into the Cas9 protein, as can be seen in Figure 1-2 panels 2 and 3.

The one remaining necessary component to functionally use Cas9, was discovered with differential RNA sequencing (dRNA-seq) of S. pyogenes. A highly expressed non-coding RNA was discovered with a 25bp region of nearly perfect complementary (one base difference) to crRNA repeats. These regions were named transactivating CRISPR RNAs (tracrRNA) due to their requirement in processing the pre-crRNA strands (Deltcheva et al., 2011). The tracrRNAs were also later identified as essential to Cas9 digestion activity (Jinek et al., 2012).

Discovering tracrRNA was a major step in understanding and reconstituting the CRISPR-

Cas9 system. The required components of the bacterial system were, the Cas9 protein, the pre- crRNA, tracrRNA and RNaseIII. To compress the necessary elements even further, RNaseIII could be eliminated if each individual crRNA was pre-cleaved from the array. The final engineering step was taken when it was shown the crRNA and tracrRNA could be fused into a single-guide RNA (sgRNA) (Jinek et al., 2012). A fully functional, precise DNA digestion

CRISPR-Cas9 system can now be created with two parts, the Cas9 protein and the sgRNA molecule. This system has further been augmented by multiplexing multiple sgRNAs with Cas9 which has allowed for the simultaneous targeting of multiple genes or loci.

While the ability to selectively edit genomes has been slowly evolving over the past 40 years, the ease and simplicity of reprogramming CRISPR-Cas has led to the widespread adoption 6 of the system. The lower cost in both time and money has also led to an overall acceleration of genetic research.

CRISPR Use in Plants

The first CRISPR/Cas uses were directed toward mammals, but the plant biology research community quickly adopted the system as well. The primary limitations to any type of genetic engineering in plants are: 1) the introduction of the editing machinery and, 2) regenerating plants from embryogenic tissue. Commonly used solutions to the first issue can be lumped into two categories, protein or DNA integration. The second issue also has two solution strategies, the use of lines amenable to regeneration or embryogenic regulatory elements.

The necessary and sufficient elements for CRISPR/Cas9 to function as simple targeted mutagenesis are the Cas9 protein and the chimeric sgRNA molecule. The more commonly used method of inserting these reagents into a host organism is by incorporating DNA into the host genome and using the host transcription machinery to manufacture the protein and gRNA. The primary drawback to this method is the constitutive expression of the Cas protein and the sgRNA(s) after they induce the desired effect. In plant research, the two common methods of introducing DNA into the host are through biolistic bombardment or through Agrobacterium tumefaciens transfection. The biolistic method avoids the complications of a bacterial intermediary like Agrobacterium mediated transformants, however the drawbacks are a slightly higher cost burden for equipment, the possibility of significant DNA damage or deletions, and the number of integrated copies can exceed 50 (Liu et al., 2019). Both methods integrate DNA randomly into the genome, however the average number of Agrobacterium T-DNA insertions is 7 between 1.7 and 4.9 depending on bacterial background and the T-DNA source plasmid

(Oltmanns et al., 2010).

As opposed to DNA based integrations it is also possible to assemble the Cas9/sgRNA complex in vitro and deliver it via ribonucleic particle bombardment (Svitashev et al., 2015).

This is similar to the DNA based particle bombardment, except there is no DNA integration and thus no constitutive Cas expression. This method can be exceptionally useful in organisms with long maturation times and extended reproductive cycles, while most methods can create mutations in the first generation, this method also does not require cross breeding to remove any inserted DNA (Fan et al., 2015).

In order to generate heritable mutations, a common place to apply mutagenesis is directly to embryo derived embryogenic materials. Most inbred plant lines do not efficiently regenerate embryos after bombardment or transformation (Chilcoat et al., 2017). Maize inbred lines such as

Hi-II and B104 were selected based on their ability to be transformed and regenerate plants efficiently in vitro and not on their scientific relevance or commercial importance. This requires any novel alleles to be introgressed into more relevant genetic backgrounds, possibly carrying with them undesired linked alleles.

Another option has been developed to allow the creation of embryogenic material from inbred lines of many different origins. This is accomplished through the overexpression of morphogenic regulators like Baby boom and Wuschel, or the maize cell division promoting transcription factor ovule developmental protein 2 (ODP2) (Svitashev et al., 2016, Lowe et al.,

2016). These are able to produce prolific somatic embryos; however, the overexpression must 8 end before these embryos and germinate and regenerate plants. Systems to modulate this timing and interaction are still under development and are not ready for widespread use.

CRISPR Genetic engineering techniques are being applied to traditional breeding questions such as increasing yields, drought tolerance or disease resistance which are important to help feed a growing population worldwide (Chilcoat et al., 2017). The auxin related gene family involved in ethylene responses (Argos) have been demonstrated to increase drought tolerance in maize. Using CRISPR/Cas9 researchers swapped out the promoter from argos8 gene for a constitutive maize promoter GOS2, and increased yields five bushels per acre under water stress while there was no effect in the well-watered control group (Shi et al., 2017). Researchers have also been experimenting with using CRISPR/Cas9 as a viral defense mechanism against dsDNA viruses in plants. This use is similar to how the system evolved to be used in bacteria but has been tested in Nicotiana benthamiana (Ali et al., 2015; 2016). CRISPR/Cas is the latest technology in this field, but it has increased the speed and precision with which genetic research has been conducted and there is no indication of it slowing anytime soon.

References

Ali, Zahir, et al. “CRISPR/Cas9-Mediated Viral Interference in Plants.” Genome Biology, vol. 16, 2015. PubMed Central, doi:10.1186/s13059-015-0799-6.

Ali, Zahir, et al. “CRISPR/Cas9-Mediated Immunity to Geminiviruses: Differential Interference and Evasion.” Scientific Reports, vol. 6, May 2016. PubMed Central, doi:10.1038/srep26912.

Alonso, Jose M., and Anna N. Stepanova. “T-DNA Mutagenesis in Arabidopsis.” Methods in Molecular Biology (Clifton, N.J.), vol. 236, 2003, pp. 177–88. PubMed, doi:10.1385/1-59259- 413-1:177.

Char, Si Nian, et al. “Heritable Site-Specific Mutagenesis Using TALENs in Maize.” Plant Biotechnology Journal, vol. 13, no. 7, Sept. 2015, pp. 1002–10. PubMed, doi:10.1111/pbi.12344.

9

Chilcoat, Doane, et al. “Use of CRISPR/Cas9 for Crop Improvement in Maize and Soybean.” Progress in Molecular Biology and Translational Science, vol. 149, 2017, pp. 27–46. PubMed, doi:10.1016/bs.pmbts.2017.04.005.

Deltcheva, Elitza, et al. “CRISPR RNA Maturation by Trans-Encoded Small RNA and Host Factor RNase III.” Nature, vol. 471, no. 7340, Mar. 2011, pp. 602–07. PubMed Central, doi:10.1038/nature09886.

Deveau, Hélène, et al. “Phage Response to CRISPR-Encoded Resistance in Streptococcus Thermophilus.” Journal of Bacteriology, vol. 190, no. 4, Feb. 2008, pp. 1390–400. PubMed Central, doi:10.1128/JB.01412-07.

Fan, Di, et al. “Efficient CRISPR/Cas9-Mediated Targeted Mutagenesis in Populus in the First Generation.” Scientific Reports, vol. 5, July 2015, p. 12217. PubMed, doi:10.1038/srep12217.

Gao H, Smith J, Yang M, et al. Heritable targeted mutagenesis in maize using a designed endonuclease. Plant J. 2010;61:176–87.

Horvath, Philippe, et al. “Diversity, Activity, and Evolution of CRISPR Loci in Streptococcus Thermophilus.” Journal of Bacteriology, vol. 190, no. 4, Feb. 2008, pp. 1401–12. PubMed Central, doi:10.1128/JB.01415-07.

Jinek, Martin, et al. “A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity.” Science (New York, N.Y.), vol. 337, no. 6096, Aug. 2012, pp. 816–21. PubMed, doi:10.1126/science.1225829.

Liu, Jianing, et al. “Genome-Scale Sequence Disruption Following Biolistic Transformation in Rice and Maize.” The Plant Cell, vol. 31, no. 2, Feb. 2019, pp. 368–83. www.plantcell.org, doi:10.1105/tpc.18.00613.

Lowe, Keith, et al. “Morphogenic Regulators Baby Boom and Wuschel Improve Monocot Transformation.” The Plant Cell, vol. 28, no. 9, Sept. 2016, pp. 1998–2015. www.plantcell.org, doi:10.1105/tpc.16.00124.

Makarova, Kira S., et al. “A Putative RNA-Interference-Based Immune System in Prokaryotes: Computational Analysis of the Predicted Enzymatic Machinery, Functional Analogies with Eukaryotic RNAi, and Hypothetical Mechanisms of Action.” Biology Direct, vol. 1, no. 1, Mar. 2006, p. 7. BioMed Central, doi:10.1186/1745-6150-1-7.

Makarova, Kira S., et al. “An Updated Evolutionary Classification of CRISPR-Cas Systems.” Nature Reviews. Microbiology, vol. 13, no. 11, 2015, pp. 722–36. PubMed, doi:10.1038/nrmicro3569.

Marraffini, Luciano A., and Erik J. Sontheimer. “CRISPR Interference Limits Horizontal Gene Transfer in Staphylococci by Targeting DNA.” Science (New York, N.Y.), vol. 322, no. 5909, Dec. 2008, pp. 1843–45. PubMed, doi:10.1126/science.1165771. 10

McCarty, Donald R., et al. “Steady-State Transposon Mutagenesis in Inbred Maize.” The Plant Journal: For Cell and Molecular Biology, vol. 44, no. 1, Oct. 2005, pp. 52–61. PubMed, doi:10.1111/j.1365-313X.2005.02509.x.

Mojica, F.J.M., Juez, G., and Rodriguez-Valera, F. (1993). Transcription at different salinities of Haloferax mediterranei sequences adjacent to partially modified PstI sites. Mol. Microbiol. 9, 613–621.

Mojica, F.J.M., Dı´ez-Villasen˜ or, C., Garcı´a-Martı´nez, J., and Soria, E. (2005). Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J. Mol. Evol. 60, 174–182.

Nakata, A., Amemura, M., and Makino, K. (1989) Unusual nucleotide arrangement with repeated sequences in the Escherichia coli K-12 . J Bacteriol 171: 3553- 3556,

Nuñez, James K., et al. “Cas1–Cas2 Complex Formation Mediates Spacer Acquisition during CRISPR–Cas Adaptive Immunity.” Nature Structural & Molecular Biology, vol. 21, no. 6, June 2014, pp. 528–34. www.nature.com, doi:10.1038/nsmb.2820.

Oltmanns, Heiko, et al. “Generation of Backbone-Free, Low Transgene Copy Plants by Launching T-DNA from the Agrobacterium Chromosome1[W][OA].” Plant Physiology, vol. 152, no. 3, Mar. 2010, pp. 1158–66. PubMed Central, doi:10.1104/pp.109.148585.

Santivasi, Wil L., and Fen Xia. “Ionizing Radiation-Induced DNA Damage, Response, and Repair.” Antioxidants & Redox Signaling, vol. 21, no. 2, July 2014, pp. 251–59. PubMed, doi:10.1089/ars.2013.5668.

Shi, Jinrui, et al. “ARGOS8 Variants Generated by CRISPR-Cas9 Improve Maize Grain Yield under Field Drought Stress Conditions.” Plant Biotechnology Journal, vol. 15, no. 2, 2017, pp. 207–16. PubMed, doi:10.1111/pbi.12603.

Shukla, Vipula K., et al. “Precise Genome Modification in the Crop Species Zea MaysUsing Zinc-Finger Nucleases.” Nature, vol. 459, no. 7245, May 2009, pp. 437–41. www.nature.com, doi:10.1038/nature07992.

Sapranauskas, Rimantas, et al. “The Streptococcus Thermophilus CRISPR/Cas System Provides Immunity in Escherichia Coli.” Nucleic Acids Research, vol. 39, no. 21, Nov. 2011, pp. 9275– 82. PubMed, doi:10.1093/nar/gkr606

Svitashev, Sergei, et al. “Targeted Mutagenesis, Precise Gene Editing, and Site-Specific Gene Insertion in Maize Using Cas9 and Guide RNA.” Plant Physiology, vol. 169, no. 2, Oct. 2015, pp. 931–45. PubMed, doi:10.1104/pp.15.00793.

11

Svitashev, S. et al. Genome editing in maize directed by CRISPR–Cas9 ribonucleoprotein complexes. Nat. Commun.7, 13274 doi: 10.1038/ncomms13274 (2016).

Vollbrecht, Erik, et al. “Genome-Wide Distribution of Transposed Dissociation Elements in Maize.” The Plant Cell, vol. 22, no. 6, June 2010, pp. 1667–85. PubMed, doi:10.1105/tpc.109.073452.

12

Figures

Figure 1-1: Meganucleases, ZFN and TALENs diagrams

Three nucleases that create targeted double stranded DNA breaks. Meganucleases are single protein effectors while Zinc Finger Nucleases (ZFN) and transcription activator-like effector nucleases (TALENs) utilize separate proteins for each DNA strand. All three use protein sequence motifs for DNA sequence recognition and binding. Zinc Fingers recognize DNA trios while TALE proteins recognize individual nucleotides. (Farzad Jamshidi, https://en.wikipedia.org/wiki/File:Engineered_Nucleases.jpg)

13

Figure 1-2: Bacterial CRISPR array immune system diagram

1) The memory unit of the immune system from which the pre-crRNA is transcribed. 2) RNase III, Cas9 and tracrRNA are transcribed by genes near the CRISPR array. tracrRNA binds to short palindromic repeats from the pre-crRNA and assists RNase III in cleaving individual crRNA sequences. 3) Cas9 incorporates the tracrRNA and crRNA complex creating an active ribonucleoprotein unit. 4) Depicts the orientation of the target sequence from the crRNA next to the Protospacer Adjacent Motif (PAM). 5) Depicts the crRNA sequence binding to target DNA. 6) Displays the final double stranded DNA cleavage at the location 3bp from the PAM. Figure originates from www.addgene.org/crispr/history/.

14

CHAPTER 2. CRISPR/CAS9 KNOCKOUT EXPERIMENT OF PUTATIVE RAMOSA1 INTERACTORS

Abstract

Inflorescence development in maize is known to be controlled by a large gene network.

In order to study it, one gene (ramosa1) known to perturb this network has been exploited.

Mutants in ramosa1 (ra1), ramosa2 and ramosa3 each create an overly branched ear and tassel phenotype. In these studies, a yeast-2-hybrid was performed using RAMOSA1 as bait which identified 12 possible physical interactors. The 12 putative ra1 interactors were pared down to three genes hypothesized to be involved in the ramosa pathway based on criteria of: gene type and function, gene expression profile in the ear and whether the gene has close paralogs which are hypothesized to be potentially functionally redundant. Therefore, four genes were selected for additional analysis: rho-GDI1, jmjC and selT as potential RA1 interactors, and a previously suspected actor in the ramosa pathway ail6. For these four genes and ra1, a CRISPR/Cas9 knockout experiment was designed and conducted using Agrobacterium tumefaciens mediated transfection on Hi-II immature embryos. Two unique CRISPR expression arrays were utilized targeting 12 sites in five genes. Plants were regenerated, DNA sequenced, and the generated mutants were genetically categorized. Plants with mutations in ra1, jmjC, selT, ail6 and rho-

GDI1 were variously crossed into inbred lines B104, B73 or B73 ra1-63 for propagation and further analysis.

Introduction

An ear of corn is one of the most distinctive and identifying physical traits of Zea mays.

The iconic cylindrical shape dotted with straight and ordered rows of plump kernels retains a platonic form known anywhere corn is grown. So, in 1909 when Walter Gernert found a cone 15 shaped ear with disordered rows of kernels and branches stemming from the base, it piqued his interest (Figure 2-1). The ear was found in a breeding line for enhanced protein composition and the trait bred true for several generations. Gernert called it a new subspecies of Zea mays with the name Zea ramosa (meaning branched in Latin) describing the phenotype (Gernert 1912).

This turned out not to be a new species, but the locus of that recessive mutation on chromosome

7 still bears the name ramosa.

Maize has two types of inflorescences, the tassel and the ear. The tassel normally is structured with long indeterminate branches stemming from the base, while the ear is structured similar to the upper portion of the tassel. These axes both form short determinate second order meristem branches which shape into orderly rows of spikelet pairs. The canonical ramosa1 (ra1) and ramosa2 (ra2) mutants create long indeterminate second-order meristems which create long inflorescence branches (Vollbrecht et al., 2005). Because the shape of inflorescence architecture affects grain production capability these genes were selected upon during domestication from teosinte (Sigmon and Vollbrecht, 2010).

The ramosa1 gene was cloned and identified as locus LOC100276104 on chromosome 7

(V3ID GRMZM2G003927, V4ID Zm00001d020430) (Vollbrecht et al., 2005). The gene encodes a Cys2-His2 zinc finger transcription factor. Zinc is a biologically important transition metal with only one normal oxidation state Zn2+. It will usually bind four molecules in tetrahedral geometry and performs structural roles in protein construction. The most prominently studied zinc protein motif is the zinc finger, commonly forming Cys4 or Cys2-His2 bonds. The

RAMOSA1 protein fits the form of a classical zinc finger with an α-helix and antiparallel β-sheet 16

(Figure 2-2). Four amino acids within the α-helix bind the major groove of a DNA helix and bond with 3-4 specific nucleotides (Pace and Weerapana 2014). Multiple zinc fingers can work in concert to increase sequence specificity, not only recognizing the binding sequences but the space between them (Takatsuji, 1999). However, the ramosa1 gene encodes has one zinc finger which is likely to bind the sequence TTG according to zf.princeton.edu using the Expanded

Linear SVM prediction method (Anton et al., 2014). Figure 2-3 depicts the likelihood of each base at each position by the height of the depicted base. The α-helix sequence QALGGH is highly conserved in single zinc finger EPF genes in plants, however in the RAMOSA1 amino acid sequence, QGLGGH, the Alanine is replaced by Glycine. This indicates possible different targets or functional roles (Vollbrecht et al., 2005).

The RAMOSA1 protein also contains two Ethylene-responsive element binding factor- associate Amphiphilic Repression (EAR) motifs. These motifs are highly conserved across many plant species and function to regulate gene transcription (Yang et al., 2018). These factors split into two consensus sequence patterns LxLxL or DLNxxP. Both of the ra1-encoded EARs are

LxLxL type which are known to actively repress genes by recruiting co-repressors like the

TOPLESS family (Kagale et al., 2011). These repressors interact with histone deacetylates which repress transcription by remodeling chromatin (Kagale et al., 2010). There are two commonly used ramosa1 mutants: ra1-R is a strong allele and ra1-63 is a weak allele. The phenotypic differences can be seen in the degree of branching (Figure 2-4). ra1-R is a point mutation H64N which eliminates the first histidine in the zinc finger. ra1-63 is a frame-shift mutation near the C-terminus which prevents correct termination and extends it an additional 17 nonsense residues (Vollbrecht et al., 2005). This weak mutant is hypothesized to interfere with 17 proper protein folding or with the EAR motif function. This ra1-63 mutant is useful for scoring the phenotype in enhancer/suppressor screens because the relatively fewer branches, as compared to the strong mutant, are countable.

There are several ways to analyze pathways like ramosa1, one approach is to use a genetic screen to look for modifiers of the ramosa1 phenotype. The severity of the ramosa1 phenotype varies among genetic backgrounds, including in two well studied inbreds, B73 (highly branched) and Mo17 (lightly branched). Rebecca Weeks used this phenotypic difference and an inter-mated B73 x Mo17 population (IBM) to map modifier loci that impact the branching phenotype of ra1. A 750 kb interval was identified on chromosome 1 comprising nine genes.

One of these genes is ail6. (Weeks, 2013).

aintegumenta-like6 (ail6, also known as ereb130) was identified as one of nine genes of interest in the Quantitative Trait Locus (QTL) on chromosome one (Weeks, 2013). A Mutator

(Mu) transposon insertion line for ail6 was obtained from the Maize Genetics Co-op stock center and the Mu insertion was verified in the first exon. When the ail6::Mu allele was crossed with ra1-63 in the W22 background, both TBN and EBN were significantly increased as compared with ra1-63 and normal W22 plants (EV unpublished). While the Mu line can indicate this gene might be involved in the modified branching phenotype conditioned by the QTL locus, these lines tend to contain more than one transposon. Plural Mu insertions could disrupt other genes and the single Mu line therefore lacks the specificity to confidently assert causality for this phenotype. Therefore, generating additional mutants in ail6 could support this gene’s role in 18 modifying the branching phenotype and potentially create an allelic series for future study. For this reason, we targeted the ail6 gene for CRISPR mutagenesis.

Another approach to analyze the ramosa pathway is to identify gene products that physically interact with RAMOSA1. A Yeast-2-Hybrid experiment can inform researchers about proteins that physically interact with other proteins, including transcription factors. The screen involves a bait and a library of prey proteins which when they physically interact will activate reporter genes and allow growth on deficient media (Brückner et al., 2009). This may then be followed up with a knockout study, reverse genetics approach, as described here using

CRISPR/Cas9 reagents.

In March of 2017 a yeast 2 hybrid experiment was performed by Hybrigenics Service.

The experiment used a cDNA library derived from Zea mays B73 vegetative apex, immature ear and immature tassel messenger RNA and ramosa1 as bait. Hybrigenics sorted the associated interactions by score based on the survivability of the yeast on histidine deficient media (Figure

2-5; Table 1). Surviving yeast colonies were categorized by the gene regions contained in the prey interactors.

CRISPR/Cas9 system

The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) system was adapted from a bacterial active immune system. This approach works to create mutants by creating a double stranded DNA break at a specific site and allowing the organism’s own stress response Non-Homologous End Joining (NHEJ) pathway to repair the DNA. This system does 19 not possess base-checking exonucleases and is thus error prone, often causing small insertions or deletions. The Streptococcus pyogenes Cas9 enzyme recognizes the site to be cut by being paired with an 18-21bp guide and an 80bp scaffold (gRNA) complex (Figure 2-6). In the bacterial form the scaffold was composed of two separate RNA sequences the CRISPR RNA (crRNA) and transactivating crRNA (tracrRNA) (Karvelis et al., 2013). These have been fused into a chimeric form in order to more efficiently utilize this enzyme. The guide sequence is complementary to the target region of DNA you intend to cut and must begin with but does not include a

Protospacer Adjacent Motif (PAM). The PAM is the only hard requirement for guide choice, for

S. pyogenes Cas9 this sequence is NGG (Anders et al., 2014). There are many types of Cas related proteins and research is currently being conducted to expand that list. The research is primarily in the areas of decreasing off target effects and developing Cas enzymes with different

PAM sites in order to target genome regions less rich in traditional PAM sites. The scaffold region is also dependent on which enzyme you use. Typically, the guide + scaffold complex

(gRNA) is transcribed from DNA into RNA by RNA polymerase III and the scaffold folds into a secondary loop structure that fits inside the Cas enzyme (Figure 2-6).

There are two main ways to go about creating this ribonucleoprotein (RNP) complex and they largely differ by the method of delivery into the organism. One method is to introduce a section of DNA into the organism that uses endogenous transcription machinery to produce both the enzyme and gRNA. In plants this is most often accomplished by Agrobacterium tumefaciens.

Agrobacterium contains DNA editing machinery which recognizes specific sites in an endogenous plasmid, and then cuts out the segment of DNA and randomly inserts it into the genome. 20

Particle bombardment (biolistic) is another method to introduce reagents into plant cells, whereby gold particles are coated with either the plasmid DNA encoding the cas9 and single guide RNAs (sgRNAs) or a pre-assembled RNP complex of enzyme and gRNA. The DNA free delivery of the editing reagents as an RNP complex has the benefit of transient Cas9 existence, which means it does not need to be crossed out of the line before production. However, this method is still being developed and is not ready for widespread adoption (Svitashev et al., 2016).

Because we possessed the experience and resources to perform an Agrobacterium transformation this was the method chosen for these studies. This required we assemble the region of DNA for Agrobacterium to transplant into the Zea mays genome. This is generally referred to as a cassette because it is a self-contained and mobile section of DNA, a “little-box” of DNA. A template plasmid was provided by Dr. Bing Yang (Iowa State University) with a maize ubiquitin promoter driving Cas9 production, an enhanced 35S promoter driving a phosphinothricin N-acetyltransferase protein for herbicide resistance selection and a ccdB gene surrounded by attR gateway cloning sites. The gateway cloning site is intended for the user to clone in another DNA segment containing their target sequences. Each gRNA needs to either be driven by its own promoter or be post-transcriptionally digested. There is research into this area and some enzymes (cpf1 for example) natively contain the ability to digest several gRNA regions transcribed as a single unit. These methods are still experimental, and we have access to a Cas9 expression plasmid which necessitates a system for transcribing each gRNA on its own promoter.

21

The promoters used by Dr. Yang’s array and most commonly used in CRISPR experiments are the U6 Polymerase III (Pol III) promoter. Unlike Polymerase II (Pol II) promoters which produce mRNAs for proteins synthesis, Polymerases I and III produce large volumes of RNA. While Pol I creates ribosomal RNA, Pol III creates short structured RNAs like tRNA (Khatter et al., 2017). Pol III promoters have been used to create siRNA and shRNA segments and are also suited to creating the roughly 100 bp gRNA needed for Cas9. The Pol III promoters require either an A (Adenine) or G (Guanine) in the +1 position (23bp downstream of the TATA box) to initiate transcription. The general rule is to use one or more Gs at the start of your RNA sequence when using U6 promoters (Ma et al., 2014).

There are currently two strategies for producing gRNAs for CRISPR. One is to transcribe each gRNA from its own RNA polymerase (usually Pol III) promoter. The second is to produce one RNA strand and post-transcriptionally process it into the individual gRNAs. This is accomplished by flanking each gRNA sequence with a self-cleavable ribozyme, endogenous tRNA or introns (Zhang et al., 2019). This has the benefit of decreasing the size of the array, which is beneficial when DNA sequences are being synthesized. This second approach also increases the flexibility of the CRISPR system by making the expression tissue specific with Pol

II promoters. These promoters generally produce products which are highly processed including

5’ caps and 3’ polyadenylation. A strategy encompassing post-transcriptional processing allows the use of these more versatile promoters (Zhang et al., 2017).

Dr. Yang’s system is gateway compatible to enable the incorporation of the guide arrays into the T-DNA expression vector containing cas9. The plasmid is assembled using type II 22 restriction enzymes and two custom plasmids with maize and rice promoters driving each gRNA.

This system allows for the use of 4 gRNAs per Agrobacterium transformation and requires weeks of cloning to assemble the system. This will be called the 4x array. We also had access to a DNA sequence from Dr. David Jackson (Cold Spring Harbor Laboratory) which is intended to be designed and purchased from a gene synthesis laboratory. This sequence is designed with attL gateway sites and intended for use with Dr. Yang’s Cas9 expression plasmid. Dr. Jackson’s sequence has the benefit of compact promoter regions for the gRNAs and can accommodate eight gRNAs. This will be called the 8x array. We decided to use both avenues giving us a total of 12 guides to work with between the 4x and 8x arrays.

Research has noted there are some regions of DNA that are not amenable to Cas activity.

This is hypothesized to be the consequence of chromatin structure or histone activity, but it is currently difficult to predict gRNA activity (Doench et al., 2014, 2016, Wu et al., 2014). For this reason, we designed two targets for every gene of interest. There is also an added benefit of having two active targets. It is possible to remove the region of DNA between your sites if both are active at the same time.

The purpose of this research project is to apply a reverse genetics approach to further study the ramosa1 pathway. The goals are: 1 to select biologically relevant genes hypothesized to be involved in the ramosa pathway, 2 design and conduct a CRISPR/Cas9 knockout experiment to generate mutagenized Zea mays plants, 3 genetically categorize the generated mutations and 4 cross the plants to an inbred line for future study.

23

Materials and Methods

Choosing Genes for Cas9-mediated knockout

The 12 yeast-2-hybrid loci rated A through C were compiled, the loci coordinates were translated to associated gene names and gene identifiers (geneID) (Table 1). The full list of all gene information can be viewed in appendix A. Both V3 and V4 geneIDs are included as these gene IDs are required by various programs used to interrogate these candidate genes.

Applying a reverse genetics approach in these studies is with the intent of knocking out genes to shut off the gene’s function. Because many genes in maize contain, somewhere in the genome, a paralog which might provide partially or completely redundant function, each of these genes was researched at ensembl.gramene.org/Zea_mays and the gene tree was analyzed to identify paralogs. The gene coordinate information and percent identity (listed as the percent of the paralogous sequence which matched the main gene sequence) of each paralog was noted.

Some genes listed many ancient paralogs, these were excluded due to low target identity and low functional relevance. These paralogs were then further interrogated below.

Based on the results of the yeast-2-hybrid assay these genes of interest are believed to encode physical interactors with RAMOSA1. In order to work together, these physical interactors should therefore be co-expressed in the same tissues as ramosa1. To explore this idea, the expression data from a developmental transcriptomics study of maize ears and tassels (termed the ramosa series- ear samples) was queried using maizeinflorescence.org/profile_display.php

(Eveland et al., 2013). The list of B73v3 gene IDs used for each query was comprised of the specific target gene and all paralogs. This was done for each gene of interest along with ramosa1 24 and ail6. The 1mm and 2mm sizes correspond to developmental time points in the ear, which is relevant because ramosa1 expression increases 4-fold between 1mm and 2mm in size. This allowed the baseline expression levels to be compared with ramosa1 expression at multiple stages in development. The paralog data and expression data can be viewed in Table 2. Any gene with an identity over 60% and with an expression profile identifying similar levels of co- expression was counted as a “close paralog” on the gene tracker in Table 3. Genes with two or more close paralogs were ruled out as candidates for CRISPR/Cas9 mutagenesis due to difficulty in ensuring loss of function as this would likely require both alleles on both genes to be rendered inactive, which may require multiple mutagenesis targets and several generations of crossing.

Since these genes are candidates for selective mutation it was prudent to check for known mutations in these genes available through seed stocks. Each candidate gene was searched through https://www.maizegdb.org/uniformmu for known mutants which were catalogued.

Several Mutator insertion lines were ordered, grown, PCR verified and crossed to weak ramosa1 mutants (May et al., 2003). Depending on the location of the mu insertion, these genes were removed from the list of possible CRISPR-cas9 targets if there were already potential insertions in early exons.

Comparing candidate genes for function

The pared list was then compared and analyzed for gene function. Often gene function in maize is inferred through homology with genes in Arabidopsis. Functional annotation of the maize candidate genes was collected from various sites such as www..org (Table 3).

Genes were broken down into groups based on function. Four genes are involved with 25 transcription, (jmjC, selT, SKU5, ail6, Table 3), the others include a chaperone SBA1

(Zm00001d049305), a nuclear transport interactor rho-GDI1 (Zm00001d017859) and rel2

(Zm00001d024523) a TOPLESS-like co-repressor which has already been demonstrated to physically interact with RAMOSA1 (Gallavotti et al., 2010). Just as this result eliminates rel2 as a CRISPR-cas9 candidate, it also lends credence to the yeast-2-hybrid results.

Selected genes

The resources available for this project limited the number of genes to target using the

CRISPR-cas9 approach to five. We expect many of these genes to only present a phenotype as homozygous and in the presence of a mutant ramosa1 allele. In order to decrease the number of future crossings and to discover novel ramosa1 mutants, this gene was targeted. ail6, has demonstrable phenotypic relevance and was an obvious choice. rho-GDI1, is thought to be involved with nuclear transport which makes it unique in the group and an interesting candidate.

The remaining two come from the transcription list. jmjC, has an internal zinc finger and is a transcription factor. selT, is a metal binding histone modifier which is also likely to modify transcription.

Only rho-GDI1, has a close paralog. Function retaining paralogs could present problems when checking for a phenotype so genes without them were preferred. rho-GDI1 shares 80% identity with its one close paralog which gave us the opportunity to design guides to hit both the gene and its paralog. This strategy has worked before and could be a valid strategy in designing knockout experiments in genes with paralogs (Ferreira et al., 2017).

26

Selecting CRISPR Guides

There are several considerations in choosing effective guides. The first immutable rule is the guide must be next to the PAM for your enzyme. Depending on your enzyme these are generally plentiful in the genome. The second is the sequence must be unique in the genome

(unless you intend to cut in more than one location). There are many programs designed to find guide sites for you. Ultimately the program used, at a minimum, must have your reference genome available for off-site comparison, the PAM sequence for your enzyme and preferably guide length and leading G options. The program used in these studies was the University of

Bergen’s Chop Chop V2 program and contains all of these features (Kornel et al., 2016; Tessa et al., 2014). However, this program was chosen largely due to the self-complementary calculation available from none of the other options at the time. This feature originated by studying guides with complementary regions to the RNA scaffold. The secondary structure of the scaffold is important for Cas integration and guides which disturb this secondary structure prevent integration and lower editing activity (Thyme et al., 2016). Sample output for ramosa1 using the chop chop program can been seen in Figure 2-7.

The option to prefer leading Gs (Guanines) originates from the necessity of a G or A

(Adenine) to initiate transcription (using Pol III promoters). The guide end closest to the PAM

(also the end adjacent to the scaffold) is much more selective and binds more tightly (Wong et al., 2015). The combination of these factors leads to many experimenters adding one or more Gs to the leading end of their guides. However, native Gs are still preferred and the abundance of available guide locations in these studies did not necessitate the addition of exogenous nucleotides. 27

Guide location selection

Location within the gene of interest and separation between guides (when multiple guides are used) is important because transcription processes from 5’ to 3’ and important features upstream of a mutation may still retain some or all functionality. For knockouts, a position early in exon 1 was preferred. Distance between guide cut sites was kept below 200bp to increase the likelihood of a large deletion. Several of the genes also had either interesting features (zinc finger of ramosa1) or had interaction regions from the yeast-2-hybrid. All guides were chosen to be inside or upstream of these regions. Potential guides starting with a G in these locations were then compiled and selected based on calculated efficiency, self-complementarity and the number of off target regions. See Figure 2-7 for chop chop output, guide color was based on overall score, green, yellow and red in order best to worst. Again, due to the abundance of targets, neither self-complementarity nor any off targets were accepted in the final guides. The selected guides and their location considerations are listed in Table 4. Guide locations are depicted on the each gene and coding sequence, for ramosa1 Figure 2-8, rho-GDI1 and rho-GDI1 paralog Figure

2-9, jmjC Figure 2-10, selT Figure 2-11, and ail6 Figure 2-12.

Description and Assembly of the Cas9 Expression System

There were four plasmids used for these studies, three were provided by Dr. Bing Yang

(Iowa State University) and the fourth contained a synthesized DNA segment designed by Dr.

David Jackson (Cold Spring Harbor Laboratory). The three provided by Dr. Yang were pENTR- gRNA1 & 2 and pGW-Cas9 (Char et al., 2017). pENTR-gRNA1 & 2 are assembly vectors intended as incremental stepping stones for the generation of a plasmid containing four sequences of a U6 promoter (two unique promoters each used twice) followed by the researchers chosen guide sequence and the gRNA scaffold. This cassette is then flanked by attB sites for 28 assembly into pGW-Cas9 plasmid (Figure 2-13). pGW-Cas9 contains the Cas9 gene, the bialaphos resistance gene and the right and left T-DNA borders necessary for Agrobacterium- mediated transformation. The final expression vector, pGW-cas9, is used for both the synthetic gRNA array and the guide array assembled from the first two plasmids (Figure 2-14).

To assemble the pENTER-gRNA plasmids, both arrays are digested with restriction enzymes BtgZI and BsaI in sequential rounds of cloning and ligation with custom double stranded DNA oligonucleotide. The region between HindIII sites in gRNA2 is removed with

HindIII digestion and ligated into gRNA1 to create an array of four promoter-guide-scaffold segments. Complementary oligonucleotides were synthesized to match the guide and make the necessary complementary overhangs created by the type-II restriction enzymes, these are listed in Table 5. Both BtgZI and BsaI cut outside their recognition site, this allowed for custom designing of the overhangs and a unidirectional ligation product (Figure 2-15).

To generate a double stranded fragment for cloning, 20µL of each oligo (at 100uM) was added to 20µL of 5M NaCl solution and 20µL of dH2O. Heated in thermocycler to 95 ℃ for 10 min, removed to the bench and cooled slowly to room temperature. This solution was diluted to

1:200 and 2 µL used in ligation reactions with appropriately digested gRNA1 and 2 vectors.

The gRNA1 and 2 plasmids were prepared in tandem doing the BtgZI and BsaI reactions in parallel pipelines. After digestion with the enzyme, the product was gel purified, to prevent re- ligation of the previous insert and to allow for separation of uncut and linear plasmid DNA. The digested, linear plasmid was cut from the gel and extracted using a gel clean up spin kit 29

(Qiagen). 100ng of the vector fragment was ligated with the annealed DNA oligo pairs with T4

DNA ligase (NEB) and transformed into One Shot Top10 chemically competent E.coli (Thermo

Fisher). Transformation was performed following the manufacturer’s protocol (page 8 of https://assets.thermofisher.com/TFS-Assets/LSG/manuals/oneshottop10_man.pdf) except one 50

µL tube was split to accommodate two reactions. The plated bacteria were grown overnight at

37℃ on LB agar containing 50 mg/L kanamycin.

Array screening

Individual colonies were screened using colony PCR and the primer pairs described in

(Table 5). To perform the colony PCR, numbers were assigned to individual colonies while still on the agar plate, PCR reactions were prepared, a 10 µL pipette tip was touched to the colony and swirled inside the respective tube while pipetting up and down. Standard PCR conditions were modified to extend the initial denaturation at 95 C to 5 mins to ensure the cells were lysed.

The products were viewed following electrophoresis on a 1.5% agarose gel (Figure 2-16).

The first round produced 1 positive out of 24 colonies, or around 4% efficiency, on the ligation reaction. This was likely due to the close proximity of the two restriction enzyme sites and an inability to distinguish between single digest and double digest plasmids on an agarose gel with a 20bp size difference. To resolve this issue, the restriction time was doubled, the number of enzyme units was doubled, and an additional double dose of enzyme was added at the halfway point. With these measures over 90% efficiency was achieved on both plasmids with both enzyme (Figure 2-16).

30

Positive colonies were grown overnight in 4mL LB containing 50 mg/L kanamycin (in a

15 mL falcon tube) at 37℃ in a shaking incubator at 200 rpm. DNA was extracted using a qiagen mini-prep kit and the product was quantified spectrophotometrically for concentration.

gRNA cassette assembly

ramosa1 4x guides 1 and 2 were assembled in pENTR-gRNA1 while rho-GDI1 4x guides 1 and 2 were assembled in pENTR-gRNA2. Both plasmids were DNA sequenced with one of the PCR primers to verify before proceeding. After verification both were digested with

HindIII and run on 2% agarose gels. pENTR-gRNA2 was digested into two pieces, the 1000bp band was used as the insert of a ligation reaction with pENTR-gRNA1. After ligation they were transformed into E.coli as described previously. A colony PCR using a combination of primers from gRNA1 and 2 verified the plasmid. Further verification was accomplished with DNA sequencing.

Synthetic 8x array

The 8x array was ordered from geneArt and was delivered inside the pMA-RQ ampicillin resistant plasmid. The array was designed with attL sites flanking the guides in order to be easily gateway cloned into a final expression vector (panel B in Figure 2-14).

The final cloning step into the pGW-Cas9 expression plasmid was the same for both the

4x and 8x arrays and accomplished using the gateway LR clonase. This reaction swaps the contents of each plasmid between attL and attR sites (Figure 2-15). In the expression plasmid the

CcdB gene is located between the attR sites. This protein is a topoisomerase II poison and requires the presence of CcdA to counteract its gyrase trapping action (Bernard 1993, 1994). The 31 expression vector is maintained in a CcdB resistant E. coli but once the gateway reaction is performed both plasmids are transformed into a Top10 E. coli and any bacterium containing the

CcdB operon will die. Since the entry and expression vectors use different antibiotic resistance, the only colonies able to grow contain the expression vector with the correct insert (without the poison gene).

Positive colonies for each 4x and 8x colonies were selected and streaked onto LB + 100 mg/L spectinomycin plates. Individual colonies were grown overnight, and plasmid DNA extracted. Both plasmids were verified with restriction enzyme digestion (4x: NotI, 8x: PstI) and further verified by DNA sequencing. This required several sequencing reactions due to the size of these genes (4x: 1950bp and 8x: 2470bp, Cas9 gene: 4.3 Kb).

Transformation into Agrobacterium tumefaciens EHA101

The 4x and 8x expression vectors were electroporated into Agrobacterium tumefaciens

EHA101. Agrobacterium was spread on YEP plates containing 50mg/L kanamycin and grown at

28 ℃ for 2 days. Three tubes were prepared with 150 µL of sterile H2O and 3-4 loops of

Agrobacterium. The tubes were then centrifuged for 1 minute and the supernatant was pipetted off. The Agrobacterium was then re-suspended in 150 µL of sterile H2O, this wash step was performed 3 times. The final resuspension is in 100 µL sterile H2O. 1 µL of plasmid DNA from each of the 4x and 8x arrays was pipetted into two of the tubes and the entire volume is transferred to an electroporation cuvette and electroporated with a Bio-Rad MicroPulser on the bacterial setting. After electroporation each cuvette was immediately flushed with 800 µL of

YEP (without antibiotics) and transferred to a 1.5mL tube. These were incubated in a 200 rpm

28℃ shaker for 3-6 hours. 25 µL and 250 µL of each culture were plated onto separate 32 kanamycin (Agrobacterium selection) plus spectinomycin (expression vector selection) YEP plates and grown for 48-72 hours at 28℃. Pinhead sized colonies were visible by 24 hours.

The 25 µL plates contained 50-100 colonies, however these colonies needed to be verified (Array and Cas9) before they could be transformed into maize embryos. A colony PCR method similar to E. coli failed to consistently produce predicted amplicons. DNA was extracted from YEP (with spectinomycin and kanamycin) liquid cultures derived from single colony

Agrobacterium isolates. Colonies were picked and patched to another plate for further growth and the tip was then swirled and pipetted up and down in labeled tubes of 10mL of YEP with spectinomycin and kanamycin. The patch plates were grown in an incubator at 28℃ for 48-72 hours. The YEP liquid cultures were grown at 28℃ in a shaking incubator at 200 rpm for 24-48 hours. A similar mass of plasmid DNA can be produced from 1-2 mL culture of E. coli, however the pGW-Cas9 plasmid is relatively low-copy in Agrobacterium thus requiring a larger

(10mL) culture volume.

Array screening in Agrobacterium tumefaciens

The DNA was then extracted using a Qiagen mini-prep kit following the modifications described by the Iowa State University Plant Transformation Facility (PTF) to match the larger starting volume. Briefly, the 10 mL of colony growth was pelleted, and the supernatant was removed. The pellet was re-suspended in 500 µL of cold P1 buffer and transferred to a 2 mL tube. 500 µL of P2 buffer was added and mixed with gentle inversion of the tube 10-12 times.

The tube was placed on ice for seven minutes after which the reaction was stopped with 700 µL of N3 buffer and mixed by gentle inversion 10-12 times. A precipitate formed as it incubated at room temperature for seven minutes. The tubes were then centrifuged for 15 minutes at 15,000 33 rpm and the supernatant was applied (750 µL at a time) to a Qiagen mini-prep column. The column was centrifuged for 30 seconds at 15,000 rpm, the flow through was discarded and these steps were repeated until all the supernatant had been processed through the column. This process of centrifuging and discarding was repeated with 500 µL of PB buffer and 750 µL of PE buffer. The column was dried by centrifuging for one minute at 15,000 rpm and then placed in a new 1.5 mL tube. 50 µL of 60 ℃ pre-warmed sterile water was added to the column and it was left on the bench for three minutes to allow for full permeation of the filter. The tube was centrifuged for 1 minute at 15,000 rpm and the DNA was checked spectrophotometrically for concentration.

Quality control of the plasmid in E. coli was straightforward, while screening in

Agrobacterium was problematic. This is a well-known problem which was attempted by processing large numbers of colonies using PCR and then performing restriction digests on the positive ones. Besides the comparative difficulty of producing enough DNA, Agrobacterium also had a higher ratio of bacterial chromosomal to plasmid DNA. This can be seen in Figure 2-17 where primers which produced clean single +/- bands in E. coli produce errant or additional

DNA bands in Agrobacterium. Several combinations of primers and temperatures were attempted until usable conditions were found. It was discovered that the last 700 bp on the 3’ end of Cas9 was susceptible to being lost in Agrobacterium. Primers that amplified this region became the standard screening for the presence of Cas9 (Cas-screen-Fw/Rv) (Table 5).

The restriction digests performed on Agrobacterium-plasmid DNA were underwhelming due to low DNA concentrations (10-30 ng/µL). Digested plasmid DNA was not visible on an 34 agarose gel. An attempt to amplify the Cas9 and array region was performed with Q5 High

Fidelity DNA polymerase. 11 Kilobases were amplified and digested with PstI for the 8x array and NcoI for the 4x array. The 8x array was digested correctly and those colonies started the maize transformation process. The 4x array was started again from the electroporation step and the growth and QC process was repeated. Plasmid DNA from a few Agrobacterium colonies which tested positive with the Cas9 PCR were re-transformed into E. coli for the restriction digestion test. For this reason, the first half of the ears transformed were with the 8x array and the last half were with the 4x.

Agrobacterium transformation of Maize Hi-II

Hi-II immature zygotic embryos for transformation are dissected from a segregating F2 population. The F1 seeds are maintained from cross pollinated parent A and parent B lines

(Yadava et al., 2017). F1 seeds were donated from three sources, one line did not germinate but plants were grown from the other two. These two sources, called W and S, were grown in the

Curtiss Farm field in summer 2018. The F1 plants were sibling pollinated and F2 embryos were dissected 10-14 days post pollination, when the average embryo size is 1.5-2.0 mm. The

Agrobacterium-mediated transformation of the Hi-II immature zygotic embryos was performed according to Frame et al. (2002) with a few modifications to the embryo extraction protocol. The paper specifies using a pair of forceps inserted into the end of the ear. These split the end of the ear and were difficult to hold for extended periods of time. A custom handle was created from a deck screw welded to the tip of a phillips screwdriver (Figure 2-18). The screw allowed much deeper penetration of the body of the ear without splitting and the handle was much more comfortable to hold. Care was taken to choose a screwdriver which could be sterilized easily and lacked hidden crevices for bacteria to hide. The protocol also listed the embryo extraction tool as 35 a sharpened spatula. The tips of these were found to be much wider than needed and were difficult to hold for long periods. A dental spatula solves both problems as the tip was much narrower and the handle was thicker and designed to be held, the one used in this experiment was the Hu-Freidy No 03 Flexi Thin TNCIGFT3.

Embryo-derived events leading to prolific type II callus growth under bialaphos selection were assigned designations B1-B142 and plants were regenerated according to Frame et al.

(2002). Plants were transferred to soil when at least two leaves were visible and were grown according to Dr. Nick Lauter’s transgenic plant greenhouse protocol.

To identify mutations, DNA was extracted from each plant, including individuals of intertwined plants potted together. PCR for each target gene was performed on each plant and the

DNA amplicons were sequenced. The CRISPR target sites on all genes, except ail6, were close enough to allow one sequencing reaction to cover both target sites.

Results

A goal of these studies was to use CRISPR-cas9 technology to generate novel mutations in the maize ramosa1 gene alone and in combination with mutations in genes hypothesized to interact with RAMOSA1. Four mutants in ramosa1 (ra1-M7, M8, M11 & M12) have verifiable loss of function due to ramosa phenotype in tassels from first generation plants as can been seen in Figure 2-27.

36

Of the 27 ears originally transformed, 18 produced embryogenic events surviving on selection. All the surviving regenerated plants came from just seven ears. One ear (E104) was so prolific that 53.6% of the surviving plants stem from it. The 27 ears transformed represented two sources: 12 W and 15 S (Table 6). All regenerated plants originated from embryos extracted from S ears (Table 7). Of the 142 callus lines selected for growth on bialaphos-containing media, 28 successfully regenerated plants (Table 8). 112 plants, representing 16 of the original callus lines, were transplanted in the greenhouse. Interestingly, only seven of the plants were transformed with the 4x array.

Sequence analysis uncovers novel mutations in all target genes with varied efficiency

All target gene amplicons were sequenced and analyzed in each plant. DNA sequence files (*.ab1) were aligned to the B73 and Hi-II parental sequences using the Geneious software package (Biomatters). One indicator of a monoallelic edit is a coincidental drop in sequence quality (confidence score) at the location of the guideRNA target site (Figure 2-19). There were also times when confidence levels did not drop but an edit occurred. These were caused by bi- allelic insertions or deletions (indels) and was indicated by a gap in either the test or reference sequence (Figure 2-20). These could be the same base or different bases inserted at the cut site

(Figure 2-21).

Mutations uncovered by DNA sequencing analysis are described in Table 9. To date, thirteen unique mutations have been detected in the ramosa1 gene, while in jmjC there are 8, selT has 16, ail6 has four and rho-GDI1 has one. Single base indels make up the majority (80%) of all mutations recovered (Table 12). Separated by type, 98.5% of all insertions were single base while only 40.6% of all deletions were single base. This distribution can be seen visually in 37

Figure 2-24. There was also clearly a bias toward “A” or “T” base insertions. “A” insertions made up 41% while “T” insertions were observed in 47% of all insertions (Figure 2-25).

From the total 85 surviving plants treated with the 8x array, 2 have all four genes edited,

17 have three genes edited, 38 have two genes edited, 23 have one gene edited and 5 were not edited, see Figure 2-26. Some plants were also edited on both alleles of the same gene, this is described in Table 10, where each gene of interest is separated into two columns which represent the two alleles.

A series of B73 ra1-63 plants were planted to cross with the parental generation of regenerated plants. This was selected because most of the ramosa1 phenotyping was conducted in B73 and the mild phenotype of ra1-63 allows for enhancement or suppression to be detected.

However, due to a very short maturation period of the regenerated plants and greenhouse stress not enough cross plants were ready when the parental tassels began to shed. The Iowa State

University Plant Transformation Facility graciously donated B104 ears and B73 pollen for crossing.

Discussion

ramosa1 mutations create interesting future study possibilities

The experiments were designed to generate mutation in or upstream of functionally relevant regions in each gene. In ramosa1, several mutations were generated which could be used to test the importance of various functional domains, such as the Cys2-His2 zinc finger, and the two 3’ EAR domains. In the Cys2-His2 zinc finger, for example, In-frame deletions were 38

recovered that resulted in alterations in the canonical Cys2-His2 residues: the ra1-M2 mutation results in the deletion of both Histidines while the ra1-M8 and ra1-M13 mutations result in the deletion of both Cystines. These new alleles should be valuable to test the importance of a structurally sound zinc finger and the importance of DNA recognition in ramosa1 activity.

Another interesting in-frame deletion is ra1-M1, which changes Arg69 (negative) and Leu70

(neutral) into a single Met (sulfur containing), immediately after the second Histidine in the zinc finger. A diagram of ra1 mutations can be seen in Figure 2-28.

Most of the other ramosa1 mutants are in the “frame-shift early termination” category.

Four of note delete parts of the zinc finger and terminate early. ra1-M4, ra1-M11 and ra1-M12 all delete the second histidine and terminates 5, 25 and 7 amino acids later respectively. ra1-M7 terminates the protein between the Cystine and Histidine sites. Mutants after the zinc finger are likely to affect EAR domains which are known active domains (Gallavotti et al., 2010). ra1-M5, ra1-M3 and ra1-M9 are immediately after the second histidine and creates an incorrect string of

20 (ra1-M5) or 24 (ra1-M3 & M9) amino acids before termination. ra1-M10 preserves the zinc finger but terminates one amino acid after the second Histidine.

These mutations were produced by both the 4x and 8x array. While both guides were active in the 4x array, activity was only detected at guide 2 in the 8x array. The 13 unique mutations generated were discovered in 31 alleles across 27 plants.

39 jmjC guides produced mostly frame-shift early termination mutations

The ramosa1 interaction domain for jmjC is across exons 2 and 3 between amino acids

142 - 318, this region also includes a putative monoamine-oxidase zinc finger domain (residues

210-276). The Both CRISPR guides were designed in exon 1 with the intent of creating a nonsense protein and an early stop. jmjC-M8 is the only mutation which creates an in-frame mutation. jmjC-M1, jmjC-M4, jmjC-M5 and jmjC-M6 are frameshift mutations which terminate before the 3’ end of exon 1. The remaining mutations, jmjC-M2, jmjC-M3, and jmjC-M7 also result in frameshifts, however they do not terminate translation until exon 3. All of these, except perhaps jmjC-M8, should create non-functional proteins.

Both guides were active in jmjC, however guide 2 was much more active. Mutations from guide 1 were recovered 5 times while mutations from guide 2 were recovered 84 times. The 8 unique mutants recovered produced 89 total combined mutations recovered and jmjC-M1 accounts for 55 of these. This may have a connection to the cut site of guide 2 being inside a string of Ts and jmjC-M1 is a T insertion. A diagram of jmjC mutations can be seen in Figure

2-29.

selT guides were both highly active

The selT guides generated 16 unique mutants, the most of any gene, likely due to the high activity of both guides. Both guides are in exon 1, 5’ of the ramosa1 interaction domain defined in the Y2H experiment as residues 32-195. One of the mutations (selT-M8) started at amino acid

(AA) 4 and terminated 50 amino acids later. 9 of the 16 mutations start at AA 5 and terminate

50-114 AAs later (selT-M2, M3, M5, M6, M7, M12, M13, M14 and M16). Six of these were created by single base insertions at both guide sites (selT-M2, M3, M12, M13, M14, and M16). 40

They were classified as distinct mutations due to the insertion of different combinations of bases at each site. There are 16 possible combinations of single base insertions across two guides, however due to the frequency of A or T base insertions in comparison to C or G the likelihood of seeing all combinations is diminished. The selT guides were unique from the other gene guides in that all G or C insertions came from this gene.

The remaining six mutations are one in-frame deletion of two amino acids (selT-M4) and five frame-shifts one L56 (selT-M15) and four G58 (selT-M1, M9, M10, and M11) which all terminate 60-65 AA later. All selT frame-shift mutations terminate inside exon 1 and create nonsense proteins inside the interaction domain effectively knocking out gene function. A diagram of selT mutations can be seen in Figure 2-30.

Activity at both target site locations proved challenging regarding the identification of specific edits due to the degeneration of traces after the first cut. All annotated mutations with known double cuts were biallelic at the first guide and the traces didn’t degrade. This leads to the possibility that some or all of the monoallelic plants might have a second undetectable mutation.

There were also a few circumstances where both sites had different bases inserted. Using these sequencing methods, it is impossible to tell exactly which allele had which base inserted (Figure

2-21).

ail6 guides were largely inactive

Activity in ail6 targets was limited to guide 1 and four unique mutations were recovered.

Three are frame-shift early termination mutations starting at P59, ail6-M1 and ail6-M3 terminate

3 AA later and 1 AA later respectively in exon 1 and ail6-M2 terminates 29 AA later in exon 2. 41

All three are likely to be functional knockouts unless important functional regions occur before residue 59. The final mutation, ail6-M4, is an in-frame 5 AA deletion upstream of the AP2 domains and is not likely to be a functional knockout. This gene was also more difficult to assess mutations due to the distance between cut sites. All the other genes in these studies only required one sequencing reaction to assess activity at both guide sites. With over 1,100 bp between sites this required separate reactions which added time and expense. A diagram of ail6 mutations can be seen in Figure 2-31.

rho-GDI1 mutagenesis was limited by ear availability

Only one rho-GDI1 mutant was recovered in two plants. It was a one bp deletion created by guide 1. It is categorized as a V74 frame-shift which terminates 28 AA later. This is one quarter of the way into the RAMOSA1 interaction domain (residues 54-147) as defined in the

Y2H experiment. The location and nature of this mutation suggest that a non-functional protein is generated. The lack of mutations recovered in this target gene is likely more a product of fewer transformable ears than the guides or the array themselves. A diagram of rho-GDI1 mutations can be seen in Figure 2-32.

Multi-guide strategy is prudent and functional

In order to analyze the productivity of each target guide each mutation event needed to be counted. This was somewhat problematic due to the difficulty in determining when any individual mutation event occurred. Any events that occurred in different plant lines were scored individually but events that occurred multiple times within a plant line were treated as suspect.

The first goal was to identify mutations that were unlikely to occur spontaneously, large deletions of exactly the same size and position, such as jmjC-M2, were all collapsed into a single 42 event. Conversely, any mutation that occurred in multiple lines such as jmjC-M1 or selT-M9 were immediately treated as more likely to occur spontaneously. Events were then broken down and worked through until the number of actual mutations occurring in each line were calculated and tabulated (Table 12).

The strategy of selecting multiple guides per gene may seem wasteful of resources but these studies demonstrate the value of that action. In actual activity 9 of the 12 guides demonstrated some level of activity. Each gene had some activity in one of their guides and both guides in selT and both 4x ramosa1 guides were active. While guide 1 in jmjC did produce three mutants, guide 2 produced 32, so guide 1 was relatively less active. selT mutagenesis would have worked with only one guide but the other genes would have had a 50% chance of choosing the active guide on the first chance.

To determine if activity was a result of position or promoter choice the activity at each location in the array was compared, see Figure 2-14 for the array and Figure 2-33 for a graphic of the event frequency by guide and location in the 8x array. It is true the two guides in the 7 and

8 position of the 8x array did perform poorly, if activity were to be attributed only to location the middle four would be the most desired positions. However, chromosomal guide position physically on the gene is also a likely cause. The promoter used could also be a factor but the two most active guides (jmjC guide 2 and selT guide 2) were driven by rice promoters instead of maize. Also, jmjC guide 2 was the second to most active guide and was driven by the same promoter as ail6 guide 2 one of the least active guides. This suggests that promoter choice was 43 not a likely factor in guide success, however the placement within the array is still directly untested.

References

Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513, 569–573 (2014).

Anton Persikov and Mona Singh (2014) "De Novo Prediction of DNA-binding Specificities for Cys2His2 Zinc Finger Proteins". NAR, 42(1): 97-108. Epub 2013 Oct 3.

Bernard, P., et al. “The F plasmid CcdB protein induces efficient ATP-dependent DNA cleavage by gyrase.” J Mol Biol. 1993 Dec 5;234(3):534-41. PubMed PMID: 8254658.

Bernard, P., et al. “Positive-selection vectors using the F plasmid ccdB killer gene”. Gene. 1994 Oct 11;148(1):71-4. PubMed PMID: 7926841.

Brückner, Anna, et al. “Yeast Two-Hybrid, a Powerful Tool for Systems Biology.” International Journal of Molecular Sciences, vol. 10, no. 6, June 2009, pp. 2763–88. PubMed Central, doi:10.3390/ijms10062763.

Char, Si Nian, et al. “An Agrobacterium-Delivered CRISPR/Cas9 System for High-Frequency Targeted Mutagenesis in Maize.” Plant Biotechnology Journal, vol. 15, no. 2, 2017, pp. 257–68. Wiley Online Library, doi:10.1111/pbi.12611.

Doench, John G., et al. “Rational Design of Highly Active SgRNAs for CRISPR-Cas9-Mediated Gene Inactivation.” Nature Biotechnology, vol. 32, no. 12, Dec. 2014, pp. 1262–67. PubMed, doi:10.1038/nbt.3026.

Doench, John G., et al. “Optimized SgRNA Design to Maximize Activity and Minimize Off- Target Effects of CRISPR-Cas9.” Nature Biotechnology, vol. 34, no. 2, Feb. 2016, pp. 184– 91. PubMed, doi:10.1038/nbt.3437.

Eveland, A. L., et al. (2013). Regulatory modules controlling maize inflorescence architecture. Genome Research, 24(3), 431–443.

Ferreira, Raphael, et al. “Exploiting Off-Targeting in Guide-RNAs for CRISPR Systems for Simultaneous Editing of Multiple Genes.” FEBS Letters, vol. 591, no. 20, 2017, pp. 3288– 95. Wiley Online Library, doi:10.1002/1873-3468.12835.

Frame, Bronwyn R., et al. “Agrobacterium Tumefaciens-Mediated Transformation of Maize Embryos Using a Standard Binary Vector System.” Plant Physiology, vol. 129, no. 1, May 2002, pp. 13–22. www.plantphysiol.org, doi:10.1104/pp.000653.

44

Gallavotti, A., Long, J. A., Stanfield, S., Yang, X., Jackson, D., Vollbrecht, E. and Schmidt, R. J. (2010). The control of axillary meristem fate in the maize ramosa pathway. Development 137, 2849-56.

Kagale, Sateesh, et al. “Genome-Wide Analysis of Ethylene-Responsive Element Binding Factor-Associated Amphiphilic Repression Motif-Containing Transcriptional Regulators in Arabidopsis.” Plant Physiology, vol. 152, no. 3, Mar. 2010, pp. 1109–34. www.plantphysiol.org, doi:10.1104/pp.109.151704.

Kagale, Sateesh, and Kevin Rozwadowski. “EAR Motif-Mediated Transcriptional Repression in Plants.” Epigenetics, vol. 6, no. 2, Feb. 2011, pp. 141–46. PubMed Central, doi:10.4161/epi.6.2.13627.

Karvelis, Tautvydas, et al. “CrRNA and TracrRNA Guide Cas9-Mediated DNA Interference in Streptococcus Thermophilus.” RNA Biology, vol. 10, no. 5, May 2013, pp. 841–51. Taylor and Francis+NEJM, doi:10.4161/rna.24203.

Khatter, Heena, et al. “RNA Polymerase I and III: Similar yet Unique.” Current Opinion in Structural Biology, vol. 47, Dec. 2017, pp. 88–94. ScienceDirect, doi:10.1016/j.sbi.2017.05.008

Kornel Labun; Tessa G. Montague; James A. Gagnon; Summer B. Thyme; Eivind Valen. (2016). CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Research; doi:10.1093/nar/gkw398

Ma, Hongming, et al. “Pol III Promoters to Express Small RNAs: Delineation of Transcription Initiation.” Molecular Therapy. Nucleic Acids, vol. 3, no. 5, May 2014, p. e161. PubMed Central, doi:10.1038/mtna.2014.12.

May, Bruce P., et al. “Maize-Targeted Mutagenesis: A Knockout Resource for Maize.” Proceedings of the National Academy of Sciences, vol. 100, no. 20, Sept. 2003, pp. 11541–46. www.pnas.org, doi:10.1073/pnas.1831119100.

Pace, Nicholas J., and Eranthie Weerapana. “Zinc-Binding Cysteines: Diverse Functions and Structural Motifs.” Biomolecules, vol. 4, no. 2, Apr. 2014, pp. 419–34. PubMed Central, doi:10.3390/biom4020419

Sigmon, Brandi, and Erik Vollbrecht. “Evidence of Selection at the Ramosa1 Locus during Maize Domestication.” Molecular Ecology, vol. 19, no. 7, Apr. 2010, pp. 1296–311. PubMed, doi:10.1111/j.1365-294X.2010.04562.x.

Svitashev, Sergei, et al. “Genome Editing in Maize Directed by CRISPR–Cas9 Ribonucleoprotein Complexes.” Nature Communications, vol. 7, Nov. 2016, p. 13274. www.nature.com, doi:10.1038/ncomms13274.

Takatsuji, H. Zinc-finger proteins: the classical zinc finger emerges in contemporary plant science. Plant Mol. Biol. 39, 1073–-1078 (1999). 45

Tessa G. Montague; Jose M. Cruz; James A. Gagnon; George M. Church; Eivind Valen. (2014). CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing. Nucleic Acids Res. 42. W401-W407

Thyme, Summer B., et al. “Internal Guide RNA Interactions Interfere with Cas9-Mediated Cleavage.” Nature Communications, vol. 7, June 2016, p. 11750. www.nature.com, doi:10.1038/ncomms11750.

Vollbrecht, E., and B. Sigmon. “Amazing Grass: Developmental Genetics of Maize Domestication.” Biochemical Society Transactions, vol. 33, no. Pt 6, Dec. 2005, pp. 1502– 06. PubMed, doi:10.1042/BST20051502.

Weeks, Rebecca. “Inflorescence Branching in Maize: A Quantitative Genetics Approach to Identifying Key Players in the Inflorescence Development Pathway.” Graduate Theses and Dissertations, Jan. 2013, doi:https://doi.org/10.31274/etd-180810-2464.

Wong, Nathan, et al. “WU-CRISPR: Characteristics of Functional Guide RNAs for the CRISPR/Cas9 System.” Genome Biology, vol. 16, 2015. PubMed Central, doi:10.1186/s13059- 015-0784-0.

Wu, Xuebing, et al. “Genome-Wide Binding of the CRISPR Endonuclease Cas9 in Mammalian Cells.” Nature Biotechnology, vol. 32, no. 7, July 2014, pp. 670–76. PubMed, doi:10.1038/nbt.2889.

Yadava, Pranjal, et al. “Advances in Maize Transformation Technologies and Development of Transgenic Maize.” Frontiers in Plant Science, vol. 7, Jan. 2017. PubMed Central, doi:10.3389/fpls.2016.01949.

Yang, Jiaotong, et al. “PlantEAR: Functional Analysis Platform for Plant EAR Motif-Containing Proteins.” Frontiers in Genetics, vol. 9, 2018. Frontiers, doi:10.3389/fgene.2018.00590.

Zhang, Tao, et al. “Production of Guide RNAs in Vitro and in Vivo for CRISPR Using Ribozymes and RNA Polymerase II Promoters.” Bio-Protocol, vol. 7, no. 4, Feb. 2017. PubMed Central, doi:10.21769/BioProtoc.2148.

Zhang, Yueping, et al. “A GRNA-TRNA Array for CRISPR-Cas9 Based Rapid Multiplexed Genome Editing in Saccharomyces Cerevisiae.” Nature Communications, vol. 10, no. 1, Mar. 2019, p. 1053. www.nature.com, doi:10.1038/s41467-019-09005-3.

46

Figures

Figure 2-1: Original ramosa ear discovered by Dr. Walter Gernert.

This overly branched ear is a true breeding phenotype discovered in a line developed for enhanced protein composition. Image is from Gernert 1912.

47

Figure 2-2: Zinc finger and ramosa1 folding models.

A) Generic zinc finger model depicting the common α-helix and β-sheet. (Created by Thomas Splettstoesser https://en.wikipedia.org/wiki/File:Zinc_finger_rendered.png). (B - D) I-Tasser folding models for ramosa1; B) ramosa1 PyMol, α-helix (red) and β-sheet (blue). C) depicting DNA binding sites and D) Cys2-His2 zinc finger. zhanglab.ccmb.med.umich.edu/I-TASSER/ 48

Figure 2-3: ramosa1 zinc finger recognition sequence logo.

The sequence logo depicting the predicted ramosa1 zinc finger recognition site (TTG) based on protein composition from zf.princeton.edu.

49

Figure 2-4: ramosa1 phenotype in tassel and ear.

Branching in maize tassels (A-C) and ears (D-F) of a standard inbred line (B73) and two ramosa1 mutant alleles. (A & D) Tassel and ear of B73 are typical of many corn belt inbreds. (B & E) The weak allele (ra1-63) (Vollbrecht et al., 2005) is a frame-shift mutation near the C- terminus, before the stop codon, which adds 17 nonsense codons and may affect protein folding. (C & F) The strong allele (ra1-R) (Vollbrecht et al., 2005) is the original Gernert allele which encodes for asparagine (Asn) instead of histidine (His) at the position of the first His in the DNA binding Cys2-His2 zinc finger.

50

Figure 2-5: Yeast-2-Hybrid description.

The bait in this yeast-2-hybrid experiment was RAMOSA1 (RA1) and the prey was constructed from a cDNA library derived from Zea mays B73 vegetative apex, immature ear and immature tassel messenger RNA. A) When the prey does not physically interact with RA1 the yeast cell dies and no colony is produced. B) When the prey and bait interact, the cell survives due to the activation of the HIS3 gene and a colony is produced. The number of times a clone is recovered is an indicator of the strength of interaction. 12 genes were identified as putative interactors from these results.

51

Figure 2-6: CRISPR/Cas9 function.

A) depicts the necessary components of the CRISPR/Cas9 system (gRNA and Cas9 protein) and how they assemble into a functional ribonucleoprotein complex. B) When the gRNA target region matches DNA, the DNA is digested which creates mutations during the DNA repair process using the non-homologous end joining (NHEJ) DNA repair pathway. This figure originates from www.addgene.org/crispr/guide/.

52

Figure 2-7: Screen-shot of University of Bergen’s chop chop V2 program depicting potential guides for ramosa1.

The chop chop program chopchop.cbu.uib.no searches a user-defined segment (blue line) of the selected genome (maize B73v4) for PAM sites, calculates predicted efficiency and displays potential gRNAs as short, colored lines above the segment. The program detects potential off target cut sites and calculates self-complementarity with a selected gRNA scaffold sequence. Potential guide positions are shown as short lines above the genome segment and colored according to predicted performance (green is highest, red is lowest).

Figure 2-8: ramosa1 gene model.

ramosa1 is encoded by a 528 bp single exon gene on chromosome 7. Zinc finger domain and EAR motifs are depicted as blue and grey blocks, respectively. Red blocks indicate guides used in the 8x array; pink blocks indicate guides in the 4x array.

53

Figure 2-9: rho-GDI1 and rho-GDI1 paralog gene models.

Full gene models of A) rho-GDI1 and B) rho-GDI1 paralog. C & D) Coding sequences of the C) rho-GDI1 gene and D) rho-GDI1 paralog gene aligned together. Pink arrowheads are guides used in the 4x array, blue blocks indicate the Y2H interaction domain.

Figure 2-10: jmjC gene model.

A) Gene model of the jmjC gene. B) Coding sequence of the jmjC gene. Red arrowheads are guides used in the 8x array, blue arrowheads indicate the Y2H interaction domain. 54

Figure 2-11: selT gene model.

A) Gene model of the selT gene. B) Coding sequence of the selT gene. Red blocks and arrowheads are guides used in the 8x array, blue blocks indicate the Y2H interaction domain.

Figure 2-12: ail6 gene model.

A) Gene model of the ail6 gene. B) Coding sequence of the ail6 gene. Red arrowheads are guides used in the 8x array, blue blocks represent the two AP2 domains.

55

Figure 2-13: pENTR-gRNA1 and pENTR-gRNA2.

A) The Yang Laboratory’s pENTR-gRNA CRISPR gRNA expression cassette system. B) Target sites are assembled by annealing specially designed single stranded oligos into double stranded DNA with engineered overhangs. The plasmids (A) are digested by BtgZI and BsaI and repaired by ligating the previously described overhanging DNA (B) into the resulting gap. These actions are performed in sequential steps and the full 4 guide array is assembled by concatenating the two arrays together by HindIII digestion and ligation of the DNA between HindIII sites of gRNA2 into gRNA1. Both plasmids (gRNA1 & 2) are identical save for the second HindIII site in gRNA2 used for the final concatenation step (Char et al., 2017).

Figure 2-14: pENTR-gRNA (4x) and Synthetic (8x) Arrays.

Both the 4x (A) and 8x (B) arrays are flanked by attL cloning sites. The expression plasmid pGW-Cas9 (C) contains attR sites to allow for easy cloning of the array module. 56

Figure 2-15: 4x and 8x Array depictions.

The expression vector contains the A) 4x array and B) 8x array positioned downstream of the Cas9 gene. These are flanked by left (LB) and right (RB) borders utilized in Agrobacterium tumefaciens mediated transformation.

57

Figure 2-16: gRNA1 cloning colony PCR, BsaI digestion and ligation step.

A & B) PCR was conducted using primers ra1-G2-BsaI-Fw & Array-Ver-Rv (TA: 58℃, Ext: 1 min 20 sec, Cycles: 35, Taq: GoTaq) A) PCR detection of ramosa1 guide 2 cloning (see text) for 24 E. coli colonies; digestion reaction used 10 units of BsaI restriction enzyme for 1 hour, resulting in around 4% efficiency. B) Drastically better results for 21 E. coli colonies; digestion reaction was 20 units of BsaI added twice over two hours of digestion, resulting in over 90% efficiency.

58

Figure 2-17: Agrobacterium tumefaciens preparation of 4x array plasmid DNA.

A) 790bp Cas9 segment PCR amplification. Cas9-Screen2-Fw/Rv (TA: 61℃, Ext: 1 min, Cycles: 35, Taq: GoTaq) B) 700bp Cas9 segment amplification. Cas9-Screen3-Fw/Rv (TA: 61℃, Ext: 1 min, Cycles: 35, Taq: GoTaq) C) 2000bp amplification across the entire 4x array. Array-Ver- Fw/Rv (TA: 60℃, Ext: 2.5 min, Cycles: 35, Taq: GoTaq) D) 1190bp amplification across one portion of the 4x array. ra1-G2-BsaI-Fw & Array-Ver-Rv (TA: 58℃, Ext: 1 min 20 sec, Cycles: 35, Taq: GoTaq). These amplifications were chosen due to clean positive/negative results during E. coli screens which did not continue in Agrobacterium tumefaciens screening.

59

Figure 2-18: Ear holding tool.

Screw tip prevents ear from splitting when inserted and holds the ear more securely than forceps.

Figure 2-19: Sequencing quality levels.

Figure is a screen shot from the Geneious bioanalysis software aligning Sanger sequencing traces to the B73-V4 jmjC gene. When sequence quality levels (blue peaks) drop precipitously (upper red arrow) at a target site, we typically detected a mutation as evidenced by the gap (lower red arrow) in aligned regions (black segments).

60

Figure 2-20: Gaps in sequencing alignments.

Figure is a screen shot from the Geneious bioanalysis software aligning Sanger sequencing traces to the B73-V4 selT gene. Depicted here are traces where the sequence quality levels do not drop but a gap(s) still exists (red arrows). This suggests biallelic indels, where both alleles are altered in the same way.

61

Figure 2-21: Multi-base biallelic mutations.

Figure is a screen shot from the Geneious bioanalysis software aligning Sanger sequencing traces to the B73-V4 selT gene. Both images are from B48.09, Guide 1 (A) and Guide 2 (B) each produced single base insertions of different bases on each allele.

Figure 2-22: Cas9 assay on DNA made from callus tissue.

This PCR on callus DNA was conducted using Cas9-Screen1-Fw/Rv which produce 371 bp amplicon primers (TA: 57℃, Ext: 30 sec, Cycles: 35, Taq: GoTaq). Yellow stars indicate lines which generated plants. 62

Figure 2-23: Cas9 PCR assay from all regenerated plants.

A PCR on plant DNA, from all plants which germinated and survived the initial growth chamber, was conducted using Cas9-Screen1-Fw/Rv primers producing a 371 bp amplicon (TA: 57℃, Ext: 30 sec, Cycles: 35, Taq: GoTaq). All plants except B116.02, B121.01 and B121.02 contained Cas9.

63

Figure 2-24: Frequency of insertions and deletions by number of nucleotides.

Mutation events were counted by type and number of nucleotides inserted or deleted. Single base indels are the most frequently recovered event. While deletions can be almost any size, insertions are almost exclusively single base.

64

Figure 2-25: Number of single base insertions by base.

Single base insertions were counted by base. Nucleotides A and T make up the majority of all insertions.

65

Figure 2-26: Number of targeted genes edited per recovered plant treated with the 8x array.

Each of the counted plants were regenerated from material treated with the 8x array which targets four genes. The number of genes with mutations per plant was counted and the data are consistent with a normal distribution (mean = median = 17, kurtosis = -0.54, skewness = 0.59).

66

Figure 2-27: Tassel phenotypes in parental generation of new ramosa1 alleles

A) T0 plant B48.08 B) T0 plant B48.09 C) T0 plant B115.01 D) T0 plant B115.02. Each plant contains biallelic ramosa1 mutations A & B) ra1-M11 and -M12; C & D) ra1-M7 and -M8.

67

Figure 2-28: Characterization of ramosa1 mutations.

Figure represents the various types of mutations recovered in ramosa1. The internal C2H2 refers to the Cys2-His2 zinc finger, and red characters in this section mean one or more of these important amino acids are missing. Mutations listed together as a group (left side of figure) are distinct entities by nucleotide sequence but produce similar or identical proteins.

68

Figure 2-29: Characterization of jmjC mutations.

Figure characterizes eight different nucleotide specific mutations generated from two target guides in the jmjC gene. The yeast-2-hybrid interaction domain is indicated by the box labelled Y2H ID. Only one mutant allele (jmjC-M7) was produced from jmjC_Guide1; the remaining seven were produced from jmjC_Guide2. Mutant alleles jmjC-M1, M4, M5 & M6 all contained different nucleotide mutations but were grouped together (left side of figure) due to their all producing frame-shift early nonsense (stop) mutations after seven or eight amino acids. All but jmjC-M8 produce frame-shift early stop mutations.

69

Figure 2-30: Characterization of selT mutations.

Figure characterizes 16 different nucleotide specific mutations in the selT gene. The yeast-2- hybrid interaction domain is indicated by the box labelled Y2H ID. Mutations listed together (left side of figure) represent unique mutations at a nucleotide level which produced similar results at the protein level. Six of the eight mutations labeled as “frame-shift early stop 50-65 amino acids downstream” (fsX 50-65) were the result of both guides creating indels in the same allele. Only one mutation (selT-M4) created an in-frame indel.

70

Figure 2-31: Characterization of ail6 mutations.

Figure characterizes four mutations in the ail6 gene. The two AP2 domains are noted in the box labelled AP2-1&2. Mutations ail6-M1 and M3 are unique by nucleotide but encode similar proteins. Only ail6-M4 doesn’t create a frame-shift early stop mutation, the “5” represents a five amino acid deletion.

Figure 2-32: Characterization of the rho-GDI1 mutation.

Figure represents the single mutation recovered from the two target guides for rho-GDI1. It occurred in the yeast-2-hybrid interaction domain (Y2H ID) and was produced by rho- GDI1_Guide1.

71

Figure 2-33: 8x Array Guide location vs. Activity in number of unique events

The location on the graph indicates the location on the 8x array. G1 & G2 are from ramosa1, G3 & G4 are from jmjC, G5 & G6 are from selT, and G7 & G8 are from ail6. The same four promoters were used twice in the first and second half of the array, bars of the same color indicate the same promoter was used. Green and blue are driven by rice promoters; red and orange are driven by maize promoters.

72

Tables

Table 1: Putative ramosa1 gene interactors identified by yeast-2-hybrid.

HG Close Gene name Ratinga V4 GeneIDb Paralogsc Known mutantsd jmjC A Zm00001d033158 0 ran-binding protein 1 A Zm00001d010504 2 SBA1 A Zm00001d049305 1 RELK2 A Zm00001d028481 3 mu1013659 rel2 A Zm00001d024523 0 yes Ras association and mu1049664* pleckstrin y domain 1 B Zm00001d012336 0 MuI_713575* NUP50A B Zm00001d043757 1 mu1059542* selT B Zm00001d023692 0 MuI_581173.6* SKU5 B Zm00001d045599 0 GAPC2 B Zm00001d035156 1 mu1022147 cdj3 C Zm00001d013669 2 mu1015057 rho-GDI1 C Zm00001d017859 1 aHG Rating: Graded scale for strength of interaction as provided by Hybrigenics Inc; A = “Very high confidence in the interaction”, B = “High confidence” and C = “Good confidence”. Anything D rated or lower has barely detectable signals and high risk of false positives, therefore these were excluded. bV4 Gene ID: Maize B73 ID reference number, version 4.0 https://www.maizegdb.org/genome/genome_assembly/Zm-B73-REFERENCE-GRAMENE-4.0 cClose Paralogs: Number of genes which are hypothesized to be potentially functionally redundant with the corresponding Y2H interactor. These are identified in Table 2. dKnown mutants: Mutations available from the Maize Genetics Co-op stock center are identified and ones marked with an * were ordered and PCR verified.

73

Table 2: Candidate gene paralogs and ear wt-vs-ra1 developmental timeline expression profiles.

ear.wt. ear.ra1. ear.wt. ear.ra1. a b c d e f g h Name % identity 1mm 1mm 2mm 2mm V3 Gene ID Chr ramosa1 11.23 14.49 41.99 92.69 GRMZM2G003927 7 25.13% 0 0.04 0.51 0.08 GRMZM2G137736 10 25.29% 0 0 0 0 GRMZM2G006282 5 13.90% 0.38 0.27 1.33 0.39 GRMZM2G000126 7 20.10% 0 0 0 0 GRMZM2G090332 2 14.60% 0.56 0.43 2.89 1.27 GRMZM2G058868 4 20.10% 0 0 0 0 GRMZM2G086530 10

jmjC 24.38 27.31 26.99 100.77 GRMZM2G417089 1 48.35% 11.36 8.81 9.74 21.81 GRMZM2G383210 4 39.94% 7.91 7.94 7.04 30.56 GRMZM2G054162 4 36.03% 3.53 5.23 4.98 14.94 GRMZM2G070885 7 27.12% 3.03 2.95 3.23 6.56 GRMZM2G428933 9

ran-binding protein 1 45.16 35.24 78.7 107.53 GRMZM2G111411 8 Ran-binding protein 1 homolog a 68.48% 34.84 27.15 64.27 99.12 AC213884.3_FG001 6 Ran-binding protein 1 homolog a 66.37% 96.28 56.51 104.81 92.3 GRMZM2G094388 1 Ran-binding protein 1 68.00% 16.16 13.36 26.54 35.04 GRMZM2G078933 9

SBA1 130.38 143.24 153.81 447.14 GRMZM2G154312 4 81.04% 250.96 250.28 318.2 745.62 GRMZM2G078022 1 40.34% NA NA NA NA GRMZM6G735128 9 30.98% 45.06 49.25 50.02 188.92 GRMZM2G169432 1

RELK2 12.55 21.73 13.57 67.63 GRMZM2G030422 1 Topless 97.17% 19.65 30.4 19.93 111.08 GRMZM2G550865 9 topless- 3.24 2.95 3.68 9.3 related1 61.87% GRMZM2G316967 3 Rel2 66.78% 75.36 89.05 108.8 230.49 GRMZM2G042992 10 trehalose-6- 4.37 1.42 2.65 4.27 phosphate synthase14 40.62% GRMZM2G416836 8

74

Table 2: (continued) % identityb ear.wt. ear.ra1. ear.wt. ear.ra1. Name a 1mmc 1mmd 2mme 2mmf V3 Gene IDg Chrh Rel2 75.36 89.05 108.8 230.49 GRMZM2G042992 10 topless-related 3.24 2.95 3.68 9.3 1 64% GRMZM2G316967 3 trehalose-6- 4.37 1.42 2.65 4.27 phosphate synthase14 43.42% GRMZM2G416836 8 Topless 68.44% 19.65 30.4 19.93 111.08 GRMZM2G550865 9 topless-related2 68.08% 12.55 21.73 13.57 67.63 GRMZM2G030422 1

Ras association and pleckstrin y domain 1 58.6 32.87 47.8 31.54 GRMZM2G057075 8 31.72 % NA NA NA NA AC210204.3_FGT002 2 31.79 % 6.72 3.38 5.45 1.11 GRMZM2G032766 7 38.30 % 0.87 0.36 1.28 0.07 GRMZM2G090177 1

NUP50A 181.9 131.44 170.92 322.58 GRMZM2G157317 3 78.74% 43.74 36.07 49.18 93.94 GRMZM5G823017 8

selT 341.97 105.86 285.97 268.38 GRMZM2G040389 10

SKU5 56.08 69.33 72.95 197.07 GRMZM2G049693 9 66.01% 10.65 5.42 10.12 10.77 GRMZM2G438386 1 53.84% 3.51 4.86 6.08 11.39 GRMZM2G402584 1 64.78% 8.18 6.96 13.32 15.91 GRMZM2G172642 10 65.22% 4.17 3.99 6.63 6.18 GRMZM2G077317 10 45.08% 0.11 0.08 0 0 GRMZM2G129064 6

GAPC2 242.24 314.7 443.3 1163.48 GRMZM2G180625 6 GAPC1 97.33% 325.16 412.46 574.3 1213.81 GRMZM2G046804 4 GAPC3 87.24% 139.91 187.14 194.18 956.35 GRMZM2G071630 4 GAPC4 57.65% 138.33 234.31 190.02 1328.07 GRMZM2G176307 5

cdj3 192.34 218.05 260.4 830.25 GRMZM2G134980 5 cdj2 98.09% 430.98 307.89 565.44 1111.08 GRMZM2G364069 1 85.89% 107.15 62.35 89.5 124.97 GRMZM2G134917 1 84.53% 0.02 0.08 0.08 0 GRMZM2G346863 2 85.13% 6.58 4 6 10.6 GRMZM2G028218 5 75

Table 2: (continued)

64.43% 0 0.08 0.01 0.33 GRMZM2G029079 2 72.76% 0.17 0.02 0.16 0.07 GRMZM2G433854 2 78.15% 0.07 0.08 0.06 0.08 GRMZM2G354746 9 74.44% 0.07 0.1 0.25 0.26 GRMZM2G434839 9 70.96% 0.07 0.1 0.12 0.05 GRMZM2G063238 2 73.22% 17.79 22.66 24.54 81.94 GRMZM2G118731 5 36.29% 18.71 22.44 20.89 65.6 GRMZM2G086964 8 28.08% 0 0.1 0 0.15 GRMZM2G108259 8 78.16% No V3 ID 2 76.09% No V3 ID 1 66.20% No V3 ID 10

rho-GDI1 58.82 31.28 59.28 40.73 GRMZM2G085049 5 80.60 % 22.78 12.76 24.87 12.82 GRMZM2G012814 4 58.96 % 0 0 0 0 GRMZM2G072089 8 36.09 % 0 0 0.02 0 GRMZM2G150724 8 56.28 % 0 0.07 0.18 0.07 GRMZM2G136236 6

Gene of interest (Y2H interactor) is in green and the bold genes, below each, are hypothesized to be potentially functionally redundant with the corresponding Y2H interactor (green line). These bold genes are called “close paralogs” in the text.

aName: Gene name b% identity: Percent nucleotide identity of putative paralog based on gramene gene trees ensembl.gramene.org/Zea_mays (c-f) Reads per kilobase million (RPKM) expression values from maizeinflorescence.org/profile_display.php cear.wt.1mm: wild type ear expression at 1mm size dear.ra1.1mm: ramosa1 ear expression at 1mm size eear.wt.2mm: wild type ear expression at 2mm size fear.ra1.2mm: ramosa1 ear expression at 2mm size gV3 Gene ID: Maize B73 ID reference number, version 3 www.maizegdb.org/genome/genome_assembly/B73%20RefGen_v3 hChr: Chromosome number of indicated gene

76

Table 3: Candidate genes and their annotated functions/descriptions

Gene HG Close name Ratinga Annotated Function V3 Gene IDb Paralogsc

Transcription factor jumonji (jmjC) jmjC A domain-containing protein GRMZM2G417089 0 HSP20-like chaperones superfamily SBA1 A protein GRMZM2G154312 1 rel2 A ramosa1 enhancer locus 2 GRMZM2G042992 0 selenoprotein domain. Histone selT B modification GRMZM2G040389 0 Monocopper oxidase-like. Histone SKU5 B modification GRMZM2G049693 0 rho GDP- dissociation inhibitor 1 rho- Regulates the GDP/GTP exchange. GDI1 C Nuclear transport GRMZM2G085049 1 ail6 --- AP2-EREBP-type transcription factor GRMZM2G399072 0

aHG Rating: Graded scale for strength of interaction as provided by Hybrigenics Inc; A = “Very high confidence in the interaction”, B = “High confidence” and C = “Good confidence”. bV3 Gene ID: Maize B73 ID reference number, version 3 www.maizegdb.org/genome/genome_assembly/B73%20RefGen_v3 cClose Paralogs: Number of genes which are hypothesized to be potentially functionally redundant with the corresponding Y2H interactor. These are identified in Table 2.

77

Table 4: CRISPR Guides and their location considerations. ramosa1 Zm00001d020430 Cut site distance from Distance Zinc Finger starting between Domain: 42- Gene Name Target sequence ATG cut sites 72 AA 4xArray_Ramosa_ G1_BtgZI GACGACGACCTTACCTGTGG 105 bp 69 bp 35 AA 4xArray_Ramosa_ G2_BtgZI GGAGTTCAGATCAGCACAA 174 bp 58 AA 8xArray_Ramosa_ G1 GTTGTTGCTGCAGTTTCATT 29 bp 188 bp 10 AA 8xArray_Ramosa_ G2 GCCACATGAACATCCACAGGC 205 bp 68 AA rho-GDP Y2H Dissociation Interaction Inhibitor 1 Domain: 54- (rho-GDI1) Zm00001d017859 147 AA 4xArray_rho_G1_ BtgZI GCTGCTCCTTGATGCTGACCA 220 bp 140 bp 73 AA 4xArray_rho_G2_ BsaI GCTGCTCCTTCCATCGCCGC 357 bp 119 AA

Y2H Interaction Domain: 112- jmjC Zm00001d033158 319 AA 8xArray_jmjC_G1 GATGGCGGGGTGACGAATCG 224 bp 149 bp 75 AA 8xArray_jmjC_G2 GTCGCGCTCGCAGAAAACGT 373 bp 124 AA

Y2H Interaction Domain: 32- selT Zm00001d023692 195 AA 8xArray_sel_G1 GTCGCCGGGGACTTGCGCTT 12 bp 171 bp 4 AA 8xArray_sel_G2 GCGGTTGCCGAGTTGAAAGGG 162 bp 54 AA

AP2 Domains ail6 Zm00001d027878 270-337 AA 8xArray_ail6_G1 GCAGAGAGTCATCGAGCGGCA 175 bp 1,188 bp 58 AA 8xArray_ail6_G2 GCAATGCCGATGTACAACGC 1,363 bp 454 AA 78

Table 5: Primers used in these experiments.

Name Sequence Purpose ra1-G1-BtgZi-Fw TGTTGACGACGACCTTACCTGTGG DS guide, PCR ra1-G1-BtgZi-Rv AAACCCACAGGTAAGGTCGTCGTC DS guide, PCR ra1-G2-BsaI-Fw GTGTGGAGTTCAGATCAGCACAA DS guide, PCR ra1-G2-BsaI-Rv AAACTTGTGCTGATCTGAACTCC DS guide, PCR rho-G1-BtgZi-Fw TGTTGCTGCTCCTTGATGCTGACCA DS guide, PCR rho-G1-BtgZi-Rv AAACTGGTCAGCATCAAGGAGCAGC DS guide, PCR rho-G2-BsaI-Fw GTGTGCTGCTCCTTCCATCGCCGC DS guide, PCR rho-G2-BsaI-Rv AAACGCGGCGATGGAAGGAGCAGC DS guide, PCR ra1-Ver-Fw TAGCTAGGTTAGGCACACGC PCR PCR, ra1-Ver-Rv GACTGCACGCTCCTATCCTC Sequencing PCR, rho-GDI1-Ver-Fw TGCGCTCATCTCGGTATCTG Sequencing rho-GDI1-Ver-Rv ATGCAGATGGATGGCTCACG PCR rho-GDI1-Paralog- PCR, Ver-Fw CTCCTGCTCCAAGGTCGAC Sequencing rho-GDI1-Paralog- Ver-Rv CAAGTGTTCAACGACCGAGC PCR PCR, jmjC-Ver-Fw CTAGGATCTCACCGAGACGC Sequencing jmjC-Ver-Rv CATGCGAGAACGATAGGAAGAC PCR PCR, selT-Ver-Fw GTCTCCACTCGCACTTCCG Sequencing selT-Ver-Rv AATCACAGACGACACCTCGC PCR selT-Ver-Rv2 CAACCCCATCCTCCTCCTCTGC Sequencing PCR, ail6-G1-Ver-Fw CTCCTTCGCTTGCTTGCAAG Sequencing ail6-G1-Ver-Rv GCAACAACCGATGGCCATAG PCR PCR, ail6-G2-Ver-Fw TTCCTGTGCAGATTGGAGGC Sequencing ail6-G2-Ver-Rv GGTGACGCCACGGAACTG PCR Array-Ver-Fw AACATGTCGAGGCTCAGCAGGA PCR, QC Array-Ver-Rv CTGCAATGGCAATTACCTTATCCGCA PCR, QC Cas9-Screen1-Fw GGGTAATGAACTCGCTCTGC PCR, QC Cas9-Screen1-Rv TGGCGTCAAGAACTTCCTTTG PCR, QC Cas9-Screen2-Fw ACAGTTGCGTACTCCGTGCTTG PCR, QC Cas9-Screen2-Rv CATGCGATCATAGGCGTCTCGC PCR, QC Cas9-Screen3-Fw GAAGACATACAGAAGGCTCAGGTCTCCG PCR, QC Cas9-Screen3-Rv ATCCAGGATCTGGGCAACGTGT PCR, QC

79

Table 6: Ears used in these studies.

# Starting Ear Construct Origin of # # Plants # Plants % Total Ear # Embryos Source Used plant lines Regenerated Survived Plants E100 235 S 8x 1 0 0 0.00% E101 124 W 8x 0 0 0 0.00% E102 261 S 8x 0 0 0 0.00% E103 279 S 8x 2 0 0 0.00% E104 367 S 8x 28 133 60 53.57% E105 324 S 8x 1 7 6 5.36% E106 281 S 8x 4 0 0 0.00% E107 315 S 8x 9 3 3 2.68% E108 260 S 8x 6 14 8 7.14% E109 149 W 8x 1 0 0 0.00% E110 324 S 8x 15 30 17 15.18% E111 219 S 8x 10 0 0 0.00% E112 36 W 8x 0 0 0 0.00% E113 132 S 8x 0 0 0 0.00% E114 197 S 8x 16 23 11 9.82% E116 189 W 4x 4 0 0 0.00% E117 173 W 4x 14 0 0 0.00% E118 240 S 4x 1 0 0 0.00% E119 104 W 4x 0 0 0 0.00% E120 277 S 4x 3 0 0 0.00% E121 289 S 4x 15 18 7 6.25% E122 43 W 4x 4 0 0 0.00% E123 118 W 4x 0 0 0 0.00% E124 34 W 4x 0 0 0 0.00% E125 19 W 4x 0 0 0 0.00% E126 50 W 4x 0 0 0 0.00% E127 42 W 4x 8 0 0 0.00% Totals 142 228 112 Ears in green shaded rows gave rise to mutated plants. 80

Table 7: Ear data by source and array transformed.

S Source W Source Totals 4x Array Earsa 3 9 12 8x Array Earsb 12 3 15 Total Ears 15 12 27

4x Array Embryosc 806 772 1578 8x Array Embryosd 3194 309 3503 Total Embryos 4000 1081 5081

Plants 4x Plants regeneratede 18 0 18 4x Plants survivedf 7 0 7

8x Plants regeneratedg 210 0 210 8x Plants survivedh 105 0 105

Total Plants regeneratedi 228 0 228 Total Plants survivedj 112 0 112 a4x Array Ears: Number of ears treated with the 4x array from S & W sources b8x Array Ears: Number of ears treated with the 8x array from S & W sources c4x Array Embryos: Number of embryos extracted S & W ears treated with 4x array d8x Array Embryos: Number of embryos extracted S & W ears treated with 8x array e4x Plants regenerated: Number of plants regenerated from S & W sources treated with 4x array f4x Plants survived: Number of plants regenerated from S & W sources treated with 4x array that survived to crossing g8x Plants regenerated: Number of plants regenerated from S & W sources treated with 8x array h8x Plants survived: Number of plants regenerated from S & W sources treated with 8x array that survived to crossing iTotal Plants regenerated: Total number of plants regenerated from S & W sources jTotal Plants survived: Total number of plants regenerated from S & W sources that survived to crossing.

81

Table 8: Callus events capable of regenerating at least one plant.

Event # Plants # Plants survived % Plants # Ear # source Array Regenerated to cross Survived B1 E104 S 8x 40 8 20.00% B2 E104 S 8x 13 4 30.77% B7 E114 S 8x 2 1 50.00% B11 E114 S 8x 21 10 47.62% B12 E104 S 8x 22 21 95.45% B16 E105 S 8x 7 6 85.71% B31 E104 S 8x 5 0 0.00% B37 E107 S 8x 3 3 100.00% B39 E108 S 8x 11 5 45.45% B46 E110 S 8x 1 0 0.00% B48 E110 S 8x 23 17 73.91% B52 E110 S 8x 6 0 0.00% B107 E104 S 8x 2 0 0.00% B108 E104 S 8x 35 23 65.71% B115 E121 S 4x 9 3 33.33% B116 E121 S 4x 6 2 33.33% B121 E121 S 4x 3 2 66.67% B124 E104 S 8x 7 1 14.29% B125 E104 S 8x 1 0 0.00% B127 E104 S 8x 8 3 37.50% B129 E108 S 8x 3 3 100.00% Total Plants 228 112 49.12% Number of events 21 16

Events in green shaded rows regenerated plants which survived to crossing.

82

Table 9: Mutant classifications for all genes of interest. ramosa1 Mutants # times NT AA Active mutation Name Classification Classification Type Guide recovered Description R69_L70delins In Zn finger immediately ra1-M1 206_208del M In-frame indel 8xGuide 2 3 after second His Deletes both His in Zn ra1-M2 187_204del H64_R69del In-frame indel 8xGuide 2 1 finger C2H2 Mutate in Zn finger immediately after second ra1-M3 204_205 insA R69 fsX24 FS early term 8xGuide 2 8 His, terminate 24 aa later Deletes last His in Zn finger C2H2 and terminates ra1-M4 198_210del N66 fsX5 FS early term 8xGuide 2 1 5 aa later Mutate in Zn finger immediately after second ra1-M5 206_216del R69 fsX20 FS early term 8xGuide 2 1 His, terminate 20 aa later Outside of Zn finger, cut is ra1-M6 114_116del S39del In-frame indel 8xGuide 2 1 inside 4X guide site 7bp deletion at 3' end of Zn finger, terminates before ra1-M7 173_179del A58 fsX5 FS early term 4xGuide 2 2 both His 69bp deletion; First 2/3 of 4xGuides Zn finger removed ra1-M8 110_178del V37_Q59 del In-frame indel 1&2 2 including both Cis Mutate in Zn finger immediately after second ra1-M9 205_206 insT R69 fsX24 FS early term 8xGuide 2 4 His, terminate 24 aa later Terminates 1 aa after ra1-M10 208_223del L70X Early term 8xGuide 2 3 second His Deletes second His and ra1-M11 203_204del H68 fsX25 FS early term 8xGuide 2 2 terminates 25 aa later Deletes second His and ra1-M12 204 delC H68 fsX7 FS early term 8xGuide 2 2 terminates 7 aa later 69bp deletion; First 2/3 of 4xGuides Zn finger removed ra1-M13 106_174del Q36_A58 del In-frame indel 1&2 1 including both Cis

jmjC Mutants # times NT AA Active mutation Name Classification Classification Type Guide recovered Description jmjC-M1 373_374 insT C127 fsX7 FS early term Guide 2 55 Terminate in exon 1 83

Table 9: (continued)

# times NT AA Active mutation Name Classification Classification Type Guide recovered Description A105_F126 del 67bp/22 AA deletion, jmjC-M2 313_379del fsX80 FS early term Guide 2 4 Termination in exon 3 jmjC-M3 374 delT F126 fsX80 FS early term Guide 2 10 Termination in exon 3 jmjC-M4 373_374 insA C127 fsX7 FS early term Guide 2 6 Terminate in exon 1 jmjC-M5 374_375del V125 fsX8 FS early term Guide 2 5 Terminate in exon 1 jmjC-M6 364_380del T122 fsX7 FS early term Guide 2 3 Terminate in exon 1 jmjC-M7 223_224del N75 fsX132 FS early term Guide 1 5 Termination in exon 3 N174_F176 6bp deletion, in-frame jmjC-M8 371_376del delins I In-frame indel Guide 2 1 delete @174 NVF into I

selT Mutants # times NT AA Active mutation Name Classification Classification Type Guide recovered Description selT-M1 171_172 insA G58 fsX60 FS early term Guide 2 14 Terminates inside exon 1 12_13 insA; selT-M2 171_172 insA R5 fsX65 FS early term Guides 1 & 2 2 Terminates inside exon 1 12_13 insT; selT-M3 171_172 insA R5 fsX65 FS early term Guides 1 & 2 7 Terminates inside exon 1 selT-M4 13_18del R5_6K del In-frame indel Guide 1 2 selT-M5 13_16del R5 fsX50 FS early term Guide 1 5 Terminates inside exon 1 selT-M6 12_13 insT R5 fsX114 FS early term Guide 1 2 Terminates inside exon 1 selT-M7 12_13 insG R5 fsX114 FS early term Guide 1 1 Terminates inside exon 1 selT-M8 11_17del K4 fsX 50 FS early term Guide 1 1 Terminates inside exon 1 selT-M9 171_172 insT G58 fsX60 FS early term Guide 2 8 Terminates inside exon 1 171_172 ins selT-M10 7A G58 fsX63 FS early term Guide 2 2 Terminates inside exon 1 selT-M11 171_172 insG G58 fsX60 FS early term Guide 2 5 Terminates inside exon 1 12_13 insC; selT-M12 171_172 insA R5 fsX65 FS early term Guides 1 & 2 10 Terminates inside exon 1 12_13 insG; selT-M13 171_172 insT R5 fsX65 FS early term Guides 1 & 2 1 Terminates inside exon 1 12_13 insA; selT-M14 171_172 insT R5 fsX65 FS early term Guides 1 & 2 18 Terminates inside exon 1 selT-M15 163_167del L56 fsX62 FS early term Guide 2 3 Terminates inside exon 1 84

Table 9: (continued)

# times NT AA Active mutation Name Classification Classification Type Guide recovered Description 12_13 insC; selT-M16 171_172 insT R5 fsX65 FS early term Guides 1 & 2 7 Terminates inside exon 1

ail6 Mutants # times NT AA Active mutation Name Classification Classification Type Guide recovered Description ail6-M1 176_177 insT P59 fsX3 FS early term Guide 1 2 Terminates inside exon 1 ail6-M2 176 delC P59 fsX29 FS early term Guide 1 1 Terminates inside exon 2 ail6-M3 176_177del R60 fsX1 FS early term Guide 1 1 Terminates inside exon 1 ail6-M4 163_177del L55_P59del In-frame indel Guide 1 1

rho-GDI1 Mutants # times NT AA Active mutation Name Classification Classification Type Guide recovered Description rho- GDI1-M1 221_222del V74 fsX28 FS early term Guide 1 2 Terminates inside exon 2

Mutation names are in left column and each row is the information related to that mutation. # times mutation recovered is the raw number of times that mutation was seen in any plant. All coordinates are from start of translation ATG.

85

Table 10: All surviving plants assayed for mutations in five genes.

Plant Name RA1 jmjC selT ail6 rho-GDI1

B1 B1.01 jmjC-M1 WT selT-M2 selT-M3 B1.02 jmjC-M1 WT B1.03 jmjC-M1 WT selT-M1 selT-M11 B1.04 jmjC-M1 WT selT-M12 selT-M13 B1.05 jmjC-M8 WT selT-M1 selT-M11 B1.06 ra1-M9 WT jmjC-M1 WT selT-M9 selT-M10 B1.07 ra1-M9 WT jmjC-M1 WT selT-M9 selT-M10

B2 B2.01 jmjC-M1 jmjC-M6 selT-M9 WT B2.02 jmjC-M1 jmjC-M6 selT-M1 selT-M1 B2.03 jmjC-M1 jmjC-M6 selT-M1 selT-M9 B2.04 jmjC-M7 WT selT-M9 WT ail6-M1 WT

B7 B7.01 selT-M1 WT

B11 B11.01 jmjC-M3 WT selT-M6 WT B11.02 jmjC-M1 jmjC-M4 B11.03 ra1-M4 WT jmjC-M1 jmjC-M4 B11.04 jmjC-M3 WT selT-M6 selT-M7 B11.05 jmjC-M1 WT B11.06 jmjC-M4 WT selT-M1 WT B11.08 jmjC-M1 WT selT-M1 selT-M3 B11.09 jmjC-M1 WT selT-M1 selT-M3 B11.10 jmjC-M1 WT selT-M1 selT-M3

86

Table 10: (continued)

Plant Name RA1 jmjC selT ail6 rho-GDI1

B12 B12.02 jmjC-M1 WT B12.03 jmjC-M1 WT B12.04 ra1-M3 WT B12.07 ra1-M3 WT selT-M5 WT B12.08 ra1-M3 WT selT-M5 WT B12.09 ra1-M3 WT selT-M5 WT B12.10 jmjC-M1 jmjC-M1 B12.11 jmjC-M1 WT B12.12 jmjC-M1 WT B12.13 jmjC-M1 WT B12.14 jmjC-M1 jmjC-M1 B12.15 ra1-M3 WT jmjC-M1 WT selT-M5 WT B12.16 jmjC-M1 WT selT-M8 WT B12.17 ra1-M3 WT B12.18 ra1-M3 WT selT-M5 WT B12.19 jmjC-M1 WT B12.20 jmjC-M1 jmjC-M1 B12.21 jmjC-M1 WT

B16 B16.01 ra1-M1 WT jmjC-M3 WT selT-M1 WT B16.03 ra1-M1 WT jmjC-M3 WT B16.04 jmjC-M3 WT B16.05 ra1-M1 WT jmjC-M1 WT selT-M1 WT

B37 B37.02 ail6-M3 WT

87

Table 10: (continued)

Plant Name RA1 jmjC selT ail6 rho-GDI1 B39 B39.01 ra1-M6 WT jmjC-M1 WT selT-M1 selT-M9

B48 B48.01 jmjC-M1 jmjC-M1 selT-M2 selT-M12 ail6-M2 WT

B48.02 ra1-M10 WT jmjC-M7 WT selT-M3 selT-M14

B48.03 ra1-M10 WT jmjC-M7 WT selT-M3 selT-M14

B48.04 ra1-M10 WT jmjC-M7 WT selT-M3 selT-M14 B48.05 ra1-M9 WT jmjC-M1 jmjC-M1 selT-M11 selT-M15 B48.06 ra1-M9 WT jmjC-M1 jmjC-M1 selT-M11 selT-M15 B48.07 jmjC-M1 jmjC-M1 selT-M11 selT-M15

B48.08 ra1-M11 ra1-M12 jmjC-M1 jmjC-M1 selT-M12 selT-M14

B48.09 ra1-M11 ra1-M12 jmjC-M1 jmjC-M1 selT-M12 selT-M14

B48.10 ra1-M13 WT B48.16 ail6-M4

B108 B108.01 ra1-M2 WT jmjC-M1 WT selT-M4 WT B108.02 jmjC-M1 WT selT-M4 WT ail6-M1 WT B108.03 ra1-M5 WT jmjC-M1 WT selT-M9 WT B108.04 ra1-M3 WT jmjC-M1 jmjC-M4 selT-M1 WT B108.05 jmjC-M2 WT selT-M9 WT B108.06 jmjC-M2 WT B108.07 jmjC-M2 WT B108.08 jmjC-M2 WT selT-M12 selT-M14 B108.09 jmjC-M3 jmjC-M5 selT-M14 selT-M16 B108.10 jmjC-M3 jmjC-M5 selT-M14 selT-M16 88

Table 10: (continued)

Plant Name RA1 jmjC selT ail6 rho-GDI1

B108.11 jmjC-M3 jmjC-M5 selT-M14 selT-M16 B108.12 jmjC-M3 jmjC-M5 selT-M14 selT-M16 B108.13 jmjC-M3 jmjC-M5 selT-M14 selT-M16 B108.14 jmjC-M4 WT selT-M12 selT-M14 B108.15 jmjC-M1 WT B108.16 jmjC-M1 WT selT-M12 selT-M14 B108.17 jmjC-M1 WT selT-M12 selT-M14 B108.18 jmjC-M7 WT selT-M12 selT-M14 B108.19 jmjC-M4 WT selT-M12 selT-M14 B108.20 jmjC-M1 WT B108.21 jmjC-M1 WT B108.22 jmjC-M1 WT selT-M14 selT-M16 B108.23 jmjC-M1 WT selT-M14 selT-M16

B115 B115.01 ra1-M7 ra1-M8 rho-GDI1-M1 B115.02 ra1-M7 ra1-M8 rho-GDI1-M1

B124 B124.01 ail6-M2 WT

Plant names are listed in the first column of each row, target genes are listed in subsequent columns. There are two columns per each target gene, one for each allele.

89

Table 11: Collapsed mutation frequencies for all five genes of interest.

# unique events ins/del # nt ra1-M1 1 del 3 ra1-M2 1 del 18 ra1-M3 3 ins 1 ra1-M4 1 del 13 ra1-M5 1 del 11 ra1-M6 1 del 3 ra1-M7 1 del 7 ra1-M8 1 del 69 ra1-M9 2 ins 1 ra1-M10 1 del 16 ra1-M11 1 del 2 ra1-M12 1 del 1 ra1-M13 1 del 69 jmjC-M1 20 ins 1 jmjC-M2 1 del 67 jmjC-M3 4 del 1 jmjC-M4 4 ins 1 jmjC-M5 1 del 1 jmjC-M6 1 del 17 jmjC-M7 3 del 1 jmjC-M8 1 del 6 selT-M1 10 ins 1 selT-M2 2 ins 1 selT-M3 3 ins 1 selT-M4 1 del 6 selT-M5 1 del 4 selT-M6 1 ins 1 selT-M7 1 ins 1 90

Table 11: (continued)

# unique events ins/del # nt selT-M8 1 del 7 selT-M9 6 ins 1 selT-M10 1 ins 7 selT-M11 2 ins 1 selT-M12 5 ins 1 selT-M13 1 ins 1 selT-M14 4 ins 1 selT-M15 1 del 5 selT-M16 1 ins 1 ail6-M1 2 ins 1 ail6-M2 3 del 1 ail6-M3 1 del 2 ail6-M4 1 del 15 rho-M1 1 del 1

These were collapsed into the number of verifiable unique events from the raw counts of actual events (see text).

91

Table 12: Number of indels recovered by size in nucleotides.

# nt # Insertions # Deletions 1 67 13 2 0 2 3 0 2 4 0 1 5 0 1 6 0 2 7 1 2 8 0 0 9 0 0 10 0 0 11 0 1 12 0 0 13 0 1 14 0 0 15 0 1 16 0 1 17 0 1 18 0 1 67 0 1 68 0 0 69 0 2

Total Recovered 68 32 Percent single nt 98.53% 40.63%

Overall Total 100

Overall % single nt 80.00%

The counts from this table are based on the collapsed unique event numbers (Table 11).

92

GENERAL CONCLUSIONS

The ease and simplicity of reprogramming CRISPR-Cas has led to the widespread adoption of the system for genome editing and related technologies. The lower cost in both time and money has also led to a general acceleration of genetic research. These improvements have allowed research like these studies to become commonplace.

In these studies, we described the basic functionality of the CRISPR/Cas9 system and how we utilized it’s double stranded break capability to generate novel mutants in five genes.

Three of these genes were identified from a yeast-2-hybrid experiment (rho-GDI1, jmjC and selT) as potential interactors with RAMOSA1. The final genes are ramosa1 itself and ail6 which was identified previously as a suspected actor in the ramosa pathway. Two unique CRISPR expression arrays were designed to cut 12 targets across all five genes. Agrobacterium tumefaciens mediated transfection was performed on immature embryos extracted from 27 Hi-II ears. Plants were regenerated, DNA sequenced, and the generated mutants were genetically categorized. Mutations were recovered in all five targeted genes. Specifically, thirteen unique mutations have been detected in the ramosa1 gene, while in jmjC there are eight, selT has 16, ail6 has four and one mutation was recovered in rho-GDI1. Plants with mutations were variously crossed into inbred lines B104, B73 or B73 ra1-63 for propagation and further analysis.

APPENDIX. YEAST-2-HYBRID GENE CANDIDATE INFORMATION

HG clone Close Known Rating Locus name Gene name Chr V3 Tag V4 Tag Function Paralogs mutants NCBI link Transcription factor jumonji LOC103 pB66_A- GRMZM2 Zm00001 (jmjC) domain-containing https://www.ncbi.nlm.nih.gov/ A 643664 110 jmjC 1 G417089 d033158 protein 0 gene/?term=LOC103643664 LOC100 pB66_A- ran-binding GRMZM2 Zm00001 https://www.ncbi.nlm.nih.gov/ A 282282 14 protein 1 8 G111411 d010504 2 gene/?term=LOC100282282 LOC100 pB66_A- GRMZM2 Zm00001 HSP20-like chaperones https://www.ncbi.nlm.nih.gov/ A 856929 278 SBA1 4 G154312 d049305 superfamily protein 1 gene/?term=LOC100856929 LOC100 pB66_A- GRMZM2 Zm00001 https://www.ncbi.nlm.nih.gov/ A 276114 160 RELK2 1 G030422 d028481 Uncharacterized 3 mu1013659 nuccore/NM_001319213.1 pB66_A- GRMZM2 Zm00001 https://www.ncbi.nlm.nih.gov/ A rel2 199 rel2 10 G042992 d024523 ramosa 1 enhancer locus 2 0 yes gene/100381576 Ras association Calcium-dependent lipid- mu1049664* LOC100 pB66_A- and pleckstrin GRMZM2 Zm00001 binding (CaLB domain) family MuI_713575 https://www.ncbi.nlm.nih.gov/ B 193897 268 y domain 1 8 G057075 d012336 protein 0 * gene/?term=LOC100193897

93 LOC100 pB66_A- GRMZM2 Zm00001 Nuclear pore complex protein https://www.ncbi.nlm.nih.gov/

B 273971 73 NUP50A 3 G157317 d043757 NUP50A 1 mu1059542* gene/?term=LOC100273971 selenium binding, LOC100 pB66_A- GRMZM2 Zm00001 selenoprotein domain MuI_581173 https://www.ncbi.nlm.nih.gov/ B 283075 371 selT 10 G040389 d023692 containing protein 0 .6* gene/?term=LOC100283075 LOC100 pB66_A- GRMZM2 Zm00001 Monocopper oxidase-like https://www.ncbi.nlm.nih.gov/ B 383470 370 SKU5 9 G049693 d045599 protein SKU5 0 gene/?term=LOC100383470 Zea mays cytosolic LOC542 pB66_A- GRMZM2 Zm00001 glyceroldehyde-3-phosphate https://www.ncbi.nlm.nih.gov/ B 718 151 GAPC2 6 G180625 d035156 dehydrogenase GAPC2 1 mu1022147 nuccore/NM_001112230.2 LOC100 pB66_A- GRMZM2 Zm00001 https://www.ncbi.nlm.nih.gov/ C 280317 232 cdj3 5 G134980 d013669 Chaperone protein dnaJ 3 2 mu1015057 nuccore/NM_001153243 rho GDP-dissociation inhibitor LOC100 pB66_A- GRMZM2 Zm00001 1 Regulates the GDP/GTP https://www.ncbi.nlm.nih.gov/ C 285962 179 rho-GDI1 5 G085049 d017859 exchange. Nuclear transport 1 nuccore/NM_001158851.1 LOC103 GRMZM2 Zm00001 AP2-EREBP-type transcription https://www.ncbi.nlm.nih.gov/ -- 631785 ----- ail6 1 G399072 d027878 factor 0 gene/103631785