<<

2019 Project-write and 8th Annual Sc2.0 Meeting

New York, New York, United States November 11-14, 2019

ABSTRACT BOOKLET

Hosted by the Institute for Systems at NYU Langone Health

1 Abstracts Selected for Posters (P1 – P28): P1. A “Marionette” S. cerevisiae strain to control metabolic pathways Marcelo Bassalo, Chen Ye, Joep Schmitz, Hans Roubos, Christopher Voigt Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States Optimizing metabolic networks often requires fine-tuning of expression levels to minimize buildup of toxic intermediates while maximizing productivity. Inducible promoters are a straight-forward strategy to systematically test different expression levels, providing levers to independently control targeted . However, the limited availability of orthogonal transcriptional sensors in the yeast, Saccharomyces cerevisiae, hinders their use to optimize an engineered biosynthetic pathway. In this work, we aim to expand the set of inducible promoters and develop a “Marionette” yeast strain, containing a genome integrated array of optimized sensors. We have taken steps towards this “Marionette” strain by constructing and testing an initial set of 4 orthogonal sensors, engineered by placing bacterial operator elements into yeast core promoters. We then demonstrate “Marionette” in yeast by tuning a toxic metabolic pathway to produce the monoterpene Linalool. Initially, a two-level factorial experiment was performed to uncover expression rules of the targeted genes. By incorporating these rules, we performed a second optimization round. Overall, this pilot test of expression profiles allowed us to explore the equivalent of ~300 kb of pathway variant constructs with a single genetic design. Finally, we also demonstrate staging order of operations on the controlled genes. The ability to establish a synthetic metabolic pathway control to independently tune component genes will accelerate metabolic engineering cycles in yeast, enabling rapid testing of multiple expression levels that ultimately can be used to train learning algorithms and uncover rules for optimal pathway flux. P2. Dissecting the α-globin super-enhancer with synthetic regulatory Brendan R. Camellato, Leslie A. Mitchell, Helena Francis, Mira T. Kassouf, Matthew T. Maurano, Douglas R. Higgs, Jef D. Boeke Institute of Systems Genetics, NYU Langone Health, New York, New York, USA The α-globin locus has been a pioneering model to study gene regulation by distal enhancer elements, locus control regions, and super-enhancers. In the mouse, α-globin expression is regulated by a cluster of five enhancers located 50 kb away from the coding genes. It’s not clear, however, whether these enhancers function as a cluster of independent elements, or synergistically as a super-enhancer. Current techniques to study enhancer function are not suitable for simultaneously assessing the roles of multiple elements in their native genomic context. We are thus applying our approach of synthetic regulatory genomics to dissect the

2 α-globin super-enhancer and characterize the individual elements. Using our “Big DNA” synthesis technologies, we built a synthetic version of the mouse α-globin locus, as well as 20 different variants which feature combinatorial perturbations of specific enhancers and addition of various features to facilitate phenotypic readouts. These synthetic loci are assembled directly into BAC vectors allowing delivery into suitably engineered mouse embryonic stem cells (mESCs) and integration into the genome using a recombinase- mediated cassette exchange strategy, overwriting the endogenous locus. Following in vitro differentiation of the mESCs into erythrocytes, or establishment of mouse strains, phenotypes can be assessed relating to gene expression, transcription factor binding, and chromatin state. So far, three synthetic loci have been integrated, establishing three mESC cell lines and one mouse strain. Characterization of these locus variants has produced results that were unexpected based on previous work and support a model in which the α- globin enhancer cluster functions synergistically as a super-enhancer. P3. Development of platform technologies for metabolic engineering in Saccharomyces cerevisiae Alexander C. Carpenter 1,2, Thomas C. Williams 1,2, Isak S. Pretorius1 and Ian T. Paulsen1

1Department of Molecular , Macquarie University, Sydney, NSW 2109, Australia 2CSIRO Synthetic Biology Future Platform, Canberra, ACT 2601, Australia Directed evolution is an attractive method for metabolic engineering that allows for the identification of non-obvious changes, which increase compound of interest production. Two major limitations in directed evolution experiments are the emergence of cheater cells within the population, and the difficulties involved in generating new biosensors. Two methods are currently being developed to mitigate both of these limitations in S. cerevisiae. Two cell directed evolution (2CDE) creates a synthetic co-dependency in which production of a compound of interest by one strain triggers the activation of a biosensor in a second strain. Genetic separation of biosensor from compound of interest production would greatly reduce the formation of cheating sub-populations. Underlying assumptions involved in 2CDE have successfully been tested using doxycycline induced amino acid cross-feeding, and PHBA biosensor activation from adjacent overproducing strains. Iterative simultaneous yeast display (ISYD) is a technique to create binding peptides for compounds of interest which can be modularly used in biosensor designs. The technique uses simultaneous screening of yeast display libraries in a non-competitive binding assay for a compound of interest. Furthermore, the construction uses intron mediated genetic assembly to allow iterative addition of new peptides to identified binding peptides, increasing binding affinity and specificity. Intron mediated genetic assembly is being tested in model systems using fluorescence activated cell sorting based outputs.

3 P4. Identification of optimal orientation and insertion loci for contextual gene expression in bacterial by in vitro transposition JungHwan Cho, Seungwoo Baek, In-Geol Choi Department of Biotechnology, Graduate School, College of Life Sciences and Biotechnology, Korea University, Seoul, Republic of Korea The bacterial gene expression using a plasmid vector is mainly affected by various non- contextual factors such as ribosome binding site, codon usage, promoter intensity. However, the contextual effect of gene expression (e.g. orientation and location of genes) is also a critical factor for gene expression and its contribution to the gene expression has not been accessed thoroughly. Here, we employed the Tn5 transposase system, which makes a random insertion of reporter gene cassette into plasmid vectors, for examine the contextual effect of gene expression. The main objective is to find the ‘optimal orientation and insertion loci’ that are correlated with various expression levels. To do this, we performed in vitro transposition experiments to build a of expression vectors having various orientation and insertion loci of gene cassette. Random and single insertion of a GFPuv gene cassette into pUC19 was examined by NGS sequencing of a pooled library (1,000 colonies). The insertion loci were screened by the fluorescent intensity in E. coli. Relating the insertion sequential position data to the expression level, we categorized contextual effects by GFPuv expression level. P5. Methanol assimilation in native and synthetic strains of Saccharomyces cerevisiae Monica I. Espinosa1,2, Kaspar Valgepea3,4, Ricardo A. Gonzalez-Garcia3, Colin Scott 2,5, Isak S. Pretorius1, Esteban Marcellin3, Ian T. Paulsen1* and Thomas C. Williams1,2*

1Department of Molecular Sciences, Macquarie University, Sydney, Australia 2Synthetic Biology Future Science Platform, CSIRO, Sydney, Australia 3Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Australia 4ERA Chair in Gas Fermentation Technologies, Institute of Technology, University of Tartu, Tartu, Estonia 5Biocatalysis and Synthetic Biology Team, CSIRO, Canberra, Australia Microbial fermentation for chemical production is becoming more broadly adopted as an alternative to petrochemical refining. Fermentation typically relies on sugar as a feed- stock. However, one-carbon compounds like methanol are a more sustainable alternative as they do not compete with arable land. This study focused on engineering the capacity for methylotrophy in the yeast Saccharomyces cerevisiae through a yeast xylulose monophosphate (XuMP) pathway, a ‘hybrid’ XuMP pathway, and a bacterial ribulose monophosphate (RuMP) pathway. Through methanol toxicity assays and 13C-methanol- growth phenotypic characterization, the bacterial RuMP pathway was identified as the most promising synthetic pathway for methanol assimilation. When testing higher

4 methanol concentrations, methanol assimilation was also observed in the wild-type strain, as 13C-ethanol was produced from 13C-methanol. These results demonstrate that S. cerevisiae has a previously undiscovered native capacity for methanol assimilation and pave the way for further development of both native and synthetic one-carbon assimilation pathways in S. cerevisiae. P6. Rewriting the firmware of RNA-guided nucleases via directed evolution Gregory W. Goldberg, Brendan Camellato, Jeffrey Spencer, Neta Agmon, David Ichikawa, David Giganti, Jef D. Boeke, Marcus B. Noyes Institute for Systems Genetics, NYU Langone Health, New York, New York, USA Programmable RNA-guided nucleases are facilitating top-down genome editing endeavors throughout the biomedical research community and show great potential for use in therapeutic applications. The sequence specificity of the most widely used RNA-guided nucleases, including Streptococcus pyogenes Cas9 (SpCas9), is determined by essential interactions with a target-abutting DNA sequence known as a protospacer adjacent motif (PAM), as well as base pairing between guide RNA and target DNA. PAMs are recognized through hardwired protein-DNA interactions that mechanistically precede the formation of guide-RNA-dependent R-loops, and thereby limit the sequence space available for targeting. Wild type SpCas9 is known to have sub-optimal interactions with ‘NAG’ and ‘NGA’ PAMs relative to ‘NGG’, and no detectable interactions with ‘NCC’ PAMs. In this study, we exploited custom directed evolution systems for SpCas9 that incorporate positive and negative selection to select variants with improved activity on ‘NAG’ PAMs and reduced activity on ‘NGG’ PAMs. A pooled selection strategy for identifying epistatic mutations within evolving populations is also presented. This work has implications for the systematic engineering of orthogonal RNA-guided nucleases and the strategic implementation of co-selection parameters in directed evolution. P7. Redesigning the yeast genome to reveal its molecular intricacy and introduce extensibility Jianhui Gong1, Feng Gao1, Yun Wang1, Yang Deng2, Yuerong Wang1, Jintao Zhang1, Xian Fu1, Jef D. Boeke3, Yizhi Cai4, Huanming Yang1, Yue Shen1

1BGI-Shenzhen, Shenzhen, P.R. China 2BGI-Qingdao, Qingdao, P.R. China 3Institute for Systems Genetics, NYU Langone Medical Center, New York, New York, United States 4Manchester Institute of Biotechnology, University of Manchester, Manchester M1 7DN, United Kingdom Here we reported on the design, construction and characterization of a 1028-kilobase pair VII, synVII, based on native chromosome VII in Saccharomyces cerevisiae and SC2.0 uniform design principles. A bug in S megachunk was observed through the growth curve and spotting assay, associated with abnormal mRNA folding and translation

5 repression due to two LoxPsym sited residing in one transcript. Additionally, we obtained two aneuploid strains carrying an extra copy of single chromosome VII (N+1) or synVII (N+S1) by transient nondisjunction of one specific chromosome. By applying with SCRaMbLE (Synthetic Chromosome Rearrangement and Modification by LoxPsym- mediated Evolution) system, we generated Aneuploid-Tolerating variants with massive genome rearrangements. Furthermore, we gained new insights for the phenotype- genotype relationship in aneuploid cells through characterizing these strains via trans- analysis. P8. Identifying the drivers of diversity induced by genome SCRaMbLEing Amanda L Hughes, Aaron N Brooks, Sandra Clauder-Muenster, Lars M Steinmetz European Laboratory, Heidelberg, Germany The Synthetic Chromosome Recombination and Modification by LoxP-mediated Evolution (SCRaMbLE) system has created more than 60 strains in which the genes on the synIXR chromosome are uniquely reorganised. The resultant genomic reorganisation generates novel junctions between genes, non-coding RNAs, and UTRs, presenting the opportunity to observe gene regulation across multiple genomic contexts. Using Oxford Nanopore directRNA sequencing, we observed that genomic rearrangements produce variation in transcription start and termination sites of genes that have undergone SCRaMbLEing. To identify candidate drivers of these transcriptional changes, we are mapping chromatin accessibility in these strains. We are implementing long-read single molecule footprinting with DNA methyltransferases to map chromatin accessibility, as this technique allows for cross-junctional reads that uniquely map to the highly duplicated . Using the accessibility data from these maps, we will infer transcription factors and chromatin states that drive transcriptional changes induced by genomic reorganisation. P9. Genome writing and the application using /mouse artificial Yasuhiro Kazuki Chromosome Engineering Research Center, Tottori University, Tottori, Japan The conventional vector system for expression of desired genes in mammalian cells has problems such as insertion of the vector into the host and restriction of the introduced DNA size. To solve the problems, we constructed a human artificial chromosome (HAC) vector and a mouse artificial chromosome (MAC) vector that do not contain any endogenous genes using our unique chromosome engineering technology. The HAC/MAC as gene delivery vectors can deliver Mb-sized gene cluster and multiple genes, and are stably and independently maintained with defined copy numbers in host cells, as well as being transferrable to any other cell line via microcell-mediated chromosome transfer (MMCT). Using the HAC/MAC vector, we demonstrated following applications; 1)Humanized mice and rats for predicting human drug metabolism, 2) Novel Down syndrome model

6 and cells for identifying the responsible genes and the preclinical study, 3) Basic study for gene and cell therapy of Duchenne muscular dystrophy, and 4) Fully human antibody- producing animals for antibody drug. Thus, HAC/MAC technology can be expected to be used not only for basic research but also for applied research such as drug discovery and medical applications. We are planning to develop fundamental technologies to introduce Mb-sized synthetic DNA into target cells efficiently using the HAC/MAC technology. P10. Tracking meiotic chromosomes structure, pairing and recombination with optimized Hi-C Charlotte Cockram; Agnès Thierry; Romain Koszul Institut Pasteur, Paris, France We recently redesigned and reassembled in yeast a 145kb region (Syn-HiC) with regularly spaced restriction sites to improve the resolution and sensitivity of the Hi-C assay. In the Syn-HiC region, the signal to noise ratio is enhanced, and the redesigned sequence is now distinguishable from its native homologous counterpart in an otherwise isogenic diploid strain. As a proof of principle, we tracked the establishment of chromatin loops formation and homolog pairing during meiotic prophase in a synchronized population, providing insights on the individualization and pairing of homologs, as well as on their internal restructuration into arrays of loops during meiosis prophase. Following up on this work, we now have assembled a second Syn-HiC region homolog to the first one. We can now track the 3D folding and interactions between the two Syn-HiC regions at high-resolution during meiotic pairing. In addition, the restriction site polymorphisms introduced along both sequences allows us to track at multiple hotspots along the region. Overall, this experimental system enables us to follow the evolution of the structure of two large homologous regions representing multiple loops through the entire meiotic prophase, which includes compaction, pairing and recombination. Here we present the system and our first results using it. P11. Digital Genome Engineering - Unlocking the Unlimited potential of Biology Bryan Leland, Eric Abbate, Krishna Yerramsetty, Steve Federowicz, Katherine Krouse, Michael Clay, Dan Held, Richard Fox, Nandini Krishnamurthy

Inscripta, Inc., Boulder, Colorado, USA To fuel the pace of discovery through CRISPR editing, the research community requires tools to efficiently deliver precise edits at the gene, pathway, and whole-genome level. Unfortunately, current editing techniques suffer from limitations in scalability, efficiency, diversity of edit types, and accessibility. The Onyx platform developed by Inscripta, Inc., dramatically increases the scale of Digital Genome Engineering, making it possible to easily perform CRISPR-based forward engineering experiments through high-throughput diversity generation in E. coli and yeast. The platform simplifies the complex editing workflow for biologists by offering an end-to-end solution from design to diversity

7 generation – including software, reagents, benchtop instrument and analytics. Using the Onyx platform, we have performed high-throughput diversity generation with up to 200,000 edits across the genome in both coding and non-coding regions to improve lysine production in E. coli and 130,000 edits in yeast to improve tyrosine production. Inscripta’s technology has also fueled large-scale discovery of genotype-phenotype relationships in antibiotic resistance and abiotic stress responses. Ultimately, this approach will give scientists a more comprehensive view of the phenotypes they study by allowing them to test far more genomic changes and to rapidly select the most effective ones. This should have far-reaching benefits for agricultural and bio-industrial science, healthcare, and alternative energy. P12. Production and analysis of fully human antibody producing mice using mouse artificial chromosome Yamazaki Kyotaro 1, Hiroyuki Satofuka2, Yasuhiro Kazuki1,2

1Department of Biomedical Science, Institute of Regenerative Medicine and Biofunction, Graduate School of Medical Science, Tottori University, Tottori, Japan 2Chromosome Engineering Research Center, Tottori University, Tottori, Japan Mouse artificial chromosomes (MACs) is a unique gene delivery vector and have several characters suitable for synthetic biology as follows: 1) stable maintenance in mouse cells and tissues, 2) germline transmission capability in rodents, 3) capacity to carry Mb-sized DNA, and 4) stringently regulated gene expression like as native chromosomes. In this study, we generated fully human antibody producing mice (Tc-mAb mice) of which mouse immunoglobulin (Ig) genes were knocked out and human Ig genes was introduced by using MAC that carries IgH (1.6 Mb) and Igκ (1.7 Mb) locus (IgHK-NAC). In the Tc-mAb mice, it was confirmed that IgHK-NAC was stable in various tissues and hematopoietic cells, and human Ig genes was specifically expressed in tissues related to the immune system. High level expression of human Igs (IgG1,2,3,4 and IgM) and undetectable expression mouse Igs were demonstrated in the sera of Tc-mAb mice. By analyzing of overall immune-repertoire diversity, it was indicated that the human Ig-genes was worked as absolutely functional in mice. Since, the frequency usage of V(D)J segments was detected as almost the same as that of human peripheral blood mononuclear cells (hPBMCs). In addition, Tc-mAb mice has CDR3H about 16.4aa equal of human CDR3H length. These data indicated that production of human mAbs in TC-mAb mice was successful. Therefore, MACs can carry large functional Mb-sized domain to mice, and MACs will be important tool with unique features for genome writing and synthetic biology.

8 P13. The design and assembly of synthetic yeast chromosome VIII Stephanie Lauer, Jingchuan Luo, Weimin Zhang, Jef Boeke Institute for Systems Genetics, NYU Langone Health, New York, New York, United States The synthetic yeast genome project (Sc2.0) aims to build the first eukaryotic genome, marking a groundbreaking achievement in the field of synthetic biology. Assembly of all 16 designer chromosomes nears completion, and consolidation of these chromosomes into a single synthetic strain is already underway. Chromosome VIII was redesigned to remove 11 tRNA genes and 18 introns, resulting in a decreased chromosome length of 506,705 base pairs (10% shorter than the wild-type counterpart). All TAG stop codons were recoded to TAA, 183 loxPsym sites were added, and telomere regions were replaced with universal telomere caps. Construction of synthetic chromosome VIII (synVIII) was performed using a combination of SwAP-In (Switching Auxotrophies Progressively for Integration) and MRA (Meiotic Recombination-mediated Assembly). This strategy systematically replaces wild- type sequences with various “chunks” of synthetic DNA across several different yeast strains in parallel. Mating the semi-synthetic strains together results in full-length synVIII. During assembly, we observed a lower integration success rate near the centromere compared to other regions, and we are currently testing how the centromere affects mitotic recombination efficiency. After finishing the initial assembly, we performed and generated a draft of synVIII, which is ~95% complete. To finalize the strain, we are re-integrating a ~25 kilobase pair synthetic sequence, repairing a ~30 kilobase pair duplication, and removing a tRNA gene. Replacing the remaining wild-type sequences with synthetic DNA will result in a final synVIII strain with near wild-type fitness, which can then be used as a platform for biotechnology applications. P14. Karyotype engineering: Novel 3D contacts and DNA replication organization revealed in megachromosome strains Luciana Lazar-Stefanita1, Jingchuan Luo1, Remi Montagne2, Xiaoji Sun1, Agnes Thierry2, Guillaume Mercy2, Julien Mozziconacci2, Romain Koszul3, Jef D. Boeke1

1Institute for Systems Genetics, NYU Langone Health, New York, New York, United States 2Institut Pasteur, Paris, France 3Structure and Genome Instability, Muséum National d'Histoire Naturelle, Paris, France Eukaryotic genomes vary greatly in terms of size, chromosome number and genetic complexity. Their nuclear organization is among the most complex cellular process that requires a fine coordination between DNA structure and function. Although the functional aspect of chromatin organization has been clear for many years, identifying principles that govern it has been extremely challenging. Here we used engineered karyotypes of Saccharomyces cerevisiae - whose 16 native chromosomes were fused into gigantic DNA molecules - to characterize using Hi-C the effect of DNA size and spatial localization on nuclear architecture. We observed that mega-chromosomes lead to an increase in nuclear size and display marked changes in their 3D structure, mostly corresponding to locations of

9 deleted centromeres and telomeres. Moreover, de-clustering of the inactive centromeres resulted in their loss of early replication firing; while re-positioning of the former telomere- proximal regions revealed new set of gene contacts. Finally, the expected increase of intrachromosomal interactions did not reflect any variation in the status of the chromatin fiber. On the contrary, the folding of the DNA inside mega-chromosomes remains conserved throughout the entire cell cycle. Altogether these results highlight the importance of the synthetic genome editing in the field of nuclear organization and function. P15. Design and assembly of the yeast synthetic chromosome synIXL Laura H. McCulloch1*, Vijayan Sambasivam2*, Sivaprakash Ramalingam2*, Boeke, Bader, and Chandrasegaran Labs, Build-a-Genome Class, Joel S. Bader2, Jef D. Boeke1†, Srinivasan Chandrasegaran2†

1Institute for Systems Genetics, NYU Langone Health, New York, New York, United States 2Johns Hopkins University, Baltimore, Maryland, United States *Co-first authors †Co-senior authors The synthetic yeast genome project (Sc2.0) marks a key milestone in the development of “designer” eukaryotic genomes. This global effort seeks to produce a modified version of the ~12 Mb Saccharomyces cerevisiae genome. Synthetic chromosomes designed for Sc2.0 contain a variety of customized elements, including loxPsym sites needed for an inducible evolution system called SCRaMbLE, a watermarking system called PCRTagging, tRNA gene relocation from individual chromosomes to a separate tRNA neo-chromosome, repeat deletion, stop codon recoding from TAG to TAA, and site recoding. So far, researchers have finished assembly of six synthetic yeast chromosomes, with the remaining chromosomes nearing completion. Following Dymond et al.’s earlier construction of the shorter right arm of synthetic chromosome IX, synIXR, we have designed and assembled a draft of the longer left arm, synIXL. In conjunction with synIXR, this arm reduces the total size of chromosome IX from a wild-type length of 439,885 bp to a synthetic length of 405,513 bp. SynIXL was assembled via the SwAP-In (Switching Auxotrophies Progressively by Integration) method. Using this strategy, we initially assembled 30-60 kb synthetic megachunks in yeast from 2-4 kb minichunks. We then replaced the underlying wild-type chromosome sequence with our megachunks in a stepwise fashion. At present, we have completed our initial synIXL assembly draft, and are currently correcting errors. Once finalized, we hope that synIXL will not only represent a central piece of Sc2.0, but also serve as a platform for exploring additional biological questions.

10 P16. Development of a CRISPR-mediated methodology to re-write microalgal genomes Alexandra Mystikou, Weiqi Fu, David R. Nelson, Kourosh Salehi-Ashtiani Center for Genomics and , New York University Abu Dhabi, Abu Dhabi, United Arab Emirates Recent advances in DNA synthesis technology and genome editing have enabled the re- writing and re-coding of genomes, adding new dimensions to biological research. However, the technology to replace native genomic DNA with synthetic ones has not been implemented in green microalgae, in part, due to the low efficiency of homologous DNA recombination in this group of . In this study, we are developing the methodology to replace targeted chromatin segments of Chlamydomonas reinhardtii with synthetically generated DNA. Here, our goal is to improve targeted recombination in C. reinhardtii through introducing CRISPR-Cas mediated cuts at the intended homologous recombination sites to enhance the replacement of the endogenous target region with the introduced synthetic DNA. Accordingly, the synthesized DNA is transformed into C. reinhardtii through electroporation along with in vitro assembled CRISPR-Cas12(Cpf1) complex. We have selected the use of Cpf1/Cas12 because this nuclease provides higher efficiency and specificity in Chlamydomonas compared to Cas9. As a set of pilot studies to allow quantitative assessments of our approach, synthetic sequences (ranging from 100 bases to 3kb) that include a selectable marker have been synthesized to replace the native sequences through CRISPR-Cpf1. The selection markers that have been used (hygromycin and spectinomycin) give antibiotic resistance to the cells and enable positive selection for the introduction of the segment. Additionally, the target genes (FKB12 and AMI1) can provide counter-selection when mutated. The combination of the positive and negative selectable markers in the experiments provide the ability to quantify on-target and off- target integration of the synthetic DNA. P17. Targeted genome editing at repetitive genetic elements using filtered MAGE and CRISPR Felix Radford, Farren Isaacs Yale University, New Haven, Connecticut, United States Genome Engineering has been an enabling technology for evolution of enzyme function, biosynthetic pathways, and whole genomes with applications to basic science, biotechnology, and medicine. A major limitation with all genome editing approaches is the inability to selectively modify a genetic element that possesses high sequence homology to other genetic loci. Here we present filtered CRISPR and MAGE, which allow deep mutagenesis of only one specified locus in the genome among many possible repeats. This method can be used to evolve noncoding RNAs continuously in vivo, without the need for in vitro cloning and transformation steps.

11 P18. The Sc2.0 tRNA neochromosome Daniel Schindler1, Roy Walker, Aaron Brooks, Yue Shen, Lars Steinmetz, Jef Boeke2, Yizhi Cai1

1Manchester Institute of Biotechnology, Manchester, United Kingdom 2Institute for Systems Genetics, NYU Langone Health, New York, New York, United States The Sc2.0 consortium aims to generate the first synthetic eukaryotic genome. Almost all synthetic chromosomes have now been built and are currently being characterized. Merging multiple chromosomes in a single cell is in progress to ultimately construct the first synthetic eukaryotic cell. One major change to the Sc2.0 genome is the relocation of all tRNA genes to a new, de novo designed chromosome: the tRNA neochromosome. The 275 tRNA genes of S. cerevisiae are distributed throughout the whole genome. tRNA genes are highly transcribed and are preferred insertion sites for transposable elements. By this means, tRNA genes cause genomic instability. Removing the tRNA genes ensures that risk of transposon insertion into synthetic chromosomes is minimized during their construction. Here we present the design, construction and characterisation of the first de novo designed and constructed synthetic eukaryotic chromosome. The tRNA neochromosome has been constructed and is currently being extensively characterized, including growth phenotyping, expression analysis and localization studies are only some highlights. Interestingly the tRNA neochromosome seems to be unstable in haploid cells but can provoke a whole genome duplication. In the genome-duplicated cells the neochromosome can be maintained stably in circular or linear form. In the future, the tRNA neochromosome will supplement the final Sc2.0 strain with all necessary tRNA genes and will give new insights into tRNA biology. P19. The art of fusion: a synthetic approach to creating cross-kingdom hybrids Robert Smith1, Elise Cachat1, Erika Szymanski2, Tarsh Bates3,Lenny Nelson1, Daniel Sachs1, Lucia Bandiera1, Filippo Melonascina1, Ionat Zurr3, Jane Calvert1, Susan Rosser1, Oron Catts3

1The University of Edinburgh, Edinburgh, Scotland, United Kingdom 2 Colorado State University, Fort Collins, Colorado, United States 3SymbioticA, University of Western Australia, Australia Cell fusion has been used historically to produce hybrid cell lines. More recently, in the context of Synthetic Biology, fusion between mammalian and yeast cells has been used to transfer genetic material from one cell type to another (Brown et al. 2017. Nucleic Acids Res. 45(7):e50). This type of cross-kingdom fusion, beyond its scientific merits, raises broader questions: how do multi-kingdom cell fusions challenge our categories and understandings of life? Where do these novel entities belong, both biologically and

12 culturally? In this interdisciplinary collaboration between artists (SymbioticA, University of Western Australia), scientists and social scientists (SynthSys, University of Edinburgh), we attempted to create yeast-mammalian cell fusions using a synthetic biology approach, while exploring such questions through art. In the lab, human cells expressing viral proteins, previously shown to induce fusion between mammalian cell membranes (Cachat et al. 2014. J Biol Eng. 8(1):26), were mixed with yeast spheroplasts on a microfluidic imaging platform, in an attempt to record and publicly present fusion events between Saccharomyces cerevisiae and HEK293 cells. The potential creation of new life forms raises questions of responsible innovation and ethics. In this project, rather than commenting on these issues from a distance, they are incorporated in – and explored alongside – the actual manipulation of life forms. This is a direct way of examining, questioning and critiquing the new engineered constructions that are being created through synthetic biology. P20. Genetic code expansion in Bacillus subtilis Devon A. Stork, Erkin Kuru, Aditya Jog, Aditya M. Kunjapur, Ethan Garner, George M. Church Harvard University, Cambridge, Massachusetts, United States Encoding nonstandard amino acids (nsAAs) into proteins allows for expansion of the genetic code beyond the standard 20 amino acids for probing, labelling, or controlling proteins in a minimally disruptive manner. However, these tools have been mostly unavailable in many bacterial model systems, such as the primary gram-positive model , Bacillus subtilis. Here we describe the use of several classes of genome- integrated synthetases to incorporate many different nsAAs into proteins in B. subtilis, including nsAAs used for biorthogonal labelling, fluorescence imaging and photo- crosslinking. We also demonstrate a nsAA-dialable protein expression system in this bacterium. The expression of a target gene can be enhanced >50-fold when nsAAs are added and up to 1000s-fold when combined with a transcriptional inducer. Unlike E. coli nsAA systems, where nsAAs are not incorporated into native UAG codons even before recoding efforts, B. subtilis nsAA systems incorporate nsAAs into many genomic proteins at native UAG codons. This feature presents both challenges and opportunities for follow-up work in B. subtilis nsAA research and genome modification. The general and effective expansion of nsAA technology to B. subtilis can facilitate our understanding of cell biology in this bacterium and industrial protein production of nsAA-containing proteins. P21. In vitro genome manipulation Kazuhito V. Tabata1, Mana Ono1, Masayuki Suetsugu2, John I. Glass3, Nacyra Assad-Garcia3, Hiroyuki Noji1

1The University of Tokyo, Tokyo, Japan 2Rikkyo University, Tokyo, Japan 3The J. Institute, Rockville, Maryland, United States We are developing technology to manipulate and transplant the mycoplasma genome in vitro.

13 P22. Long chain DNA foundry at Kobe University Kenji Tsuge, Shunsuke Takahashi, Akihiko Kondo Graduate School of Science, Technology, and Innovation, Kobe University, Kobe, Japan In Genome Project-write, technology for synthesis long chain DNA with arbitrary sequence at low cost is desired, but development of such kind of technology requires many of improvements through the whole process. For this purpose, we have begun to develop a long DNA synthesis foundry in Kobe University. Our core long DNA synthesis technology is OGAB method, which uses Bacillus subtilis plasmid transformation system. Due to trait of B. subtilis, OGAB method intendedly requires long tandem repeat ligation products of material DNA blocks and assembly vector in vitro. However, long tandem repeat ligation product is obtained only after precise equimolar adjustment of the blocks, to realize this, we developed unique automation system for equimolar mixing by collaboration with Precision System Science, Co. Moreover, we specially developed chemical DNA synthesizer of long DNA synthesis by collaboration with Nihon Techno Service, Co. This synthesizer can synthesize up to 200 nt of single strand DNA at low cost. Together with specially developed PCR-based single strand DNA assembly condition, we can synthesize almost all of material DNA fragment of OGAB that are including refused by commercial synthesis services. By using this total system, now we can construct long DNAs up to 100 kb in size from as much as 50 of DNA blocks. P23. of Human Cells for Radiotolerance Craig Westover, Sherry Yang, Sonia Iosim, Deena Najjar, Daniel Butler, Daniela Bezdan, Christopher E. Mason. Weill Cornell, New York, New York, United States Space flight has been documented to produce a number of detrimental physiological effects as a result of cosmic radiation. Space radiation is about 100 times higher than the average effective dose per year from natural radiation on earth and has the ability to produce DNA double stranded breaks leading to increased chromosomal aberrations. The harsh environmental effects of space on organisms have also been studied on the molecular level and as such have shed light on some of the underlying mechanisms that give rise to space induced alterations of cellular functions such as proliferation, differentiation, maturation, and cell survival. Our lab was recently involved in the NASA Twin Project where we analyzed Scott Kelly’s genome, transcriptome, and corresponding epigenetic modifications in response to 1 year of space flight. With this information in mind we are now moving on to genetically engineering HEK293 cells to survive ionizing cosmic radiation. It was discovered that tardigrades could survive 4000x the amount of radiation than anything else on this planet by utilizing a gene known as Damage Suppressor Protein (DSUP). DSUP works by binding its c-terminal domain to the genome during times of stress to protect against reactive oxygen species formed by radiation. Here we introduce DSUP marker via lentiviral transfection into HEK293 cells and test a range of radiation doses in the form of Cs-137 to generate kill curves and assess cell proliferation. Here we demonstrate HEK293

14 cells transfected with DSUP are able to withstand high doses of radiation compared to WT cells. P24. Identification of optimal orientation and insertion loci for contextual gene expression in bacterial plasmid by in vitro transposition JungHwan Cho, Seungwoo Baek, In-Geol Choi Department of Biotechnology, Graduate School, College of Life Sciences and Biotechnology, Korea University, Seoul, Republic of Korea The bacterial gene expression using a plasmid vector is mainly affected by various non- contextual factors such as ribosome binding site, codon usage, promoter intensity. However, the contextual effect of gene expression (e.g. orientation and location of genes) is also a critical factor for gene expression and its contribution to the gene expression has not been accessed thoroughly. Here, we employed the Tn5 transposase system, which makes a random insertion of reporter gene cassette into plasmid vectors, for examine the contextual effect of gene expression. The main objective is to find the ‘optimal orientation and insertion loci’ that are correlated with various expression levels. To do this, we performed in vitro transposition experiments to build a library of expression vectors having various orientation and insertion loci of gene cassette. Random and single insertion of a GFPuv gene cassette into pUC19 was examined by NGS sequencing of a pooled library (1,000 colonies). The insertion loci were screened by the fluorescent intensity in E. coli. Relating the insertion sequential position data to the expression level, we categorized contextual effects by GFPuv expression level. P25. Circular Human Artificial Chromosome is Heritable Across Multiple Generations Leslie Mitchell, Sang Kim, Aleksandra Wudzinska, Nicholas Lee, Jef Boeke Institute for Systems Genetics, NYU Langone Health, New York, New York, United States Human Artificial Chromosomes (HAC) are a unique source for modern gene therapy development. HACs offer advantages over alternative gene delivery systems in that they replicate autonomously without integrating into the host genome which historically can lead to insertional mutagenesis, immunogenicity and oncogene activation. Additionally, HACs are able carry larger genes, are maintained in cycling and as well as quiescent cells, are heritable across generations and can be engineered to include selective genes that allow efficient loss of the HAC if required. Here we show for the first-time germline transmission of a circular HAC. Murine ES cells containing an alphoidtetO-HAC were used to generate transgenic mice through tetraploid complementation. We observe that the HAC is stably transmitted episomaly across multiple generations.

15 P26. Development of a methodology to evaluate the essentiality of a megabase-scale genome region in haploid human cell (HAP1) Hitoyoshi Yamashita, Hikaru Kurasawa, Tomoyuki Ohno, Shinya Kaneko, Yasunori Aizawa Tokyo Institute of Technology, Tokyo, Japan Human and any other mammalian genomes have borne paralog gene expansion and transposon propagation, resulting in the redundant cell networks and the vast noncoding genomic regions. To understand the fundamental genome architecture underlying the core and essential networks of mammalian cells, we here developed a method for identification of the essential DNA elements required for cell viability, so-called LARge-scale Genome Evaluation for cell Survival, or LARGES”. LARGES employs HPRT1 as negative selection marker; cell survival rates increase under 6-thioguanine selection when CRISPR-mediated deletion of HPRT1-containing locus is non-deleterious. As a pilot, we applied LARGES to the endogenous HPRT1 locus on the X chromosome in haploid human cell (HAP1). Gradual expansion of deleted regions allowed for elimination of 5.5 Mbp regions containing 69 genes (3 genes are known as conditionally essential genes). Microarray analysis indicated that ~250 genes changed the transcription levels by >2-fold, while growth competition assay clarified no fitness defect. Additionally, it was revealed that the RBMX2 and MMGT1 loci were indispensable for HAP1 survival. To assess whether the genic entities were vital or these regions include some crucial cis-regulatory elements for essential gene(s) nearby, the entire genic units were transplanted to the chromosome 12 and LARGES was performed, demonstrating that the rescue was successful for both cases. In summary, a LARGES is a useful screening strategy for essential elements in Gigabase-scale genomes with low-gene-densities. P27. Total synthesis of a 1.4 Mb yeast chromosome, synIV Weimin Zhang1, Hitoyoshi Yamashita3, Michael J. Shen1, Luciana Lazar-Stefanita1, Hikaru Kurasawa3, Leslie A. Mitchell1, Brendan R. Camellato1, Jiaming Lin1, Nicole Easo1, Zhuwei Xu1, Junbiao Dai2, Yasunori Aizawa3, Jef D. Boeke1

1Institute for Systems Genetics, NYU Langone Health, New York, New York, United States 2Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China 3Tokyo Institute of Technology, Tokyo, Japan Emergence of next-generation sequencing (NGS) and advanced DNA synthesis technology enables us to explore the genetic blueprint of life in more depth. From the synthetic biology perspective, “build-to-understand” represents one way to probe the nature of genomes. However, the limited capacity of building large synthetic DNA limits our understanding of genomes across species, especially for . In this study, we report the total synthesis of the largest eukaryotic chromosome up to date in yeast. Following the Sc2.0 design principles, we not only simplify the chromosome by deleting introns, transposons, and sub-telomeres, but also endow the chromosome with new biological functions by adding 479 loxPsym sites downstream of nonessential genes and by swapping all TAG stop codons to TAA. We successively used marker SwAP-In and Meiotic Recombination-

16 mediated Assembly strategies to assemble the synthetic chromosome. During chromosome building, the parallelized synthesis workflow allowed us to identify causative bugs and fix them rapidly. In summary: repeatsmash-recoded NOP1 and PCF11 impaired protein expression (potentially due to rare codon incorporations), a single substitution in the promoter region of MAK21 caused severe growth defects, and a PCRTag (a water marking system) in RTT103 impaired fitness in a “synthetic sick” manner. The knowledge acquired during debugging gives us a better understanding of genome design principles. The completion of synIV provides an excellent platform for answering fundamental questions in evolutionary biology, chromosome stability and chromosome structure in the future. P28. Detecting evidence of genetic engineering

Yuchen Ge, Jitong Cai, Joel S. Bader

Johns Hopkins University, Baltimore, Maryland, United States Detecting evidence of genetic engineering is important for biosecurity, provenance, and intellectual property rights. The need for monitoring and detection is growing with contemplated release of gene drive systems. We describe results of a computational systems designed to detect engineering from DNA sequencing of biological samples, including automated identification of host strains, detection of foreign gene content, and detection of watermarks. Our results demonstrate near perfect identification of foreign gene content in blinded samples, but less ability to detect more subtle engineering associated with watermarks that blend in with natural variation. We describe plans for future improvements.

17 Abstracts Selected for Short Talks During the Meeting (T1 – T21): T1. Measuring transcriptomic diversity induced by genome SCRaMbLEing with direct RNA sequencing Aaron N. Brooks, Amanda L. Hughes, Sandra Clauder-Münster, Andreas Johansson, Lars M. Steinmetz European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany Eukaryotic genomes are organized non-randomly. We are studying how abrupt changes to genome organization shapes the transcriptional landscape using Sc2.0, a yeast strain composed entirely of designer, synthetic DNA. Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) is a key design feature of the synthetic yeast genome project. SCRaMbLE generates stochastic and complex rearrangements at engineered recombinase sites throughout the genome on-demand using Cre-mediated recombination. We have characterized extensive transcriptional diversity in >60 independent SCRaMbLE strains containing a circular synthetic chromosome, synIXR. These transcriptional alterations appear to be isolated to the SCRaMbLEd chromosome and dependent on rearrangements specific to each SCRaMbLE strain. Using direct RNA sequencing, we have observed novel transcriptional events in SCRaMbLEd genomes, including alteration to transcript start and termination sites. Our results suggest an inextricable link between physical organization of the genome and transcript isoform expression. T2. Detecting evidence of genetic engineering Jitong Cai, Yuchen Peter Ge, Joel S. Bader Johns Hopkins University, Baltimore, Maryland, United States Detecting evidence of genetic engineering is important for organism provenance, intellectual property, and biosecurity. While it is straightforward to describe the main types of engineering --- foreign gene content in host chromosomes, presence of or vectors, and watermarking of host genes by synonymous recoding --- detecting these features from whole-genome, short-read sequencing remains challenging. We describe an analysis strategy combining methods originally developed for and genome application and present its application to known synthetic genomes and to unknown test strains. T3. Transcription factor binding sites are more common than you think Carl G. de Boer, Eeshit Dhaval Vaishnav, Ronen Sadeh, Esteban Luis Abeyta, Nir Friedman, Aviv Regev Broad Institute, Cambridge, Massachusetts, United States

18 Deciphering cis-regulation, the code by which transcription factors (TFs) interpret regulatory DNA sequence to control gene expression levels, is a long-standing challenge and is central to our ability to interpret regulatory DNA. Previous studies of native or engineered sequences have remained too limited in scale to learn the complex rules of cis regulation. Here, we use random sequences as an alternative, allowing us to measure the expression output of over 100 million synthetic yeast promoters. We create new interpretable computational models to make use of this data, allowing us to identify how TFs interpret the DNA. This revealed the importance of abundant, weak TF binding sites. Random sequences yield a broad range of reproducible expression levels, indicating that the fortuitous binding sites in random DNA are functional. Models trained on these data predict over 94% of the expression driven from independent test data and nearly 89% from yeast promoter sequence, on the basis of TF binding sites. We validated predictions of TF function in numerous ways, giving us confidence in this model. Finally, because we explicitly model each TF, we can reconstruct gene regulatory networks, and find that they are unexpectedly interconnected. Such massive-throughput regulatory assays of random DNA produce the ideal training data for learning complex models of cis-regulatory logic because they contain abundant functional TF binding sites of diverse affinities that are randomly assorted, and random sequences can be synthesized at nearly limitless scale. T4. Constraint Satisfaction in Synthetic Biology Sara Kalvala University of Warwick, Coventry, United Kingdom Design of synthetic genetic circuits which actually work is a much too complex task to have a straightforward algorithmic solution: what is needed is an approach that navigates through the many, sometimes conflicting, factors that inform the feasibility of a design. It is not realistic to hardwire such factors into CAD tools, as many of these factors are the result of a fluid state of knowledge, and others are very context-specific. Computer scientists have developed a sophisticated framework to capture this situation, called Constraint Satisfaction. Here mathematics functions are associated with each rule or heuristic, and the system navigates the space of design to optimize the outcome using the functions. Rules can be added, removed, changed or associated with new calculations easily and in a modular fashion. I will present several experiments in using Constraint Satisfaction to capture different aspects of biological design, from capturing sequence rules that disallow certain DNA sequences or optimize codon use, to rules that capture desired dynamic biochemical properties.

19

T5. A approach to dissecting an age-related macular degeneration-associated haplotype Jon M Laurent1, Ran Brosh1,2, Sergei German1, Matthew T Maurano1,2, Jef D Boeke1

1Institute for Systems Genetics, Department of Biochemistry and Molecular Pathology, NYU Langone Health, New York, New York, United States 2Department of Pathology, NYU Langone Health, New York, New York, United States Age-related Macular Degeneration (AMD) is a leading cause of blindness in the developed world, especially in aging populations. Recent studies have demonstrated strong associations between AMD risk and heritable sequence variation at multiple genomic loci, including a highly significant risk association at 10q26. The 10q26 risk region contains two genes, HTRA1 and ARMS2, both of which have been separately implicated as targets for sites of regulatory variation in the region, and therefore as potential contributors to AMD pathology. To date, no studies have successfully pinpointed which of the variant sites are functional in AMD, nor definitively identified which genes are targets of such regulatory variation. As the retina is one of the few tissues where gene therapy is currently practical, identification of specific functional alleles will enable design of effective gene therapies for AMD. In order to efficiently decipher which sites are functional in AMD phenotypes, we describe a general framework for combinatorial assembly of large ‘synthetic haplotypes’ along with their delivery to relevant cell types for downstream functional analysis. We demonstrate highly efficient parallelized assembly of a library of synthetic haplotypes of the HTRA1/ARMS2 risk region, representing dozens of combinations of risk alleles, delivery and analysis of which will identify functional sites and their effects. We anticipate that the methodology described here is highly generalizable towards the difficult problem of identifying truly functional variants from those discovered via GWAS or other genetic association studies. T6. Mapping Genotype-Phenotype Relationships with Rapid Genome- Scale Engineering Bryan Leland, Eric Abbate, Krishna Yerramsetty, Steve Federowicz, Katherine Krouse, Michael Clay, Dan Held, Richard Fox, Nandini Krishnamurthy

Inscripta, Inc., Boulder, Colorado, USA To fuel the pace of discovery through CRISPR editing the research community requires needs to deliver genome-scale edits at the gene, pathway, and genome level. Unfortunately, current editing techniques suffer from limitations in scalability, efficiency, diversity of edit types, and accessibility. The Onyx platform developed by Inscripta, Inc., dramatically increases the scale of Digital Genome Engineering, making it possible to easily perform CRISPR-based forward engineering experiments through high-throughput diversity generation in E. coli and yeast. The platform simplifies the complex editing workflow for biologists by offering an end-to-end solution from design to diversity generation –

20 including software, reagents, benchtop instrument and analytics. Using the Onyx platform, we have performed high-throughput diversity generation with up to 200,000 edits across the genome in both coding and non-coding regions to improve lysine production in E. coli and 130,000 edits in yeast to improve tyrosine production. Inscripta’s technology has also fueled large-scale discovery of gene-phenotype relationships in antibiotic resistance and abiotic stress responses. Ultimately, this approach will give scientists a more comprehensive view of the phenotypes they study by allowing them to test far more genomic changes and to rapidly select the most effective ones. This should have far- reaching benefits for agricultural and bio-industrial science, healthcare, and alternative energy. T7. Improved eMAGE Platform for Multiplex Gene Editing in DNA Repair Proficient Cells Zhuobin Liang, Eli Metzner, and Farren J. Isaacs Department of Molecular, Cellular, & Developmental Biology, Systems Biology Institute, Yale University, New Haven, Connecticut, United States We recently developed a new multiplex genome engineering technology (eMAGE) in eukaryotic Saccharomyces cerevisiae based on annealing synthetic oligonucleotides (ssODNs) at the lagging strand of DNA replication. The underlying mechanism of eMAGE is independent of Rad51-directed homologous recombination and avoids the creation of DNA breaks using engineered nucleases, enabling precise chromosome modifications at single base-pair resolution. To further enhance eMAGE efficiency, we first modulated the co-selection stringency by optimizing the molecular ratio of ssODNs targeting a URA3 auxotrophic marker and a red fluorescent reporter, resulting in a ~2-fold gene editing enhancement with 50% allelic replacement efficiency (ARF). We then coupled the use of two pre-selection markers (URA3 and ADE2) franking the eMAGE locus to augment multiplex gene editing performance to 50-70% ARF across a 5-kb genomic region. Because ablation of DNA mismatch repair (MMR) pathways can lead to a ~100-fold gene editing enhancement, deletion of MMR genes (e.g. MSH2, MSH6, MLH1) is often required for robust eMAGE but comes at the cost of undesirable background mutations and requires laborious modifications of the genome. To circumvent this potential problem and enable eMAGE applications in DNA repair proficient cells, we screened a panel of 9 dominant negative MMR (MMR-DN) mutants expressed from an episomal plasmid under titratable induction of -estradiol and identified highly active mutants of which inducible expression can transiently boost eMAGE for subsequent optimizations. Together, we provide an improved eMAGE platform for more robust multiplex gene editing with reduced off-target mutations. T8. A engineered protein-phosphorylation toggle network and implications for endogenous network discovery Deepak Mishra, Tristan Bepler, Bonnie Berger, Jim Broach, and Ron Weiss

21 Synthetic Biology Center and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States Synthetic regulatory circuits comprising fast, reversible interactions may allow engineering of new cellular behaviors that are not possible with slower regulation. Here we demonstrate a mechanism to create protein phosphorylation devices that implement OR and NOT logical gates based on phosphorylation-required interaction and mediated effect (PRIME). We used reversible protein-protein interactions exclusively to construct in Saccharomyces cerevisiae a fast, ultrasensitive, bistable toggle switch network comprising fourteen proteins. Motivated by our synthetic design, we computationally searched endogenous yeast protein interactions for network topologies not previously considered but similar to our experimental network. We identified and experimentally verified six formerly unknown endogenous toggle networks with similar regulatory topologies. Creation of synthetic protein networks will allow sophisticated regulation of cellular processes, shed light on how existing natural motifs operate, and aid discovery of currently unknown endogenous functions. T9. Deep Learning for synthetic genomes

Etienne Routhier, Edgar Pierre, Julien Mozziconacci1 1Structure and Genome Instability, Muséum National d'Histoire Naturelle, Paris, France

When synthesizing a genome for a designed purpose there is always the possibility to change unwanted features such as nucleosome positioning, histone or DNA modifications and gene expression. Controls are often performed a posteriori but it would be beneficial to get that information at the sequence design step. Deep neural networks trained on wild type datasets can be used for this purpose. I will first present an overview of the application of deep learning in genomics and report our results on determining the effect of single mutations over an entire genome. I will then explain how this tools, coupled with methods from statistical physics, can be used to designing genome sequences with specific characteristics. I will take as an example our work on budding yeast nucleosome positioning. T10. Synthetic Hox clusters to understand the regulation of chromatin domains and gene expression Sudarshan Pinglay1, Milicia Bulajić2, Leslie Mitchell1, Sergei German1, Matthew T. Maurano1, Liam J. Holt1, Esteban O. Mazzoni2 and Jef D. Boeke1

1Institute for Systems Genetics, NYU Langone Health, New York, New York, United States 2NYU Department of Biology, New York, New York, United States The logic behind regulation of genes by distal enhancer elements has remained largely vague, especially in complex mammalian genomes. Each gene is often controlled by many enhancers, some of which can be located over a 100kb away. A synergistic understanding of how gene expression is controlled will require the ability to manipulate these elements in concert.

22 Despite the emergence of CRISPR/Cas9 based genome editing, there is a gap in our ability to make multiple, precise edits on the same haplotype or generate large scale complex rearrangements to interrogate these multiway functional connections between regulatory elements. Bottom-up synthesis of large DNA segments (>100kb) allows for the arbitrary modification, removal and inclusion of elements on the scale that is required to probe gene regulation. We have coupled the construction of large sections of mammalian genomes with a strategy to deliver these highly variant, locus-scale constructs to defined locations in mammalian cells. We have applied this technology to dissect a classical mammalian gene regulatory model: the HoxA cluster. A synergistic model of Hox gene regulation remains elusive despite the implication of many players including transcription factors, enhancers and dynamic chromatin domains. Building synthetic Hox clusters from the ground-up allows us to identify networks of elements that are necessary in the endogenous context and sufficient in the ectopic context to establish characteristic dynamic chromatin domains and gene expression patterns. This is a proof-of-principle for ‘Synthetic Regulatory Genomics’ - a fully generalizable strategy to systematically dissect regulation of gene expression at other genomic loci. T11. Refactoring yeast mating for synthetic morphology William Shaw, Tom Ellis Imperial College London, London, United Kingdom Synthetic morphology is a growing sub-field of synthetic biology that aims to produce self- organising, pattern-forming cells by combining modular genetic parts. The mating response in the yeast Saccharomyces cerevisiae represents the ideal system for developing multicellularity as many of the mechanisms that are required, including cell-to-cell communication and adhesion, are present and well understood. Building on our recent work, where we used genome engineering and synthetic tools to refactor yeast signalling for rationally tuning the input-output, we have now extended this approach to control cell- to-cell signalling and adhesion. These highly characterised biological modules now provide a platform to begin building minimal multicellular systems from the bottom-up. Taking this approach, we hope to gain insights into natural processes such as morphogenesis and the evolution of multicellularity, as well as enabling the development of downstream applications such as complex tissues and novel biomaterials. T12. Characterization of a New Saccharomyces cerevisiae Isolated from Hibiscus Flower and Its Mutant with l-Leucine Accumulation for Awamori Brewing Hiroshi Takagi1, Takayuki Abe2, Yoichi Toyokawa1, Masatoshi Tsukahara2

1Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan 2BioJet Co., Ltd., Okinawa, Japan

23 To produce awamori, a traditional distilled alcoholic beverage with unique flavors made from steamed rice in Okinawa, Japan, it is necessary to optimize yeast strains for a diversity of tastes and flavors with established qualities. Here we isolated a novel strain of Saccharomyces cerevisiae from hibiscus flowers in Okinawa, HC02-5-2. The whole-genome information revealed that strain HC02-5-2 is contiguous to wine yeast strains in a phylogenic tree. This strain also exhibited a high productivity of 4-vinyl guaiacol (4-VG), which is a precursor of vanillin known as a key flavor of aged awamori. Although conventional awamori yeast strain 101-18, which possesses the FDC1 pseudogene does not produce 4-VG, strain HC02-5-2, which has the intact PAD1 and FDC1 genes, has an advantage for use in a novel kind of awamori. To increase the contents of initial scented fruity flavors, such as isoamyl alcohol and isoamyl acetate, we attempted to breed strain HC02-5-2 targeting the L-leucine synthetic pathway by conventional mutagenesis. In mutant strain T25 with L-leucine accumulation, we found a hetero allelic mutation in the LEU4 gene encoding the Gly516Ser variant α-isopropylmalate synthase (IPMS). IPMS activity of the Gly516Ser variant was less sensitive to feedback inhibition by L-leucine, leading to intracellular L-leucine accumulation. In a laboratory-scale test, awamori brewed with strain T25 showed higher concentrations of isoamyl alcohol and isoamyl acetate than that brewed with strain HC02-5-2. Such a combinatorial approach to yeast isolation, with whole-genome analysis and metabolism-focused breeding, has the potentials to vary the quality of alcoholic beverages. T13. Engineering Prototrophy In Mammalian Cells Julie Trolle1, Ross McBee2, Liyuan Liu2, Andrew Kaufman2, Xinyi Guo1, Sudarshan Pinglay1, Sergei German1, Harris H. Wang2#, Jef D. Boeke1#

1Institute for Systems Genetics, NYU Langone Health, New York, New York, United States 2Department of Systems Biology, Columbia University, New York, New York, United States *co-first authors #co-senior authors lack the biosynthetic machinery to produce 9 of the 20 proteinogenic amino acids. Phylogenetic studies indicate that the last eukaryotic common ancestor was able to synthesize all 20 amino acids and that the loss of ability to synthesize subsets of these pathways has been a result of multiple multi-gene deletion events throughout evolution. The reasons for these polyphyletic deletion events are not well understood. In our work, we take non-native amino acid biosynthetic pathways sourced from bacterial and fungal species, codon optimize their coding sequences for use in mammalian cells and deliver them to Chinese Hamster Ovary (CHO) cells. Here, we successfully demonstrate the yeast to assembly and delivery of constructs, which encode the missing steps towards the mammalian biosynthesis of isoleucine and valine. We show that CHO cells engineered to express E. coli-derived isoleucine/valine biosynthetic genes are able to proliferate in media lacking valine, and that they display a survival advantage in media lacking isoleucine provided that pathway intermediate 2- oxobutanoate is spiked into the culture media. This is the first example of essential amino acid biosynthesis in cells.

24 T14. Genome Fission Ignited Fusion Kaihang Wang, Daniel de la Torre, Wesley E. Robertson, Jason W. Chin California Institute of Technology, Pasadena, California, United States The design and creation of synthetic genomes provide a powerful approach to understanding and engineering biology. However, it is often limited by the paucity of methods for precise genome manipulation. Here, we demonstrate the programmed fission of the genome into diverse pairs of synthetic chromosomes and the programmed fusion of synthetic chromosomes to generate genomes with user-defined inversions and translocations. We further combine genome fission, chromosome transplant, and chromosome fusion to assemble genomic regions from different strains into a single genome. Thus, we program the scarless assembly of new genomes with precision, a key step in the convergent synthesis of genomes from diverse progenitors. This work provides a set of precise, rapid, large-scale (megabase) genome- engineering operations for creating diverse synthetic genomes. T15. Total synthesis of a 1.4 Mb yeast chromosome, synIV Weimin Zhang1, Hitoyoshi Yamashita3, Michael J. Shen1, Luciana Lazar-Stefanita1, Hikaru Kurasawa3, Leslie A. Mitchell1, Brendan R. Camellato1, Jiaming Lin1, Nicole Easo1, Zhuwei Xu1, Junbiao Dai2, Yasunori Aizawa3, Jef D. Boeke1

1Institute for Systems Genetics, NYU Langone Health, New York, New York, United States 2Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China 3Tokyo Institute of Technology, Tokyo, Japan Emergence of next-generation sequencing (NGS) and advanced DNA synthesis technology enables us to explore the genetic blueprint of life in more depth. From the synthetic biology perspective, “build-to-understand” represents one way to probe the nature of genomes. However, the limited capacity of building large synthetic DNA limits our understanding of genomes across species, especially for eukaryotes. In this study, we report the total synthesis of the largest eukaryotic chromosome up to date in yeast. Following the Sc2.0 design principles, we not only simplify the chromosome by deleting introns, transposons, and sub-telomeres, but also endow the chromosome with new biological functions by adding 479 loxPsym sites downstream of nonessential genes and by swapping all TAG stop codons to TAA. We successively used marker SwAP-In and Meiotic Recombination- mediated Assembly strategies to assemble the synthetic chromosome. During chromosome building, the parallelized synthesis workflow allowed us to identify causative bugs and fix them rapidly. In summary: repeatsmash-recoded NOP1 and PCF11 impaired protein expression (potentially due to rare codon incorporations), a single base pair substitution in the promoter region of MAK21 caused severe growth defects, and a PCRTag (a water marking system) in RTT103 impaired fitness in a “synthetic sick” manner. The knowledge acquired during debugging gives us a better understanding of genome design principles. The completion of synIV provides an excellent platform for answering fundamental questions in evolutionary biology, chromosome stability and chromosome structure in the future.

25 T16. Yeast Synthetic Chromosome Consolidation - Building the ‘Complete’ Sc2.0 Strain with Designer Genome Yu Zhao, Leslie Mitchell, Jef D. Boeke Institute for Systems Genetics, NYU Langone Health, New York, New York, United States Saccharomyces cerevisiae is a unicellular eukaryote widely used in basic research, bioengineering and industrial fermentation. In 1996, S. cerevisiae was the first eukaryote of which the whole genome was sequenced and released, 7 years prior to the completeness of . In 2006, the synthetic S. cerevisiae (Sc2.0) project was launched, where yeast chromosomes were redesigned. The Sc2.0 consortium plan to have all chromosomes assembled in the near future. One issue with the current approach is that synthetic chromosomes are built individually in a strain whose genome is still mostly wild-type. How we can consolidate the chromosomes into one ‘complete’ Sc2.0 strain is still a challenge. Whether a eukaryote cell can survive well with its nucleus genome completely redesigned and synthesized remains unknown. To overcome these challenges, we describe a strategy to incorporate all synthetic chromosomes into one strain using mating and GAL promoter-mediated chromosome destabilization. In a heterozygous diploid strain with synthetic chromosomes and its wild- type counterpart, the wild-type chromosome was destabilized. After sporulating and screening these haploid spores, we have been able to obtain haploid strains with more than one synthetic chromosome. Using this strategy, we have obtained a yeast strain with 6.5 synthetic chromosomes, accounting for around 40% of the yeast genome. We expect to build the final Sc2.0 strain with all 16 synthetic chromosomes, as the first eukaryote with its genome completed redesigned and synthesized. The new strain containing designer genome will serve as a platform for systematic eukaryotic genome study and industrial applications.

26