Spell Checking Nature: Development of a CRISPR- Mediated Gene Editing Approach for the Treatment of Pathogenic Duplications

by

Daria Wojtal

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Molecular University of Toronto

© Copyright by Daria Wojtal 2020

ii

Spell Checking Nature: Development of a CRISPR-mediated Gene Editing Approach for the Treatment of Pathogenic Duplications

Daria Wojtal

Doctor of Philosophy

Department of Molecular Genetics University of Toronto

2020

Abstract

Duchenne muscular dystrophy (DMD) is a neuromuscular disorder that leads to progressive muscle deterioration, loss of ambulation, and respiratory complications. It is caused by genetic mutations that result in the absence of dystrophin protein expression needed for muscle function.

Despite significant advances in our understanding of the pathogenesis of DMD, no curative treatment has been identified to date and the disorder has a life-limiting disease trajectory.

Recently, we have pioneered an approach to successfully remove large duplications in patient cells.

We first tested this approach in vitro by removing a multi-exon (18-30) duplication of 139 kb in the DMD gene using Clustered Regularly Interspaced Short Palindromic Repeats

(CRISPR)/CRISPR-associated Nuclease (Cas9) with a single guide. To test our treatment approach in vivo, I first generated a mouse model harboring a multiexon duplication of 136.8 kb in Dmd using CRISPR/Cas9. This first multiexon duplication model of DMD specifically mimics a patient duplication of Exons 18-30. Molecular and functional characterization of this model reveals dystrophin deficiency with characteristic markers of dystrophic muscle. Furthermore, using our previously described CRISPR/Cas9 single guide strategy, we have for the first time treated a large genomic duplication in vivo and shown successful removal of the duplication

iii fragment leading to restoration of full-length dystrophin in skeletal and cardiac muscles.

Additionally, histopathological analysis shows that treated mice have less indications of dystrophy including fewer centrally localized nuclei as well as significantly improved muscle function. Our findings establish the far-reaching therapeutic utility of CRISPR/Cas9, which can be tailored to target numerous inherited disorders caused by duplications.

iv

Acknowledgments

I would like to thank my supervisor, Dr. Ronald Cohn, for the tremendous opportunity to work in his lab. Your well-rounded mentorship both scientifically in my academic work as well as beyond the lab bench, challenged me to think globally and meet the individuals and their families driving this work. Your strong, fair and compassionate leadership inspires me every day.

Also, I would like to thank Dr. Zhenya Ivakine for his mentorship, scientific expertise, and critical analysis of my work. Your positive thinking mantra, and unique lens on every failed experiment have helped me get over the biggest seemingly impossible hurdles in this project. Thank you to my excellent supervisory committee members, Drs. Lucy Osborne and Sean Egan, who have challenged me along the way to strengthen my research and critical thinking. Furthermore, this work could not have been completed without the support of past and present members of the Cohn Lab including Dr. Dwi Kemaladewi, Dr. Zeenat Malam, Ella Hyatt, Dr. Zahra Baghestani, Kyle Lindsay, Matthew Rok and my summer and thesis trainees over the years- Amanda Chiodo, Vanessa Gomes, Gina Desatnik, and Georgia Besant. Special thank you to Eleonora Maino, Sonia Evagelou, and Aiman Farheen who contributed to finalizing my project and who will be now moving forward with the next steps of the research. I would also like to thank Dr. Juliet Daniel and Dr. Doug Boreham, my undergraduate research supervisors and my lab mates from McMaster University, who gave me my first opportunities to work in their research laboratories. Especially, Dr. Victor Kreft for his continued mentorship. Without these early hands on research opportunities and training, I certainly would not be in the position I am today… and most definitely would not be able to pipette and open an epi tube in one hand!

Also, I would like to wholeheartedly thank all of the individuals personally impacted by neuromuscular disorders whom I have met over the years, the individuals who have donated tissue samples, who have support research funding and advocacy. As well as organizations who have funded this work including the SickKids Research Training Center Restracomp, Duchenne Children’s Fund, Duchenne UK, McArther Fund, Jesse’s Journey, and Solid Biosciences, as well as my new team at Muscular Dystrophy Canada, for their support during my last sprint towards the finish line. Last, but certainly not least, I would like to thank my family for their endless support of my education.

v

Table of Contents

Acknowledgments ______iv

Table of Contents ______v

List of Abbreviations______ix

List of Tables ______xiv

List of Figures ______xv

______1

Introduction ______1

1.1 Duchenne Muscular Dystrophy ______1 1.1.1 Duchenne Muscular Dystrophy ______1 1.1.2 DMD Mutation Spectrum ______1

1.2 Current Therapeutic Approaches for DMD ______2 1.2.1 Current Interventions ______2 1.2.2 Approved Medications for DMD ______2 1.2.3 Gene Therapies In Clinical trials ______3 1.2.4 Functional Benefits of Low Level Dystrophin Expression ______3 1.2.5 Current Dystrophin Deficient Animal Models ______4

1.3 Structural Genomic Variants and Human Disease ______5 1.3.1 Copy Number Variants ______5 1.3.2 Complex Genomic Rearrangements ______5 1.3.3 Mechanism Underlying Duplications and Complex Rearrangements ______6

1.4 CRISPR/Cas9 System ______7 1.4.1 Overview CRISPR/Cas9 ______7 1.4.2 Mechanism of Action: Gene Editing ______7 1.4.3 CRISPR-Cas9 in Clinical Trials ______8 1.4.4 Use of CRISPR/Cas9 in Animal Model Generation for Large Structural Rearrangements ______9 1.4.5 Adeno-Associated Virus as Delivery System ______11

1.5 Thesis Objectives ______11

______14

vi

Materials and Methods ______14

2.1 Duplication Mapping ______14 2.1.1 Whole Genome Sequencing ______14 2.1.2 Variant Calling and Annotation of Whole Genome Sequencing DMD Dup18-30 Patient ______14 2.1.3 Variant Calling and Annotation for Dmd Dup18-30 inv1A10 Mouse Model ______15

2.2 Human DMD Duplication Treatment in vitro______15 2.2.1 CRISPR/Cas9 and sgRNA Design and Cloning ______15 2.2.2 sgRNA Evaluation ______15 2.2.3 In vitro Transfection ______16 2.2.4 Western blot From Cells ______16 2.2.5 Off-target Editing Analysis ______17

2.3 Mouse Model Generation ______17 2.3.1 Animal Husbandry ______17 2.3.2 sgRNA Design and Cloning for Mouse Model Generation ______18 2.3.3 Dup18-30 Dmd-1A10 Mouse Model Generation ______18 2.3.4 Dup18-30 Dmd-inv1A10 Mouse Model Generation ______18 2.3.5 sgRNA Design and Cloning for Mouse Model Generation and Treatment ______19 2.3.6 RNA isolation and RT-PCR ______20

2.4 AAV9 Viral Vector Production ______21

2.5 Molecular Assessment ______21 2.5.1 Collection of Animal Tissues ______21 2.5.2 Assessment of Muscle Pathology ______21 2.5.3 Immunofluorescence ______22 2.5.4 Western blot ______22

2.6 Functional Analysis ______23 2.6.1 Grip Strength ______23 2.6.2 Open field test ______23

2.7 Statistical Analysis ______24

______25

Exploring the Versatility of the CRISPR/Cas9 System ______25

3.1 Overview ______25

vii

3.2 Results ______26 3.2.1 Duplication Mapping and Characterization ______26 3.2.2 CRISPR-Based Genetic Removal of Duplicated Fragment in Patient Cells ______30 3.2.3 Off-target Analysis ______30

3.3 Discussion and Conclusion ______34

3.4 Acknowledgements ______36

______37

Generation of First Multi-Exon Duplication Mouse Model of Duchenne Muscular Dystrophy ______37

4.1 Overview ______37

4.2 Results ______38 4.2.1 Genome Engineering to Generate 137 Kb Genomic Duplication in Murine Model ______38 4.2.2 Dmd Dup18-30 1A10 and 3A7 Mice have No Dystrophin Expression and Show Characteristic Dystrophic Pathophysiology ______43 4.2.3 Molecular Characterization of Dmd Dup18-30 1A10 and 3A7 Mouse mRNA Suggests Occurrence of Complex Rearrangement in Dmd gene ______46 4.2.4 WGS of DMD Dup18-30 1A10 and 3A7 Mouse DNA ______49 4.2.5 Generation of Second Version of DMD Dup18-30-inv1A10 Mouse ______57 4.2.6 Second Version of DMD Dup18-30-reinv1A10 Mouse Shows Full Duplication With No Additional Dmd mRNA Altering Rearrangements ______60 4.2.7 Dmd Dup 18-30 inv1A10 Mouse Model is Dystrophin Deficient and Shows Dystrophic Muscle Pathophysiology as well as Impaired Skeletal Muscle Function ______63

4.3 Discussion and Conclusions ______68

4.4 Acknowledgements ______71

______72

Single sgRNA CRISPR-mediated Genome editing Restores Full-Length Dystrophin and Improves Muscle Function in first 137 kb Duplication Mouse Model of Duchenne Muscular Dystrophy ______72

5.1 Overview ______72

5.2 Results ______73

viii

5.2.1 Design of sgRNA for Duplication Removal ______73 5.2.2 Experimental Design of CRISPR-Mediated Treatment of Dmd Dup18-30 (inv1A10) Mouse Model with Single sgRNA Approach ______74 5.2.3 CRISPR single sgRNA Treatment Restores Full-Length Dystrophin Expression and Improved Pathophysiology in Skeletal and Cardiac Muscles ______78 5.2.4 CRISPR Treated Mice Show Functional Improvement ______85

5.3 Discussion and Conclusions ______88

5.4 Acknowledgements ______90

______91

Future Directions ______91

6.1 Overview ______91

6.2 Improving CRISPR efficacy through Combinatorial Therapy: Gene Editing and Gene Modulation ______91 6.2.1 Simultaneous editing and upregulation of newly synthesized gene ______92 6.2.2 Combinatorial Gene Modulation of Disease Modifiers ______92

6.3 Improving CRISPR Delivery ______92

6.4 Increasing specificity and efficiency of editing ______93

6.5 Exploring Applicability to Other Duplications in DMD ______94

6.6 Conclusion ______94

References ______97

ix

List of Abbreviations

AAV9 Adeno-Associated Virus, Serotype 9

BAM Binary form of sequence Alignment Map bGHpA Bovine Growth Hormone Polyadenylation

BMD Becker Muscular Dystrophy

Bp

BSA Bovine Serum Albumin

Cas9 CRISPR-Associated Protein 9 cDNA complementary Deoxyribonucleic Acid

Chr Chromosome

CGR Complex Genomic Rearrangement

CMV Cytomegalovirus

CNV

CNX Calnexin

COSMID CRISPR Off-Target Sites with Mismatches, Insertions, and Deletions

CRISPR Clustered Regularly Interspaced Short Palindromic Repeats

Cre Carbapenem-resistant Enterobacteriaceae cRNA complementary Ribonucleic Acid crRNA CRISPR RNA dCas9 dead Cas9

x

DEL Deletion

DC Dystroglycan

DGC Dystrophin Glycoprotein Complex

DMD Duchenne Muscular Dystrophy

DMEM Dulbecco’s modified Eagle’s medium

DNA Deoxyribonucleic Acid

DSB Double Stranded Break

DUP Duplication

ECL Enhanced Chemiluminescence

ESC Embryonic Stem Cell

EX Exon

F Forward

FBS Fetal Bovine Serum

FDA Food and Drug Administration

GFP Green Fluorescent Protein

GLT Germline Transmission

GRMD Golden Retriever Muscular Dystrophy

GS Grip Strength

H&E Haemotoxylin and Eosin

HA Human influenza hemagglutinin

xi

HDR Homology Directed Repair

HRP Horseradish peroxidase

I Intron

ICE Inference of CRISPR Edits

Indel Insertion or deletion

IF Immunofluorescence

IGV Integrative Genomics Viewer

INV Inversion

ITR Inverted Terminal Repeat

Lenti Lentivirus

LINE Long Interspersed Element loxP locus of X-over P1

Mb Megabase

MECP2 Methyl CpG binding protein 2 mESC murine Embryonic Stem Cell

MLPA Multiplex-Ligation dependent Probe Amplification mRNA messenger RNA

N any Nucleotide n number

NAHR Non-Allelic Homologous Recombination

xii

NGS Next Generation Sequencing

NHEJ Non-Homologous End Joining

NLS nuclear localization signal n.s. not significant kb kilobase kDa kiloDalton

R Reverse

RNA Ribonucleic Acid

P1/2 Postnatal Day 1/2

PAM Protospacer Adjacent Motif

PCR Polymerase Chain Reaction

SaCas9 Streptococcus aureus Cas9 s.d. standard deviation

SEM Standard Error of Mean

SG Sarcoglycan sgRNA single guide RNA

SpCas9 Streptococcus pyogenes Cas9 ssODN single stranded oligonucleotide

SV Structural Variant

TAL/TAR Tibialis Anterior Left/Right

xiii

TALENs Transcription Activator-Life Effector Nucleases

TBST Tris-Buffered Saline with Tween

TCAG The Center for Applied Genomics

TCP The Centre for Phenogenomics tracrRNA trans-activating CRISPR RNA

TSS Transcriptional Start Site

WGS Whole Genome Sequencing

VGC Viral Genome Copies

VP16 Herpes Simplex Viral Protein 16

WB Western Blot

WT Wild Type

ZFNs Zinc Finger Nucleases

xiv

List of Tables

TABLE 1 | SUMMARY OF REPORTED LARGE GENOMIC EDITING EFFICIENCIES IN MURINE MODEL GENERATION ...... 10

TABLE 2 | PRIMERS FOR MOUSE MRNA MAPPING...... 20

TABLE 3 | OFFTARGET HITS FROM DMD SGRNA 1 ...... 33

TABLE 4 | SUMMARY OF EDITING OUTCOMES FROM MURINE MODEL GENERATION ...... 42

TABLE 5 | SUMMARY OF CHIMERISM AND GERM LINE TRANSMISSION ...... 42

TABLE 6 | COPY NUMBER VARIANTS CALLED BY CNVNATOR ...... 61

TABLE 7 | STRUCTURAL VARIANTS CALLED BY MANTA ...... 62

TABLE 8 | ICE ANALYSIS OF CANDIDATE SGRNAS TARGETING MURINE DMD ...... 75

xv

List of Figures

FIGURE 1 | PHD THESIS OVERVIEW ...... 12

FIGURE 2 | WGS ANALYSIS OF DMD IN PATIENT HARBORING PATHOGENIC DUPLICATION ...... 28

FIGURE 3 | IN VITRO CRISPR-MEDIATED TREATMENT OF A TANDEM 139 KB INTRAGENIC DUPLICATION...... 31

FIGURE 4 | SGRNA DESIGN AND SCREENING FOR DMD DUPLICATION MOUSE MODEL ...... 40

FIGURE 5 | HISTOLOGICAL ANALYSIS OF NOVEL DMD MODELS 1A10 AND 3A7 ...... 44

FIGURE 6 | MOLECULAR CHARACTERIZATION OF 1A10 AND 3A7 MRNA SHOWS ABERRANT TRANSCRIPT SUGGESTING A COMPLEX REARRANGEMENT ...... 47

FIGURE 7 | WGS OF 1A10 MODEL IDENTIFIES DUPLICATION OF EXONS 18-30 FOLLOWED BY INVERSION OF EXONS 31-34 IN DMD ...... 50

FIGURE 8 | LONG RANGE PCR AND SANGER SEQUENCING VERIFICATION OF WGS OF 1A10 MOUSE MODEL ...... 52

FIGURE 9 | NEXT GENERATION SEQUENCING ANALYSIS OF 3A7 MODEL IDENTIFIES COMPLEX GENOMIC REARRANGEMENT ...... 55

FIGURE 10 | CRISPR-MEDIATED RE-INVERSION OF COMPLEX DNA REARRANGEMENT IN ZYGOTES OF DUP18-30 1A10 MOUSE MODEL ...... 58

FIGURE 11| HISTOLOGICAL CHARACTERIZATION OF DMD DUP18-30 INV1A10 MICE ...... 64

FIGURE 12| FUNCTIONAL CHARACTERIZATION OF DMD DUP18-30 INV1A10 MICE .66

FIGURE 13 | SGRNA DESIGN AND EXPERIMENTAL DESIGN ...... 76

xvi

FIGURE 14 | DMD DUP18-30 MICE TREATED WITH CRISPR SHOW RESTORATION OF FULL-LENGTH DYSTROPHIN PROTEIN ...... 79

FIGURE 15 | WESTERN ANALYSIS OF PROTEIN EXPRESSION IN CRISPR TREATED DMD DUP18-30 MICE ...... 81

FIGURE 16 | DUP19-30 MICE TREATED WITH CRISPR SHOW HISTOLOGICAL AND FUNCTIONAL IMPROVEMENT ...... 83

FIGURE 17 | DUP19-30 MICE TREATED WITH CRISPR SHOW FUNCTIONAL IMPROVEMENT ...... 86

FIGURE 18 | EXPLORING THE FEASIBILITY OF ONE SGRNA TREATMENT APPROACH FOR MULTIPLE PATIENT CELLS HARBORING DUPLICATION IN DMD...... 95

1

Introduction

1.1 Duchenne Muscular Dystrophy

1.1.1 Duchenne Muscular Dystrophy

Duchenne muscular dystrophy (DMD) OMIM #300377 is the most common form of pediatric muscular dystrophy, affecting approximately 1 in 5,000 males worldwide 1. This X-linked neuromuscular disorder is caused by mutations in the DMD gene (DMD; MIM #310200) 2,3. DMD has 79 exons and encompasses approximately 2.5 megabases of genome encoding a 427 kilodalton (kDa) protein, dystrophin 4-6. Clinically the loss of dystrophin in DMD patients manifests as progressive muscle deterioration, loss of ambulation, respiratory complications and a life limiting disease trajectory.

Dystrophin provides structural integrity to skeletal and cardiac muscles by linking the subsarcolemmal cytoskeleton to the extracellular matrix through the Dystrophin Glycoprotein Complex (DGC). In DMD, the absence of dystrophin leads to the loss of the DGC at the sarcolemma which ultimately causes an increased susceptibility to muscle damage in response to physical activity or injury 7,8.

1.1.2 DMD Mutation Spectrum

Pathogenic mutations in the DMD gene are mostly large deletions followed in frequency by single point mutations and, to a lesser extent, duplications and translocations, all leading to disruption of the coding sequence, resulting in the absence of dystrophin expression 5,9. In frame mutations in DMD generally give rise to a milder phenotype as seen in patients with Becker muscular dystrophy (BMD) 9,10. Duplications comprise approximately 10-12% of DMD cases as reported by a recent analysis of the Treat-NMD Global database 11. There is some reported variation between ethnic populations. For example, this rate is 6.5% in Indian 12, 16.7 % in Korean 13, 17.8% Saudi 14, and 20.5% in the Taiwanese 15 population. Although most duplications are nonrecurrent, hotspots for duplications spanning exon 2–22 have been identified 10,11,16,17.

2

1.2 Current Therapeutic Approaches for DMD

There is an unmet medical need for treatment of DMD as no curative therapies are available. Recent advances in multidisciplinary care have improved the quality of life for individuals with DMD, many of which are living longer. Standards of care include rehabilitation as well as respiratory and cardiac management 18. Emerging treatments include gene therapy, such as micro- dystrophin gene transfer, as well as DMD deletion specific anti-sense oligonucleotide exon skipping, and promotion of nonsense mutation stop codon readthrough with Ataluren. Patients with pathogenic duplications in DMD would only be amendable to micro-dystrophin gene transfer therapy, which delivers a micro-dystrophin that is not full-length. Thus, there is currently no approved therapy nor emerging therapeutic strategy for DMD caused by duplications that would restores full-length dystrophin.

1.2.1 Current Interventions

Physiotherapy, assistive devices and treatment with glucocorticoids are main components of DMD management 18. Physiotherapy has been shown to prolong function through increased range of motion to prevent contractures and scoliosis. Glucocorticoid therapy slows progression by maintaining ambulation longer, preserving upper limb and respiratory function. Respiratory complications are the major cause of mortality in individuals with DMD. As diaphragm muscles become more affected, individuals with DMD may require noninvasive devices like a cough assist. With disease progression, additional ventilatory support may be required 18. Cardiovascular complications are another major cause of mortality. The cardiac muscle, like skeletal muscle, loses function when dystrophin is absent . This can lead to cardiomyopathy. Glucocorticoids have shown a beneficial effect on heart function and, additionally, angiotensin-converting enzyme inhibitors and beta blockers are used to manage this symptom 19.

1.2.2 Approved Medications for DMD

Corticosteroids like prednisone and Emflza (deflazacort) are part of the standard of care for DMD treatment. Although the exact mechanisms of action of corticosteroids is not entirely clear, the benefits include slowing progression of the disorder likely through its anti-inflammatory effect. Chronic use of corticosteroids has also been associated with detrimental side effects like bone

3 weakness as well as weight gain 20,21. Because this treatment does not target the underlying cause of DMD, it can be prescribed to DMD and BMD patients irrelevant of their pathogenic mutations. Additionally, mutation specific therapies have been developed: Translarna (ataluren) approved in 2014 by the European Commission, Exondys 51 (eteplirsen) in 2016 by the Food and Drug Administration (FDA), and Vyondys 53 (golodirsen) in 2019 by the FDA. To date none have been approved by Health Canada. Additionally of note, neither of these three therapies can be used to treat patients with duplication in DMD. Ataluren is beneficial for the subset of patients harboring specifically a stop mutation. Eteplirsen and golodirsen are beneficial for patients with deletion mutations, specifically those that would be amendable to a restored reading frame with exon 51 or exon 53 skipping respectively. Antisense oligonucleotides like eteplirsen and golodiresen are complementary to regions of the dystrophin premature mRNA; these agents induce skipping of one or more exons to restore the reading frame and produces a shorter dystrophin protein 22,23. The long-term efficacy of the short dystrophin production has yet to be determined as these therapies are fairly new.

1.2.3 Gene Therapies in Clinical trials

Another therapeutic approached aimed at restoring shorter but functional dystrophin is currently in clinical trials, each sponsored by different companies; Sarepta Therapeutics, SOLID Biosciences, and Pfizer Inc. These trials for gene replacement therapy aim to deliver a truncated but functional DMD gene to cells 24,25. This approach is not without its own set of challenges. The short dystrophin being produced is not fully functional. Also episomal AAV vectors that deliver micro-dystrophin are lost over time 26 and second dose administration of AAV can lead to an immune repose. Further strategies aimed at compensating for the loss of dystrophin include ATP modulation as well as anti-inflammatory and fibrosis reducing compounds. The long determined the long terms efficacy of these approaches are yet to be determined.

1.2.4 Functional Benefits of Low Level Dystrophin Expression

An important question in assessing dystrophin restorative therapies is how much dystrophin is enough to ameliorate disease. It is estimated that in humans ≥20% mini- dystrophin protein expression would be sufficient, based on clinical observations that patients with BMD are often affected less severely and maintain ambulation 27,28. Furthermore, studies in dystrophin negative

4 female mice with skewed X-inactivation suggest that approximately 3-14% of full-length dystrophin can improve muscle function and >20% is needed to fully protect muscle fibers from exercise-induced damage 29-31

1.2.5 Current Dystrophin Deficient Animal Models

The classic model for Duchenne muscular dystrophy is the C57Bl/10ScSn mdx (mdx) mouse 32, which harbors a sporadic out-of-frame point mutation in exon 23 leading to dystrophin loss. While this model is genetically comparable to a subset of Duchenne muscular dystrophy caused by point mutations in humans, the model shows a species-specific response to dystrophin loss and shows relatively mild clinical symptoms which are not obvious but can be detected through highly sensitive behavioral testing of balance, locomotion and strength (reviewed in 33). Additionally, this mouse it shows a moderately affected muscle histology, with less uniform muscle fibers, presence of inflammation and centrally localized nuclei. Other strains have also been generated using chemical mutagenesis each carrying different point mutations or small deletions (as reviewed in 33). These dystrophin deficient models also have a similar mild phenotype.

Additionally, sporadic Dmd gene mutations have been identified in at least nine dog breeds (as reviewed in 33) harboring point mutations or small deletions. Experimental colonies, however are not available for most breeds and the most commonly studied is the Golden Retriever Muscular Dystrophy (GRMD) model which carries a sporadic splice mutation in intron 6 34. Although the dog phenotype more closely resembles the disease phenotype in patients, several factors like availability and resource intensive maintenance of a dog colony have limited the use of this model.

With the mainstream use of genome engineering technology, animal models are being developed. A pseudo duplication model for DMD was developed in the mouse by targeting a second copy of exon 2 into intron 1 35. This model does not recapitulate the structure of pathogenic duplications in DMD seen in patients as it is not a tandem duplication. Most recently, a mouse model has been made through exon 50 deletion (DEx50) which is amendable to exon 51 skipping therapeutic approaches 36. Both of these models have been described to show a similar pathological physiology to the standard mdx mouse 36 35. As of today, no mouse nor canine model with a tandem duplication in DMD exists.

5

1.3 Structural Genomic Variants and Human Disease

Since sequencing of the first two human genomes were published in 2007 37 and 2008 38, it has become increasingly clear that genomic variation is not just restricted to single nucleotide polymorphisms as previously thought, but that structural variants are quite common 39,40. Structural variants are generally defined as changed regions in Deoxyribonucleic Acid (DNA) sequence effecting at least 1 kilobase (kb) in length which can be balanced i.e. inversions and translocations or imbalanced i.e. deletions or duplications, the latter known as copy number variants (CNVs) 39.

1.3.1 Copy Number Variants

Approximately 12% of the is subject to copy number variation like deletions, duplications or other multiplications of DNA 40. CNV may be intragenic or encompass multiple contiguous genes. The full functional significance of such variations are not fully understood, however it has been suggested that CNVs give rise to high genetic variability are thought to drive a great deal of genetic diversity and evolution 41. Additionally, pathogenic CNVs have been linked to both monogenetic as well as complex diseases. Such pathogenic CNV have been identified in neuromuscular disorders like Charcot-Marie-Tooth 1A disease 42,43 and Duchenne/Becker muscular dystrophy 11,16,44. In multisystem disorders like Williams syndrome (7q11.23 Deletion) 45-47, Prader-Willi and Angelman syndromes 48,49, as well as in neurodegenerative and neuropsychiatric conditions like MECP2 (Xq28) duplication syndrome 50-52, Pelizaeus- Merzbacher disease 53,54, Alzheimer’s disease 55, 22q11.2 Deletion and Duplication syndromes 56,57, Autism Spectrum disorders and schizophrenia 58-60, the deletions and duplications of large genomic regions are an important class of disease causing variants in humans, however difficulty in modeling these structural variants in animals has limited their availability.

1.3.2 Complex Genomic Rearrangements

Complex genomic rearrangements (CGRs) are mutations that have two or more breakpoint junctions consisting of more than one simple structural variant. CGRs account for approximately 1% of pathogenic mutations in Duchenne muscular dystrophy 61,62. This frequency is believed to be an underestimation as these types of mutations, especially if encompassing balanced mutation

6 like inversions or translocations which are relatively more difficult to identify through multiplex- ligation dependent probe amplification (MLPA), the standard mutation diagnostic assays.

Although infrequent, CGRs with various structures have been identified in DMD with more comprehensive mutation analysis including DMD gene-targeted sequencing. These CGR structures are varied and can include both a duplication and deletion on the same chromosome 14,63, double non-contiguous duplications 14,63, a duplication inserted in direct orientation into a deleted region 64, as well as duplications and inversions 65, and inversion flanked by deletion 64.

1.3.3 Mechanism Underlying Duplications and Complex Rearrangements

Two major mechanisms by which copy number aberrations have been elucidated from mapping breakpoint junction of pathogenic duplications nonallelic homologous recombination (NAHR) and replication-based mechanisms (reviewed in 66. Although both can happen at any locus, one or other mechanism may predominate as determined by local genome architecture. NAHR occurs when there is exchange of DNA fragments, usually this rearrangement can happen if the interval is flanked by a common genomic structure or architecture like low-copy repeats. This recombination can result in deletions and duplications, but deletions are favored. This mechanism results in recurrent duplications which include the same genomic interval in unrelated individuals affected by the same disorder, as observed in the neuromuscular disorder Charcot-Marie-Tooth 1A (reviewed in 66,67)

On the other hand, replication-based mechanisms involving Fork Stalling Template Switching (FoSTes)/Microhomology Mediated Break Induction Replication (MMBIR) and/or Non- Homologous End joining are commonly thought to play a significant role in formation of nonrecurrent genomic rearrangements where the size and breakpoint position can differ amongst unrelated individuals with the same disorder. These models suggest that during replication switching to an upstream fork can result in duplication, downstream fork switching can result in a deletion, and repeated switches back and forth result in complex rearrangements such as triplications and inversions. In these cases, a simple cut and paste of two ends is not observed at the breakpoint junction, instead there is an inserted segments of genome suggesting a template driven mechanism with microhomology at the ends (reviewed in 66).

7

Furthermore, an increasing number of studies providing a deeper resolution of breakpoint junctions of pathogenic duplication and CGR mutations in DMD are providing growing evidence that replicative mechanisms that drive most rearrangement (reviewed in 66. It has been suggested that certain features of the DMD locus may contribute to the high number of structural variants observed in the gene; these include regional genomic instability within DMD, presence of repetitive elements, as well as stem-loop structures 64,68.

1.4 CRISPR/Cas9 System

1.4.1 Overview CRISPR/Cas9

CRISPRs and CRISPR-associated (Cas) genes function in the adaptive immune system of prokaryotes 69-74. This system has been adapted to facilitate a wide variety of targeted genome engineering applications, including generation of specific model organisms and large-scale unbiased genome perturbation experiments to probe gene function or elucidate causal genetic variants. In addition to facilitating genome modifications, the wild-type Cas9 nuclease can also be converted into a generic RNA-guided homing tool. For this application dead Cas9 (dCas9) is utilized where its catalytic domains are inactivated. The use of effector fusion proteins greatly expands the repertoire of genome engineering modalities achievable using Cas9. This provides opportunities to alter transcription states of specific genomic loci, monitor chromatin states, or even rearrange the three-dimensional organization of the genome 6.

The CRISPR/Cas9 system uses guides or single guide RNAs (sgRNAs) to direct Cas9 to sequences upstream from a protospacer-adjacent motif (PAM). This motif is specific to each species of Cas9 and is necessary for its recognition of the target. Once the sgRNA-Cas9 complex binds to the target region, a double stranded cut is made which is repaired through the cell’s own repair machinery.

1.4.2 Mechanism of Action: Gene Editing

Since realizing the potential power of a programmable nuclease in editing mammalian genomes, the CRISPR/Cas9 system has been developed as a technology applicable to probing or altering many biological processes 75,76. Regardless of the platform, this system requires a mammalian

8 codon optimized Cas9 protein and a chimeric sgRNA which is made up of CRISPR RNA (crRNA) and a trans-activating CRISPR RNA (tracrRNA) 72,75-77. Guide sequences are generally 17-20-bp long 78. Target sequences must be adjacent to a protospacer adjacent motif (PAM) sequence for Streptococcus pyogenes Cas9 (SpCas9) in the form of 5′-NGG 79. Cas9 target recognition is dictated by Watson-Crick base-pairing of an RNA guide with its DNA target 80,81. Once expressed in cells, Cas9 nuclease and the sgRNA form a complex that binds to the target sequence and makes a double-stranded break in the target. The break is repaired via non-homologous end joining (NHEJ), an error-prone process that introduces insertions and deletions (indels) into the target sequence. Targeted mutations can also be introduced by co-transfecting single- or double-stranded DNA templates to promote homology directed repair (HDR). To date, SpCas9 has been used broadly to achieve efficient genome editing in a variety of species and cell types, including human cell lines, bacteria, zebrafish, yeast, mouse, fruit fly, roundworm, rat, common crops, pig, and monkey (reviewed in 6). In recent years the discovery and characterization of Cas9 from other species like Staphylococcus aureus Cas9 (SaCas9) has expanded the use of CRISPR/Cas9 in pre- clinical studies since it is relatively smaller than SpCas9 82.

1.4.3 CRISPR-Cas9 in Clinical Trials

The first approved clinical trials using Cas9 edited cells are underway, with 13 currently recruited and 1 completed study (see: clinicaltrials.gov: NCT03164135, NCT03545815, NCT03655678, NCT03399448, NCT04037566, NCT03745287, NCT04035434, NCT03747965, NCT03166878, NCT03398967, NCT02793856, NCT03044743, NCT03872479, NCT03081715). All but one utilizes ex vivo edited cells which are transplanted back into the patient to treat cancer like leukemia or blood disorders like b-thalassemia or Sickle cell anemia. The first, and to date only, in vivo editing therapy is for treatment of Leber congenital amaurosis 10 delivered though subretinal injection (NCT03872479). Currently, there are no systemic (whole body) CRISPR therapies, and further, no clinical trials assessing CRISPR gene editing for neuromuscular disorders, including Duchenne muscular dystrophy, have been described.

9

1.4.4 Use of CRISPR/Cas9 in Animal Model Generation for Large Structural Rearrangements

Modelling the inherent large size of genomic rearrangements, like duplications observed in Duchenne muscular dystrophy, have proven to be laborious and time consuming, thus hindering the generation of animal models for disease associated CNVs (Copy Number Variations). One technique that has been used involves Cre-recombinase- locus of X-over P1 (Cre-loxP) system 83. In this case, Cre-loxP sites were inserted at a desirable location in a chromosome, followed by recombination within embryonic stem cells. A murine model for Pelizaeus-Merzbacher disease harboring a duplication was generated using this method 84. With the recent discovery and mainstream use of CRISPR/Cas9, multiple protocols have be published demonstrating that large structural rearrangements can be made on autosomal genes in ES cells 85 or zygotes 86-88 using CRISPR/Cas9 system (Table 1). Also in zygotes, generation of megabase-scale deletions, inversion and duplications was shown using CRISPR/Cas9 technology targeting the Contactin-6 gene 87 The most frequent genetic rearrangement in the above mentioned studies involved deletion. Duplications were detected in some but not all mouse generating experiments and it is still unknown what features in the targeting design will lend to more duplication outcomes. Additionally of note, animals harboring duplications were not the intended models in these studies, deletions models were. As such, duplications were reported from the initial PCR based junction screening and not Whole Genome Sequencing (WGS). Duplication harboring animals were not kept to establish mouse lines and so were not fully characterize. As a result, the integrity and complexity of the duplication identified in initial screens is unknown.

10

Table 1 | Summary of Reported Large Genomic Editing Efficiencies in Murine Model Generation

Locus Size of Targeting approach Frequency of born Verification method Citation targeted fragment Del Inv Dup 2 different alleles CNTN6 2 sgRNA+ mRNA + 5(41) 4(41) 0(41) 2(41) PCR analysis specific junction, 87 Chromoso single stranded 17% 10% 0% 5% sequencing of PCR products me (Chr) 6 1.1 Mb oligonucleotide (ssODN) Del+ Dup in zygotes Dip2a 65 kb 2 sgRNA 3(14) Not Not Not reported PCR analysis specific junction, 88 Chr 10 zygotes 21% reported reported sequencing of PCR products Nox 4 155 kb 4 sgRNAs + mRNA+ 11(46) 14(46) 1(41) PCR analysis for specific junction, 86 Chr 11 ssODN 24% 30% 2% sequencing of PCR products in zygotes Grm5 545 kb 12 (68) 12(68) 1(68) Chr 11 18% 18% 1% Nox4+Grm 1.15 Mb 14(48) 10(48) None 5 29% 21% detected H2afy 1.189 kb 2sgRNA+ Cas9 Plasmid 11(288) 2(288) 0(288) 1 (288) PCR analysis for specific junction, 85,86 Chr 5 in ES cells 3.8% 0.7% 0% 0.3% Del/Inv sequencing of PCR products Bmp2 3.7 kb 12(192) 3(192) 0(192) 2(192) Chr 20 6.3% 1.6% 0% 1% Del/Inv Ihh 12.6 kb 121(288) 17(288) 0(288) 6(288) Chr 2 42% 5.5% 0% 1%

Pitx1 32 kb 9(288) 11(288) 2(288) 7(288) Chr 5 3.1% 3.8% 0.7% 2.4% Del/Inv Del/Dup Inv/Dup Laf4 353 kb 38(288) 32(288) 81(288)28 71(288) Chr 2 13.2% 11.1% % 24.6% all combinations Epha4 1.672 Mb 4(192) 3(192) 0(192) 0(192) Chr 2 2.1% 1.6% 0% 0%

11

1.4.5 Adeno-Associated Virus as Delivery System

Adeno-associated virus (AAV) vectors are attractive vehicles for the delivery of CRISPR/Cas9 components in vivo. This is mainly because of their low immunogenic potential, reduced oncogenic risk from host-genome integration 89, and broad range of serotype specificity 90,91. They have a cargo size of ~4.5 kb and in 2017 Luxturna (Spark Therapeutics), a gene therapy for RPE65- associated retinal dystrophy, became the first FDA approved AAV9 based drug. Additionally, systemic clinical trials are also ongoing for other AAV-based gene therapies in Spinal Muscular Atrophy Type I (Novartis) and DMD (Sarepta, Solid Biosciences, Pfizer).

Various parts of this Introduction have been adapted from Wojtal, D*, Kemaladewi D.U.,* et al. Spell Checking Nature: Versatility of CRISPR/Cas9 for Developing Treatments for Inherited Disorders. Am J Hum Genet 98, 90-101, doi:10.1016/j.ajhg.2015.11.012 (2016).

1.5 Thesis Objectives Currently there is no curative treatment available for DMD caused by duplications. The objective of this thesis is to develop a gene editing strategy using the CRISPR/Cas9 system for the treatment of genetic duplications. As proof-of-concept the strategy will be first tested in patient cells harboring a duplication in the DMD gene where successful editing will be measured by the restoration of a full length dystrophin protein. Furthermore, the second objective will be to test this strategy in an animal model. For this objective the first multi-exon duplication mouse model of Duchenne muscular dystrophy will be generate to mimic a pathogenic mutation observed in a DMD patient. The novel model will then be treated with CRISPR/Cas9 targeting the duplication and successful removal of the duplication DNA fragment will be tested by; measurement of the level of restored of a full-length dystrophin protein as well as measurement of the degree of improved pathophysiology and muscle function. This will be the first reported therapeutic strategy for targeting a CNV that has been able to treat the root cause of DMD and restore full-length protein.

12

Figure 1 | PhD Thesis overview

A) Panel describing the effects that intragenic duplication in DMD has on mRNA, protein and muscle histology in human and murine muscles. B) Outline of theoretical approach being tested in this thesis which is the use of CRISPR/Cas9 Gene Editing technology to treat intragenic duplications as the primary cause of a subset of Duchenne muscular dystrophy cases.

13

A B Intragenic Duplication Restoration to WT

1 2 1 18 30 18 30 31 79 7

CRISPR

1 2 1 1 30 18 30 31 79 1 2 17 18 30 3 79

DNA 7 8 1

1 2 1 18 30 18 30 31 79 1 2 17 18 30 31 79 mRNA 7

Protein dystrophin No dystrophin restored

Histology

14

Materials and Methods

2.1 Duplication Mapping

2.1.1 Whole Genome Sequencing

Whole genome sequencing on DNA extracted from human cells in Chapter 3 and murine tail tissue in Chapter 4 was performed by the Center for Applied Genomics (TCAG) at the Hospital for Sick Children using the Illumina (San Diego, CA) HiSeq X system as previously described 92. DNA was extracted using the DNAeasy Blood and Tissue Kit (Qiagen) and quantified using the Qubit Fluorometer High Sensitivity Assay (Thermo Fisher Scientific, Waltham, MA). Sample purity was verified using Nanodrop (Thermo Fisher Scientific, Waltham, MA). 400 ng of DNA were used as input material for library preparation using the Illumina TruSeq PCR-free DNA Library Prep Kit by first fragmenting DNA to approximately 350 base pairs through sonication. The DNA was then end-repaired, A-tailed and indexed adapter were added by ligation (TruSeq Illumina adapters). Libraries were validated using Bioanalyzer DNA High Sensitivity chips (Agilent Technologies, Santa Clara, CA) quantified by PCR, pooled in equimolar quantities and pair-end sequenced on an Illumina HiSeq X platform to generate reads of 150 bp in length.

2.1.2 Variant Calling and Annotation of Whole Genome Sequencing DMD Dup18- 30 Patient

The sequencing dataset in BAM (Binary form of Sequence Alignment Map) format was visualized using the Integrative Genomics Viewer (IGV) tool 93. CNVs breakpoints were mapped based on location of discordant split reads. Breakpoints in human Dup18-30 subject were confirmed by long rage PCR followed by Sanger sequencing using primers 5’- CAGCATCATGACCTGTTTCAATC-3’ (forward) and 5’-TTGTTAGAGGGCAGCAAGTTTGT-3’ (reverse).

15

2.1.3 Variant Calling and Annotation for Dmd Dup18-30 inv1A10 Mouse Model

Variant calling annotation was performed by TCAG at the Hospital for Sick Children. Briefly, Illumina software bcl2fastq (HAS v2-2.5.55.1311) was used to convert per-cycle BCL basecall files to standard sequencing output in FASTQ format. FastQC was used to assess the quality of the experiment and FastQ Screen was used to check composition of the library. Reads were then aligned to the reference mouse genome GRCm38_68 (mm10) using BWA mem (v0.7.12). Duplicate reads were marked using Picard Tools (v2.5.0). Indel realignment, base quality score recalibration and variant detection using HaplotypeCaller were performed using GATK 3.7 following the best practices recommendation. Filters recommended by GATK were applied to flag variants. Variants were annotated using an Annovar based pipeline. CNVnator (0.3.3) was used to call CNVs and Manta (v0.29.6) to call SVs (structural variants). CNV and SV variants were annotated using VEP (v93).

2.2 Human DMD Duplication Treatment in vitro

2.2.1 CRISPR/Cas9 and sgRNA Design and Cloning

All sgRNAs described in the study were chosen based on highest activity level and ranked according to the least possible number of off targets by utilizing the Benchling Tool (http://benchling.com) 94. The best predicted sgRNAs were then subcloned into the lentiCRISPR v2 vector (Addgene #52961, a gift from Feng Zhang) 95 in Chapter 3. All oligonucleotides were synthesized by Integrated DNA Technologies.

2.2.2 sgRNA Evaluation

Human sgRNAs were evaluated in human embryonic kidney cells 293 (HEK-293) (ATCC CRL- 1573). All cells where maintained in DMEM (Dulbecco’s modified Eagle’s medium) supplemented with 10% FBS (Fetal Bovine Serum), 1% penicillin/streptomycin and 1% L- o glutamine (all from Gibco). Cells were maintained at 37 C with 5% CO2. Cells were transfected

16 using Lipofectamine2000 (Invitrogen) as per manufacturers instructions and incubated for 72 hours. Genomic DNA was isolated using DNeasy Blood and Tissue Kit (Qiagen) and regions of interest were PCR amplified. Indels were detected using GeneArt Genomic Cleavage Detection Kit (Invitrogen) according to the manufacture’s protocol.

2.2.3 In vitro Transfection

Primary muscle biopsy cells from individual harbouring a DMD exon 18-30 duplication were obtained and maintained in High Glucose DMEM supplemented with 10% FBS, L-Glutamine, 1X penicillin/streptomycin (all from Life Technologies) at 37°C with 5% CO2 incubation. Production of lentiCRISPR and transduction into target cells were performed as described by Sanjana et al. 95 with a slight modification. To produce the lentiCRISPR, a 10 cm petri dish of 293T cells (ATCC) at 80% confluency were transfected with 10 µg of transfer lentiCRISPR-V2 plasmid (Addgene #52961, a gift from Feng Zhang) 95, 5 µg of the envelope plasmid (pCMV-VSV-G) (Addgene #8454, a gift from Bob Weinberg) plasmid, and 7.5 µg of packaging plasmid (psPAX2) (Addgene #12260, a gift from Didier Trono) plasmid using the Calcium Phosphate transfection method. 60 hours post transfection, supernatant was collected, centrifuged at 3000 rpm for 10 minutes and filtered through a 0.45 µm low-binding filter (Whatman)95. DMD duplication cells were co- transduced with Ad-MyoD (Vector Biolabs) at 100 MOI in DMEM with 1% FBS to induce differentiation of fibroblasts into myoblasts and with a lentiCRISPR vector containing Dup18-30: sgRNA 1, as described above. Three days post-transduction, 2 µg/ml puromycin was added to enrich for cells containing the lentiCRISPR-sgRNA constructs. DNA was collected 7 days post- puromycin selection and proteins were collected 7 days post-differentiation.

2.2.4 Western blot From Cells

Cells were lysed using RIPA buffer (50 mM Tris HCl pH 7.4, 150 nM NaCl, 1 mM EDTA, 1% deoxycholate, 1% NP40 and 1% Triton X-100 supplemented with Phosphatase and Protease inhibitor cocktails (Roche). Protein concentration was measured using the BCA assay. 25 µg of protein lysates were resolved by Western blot (WB) on 3-8% Tris-acetate gels, transferred to nitrocellulose membranes overnight using the BioRad system. Blots were probed with antibodies against dystrophin (MAB1692, Millipore), β-dystroglycan (MANDAG clone 7D11, DSHB), α- dystroglycan (kindly provided by Kevin Campbell) and β-tubulin (SantaCruz).

17

2.2.5 Off-target Editing Analysis

Off-target analysis for in vitro gene editing experiments was performed for all lenti-based delivery gene editing experiments. Specifically primers targeting loci corresponding to each sgRNA’s top 20 off-target hits, as computed by the CRISPR Design Tool 96, were designed and used to amplify DNA from each gene editing experiments. ~200 bp amplicons were purified using magnetic beads and library preparation conducted at TCAG within the The Hospital for Sick Children with sample- specific barcodes using the Ion Torrent Library preparation kit (Life Technologies). Sequencing was performed using the Ion Torrent Proton. Each potential off-target site was evaluated after aligning corresponding sequencing reads to the Human reference genome (hg19). A custom Python script was utilized as described above and the proportion of reads that match the reference genome versus those with insertions, deletions and substitution near predicted cleavage sites was used to estimate off-target activity of a corresponding single sgRNA. The top 12 off-target hits computed by the COSMID (CRISPR Off-Target Sites with Mismatches, Insertions, and Deletions) Tool 97 which were not predicted by the CRISPR Design Tool 96 were assayed using the GeneArt Genomic Cleavage Detection Kit (Life Technologies), according to the manufacture’s protocol.

2.3 Mouse Model Generation

2.3.1 Animal Husbandry

All mice were housed in the specific-pathogen free facility at The Center for Phenogenomics (TCP) in Toronto, Ontario. Animals were maintained under proper environmental regulations: 12 hour light/dark cycle and provided with food and water ad libitum in individually ventilated units (Techniplast). All procedures involving animals were performed in compliance with the Animals for Research Act of Ontario and the Guidelines of the Canadian Council on Animal Care. The local Animal Care Committee reviewed and approved all procedures and treatments conducted on animals at The Center for Phenogenomics.

18

2.3.2 sgRNA Design and Cloning for Mouse Model Generation

The Benchling platform (https://benchling.com) was used to identify candidate sgRNA sequences with little offtarget homology. These were used to generate the DMD Dup18-30 mouse models. The DMD Dup18-30-1A10 and 3A7 mouse models were generated by designing two sgRNAs targeting intron (I) 17 (sgRNA i17a: 5’- GCATGGCGCAAAGGTCAAGA-3’, PAM AGG and sgRNA i17b: 5’-AATACTACTAGCTCACCATC-3’, PAM TGG) and two sgRNAs targeting intron 30 (sgRNA i30a: 5’- ACTGGTGAAATCGTGCCCGG-3’, PAM AGG and sgRNA 17c: 5’- CCGGGCACGATTTCACCAGT-3’, PAM AGG). sgRNAs were annealed and cloned into the BbsI site of pSpCas9(BB)-2A-Puro (PX459) v2.0 (Addgene, Plasmid #62988 a gift from Feng Zhang). sgRNAs to generate the Dmd Dup18-30-inv1A10 mouse model were designed to flank the inverted region of exons 31-34. Cas9 mRna and sgRNA were synthesized as previously described 98. sgRNAs activity was tested using a GeneART Genomic Cleavage Detection Kit (Thermofisher).

2.3.3 Dup18-30 Dmd-1A10 Mouse Model Generation Plasmids containing SpCas9 and the four sgRNAs were co-electroporated into C57BL/6NTac-C2- derived mouse embryotic stem cells (mESCs) The Center for Phenogenomics 99-101. Clones were expanded from 96-well plates and replicate plates were made. DNA was isolated from each well as previously described by Nagy et al., 101 and screened for duplication junction with primers flanking the duplication junction which join intron 30 to intron 17. Primer i30 #1929: 5’- GCC TGA GAA GCA TCA TAC CAC AAC G-3’ and i17 #1930: 5’- GAG CAT GAA ACG AAG CCA GAG ATT AGA C-3’.

2.3.4 Dup18-30 Dmd-inv1A10 Mouse Model Generation

SgRNAs were designed flanking exons 31 to 34, the region that was inverted in the first version of the Dup18-30- 1A10 mouse model. The Benchling platform as previously described was used to obtain candidate sgRNAs which were 5’ SgRNA 5’-CTACTAGCTGAATCAAAAGA-3’, PAM: TGG which targets unique inversion junction identified in 1A10 mice and 3’ junction sgRNA 5’-AAAGAATCGACCCAAGCCTC-3’, PAM: TGG which targets normal region of intron 34 just outside of the inversion junction. Homozygous 1A10 female mice were induced to supervoluated and mated with 1A10 hemizygous males. Embryos were then pronuclear injected

19 with Cas9 + two sgRNAs mRNA at TCP. PCR-based screening was utilized to genotype F1 pups. Screening was targeted to the duplication junction, as a positive control, using primers previously described in 2.3.3. Additionally screening was targeted at exons 31-34 to determine if this inverted region was indeed flipped backed to the WT sequence utilizing the 5’inversion junction Primers: i34 #1830: 5’-CAA GAT GGC GAC CGC CAA ATA TGA GAT -3’ and dup18-30 #1372: 5’- AAT CTT TCA CTT CCT CGG ACT CAA GG-3’, expected size 366 bp. The 3’inversion junction was screened using Primers: i34 #981: 5’-ACC GAA TTG GTT TTA TTT TCC CTT TAC TAC -3’ and i34 #2411: 5’-GAC AAA GCA AGA AGA GCC AAG CAA AGT G-3’, expected size 2.5 kb. Other editing outcomes were screened as well including deletion of exons 31-34 using primers dup18-30 com #1372 5’-AAT CTT TCA CTT CCT CGG ACT CAA GG-3’ and i34 #1829 5’- TCT CCT AAG GAA GTC TTG ACA ATT GAT TGA AC-3’. As well as the editing outcome of a duplication of exons 31-34 screened using primers i34 #981: 5’-ACC GAA TTG GTT TTA TTT TCC CTT TAC TAC -3’ and i34 #1830: 5’-CAA GAT GGC GAC CGC CAA ATA TGA GAT - 3’.

2.3.5 sgRNA Design and Cloning for Mouse Model Generation and Treatment sgRNAs tested for in vivo duplication removal treatment were selected using Benchling Tool to screen all intronic regions within the duplicated exon 18-30 region. A panel of top scoring sgRNAs (Table 7) based on off-target score were cloned into a vector where SaCas9 is expressed under a cytomegalovirus (CMV) promoter with a nuclear localization signal (NLS), Human influenza hemagglutinin (HA) tags and a Bovine Growth Hormone Polyadenylation (bGHpA) signal. The sgRNA with scaffolding is expressed under a U6 promoter. pX601-AAV-CMV::NLS-SaCas9- NLS-3xHA-bGHpA;U6::BsaI-sgRNA (Addgene, Plasmid #61591 a gift from Feng Zhang. Their activity was additionally tested using the Inference of CRISPR Edits (ICE) software (Synthego; https://ice.synthego.com). Briefly, the activity for all murine sgRNAs were evaluated in murine muscle myoblast cells C2C12 cells (ATCC, CRL-1772™). C2C12 cells were transfected with each sgRNA and incubated for 72 hours 3 days later DNA was collected. Primers spanning each guide target were PCR amplified and Sanger sequenced. The resulting chromatographs were then decomposed using the ICE software and aligned to control sequencing from an untreated sample.

20

2.3.6 RNA isolation and RT-PCR

RNA from cultured cells and mouse tissue sections were isolated utilizing the RNeasy Mini Kit (Qiagen). Complementary deoxyribonucleic acid (cDNA) was synthesized utilizing Superscript III Reverse Transcriptase (Invitrogen) per the manufacturer's protocol. cDNA was then amplified utilizing a series of primers (Table 2), in a PCR walking manner. Forward (F) or Reverse (R) Primers were selected so that amplicons had one arm anchored in previously amplified region. Primers did not span exon-exon junctions, but rather were within one exon so as to not limit amplification if this assumed exon-exon junction was not present. All amplicons were sequenced and aligned to murine reference transcript to determine the exact sequence of mRNA.

Table 2 | Primers for mouse mRNA mapping

Name/Location Primer Sequence 5’-3’

Exon 1 mDMD F GTGGGAAGAAGTAGAGGACTGTTA

Exon 7 mDMD F AATAGTGTGGTTTCACAGCACTCAGC

Exon 14 mDMD F CAACTTAAGGTACTGGGAGATCG

Exon 14 mDMD R AGCCATGTACTAAAAAGGCACTG

Exon 17 mDMD F ATTTCACAGGCTGTCACCAC

Exon 19 mDMD F CATTAACACCCTCATTTGCCATC

Exon 22 mDMD F GCTATCAGGAGACAATGAGTAGCATCAG

Exon 22 mDMD R TGTAATTTCCCGAGTCTCTCCTC

Exon 31 mDMD F GAAACATAACCAGGGGAAGGATGCC

Exon 31 mDMD R GGCATCCTTCCCCTGGTTATGTTTC

Exon 36 mDMD F CACAAAGTGGATCATTCATGCAGATG

Exon 37 mDMD R TTCTGTGAGAAATAGCTGCAAATC

Exon 53 mDMD F CAGCTGCAGAACAGGAGACAACAG

21

Exon 56 mDMD R TGGCCATTTTCATCAAGATTGTGATAG

Exon 65 mDMD R TCTCAATGTTTATGATACGGGACGAAC

Exon 79 mDMD R AAAGTAATGCAAAACAATGTGCTGCCTC

2.4 AAV9 Viral Vector Production

AAV9 were generated from PX601 vectors containing duplication targeting sgRNA and SaCas9 by Vigene Biosciences and stored in aliquots at -80°C. Systemic delivery of the AAV9 based CRISPR treatment was achieve through temporal vein injections as previously described by our lab102,103. Briefly, P1 male pups were anaesthetized by brief placement on ice, and 50 µl of AAV solution (3x1012 Genome Copies (GC)/ml) were injected into the temporal vein using 35GA

Needle. After 7 weeks functional tests were performed and mice were euthanized by CO2 inhalation and tissue was collected for DNA, RNA, and protein analysis.

2.5 Molecular Assessment

2.5.1 Collection of Animal Tissues

All animals described in Chapter 3 and 4 were sacrificed by CO2 inhalation according to the TCP Standard Operating Procedure PR011 and the whole skeletal and cardiac muscles were dissected. Tissues were frozen by coating in VWR Clear Frozen Section Compound and submerged for 20 seconds into a beaker of isopentane which itself was kept cold by partial submergence into liquid nitrogen. After freezing, tissues were placed into tubes, transported on dry ice, and stored at -80C.

2.5.2 Assessment of Muscle Pathology

For histological analysis, flash frozen collected tissues were cut into 8µm cross-sections of skeletal muscle and 10 µm cross-section of cardiac muscle tissues using the Microm HM5252 Cryostat (Thermo Fisher Scientific) at -25oC to -28oC. Tissues positioned and embedded in OCT and

22 sectioned from 3 positions in the muscle were mounted on Superfrost Plus slides (VWR). Sections not used for slides were collected into tubes containing zirconium beads (OPS Diagnostics) for RNA and protein extractions. Staining was performed on sections retrieved from three independent subsections of muscles that spanned it’s length from tendon to tendon. Hematoxylin and eosin stain (H&E) of skeletal and cardiac muscle was performed using standard protocol. Digitized images of all muscle sectioned were acquired with 3dhistech Pannoramic 250 Flash digital scanner and 350 fibers scored from three different field of views per animal. All counts were performed by two individuals in a double-blinded procedure using ImageJ software. The percent of centrally localized nuclei was determined by counting the total number of fibers and the portion of fibers with centrally localized nuclei within each field of view. Feret’s diameter was calculated using a mouse-driven cursor, and the dimensions of highlighted fibers calculated within the software package based on a calibrated screen pixel-to-actual size ratio.

2.5.3 Immunofluorescence

Slides were equilibrated at room temperature for 30 minutes, fixed in ice cold Methanol for 10 min at 4ºC and blocked for 30 min at room temperature in blocking buffer (0.2% Bovine serum albumin (BSA), 3% normal goat serum in Phosphate-buffered saline. Primary antibodies used were: 1:250 Dystrophin (#15277, Abcam) incubated at 4°C overnight. Secondary antibodies included: Goat polyclonal anti-rabbit Alexa Fluor 594 (Thermo Fisher, 1:200) applied for 1 h at room temperature the following day. Slides were washed and mounted with ProLong Gold anti fade reagent (Life Technologies). Slides were digitally scanned using the 3d histech Pannoramic 250 Flash digital scanner.

2.5.4 Western blot

The 10-30 mg of tissues were lysed using RIPA buffer (50 mM Tris HCl pH 7.4, 150 nM NaCl, 1 mM EDTA, 1% deoxycholate, 1% NP40 and 1% Triton X-100 supplemented with Phosphatase and Protease inhibitor cocktails (Roche) in magnalyzer. The protein concentration was measured using the BCA assay. 10 µg of protein lysates were resolved by Western blot on 3-8% Tris-acetate pre-cast gels (Thermo Fisher Scientific), transferred to nitrocellulose using 12 min transfer using parameters 20V for 2min, 23V for 6min, 25V for 4min membranes (iBlot2 Dry Blotting System, Thermo Fisher Scientific) and probed for dystrophin (mouse monoclonal anti-dystrophin,

23

Mandys8, Sigma (1:1000)), HA tag (mouse anti HA-tag, #ab130275, Abcam (1:1000)) and calnexin (CNX) (rabbit anti-calnexin, ab22595, Abcam (1:10,000)). Secondary Horseradish peroxidase (HRP) was applied at 1:10000 concentration and blot developed using Fempto Enhanced chemiluminescence (ECL) (Thermo Fisher Scientific) for dystrophin and pico ECL for CNX.

2.6 Functional Analysis

All functional tests were performed blindly by a trained operator at the Clinical Phenogenomics Core Facility at the Centre for Phenogenomics, Toronto. The operator was unaware of the nature of the projects and treatments.

2.6.1 Grip Strength

All tests were performed blindly and in accordance to Treat NMD SOP #DMD_M.2.2.001 104. The grip strength of fore and hind limb was assessed using the Bio-GS3 Grip Strength Meter (Bioseb) which measures peak force (in grams) an animal applies in grasping a specially designed pull grid. The force was normalized to bodyweight (grams). Each mouse underwent 3 trials testing forelimbs.

2.6.2 Open field test

All tests were performed blindly and in accordance to Treat NMD SOP #DMD_M.2.2.002 105. Animals were placed in a plexiglas open field (41.25 cm Å~ 41.25 cm Å~ 31.25 cm) illuminated by 200 lx. The VersaMax Animal Activity Monitoring System recorded vertical activities and total distance traveled for 20 min per animal.

24

2.7 Statistical Analysis

All statistical analysis was performed using GraphPad Prism (GraphPad software). Statistical significance was evaluated using Two-Tailed Students’ t-test and significance was considered to be P< 0.05.

Fragments from Chapter 2 have been adapted from; Wojtal, D*, Kemaladewi D.U.,* et al. Spell Checking Nature: Versatility of CRISPR/Cas9 for Developing Treatments for Inherited Disorders. Am J Hum Genet 98, 90-101, doi:10.1016/j.ajhg.2015.11.012 (2016).

25

Exploring the Versatility of the CRISPR/Cas9 System

3.1 Overview

With the development of powerful genome analysis platforms, there is growing evidence for the prevalence of copy number variations (CNVs) that are associated with numerous genetic conditions 40. However, to date no therapeutic strategies have been developed to directly target these large genomic rearrangements. Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) is a widely used platform for genome engineering. However, the potentially broad therapeutic implications for CRISPR-mediated genome repair are largely unexplored especially for disorders caused by copy number variation. Here, we develop a previously undescribed CRISPR mediated single RNA guide approach to successfully remove a large genome rearrangement.

Theoretically, duplication removal may also be achieved using two sgRNAs, which are targeted to cut on either end of the duplicated region of the genome, however the potential for this is dependent on the availability of a functional sgRNA which would span over the duplication junction to be unique, because if it is designed anywhere outside of this region, the sequence will be repeated in the second copy of duplication leading to a total of three cuts with two sgRNAs. Additionally, locus specificity further restricts the sgRNA design, as the junction sequence may not be ideal selection for a sgRNA; i.e. no or limited PAM recognition sequences available in that region, low cutting efficiency, or high offtarget score. To overcome these limitations, we developed the one sgRNA approach to remove genomic duplication. This approach provides an opportunity to evaluate the entire duplication sequence in order to design RNA guides with the highest duplication removing efficiencies and least possible off-targets, thereby alleviating the sequence restrictions presented by two guides approach. Moreover, using one instead of two sgRNAs is therapeutically appealing given the limited loading capacity of potential in vivo delivery vehicles such as AAV9.

As a proof-of-concept, we test this approach in primary cells of an individual affected by DMD with an X-chromosomal intragenic duplication in the DMD gene which prevents the production of

26 the dystrophin protein. Removal of the pathogenic exon 18-30 duplication in DMD leads to restoration of full-length dystrophin expression. Our findings establish the far-reaching therapeutic utility of CRISPR/Cas9 that can be tailored to target numerous inherited disorders caused by duplications.

3.2 Results

3.2.1 Duplication Mapping and Characterization

Muscle biopsy cells were obtained from a patient diagnosis with Duchenne muscular dystrophy and carrying a duplication of exons 18-30 within their genome. In order to explore whether the CRISPR/Cas9 technology could be utilized to target the pathogenic intragenic duplication suspected in patient DMD Dup ex18-30, we first needed to fine-map this CNV to resolve the breakpoints, determine location, orientation and exact size. The patient had been previously identified by standard diagnostic testing, Multiplex-Ligation dependent Probe Amplification (MLPA), to have a pathogenic intragenic duplication of DMD exons 18-30. This semi-quantitative diagnostic tool probes exon dosage and can determine presence of CNV, however the extent of the duplication which spans an intron, i.e. the breakpoint junctions, nor the orientation of duplication cannot be determined with this technology.

First, to confirm the MLPA diagnosis of a duplication of exons 18-30 and to further map the breakpoints of this CNV, WGS was performed by TCAG. The sequencing data was then aligned in BAM format to the human reference genome (hg19) using the Integrative Genomics Viewer (IGV) software 93 (Figure 2). Pair- reads are displayed as bar chart below each loci. Normal reads which align back to the reference genome are coloured gray and have a left and right orientation. Figure 2A shows an increase in read coverage, number of gray bars, spanning over a portion of intron 17 through to intron 30, confirming a duplication of exons 18-30. Also discordant reads coloured green were aligning to the breakpoints where the read depth changes. These types of reads are distinguished in IGV software by green colour as they are not in alignment to the reference genome like the gray coloured reads and have a specific orientation, right and left. Orientation is defined in terms of read-strand; left versus right, and first read versus second read of a pair. Figure 2B shows a schematic adapted from IGV software developers Robinson et al., 93

27 which illustrates how a discordant pair read, coloured green, with a right-left orientation indicative of a tandem duplication or translocation. In the reference genome, reads on the breakpoints of a CNV fragment, are labeled loci “A” and “B”. Pair-end reads generated on the Illumina HiSeq X platform span about 150 bp and CNVs are over 1 kb in length, here 139 kb. As such a large section of DNA that is duplicated and inserted into the genome next to the original sequence, as seen in the lower subject genome, would not be captured by a single read. The only instance where regions “A” and “B” in that orientation would be in close enough proximity to be captured in the same read would be in the two loci were physically attached in a tandem orientation so as would be seen in a duplication junction.

To determine the exact location in intron 17 and 30 where the two copies are joined (the junction breakpoint) I looked at the loci immediately adjacent to where there was a change in read coverage. Deeper resolution of the alignment track at the intron 30 locus, identified that paired-end alignment discordant split reads, indicated in green colour in intron 30, mapped to intron 17, suggesting that these two fragments of DNA were physically linked in a tandem orientation in this patient’s DNA (Figure 2C). To confirm the duplication breakpoint, long range PCR followed by Sanger sequencing was used (Figure 2D). Taken together a tandem duplication was mapped to chrX:32,552,206-32,413,149 (hg19) with an AAAT insertion in the breakpoint junction. No additional large copy number variations were observed in this gene.

28

Figure 2 | WGS Analysis of DMD in patient harboring Pathogenic Duplication

A) Screenshot of the DMD gene from IGV software where the depth of reads displayed at each locus is represented as a gray bar chart. There is an increase in read coverage, spanning from intron 30 to intron 17 as seen by the peak in number of reads mapping to this region. Also reads highlighted in green are visualized at the breakpoints in intron 17 and intron 30 suggesting that the two genomic fragments are in close proximity of each other and a head to tail orientation. The duplication mapped to chrX:153,420,649-153,142,419 (hg19). B) Schematic adapted from IGV software developers 93. showing program’s interpretation of structural events. In the reference genome, reads on the breakpoints of this fragment are labeled “A” and “B”. In a duplicated genome loci “A” and “B” will be closer in proximity and pairs-end sequencing will identify reads where these two loci are physically joined but flag them as discordant with a Right-Left orientation, distinguished in IGV software by green highlights colour. C) Screenshot of a deeper resolution of the alignment track at duplication breakpoints. Here paired-end read alignment of the colored reads in intron 30 maps map to intron 17 suggesting that these two fragments of DNA are physically linked in a tandem orientation in this patient’s DNA. D) Electropherogram of the junction of the duplication of DMD exons 18–30; highlighted in blue is the insertion of AAAT at the junction.

29

30

3.2.2 CRISPR-Based Genetic Removal of Duplicated Fragment in Patient Cells

Next, we sought to explore the potential for removing duplicated fragments of DNA using the CRISPR/Cas9 system in conjunction with only one sgRNA. We hypothesized that a sgRNA could bind in two places within a duplication, leading to the formation of two DSBs, hence removing the intervening sequence upon repair (Figure 3A). To test the one sgRNA duplication removal approach, we designed sgRNAs within the 139 kb duplication, excluding any known coding regions or regulatory elements, named sgRNA1. We co-transduced the affected fibroblasts with adeno-MyoD to induce transdifferentiation of fibroblasts into myoblasts and lentivirus (Lenti-) vector containing CRISPR/Cas9 with DMD: sgRNA1 (LentiCRISPR) or a lentiviral vector containing Green Fluorescent Protein (GFP) (LentiGFP). To assess for evidence of duplication removal on a molecular level, we employed a three-primer PCR strategy as described and illustrated in Figure 3B. In the WT control, we detected a higher band, corresponding to the amplification product of P1+P3, whereas the duplication control showed two bands corresponding to the P1+P3 product and duplication junction-specific amplification using P1+P2, at a ratio of 1:1. After LentiCRISPR treatment with sgRNA 1, but not with LentiGFP, the ratio becomes skewed towards the top band (Figure 3C; p<0.05), indicating a conversion of the duplicated allele toward a WT single copy. We next explored whether the molecular transition toward the WT allele leads to functional restoration of protein expression. We detected expression of full-length dystrophin at 4.4% wildtype levels in transdifferentiated myotubes treated sgRNA1, which was accompanied by restoration of α-dystroglycan, a critical component of the DGC (Figure 3D; p= <0.01). Taken together, our data demonstrate that CRISPR/Cas9-mediated removal of duplications leads to restoration of full-length protein expression in myotubes, which for the first time opens up entirely new treatment strategies for individuals affected with DMD duplications.

3.2.3 Off-target Analysis

Off-target analysis showed no significant hits in the top 20 sites predicted by the MIT CRISPR 97 Design tool and top 12 hits predicted by COSMID ; these hits were validated by next-generation sequencing and the GeneArt Genomic Cleavage Detection Kit. All summarized in Table 3.

31

Figure 3 | In vitro CRISPR-mediated treatment of a tandem 139 kb intragenic duplication

A) Schematic of the position of DMD sgRNA 1 and the duplication-removal strategy. Each numbered square represents indicated exon. B) Schematic of the three-primer duplication- removal strategy. C) Targeted deletion of a 139 kb duplication in DMD. PCR was performed on DNA from three replicate experiments in which affected myoblasts were transduced with lentiGFP or lentiCRISPR Cas9 nuclease with DMD sgRNA 1. The top band was amplified with universal primers (P1 fl P3) to both an allele with the duplication and a control. The bottom band is specific to alleles harboring the duplication (P1 fl P2). A decrease in the bottom band, indicating removal of the duplicated region, was only observed when Cas9 and sgRNA 1 were present. D) Western blot with antibodies against dystrophin, α-dystroglycan and tubulin as a loading control. The amount of dystrophin was normalized to that of tubulin by densitometric analysis. 4% expression of WT full-length dystrophin in patient cells compared to 0% in untreated. *p <0.05, **p= < 0.01 (student t test from three independent experiments.

32

33

Table 3 | Offtarget hits from DMD sgRNA 1

Targets Sequence MIT Score Cosmid Score Locus (Gene ID)* Indels (%)

ON ATATCTTCTTAAATACCCGAAGG 100 0 chrX:+32461612 N/D OFF 1 AGTGTCTTCTTAAATACCTGCAG 1.1 5.3 chr2:+225344609 N/D OFF 2 AAAACCTTCACAAATACCCGGAG 0.7 Not identified chr6:+19965267 N/D OFF 3 AATCTCTTCTTCAATACCCTTGG 0.7 6.97 chr18:+23119458 N/D OFF 4 AAGAGCTGCTTAAATACCCTGAG 0.7 Not identified chr11:+46836502 2.53 OFF 5 CATATCTTCTTAAATAGCCTTGG 0.6 9.12 chr8:+67600685 N/D OFF 6 ATTAGCATCTTTAATACCCGAAG 0.5 Not identified chr7:+130907163 N/D OFF 7 GATATATCCTGAAATACCCGTAG 0.5 Not identified chrX:-7222056 N/D OFF 8 ATTTTCTTATGAAATACCCGAAG 0.5 Not identified chr14:-91038783 N/D OFF 9 TAAATCCTCTTAAATACCCTAAG 0.5 6.89 chr1:-19924480 N/D OFF 10 GAAATCTTCATAAATACCAGGAG 0.5 6.16 chr9:-34930580 N/D OFF 11 AATTACTTCATAAATACCTGAGG 0.5 Not identified chr5:-170343578 (NM_022897) N/D OFF 12 AATTTCAACTTAAATACCCTTGG 0.5 Not identified chr13:-109558634 1.26 OFF 13 AATTCCATCTTAAATACCCTAAG 0.5 Not identified chr7:-10351725 N/D OFF 14 CATCTTTTCTTAAATACCCAAGG 0.4 Not identified chr2:+208081568 N/D OFF 15 AGTTTCTTGTTAAATACCCAAGG 0.4 Not identified chr6:+8264834 N/D OFF 16 ATTCTCTTTTTAAATACCCACAG 0.4 Not identified chr5:+146530145 N/D OFF 17 AATTTCTCTTTAAATACCCAAAG 0.4 Not identified chr19:-22631201 N/D OFF 18 AATATTTTCTTCAATACCCCTGG 0.4 7.01 chr5:-146140359 N/D OFF 19 AATATTTTCTTCAATACCCTAAG 0.4 7.01 chr1:-185743949 N/D OFF 20 AATACCTTCATAAGTACCCGAAG 0.4 1.99 chr13:+23073576 N/D

34

3.3 Discussion and Conclusion

Recent development of genome editing technologies based on the RNA-guided CRISPR- associated endonuclease Cas9 has generated enormous excitement across many fields, including biological research, biotechnology, and clinical medicine. Despite its recent discovery, Cas9 has already been used to generate a number of cellular and animal models for a variety of basic research as well as applications in biotechnology. Furthermore, the CRISPR/Cas9 system can be exploited for the development of genome engineering therapies, which carries the potential to revolutionize medical management in the future. This study defines a pipeline for genome engineering strategies using easily accessible cells from affected individuals and provides evidence for the versatility of the CRISPR/Cas9 system that can be employed for various genetic conditions.

An increasing number of genetic disorders are determined to be caused by chromosomal rearrangements and CNVs. However, treatments targeting the underlying cause of these disorders are currently not available. Although deletion of genomic, single copy DNA has been shown using zinc fingers Lee, et al. 106 and two guides in the CRISPR/Cas9 system 107,108, removal of duplications has not yet been demonstrated. Furthermore, it was unknown whether this type of genetic correction would lead to restoration of a fully functional gene. Here, we developed a new strategy employing the CRISPR/Cas9 system to remove duplicated regions within the genome. Our strategy uses only one sgRNA, which due to the nature of a tandem (head-to-tail) duplication, creates two double stranded breaks. Since we are targeting a sequence within a duplication, the sgRNA target will be detected twice, leading to the formation of two DSB and hence removing the intervening sequence, which equates to the total size of the duplication. There are several advantages to this strategy. First, the design of RNA guides is not limited to specific sequences near the breakpoints. This allows for a larger selection of guide RNAs that can target any portion of the duplicated sequence while minimizing off-target sites. Second, given the limited loading capacity of in vivo delivery vehicles such as AAV9, strategies using the least amount of CRISPR components will be critical for successful therapeutic application. As described in this Chapter, I demonstrate successfully removal of a 139 kb duplication in DMD and, furthermore, in our 2016 publication we showed how the same strategy was able to remove another large 278 kb X- chromosomal rearrangement containing the MECP2 gene 109, indicating that this approach can be targeted toward several chromosomal duplication syndromes.

35

Importantly, off-target analysis showed no significant hits in the top 20 predicted sites by the MIT Design Tool 96 and top 12 hits predicted by the COSMID Tool 97 which were validated using NGS and the GeneArt Cleavage Detection kit, suggesting that the accuracy and safety of our system lends itself to future therapeutic applications. It is important to note that there were discrepancies between off-target sites identified by the MIT CRISPR Design Tool versus the COSMID Tool, further emphasizing a need for new non-biased off-target analysis such as GUIDE-Seq 110 and/or high-throughput, genome-wide, translocation sequencing methods 111,112.

To determine if our duplication removal strategy has broader applicability, we performed proof- of-concept studies in the context of DMD. To date, treatments that specifically target duplications in DMD have not been extensively studied even though duplications of one or more exons comprise approximately 10% of the DMD mutation spectrum 10. Recent therapeutic strategies for DMD undertaken by other groups include gene replacement therapies, which deliver truncated but functional microdystrophin genes 24,25. One type of mutation specific therapy involves exon skipping, where antisense oligonucleotides complementary to regions of the dystrophin pre- mRNA are used to induce skipping of one 22,113 or more exons 114, hence restoring the open reading frame to produce a shorter dystrophin protein. Similarly, previous studies from other laboratories using CRISPR/Cas9 have demonstrated that this system can be utilized to restore the reading frame of large deletions in dystrophin 107. However, one potential shortcoming of these approaches is that the shorter dystrophin product ameliorates disease phenotype only to the extent of making them similar to individuals affected with BMD, who exhibit expression of a truncated, yet functional dystrophin protein 115. Thus, our data are of particular importance as removal of a duplication leads to restoration of full-length dystrophin, which represents new therapeutic opportunities for affected DMD individuals with duplications.

An important consideration in establishing a treatment for DMD is determining how much dystrophin is necessary to ameliorate the disease phenotype. It is estimated that in humans, about 20% of truncated dystrophin protein expression is sufficient to have a less severe phenotype and maintain ambulation 27,28. Furthermore, studies in mdx mice suggest that approximately 5% of full- length dystrophin can improve disease pathology and >20% is needed to fully protect muscle fibres from exercise-induced damage 29-31. One potential challenge for this treatment strategy is the delivery vehicle for Cas9 and sgRNAs. In this study, we have used lentiviral vectors due to ease of infecting primary cell cultures. However future in vivo studies will include more clinically

36 useful vehicles such as AAVs. Nonetheless, while it is difficult to extrapolate our in vitro data to potential in vivo situations, our data demonstrating 4.4% expression of full-length dystrophin accompanied by restoration of components of the DGC, are promising as we continue to explore the in vivo therapeutic feasibility of this approach.

Recent estimates suggest that about 400 million people worldwide are affected by orphan diseases and most of these are caused by primary genetic abnormalities 116. While orphan-drug development has made some progress over the last few years, most genetic disorders lack efficient treatments and are often associated with a life-threatening or life-limiting disease trajectory. The CRISPR/Cas9 system provides a rare and unique opportunity to target not only the underlying primary disease-causing genetic abnormality, but also to alter genetic modifiers that play a critical role in the pathogenesis of a certain disease. Here, we demonstrate that individually tailored single RNA guides are able to remove large duplicated genomic rearrangements. Proof of concept studies as outlined here utilizing affected individuals’ cells are critical in laying the foundation for further research into the application of these therapeutic strategies for safe and efficient postnatal, in vivo treatments for numerous inherited disorders.

3.4 Acknowledgements

Fragments of this Chapter 3 have been published in; Wojtal, D*, Kemaladewi D.U.,* et al. Spell Checking Nature: Versatility of CRISPR/Cas9 for Developing Treatments for Inherited Disorders. Am J Hum Genet 98, 90-101, doi:10.1016/j.ajhg.2015.11.012 (2016).

I would like to acknowledge Drs. Jennifer Doudna, Steve Lin, Aravinda Chakravarti, Hal Dietz and Janet Rossant for critical insights into our studies, as well as Minggao Liang and Wilson Sung for their bioinformatics input. Also Drs. Feng Zhang and Rudolf Jaenisch and their laboratories for the CRISPR/Cas9 backbone constructs used in this study. The manuscript where this work was published is a co-authored between myself and Dr. Dwi Kemaladewi, analysis and experiments were shared. Drs Ronald Cohn and Zhenya Ivakine were co-senior authors and contributed to conceptualization and data analysis. The following Cohn lab members contribute to data collection: Dr Zeenat Malam, Sarah Abdullah, Tatianna Wong, Elżbieta Hyatt, Zahra Baghestani.

37

Generation of the First Multi-Exon Duplication Mouse Model of Duchenne Muscular Dystrophy

4.1 Overview

Deletions, duplications and inversions of large genomic fragments constitute an important class of disease-causing variants. These have been linked to a spectrum of conditions from monogenic disorders to complex diseases. However, limitation of genetic engineering technology has historically made modeling these structural variants, difficult and limited their availability. This limitation impacts basic and translation studies aimed at characterizing gene function, elucidating pathogenicity of unknown genetic variants, elucidating pathophysiology and proof-of-concept in vivo testing of potential therapies. The versatility of the CRISPR/Cas9 system has led many scientists to use this approach to genome targeting and/or modification. In chapter 3, I discussed application of this technology as a treatment for genetic disorders, however perhaps the more commonly used application for CRISPR is its role in cell and animal model generation. Availability of genetically relevant animal models is especially important for testing gene targeting therapies like oligonucleotide and CRISPR-based therapies as they are mutation specific, and have to be conducted in genetically relevant models that harbour the relevant mutation.

The most commonly used model for Duchenne muscular dystrophy is the mdx mouse, which harbours a spontaneous nonsense mutation. The wide commercial availability and established characterization contributes to the popularity of this model in the Duchenne field. To date all published studies involving CRISPR/Cas9-mediated editing have focused on existing point mutation models like the mdx mouse 117-121 and only one other on a novel exon deletion mouse model, the DEx50 mouse 36. Neither of these models would be amendable to testing a gene editing therapeutic approach aimed at pathogenic duplications. As such, I have generated and characterized the first tandem multiexon duplication mouse harbouring a 136.8 kb duplication as a genetically relevant model to test our previously described CRISPR-Cas9 single guide duplication removal strategy. The model mimics duplication of DMD exons 18-30 which we have

38 previously treated with CRISPR-mediated gene editing in patient cells 109. Secondly, in generating this novel model harboring a large copy number variation, this study adds to the growing body of literature describing generation of complex rearrangements with CRISPR-Cas9 and importance of exploration of Cas9-induced genetic alterations to distal off-target sequences.

4.2 Results

4.2.1 Genome Engineering to Generate 137 Kb Genomic Duplication in Murine Model

To generate a duplication of exons 18-30 in DMD, a targeted break in the DNA before exon 18 and after exon 30 needed to be made without disrupting the exons themselves. Figure 4A illustrated potential editing outcomes based on the cell’s own repair mechanism. I designed two sgRNAs in intron 17 and intron 30 by screening the mouse DMD sequence in intron 17 and intron 30 to identify sgRNAs with least possible off-target scores. Two sgRNA’s were selected for each targeted intron and cut sites for sgRNAs i17A and i17B were 17,695 bp and 17,406 bp from exon 18 respectively. The cut sites for sgRNAs i30A and i30B were 18,846 bp and 18,999 bp from exon 30 (Figure 4B). Each sgRNAs was cloned into the px459 vector encoding SpCas9 and puromycin selection cassette. All four vectors were then co-electroporated into the male C57BL/6NTac-C2- derived mESCs at the Center for Phenogenomics in accordance with a previously described protocol 99-101. Two sgRNAs per breakpoint were used as we reasoned that by targeting additional double strand breaks at each end could elevate the frequency of generating this complex and infrequent mutation type, an approach that was also taken by Boroviak et al., 86. Figure 4C illustrated the predicted duplication. As summarized in Table 4, 243 ESC clones were PCR screened for deletion and duplication junction with primers flanking the predicted junctions. 4 clones (1.6%) were positive for a deletion exon 18-30 junction and 3 clones (1.04%), were identified to contain a duplication junction. Next, clones containing a duplication junction were expanded, aggregated, and injected into blastocysts which were then implanted into pseudo- pregnant females. Chimeras from the 3 separate clones were born and named Dmd dup18-30 1A10, Dmd dup18-30 3A7, Dmd dup18-30 2G1 (hereafter referred to as 1A10, 3A7 and 2G1 mouse lines). PCR screening of the F1 generation showed germline transmission (GLT) in two of the three lines, 1A10 and 3A7 (Table 5). To confirm the structure at the duplication junction, Sanger sequencing

39 was performed. The sequence of the predicted junction based on a DSB 3 bp from PAM and seamless joining of intron 30 to intron 17 is shown in Figure 4D. The actual sequence is shown below. Both clones guide i17B was used to generate intron 17 DSB and i30A was used to generate DSB in i30. The junction sequence in line 1A10 contained a 96 bp del and in 3A7 a 8 bp deletion. In order to map out the exact mutation generated and confirm its effect on dystrophin protein expression further molecular characterization was carried out.

40

Figure 4 | sgRNA Design and screening for DMD Duplication Mouse Model

A) Schematic depicting potential editing outcomes by intrinsic cellular machinery of a fragment spanned by 2 DSB. B) Schematic depicting position of four sgRNAs (green arrows) used to generate Dup18-30 mouse model. C) Schematic depicting predicted duplication of exons 18-30. D) Sanger sequencing of duplication junction identified in 2 germline transmitting founder lines, named Dup18-30 1A10 and Dup18-30 3A7.

41

42

Table 4 | Summary of Editing Outcomes From Murine Model Generation

Targeting method Number of clones/ Exons targeted Frequency of Editing outcome based on PCR founder screened and size breakpoint screening

Deletion Inversion Duplication 1st Model 2 sgRNAs/target, 243 Exon 18-30 4 (1.65%) N/A Dup Junc 3 (1.23%) generation electroporation Cas9 vector into mESCs 136,837 kb Full dup 1 (0.41%)

2nd Model 1 sgRNA/target site, Micro- 32 Exon 31-34 2 (6.25%) 2 (6.25%) 0 (0%) generation injection cas9 + sgrn rmp in zygotes 12,083 kb

Table 5 | Summary of Chimerism and Germ Line Transmission

Chimera Line ID # of Chimeras % of chimerism, Sex GLT

1A10 4 2 M x 100% Yes 1 M x 80% 1 M x 60%

3A7 5 2 M x 100% Yes 1 M x 95% 1 M x 80% 1 M x 60% 2G1 3 1 M x 60% No 1 F x 40% 1 F x 5%

43

4.2.2 Dmd Dup18-30 1A10 and 3A7 Mice have No Dystrophin Expression and Show Characteristic Dystrophic Pathophysiology

In order to determine if dystrophin expression was affected by the Dmd mutation described above for the 1A10 and 3A7 mouse lines, respectively, I examined muscle tissue with immunofluorescence as well as Haematoxylin and Eosin staining (Figure 5A). Immunofluorescence of Tibialis Anterior (TA) muscle cross-sections shows dystrophin loss in 1A10 and 3A7 mice. Furthermore, H&E staining of 4, 12 and 52-week-old TA muscle cross- sections indicate a muscular dystrophy pathophysiology in 1A10 and 3A7 mice which includes increased fibrosis, presence of centrally localized nuclei, and variably sized fibers. Quantification of myofiber sizes as measures by Minimum Feret’s Diameters in mice show a wider distribution in fiber sizes in 1A10 and 3A7 mice as compared to age matched WT (Figure 5B-D). Furthermore, histological quantification showed a trend towards an increased number of centrally localized nuclei in 1A10 and 3A7 mice as compared to WT. This number further increased with age. At 4 weeks the percent of centrally localized nuclei in 1A10 mice was 73.3% (± 11.1, n=3), in 3A7 mice was 65.3% (± 15.1, n=3), and in WT mice was 3.7% (± 0.5, n=3) (WT vs 1A10 **p= 0.0012, WT vs 3A7 * p=0.0508, Standard Error of Mean (SEM)) (Figure 5E). Taken together this data confirmed that a generated mutation in DMD disrupted dystrophin expression and caused a loss of this protein lead to muscular dystrophy phenotype which worsened with age.

44

Figure 5 | Histological analysis of novel DMD models 1A10 and 3A7

A) Immunofluorescence with antibody against dystrophin in TA muscle cross-sections (Top) H&E (Bottom) staining in 4, 12, 52 week old 1A10 (n=3,2,2) 3A7(n=3,2,2) and WT(n=3,2,2) mice, respectively. A representative sample from each genotype is shown. Scale bar =100 µm. B) Quantification of myofiber sizes as measured by Minimum Feret’s Diameter in mice at 4 weeks. C) 12 weeks D) 52 weeks show a E) Quantification of centrally localized nuclei in myofibers over time. At 4 weeks WT vs 1A10 **p= 0.0012, WT vs 3A7 *p=0.0508. Data in are presented as means ± SEM. Statistical analyses were performed using a Student’s t-test.

45

46

4.2.3 Molecular Characterization of Dmd Dup18-30 1A10 and 3A7 Mouse mRNA Suggests Occurrence of Complex Rearrangement in Dmd gene

Lack of dystrophin production suggested that the genetic mutation in Dmd must be generating an aberrant mRNA that could not be translated. To determine which exons were affected, I performed long range PCR followed by verification with Sanger sequencing of the dystrophin complementary Ribonucleic Acid (cRNA) as well as Illumina next generation sequencing of Dmd DNA. Interestingly, I identified abnormal mRNA transcripts suggesting the presence of a complex rearrangement in both lines involving areas of the Dmd gene not predicted to be targeted in the genome engineering experiment. In addition the complex rearrangement was not the same in both mouse lines, despite both being derived using the same sgRNAs.

As shown in Figure 6A-B, PCR of Exons (Ex) 14-17 and Ex 17-25 shows an unaffected transcript 5’ to targeted DNA loci in intron 17 (in 1A10) however amplicon from Ex 17-25 is absent in 3A7, suggesting a change at the DNA level that is affecting RNA transcription. Amplification with primers in exon 25 forward and exon 18 reverse, spanning a duplication junction is only present in 1A10, suggesting that 1A10 contains the full duplication but 3A7 does not. 3’ of the DMD targeted intron 30 is ascertained by amplification of Ex 29-37 which is absent in 3A7 and a shorter than WT transcript is observed 1A10. Taken together, a model of mRNA from the 1A10 (Figure 6B) was built and verified with Sanger sequencing. It was determined that 1A10 contained a duplication of exons 18-30 which was expected, however the duplication was followed by two pseudo exons joined to exon 35 and the remainder of the transcript was correctly present. The pseudo-exons consisted of fragments of introns 32 and 33 which were inverted. The second 3A7 line (Figure 6C), had a completely different structural variant despite being generated by the exact same sgRNAs. Even though, like the 1A10 line, a duplication junction on the DNA level was detected, mRNA from the 3A7 mouse did not contain the full duplication in the DMD transcript. mRNA instead showed a deletion of exons 18-32 with inclusion of two pseudo-exons joined to exon 33. The first pseudo-exon consisted of an inverted fragment of intron 18 joined to exon 19 and the second consisted of an inverted fragment of intron 30. To determine the exact nature of the genome editing that occurred in DMD, I next performed whole genome sequencing.

47

Figure 6 | Molecular Characterization of 1A10 and 3A7 mRNA shows aberrant transcript suggesting a complex rearrangement

A) PCR walking to map mRNA sequence. Amplifications of Exons (Ex) 14 to 17 and Exons 17- 25 tested sequence 5’ to targeted fragment of DNA in both lines. Amplification with primers in Exon 25 forward and exon 18 reverse span the predicted duplication junction. Amplification of Ex 29-37 to test the 3’ end of duplication. B) Schematic depicting dystophin mRNA model built based on PCR and Sanger Sequencing of 1A10 cRNA and C) 3A7 cRNA. Single copy exons are depicted in solid black and double copy in solid red. Hashed pattern exons are not present in mRNA. Green exons indicated new pseudoexons.

48

49

4.2.4 WGS of DMD Dup18-30 1A10 and 3A7 Mouse DNA

Whole Genome Sequencing of 1A10 and 3A7 mice was performed as previously described in Chapter 3.2.1 for human CNV analysis but sequencing reads here were aligned to mus musculus C57BL/6J; reference genome (GRCmm10). WGS on the Illumina HiSeq X platform produces read lengths which are 150 bp. This is considerable shorter than SV including the expected full duplication length of ~140,000 bp and potentially the complex rearrangements(s) suspected. As such three different methods to analyze the pair-end next-generation-sequencing in order to map out exactly where the structural variants were; these were read-depth, read-pair and split-read mapping. Figure 7A and B illustrate schematic of split end reads in inversion and deletion variants, respectively (adapted from 93. In the 1A10 mouse, split-read mapping revealed that a 1439 deletion (X:83896501-83899000) in intron 17 which was present spanning the cut site of sgRNA i17B, which could have been expected as deletions at cut site are regularly reported (Figure 7C, 8A). Furthermore, there was an increased read coverage spanning X:83737872-83874709, 136,837 kb containing exons 18-30 was observed as well as discordant pair-end reads in a tandem configuration physically linking intron 17 to intron 30 (Figure 7C-D) which was verified by PCR and Sanger sequencing (Figure 8C). Taken together these indicate a Duplication of Exons 18-30 at the DNA level. Furthermore, as suspected from mRNA mapping, an 12,083 bp inversion was identified (Figure 7C, D and 8B) in an untargeted region 3’ of the second copy of duplication harboring exons 31-34 which were inverted with smaller intronic deletions at the breakpoints and a 90 bp insertion of intron 17.

50

Figure 7 | WGS of 1A10 model identifies duplication of exons 18-30 followed by inversion of exons 31-34 in DMD

A) Schematic indicating split pair-end read interpretation in IGV for an inversion variant. A normal read has a Left to right orientation which aligns back to the genome. Discordant reads with different orientations are colored. Teal reads on one breakpoint indicate a left-left pair and blue imply a right and right orientation, together implying an inversion in sequenced DNA. Arrows indicated end reads, a dash indicates pair-end reads i.e. sequenced fragments are at a 150 bp maximum distance from each other. B) Schematic showing deletion of fragment. Loci A and B are more than 150 bp apart and so would not be captured in the same pair-end read. If the DNA between these loci is deleted, then the two loci are physically brought closer together. A read covering these two loci is coloured blue by the software as it indicated discordant reads. C) DNA from male hemizygous 1A10 mouse was analyzed by Illumina HiSeq X at a 15X coverage and aligned to the Mouse mm10 reference genome. Screen shot of WGS dataset visualized by IGV software shows increased read coverage of exons 18-30 and discordant reads (coloured green and teal) at 3’ breakpoints in intron 17, intron 30 and intron 34. D) Deeper resolution of 3 identified breakpoints in DMD suggesting a duplication followed by an inversion.

51

52

Figure 8 | Long range PCR and Sanger sequencing verification of WGS of 1A10 mouse model

PCR and Sanger sequencing at each breakpoint with primers indicated by black arrows and labeled based on location, and intron (I) number. A) Analysis of intron 17 shows a 1,439 bp deletion (X : 83737128-83738567). B) 90 bp insertion from intron 17 (X : 83874711 - 83895347) was detected flanked by intron 30 and an inverted intron 34. C) Breakpoint junction of Duplication localized to X:83737872-83874709 (136,837 kb) was confirmed to join intron 30 to intron 17.

53

54

In the 3A7 mouse line, next generation sequencing showed a short fragment joining intron 17 and 30 as expected in a duplication but a full duplication was not present. This was confirmed by pair- end reads joininh intron 17 to intron 30 but no change in read depth of exons 18-30 (Figure 9 A, D). Additionally, multiple discordant reads suggest a complex rearrangements that involves inversions and deletions involving intron 17, 30 and 32 suggesting a complex rearrangements (Figure 9 B-D). Further functional characterization was not performed on these lines as they did not contain the desired duplication mutation. Instead, efforts were focused on generating a new model with a tandem duplication only.

55

Figure 9 | Next generation sequencing analysis of 3A7 model identifies complex genomic rearrangement

A) DNA from male hemizygous 3A7 mouse was analyzed by Illumina HiSeq X at a 15X coverage and aligned to the Mouse mm10 reference genome. Screen Shot of WGS dataset visualized by IGV software shows no increased read coverage of exons 18-30 and discordant reads at multiple loci. B) Deeper resolution of pair end reads suggesting complex rearrangements involving inversions and deletions in intron 17, intron 30 and intron 32. C) Three discordant pair-end reads identified from WGS showing the fragment of DNA involved as well as orientation. D-E) Schematic illustrating the complex rearrangement and various breakpoints in the DMD gene of this mouse model. Not to scale. D) PCR and Sanger sequencing shows that at the 5’ breakpoint of the rearrangement, intron 30 joined to intron 17 however the fragments are not further joined to a full duplication but rather an inversion of exon 19 to 29. E) The 3’ breakpoint contains a fragment of inverted intron 30 spanned between two inverted regions of intron 17 before joining to normal intron 32 and the rest of the DMD sequence.

56

57

4.2.5 Generation of Second Version of DMD Dup18-30-inv1A10 Mouse

A mouse line with a clean tandem duplication in the DMD gene was not identified in the first CRISPR mediated model generation experiment but was essential to test duplication editing strategies in vivo and so a second experiment was undertaken. Taking into consideration published editing frequency outcomes which suggest a relative lower occurrence of duplication as an editing events as compared to inversions, and that unlike duplications, inversions were observed in zygote injection experiments, I hypothesized that I could use zygotes from the 1A10 model which already harboured the duplication of DMD exons 18-30 and revert to wildtype the inverted fragment of DNA containing exons 31-34 in a second round of CRISPR gene editing (Figure 10A).

To test this hypothesis, I designed sgRNAs that spanned the inverted fragment in the 1A10 mouse line. The 5’ sgRNA resided on the breakpoint junction of WT intron 30 joined to inverted intron 34 and took advantage of a unique insertion that was added to this breakpoint. The 3’ sgRNA was designed 6,502 bp into intron 34 after the inversion breakpoint as the junction itself contain a long stretch of highly repetitive elements which affected the off-target score of a sgRNA targeted to this region. sgRNAs were designed using Benching with the default “repetitive sequence masking” featured turned off in order unmask repetitive elements. Further verified by screening intronic targeted in the regions against proximity to repetitive regions of the intron.

With the TCP Animal Modeling Core, 1A10 embryos were injected with 2 sgRNAs spanning the inversion and Spcas9 mRNA. 32 founder were born and PCR screened for duplication junction which as expected was present in all 1A10 derived founders. PCR screening was also performed for the predicted inversion breakpoints and 2/32 (6.25%) founders were identified and named DMD Dup18-30 inv1A10 (inversion 1A10) #723 and #732. Of these, only 1 founder (#723) exhibited GLT of the re-inverted allele and maintenance of duplication and another 2/32 (6.25%) founders contained a deletion junction of the targeted fragment. None were found with a 2nd duplication junction (Table 4). All subsequent characterization was conducted on DMD Dup18- 30 inv1A10 #723 which displayed GLT and a colony could be established herein called inv1A10.

58

Figure 10 | CRISPR-mediated Re-inversion of Complex DNA Rearrangement in zygotes of Dup18-30 1A10 mouse model

A) Schematic of DMD sequence identified in 1A10 mouse line. Duplication of exons 18-30 is indicated in green and complex rearrangement including an inversion of exons 31-34 in beige. Relative location of Cas9 and sgRNAs used to treat 1A10 mouse zygotes are here depicted as scissors. An editing outcome precited from this treatement is a re-inversion of fragment intervening the two sgRNAs. Schematic not to Scale. B) Sequencing of re-inverted 5’ junction in two identified founders 723 and 732. C) Sequencing of re-inverted 3’ junction in the GLT founder, 732. D) mRNA mapping of male F1 offspring from 723 founder shows that Exons 25-19 are joined indicating the duplication junction is present. The 723 line also has restored exons 31-34 as seen by the higher molecular weight band compared to 1A10. Introns are indicated by “I” followed by corresponding number.

59

60

4.2.6 Second Version of DMD Dup18-30-reinv1A10 Mouse Shows Full Duplication With No Additional Dmd mRNA Altering Rearrangements

To determine if the inversion detected by breakpoint PCR was in fact a full re-inversion, mRNA from this locus was mapped. Figure 10B shows long range dystrophin mRNA mapping. The top panel confirms that the duplication junction is still retained in inv1A10, and the bottom panel shows amplification of exons spanning the previously identified inversion of exons 31-34 which manifested as a shorter mRNA transcript in 1A10. Here in inv1A10, the transcript length is the same in WT mice. This suggested that the targeted fragment was re-inverted to the correct WT orientation. To confirm that this was in fact the case at the DNA level, WGS was performed followed by mapping of structural and copy number variants.

Based on mRNA sequencing we expected that other than the tandem duplication of exons 18-30, the other structural variants that may have been generated in the second CRISPR targeting experiment were not affecting splicing of DMD mRNA. Nonetheless to map these genomic changes we performed WGS with extensive SV and CNV callings and verification of break points by long range PCR. As previously described in first generation of mouse model WGS, in order to map exactly where these variants which are 10-100 kb in length occur since they are longer than the average read length of 150 bp , three different methods were used from pair-end next- generation-sequencing which were read-depth, read-pair and split-read mapping. Read-depth was analyzed using CNVnator (a CNV caller), which measures the deviation of sequencing depth as compared to local genomic average. It can be used to detect duplications and deletions but not E balanced SVs like inversion. Table 6-7 shows the identified tandem duplication and small deletion F in intron 17 which were intrinsic to the 1A10 line from which the inv1A10 line was derived. Figure 10C shows a screenshot from IGV software-based display of sequencing reads where each gray line represents a sequencing pair-end read. An increase in gray bars indicates a duplication and no reads a deletion. A second caller, Manta, was used to identify SV based on read-pair and split- read assembly from pair-end sequences generated in NGS. Read-pairs have consistent sizes and G orientations of the two ends that map to the reference genome. They are fairly short fragments of a few hundred bp and so therefore do not typically span large CNV nor SVs, however, an assembly of reads can give good coverage. If one or both of the pair-ends in a read align to two separate locations in the reference genome, a split-read is called and depending on the orientation coloured differently. Duplication reads are green, inversion reads are teal on one end and blue on the other,

61 and deletion reads are blue. Split-reads need to be assembled in order to piece together and resolved the exact breakpoint junctions. All identified SV called be Manta in DMD are listed in Table 7. Both callers identified 2 SV distinct in the 1A10 mouse which is the tandem duplication at X: 83737872-83874709 with a total length of 136,837 kb. CNVnator further confirmed with a normalized read depth of 1.99607. Also, both identified the 1439 deletion (X:83896501- 83899000) in intron 17 which was present in 1A10 from the first targeting. All three inversions called by Manta map to the same complex rearrangement and are therefore linked by the same Manta ID; MantaINV:22353 which are unique to inv1A10. All three are found at the 3’ end of the inversion junction. They were further characterized by long range PCR. The Sanger sequencing and assembly of mapping is summarized in Figure 10D. They did not affect splicing in this intron as mRNA sequencing showed inclusion of both exons 33 and 34.

Table 6 | Copy Number variants called by CNVnator

CNV Type Coordinates Size Normalized Read P- val Depth duplication X:83738501-83874500 136000 1.99607 0.00191515 deletion X:83896501-83899000 2500 0.184023 0.0306122

62 Table 7 | Structural Variants called by Manta

#CH POS Start ID REF ALTeration Pos END SV SV SV SV INS SEQ Annotation based on long Range ROM first TYP LEN INS PCR break point mapping N E LEN X 83737128 MantaDEL: A 83738567 DEL -1439 Deletion within intron 17 22353:0:1: 0:0:0 X MantaDUP T 7 ending in intron 30 :22353:1:3: 0:0:0 X MantaINV: G 83895347 INV 15738 83737964 22353:1:2: 3 0:0:0 X MantaINV: C 83895347 INV 20636 90 TAAAAATTTTT Portion of intron 17 inserted in 5’ of 83874711 22353:2:3: CATTTCTTGGA previously inverted breakpoint 0:0:0 GTTCATTGCTT (1A10) which joined intron 30 with ATAGTCCTGTC an inverted intron 34. In inv1A10 it is TTTCTTAAGTA on 3’

X 83895465 MantaINV: T 83898707 INV 3242 Inversion downstream of 5’ sgRNA in 22353:2:4: WT intron 30 unique to inv1A10 0:0:0 X 83888795 MantaINV: C 83896752 INV 7957;I Inversion downstream of 5’ sgRNA + 22356:0:1: MPRE targeted i34 3’ of sgRNA in WT intron 0:0:0 CISE 30 unique to inv1A10

63

4.2.7 Dmd Dup 18-30 inv1A10 Mouse Model is Dystrophin Deficient and Shows Dystrophic Muscle Pathophysiology as well as Impaired Skeletal Muscle Function

To confirm that inv1A10, herein referred to as Dmd Dup18-30, was in fact a dystrophin deficient model, and to assess the effect of the pathogenic duplication in the Dmd gene, characterization was performed on male inv1A10 mice and their WT littermates at 15 weeks of age. The Tibialis Anterior (TA) and Triceps (Tri) muscle from each mouse was probed for dystrophin expression by Immunofluorescence (Figure 11A,B,E). As expected in Dup18-30 mice only 4.74% (n=5, p>0.0001) of TA myofibers and 4.34% (n=5, p>0.0001) triceps myofibers express dystrophin, likely revertant fibers, as compared to 100% expression in both tissues of WT mice (n=4). Furthermore, histological analysis with H&E staining showed characteristic of dystrophic pathophysiology. Centrally localized nuclei were observed in 65.8% of Dup18-30 TA myofibers (p>0.0001) and 80.19% in their triceps (p>0.0001) as compared to <1% in WT mice (Figure 11C,F). Also, a wider distribution of fiber sizes as measured by Feret’s Diameter was observed in TA and triceps of inv1A10 mice. Both suggesting muscle undergoing cycles of regeneration.

To determine if loss of dystrophin also impacted muscle function, the mobility and strength of this cohort was tested. All functional tests were performed by blinded trained operators at the Lunenfeld-Tanenbaum Research Institute’s Centre for Modeling Human Disease Mouse Phenotyping Facility. Dup18-30 (n=12) and WT (n=9) mice were recorded in an open field chamber for 20 min. Using VersaMax Animal Activity Monitoring system, their mobility patterns were analyzed (Figure 12). Overall Dup18-30-inv1A10 mice showed significantly reduced mobility compared to WT. Average speed in inv1A10 mice was 8.85 cm/s (± 0.82) compared to 13.65 cm/s (± 0.48) in WT (***p=0.0002, SEM, Figure 12A). Dup18-30 inv1A10 mice also had less vertical activity as measured by rearing events (n) 99.92 (± 30.07) compared to 292.8 (± 26.04) in WT (Figure 12B, ***p=0.0002, SEM) and travelled less, with an average distance of 3998 cm (± 1066) compared to 10372 cm (± 751) in WT (Figure 11C, ***p=0.0002, SEM). Grip strength of forelimbs was also weaker in Dup18-30 mice with an average of 3.007 g/g (± 0.1612) compared to 3.948 g/g (± 0.09487) (Figure 12D, p=0.0004, SEM). Taken together Dup18-30 mice did in fact have impaired hind and forelimb function which affected mobility and strength.

64

Figure 11| Histological characterization of Dmd Dup18-30 inv1A10 mice

A) Immunofluorescence with antibodies against Dystrophin (left panel) and H&E (right panel) staining on TA and Tri muscle cross-section of 15 week old Dup18-30 inv1A10 mice and their WT littermates. A representative sample from each genotype is shown. Scale bar, 130 µm. B) IF quantification of percent of dystrophin positive myofibers in TA from Dup18-30 inv1A10 (red, n=6) and WT (black, n=5) mice. C) Histological quantification of the percent of myofibers with centrally localized nuclei in TA muscles. D) Mean minimum Feret diameter of TA myofibers. E) Percent dystrophin positive fibers in Tri F) Percent of myofibers with centrally localized nuclei in Tri muscles. G) Mean minimum Feret diameter of Tri myofibers. Data in are presented as means ± s.d. (B,C,E,F) using a Student’s t-test **** p <0.0001, ***p=<0.001, *p=0.05.

65

66

Figure 12| Functional characterization of Dmd Dup18-30 inv1A10 mice

A) Functional assessment in open field chamber of average speed B) number of rearings on hindlimbs C) and total distance travelled. D) Grip strength (GS) measured as grams of force of forelimbs normalized to weight. Data in are presented as means ± S.E.M. Statistical analyses were performed using a Student’s t-test **** p <0.0001, ***p=<0.001, *p=0.05.

67

68

4.3 Discussion and Conclusions

Generating targeted genome alterations like those required to make animal models for SV, requires two steps: first a targeted DSB and second a repair of the cut via cell-intrinsic repair pathways. The initial DSB can be made by targeted nucleases like the Cas9 and sgRNA system. The DSBs made by Cas9 are then repaired by the cell’s own repair machinery and the process is typically seamless. When the repair mechanism makes a repair error and the DNA sequence is changed, then we can detect a “gene editing outcome”. And so, generating an animal model for an editing outcome like a duplication is reliant on the intrinsic cellular machinery to make an error and generate a duplication instead of seamlessly fixing the break. The frequency at which a desired editing outcome is made relies on temporal and/or cellular constraints of the repair mechanism required to make it. The most common repair error is an indel. The efficiency of generating animal models harboring indels using a single cutting sgRNA is in the range of 78%-100% 122-127. Indeed, this approach generates knock-out models. Comparatively, the frequency of generating structural variants like a duplication is in the range of <2% for autosomal genes and some studies failed to detect any duplications at all 85-88. A duplication in a X chromosome gene has not been generated yet to our knowledge. In this study duplications were detected at 0.4% in an X-linked gene which is within the previously reported range. In fact, while duplications are the rarest structural variant, editing occurrences have been described in multiple murine model generating studies including this one (Table 1 and 4). This is likely due to the more complicated repair mechanisms required to generate duplications as compared to indels and other structural variants like deletions and inversions.

In order to generate an animal model harboring a rare editing mutation, high throughput screening is required. In the case of a duplication where a best case scenario of 50:1 chance of a duplication occurring is expected, we decided to generate our duplication model by editing murine embryonic stem cells instead of embryos directly as this would allow for screening of hundreds of clones prior to implantation; with direct embryo targeting we would only be able to screen after birth. And so even though direct genome modification in embryos is favorable in that it alleviates the need to work with embryonic stem cells which is laborious and time-consuming 124, it limits the number of animals that can be screened. Unfortunately, the embryonic stem cell line used by the TCP Animal Modeling Core, C57BL/6NTac-C2-derived mouse embryotic stem cells, is male 100. Without a homologous chromosomal copy of the targeted X-linked Dmd gene, I suspected that

69 there could be a temporal constraint that could further lower the frequency of duplication generation. There would be a limited time frame of when an extra copy of the genetic fragment required to make a duplication would be available, which would be only during mitosis when a sister chromatid was present.

Furthermore, the application of CRISPR to generate models harboring duplications is new and there is currently no accepted consensus on what factors and parameters influence the frequency of duplication associated repair. Factors that may affect the frequency of generating a duplication include: size of targeted fragments, genomic architecture surrounding the targeted loci, sgRNA selection, as well as whether mESC are being targeted or embryos directly. However the influence of these factors has yet to be delineated and it will become more clear as more systematic studies to test these parameters on duplication editing frequencies are published.

Additionally, thus far exploration of CRISPR/Cas9 mediated generation of models harboring structural variants has been limited to the immediate vicinity of the target site and distal predicted off-target sequences 36,85-88,128-132. Although this is a quick and cost-effective screening approach, the caveat to PCR screens is that they are limited to predicted breakpoint junction screening. This approach may not identify all mice with rearrangements as PCR amplification requires that the primer sites are retained at the junctions. Alternatively, it may also identify false positives as efficient PCR amplification is limited to a few thousand base pairs and so the full length of a structural variant (especially a +100 kb duplication) cannot be captured by this approach. In our study, although we did identify and establish mouse lines harboring duplication junctions, further characterization revealed unexpected complex rearrangements with breakpoints in untargeted loci.

Furthermore, the location, nature and size of unexpected rearrangements differed from one clone to the other which were both generated using the same sgRNAs. And so, without mRNA and NGS characterizations it would have been challenging to predict where all of the breakpoints could be. Since our discovery, other groups have also reported similar off-target effects in adjacent untargeted sites 133-135. It is unclear exactly how these complex rearrangements could have occurred in 1A10 and 3A7 mouse lines. Furthermore, it will be important to determine if a clean tandem duplication can be made in a second experiment with different sgRNAs. Alternatively, elements in the DMD gene could induce formation of complex rearrangements if double stranded breaks are made in these introns. Interestingly, both of the duplication positive clones generated

70 with the same sgRNAs in the same experiment resulted in complex rearrangements but with a distinctively different nature and size. Additionally, introns 30 and 34, which were part of the unintended complex rearrangement in 1A10, do have a high level of repetitive elements and this feature has been linked to SV formation with multiple breakpoints in a rare DMD case. Our study revealed hidden complexities associated with generating large structural rearrangements and contributed to the hypothesis that the DMD gene is prone to structural rearrangements as seen by the mutation spectrum where almost 80% of pathogenic mutations are large SV including duplications, deletions and inversion and only 20% are point mutations 10.

Despite the initial complication in generating a tandem duplication, I was able to generate the Dmd Dup18-30 inv1A10 murine model. This is the first multi-exon duplication model of DMD. Although transgenic models with extra copies of genes have been developed through random integration into the genome, there is a lack of models which contain actual genomic rearrangements that resemble those found in patients, i.e. a tandem duplication. This is likely due to the inherent difficulties in engineering these types of SVs. This limited number includes autosomal duplication models described by other groups 136-140 and a X-chromosome model of Pelizaeus-Merzbacher disease 84, all made using insertional targeting vectors. In the DMD field, one pseudo duplication model has been described, a murine model where a second copy of murine exon 2 was knocked into intron 1 35. This model does not replicate the structure of pathogenic duplications in DMD seen in patients as it is not a tandem duplication. As such, mutation specific therapeutic approaches could not be tested in this model, including our previously described CRISPR single sgRNA editing strategy which can generate the necessary two cuts only if a duplication is tandem. Lastly, the mean size of duplications in the DMD gene as reported by the Leiden database is 9.26 exons in size, approximately 98,000 bp, whereas the exon 2 knock-in mouse has a single exon duplication totaling ~500 bp only. In order to test the clinical translatability of our gene editing approach we required a multi-exon tandem duplication model that could more accurately reflect the patient population. As such, I generated a model which harbors a more clinically relevant tandem multi-exon duplication with a 13 exon duplication totaling 136,837 bp. The Dmd Dup18-30 inv1A10 model is the first multi-exon duplication model to be generated and this pathogenic mutation does in fact lead to a lack of dystrophin expression, dystrophic pathophysiology and impaired skeletal muscle function. The dystrophic pathophysiology is apparent but mild compared to the genetically equivalent human disorder

71

Duchenne muscular dystrophy. This species specific phenomenon was also observed in other dystrophin negative murine models such as the mdx model, the recently described exon 2 duplication model and a D50 model 35,36. Future experiments should be aimed at performing time course analysis on this novel mouse model and utilizing proof-of-concept DMD treatments.

In the end, a mouse model harboring the intended duplication was obtained. As a result, this study has far reaching implications for the application of CRISPR- based gene editing to generate large structural variants and chromosomal rearrangement in mice.

4.4 Acknowledgements

Thank you to the following individuals and core facilities for data and/or analysis contributions in this chapter: Drs. Cohn and Ivakine for input on conceptualization, data analysis including WGS dataset CNV analysis. The TCP Model Production Core for mouse production and all open field and grip strength functional testing. Ella Hyatt and Kyle Lindsay for mouse colony husbandry, genotyping, sample collections. My summer and undergraduate thesis students Amanda Chiodo, Georgia Besant, Gina Desatnik for help with vector cloning, genotyping, tissue sample preparation and scoring. For the three generated models, conceptualization, sgRNA design, founder screening, WGS analysis, CNV genetic mapping, mRNA mapping was performed by me. Additionally, I performed histological characterizations, functional testing analysis for 1A10 and 3A7 cohorts, as well as initial inv1A10 cohorts. Additional expansion of the number of mice analyzed for molecular and histology testing in the inv1A10 cohort were performed by Eleonora Maino and Sonia Evangelou.

72

Single sgRNA CRISPR-mediated Genome editing Restores Full-Length Dystrophin and Improves Muscle Function in first 137 kb Duplication Mouse Model of Duchenne Muscular Dystrophy

5.1 Overview

There is accumulating proof-of-concept evidence for the effectiveness of CRISPR/Cas9 gene editing as treatment for disorders caused by pathogenic point mutations and larger deletion CNVs. However, the feasibility and best approach for utilizing CRISPR to repair large duplication CNVs (>1 kb) has yet to be ascertained. Duchenne muscular dystrophy is an ideal disorder to develop a gene editing treatment as it is one of the more common neuromuscular disorders and duplications comprise approximately 10% of the DMD mutation spectrum 10. Additionally, there is no therapeutic approach described that would be amendable for patients with duplications in DMD that restores full-length dystrophin protein.

Exon skipping is a promising disease-modifying strategy for DMD caused by deletions, however this approach does not restore full-length dystrophin but rather aims to convert DMD into a milder allelic variant BMD. This approach was explore by other groups using of the CRISPR/Cas9 gene editing system in patient cells 107 and canines 141 harboring deletions as well as point mutations in the mdx mouse 118,120,142,143. In all cases, certain exons were deleted or reframed to create a shorter dystrophin product.

As for duplications, our group was the first to describe a duplication removal approach in patient cells harboring an exon 18-30 duplication (described here in Chapter 3 and published in 109). Another group later confirmed removal of duplicated sequences in vitro using the same single sgRNA approach targeting a different duplication of exon 2 only 144. More recently a third group verified the strategy by removing a duplication of exons 55-59 in DMD iPSC line 145. This approach had yet to be tested in vivo.

73

Furthermore, all previous studies on CRISPR/Cas9-mediated editing in mouse models used either the mdx mouse 117-121, which harbours a nonsense mutation, or the DEx50 mouse which harbours a single exon deletion 36, Neither model is amendable to a duplication correction strategy. As such, I set out to generate the first tandem multiexon duplication mouse model for DMD. This model would in turn be used to test our previously described CRISPR single guide duplications removal strategy. Our new model mimics a patient duplication of DMD exons 18-30 which we successfully repaired in cultured cells using CRISPR-mediated gene editing 109.

Given our successful removal of a large exonic duplication in DMD patient cells as described in Chapter 3, I next wanted to test the same single sgRNA approach in vivo in the newly generated Dmd Dup 18-30 inv1A10 mouse model described in Chapter 4, here in called Dup 18-30. I hypothesized that using our previously published CRISPR-mediated one cut genome editing approach we could remove the pathogenic mutation in vivo and restore full-length dystrophin expression. Dependent on the efficiency of editing this approach could also improve pathophysiology.

5.2 Results

5.2.1 Design of sgRNA for Duplication Removal

To determine if our previously described (Chapter 3 and 109) single sgRNA duplication removal approach provide therapeutic benefit in vivo, I set out to determine its efficacy in vivo. The duplication model to be studied was generated and characterized in Chapter 4. Engineering of CRISPR/Cas9 targeting required a sgRNAs targeting theoretically anywhere within our exon 18- 30 duplication. I designed sgRNAs targeting duplicated introns instead of the exons to lower the changes of disrupting the reading frame. Briefly, all introns contained within the duplication (intron 18- 29) were scanned using Benchling and 8 sgRNAs with lowest off-target score were selected. I then transfected the candidates with SaCas9 expressing plasmid to C2C12 cells without selection. In Chapter 3 we utilized the SpCas9 species of Cas9 as it was at that time the only one available. Since then other Cas9 species were described which were smaller in size like SaCas9. The SaCas9 coding sequence is 1 kb shorter than SpCas9 and had equally efficient nuclease activity. As such it is a better a candidate for clinical translation as it could fit in AAV delivery

74 vehicle 82. Next I assessed the editing efficiency, based on detection of indels consistent with CRISPR/Cas9 cutting and error prone DNA repair using the ICE algorithm 146. ICE enables batch analysis of CRISPR editing using Sanger Sequencing data. Table 8 summarizes the detected editing efficiency of candidate sgRNAs. sgRNA #299 i21 g7 (here in called sgRNA i21) targeting intron 21 had the highest efficiency average of 17% indels (Figure 13A) and it was selected to be subcloned into a vector containing SaCas9 and packaged into AAV9 particles by Vigene Biosciences Inc.

5.2.2 Experimental Design of CRISPR-Mediated Treatment of Dmd Dup18-30 (inv1A10) Mouse Model with Single sgRNA Approach In Chapter 3, I described and tested a single sgRNA approach to removing large genomic duplications which was tested in cells form a patient harboring an intragenic duplication in DMD. Next, I tested for efficacy of this approach in a genetically relevant animal model. This experiment would test if the cells’ intrinsic repair mechanisms could repair the generated DSB in a manner that would remove targeted fragment and to generate a functional Dmd gene and at what efficiency. Previous CRISPR-based Dmd editing studies showed expression of dystrophin protein after 4 weeks and noted increased improvement over time 118,119,143. Furthermore, systemic delivery through injection into the temporal vein of neonatal mice with a 7 week treatment window was previously shown by our group to promote phenotype restoration in a model of congenital muscular dystrophy 102. As such we used AAV9 91 which has tropism to cardiac and skeletal muscle to deliver SaCas9 and sgRNA i21 Postnatal day 1 (P1)-P2 Dup18-30 neonates. 50 µl of AAV solution (3x1012 Genome Copies (GC)/mouse was injected through the temporal vein. 7 weeks later, histological and functional analysis were carried out (Figure 13B).

75

Table 8 | ICE analysis of candidate sgRNAs targeting murine Dmd

Guide ID Sequence Editing Efficiency (ICE)

#154 mi20 GATGGCCTTAAGGCCCCACT 0%

#155 mi21-1 AAGAGGATCGCGACTACTAG 0%

#156 mi21-2 AAGTACACTTAAATGCCGGT 9.5%

#158 mi29 CTAGGTCGTGTAGACATCGC 0%

#299 i21 g7 GCTGTAATGCCACCTAGGAA 17%

#302 i27 g10 TGGAGCATAAGTAACCTCAC 0%

#303 i29 g11 GGAGACTTGCCCCAATGTAG 7%

#304 i29 g12 GTAGGTAGAGAGTCCTACAT 1%

76

Figure 13 | sgRNA Design and Experimental Design

A) Sanger sequencing from cells treated with CRISPR editing components, Cas9 and sgRNA i21, analyzed number of reads with indels using ICE algorithm. Nature of each identified indel is listed in accordance to frequency. Vertical dotted lines denote the expected cut site. The type of indel identified is denoted by a + insertion or – a deletion of the indicated number of nucleotides (N). B) Schematic illustrating experimental timeline wherein Dmd Dup18-30 neonatal mice (P1-2) were injected through temporal vein with AAV9 virus containing Cas9 with sgRNA i21. Cohorts were functionally tested and tissues collected 7 weeks post treatment.

77

78

5.2.3 CRISPR single sgRNA Treatment Restores Full-Length Dystrophin Expression and Improved Pathophysiology in Skeletal and Cardiac Muscles

To evaluate the efficacy of our gene editing treatment, we first determined if full-length dystrophin protein could be detected in treated Dup18-30 mice. Heart, TA, and triceps muscle from mice injected with AAV9-sgRNAi21 were evaluated for dystrophin expression by IF (Figure 14) and Western analysis (Figure 15). Dystrophin positive fibers were seen in both cardiac and skeletal muscle. Furthermore, Western blot analysis with antibodies against the N’ terminus of dystrophin confirmed that it was in fact full-length protein that was being detected. Dystrophin expression quantified by densitometric analysis of Western blot showed 23.13% (±7.13, n=2) of wildtype dystrophin expression levels were seen in TA after CRISPR treatment as compared to 2.55% (±0.49, n=2) in untreated mice (Figure 15A,C). In heart tissue, 29.95% (±11.65, n=4) of dystrophin was detected in CRISPR-treated mice and 2.35% (±0.60, n=2) untreated mice (Figure 15B,D). This demonstrated successful genetic removal of the 137 kb fragment leading to translation from a restored WT transcript from the edited gene. To determine if restored dystrophin expression impacted pathophysiology, H&E staining on CRISPR treated mouse muscle tissue showed improved pathophysiology, as demonstrated by less regenerated fibers with central nuclei (Figure 16). Quantification of these indicators showed that in TA muscles, 21.59% (±3.534%) of myofibers had centrally located nuclei compared to 68.61% (±2.879) in untreated animals (Figure 16B, p <0.0001). In triceps 30.17% (±6.133) of myofibers had centrally located nuclei which was significantly less than in untreated which had 76.09% (±2.683) (Figure 16C, p=0.0005).

79

Figure 14 | Dmd Dup18-30 Mice Treated with CRISPR show restoration of full-length dystrophin protein

Representative image of IF with antibodies against dystrophin show edited fiber expressing the protein in heart, TA, and Triceps. Scale bar, 100 µm

80

81

Figure 15 | Western analysis of protein expression in CRISPR Treated DMD Dup18-30 Mice

Western blot analysis to asses restoration of full-length dystrophin with antibodies against dystrophin (Dys). Western analysis was also used to asses expression of HA tagged Cas9 and calnexin (CNX) loading control in A) TA tissues and B) Heart tissues for titrated WT sample, 2 untreated mice and 4 treated mice. Densitometric analysis of dystrophin levels normalized to CNX loading control.

82

83

Figure 16 | Dup19-30 Mice Treated with CRISPR show histological and functional improvement

A) H&E of TA and Triceps tissues in WT, Dup18-30 untreated as well as CRISPR treated at 7 weeks post treatment. Scale bar 50 µm. B) Quantification of centrally located nuclei in TA tissues C) in Triceps. Data presented as mean ±SEM. Statistical analyses were performed using Student’s t-test **** p<0.0001, ***p<0.001, **p<0.01, *p=0.05.

84

85

5.2.4 CRISPR Treated Mice Show Functional Improvement To determine if the level of dystrophin restoration after gene editing treatment was sufficient to confer a functional benefit we performed mobility and strength assessments using open-field testing and grip strength as previously described in Chapter 4.6. Treated mice showed significantly improved mobility compared to untreated in all functional tests performed and were more comparable to WT than untreated.

To ascertain any changes in behaviour and mobility, mice were individually placed in a plexiglass chamber where there motion was recorded for 20 min and analyzed by software. Measures obtaining from open field testing included average speed, total distance travelled, vertical activity, and total resting time. The average speed in Dup18-30 treated mice was 10.51 cm/s (±0.67 SEM, n=5) which was not significantly different than WT mouse average speed of 10.39 cm/s (± 0.50, n=9), but significantly faster than untreated mice which had a speed of 6.84 cm/s (±1.26, n=5, p=*0.0329) (Figure 17A). Treated mice also travelled more, exhibiting an average distance of 4603 cm (±562, n= 5) compared to untreated travelling a distance of 1278 cm (±420, n=5). This distance was equivalent to WT which traveled 5732 cm (±544, n=9, p=not significant (n.s.)) (Figure 17B). The same trend was observed in vertical activity. Dup18-30 treated mice on average had more vertical activity, as measured by number of rearing events, 78.6 events (±18.9, n=5) compared to 3.8 events (±1.8, n=5) in untreated (p=0.0035). There was no significant difference in rearing of Dup18-30 treated mice compared to WT which had 135 events (±18, n=5, p=n.s.) (Figure 17C). To test fatigue, the total resting time was measured. Dup18-30 treated mice spent on average for 680.61s (±2.88, n=5) of the 20 min (1,200 second) test not moving, whereas untreated mice rested for a significant longer period of time, on average 1026 s (±50.85, n=5) (Figure 17D, p<0.0001). Dup18-30 treated mice were more similar to WT mice which rested for 656.2s (±28.2 n=9, p=0.0285). Taken together Dup18-30 treated mice out performed untreated animals, suggesting that the efficiency of our CRISPR treatment was sufficient to render a functional benefit.

The mean grip strength of forelimbs normalized to body weight was stronger in Dup18-30 treated mice with an average of 2.568 g/g (±0.2068, n=5 ), compared to 1.638 g/g (±0.3221, n=5) in untreated model (Figure 17E, p=0.0412). Although treated mice were significantly stronger than untreated, their strength was not completely restored to the level of WT mice whose average score was 4.876 g/g (±0.0947, n=9, p<0.0001).

86

Figure 17 | Dup19-30 Mice Treated with CRISPR show functional improvement

Open Field testing analysis of A) average speed, B) of total distance traveled C) of Vertical activity D) of total resting time E) Grip strength testing of forelimbs normalized to weight. Data presented as mean ±SEM. Statistical analyses were performed using Student’s t-test **** p<0.0001, ***p<0.001, **p<0.01, *p=0.05.

87

88

5.3 Discussion and Conclusions

Since discovery of the dystrophin gene in 1987 4, numerous pharmacologic approaches have been developed and tested in clinical trials aimed at delaying disease progression by targeting this affected gene or its products, including oligonucleotide-based exon skipping, nonsense mutation readthroughs, and gene transfer therapy. These are mutation specific therapeutic approaches and, of these, only gene transfer would be amendable to patients who harbour pathogenic duplications in DMD. However, this approach generates a truncated Dystrophin essentially converting a DMD phenotype to the milder BMD. More recently, CRISPR-mediated gene editing strategies have been described 36,117-121, but all aimed at generating a short dystrophin protein either to treat deletions or point mutations through exon skipping; this strategy is not amendable to patients with duplications nor capable of restoring full-length dystrophin. Here, I described the first treatment approach that enables restoration of full-length dystrophin in patients with a pathogenic duplication; this approach is applicable not only to DMD, but BMD as well, which is caused by mutations in the same gene.

Our gene editing approach is based on the hypothesis that one sgRNA could be used to generate two cuts in a tandem duplication allele, and that cell intrinsic repair mechanisms would be capable of repairing the cut in a manner that would remove intervening sequences. In Chapter 2, and our 2016 publication 109, we tested this single sgRNA approach in cells from two different patients both harboring a tandem duplication: one, a 139 kb duplication implicated in DMD and the other a 278 kb duplication in MECP2. In both cases we were able to show successful repair of the duplication allele and full-length dystrophin expression from the repaired DMD gene. Based on this work, I tested for removal of a duplication in an in vivo model. The in vivo model used for testing was the Dmd Dup 18-30 inv1A10 mouse model described in Chapter 4.

Full-length dystrophin expression was indeed observed in all treated mice, albeit to variable levels. In TA muscles, levels of dystrophin expression ranged from 9-47% (average 23%) and in heart 12.0- 61.8% (average 30%) after 7 weeks of systemic treatment initiated 1-2 days after birth. These levels are comparable to previously published data on systemic CRISPR treatments, although these restored a shorter less functional dystrophin protein. As ascertained by Western blot staining, Tabebordbar et al showed 3-18% protein expression after 3 weeks systemic treatment of mdx mice 119, Long et al reported up to 1.8-3.2% micro-dystrophin expression after 8 weeks systemic

89 administration in mdx mice 143, Nelson et al., showed 8% expression in response to intramuscular injection of mdx mice 118. Finally, Amoassi et al reported up to 90% of micro-dystrophin after 4 weeks systemic treatment of the D50 mouse model 36. In each case, variable expression was achieved. Comparing these studies in terms of correlating the varied dystrophin levels to functional improvements is difficult as the resulting dystrophin restored is not the same. These studies restored a shorter less functional dystrophin protein and here I restored full-length dystrophin. From carrier humans and animal model studies, we know that 50% of dystrophin is sufficient to prevent muscle and heart pathology. Importantly, as little as 3-15% full-length dystrophin expression is sufficient to improve muscle function in mouse models of DMD with skewed X- inactivation 29. To further inform the therapeutic window and long-term benefits of the gene editing treatment described in this chapter, future experiments should establish optimal treatment time and dose for older symptomatic mice.

In this study, I describe robust editing on the protein level leading to phenotypic improvement. Additional experiments should be carried out to quantify the level of editing at both DNA and mRNA levels to determine Cas9 DNA editing efficiency. Also, additional quantification of protein expression in other muscle tissues and further functional testing like muscle fiber contractile testing, cardiac functional testing, molecular analysis of serum creatine kinase levels and DGC expression probing will further establish the correlation between functional benefit and level of dystrophin expression following treatment. It is unclear why variability is observed between different treated mice. It would be interesting to see if the same level of variability would be seen with a different method of treatment injection and/or treatment of older mice. Temporal vein injections, although targeted to the blood system, are technically challenging due to the fine motor skill requirement to inject this vein, which is very small in neonates.

Furthermore, continued optimization of CRISPR design will be important for potential clinical translation. This includes systematic evaluation of sgRNAs to determine what design characteristics may influence editing efficiency. It is important to note that off-targets are intrinsic to sgRNA used in the treatment. As such, any CRISPR treatments tested in animal models will be limited due to species-specific differences in genomic sequence. Future human targeting sgRNA design should also include testing for specificity, by analyzing off-target genomic sites through non-biased techniques like next generation sequencing. sgRNAs can be tested in patient cell

90 cultures ex vivo, however, cellular repair mechanisms will be different in vitro as compared to in a living organism and the long term effects of persistent Cas9 expression are not comparable in these two contexts. These issues will need to be addressed through careful design and execution of clinical trials.

In conclusion, our results show for the first time that CRISPR/Cas9-mediated gene editing offers a novel treatment strategy for genetic disorders caused by large duplications. Our treatment approach was first tested in the DMD context, however there is a wide array of other monogenetic diseases caused by pathogenic duplications for which this single sgRNA approach provides a new, highly efficient therapeutic strategy to be explored.

5.4 Acknowledgements

I would like to acknowledge the following individuals for their contributions in this chapter: Drs. Cohn and Ivakine for input on conceptualization and analysis. Ella Hyatt for help with mouse colony husbandry, sample collection and temporal vein injections. Kyle Lindsay for help with cloning and sample collection. Tatianna Wong for help with temporal injections. TCP for all grip strength and open field testing. Experimental conceptualization, sgRNA design, cloning, testing as well as post treatment tissue collection, IF and H&E staining of initial mice in treated cohorts was performed by me. Data collected for subsequent mice added to cohort was performed by Eleonora Maino, Sonia Evangelou, Aiman Farheen. Quantifications and final data analysis of all treated mice was performed by me with input from Dr. Cohn, Dr, Ivakine and Eleonora Maino.

91

Future Directions

6.1 Overview

Genome editing via CRISPR/Cas9 is a rapidly evolving technology with enormous translational potential. The technology is fairly new, it’s applicability as a genome engineering tool was initially described in 2012 72,75, and in the 7 years since has spurred pre-clinical applications for a myriad of genetic disorders. Clinical translation of the novel and effective therapies, however, is a long process that involves iterative cycles of experimentation and optimizations to address issues encountered in pre-clinical studies. Specifically, important limitations that still need to be addressed include efficacy, delivery and safety before this treatment can be considered for use in patients. Here, I will discuss considerations that will further improve the CRISPR-mediated treatment approach developed in the thesis for clinical applicability.

6.2 Improving CRISPR efficacy through Combinatorial Therapy: Gene Editing and Gene Modulation

One potential opportunity to utilize CRISPR/Cas9 technology therapeutically would involve use of catalytically inactive or ‘‘dead’’ Cas9 (dCas9), which when bound to DNA elements can repress transcription 147 or convert Cas9 into a synthetic transcriptional activator by fusing it to multiple copies of VP16 activator 148-151. In a combinatorial approach one could: 1) Upregulate expression of the newly edited gene i.e. in this case dystrophin; and/or 2) modulate the expression of a disease modifier. In order for this approach to work, two Cas9 nucleases from two different species of bacteria would have to be used to ensure that the active cutting Cas9 is not brought to the promoter by sgRNAs designed to upregulate expression and visa-versa.

92

6.2.1 Simultaneous editing and upregulation of newly synthesized gene

One approach to increasing the level of dystrophin expression would be to treat DMD animal models simultaneously with a CRISPR gene editing treatment and a CRISPR dystrophin upregulation treatment. In theory this combinatory treatment could amplify the effect of edited cells. This approach has not been described yet to my knowledge but it is an interesting concept to explore as it could serve as a valuable technique to increasing editing efficiency in any CRISPR- mediated gene editing therapeutic approach. Indeed, as muscles are formed through fusion of cells, upregulation of repair gene copies within a syncytium could help restore dystrophin to near WT levels in multinucleated cells which are likely to contain unrepaired copies of the gene.

6.2.2 Combinatorial Gene Modulation of Disease Modifiers

An alternative strategy to treat DMD could involve a combinatorial approach focused on modulating expression of disease modifying proteins in conjunction with gene editing. The gene expression modulation approach could serve as an alternative to current pharmacological drug development strategies, which modulate pathways associated with disease pathogenesis. An example of a gene that ameliorates disease progression is UTRN, which is a DMD homologue and has been known to partly compensate for the loss of dystrophin in DMD 4,5,10.

Utrophin’s ability to modify disease progression has been established in multiple experiments using the dystrophin-negative mdx mouse, and it has been suggested that 2-fold upregulation of utrophin mRNA is sufficient to improve muscle function 152,153. In addition, small molecules that target increased utrophin protein expression are currently in clinical trial stage 154. We have previously shown a 1.7-6.9 fold upregulation of protein expression in response to targeting UTRN109. It will now be important to test a combinatorial approach involving DMD mutation repair together with UTRN upregulation.

6.3 Improving CRISPR Delivery In Chapter 5, adeno-associated virus serotype 9 with specific tissue tropism for muscles was used for in vivo delivery of CRISPR treatment. In recent years, the number of clinical trials using this virus for various gene therapies has increased. As such, there is a growing body of evidence about

93 its safety as a vehicle and high efficiency of transduction. However, some drawbacks of these vehicles remain around immune recognition, as they can only be administered once and a proportion of the population already has antibodies against it. Additionally, prior immunization against viral proteins might also be conferred if patients have enrolled in clinical trials for gene therapy involving micro-dystrophin treatment which are also delivered using a AAV. Such patients would not be eligible for CRISPR-mediated gene therapy if it was packaged in the same viral vector. To address these limitations, synthetic AAV vectors are under development improve transduction efficiency, specificity and reduce immune recognition. Alternative non-viral delivery mechanisms are also being explored including exosomes and nanoparticles which could circumvent immune recognition and enable multiple dosing. The transducability, and scalable production of such alternatives have yet to be determined.

6.4 Increasing specificity and efficiency of editing

Efficiency could also be altered by improving vector design. For example, a muscle specific creatine kinase regulatory cassette has been used to drive expression of Cas9 in mice 36 and canines 141. Another concern with the therapeutic use of a nucleases like CRISPR-Cas9 is the potential unexpected cuts and mutations outside of the targeted sequence (off-target effects). Cas9 only requires a 20 bp sequence match with a 3 bp PAM site to make a DSB, and some mismatches can be tolerated 96. Although multiple techniques are being developed to evaluate possible Cas9- mediated off-target effects, ultimately the sequence and design of sgRNA will determine what they are and furthermore how the human body’s intrinsic cell mechanisms deal with the potential unintended DSB. Off-targets in murine models can only be evaluated for murine sgRNAs. Ultimately, the sgRNA used for human clinical trials cannot be tested in mice as DNA sequences are not the same. Tests for off-target cleavage can be performed in human cells growing in vitro, however this approach has its own limitations including limited long term observation of Cas9 cleavage activity and differential repair mechanisms of cells growing in a petri dish and in a human body. As such, the off-targets will not be able to be truly ascertained until clinical trials are performed. As the CRISPR system is improved over time, newer versions of the nuclease will become available. Indeed Cas9 nickases have been engineered which make single stranded nicks instead of double stranded breaks. Also Cas9s with longer PAM motif sequences may be engineered in the future which can further improve specificity.

94

6.5 Exploring Applicability to Other Duplications in DMD

Duchenne muscular dystrophy is caused by a spectrum of mutations including point DNA sequence alterations, deletions, and duplications. Duplications are heterogenous and although the exact location and size is varied among patients, approximately 50% of duplications start in the hotspot region of exon 2–22 11,63,155 (Figure 18A). This presents an interesting opportunity as there are clearly fragments of duplications which overlap in multiple patients. In this thesis I describe CRISPR treatment of a duplication of exons 18-30 utilizing a sgRNA targeting intron 27 (Chapter 3), however the same sgRNA could also be used to treat other duplications in DMD as long as the sgRNA falls within the duplicated region. It would be interesting to compare editing efficiencies of different duplication removals by the same sgRNA as well as test other sgRNAs. I have since collected cells from other patients affected by duplications in DMD (Figure 18B). Future steps should be aimed at designing sgRNAs within intron 2-22 and testing our single sgRNA CRISPR treatment on duplications with varied breakpoints and sizes to determine broader applicability of this treatment.

6.6 Conclusion

In conclusion, I describe a novel CRISPR-mediated gene therapy approach which was successful in removing pathogenic duplications >100 kp in patient cells, as well as in a newly generated mouse model. The treatment approach was able to restore full-length dystrophin protein expression and in the mouse model show functional benefit. This proof-of-concept work provides critical evidence for effectiveness the approach and initiates conversations about a pathway to translating this work into a clinical trial.

95

Figure 18 | Exploring the feasibility of one sgRNA treatment approach for multiple patient cells harboring duplication in DMD.

A) Graph adapted from Ma et al., 2018 showing the cumulative number of subjects with a duplication in 1400 Chinese patients registered. A Duplication hotspot is visible between exon 2- exon 22 63. B) Schematic depicting the duplication location and size of 3 patients harboring a duplication of exons 6-7 (86 kb), exons 2-6 (289 kb), and exons (529 kb). A region spanning 7,485 bp with encompasses a portion of intron 5, exon 6 and intron 6 (highlighted in yellow) is duplicated in all 3 patients. The 5’ and 3’ breakpoints of this region were mapped using PCR intron walking and Sanger sequencing.

96

References

1 Mendell, J. R. et al. Evidence-based path to newborn screening for Duchenne muscular dystrophy. Annals of neurology 71, 304-313, doi:10.1002/ana.23528 (2012).

2 Hoffman, E. P. et al. Characterization of dystrophin in muscle-biopsy specimens from patients with Duchenne's or Becker's muscular dystrophy. N Engl J Med 318, 1363-1368, doi:10.1056/NEJM198805263182104 (1988).

3 Koenig, M. & Kunkel, L. M. Detailed analysis of the repeat domain of dystrophin reveals four potential hinge segments that may confer flexibility. The Journal of biological chemistry 265, 4560-4566 (1990).

4 Hoffman, E. P., Brown, R. H., Jr. & Kunkel, L. M. Dystrophin: the protein product of the Duchenne muscular dystrophy locus. Cell 51, 919-928 (1987).

5 Nowak, K. J. & Davies, K. E. Duchenne muscular dystrophy and dystrophin: pathogenesis and opportunities for treatment. EMBO reports 5, 872-876, doi:10.1038/sj.embor.7400221 (2004).

6 Hsu, P. D., Lander, E. S. & Zhang, F. Development and applications of CRISPR-Cas9 for genome engineering. Cell 157, 1262-1278, doi:10.1016/j.cell.2014.05.010 (2014).

7 Campbell, K. P. & Kahl, S. D. Association of dystrophin and an integral membrane glycoprotein. Nature 338, 259-262, doi:10.1038/338259a0 (1989).

8 Ervasti, J. M., Ohlendieck, K., Kahl, S. D., Gaver, M. G. & Campbell, K. P. Deficiency of a glycoprotein component of the dystrophin complex in dystrophic muscle. Nature 345, 315-319, doi:10.1038/345315a0 (1990).

9 Moat, S. J., Bradley, D. M., Salmon, R., Clarke, A. & Hartley, L. Newborn bloodspot screening for Duchenne muscular dystrophy: 21 years experience in

98

Wales (UK). European journal of human genetics : EJHG 21, 1049-1053, doi:10.1038/ejhg.2012.301 (2013).

10 Tuffery-Giraud, S. et al. Genotype-phenotype analysis in 2,405 patients with a dystrophinopathy using the UMD-DMD database: a model of nationwide knowledgebase. Human mutation 30, 934-945, doi:10.1002/humu.20976 (2009).

11 Bladen, C. L. et al. The TREAT-NMD DMD Global Database: analysis of more than 7,000 Duchenne muscular dystrophy mutations. Human mutation 36, 395- 402, doi:10.1002/humu.22758 (2015).

12 Manjunath, M. et al. A comparative study of mPCR, MLPA, and muscle biopsy results in a cohort of children with Duchenne muscular dystrophy: a first study. Neurol India 63, 58-62, doi:10.4103/0028-3886.152635 (2015).

13 Suh, M. R. et al. Multiplex Ligation-Dependent Probe Amplification in X- linked Recessive Muscular Dystrophy in Korean Subjects. Yonsei Med J 58, 613-618, doi:10.3349/ymj.2017.58.3.613 (2017).

14 Elhawary, N. A. et al. Molecular characterization of exonic rearrangements and frame shifts in the dystrophin gene in Duchenne muscular dystrophy patients in a Saudi community. Hum Genomics 12, 18, doi:10.1186/s40246-018-0152- 8 (2018).

15 Liang, W. C., Wang, C. H., Chou, P. C., Chen, W. Z. & Jong, Y. J. The natural history of the patients with Duchenne muscular dystrophy in Taiwan: A medical center experience. Pediatr Neonatol 59, 176-183, doi:10.1016/j.pedneo.2017.02.004 (2018).

16 Aartsma-Rus, A., Van Deutekom, J. C., Fokkema, I. F., Van Ommen, G. J. & Den Dunnen, J. T. Entries in the Leiden Duchenne muscular dystrophy mutation database: an overview of mutation types and paradoxical cases that confirm the reading-frame rule. Muscle & nerve 34, 135-144, doi:10.1002/mus.20586 (2006).

99

17 Okubo, M. et al. Comprehensive analysis for genetic diagnosis of Dystrophinopathies in Japan. Orphanet J Rare Dis 12, 149, doi:10.1186/s13023-017-0703-4 (2017).

18 Birnkrant, D. J. et al. Diagnosis and management of Duchenne muscular dystrophy, part 1: diagnosis, and neuromuscular, rehabilitation, endocrine, and gastrointestinal and nutritional management. Lancet Neurol 17, 251-267, doi:10.1016/S1474-4422(18)30024-3 (2018).

19 Birnkrant, D. J. et al. Diagnosis and management of Duchenne muscular dystrophy, part 2: respiratory, cardiac, bone health, and orthopaedic management. Lancet Neurol 17, 347-361, doi:10.1016/S1474-4422(18)30025- 5 (2018).

20 Angelini, C. The role of corticosteroids in muscular dystrophy: a critical appraisal. Muscle & nerve 36, 424-435, doi:10.1002/mus.20812 (2007).

21 Moxley, R. T., 3rd et al. Practice parameter: corticosteroid treatment of Duchenne dystrophy: report of the Quality Standards Subcommittee of the American Academy of Neurology and the Practice Committee of the Child Neurology Society. Neurology 64, 13-20, doi:10.1212/01.WNL.0000148485.00049.B7 (2005).

22 Cirak, S. et al. Restoration of the dystrophin-associated glycoprotein complex after exon skipping therapy in Duchenne muscular dystrophy. Molecular therapy : the journal of the American Society of Gene Therapy 20, 462-467, doi:10.1038/mt.2011.248 (2012).

23 Mann, C. J. et al. Antisense-induced exon skipping and synthesis of dystrophin in the mdx mouse. Proceedings of the National Academy of Sciences of the United States of America 98, 42-47, doi:10.1073/pnas.011408598 (2001).

24 Kimura, E., Li, S., Gregorevic, P., Fall, B. M. & Chamberlain, J. S. Dystrophin delivery to muscles of mdx mice using lentiviral vectors leads to myogenic progenitor targeting and stable gene expression. Molecular therapy : the

100

journal of the American Society of Gene Therapy 18, 206-213, doi:10.1038/mt.2009.253 (2010).

25 Zhang, Y. & Duan, D. Novel mini-dystrophin gene dual adeno-associated virus vectors restore neuronal nitric oxide synthase expression at the sarcolemma. Human gene therapy 23, 98-103, doi:10.1089/hum.2011.131 (2012).

26 Le Hir, M. et al. AAV genome loss from dystrophic mouse muscles during AAV-U7 snRNA-mediated exon-skipping therapy. Molecular therapy : the journal of the American Society of Gene Therapy 21, 1551-1558, doi:10.1038/mt.2013.121 (2013).

27 Nicholson, L. V. et al. Integrated study of 100 patients with Xp21 linked muscular dystrophy using clinical, genetic, immunochemical, and histopathological data. Part 1. Trends across the clinical groups. Journal of medical genetics 30, 728-736 (1993).

28 Nicholson, L. V., Bushby, K. M., Johnson, M. A., Gardner-Medwin, D. & Ginjaar, I. B. Dystrophin expression in Duchenne patients with "in-frame" gene deletions. Neuropediatrics 24, 93-97, doi:10.1055/s-2008-1071521 (1993).

29 van Putten, M. et al. Low dystrophin levels in heart can delay heart failure in mdx mice. Journal of molecular and cellular cardiology 69, 17-23, doi:10.1016/j.yjmcc.2014.01.009 (2014).

30 van Putten, M. et al. Low dystrophin levels increase survival and improve muscle pathology and function in dystrophin/utrophin double-knockout mice. FASEB journal : official publication of the Federation of American Societies for Experimental Biology 27, 2484-2495, doi:10.1096/fj.12-224170 (2013).

31 van Putten, M. et al. The effects of low levels of dystrophin on mouse muscle function and pathology. PloS one 7, e31937, doi:10.1371/journal.pone.0031937 (2012).

101

32 Bulfield, G., Siller, W. G., Wight, P. A. & Moore, K. J. X chromosome-linked muscular dystrophy (mdx) in the mouse. Proceedings of the National Academy of Sciences of the United States of America 81, 1189-1192, doi:10.1073/pnas.81.4.1189 (1984).

33 McGreevy, J. W., Hakim, C. H., McIntosh, M. A. & Duan, D. Animal models of Duchenne muscular dystrophy: from basic mechanisms to gene therapy. Dis Model Mech 8, 195-213, doi:10.1242/dmm.018424 (2015).

34 Sharp, N. J. et al. An error in dystrophin mRNA processing in golden retriever muscular dystrophy, an animal homologue of Duchenne muscular dystrophy. Genomics 13, 115-121, doi:10.1016/0888-7543(92)90210-j (1992).

35 Vulin, A. et al. The first exon duplication mouse model of Duchenne muscular dystrophy: A tool for therapeutic development. Neuromuscular disorders : NMD 25, 827-834, doi:10.1016/j.nmd.2015.08.005 (2015).

36 Amoasii, L. et al. Single-cut genome editing restores dystrophin expression in a new mouse model of muscular dystrophy. Sci Transl Med 9, doi:10.1126/scitranslmed.aan8081 (2017).

37 Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol 5, e254, doi:10.1371/journal.pbio.0050254 (2007).

38 Wheeler, D. A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872-876, doi:10.1038/nature06884 (2008).

39 Feuk, L., Carson, A. R. & Scherer, S. W. Structural variation in the human genome. Nature reviews. Genetics 7, 85-97, doi:10.1038/nrg1767 (2006).

40 Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444-454, doi:10.1038/nature05329 (2006).

102

41 Lee, J. & Lupski, J. Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders. Neuron 52, 103-121, doi:10.1016/j.neuron.2006.09.027 (2006).

42 Lupski, J. R. et al. DNA duplication associated with Charcot-Marie-Tooth disease type 1A. Cell 66, 219-232, doi:10.1016/0092-8674(91)90613-4 (1991).

43 Weterman, M. A. et al. Copy number variation upstream of PMP22 in Charcot- Marie-Tooth disease. European journal of human genetics : EJHG 18, 421- 428, doi:10.1038/ejhg.2009.186 (2010).

44 Den Dunnen, J. T. et al. Topography of the Duchenne muscular dystrophy (DMD) gene: FIGE and cDNA analysis of 194 cases reveals 115 deletions and 13 duplications. Am J Hum Genet 45, 835-847 (1989).

45 Osborne, L. R. et al. Identification of genes from a 500-kb region at 7q11.23 that is commonly deleted in Williams syndrome patients. Genomics 36, 328- 336, doi:10.1006/geno.1996.0469 (1996).

46 Morris, C. A., Loker, J., Ensing, G. & Stock, A. D. Supravalvular aortic stenosis cosegregates with a familial 6; 7 translocation which disrupts the elastin gene. Am J Med Genet 46, 737-744, doi:10.1002/ajmg.1320460634 (1993).

47 Curran, M. E. et al. The elastin gene is disrupted by a translocation associated with supravalvular aortic stenosis. Cell 73, 159-168, doi:10.1016/0092- 8674(93)90168-p (1993).

48 Kishino, T., Lalande, M. & Wagstaff, J. UBE3A/E6-AP mutations cause Angelman syndrome. Nature genetics 15, 70-73, doi:10.1038/ng0197-70 (1997).

49 Butler, M. G., Meaney, F. J. & Palmer, C. G. Clinical and cytogenetic survey of 39 individuals with Prader-Labhart-Willi syndrome. Am J Med Genet 23, 793-809, doi:10.1002/ajmg.1320230307 (1986).

103

50 Van Esch, H. et al. Duplication of the MECP2 region is a frequent cause of severe mental retardation and progressive neurological symptoms in males. Am J Hum Genet 77, 442-453, doi:10.1086/444549 (2005).

51 del Gaudio, D. et al. Increased MECP2 gene copy number as the result of genomic duplication in neurodevelopmentally delayed males. Genet Med 8, 784-792, doi:10.109701.gim.0000250502.28516.3c (2006).

52 Friez, M. J. et al. Recurrent infections, hypotonia, and mental retardation caused by duplication of MECP2 and adjacent region in Xq28. Pediatrics 118, e1687-1695, doi:10.1542/peds.2006-0395 (2006).

53 Inoue, K. et al. A duplicated PLP gene causing Pelizaeus-Merzbacher disease detected by comparative multiplex PCR. Am J Hum Genet 59, 32-39 (1996).

54 Raskind, W. H., Williams, C. A., Hudson, L. D. & Bird, T. D. Complete deletion of the proteolipid protein gene (PLP) in a family with X-linked Pelizaeus-Merzbacher disease. Am J Hum Genet 49, 1355-1360 (1991).

55 Rovelet-Lecrux, A. et al. APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy. Nature genetics 38, 24-26, doi:10.1038/ng1718 (2006).

56 Kaminsky, E. B. et al. An evidence-based approach to establish the functional and clinical significance of copy number variants in intellectual and developmental disabilities. Genet Med 13, 777-784, doi:10.1097/GIM.0b013e31822c79f9 (2011).

57 Ensenauer, R. E. et al. Microduplication 22q11.2, an emerging syndrome: clinical, cytogenetic, and molecular analysis of thirteen patients. Am J Hum Genet 73, 1027-1040, doi:10.1086/378818 (2003).

58 Prasad, A. et al. A discovery resource of rare copy number variations in individuals with autism spectrum disorder. G3 (Bethesda) 2, 1665-1685, doi:10.1534/g3.112.004689 (2012).

104

59 Cook, E. H., Jr. & Scherer, S. W. Copy-number variations associated with neuropsychiatric conditions. Nature 455, 919-923, doi:10.1038/nature07458 (2008).

60 Feuk, L. Copy number variation in the autism genome. Expert Opin Med Diagn 2, 417-428, doi:10.1517/17530059.2.4.417 (2008).

61 Lalic, T. et al. Deletion and duplication screening in the DMD gene using MLPA. European journal of human genetics : EJHG 13, 1231-1234, doi:10.1038/sj.ejhg.5201465 (2005).

62 White, S. J. et al. Duplications in the DMD gene. Human mutation 27, 938- 945, doi:10.1002/humu.20367 (2006).

63 Ma, P. et al. Comprehensive genetic characteristics of dystrophinopathies in China. Orphanet J Rare Dis 13, 109, doi:10.1186/s13023-018-0853-z (2018).

64 Oshima, J. et al. Regional genomic instability predisposes to complex dystrophin gene rearrangements. Hum Genet 126, 411-423, doi:10.1007/s00439-009-0679-9 (2009 ).

65 Xu et al. Novel noncontiguous duplications identified with a comprehensive mutation analysis in the DMD gene by DMD gene-targeted sequencing. Gene 645, 113-118, doi:10.1016/j.gene.2017.12.037 (2018).

66 Lupski, J. R. Structural variation mutagenesis of the human genome: Impact on disease and evolution. Environ Mol Mutagen 56, 419-436, doi:10.1002/em.21943 (2015).

67 Stankiewicz, P. & Lupski, J. R. Genome architecture, rearrangements and genomic disorders. Trends Genet 18, 74-82 (2002).

68 Baskin, B. et al. Complex genomic rearrangements in the dystrophin gene due to replication-based mechanisms. Mol Genet Genomic Med 2, 539-547, doi:10.1002/mgg3.108 (2014).

105

69 Wiedenheft, B., S., S. & Doudna, J. A. RNA-guided genetic silencing systems in bacteria and archaea. Nature 482, 331-338 (2012).

70 Garneau, J. E. et al. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 468, 67-71, doi:10.1038/nature09523 (2010).

71 Gasiunas, G., Barrangou, R., Horvath, P. & Siksnys, V. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proceedings of the National Academy of Sciences of the United States of America 109, E2579-2586, doi:10.1073/pnas.1208507109 (2012).

72 Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821 (2012).

73 Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709-1712, doi:10.1126/science.1138140 (2007).

74 Sapranauskas, R. et al. The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli. Nucleic acids research 39, 9275-9282, doi:10.1093/nar/gkr606 (2011).

75 Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823, doi:10.1126/science.1231143 (2013).

76 Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823-826 (2013).

77 Deltcheva, E. et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471, 602-607, doi:10.1038/nature09886 (2011).

78 Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M. & Joung, J. K. Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature biotechnology 32, 279-284, doi:10.1038/nbt.2808 (2014).

106

79 Mojica, F. J., Diez-Villasenor, C., Garcia-Martinez, J. & Almendros, C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology 155, 733-740, doi:10.1099/mic.0.023960-0 (2009).

80 Mojica, F. J., Diez-Villasenor, C., Garcia-Martinez, J. & Soria, E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J Mol Evol 60, 174-182, doi:10.1007/s00239-004-0046-3 (2005).

81 Pourcel, C., Salvignol, G. & Vergnaud, G. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 151, 653-663, doi:10.1099/mic.0.27437-0 (2005).

82 Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186-191, doi:10.1038/nature14299 (2015).

83 Zheng, B., Sage, M., Sheppeard, E. A., Jurecic, V. & Bradley, A. Engineering mouse chromosomes with Cre-loxP: range, efficiency, and somatic applications. Molecular and cellular biology 20, 648-655, doi:10.1128/mcb.20.2.648-655.2000 (2000).

84 Clark, K. et al. Gait abnormalities and progressive myelin degeneration in a new murine model of Pelizaeus-Merzbacher disease with tandem genomic duplication. J Neurosci 33, 11788-11799, doi:10.1523/JNEUROSCI.1336- 13.2013 (2013).

85 Kraft, K. et al. Deletions, Inversions, Duplications: Engineering of Structural Variants using CRISPR/Cas in Mice. Cell reports, doi:10.1016/j.celrep.2015.01.016 (2015).

86 Boroviak, K., Doe, B., Banerjee, R., Yang, F. & Bradley, A. Chromosome engineering in zygotes with CRISPR/Cas9. Genesis 54, 78-85, doi:10.1002/dvg.22915 (2016).

107

87 Korablev, A. N., Serova, I. A. & Serov, O. L. Generation of megabase-scale deletions, inversions and duplications involving the Contactin-6 gene in mice by CRISPR/Cas9 technology. BMC Genet 18, 112, doi:10.1186/s12863-017- 0582-7 (2017).

88 Zhang et al. Large genomic fragment deletions and insertions in mouse using CRISPR/Cas9. PloS one 10, e0120396, doi:10.1371/journal.pone.0120396 (2015).

89 Vasileva, A. & Jessberger, R. Precise hit: adeno-associated virus in gene targeting. Nature reviews. Microbiology 3, 837-847, doi:10.1038/nrmicro1266 (2005).

90 Kay, M. A. State-of-the-art gene-based therapies: the road ahead. Nature reviews. Genetics 12, 316-328, doi:10.1038/nrg2971 (2011).

91 Zincarelli, C., Soltys, S., Rengo, G. & Rabinowitz, J. E. Analysis of AAV serotypes 1-9 mediated gene expression and tropism in mice after systemic injection. Molecular therapy : the journal of the American Society of Gene Therapy 16, 1073-1080, doi:10.1038/mt.2008.76 (2008).

92 Lionel, A. C. et al. Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test. Genet Med 20, 435-443, doi:10.1038/gim.2017.119 (2018).

93 Robinson, J. T. et al. Integrative genomics viewer. Nature biotechnology 29, 24-26, doi:10.1038/nbt.1754 (2011).

94 Doench, J. G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9- mediated gene inactivation. Nature biotechnology 32, 1262-1267, doi:10.1038/nbt.3026 (2014).

95 Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nature methods 11, 783-784, doi:10.1038/nmeth.3047 (2014).

108

96 Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology 31, 827-832, doi:10.1038/nbt.2647 (2013).

97 Cradick, T. J., Qiu, P., Lee, C. M., Fine, E. J. & Bao, G. COSMID: A Web- based Tool for Identifying and Validating CRISPR/Cas Off-target Sites. Mol Ther Nucleic Acids 3, e214, doi:10.1038/mtna.2014.64 (2014).

98 Gertsenstein, M. & Nutter, L. M. J. Engineering Point Mutant and Epitope- Tagged Alleles in Mice Using Cas9 RNA-Guided Nuclease. Curr Protoc Mouse Biol 8, 28-53, doi:10.1002/cpmo.40 (2018).

99 Gertsenstein, M., Mianne, J., Teboul, L. & Nutter, L. M. J. Targeted Mutations in the Mouse via Embryonic Stem Cells. Methods Mol Biol 2066, 59-82, doi:10.1007/978-1-4939-9837-1_5 (2020).

100 Gertsenstein, M. et al. Efficient generation of germ line transmitting chimeras from C57BL/6N ES cells by aggregation with outbred host embryos. PloS one 5, e11260, doi:10.1371/journal.pone.0011260 (2010).

101 Nagy, A., Gertsenstein, M., Vintersten, K. & Behringer, R. Thawing embryonic stem (ES) cells from a 96-well plate. Cold Spring Harb Protoc 2010, pdb prot4412, doi:10.1101/pdb.prot4412 (2010).

102 Kemaladewi, D. U. et al. A mutation-independent approach for muscular dystrophy via upregulation of a modifier gene. Nature 572, 125-130, doi:10.1038/s41586-019-1430-x (2019).

103 Kemaladewi, D. U. et al. Correction of a splicing defect in a mouse model of congenital muscular dystrophy type 1A using a homology-directed-repair- independent mechanism. Nature medicine, doi:10.1038/nm.4367 (2017).

104 De Luca, A., Tinsley, J., Aartsma-Rus, A., van Putten, M., Nagaraju, K., de La Porte, S., Dubach-Powell, J., and Carlson, G.,. Use of grip strength meter to assess the limb strength of mdx mice SOP DMD_M 2, no. 001 (2008).

109

105 Nagaraju, K. C., G.; De Luca, A. . Behavioral and locomotor measurements using open field animal activity monitoring system. . TREAT-NMD SOP Number M2.1.002. 2 (2010).

106 Lee, H. J., Kim, E. & Kim, J. S. Targeted chromosomal deletions in human cells using zinc finger nucleases. Genome research 20, 81-89, doi:10.1101/gr.099747.109 (2010).

107 Ousterout, D. G. et al. Multiplex CRISPR/Cas9-based genome editing for correction of dystrophin mutations that cause Duchenne muscular dystrophy. Nature communications 6, 6244, doi:10.1038/ncomms7244 (2015).

108 Canver, M. C. et al. Characterization of genomic deletion efficiency mediated by clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 nuclease system in mammalian cells. The Journal of biological chemistry 289, 21312-21324, doi:10.1074/jbc.M114.564625 (2014).

109 Wojtal, D. et al. Spell Checking Nature: Versatility of CRISPR/Cas9 for Developing Treatments for Inherited Disorders. Am J Hum Genet 98, 90-101, doi:10.1016/j.ajhg.2015.11.012 (2016).

110 Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nature biotechnology 33, 187-197, doi:10.1038/nbt.3117 (2015).

111 Chiarle, R. et al. Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells. Cell 147, 107-119, doi:10.1016/j.cell.2011.07.049 (2011).

112 Frock, R. L. et al. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nature biotechnology 33, 179-186, doi:10.1038/nbt.3101 (2015).

110

113 Goemans, N. M. et al. Systemic administration of PRO051 in Duchenne's muscular dystrophy. N Engl J Med 364, 1513-1522, doi:10.1056/NEJMoa1011367 (2011).

114 Aoki, Y. et al. Bodywide skipping of exons 45-55 in dystrophic mdx52 mice by systemic antisense delivery. Proceedings of the National Academy of Sciences of the United States of America 109, 13763-13768, doi:10.1073/pnas.1204638109 (2012).

115 Jarmin, S., Kymalainen, H., Popplewell, L. & Dickson, G. New developments in the use of gene therapy to treat Duchenne muscular dystrophy. Expert opinion on biological therapy 14, 209-230, doi:10.1517/14712598.2014.866087 (2014).

116 de Vrueh, R., Baekelandt, E.R., and de Hann, J.M. . Update on 2004 background paper: BP 6.19 rare diseases. Geneva: World Health Organization (2013).

117 Long, C., Amoasii, L., Bassel-Duby, R. & Olson, E. N. Genome Editing of Monogenic Neuromuscular Diseases: A Systematic Review. JAMA Neurol 73, 1349-1355, doi:10.1001/jamaneurol.2016.3388 (2016).

118 Nelson, C. E. et al. In vivo genome editing improves muscle function in a mouse model of Duchenne muscular dystrophy. Science 351, 403-407, doi:10.1126/science.aad5143 (2016).

119 Tabebordbar, M. et al. In vivo gene editing in dystrophic mouse muscle and muscle stem cells. Science 351, 407-411, doi:10.1126/science.aad5177 (2016).

120 Xu, L. et al. CRISPR-mediated Genome Editing Restores Dystrophin Expression and Function in mdx Mice. Molecular therapy : the journal of the American Society of Gene Therapy 24, 564-569, doi:10.1038/mt.2015.192 (2016).

111

121 Yin, H. et al. Genome editing with Cas9 in adult mice corrects a disease mutation and phenotype. Nature biotechnology 32, 551-553, doi:10.1038/nbt.2884 (2014).

122 Yang, H. et al. One-step generation of mice carrying reporter and conditional alleles by CRISPR/Cas-mediated genome engineering. Cell 154, 1370-1379, doi:10.1016/j.cell.2013.08.022 (2013).

123 Wang, H. et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 153, 910-918, doi:10.1016/j.cell.2013.04.025 (2013).

124 Yang, H., Wang, H. & Jaenisch, R. Generating genetically modified mice using CRISPR/Cas-mediated genome engineering. Nature protocols 9, 1956-1968, doi:10.1038/nprot.2014.134 (2014).

125 Modzelewski, A. J. et al. Efficient mouse genome engineering by CRISPR-EZ technology. Nature protocols 13, 1253-1274, doi:10.1038/nprot.2018.012 (2018).

126 Qin, W. et al. Efficient CRISPR/Cas9-Mediated Genome Editing in Mice by Zygote Electroporation of Nuclease. Genetics 200, 423-430, doi:10.1534/genetics.115.176594 (2015).

127 Zhou, J. et al. Dual sgRNAs facilitate CRISPR/Cas9-mediated mouse genome targeting. FEBS J 281, 1717-1725, doi:10.1111/febs.12735 (2014).

128 Hara, S. et al. Microinjection-based generation of mutant mice with a double mutation and a 0.5 Mb deletion in their genome by the CRISPR/Cas9 system. J Reprod Dev 62, 531-536, doi:10.1262/jrd.2016-058 (2016).

129 Yen, S. T. et al. Somatic mosaicism and allele complexity induced by CRISPR/Cas9 RNA injections in mouse zygotes. Dev Biol 393, 3-9, doi:10.1016/j.ydbio.2014.06.017 (2014).

112

130 Birling, M. C. et al. Efficient and rapid generation of large genomic variants in rats and mice using CRISMERE. Scientific reports 7, 43331, doi:10.1038/srep43331 (2017).

131 Li, J. et al. Efficient inversions and duplications of mammalian regulatory DNA elements and gene clusters by CRISPR/Cas9. J Mol Cell Biol 7, 284-298, doi:10.1093/jmcb/mjv016 (2015).

132 Kato, T. et al. Creation of mutant mice with megabase-sized deletions containing custom-designed breakpoints by means of the CRISPR/Cas9 system. Scientific reports 7, 59, doi:10.1038/s41598-017-00140-9 (2017).

133 Kosicki, M., Tomberg, K. & Bradley, A. Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements. Nature biotechnology 36, 765-771, doi:10.1038/nbt.4192 (2018).

134 Shin, H. Y. et al. CRISPR/Cas9 targeting events cause complex deletions and insertions at 17 sites in the mouse genome. Nature communications 8, 15464, doi:10.1038/ncomms15464 (2017).

135 Schaefer, K. A. et al. Unexpected mutations after CRISPR-Cas9 editing in vivo. Nature methods 14, 547-548, doi:10.1038/nmeth.4293 (2017).

136 Walz, K., Paylor, R., Yan, J., Bi, W. & Lupski, J. R. Rai1 duplication causes physical and behavioral phenotypes in a mouse model of dup(17)(p11.2p11.2). J Clin Invest 116, 3035-3041, doi:10.1172/JCI28953 (2006).

137 Li, Z. et al. Duplication of the entire 22.9 Mb human chromosome 21 syntenic region on mouse chromosome 16 causes cardiovascular and gastrointestinal abnormalities. Human molecular genetics 16, 1359-1366, doi:10.1093/hmg/ddm086 (2007).

138 Horev, G. et al. Dosage-dependent phenotypes in models of 16p11.2 lesions found in autism. Proceedings of the National Academy of Sciences of the United States of America 108, 17076-17081, doi:10.1073/pnas.1114042108 (2011).

113

139 Nakatani, J. et al. Abnormal behavior in a chromosome-engineered mouse model for human 15q11-13 duplication seen in autism. Cell 137, 1235-1246, doi:10.1016/j.cell.2009.04.024 (2009).

140 Yu, T. et al. Effects of individual segmental trisomies of human chromosome 21 syntenic regions on hippocampal long-term potentiation and cognitive behaviors in mice. Brain Res 1366, 162-171, doi:10.1016/j.brainres.2010.09.107 (2010).

141 Amoasii, L. et al. Gene editing restores dystrophin expression in a canine model of Duchenne muscular dystrophy. Science 362, 86-91, doi:10.1126/science.aau1549 (2018).

142 Li, H. L. et al. Precise correction of the dystrophin gene in duchenne muscular dystrophy patient induced pluripotent stem cells by TALEN and CRISPR- Cas9. Stem cell reports 4, 143-154, doi:10.1016/j.stemcr.2014.10.013 (2015).

143 Long, C. et al. Postnatal genome editing partially restores dystrophin expression in a mouse model of muscular dystrophy. Science 351, 400-403, doi:10.1126/science.aad5725 (2016).

144 Lattanzi, A. et al. Correction of the Exon 2 Duplication in DMD Myoblasts by a Single CRISPR/Cas9 System. Mol Ther Nucleic Acids 7, 11-19, doi:10.1016/j.omtn.2017.02.004 (2017).

145 Long, C. et al. Correction of diverse muscular dystrophy mutations in human engineered heart muscle by single-site genome editing. Sci Adv 4, eaap9004, doi:10.1126/sciadv.aap9004 (2018).

146 Hsiau, T. et al. Inference of CRISPR Edits from Sanger Trace Data. 251082, doi:10.1101/251082 %J bioRxiv (2018).

147 Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence- specific control of gene expression. Cell 152, 1173-1183, doi:10.1016/j.cell.2013.02.022 (2013).

114

148 Gilbert, L. A. et al. Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation. Cell 159, 647-661, doi:10.1016/j.cell.2014.09.029 (2014).

149 Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442-451, doi:10.1016/j.cell.2013.06.044 (2013).

150 Mali, P. et al. CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nature biotechnology 31, 833-838, doi:10.1038/nbt.2675 (2013).

151 Perez-Pinera, P. et al. Synergistic and tunable human gene activation by combinations of synthetic transcription factors. Nature methods 10, 239-242, doi:10.1038/nmeth.2361 (2013).

152 Fairclough, R. J., Wood, M. J. & Davies, K. E. Therapy for Duchenne muscular dystrophy: renewed optimism from genetic approaches. Nature reviews. Genetics 14, 373-378, doi:10.1038/nrg3460 (2013).

153 Tinsley, J. et al. Expression of full-length utrophin prevents muscular dystrophy in mdx mice. Nature medicine 4, 1441-1444, doi:10.1038/4033 (1998).

154 Tinsley, J. et al. Daily treatment with SMTC1100, a novel small molecule utrophin upregulator, dramatically reduces the dystrophic symptoms in the mdx mouse. PloS one 6, e19189, doi:10.1371/journal.pone.0019189 (2011).

155 Juan-Mateu, J. et al. DMD Mutations in 576 Dystrophinopathy Families: A Step Forward in Genotype-Phenotype Correlations. PloS one 10, e0135189, doi:10.1371/journal.pone.0135189 (2015).