<<

Linköping University - Department of Physics, and Bachelor thesis, 16 hp - Educational program: Physics, Chemistry and Biology Spring term 2020 - LITH - IFM - G - EX -- 20/3902 -- SE

The CRISPR-Cas system

Can CRISPR bring about a brighter future? An extensive and up-to-date CRISPR overview, focusing on the CRISPR- technology including a suggestion of a CRISPR-Cas9 laboratory assignment for University students.

Sara Berggren, Isabella Enoksson and Cassandra Stens

Examiner: Assoc. Prof. Lars-Göran Mårtensson

1

Abstract

Derived from and inspired by the adaptive of , CRISPR has gone from basic biology knowledge to a revolutionizing biotechnological tool, applicable in many research areas such as medicine, industry and agriculture. The full mechanism of CRISPR-Cas9 was first published in 2012 and various CRISPR-Cas systems have already passed the first stages of clinical trials as new therapies. The immense research has resulted in continuously growing knowledge of CRISPR systems and the technique seems to have the potential to greatly impact all life on our planet. Therefore, this literature study aims to thoroughly describe the CRISPR-Cas system, and further suggest an undergraduate laboratory exercise involving gene editing with the CRISPR-Cas9 tool.

In this paper, we describe the fundamental technical background of the CRISPR-Cas system, especially emphasizing the most studied CRISPR-Cas9 system, its development and applications areas, as well as highlighting its current limitations and ethical concerns. The history of genetic and the discovery of the CRISPR system is also described, along with a comparison with other established gene editing techniques.

This study concludes that a deeper knowledge about CRISPR is important and required since the technique is applicable in many research areas. A laboratory exercise will not only inspire but also provide extended theoretical and practical knowledge for undergraduate students.

Acknowledgements

We are forever grateful for all of the guidance our examiner, Assoc. Prof. Lars-Göran Mårtensson at Linköping University, has given us during these ten weeks of intense CRISPR-Cas exploration. You have shared your valuable knowledge and experience from years of research and publications, giving us a great starting point and potentiality through the paper. Thank you for your critical reading, thoughtful recommendations and many laughs - you have made our bachelor fun and memorable!

Further appreciations and thanks goes to the BioRender website, enabling quick and artistic illustrations for our report.

While writing this paper, we have tried to cite original articles as much as possible, but we apologize in advance if there are mistakes unintentionally citing other articles and researchers than the originals.

2

Terminology, abbreviations and acronyms

Abbreviations and acronyms

AdV adenovirus AAV adeno-associated bp base pairs Cas CRISPR associated CRISPR clustered regularly interspaced short palindromic repeats CRISPRa CRISPR activation CRISPRi CRISPR interference crRNA CRISPR RNA cr:tracrRNA a crRNA-tracrRNA hybrid dCas9 dead Cas9 (at times called deactivated Cas9) dsDNA double stranded DNA DSB double stranded break HDR directed repair insertions or deletions NHEJ nonhomologous end joining nt PAMs protospacer adjacent motifs pre-crRNA precursor CRISPR RNA repeats direct repeats RNP ribonucleoprotein ssDNA single stranded DNA SpCas9 pyogenes ​ CRISPR associated (Cas) protein tracrRNA trans-activating crRNA WT wild type

Terminology

Anti-CRISPR protein: i​nhibitors of the CRISPR-Cas system which can be applied to both mammalian and prokaryotic cells for more precise and controllable editing.

Cas9​: the most common and used type of Cas protein. Discovered in S​ treptococcus pyogenes​.

Cas9 complex:​ the Cas9 protein attached to gRNA (cr:tracrRNA or sgRNA).

Cas protein, CRISPR associated protein​: is a non-specific , i.e. capable of cleaving DNA. Requires a guide RNA to find where in the to execute the cut. There are multiple types of Cas isolated from various bacteria and .

3

CRISPR array​: a series of palindromic, repeated sequences called d​ irect repeats​ interspaced by different target-specific sequences called s​ pacers.​

CRISPR locus​: consists of a clustered set of tracrRNA , Cas genes and a CRISPR array at fixed regions in . A single genome can incorporate multiple CRISPR loci.

CRISPR-Cas9 system:​ relies on two components, a Cas9 protein and a guide RNA. The guide RNA can be in the form of a two-component crRNA and tracrRNA hybrid (cr:tracrRNA), or as a one-component sgRNA (crRNA and tracrRNA linked via a ​linker loop)​ .

CRISPR RNA, crRNA​: a sequence consisting of a s​ pacer ​and a r​ epeat s​ equence. The spacer is the customizable part of the gRNA and can be programmed to recognize particular target DNA sequences. crRNAs are transcribed from the CRISPR array. cr:tracrRNA​: a gRNA configuration where crRNA and tracrRNA are bound to each other. Pronunciation: tracr RNA. dCas9:​ a mutated (dead or deactivated) Cas9 protein without cleavage activity.

Direct repeats​: identical, repetitive and palindromic sequences interspaced between spacer fragments. Called repeats ​for short. The direct repeats are complementary to sequences in tracrRNAs.

Electroporation​: a microbiology technique in which an electric field is applied on a sample in order to increase the permeability of cell membranes, thereby supporting drugs, chemicals and genetic material to be introduced into target cells.

Endonucleases​: catalyse the cleavage of DNA by cutting the phosphodiester bonds in nucleic acid strands.

Guide RNA, gRNA​: can be in two conformations; a two-component cr:tracrRNA, or a one-component sgRNA. A gRNA in the shape of a cr:tracrRNA consists of a crRNA and a tracrRNA bound to each other. A gRNA in the form of an sgRNA consists of crRNA and tracrRNA linked together by a sequence called the linker loop. Despite configuration, gRNAs are capable of binding to Cas proteins and are used to recognize DNA regions of interest, thereby guiding the Cas protein to a certain target site.

HNH: ​an endonuclease domain in the Cas9 protein responsible to cut the target strand.

Leader sequence​: a sequence ahead of the CRISPR array, includes the . The leader sequence has an important role in the process of spacer acquisition.

Linker loop​: a sequence connecting the crRNA and tracrRNA together.

4

Motif​: a sequence motif is a short, recurring sequence pattern in DNA presumed to have a biochemical function, as for example to indicate binding sites for proteins such as (e.g. Cas proteins). A sequence motif is distinguished from a ; the latter describing a specific part of a protein’s 3D structure.

Multiplex​: a quality of CRISPR-Cas9 systems, where several genome sites in a cell are targeted and cut simultaneously, with the aid of multiple gRNAs.

Palindromic repeats:​ sequences that are read the same forward as backwards on two complementary strands.

PAM:​ a short, recurring sequence (motif) required for target site recognition of Cas proteins. Different Cas proteins are specific for different PAMs, for instance, the Cas9 PAM motif is NGG. PAMs are located immediately adjacent to the target protospacer in various organism’s DNA. pre-crRNA​: a single spacer sequence flanked by repeat fragments. The CRISPR array is transcribed into a long precursor CRISPR RNA (pre-crRNA), prior to being cleaved into matured crRNAs. In the case of CRISPR Type II, assistance of tracrRNA, Cas9 and RNase III is required to process pre-crRNA into matured crRNA.

Protospacer​: a short DNA sequence derived and collected from foreign . Protospacers are directly connected to PAM motifs in foreign genomes. When the protospacer is inserted into the CRISPR array in the it is termed a spacer.

RNP complex:​ the Cas9 protein in complex with gRNA (cr:tracrRNA or sgRNA).

RuvC: ​a domain, common in multiple types of Cas proteins, capable to cut the non-target strand.

Seed sequence​: a 8-12 bp sequence of the guide RNA that must be strictly complementary to the target DNA.

Single guide RNA, sgRNA​: a single RNA sequence composed of a crRNA linked to a tracrRNA via a l​ inker loop​.

Spacers:​ ~20 bp sequences of various foreign DNA that is inserted and found within the CRISPR array. Spacers are small, unique fragments of DNA gathered from that previously have tried to infect the bacterial cell. The incorporated spacers are used to express short guide , called crRNAs. By means of comparing the spacer with the complementary foreign DNA (the protospacers), CRISPR-Cas systems can identify and eliminate foreign DNA.

Spacer acquisition:​ the insertion process of new foreign DNA fragments into the CRISPR array. tracrRNA, trans-activating crispr RNA​: serves as a binding scaffold for the Cas protein and the crRNA. The tracrRNA is fixed and not programmable (in contrast to the case of crRNAs).

5

Contents

1. Introduction 9 1.1 Aim 9 1.2 Background 9

2. History 11

2.1 in historical perspective 11

3. Theory 13 3.1 The CRISPR-Cas system 13 3.1.1 Basic technical background of CRISPR-Cas9 14 3.1.2 The protein structure of CRISPR-Cas9 18 3.1.3 The DNA repair mechanisms of eukaryotic cells 22 3.1.4 Cas classes and types 24 3.1.5 of the CRISPR locus 26 3.1.6 Spacer acquisition into the CRISPR array 29 3.1.7 Introduction of CRISPR-Cas9 into cells 32 3.1.7.1 DNA delivery 34 3.1.7.2 delivery 35 3.1.7.3 RNA delivery 37 3.1.7.4 Protein delivery (RNP delivery) 38 3.1.7.5 Physical and chemical delivery methods (non-viral methods)38 3.1.7.6 Cytosolic vs nucleic insertion 40 3.1.7.7 In vivo​ and e​ x vivo​ delivery 41 3.1.8 Modifications of the CRISPR-Cas9 system 42 3.1.8.1 dCas9, CRISPRa and CRISPRi 43 3.1.8.2 Nickases 44 3.1.8.3 Base editing and 46

6

4. The breakthrough 51

4.1 The discovery of the CRISPR system 51 4.2 Comparison with other methods 52

5. Applications and prospects 55 5.1 CRISPR multiplexing 55 5.2 Diagnostics and drug development 57 5.3 Food industry and agriculture 58 5.4 Regulations and GMO 61 5.5 62 5.6 Anti CRISPR proteins 65

6. Limitations 66 6.1 Technical limitations and knowledge constraints 66

7. Ethics 69 7.1 Diseases, diagnostic tools and patents 69 7.2 Designer babies 70 7.3 Ecosystems and gene drive 71 7.4 Moratorium 72

8. Concluding remarks 73

9. References 74

7

1. Introduction

1.1 Aim

The aim of this paper is to present an overall but yet extensive up-to-date description of the CRISPR-Cas system. Particularly emphasized is the CRISPR-Cas9 system, which is currently the best known CRISPR system, found to be applicable in many research areas due to its relative simplicity and propitious alterations. The theoretical knowledge of its design, multiple functions and subsequent outcomes discussed in this paper will provide valuable understanding of its merits and demerits, and serve as a keystone for deeper ethical discussions concerning its use in and on our planet. To achieve further insight and experience of the technology, a suggestion of a laboratory exercise with CRISPR suitable for undergraduate students is provided in the paper’s appendix. The CRISPR tool has enabled rapid and intriguing advancements in genetic engineering in recent years, and is believed to hold great promises in future development of medicine, agriculture and industry. Hence, as CRISPR continues to break new grounds among scientists worldwide, it is time to gain or widen our knowledge and understanding about this flourishing, fascinating, but not flawless technique within genome editing.

1.2 Background

We are currently experiencing a biotechnological revolution, utterly spearheaded by the CRISPR-Cas technology. Even though the CRISPR technique is derived from and characterized as an adaptive immune response in bacteria towards viruses, it has proven to be applicable across a wide range of organisms including plants, animals and humans. Hence, it has shown to be a valuable and versatile tool for combating genetic diseases, improving agricultural and industrial processes, and assisting in expanded knowledge within many research areas such as microbiology and epigenetics.

The CRISPR-Cas system became popular due to its faculty to perform double stranded cuts at specific target sites in various genomes, thereby opening up possibilities to edit genes and influence traits in a more controlled and simplified manner compared to prior genome editing techniques. Due to CRISPR’s ability to execute cuts in genomes, it is sometimes referred to as a ​gene scissor or molecular scissor.​ Subsequent modifications of the CRISPR-Cas9 have enabled the system to also perform single stranded DNA cuts, base substitutions, epigenetic modulations, and even control without causing any damage to the organism’s

8

genome. These qualities are extremely valuable in genetic studies, but as exploration of the CRISPR system has proceeded during recent years, it has proven to be limited for many reasons. Today, CRISPR has proven not to be fully safe, and can result in devastating effects and severe consequences for both individuals or nature if not caution and careful considerations are carried out now and in the future. The detected hurdles must be overcome before we can use CRISPR as a safe and reliable tool in therapeutics, or any other application area. If we do not find a way to cope with these obstacles, CRISPR will not fulfill the high hopes many scientists have in the technology.

Current knowledge about CRISPR technology in society is limited, which is a major concern since the technique seems to have the potential to affect all beings on earth in one way or the other in the near future. Not only may CRISPR be utilized to treat or cure genetic diseases, it might also be used to select or enhance desirable traits such as a specific eye color, length, muscle mass, creativity, intelligence or any other quality linked to specific genes. It is of immense importance that the CRISPR discussion is not a question of perceptions, attitudes or narrow thinking, but of deep background knowledge, empirical research and widened thinking. How can we use CRISPR responsibly now and in the future? How should we make a decision when opportunity meets contingency - will it be worth the risk? Who makes the decision?

The purpose of this report is to provide an independent and objective description of the CRISPR-Cas system with the intention to create knowledge-based thinking of its future prospects and concerns by the readers themselves. The content focuses on CRISPR’s technical design, beneficial modifications, current and futuristic application areas and limitations, and does further give a brief history within genetic engineering and a comparison of CRISPR with prior genome editing techniques.

9

2. History

2.1 Genetic engineering in historical perspective

Humans have from prehistoric times found ways to manipulate nature in its favor. A long process of selective breeding has led to new improved species, such as cattle, that fulfill interest in different ways, but the general law of has taken many generations to understand. Starting from 1856 [1], an Austrian monk named was conducting experiments crossing peas and studying their phenotype through generations. His findings spawned a new take on the law of genetic inheritance. He was the first scientist to describe heredity as traits inherited in pairs, later termed alleles, with dominant or recessive characteristics [2][3].

Mendel’s work would come to inspire scientists throughout history. Although his work was not recognized within his lifetime, the studies of genetics would be set in motion in the early 1900 century when William Bateson, an English biologist, discovered Mendel’s and other early researcher’s work [3]. William Bateson proceeded to investigate Mendel's work and broadened the understanding of heredity and genes. Bateson became known as the founder of the genetic study field and was largely laying the groundwork for genetic studies ahead [3].

Fast forward to 1940-1960, scientists aimed to understand the fundamental principles of the genome. The double helix structure of the DNA was determined by and in 1953 [4], along with other prominent breakthroughs such as the understanding of the central dogma within genetics [5]. The rapid advancement within genetics gave rise to a completely novel field called genetic engineering in the early 1970s [6]. At this point in time, Hamilton Smith with colleagues found the first type II restriction termed HindII, an enzyme able to cut DNA at fixed positions. They determined its recognition sequence and produced a comprehensive explanation of the mechanism [7].

A few years later the first recombinant DNA molecule was created by a biochemist named . The recombinant molecule consisted of two miscellaneous viral , created with restriction and DNA ligase. Berg later went on to create a recombinant plasmid using the same technique [8][9]. Due to uncertainties, Berg did not follow through with his original plan to insert the recombinant plasmid into E.coli, but Berg’s new method involving the creation of a recombinant DNA molecule was later awarded the , and allowed new prospects within agriculture and paramedic development.

10

Roughly a year later Herbert Boyer and Stanley Cohen came to follow up on Berg's profound technique and inserted the recombinant molecule into bacteria, presenting the first DNA recombinant organism. Cohen was a biochemist specialized in isolation, and transformation of into bacteria. Boyer, on the other hand, was skillful at using restriction enzymes to isolate specific predetermined protein sequences. By sharing their previous knowledge, Boyer and Cohen were able to create a new method called gene cloning; creation and transformation of a recombinant DNA plasmid into bacterial cells, where the plasmid would remain functional and expressible through generations [10].

Genetic engineering research expanded at a fast rate and the first transgenic animals were created in 1973 [11]. In truth, a great deal of the fundamentals in genetics were still poorly understood. The lack of knowledge resulted in numerous scientists to voice concerns regarding the novel transgenic organisms. In 1974 [12], the National Academy of in the U.S. established a moratorium. One year later, a meeting involving over a hundred biologists took place in California's Asilomar conference center to discuss the risk and danger of the new developed methods. Principles for risk assessments were set in place and up until this date a refined version of these guidelines are used.

Nevertheless, scientists all over the world continue to uncover more about gene editing, and in 1975 [13] a new technology published by enabled previous impossible experiments to be performed. The technology was created in means of determining the sequence of bases in nucleic acid molecules, in order to verify Francis Crick's “sequence hypothesis”; a hypothesis suggesting that the bases in a DNA sequence encode specific nucleic acids [14]. The method came with enlarged opportunities to study genome sequences in detail.

In the years 1982 -1996 [15], genetic engineering began to get implemented in industrial processes. Synthetic insulin produced by bacteria and the first GMO food could be bought in the stores. During this time, a revolutionary technology called PCR was invented and refined by Kary Mullis. The PCR technique enabled amplification of small DNA pieces leading to advancement within fields working with isolating and identifying genetic material [16].

Many years have passed since Gregor Mendel first published his research. From that moment on, the genetic editing field has kept expanding and is presently a well established field of study. With different genetic tools we can treat illnesses, bring forth new medications, produce from algae, and much more.

11

3. Theory

3.1 The CRISPR-Cas system

When humans get infected by viruses, bacteria and other foreign , the physical response to it is the human immune system. The human immune system consists of an innate, non-specific immune response, as well as an adaptive, specific immune response. The adaptive defense mechanism is made up of a substantial gathering of and lymphocytes that are produced by the body in order to recognize and eliminate reappearing invaders [17]. As with humans, bacteria can as well be infected by viruses and other organisms. Viruses that infect bacteria are called ​ ​(phages for short) and the adaptive, bacterial immune response following a phage invasion is the CRISPR-Cas system. Bacteria use the CRISPR-Cas system to degrade both DNA and RNA of incoming viruses, depending on its type of CRISPR-Cas system (see section 3.1.4 Cas classes and types)[18].

CRISPR-Cas is the bacteria’s natural, heritable and adaptive defense mechanism against viruses. When phages inject their viral DNA into the cells of bacteria, the bacterial immune system rapidly responds to it by cutting the viral DNA into shorter sequences, thereby preventing proliferation and propagation of foreign genetic elements. The bacteria stores bits of the viral DNA into a specific site in their own genome called the CRISPR locus. These snipped bits of viral DNA sequences are termed ​protospacers, ​but whilst inserted into the CRISPR locus they are titled ​spacers.​ The spacers are interspaced by identical repeats called ​direct repeats​, or just ​repeats for short. The collection and insertion of foreign DNA into the CRISPR array is a way of keeping record of the various viral infections the bacteria gets exposed to during its lifetime. Hence, the CRISPR locus can be viewed as a genomic memory of the phage infection, or as a continuously updated vaccination card in bacteria (figure 1).

12

Figure 1: The bacterial CRISPR. Here, a injects its DNA into the bacteria and the bacteria cuts out a piece of it, called a protospacer, and inserts it into the CRISPR array in a process called spacer acquisition. An inserted protospacer is called a spacer.

CRISPR is an acronym for ​clustered regularly interspaced short palindromic repeats and it refers to the previously defined pattern of interspaced, repeated pieces of DNA sequences that occur numerous times in a section of the bacterial genome. The full name for CRISPR is a bit of a mouthful, and the trend of complex terminology of the CRISPR-Cas system will continue as we deep dive into this interesting topic of genetic engineering. In order to keep track in this grand vocabulary landscape, the acronyms, abbreviations and short descriptions of the frequently used words within CRISPR terminology are summarized in the beginning of the report.

3.1.1 Basic technical background of CRISPR-Cas9

The CRISPRCas9 system consists of two major components; a Cas9 protein and a guide RNA (gRNA). The guide RNA can come in two different shapes; a two-component RNA molecule consisting of CRISPR RNA (crRNA) and trans-activating CRISPR RNA (tracrRNA) separately, or as a fused, one-component RNA molecule composed of crRNA and tracrRNA merged by a ​linker loop ​(figure 2). The two-component gRNA is

13

occasionally referred to as cr:tracrRNA, and the one-component version of gRNA is generally termed a single guide RNA (sgRNA), but there are some linguistic unclarities when it comes to the phrasing and bifurcation of the gRNA variants. Sometimes gRNA is referred to the one-component molecule, whilst sgRNA is used for the fused molecule [19]. In order to avoid confusion, gRNA will henceforth consistently refer to either the cr:tracrRNA complex ​or ​the sgRNA, and the specific terms for each configuration will solely be used when the difference between the two variants needs to be emphasized.

Figure 2: The CRISPR-Cas9 complex with two different gRNA configurations. The grey figure represents the Cas9 protein. To the left is a Cas9 protein attached to gRNA consisting of the two components crRNA in red and tracrRNA in blue, called a cr:tracrRNA complex. To the right is a Cas9 protein in complex with sgRNA in green.

The cr:tracrRNA complex is a naturally occurring configuration, whilst the sgRNA variant does not exist in nature, but was rather invented by and in 2012 [20]. The sgRNA conformation is popular within the scientific community due to the simplicity of working with two molecules (a Cas protein and sgRNA) instead of three (a Cas protein, crRNA and tracrRNA).

The second essential component of CRISPR-Cas9 systems is the Cas9 protein. The Cas9 protein has nuclease activity and is often described as a molecular scissor due to its ability to cleave DNA strands. In order for the Cas9 to navigate to a desired site on the genome it requires guide RNA. The gRNA gives crucial guidelines and DNA anchor sites for the Cas9 protein, hence can Cas9 neither find nor attach to DNA by itself; it must associate to gRNA first. Only then can Cas9 execute cuts in DNA sequences.

Regardless of the gRNA configuration - cr:tracrRNA or sgRNA - the tracrRNA part holds the main function to attach to both the crRNA molecule and the Cas9 protein in order to link them together, enabling the Cas

14

protein to cut the target DNA strand. The crRNA on the other hand holds the ability to target viral DNA through providing a sequence, the ​spacer,​ which is complementary to the viral sequence, the ​protospacer​. In other words, the crRNA sequence contributes with a target site for the Cas9 complex to attach to a foreign genome. The variability and specificity of the CRISPR system is, among other things, enabled by the programmable crRNAs. The crRNA is the customizable part of the gRNA. By modifying the spacer sequence within the crRNA it is feasible to recognize and target any foreign nucleic acid. Hereby, researchers can design a sequence that matches whichever gene they want to study or edit.

As previously mentioned, the crRNA strand includes a sequence called the ​spacer​. The crRNA strand can be 45 nt long (compared with a 75 nt tracrRNA), in which an ~20 nt sequence at the 5’-end of the crRNA is the spacer [20]. The spacer is a unique DNA fragment gathered from foreign pathogens that previously have tried to infect the bacteria. Once the foreign protospacers are inserted into the CRISPR locus, the protospacers are termed spacers. Thereby can an RNA spacer sequence be complementary to a strand of future invading species. The most crucial part in the spacer region is a 8-12 bp long sequence at the 3´-end called the ​seed sequence​. The seed sequence’s nucleotides must completely correspond to the protospacer sequence in order for the CRISPR-Cas9 system to work. Without a precisely suitable seed sequence, no target DNA binding or cleavage will occur. Hence, the seed sequence is one of the most critical elements to achieve the desired function of CRISPR-Cas complexes. Figure 3 illustrates the crRNA, spacer and seed sequence. Between times, the gRNA gets depicted in a very simplified manner as a spacer and an sgRNA scaffold that binds to the Cas9 protein.

Figure 3: The left picture emphasizes where the spacer and seed sequence are located on the crRNA strand in a Cas9 complex. Right picture illustrates a Cas9 complex in a simplified manner as an sgRNA scaffold and a spacer.

Yet there are several essential components needed in order for the Cas9 complex to bind to foreign DNA. The PAM sequence is vastly important as well and must not be forgotten. PAM is an acronym for ​protospacer

15

adjacent motif.​ As the full name implies, PAM is a motif, located directly next to the protospacer in various viruses and organism’s genomes. Different Cas proteins recognize specific PAM sequences. When the Cas protein binds to a PAM motif, the adjacent dsDNA helix starts to unwind and the strands are separated, enabling the crRNA spacer to bind to the target DNA [21] (figure 6). In the presence of a PAM and a matching spacer sequence, the Cas protein will cut the target DNA at a location of approximately 3 bp upstream of the PAM sequence [20]. A powerful advantage of the CRISPR-Cas system is the ability to target any sequence about 20 bp in length on either DNA strand as long as it is adjacent to a PAM motif.

PAM is a short, conserved and commonly occurring sequence required for target site recognition. PAMs are located at the non-target strand (figure 2), are only a few base pairs in length and the specific sequence differs depending on which type of Cas protein that is present in the cell. For instance, NGG is the most common PAM sequence for Cas9 proteins to recognize, where N stands for any nucleotide, and G denotes the nucleotide guanine. Indeed, different Cas proteins recognize different PAMs. The possibility to use various Cas proteins for targeting a diverse set of PAM sequences does further increase the versatile application area of CRISPR-Cas systems. As time goes by, scientists continuously discover and identify new Cas proteins from other species targeting novel PAM motifs, providing even further flexibility in target sequence selection.

The frequent presence of PAMs in many viruses and eukaryotes, including humans and plants, is the main reason why CRISPR-Cas systems are immensely useful and applicable in multiple areas within biology. Noteworthy, the bacterial genome itself does not contain PAMs, which thereby prevents the bacteria’s own genome to be cleaved by its Cas proteins.

To summarize, the effective complex called CRISPR-Cas9 is formed when a Cas9 protein combines with a gRNA derived from the CRISPR array. The Cas protein recognizes and binds to a PAM sequence, which is followed by base pairing of the spacer to the foreign protospacer sequence. In the absence of either a guide RNA or a PAM, Cas9 will neither bind nor cut the target DNA. If the Cas complex pairs correctly to the PAM sequence and the spacer (especially the seed sequence), the Cas protein adopts its endonuclease activity and cuts the invading DNA. The resulting double stranded break (DSB) causes the viral DNA to be degraded. This is the process in which bacteria fight viral infections. If the CRISPR technology is applied to eukaryotic cells, a DSB will cause the natural DNA repair systems of the cell to be activated in order to reconnect the DNA strands again. Troubles and potential benefits of these eukaryotic repair capabilities is further discussed in section 3.1.3.

16

3.1.2 The protein structure of CRISPR-Cas9

A pivotal figure in bacterial fight against viruses is the CRISPR associated protein, or in short the Cas protein. The endonuclease enzyme Cas9 is the most commonly employed today and was discovered in ​ ​[20]. To get an improved understanding of the CRISPR-Cas9 system, the protein structure is studied with respect to the ​Streptococcus pyogenes’s​ Cas9 protein (SpCas9 for short).

SpCas9 is a large, monomeric, multidomain protein consisting of 1368 amino acids, corresponding to 4.10 kbp and 160 kDa [23][24]. Cas9 is a multifunctional protein and in addition to the role of cleavage, Cas9 domains are also responsible for PAM recognition and the maturation of crRNA. Cas9 has a bilobed structure; a design composed of two lobes with dissimilar characteristics. The first lobe is the nuclease (NUC) lobe (residues 1-60 and 718-1368) possessing nuclease activity, and the second is the recognition (REC) lobe (residues 94-717), an alpha-helical lobe with the ability to recognize the gRNA. The bilobed structure is linked with a critical bridge helix (residues 60-94) that attach the two lobes along with a disordered linker (residues 712-718) [25] (figure 4). The bridge helix is also critical for the creation of the essential R-loop [22], explained further down.

Figure 4: The bilobed Cas9 structure consisting of the NUC and REC lobe is connected by a bridge helix. The NUC lobe (grey) has the nuclease activity of the Cas9 protein, and contains two endonuclease domains called HNH (orange) and RuvC (yellow), and a PI domain (pink) responsible for PAM recognition. The REC lobe (green) recognizes the gRNA. To the right is a three dimensional structure of SpCas9 integrated with sgRNA and target sequence (pdb code 4OO8).

The REC lobe includes three α -helical domains which are necessary for the essential conformational change leading to Cas9’s ability to bind viral DNA and initiate the creation of an important R-loop. The first major alteration of the Cas9 structure occurs in the REC-lobe when it binds to gRNA. This structural change

17

activates the DNA-binding properties of Cas9, and eventually the cleavage activity. The formation change creates a central tunnel between the NUC and REC lobes where target DNA can bind, entailing that the spacer region is located in the tunnel of the Cas9.

The binding of a viral DNA strand also relies on recognition of a PAM region, which is performed by a PAM interacting (PI) domain. The PI domain is positioned close to the Cas9´s C-terminal (residue 1099-1368) where the PAM interaction occurs through base-specific pairing of target, and contributes to the dynamical change of the protein [25].

The NUC lobe and the associated domains - RuvC, HNH and PI - are correlated with the immense structural changes upon cleaving the targeted dsDNA. The NUC-lobe consists of two catalytic domains; HNH (residues 775–908) and RuvC (residues 1–59, 718–769 and 909–1098) [155][44], as well as a third domain located at the C-terminal, called a PI domain. The RuvC domain has three subdomains located adjacent to the N-terminal and the HNH domain. The RuvC domain is responsible for the cleavage of the single DNA strand that is not compatible with the spacer, the so called non-target DNA strand (figure 5). The non-binding DNA strand is directed through a more compact separate tunnel during the cutting stage [26], generated in the NUC lobe. The HNH domain however is responsible for the cleavage of the target DNA strand, which is compatible with the spacer. The HNH is located close to the center of Cas9 and has shown to have a high mobility. The domains cut the unwinded DNA strands at 3-4 nucleotides upstream of the PAM sequence, making blunt ends [27].

18

Figure 5: Binding of gRNA to the REC lobe initiates a series of conformational changes, resulting in a fully activated protein. The target DNA can be attached when the complex of Cas9 and gRNA has connected. The PI domain recognizes the PAM region and is followed by pairing of the gRNA seed sequence. The NUC-lobe’s HNH and RuvC domains are responsible for the cleavage of target dsDNA.

When the REC-lobe has associated with the gRNA, the Cas9 protein begins probing for matching target DNA sequences through PAM recognition. The recognition of a PAM sequence is done by a domain in the Cas9 lobe called the PI domain, located in the NUC lobe. Upon PAM recognition, the target DNA strands start to unwind, the spacer and the protospacer start to connect (especially the seed sequence), and a new formation called the R-loop is created between the lobes. An R-loop forms a three stranded structure composed of a gRNA strand and two unwinded DNA strands. Although the R-loop formation is not fully understood, it is crucial for the cleavage of the target DNA [26]. The R-loop formation starts when gRNA binds to DNA and develops into an RNA:DNA heteroduplex located between the NUC and REC lobes (figure 6). The loop formation will eventually fully activate the Cas9 [28].

19

Figure 6: When the Cas9 recognizes the PAM sequence, the target DNA starts to unwind, forming an R-loop structure. The R-loop expands when bases between gRNA and target DNA match. When the entire spacer region of the gRNA is bound to the target DNA, Cas9 makes a double stranded cut.

The HNH domain is positioned further away from the cleavage site and is not active in the early stages of the R-loop formation since the loop area is not spacious enough. If the sequence matches in the target region, the R-loop continues to grow until an ultimate of 18 bp matches has been located. The R-loop region generates a spacious room for the HNH domain to proceed forward into the target region guided by the REC-lobe and then implements its active form. When the HNH gets active, it rotates 180 degrees and the Cas9 complex undergoes the second major conformational alteration. The new formation contributes to the cleavage activity of both RuvC and HNH [29] (figure 7).

20

Figure 7: Illustrates the critical, dynamical changes of the Cas9 protein structure upon target binding and R-loop formation. HNH (orange) changes position into the spacious room created during the R-loop expansion, fully activating the nuclease activity of Cas9.

There are multiple features that are still unknown concerning the Cas9 domains, conformational changes and how they differ between various CRISPR-Cas types. Today, there is an extended research trying to solve the mechanisms of the CRISPR-Cas components, for instance the role of the PAM sequence in R-loop formation in SpyCas9 [28].

3.1.3 The DNA repair mechanisms of eukaryotic cells

When a bacteria uses its CRISPR-Cas system to eliminate harmful invaders, the DNA break causes a complete degradation of the viral DNA. Hence it has positive effects for the bacteria. But if the CRISPR-Cas9 system is introduced into an eukaryotic cell in order to genetically modify a targeted gene, the DNA breakage initiates a natural repair response inside of the cells. There are two major repair mechanisms to restore a double stranded break (DSB) within eukaryotic cells [30].

The first repair system is called ​NHEJ and is an acronym for ​nonhomologous end joining​. As soon as CRISPR has cleaved the DNA, the cell will rapidly try to reconnect the blunt ends via NHEJ. Two things can happen

21

during the NHEJ repair phase; the original strands are perfectly regained, or mutations are incorporated before the strands are reconnected (figure 9). In the first case, the obtained, original DNA sequence will once again be recognized and cut by CRISPR-Cas. This will occur over and over again until a mutated, unrecognizable sequence is formed. The NHEJ repair pathway is described as error-prone; this means there is a high chance of insertions or deletions of nucleotides at the site of the DSB. The result is a DNA sequence different from the original, which can have various effects on the expression of the gene [30].

The second eukaryotic repair pathway is called HDR and is an acronym for ​homology directed repair​. The HDR process requires a dsDNA or ssDNA template complementary to the region of the break. The HDR repair system is commonly applied within gene editing to perform point mutations or gene insertions at the site of the DSB. This is accomplished by designing a repair template with ends matching the upstream and downstream region of the DSB, called the left and right homologous arms (figure 8). The homologous arms are necessary for the template to anneal to each flank of the break. Besides the homologous sequence, the repair template should also contain the desired insert or mutation [30]. With this technique, it is possible to select the nucleotides that will be incorporated at the site of the break, and not let them be formed randomly, as in NHEJ.

Figure 8: A designed repair template with left and right homologous arms. The homologous arms are complementary to the region downstream and upstream of the DSB. The illustrated repair template contains an insert positioned in the center.

The HDR method of gene editing is commonly employed when a specific mutation is necessary in a gene. Another application using HDR is the knockout technique; here the repair template is designed to contain a mutation leading to an early stop codon in the gene, causing the gene to lose its function.

22

Figure 9: Comparison of the two alternative repair pathways - HDR and NHEJ - in eukaryotic cells and the possible outcomes for each repair pathway.

Even though a repair template is present and an HDR mechanism is desired, NHEJ can still occur. The DSB it self is considered toxic and can induce a p53 (tumor suppressor protein) damage response in human cells, favoring the NHEJ pathway [31]. The efficiency of HDR has shown to be generally low, but different techniques can be utilized to favor the HDR repair rather than the NHEJ [32]. Even if both repair systems can lead to a loss of function mutation, NHEJ is less predictable and lacks possibilities to control how the sequence will be altered after the repair. Therefore NHEJ is an undesirable repair pathway if, and only if, the aim is to make specific mutations, gene modifications or insertions. Even so, NHEJ is the most common repair method used in nature by eukaryotic cells [33].

3.1.4 CRISPR-Cas classes and types

Due to the progressive discovery of new CRISPR-Cas systems, different classes and types have been established in order to keep track within this CRISPR puzzle. The most renowned CRISPR-Cas system is SpyCas9, a Cas9

23

protein found in and isolated from ​Streptococcus Pyogenes​, but there are several systems that have been discovered within other organisms as well. CRISPR-Cas systems are divided into two classes, six types and around thirty subtypes up until today’s date, but the numbers are continuously changing [34]. For example SpCas9 is classified as a Class 2, type II, subtype IIA.

The two classes are separated based on the so-called signature Cas genes, and the amount of Cas proteins used in the CRISPR-Cas process. Class 1 systems involve multiple Cas proteins, whilst Class 2 is constructed of a single Cas protein obtained from a single gene [35]. Class 2 is much less common in nature than class 1, but class 2 has since the discovery of Cas9 been more studied [34].

The subdivisions of the classes are called types, which are separated based on various characteristics concerning processing and interference, such as domain properties, PAM sequences, target sequences (DNA or RNA) and cleaving (blunt ends, staggered ends or single stranded cuts). Class 1 contains three different types termed Type I, Type III and Type IV, and Class 2 consists of Type II, Type V and Type VI [34] (figure 10). The following text describes certain qualities for each type.

Class 2 type II​: Is a well known system that normally goes by the name of CRISPR-Cas9 and is one of the most studied and applied types. Cas9 is here termed a signature protein for the type II group. Type II is unique among the different types, since the protein has the capacity to both perform cuts and targeting without recruiting additional molecules. The type is also dependent on RNaseIII for the maturity process of pre-crRNA, along with the tracrRNA. Type II systems contain a RuvC domain and a HNH domain which have the capability to cleave dsDNA, resulting in blunt ends. Due to the RNase III dependency, the type II systems do not exist in archaea since RNase III is not found in archaea [34].

Class 2 type V: Encode the signature protein called Cas12 which can be divided into separate groups, such as Cas12a and Cas12b. The Cas12 protein attacks dsDNA with the RuvC domain and makes a staggered cut in the targeted genome. Type V CRISPR-Cas systems do not require tracrRNA for Cas guidance, instead they use a single crRNA to pinpoint target DNA [35].

Class 2 type VI: ​Encodes a single Cas13 protein which has the ability to target RNA and does not require a PAM nor guidance RNA when targeting. The cutting activity of the Cas13 is unique since it is followed by a course of events called ​collateral cleavage​. The type VI has been applied within new diagnostics methods since the two nuclease domains called HEPN cleaves ssRNA [35].

24

Class I type I:​ Involves the Cascade complex [36]. The Cascade comprises five Cas proteins and requires a large target sequence, unlike many other systems. The Cascade complex recognizes the target sequence, but does not possess a nuclease activity. In order to perform the cut, the Cascade needs to recruit a Cas3 protein for cleavage [34].

Class 1 type III​: Have the ability to cleave DNA without recruiting new nuclease enzymes. A signature Cas protein in type III is the Cas10 that can both target RNA and DNA, depending on the subtype. The Cas10 can recognize and cut target sequence by formation into a Cascade-like complex [37].

Class 1 type IV: has multiple active Cas proteins, like the Cas7, Cas8 and Cas5. Type IV is a broad type which has not been well explored. Recently it has been shown that the different subtypes have a large diversity [38].

25

Figure 10: Classification of the CRISPR-Cas systems, showing simplified processing and interference reactions.

3.1.5 Transcription of the CRISPR locus

The CRISPR locus is composed of tracrRNA genes, a Cas operon consisting of a collection of Cas genes, and a CRISPR array made up of promoter, leader sequence, unique spacers and identical repeats. The spacers are truncated bits of viral DNA stored as memories of prior infections. The spacer sequences are about 20 bp in

26

length and separated by repeats. The transcription of the CRISPR array yields a precursor CRISPR RNA (pre-crRNA) which is further processed and subsequently broken into smaller units made up of one spacer and one repeat connected to each other, called CRISPR RNAs (crRNAs) (figure 11).

Figure 11: An overall description of the CRISPR-Cas9 locus, its components, its products after transcription/translation and the final product of a Cas9 complex in the left corner. The tracrRNA attaches to pre-crRNA by its anti-repeat sequence.​ I​ n order to achieve mature cr:tracrRNA, RNase III and Cas9 are required.

In the beginning of the CRISPR array lies a region called the ​leader sequence​. It is an AT-rich region, about 200-500 bp in length, which includes the promoter necessary for the transcription of the CRISPR array [39]. The leader sequence has a major role when it comes to the insertion of new foreign DNA fragments into the CRISPR array, a process called s​ pacer acquisition​ (further described in the following section 3.1.6).

The transcription of a locus typically relies on promoters, which are regulatory sequences that define the start site of transcription. Promoters are located towards the 5’-end, upstreams of a genome and consist of a short sequence of base pairs where proteins, called transcription factors, can bind to initiate transcription. There are many transcription factors that can be used for CRISPR-Cas editing, but the most generally used for eukaryotic cells are the RNA polymerase II and III (Pol II and III) [40]. The polymerases need to attach to a promoter site

27

on the DNA to start the transcript of specific genes. There are a diverse set of promoters with various qualities and functions; they can stimulate high or low expression levels, transient or longstanding expression, be suitable for sequences with certain lengths, or target a particular cell tissue e.g. muscle cells. For instance are Pol II and Pol III promoters, such as U6 or U3 promoters, suitable to use when expressing short RNAs such as gRNAs in eukaryotic cells [42][41]. Additionally are the H1 and 7SK promoters frequently used within human cells [40], and for ​in vitro gRNA expression the T7 promoter is commonly practiced [43][44]. The Cas9 gene is usually expressed by the strong constitutive promoter TEF1, or CMV-driven promoters (e.g. CaMV35S, the 35S promoter of the Cauliflower mosaic virus), but other constitutive promoters such as RNR2 can also be used [44]. Target efficiency varies depending on the selection of promoter, therefore scientists have to choose an adequate strategy for their particular research objective.

Aforementioned is a general description of the transcription of a CRISPR locus, but the molecular mechanisms for expression, crRNA processing and DNA interference differs drastically depending on which type of Cas protein that is present in the bacteria or archaea. On top of that, the gene positions, promoters, terminators and the direction of gene transcription varies among organisms and CRISPR types, while some still remain fully unknown. Therefore, only an expanded description of the most studied Type I, II and III CRISPR-Cas systems will be mentioned onwards in this section.

The tracrRNA genes are in the vicinity of the Cas genes and CRISPR array, but the exact position of them are partially unknown or differs depending on which bacteria, archaea or type of CRISPR-Cas system that is examined. For type II systems, the tracrRNA can be positioned upstreams, downstreams or within the Cas operon, or even downstream the CRISPR array [45]. In figure 11, the tracrRNA has been depicted to be in the beginning of the CRISPR locus. The tracrRNAs also differ in length, sequence and structure among various species, but in type II systems they all share the common characteristic to contain an anti-repeat sequence capable of base pairing with each of the repeats in pre-crRNAs.

As described in section 3.1.4 does the Cas9 protein belong to the Type II class of CRISPR-Cas systems. This class requires tracrRNAs and III (RNase III) in order to achieve mature crRNAs. In Type II systems, tracrRNAs have a sequence that is partially complementary to the repeats within pre-crRNAs, called an anti-repeat sequence, and base pairs with them to form RNA duplexes. In other words does the tracrRNA only attach to the repeat region within the pre-crRNA, not the spacer. The pre-cr:tracrRNA duplex formed is cleaved by RNase III in the presence of Cas9 protein. The matured cr:tracrRNA complex associates to a Cas

28

protein to form the ternary effector structure that is capable of performing double stranded cuts in invading genomes.

In contrast to the Type II maturation phase, Type I and Type III systems do not require tracrRNA molecules or RNase III for the pre-crRNA to be matured. Here, only a Cas6 protein is necessary for the cleavage of pre-crRNAs into a shorter, processed spacer-repeat crRNA. In order for the Cas6 protein to recognize a pre-crRNA, the repeat region needs to have formed a stem-loop structure. This is enabled by a highly palindromic pattern in the repeats [45]. This is not a distinct characteristic for the Type II repeats.

Type I and III also share common features regarding DNA interference. As soon as the crRNA is matured, an assembly of several Cas proteins are necessary for DNA degradation. This is in contrast to Type II which only require one Cas protein to perform DNA cleavage.

3.1.6 Spacer acquisition into the CRISPR array

The ability to modify the host genome after an unfamiliar phage attack, and exploit the toxic cargo for future defense is unique among bacteria and archaea. When a not previously encountered sequence is introduced into a bacteria, it will be cleaved into smaller fragments called ​protospacers​. The protospacers are inserted into the host’s CRISPR array and are not up until then termed ​spacers.​ These spacers are needed for transcription, and also for later assembly of crRNAs. The selection of new spacer material is still a blurred process, and it is not understood how a specific protospacer sequence is selected for further insertion into the CRISPR array.

Nevertheless, scientists have uncovered several parts of the process concerning spacer acquisition. It has been found that PAM sequences are elemental when selecting new protospacers for Type I and II systems, whereas other types of CRISPR demand more actively transcribed sequences [46]. Further, regardless which type of CRISPR-Cas system is examined, they all contain the and Cas2 genes in their operon, giving rise to proteins that are of high importance for the implication of spacer acquisition.

Although a detailed mechanism of spacer acquisition remains obscure, some proteins in the CRISPR-Cas system have shown to be more significant than others. The Cas1 protein in complex with Cas2 interacts with forgein protospacers. The Cas1-Cas2 complex will undergo structural rearrangement when linked with a protospacer for stabilizing effects. By studying the free energy upon binding, it has been shown that types

29

requiring a PAM sequence are more stable. The efficiency of the nuclease activity of Cas1 has also been proven to increase when a PAM sequence is present [47] .

Adjacent to the CRISPR array is a AT-rich sequence called leader. New genetic information, in the shape of spacers, are mainly placed directly after the leader and are separated by direct repeats. The incorporation of the protospacer into the CRISPR array involves multiple reactions. Firstly, a single stranded cut occurs at the end of the leader. Secondly, a nucleophilic attack connects the protospacer’s 3´-end to the cleavage in front of the first repeat. Thirdly, the protospacers unconnected 3´-end changes position toward the first repeat where a cleavage-ligation reaction occurs [46]. The enlarged array is now in an intermediate state and has to be repaired and ligated completely. The protospacer’s dsDNA has to be linked to the ssDNA of the repeats which is accomplished by the DNA polymerase. Once fully inserted, the protospacer is termed a spacer (figure 12).

30

Figure 12: The spacer acquisition starts at the leader sequence which is followed by repeats and spacers. The Cas1 and Cas2 proteins find, cut and incorporate a new piece of viral DNA (protospacer) into the CRISPR array. First, the downstream 3´-end of the protospacer connects to the 5´-end of the first repeat by a nucleophilic attack. In the second integration, the upstream 3’-end of the protospacer attaches to the end of the first repeat. Repeat 1 is hereby separated into ssDNA at each side of the newly incorporated protospacer (now termed a spacer). Ligation and DNA repair mechanisms fill in the remaining gaps and the final extended CRISPR array is acquired.

The description of Cas1 as the “warrior protein” originates from these reactions. Depending on which type of CRISPR-Cas systems that are examined, the Cas1 and Cas 2 proteins are needed for the recognition of the cleavage site, position of the new spacer, the ligation process, as well as the extraction of viral DNA. The vital tasks performed by these Cas proteins in the initial step of the entire CRISPR system also explain why they are found in all different types of CRISPR systems [47].

31

3.1.7 Introduction of CRISPR-Cas9 into cells

For the moment, the delivery of the CRISPR-Cas9 system into cells has proven to be the toughest obstacle to conquer before extensive ​in vivo usage of the system can become a reality. The development of safe cargos and vehicles is necessary for the versatile use of the CRISPR-Cas system. Earlier efforts to correct disease causing mutations in humans in 1999-2003 [48] have been tainted by several tragedies due to unforeseen, deleterious effects by the delivery method itself. In one trial, a 18-year-old male suffering from ornithine transcarbamylase (OTC) deficiency died after developing a massive inflammatory response to the delivery vehicle, an adenovirus vector, four days after receiving the treatment [49]. Two other trials, using MLV vectors for gene transfer, comprised 20 children with X-linked severe combined immunodeficiency (SCID-X1) were partially successful, but within a few years, five of the children developed leukemia-like disorders, and one of them passed away. Genetic analysis of the afflicted cells showed that the viral vector had integrated at sites within or near tumor-promoting genes which caused a harmful activation of those genes [50].

Up until today, the issues regarding non-specific genome insertions and severe immune responses due to viral vectors remain, even when discussing the introduction of the CRISPR-Cas9 system into cells. An additional aspect to consider is the large size of the Cas9 protein, approximately 4.10 kbp (1368 aa) [51], which can be troublesome when working with viral vectors due to their limited packaging capacity. Further may even the expression of the Cas9 elicit an immunogenic response, which might seem less practical for purposes. It is critical that both the genome editing tool ​and ​delivery vehicles are specific, safe and efficient in order to reduce the risk of unwanted or even fatal outcomes. As soon as the delivery methods improve, along with refined targeting and decrease in off-target effects, the therapeutic potential for CRISPR can flourish to reach its full potential.

Indeed, there are several challenges to overcome before the introduction of CRISPR components into cells can be made in a safe and efficient way. Numerous delivery methods have been exploited, but many factors have to be considered when choosing which one to proceed with since there is no single method suitable and applicable for all cell types. Issues arise since different cell types, both within or between species, have dissimilar capabilities in terms of transformability, sensitivity and DNA repair pathways. Certain cell , e.g. from mouse, fruit fly, pigs and , are strong enough to withstand microinjection, while others burst when the tiny needle penetrates the cell surface. While bacterial cells can take up genetic material from its surroundings, plant cells are harder to transfect due to their rigid cell wall. Therefore, cell-specific methods have been forced to develop, but still there are some more extensively used and established methods than others.

32

In order to perform a successful delivery into eukaryotic cells, several steps need to be considered and mastered; protection of the CRISPR components, cell membrane crossing, cytosolic release, and penetration of the nucleus. Both the Cas9 protein and gRNA are required to obtain a fully functioning nuclease system. These components can be introduced to cells in different manners; Cas9 can be provided as DNA, an ​in vitro mRNA transcript or in its protein form, whilst the gRNA can be provided in a DNA plasmid or as an ​in vitro transcribed RNA molecule (figure 13).

Figure 13 : Transcriptional and translational pathways of the CRISPR components (Cas9 and sgRNA) inserted as DNA, RNAs or RNP complexes into cells.

In the following text, all delivery methods that are mentioned will refer to sgRNA instead of the generic name gRNA since sgRNA have become the most popular to use in practice. Both gRNAs can be chemically modified to make them less vulnerable for immunological degradation within the cell, and both have shown to bring the same gene editing efficiency (with a slight difference depending on which genes that are targeted [52]).

33

3.1.7.1 DNA plasmid delivery

The CRISPR components can be delivered as DNA into cells via an expression plasmid. Working with plasmids is relatively easy and inexpensive. Plasmids are stable and can stay within the cytoplasm for several weeks as free units [53]. Due to the drawn-out persistence and thereof the continuous transcription of the Cas9 and sgRNA sequences, CRISPR-Cas9 plasmids have proven to cause high levels of unintended cuts. Even so, the stable Cas9 expression can be desirable and advantageous for other experiments, such as labelling of genomic loci in living cells by fusion of a fluorescent protein to a dCas9 [54] (further discussed in section 3.1.8.1).

Figure 14: Pictures a simplified Cas9 plasmid containing promoters, insertion site for 20 bp guide RNA (spacer), an sgRNA backbone, the Cas9 sequence and a potential selection marker gene.

There are a variety of plasmids on the market, already containing a sequence for the Cas9 protein, an insertion site for a target sequence, an sgRNA backbone, promoters and other necessary elements for a functioning CRISPR-Cas9 system - within one and the same plasmid. Even eligible sequences such as selection markers or tags are already inserted, or can be ordered if required. It is also possible to use two plasmids for the experiment; one with Cas9 and one with sgRNA. This allows the flexibility to combine Cas9 with multiple, different sgRNAs (enabling ​multiplexing,​ section 5.1). Additionally, this brings about the opportunity to have various markers on the plasmids in order to select cells that have taken up both plasmids during the phase. For the scientist, plasmids are easy to use since it is only necessary to provide the ~20 bp guide RNA (spacer), clone the target sequence into a plasmid and then introduce the plasmid into the cells of interest. When

34

delivering DNA into cells, toxicity, persistence, mutagenesis and regulation issues have to be considered. One must have in mind that integration of plasmids might have a stressful effect on the cell, or even a toxic effect.

3.1.7.2 Viral vector delivery

Another option for DNA introduction into cells is via viral vectors, a method called ​​, which includes the gene transfer from a virus to an eukaryotic cell. The gene can be inserted or not inserted into the host genome, depending on which viral vector that is used. As illustrated earlier are viral vectors concerned with safety issues due to the potential immunogenic responses it can induce, and the potential detrimental effects following random integrations.

Viruses usually infect only a particular cell type or tissue, a feature called viral tropism, which have resulted in the need to use several different viruses and in order to target specific cell types. The most popular and useful viral vectors are developed from adeno-associated viruses (AAV), adenoviruses (AdV) and . AAV vectors have become a favorable tool in gene editing since they are less immunogenic than other viral vectors and can target a range of cells and tissues due to their multiple serotypes [55]. AAV gene delivery has also shown the ability to target a variety of human cells with a high transfer rate. The drawback with AAVs is that they have a low packaging capacity, only about 4.5 kbp [55], which is problematically small when compared to the size of the Cas9 sequence and a chimeric sgRNA together (~4.2 kb) [56]. It is feasible to pack the Cas9 sequence into the AAV, but only a tiny space is to spare for all other important elements such as the spacer, a promoter that works well in the target cell, and other regulatory sequences necessary for transcription and translation. To increase the genetic capacity, another approach using the ​Staphylococcus aureus’s Cas9 (SaCas9) sequence can be used. SaCas9 is approximately only 3.15 kb in size [56], displays cleavage faculty in mammalian cells and produces indels and off-targets at the same rate as SpCas9. AAV-SaCas9, AAV-SpCas9 and other AAVs can be used without a substantial immune response or toxicity, making it suitable for ​in vivo and ​ex vivo studies.

The adenovirus is a dsDNA virus with a nucleocapsid shaped as an icosahedral. The first AdV vectors used in research studies caused acute and chronic immune responses due to the viral capsid and viral gene expression, respectively. Later versions of the AdV vector have been devoid of several viral genes, but the capsid remains and does consequently still induce a substantial immune response. Nevertheless are AdVs, due to this particular feature, very suitable in oncolytic viral therapy and vaccines. The main advantage of AdV vectors is their packaging capability. The second generation of AdV vectors have a packaging capacity up to 12 kb, and the

35

third generation called Gutless AdVs can incorporate up to 36 kb of genes, making it very suitable for an entire CRISPR-Cas delivery. The Gutless AdV have been stripped of all viral coding sequences, thereby erasing the toxicity to a negligible level, but it does on the other hand require a helper-adenovirus to properly function, which leaves scientists with the initial problem of a complicated immune response [57].

Lentiviruses belong to the retrovirus family and do consequently have the ability to permanently integrate into the host’s genome. They carry single stranded RNA (ssRNA), have a spherical shape and hold a greater packaging capacity than AAVs (~8 kb compared to ~4.5 kb [57]). They are specifically useful in research involving brain studies since they can be used to mark , and accordingly enable visualization of their migration pattern. Furthermore are lentiviruses workable in primary cells and other cells that generally are difficult to transfect. As with AAV vectors, they only induce a low immune response. Due to their genome integration capability they can produce long-term gene expression; this may be a preferable advantage in some cases but not in others. Collectively and finally, all three mentioned vectors share the characteristic to transduce both non-dividing and dividing cell types [19] [58].

Table 1: Summary of AAV, AdV and lentiviral delivery vectors

The illustration in figure 15 shows how to produce lentiviral particles containing Cas9 and gRNA genes. Firstly must the viruses be constructed using a packaging cell line. The Cas9 and gRNA genes can be set on one plasmid, and other necessary viral sequences such as envelope genes and packaging genes set on separate plasmids. The plasmids are co-transfected into packaging cells (e.g. 293T cells). The viral particles are thereafter collected and transduced into desired target cells. The procedure is laborious and takes about two weeks [60].

36

Figure 15: Production of lentiviral vectors containing Cas9 and gRNA genes. Several plasmids containing Cas9, gRNA, packaging genes and envelope genes are necessary for the production of viral particles.

3.1.7.3 RNA delivery

When delivering RNA into cells, a co-transfection of in vitro transcribed Cas9-mRNA and sgRNA is necessary. The RNAs are simultaneously introduced into the cell, enabled by any of the methods electroporation, microinjection, or lipofection (described in section 3.1.7.5). As soon as the mRNA/sgRNA mix are inserted into the cytoplasm, the mRNA is translated to a Cas9 protein and combines with the sgRNA to form a ribonucleoprotein (RNP) complex that can enter the nucleus.

RNA sequences can be quite unstable and difficult to work with. Both synthetic cr:tracrRNA and sgRNA can be chemically modified to improve RNA stability, but not all gRNA modifications are beneficial for CRISPR gene editing. Some alterations are necessary for co-electroporation steps, others can be toxic for certain cells, while some only generate a slight increase in efficiency [61]. CRISPR-Cas9 inserts as RNAs are more expensive to use than plasmids, but it has proven to bring less unintended cuts since it is degraded faster in the cell and thus only has a temporary expression [62]. A drawback with RNA delivery is the absence of selection markers on the RNA sequences which otherwise can be used to verify if the cell has taken up both CRISPR components or not.

37

3.1.7.4 Protein delivery (RNP delivery)

RNP is an acronym for ribonucleoprotein and represents the model of a Cas protein and an sgRNA in complex. The delivery of CRISPR components as RNPs involves a self-produced or purchased Cas9 protein and an in vitro transcript of an sgRNA. The translated Cas proteins are mixed with transcribed sgRNAs and incubated at room temperature for 10-15 min [63][64] to form an RNP molecule, which then can be introduced to cells by several methods such as microinjection, electroporation, lipofection etc (described in section 3.1.7.5). The RNP particles can be inserted into the cytoplasm or directly into the nucleus. RNP complexes have a short duration in cells (up to ~48 h [62]) and will thus not be inherited to daughter cells. Accordingly, they show the most transient expression of all delivery methods mentioned, with a peak at 4 h and decreased thereafter rapidly. Due to the short expression time, RNPs show the lowest off-target effects compared to plasmids and RNA delivery [62]. They have proven a high on-target specificity and efficiency in many cell types, even within sensitive cells such as stem cells, human embryos and primary T-cells that generally are difficult to transfect or transduce [53].

The disadvantages of RNPs are the expensive costs (especially a commercially produced Cas9 protein) and delivery difficulties. It is feasible for a scientist to express and purify a Cas9 protein by oneself as long as the proper bacteria, plasmids, reagents and equipment are available at the laboratory and there is time to spare. Additionally, as in the case of RNA delivery, no selection marker is available on RNPs to ensure any cell uptake of the complex. Even so, it is easier to regulate and estimate the concentration of RNP in the cell due to its transient expression and quick degradation. This is in contrast to plasmid delivery, which has a longer duration in the cell and therefore a long-standing expression of the components, making it difficult to regulate the concentration quickly.

As a last reminder, there is a distinct correlation between long-standing expression and off-target effects. As long as sgRNA and Cas9 is produced and present in the cell, the higher the risk is to obtain unintended cleavages in the cell’s genome [2].

3.1.7.5 Physical and chemical delivery methods (non-viral methods)

The physical delivery methods mentioned in this report includes microinjection and electroporation, whilst the chemical-based techniques include the use of liposomes. These approaches do not entail any viral components, thereby not triggering the cellular defense mechanism, but they are instead limited by multiple extracellular and

38

intracellular hurdles. Microinjection, electroporation and lipofection are some of the most regularly reported delivery methods for CRISPRCas systems, but the extensive research for new and improved CRISPR deliveries will by all odds result in several new and refined methods in the future.

Microinjection is a microsurgical procedure where DNA, RNAs or RNPs are injected into the cytoplasm or nucleus of cells by the use of a tiny glass needle. Conceptually, it is one of the simplest gene delivery methods and has proven excellent efficiency (100%) and target specificity, however, it is time-consuming, requires skills, and not all cells can be treated by microinjection since they might burst when the needle touches the surface. Since only one cell can be infected at a time, the method is obviously not applicable on a whole organism or human, but it is well practiced on fertilized eggs, oocytes, zygotes and for i​ n vitro ​ applications [19].

Electroporation is a technique that involves a short electric pulse to render temporary pores in the target’s membrane which facilitate the uptake of DNA, RNAs and RNPs into the cell. The electric pulse is about 300-400 mV and lasts for less than a millisecond [60]. However, the electrical intensity and duration has to be tested and optimized specifically for each studied cell type, otherwise it can lead to cell death. Evidently, electroporation is not suitable for ​in vivo studies, but rather applicable solely for in vitro ​experiments. An electroporation-based method called ​nucleofection ​enables direct nuclear entrance of RNPs, which is especially preferable for experiments involving HDR since it allows the entry of the DNA repair template into the nucleus.

Known by many names, lipofection is also termed lipid-based transfection, cationic lipid-mediated transfection, lipid nanoparticle delivery, or LNP delivery, for short. The lipofection procedure involves liposomes. Here, lipids are used to encircle CRISPR components, and act as a transport vehicle which subsequently fuse with the target cell membrane and release the CRISPR contents into the cell’s cytoplasm (figure 16). The method has turned out to have a low efficiency due to endosomal entrapment and liposomal degradation, and is especially not suitable for stem cells or primary cells [19]. Even so, the technique still remains very common to use due to its simplicity (no special equipment is necessary) and minimal toxicity. Lipofection can be used to transport CRISPR plasmids, RNAs and RNPs [65].

39

Figure 16: By mixing a Cas9:sgRNA plasmid with liposomes, CRISPR components can cross the lipid bilayer and be introduced into the cell. The lipid coat acts as a vehicle and protector of the components across the cell membrane.

Lipofection is the most commonly used chemical method for CRISPR delivery across membranes, but other techniques including cell-penetrating peptides (CPPs), gold nanoparticles (AuNPs or CRISPR-gold), polymeric nanocarriers and DNA “nanoclews” have also been tested as facilitators for membrane penetration.

3.1.7.6 Cytosolic vs nucleic insertion

There are two major barriers to penetrate before the CRISPR components can cut the target DNA; the cell membrane and the nuclear membrane. It is possible to directly insert the CRISPR molecules into the nucleus with the aid of methods like microinjection and nucleofection. For cytoplasmic injection, methods such as microinjection, electroporation and lipofection are suitable.

For a proper CRISPR functionality, the Cas sequence needs to be both transcribed and properly translated into a protein before any DNA cutting can be achieved. The sgRNA on the other hand only requires transcription before it can be used for CRISPR-Cas gene editing. As familiar, transcription takes place in the nuclei and translation takes place in the cytosol. Hence, if the Cas9 gets inserted as a DNA plasmid in the cytosol, the plasmid firstly needs to cross the membrane into the nucleic core in order to be transcribed to a mRNA

40

molecule, secondly, traverse back into the cytosol for translation, and thirdly, return into the nuclear envelope once again in order to finally perform the desired cuts in the genome. The attachment of the sgRNA can occur in the cytoplasm or in the nucleus, depending on where it is located. This draw-out mechanism explains and clarifies why plasmid delivery has proven to be less efficient than RNA or RNP delivery. RNPs do not require any transcription or translation, thus making it workable as soon as it is inserted. The only time-determining step for RNP actuation is if the complex is introduced into the cytoplasm, or if it is directly inserted into the nuclei, the latter faster than the other. For RNA delivery, the mRNA-Cas9 gets translated into a Cas9 protein and attaches to the co-transfected ​ in vitro​ transcribed sgRNA (figure 13).

Added to this complexity of membrane-crossing is the fact that import into the nucleus only can occur during limited conditions; when the cell undergoes mitosis or through nuclear pore complexes. Nuclear transport can be further encouraged if the Cas9 complex is attached to a nucleus-uptake facilitator, such as a nuclear localization sequence (NLS). The nuclear envelope is a major barrier especially for non-dividing cell studies since the breakdown of their nuclear membrane (mitosis) does not occur at all, or only occurs to a limited degree. Transfection and delivery of CRISPR components into the nucleus of dividing cells have shown to be easier [66]. Viral vectors are naturally good at exploiting the nuclear pore complexes, therefore are AAV vectors advantageous for sgRNA and HDR fragment insert into the nucleus [67].

3.1.7.7 In vivo ​and ​ ex vivo ​ delivery

The CRISPR-Cas genome editing tool is very popular and useful for ​in vitro and germline research, but its therapeutic potential in ​in vivo applications remains challenging for researchers due to its high degree of off-target effects and induced immunogenicity of the delivery method. The off-targets can occur in both target and non-target cells if the delivery method is not sufficiently tissue-specific. Further has the editing efficiency of in vivo Cas9 studies proven to be quite low, even when the best ​in vitro ​circumstances and techniques are applied.

Viral vectors as delivery vehicles for CRISPR are the most common method for ​in vivo work. Lentiviruses and AAVs are already used today in multiple gene therapies, although not in combination with CRISPR systems (exception; clinical trial phase 1/2, Edit-101 [105] further discussed in section 5.2). On the other hand, this shows that safe viral vectors are available for ​in vivo and ​ex vivo gene therapy purposes already and may in the future even be applicable for CRISPR systems. The popularity of viral vectors for ​in vivo ​work is also because of their multiple serotypes that enable expanded tissue-specific approaches.

41

There is also an ongoing trend trying to deliver CRISPR in vivo ​using different nanoparticles instead, such as lipids nanoparticles (LNPs), gold particles, polymeric nanocarriers, CPPs and DNA nanoclews, to name some of the main emerging methods. These methods are under development and have shown diverse results. Some of them show great promise, but many of them have also proven to be less tissue-specific and tend to accumulate in the [68].

Clinical benefits of gene editing relies on the fact that a sufficient amount of cells are being edited. This number can vary, depending on disease or tissue involved. A commonly used technique within therapeutic applications is ​ex vivo ​editing, which means that something is studied outside its native ​in vivo ​conditions. This approach aims to extract the patient cells of interest, modify them in the laboratory using CRISPR-Cas gene editing (with the aid of for instance viruses, electroporation or nanoparticles), and later insert them back into the patient by viral vectors or through intravenous or tissue-injectable procedures [69]. ​Ex vivo ​studies enable scientists to carry through controls of the gene-edited cells before they are implemented back into the patient, usually by entire genome sequencing, searching for off-targets. The disadvantages of ex vivo therapy are that only certain cells and tissues are amenable to culture and manipulate, and that ​ex vivo therapeutics are extremely expensive, reaching between 400,000 to 1 million american dollars [69].

Even though intense research has progressed in the field of CRISPR genome therapies, no single method can surmount the many challenges ​in vivo CRISPR delivery brings, and it might not even be one of the processes examined here that will bring the unprecedented result we have longed for. Further elaborate and innovative work within will be required to achieve an efficient, safe and tissue-specific delivery method for future i​ n vivo​ or e​ x vivo p​ urposes.

3.1.8 Modifications of the CRISPR-Cas9 system

When the CRISPR-Cas9 system was first discovered in 1993, the system as a genome editing tool was limited to endonuclease activity and the DSB later showed unwanted toxic effects. Within eukaryotic cell editing, the DSB induced by a CRISPR-Cas9 system has proven to cause a p53-mediated DNA damage response which consequently unfavor the desired HDR pathway. The p53 function needs to be monitored to efficiently perform therapeutics in human cells, and by adopting single strand break strategies or silencing effects, these damage responses are avoided [31]. Today, researchers have invented diverse modified CRISPR-Cas9 systems to

42

avoid the toxic effect that can follow from a DSB which can broaden the potential use of CRISPR in genome editing.

3.1.8.1 dCas9, CRISPRa and CRISPRi

Since the discovery of the CRISPR-Cas9 system, further development of the technology has expanded the CRISPR-Cas9 toolbox not only to enable gene editing, but also gene activation and repression. CRISPR activation (CRISPRa) and CRISPR interference (CRISPRi) are technologies in which upregulation and silencing of gene expressions can be performed, respectively. Simplified, CRISPRa turns genes on, and CRISPRi turns genes off.

As stated, Cas9 works as a DNA binding and cutting protein and the gRNA works as a guide for the Cas9 to find the correct sequence. The cleaving capacity of Cas9 is enabled by the two active endonuclease domains RuvC and HNH. These domains can be rendered inactive by point mutations (D10A and H840A in SpCas9) in order to solely exploit the DNA binding property of the Cas9 protein, shown in figure 17. The result is a dead Cas9 protein (dCas9) which lacks cleavage activity, but it can still recognize and bind to specific DNA sequences when coupled to a guide RNA sequence. Both the CRISPRa and CRISPRi technology rely on the dCas9 protein. The dCas9:gRNA complex works at a transcriptional level (in comparison with e.g. RNAi that target transcribed mRNA in the cytoplasm) and binds to the target sequence around the promoter region or in the proximity of a transcriptional start site [70]. The dCas9 can be fused to with various transcriptional regulators called effectors (e.g. TET1, DNMT3A, KRAB, SAM, P300, LSD1, VP64, GFP) in order to achieve various desirable traits [71]. For instance, TET1 is used to methylate DNA, DNMT3A is used to methylate CpG sites, LSD1 works by demethylating , and P300 can be used to acetylate histones [72]. All by all, the effectors render activation or repression of certain genes through different attack points. Additionally, dCas9 can be used for DNA labeling. It is, for instance, possible to fuse a green fluorescent protein (GFP) to the dCas9 to use it as a tool for visualizing structures and dynamics.

43

Figure 17: Schematic picture of dCas9 protein with nonfunctional HNH and RuvC domains and an sgRNA strand capable of localizing the target sequence. Different effectors can be attached to the dCas9 in order to obtain a specific modification in the nearby sequence, often leading to activation or repression of the gene.

In comparison with ordinary Cas9 gene editing, CRISPRa and CRISPRi technologies do not induce a double stranded break in the genome. This enables researchers to find out more about the quality of specific genes without causing irreversible changes or potential damages in the DNA. Hence, CRISPRi can be defined to generate a rather than a . The dCas9 tool has also shown to be able to alter gene expression at multiple sites simultaneously, and to target both coding and non-coding regions in the genome [73]. Lastly, since the dCas9 allows chemical modifications to be done in the genome, such as introducing or erasing methyl groups, it has become an important tool within epigenetic research [74].

3.1.8.2 Cas9 nickases

One commonly applied modified variant of CRISPR systems is called nickase Cas9 (nCas9) and was invented by inducing in one of Cas9’s DNA cleavage domains. The mutation disrupts the cutting ability of either RuvC or HNH, leading to a genomic nick on the targeted DNA strand. It is possible to create nicks on either one of the DNA strands of the target DNA depending on which domain is altered (figure 18). The most common mutation leading to a deficient cleavage domain in RuvC and HNH is H840A and D10A, respectively [75].

44

The nCas9 systems can be applied for creating nicks on both strands simultaneously. By utilizing two nCas9 systems, nicks’ can be induced at opposite DNA strands and with target sequences in proximity to each other creating a staggered DSB, shown in figure 18. Several studies have shown that the DSB of nCas9 pairs have less off-target mutations, most likely because of nick’s lower chance of NHEJ but also require both nCas9 pairs to obtain a DSB. Using nCas9 for DSB when utilizing the HDR repair pathway to incorporate insert or modification shows, in many cases, a higher efficiency and yield than using the wt Cas9, hence nicks’ have a higher chance of being repaired by HDR [27][75].

45

Figure 18: ​ A:​ An nCas9 system with an inactivated RuvC domain and a functioning HNH will induce a nick on the target strand. B​ ​: A​ n nCas9 system with an inactivated HNH and a functioning RuvC produce a nick on the non-targeted strand. C​ ​: a pair of nCas9 are exploited to create a staggered DSB. The alternative repair pathway HDR can be utilized to incorporate an insertion by supplying a predesigned repair template.

3.​ 1.8.3 Base editing and prime editing

It can be tricky to perform precise base editing in mammalian cells by exploiting the HDR pathway and predesigned repair templates. Problems arise due to the favorable repair pathway NHEJ and the potential toxic effects of double stranded breaks (DSB).

46

In recent years a novel modified Cas9 variant shows promising results for high efficiency base editing without the use of DSB nor HDR. These new base editors are a fusion complex between DNA deaminases and dCas9 or nCas9 (henceforth termed d/nCas9). As in the case of CRISPRa and CRISPRi, the dCas9 protein is utilized for its precise targeting rather than its catalytic activity.

DNA deaminases are catalytic enzymes capable of inducing base transitions, and can be utilized to perform substitution in target DNA. There are two types of deaminase enzymes usually fused with d/nCas9. The first one is called cytidine deaminase and promotes C to T substitution. The second one is adenosine deaminase which induces A to G substitution. In order to activate deaminase catalytic properties, ssDNA needs to be provided. The ssDNA is supplied when d/nCas9 binds the target sequence, more specifically in the formation of the R-loop. The d/nCas9 works as a guide, navigating the complex to the target DNA sequence [76].

Cytidine deaminase and adenosine deaminase are naturally found in various cells, acting in metabolic processes. Base editors operate on ssDNA, which created problems when developing the second class of base editors, since adenosine deaminase naturally does not work on ssDNA. The development of a modified adenosine deaminase, called deoxyadenosine deaminase, enabled scientists to successfully use base editors for A to G substitutions [77].

There are some differences in using nCas9 over dCas9 in the design of base editors. Studies have shown that introduction of a nick on the non-edited strand stimulates quicker repair, as well as, favors the right base pair . The drawback of inducing a nick is the potential outcome of NHEJ-derived indel mutations, although this repair pathway is rather uncommon for nicks, it is still a noticeable effect compared to usage of dCas9 which does not cleave the target DNA and therefore have less chance of indel mutations [77].

Another common method to favor the right base pair transition is usage of uracil glycosylase inhibitors (UGIs) anchored at the end of d/nCas9. UGIs favor the desired base pair substitution by inhibiting an unwanted repair pathway mediated by uracil DNA glycosylase [78]. The full structure of cytidine base editor is shown below (figure 19).

47

Figure 19: A base editor fusion complex consists of UGI, cytidine deaminase and the nCas9 system.

When using cytidine deaminase, cytidine deaminases bind to the ssDNA and convert cytosine (C) into uracil (U) in a process called deamination meanwhile nCas9 induces a nick on the non edited strand. The mismatch between U:G is later replaced by repair polymerases into A:U using uracil as a template. The presence of a nick and UGI assist base substitution by affecting the repair polymerases to recognize uracil as the repair template, finally the U:A mismatch is repaired into A:T. If there are no nick or UGI, the repair polymerases have a higher chance of recognizing G as the repair template after deamination and change U back into C, in other words the wild type will be regained.

When using the modified adenosine deaminase as a base editor, the enzyme changes adenine (A) into the intermediate nucleoside inosine (I). Inosine is treated as G in the repair process by repair polymerases, ultimately substituting the T:I mismatch into C:G. adenosine and cytidine base editing mechanisms are shown in the picture below (figure 20).

48

Figure 20:​ A:​ Cytidine deaminase substitutes C into U by deamination and nCas9 produces a nick on the non edited strand. The mismatch repair process induced by G:U is corrected by repair polymerases switching G into A, which results in A:U base pairing, A:U is then fixed by switching U into T, finally obtaining the A:T base pair. ​B: Adenosine deaminase substitutes A into I by deamination meanwhile nCas9 induces a nick on the non edited strand. In the mismatch repair of T:I, polymerases treat I as a G thereby T is changed into C leading to C:I base pairing. Finally the C:I mismatch is changed into C:G.

Base editing has a limitation since the PAM regions are limited to certain designs. Some problems can be solved by adopting various Cas variants, but the off-target effects are still a problem. This has led to scientists uncovering and adapting Cas variants with broader limits while choosing PAM for base editors in human cells, like xCas9 [79] with several options. The applications for base editing has therefore great potential, and in February 2020, a group of scientists was able to cure stem cells from an organoid biobank for cystic fibrosis with adenosine base editors [80].

Base editors may seem like a sensational method for gene editing purposes, but it has considerable limitations. In late 2019, these limitations were set aside by a new promising gene editing technique which could cure 89 percent of all DNA mutations causing a disease, demonstrated by David Liu [81]. The technique called prime editing can with a single strand break insert a template and the possibilities to modify far and near PAM regions is no more a hindrance. The off target effects has shown to have a much lower rate compared with classic CRISPR, since prime editors require three-lock-and key matches. Prime editing can be performed efficiently with all twelve combinations of base substitutions (C to T, A to G, etc). ​In prime editing, a novel modified version of the sgRNA is used, termed pegRNA. The pegRNA can contribute with an insertion or base edit but in contrast to the sgRNA, the pegRNA is also connected to the non target strand (figure21). The cell repair mechanism is manipulated to follow the new design since the old DNA strand gets nick and will not be used as a template.

49

Figure 21: Comparison of base editors (left) and prime editor (right). Base editing utilizes adenosine or cytidine deaminase. Prime-editing relies on an nCas9 fused with reverse transcriptase which performs a single strand cut.

Prime-editing has just been introduced as a new approach to solve certain difficulties within gene editing and is followed by some blurred processes concerning the fusion of strands and the technique requires a whole genome sequencing for final conclusion to be drawn. As a summary, many strategies and modifications have been done to the CRISPR system to solve and improve diverse fields from agriculture to therapeutic. CRISPR-Cas9 is not a bullet proof technique and has limitations which explains why multiple techniques are being used and applied in scattered measurements since they are efficient in different ways.

50

4. The breakthrough

4.1 ​The discovery of the CRISPR system

The palindromic sequence of the CRISPR array constructed of spacers and repeats was first discovered in the iap gene (isozyme of alkaline phosphatase) in ​E.coli at Osaka University in 1987 [82]. The Japanese scientist Ishino was unsuccessful in uncovering the significance of the sequence.

In 1990 [83], a spanish scientist was able to solve the mystery. At the university of Alicante in Spain, discovered an equivalent palindromic sequence in his work with archaea (e.g. Haloferax mediterranei​). Along with accomplishments from the japanese scientist, Mojica concluded that the sequence had a greater importance than previously thought. The continuing studies focused on using new revolutionizing bioinformatic tools like BLAST that was launched in 1990 [84], and by 2003 the method had developed further and he was able to identify similarities in the nucleotide arrangement between spacers and different viruses. The query about how the viral sequence had ended up in the bacterial genome gave rise to the definition of CRISPR as an adaptive immune system. Francisco, together with Rudd Jansen, imprinted the final name CRISPR [83].

At the beginning of the new millennium, scientists discovered the Cas genes and around 2006, a yoghurt factory could verify Mojica’s conclusions about the adaptive immune system properties of CRISPR. A scientist at the yoghurt factory, Philippe Horvatha, had noticed that bacteria cultures were responding differently to viral attacks. By looking at the CRISPR array in ​S. thermophilus,​ Horvath found similar repeats and incorporations of viral genome in the bacteria [85]. The sequence of Cas9 was also determined by Horvath.

The following year, John van der Oost located in the Netherlands made a deeper dive into the studies of how phages actually interfered with the CRISPR system, and the significance of the crRNA. His work resulted in the creation of the first artificial CRISPR array, meaning scientists could program bacteria to resist a certain viral attack [86].

Between 2006-2011, several scientists worldwide tried to describe the mechanism of CRISPR-Cas, from DNA targeting to Cas9 cleavage, but still crucial discoveries were achieved and thus no significant amount of paper published concerning CRISPR-Cas.

51

The breakthrough for CRISPR-Cas started in 2011 when Emmanuelle Charpentier at Umeå University in Sweden presented the full mechanism. Her work was primarily focused on the human ​S.pyogenes in order to solve toxic diseases. The last fundamental pillar was the description of tracrRNA and the formation of a duplex model with crRNA. Along with Jennifer Dounda, Charpentier published the breakthrough in the magazine in 2012, “A​ programmable Dual-RNA guided DNA endonuclease in adaptive Bacterial immunity” ​[20]. The publication of the paper created a CRISPR wave and the following years, the scientific work regarding CRISPR grew explosively.

The same year in , Virginlijus Siksnys studied the CRISPR system in other species since all the fundamental mechanisms were known. The greatest accomplishment was to successfully transfer the CRISPR-Cas system to a test tube. The ability to work with CRISPR in vitro enabled the possibility to move the type II CRISPR-Cas9 locus to ​E.coli [​ 86].

During the same time, Emmanuelle in collaboration with Jennifer Doudna focused on the Type II system. They were able to optimize the three component structure - a Cas9, crRNA and tracrRNA - to a two element system - a Cas9 protein and an sgRNA. The ability to use Cas9 and engineer a target-specific RNA to perform precise cuts was invented ,and also confirmed globally by other scientists [86].

In 2013, at Broad Institute of MIT and Harvard University laid the last essential milestone for development of CRISPR-Cas as a gene editing tool. Zhang successfully described the effective use of CRISPR in mammalian cells and furthermore showed the capabilities of Cas9 to multiplex. Additionally, Zhang showed a way to greatly favour HDR repair in eukaryotic cells, which was an important step for safer gene editing purposes in humans [87]. Since Feng Zhang’s work has been confirmed by independent scientific papers, the CRISPR system has been introduced to many varieties of organisms. Ever since, articles containing CRISPR have had an exponential growth. In 2015, the first research about CRISPR-Cas being used in non viable- zygotes was published by Junjiu Huang. This was later briskly discussed by scientists worldwide and will be further discussed in section 7.2.

4.2 ​Comparison with other genome editing methods

The editing of an organism’s genome has been blooming through the last century, and today, the genetic tool kit is composed of three major nuclease instruments; nucleases (ZFN), transcription -like

52

effector nucleases (TALENs) and CRISPR-Cas. The cleavage activities in all these systems are followed by end-joining via either NHEJ or HDR, for eukaryotic cells.

Zinc finger (ZF) motifs were first described by in 1996 and the ZFN design can be described as a fusion of several ZF proteins with specific binding domains and a cleaving enzyme. Depending on the design, each ZF has a unique recognition site around 3 b, enabling a characteristic design where 3 ZF pairs array of ZF can recognize 9 bp target sequence. The fusion arrangement is connected to an endonuclease domain from the FokiI [88]. The full mechanism was first described in 2000 and three years later used in human cells.

Since the design of ZFN is known to be cumbersome and time-consuming, only well trained specialists can carry out effective gene editing with ZFN. The design of the zinc fingers are specific for each target site and experiment, meaning that for each new site cleavage, a whole new molecule has to be produced, compared to CRISPR where only gRNA has to be exchanged. The toxic properties followed by the DSB can be controlled by using less error-prone target sites and by performing ZFN with specialized FokiI domains [88].

The excitement for gene editing was profound in the scientific world when ZFN was introduced in human cells 2003, and in 2007 a novel and a more enhanced technique was just around the corner. The transcription activator-like effectors (TALEs) was by 2011 a well established technique [89]. The overall process resembles that of ZFN; directed nucleases with the ability to introduce DSB in a specific target location, but now with a simpler and more time-efficient design strategy.

The repeat array of the nuclease TALEN is also more adaptable and flexible when compared to the ZFN. The length can vary, depending on the target. TALEN is in some aspects easier to apply in genome editing, since there is a large accessibility of target sites for TALEN since each bp can have different amounts of TALEN pairs [89]. However, the larger size of the TALEN, three times bigger than ZFN, has proven to be problematic concerning delivery and expression.

Since the beginning of 2000 two new methods had been implemented, and by 2014, the next generation knock-out techniques had created a new path. Even though TALENs only been active for three years, CRISPR-Cas broke new boundaries. New experiments could be carried out, enabling previously time-consuming editing to be planned and executed within days. CRISPR-Cas made it possible to conduct

53

multiple experiments to a significantly low cost [90]. The technique led to a rapid and simple genome editing which in practice could be performed even by an undergraduate student.

As in the case of the former techniques, the CRISPR-Cas system introduces a DSB in the target genome. In order to target a certain sequence, CRISPR does only depend on a relatively short gRNA sequence for specificity of target sequence. If a new target is desired, scientists only have to change the design of the gRNA sequence, which is a clear advantage of CRISPR.

One key difference with CRISPR is the natural ability to target multiple genes at the same time with the use of various gRNAs, a technique called ​multiplexing ​[91]. The same accomplishment can be performed by ZFN and TALENS, but the design is exceedingly more complicated and time-consuming followed by more toxic responses.

The size of the CRISPR-Cas9 is a clear drawback, it even has a larger size than a TALEN [24]. Different variants of Cas9 with similar size as TALEN have been detected and can potentially solve the size problem.

In summary, even though ZFN was first used for successful knock-out in organisms, the development of TALEN and CRISPR has overrun this technique. Both TALEN and CRISPR have already been used effectively to genetically modify monkeys [92]. The application of TALEN has also been adopted to larger animals, for example in livestock gene editing, creating hornless cows [93]. Dehorning is common in both dairy industries in Europe and the U.S and with gene editing tools, the procedure is more efficient and will reduce deadly infections and usage.

54

5. Applications and prospects

5.1 CRISPR multiplexing

The characteristic distinguishing CRISPR from ZFN and TALEN is the skill to easily accomplish multiplexing. The multiplex faculty makes it possible to carry out DSB simultaneously at several sites in the genome, which is useful when studying many human diseases [94]. The cuts can be made at sites close to each other, rendering a large deletion within the gene. Cuts can also be performed in many genes at the same time, enabling several gene knockouts at once. From here, the repair mechanisms of NHEJ or HDR can create indels or repair template inserts respectively. Multiplex can also be used to repress or activate multiple loci to understand loci connections and gene mapping [94].

Multiplexing requires one reprogrammable Cas protein (typically Cas9 or Cas12) and multiple gRNAs [95]. There are several strategies for expression of multiple gRNAs in cells, but the two major approaches are via multiple gRNA expression cassettes, or by polycistronic expression of gRNAs (figure 22). When utilizing multiple gRNA expression cassettes, one or more plasmids containing individual gRNAs are delivered into the cell and transcribed inside the cell. The polycistronic expression techniques are inspired by the native structure of the CRISPR array and are, unlike expressions cassettes, matured into gRNAs after transcription inside the cell, see figure 22 [96]. The polycistronic expression technique thereby generates nonidentical gRNA with separate target sites.

55

Figure 22: The expression of multiple gRNA can be performed by two main mechanisms where a significant difference is when the gRNA matures. Using a Polycistronic cassette (A​ ​), The native-like CRISPR array in the plasmid undergoes transcription producing an intermediate state of numerous gRNA, the array is then cut and paired with Cas9 creating different CRISPR-Cas9 complexes. In expression vectors, however, gRNA is mature and ready after transcription of the plasmid. Expression cassettes can be designed with multiple gRNA in one plasmid (B​ ​) or several plasmids containing individual gRNAs (​C)​.

The usage of multiplex as a new tool in gene editing will remain a delicate matter for a long time ahead because of the difficulties in designing CRISPR arrays for multiplexing, the lack in efficiency, and foremost, the shortage in knowledge regarding gene connections in organisms [94].

5.2 Diagnostics and drug development

CRISPR has great potential to cure and treat monogenic diseases and has already been applied within animal models for gene studying. The applications for different therapeutic methods to create a genetic cure are limited

56

by the fact that disease-causing mutations in early embryos (germline editing) are significantly easier to correct than in somatic cells, partly since the modification has to be delivered to trillions of cells. CRISPR successfully has been editing genes in non-viable human embryos, with spread results concerning safety and mosaic [97].

Diseases induced by parasites or caused by infectious agents are estimated to cause one third of all deaths worldwide each year [98]. These diseases often lack effective vaccines or treatments. CRISPR enables crucial studies on pathogenic parasite vectors and has the potential to conflict parasites before an outbreak [99]. Similar tactics with creating dysfunctional pathogenic sequences have been accomplished, thereby coming a step closer to a cure against infectious diseases like HIV.

The multiplex characteristics of CRISPR have great potential to treat cancer, a disease killing one out of six people worldwide [100]. The focus to beat cancer today, lies in the understanding of carcinogenesis since it is caused by the interplay of many genes. The limitations are, for instance, to find the mutations causing tumor growth in a large library of different mutations and tangled connections. In this aspect, the CRISPR-Cas system can be retooled to find important target sites and describe gene functions with a greater speed by utilizing multiplexing. Within immunotherapy, the CRISPR molecules can be programmed to act as GPS coordinates and recognize cancer cells by reprogramming T-cells to locate and differentiate tumor cells from healthy cells. Hopefully, one treatment with CRISPR could be enough to obtain a long term effect, in comparison with repetitive and prolonged cancer treatment methods today. These strategies are today used in clinical trials in the US to treat refractory cancer [101], 훽-thalassemia [102] and sickle cell diseases [103].

In 2019, Vertex presented an ex-vivo strategy to treat 훽-thalassemia (CTX001). Preclinical disease models for 훽-thalassemia had an 93.0% indel frequency with RNP electroporation delivery, given indication that treatment can developed further [104]. The first time a CRISPR-Cas system was used in the body was in March 2020 [105]. The drug Edit-101 was injected to treat the inherited eye disease Leber congenital amaurosis by editing a point mutation. The ​Staphylococcus aureus ​Cas9 (SaCas9) was subretinally injected into the eye with AAV5 [106] to cleavage the mutated gene and followed by NHEJ [104].

Modifications in the CRISPR systems enable, in some aspects, what earlier could have been described as sci-fi applications to be applied in therapeutics for heterologous transplantation. enables tissue to be moved from an animal, for example from a modified humanized pig to a human. Applying CRISPR to mute the immune response created by the animal genes, xenotransplantation could save millions of lives [107].

57

Further, CRISPR-Cas systems have shown great potential to be a groundbreaking tool within diagnostics. The novel and ongoing pandemic from ​SARS-CoV-2 is dependent on fast, cheap and correct testing to control the spread of the coronavirus and limiting the mortality rate. The CRISPR diagnostic methods may stop the rapid spread of the virus earlier since the results do not take hours to get back. A test based on the CRISPR-technology gives results back within one hour [108], compared with other time consuming methods. RT-PCR is a frequently applied test used today to identify the virus particles but has some considerable limitations and setbacks. RT-PCR is a time consuming process, requires laborious procedures and is also sensitive to deamination. Several labs have through CRISPR developed tests to identify invasion of coronaviruses, like the SHERLOCK-test from Broad institute (Feng Zhang) and DETECTR from Mammoth Biosciences [109]. The SHERLOCK test ​utilizes the Cas13 protein to find target sequences and perform cuts in RNAs, breaking it into smaller pieces in a process termed collateral cleavage. These RNA cleavages can be manipulated or marked to be applied in a simpler environment and give distinct symbols on a paper strip, similar to the signs on pregnancy tests. The Sherlock based test produced by Sherlock Biosciences was on the 8th of May 2020 approved by the FDA to use for detection of corona-cases [110].

5.3 Food industry and agriculture

Conventional plant breeding has precedently leaned on the procedure to use radiation or chemicals in order to achieve genetic variation within crops. These techniques are only capable of generating random mutations along the plant’s entire genome, generally at many sites simultaneously. Thus, it has only been for the scientist and farmer to hope for a mutation that brings forth a specifically desirable trait, and not one that causes plant degradation. Although these approaches have been applied with passable success since the 1950s, they are quite unreliable, cumbersome, and expensive. With CRISPR, it is possible to in advance select one gene, or multiple genes at one time (​multiplexing)​ , that are intended to be edited and thereby make direct, precise and more efficient changes in the genetic material. The quick process to genetically engineer plants using CRISPR may save years or even decades in bringing new needed varieties to farmers [111]. The two most widely used CRISPR tools for plant breeding are CRISPR-Cas9 and CRISPR-Cas12a (formerly named ) [112].

The advances in , along with the feasibility to sequence entire genomes, will presumably improve the breeding of various crops, and hopefully enable us to meet the global needs of increased food-supply required as the world’s population continues to grow. Many studies within the food industry and agriculture

58

address their research towards major staple and fruit crops that are consumed worldwide. CRISPR gene editing has been tested at rice [113], wheat [114], maize [115], soybeans [116], potatoes [117], tomatoes, apples and citrus species [118] to name the main ones, with differing and some promising results. The products are modified to withstand harsh weather conditions such as wind, drought, heat, cold or exceedingly wet circumstances in order to cope with proceeding climate changes, and by that help to increase crop yield at season end. A warmer climate may as well benefit tropical pests, thereby amplifying the already existing problem with pests and chemical pesticides causing environmental pollutions. Drought and heat can as recognized also affect livestock health and potentially lead to miserable mass slaughters and huge economic losses. Consequently, CRISPR has been implemented in livestock as well, and in microbe production and industry, which will be discussed later in this chapter.

Crops are generally harder to transfect due to their rigid cell wall, and more difficult to edit due to their genetic complexity, often involving more than two copies of each chromosome (polyploids) in which all need to be genetically mutated in order to achieve a specific trait [119]. Delivery into crops can for instance be accomplished by methods such as biolistic bombardment using a specialized equipment called a gene gun, or via polyethylene glycol (PEG) mediated transfection of protoplasts [30]. The issue with off-target effects using CRISPR technology remains even when working with plants, and there is a concern that the undesired effects might even increase due to the repetitive sequence pattern in plant genomes.

According to the World Health Organization (WHO), 90 million preschool-aged children in developing countries suffer from A-vitamin deficiency leading to increased mortality and blindness [120]. CRISPR-Cas9 has been used to perform a gene mutation in rice, rendering them to produce beta carotene which is a substance that internally converts into A-vitamin [121]. Here, the enhancement of nutritious content in rice has been proven, and it is believed that genome editing holds great promise for biofortification of several staple foods in order to combat malnutrition in the world.

A devastating disease in citrus trees is the Huanglongbing (HLB) disease, also known as citrus greening or yellow dragon disease, caused by the ​Candidatus ​bacterium. The highly severe disease has gained its foothold in Asia, Africa, and since 2005, the U.S. [122] The citrus industry has been severely affected by the disease, leading to substantial decimations and major economic impacts. CRISPR has been attempted to make citrus cultivars less susceptible towards the disease, but at present it has not been achievable to develop a full resistance to the bacteria due to vaguely understood virulence factors [118].

59

With CRISPR edited tomatoes, one research group managed to increase the crop yield (ten folded), attain larger fruits (three folded), and enhance the content of the antioxidant lycopene (five folded), thereby making the tomatoes more nutritious [123]. Another study made tomatoes resistant to a species, a fungus causing a prevalent disease in tomato plants [124].

Porcine reproductive and respiratory syndrome (PRRS) is one of the most severe viral diseases afflicting swine populations. The virus is widely spread in Asia, Northern America and many countries in Europe. PPRS is the world’s most expensive swine disease and the annual cost for Northern America alone amounts to 600 million american dollars [125]. The virus’s faculty to infect swine cells are enabled by a receptor called CD163 on the swine’s macrophages, which work as an entry for the PRRS virus to move into the cell. Scientists have managed to knock out the CD163-coding gene with CRISPR-Cas9. Afterwards, the gene edited pigs got subjected to the virus and showed no clinical signs of respiratory disease, viremia, lung pathology or response during the 35 days of study after infection [125].

Generated by CRISPR-Cas9 editing, researchers have as well managed to disrupt a gene in cashmere goats and achieved enhanced fiber length and a gained number of hair follicles. The project yielded an average increase of 92.75 g cashmere per goat in contrast to the wild type [126]. The average cashmere weight of the non-gene edited goats in this study was about 290 grams, leading to an approximate 31.9% of cashmere yield. Out of 19 goats, only six were successfully genetically modified in the target gene.

CRISPR may also have useful advantages in microbial industries. Scientists have successfully used CRISPR to improve the lipid production of the industrial microalgae ​N. gaditana​, thereby taking a step closer to algae-derived biofuels instead of petroleum-based transport fuels [127]. Additionally, the variable and adaptive nature of CRISPR makes it a good candidate for strain typing and comparative analysis in various industrial processes. The quality of CRISPR-encoded phage resistance can also be used to decrease food wastes in manufacturing procedures, to screen for and kill microbes in mixed bacterial populations and wishfully enable production of food in a manner that consumes fewer resources, such as water [128].

At last, genome editing in plants seeks to control and enhance beneficial traits such as longer shelf life, improved salt tolerance, gained resistance against pests and diseases, promote fecundity and growing patterns, herbicide tolerance, and even eliminating allergens such as egg white [129] and gluten [130]. CRISPR gene-editing may as well generate refined and healthful, more aromatic and flavourful, and maybe even more affordable food in the future due to less diminished crop yields.

60

5.4 Regulations and GMO

The European Union has stated that all gene edited crops, including CRISPR engineered, are to be treated as genetically modified organisms (GMOs) and will hence be under very strict regulations in the EU [131]. By contrast, instead of looking at the process in which the crops are produced, the United States Department of Agriculture (USDA) tends to focus on the end product. The USDA stated in Marsh 2018 that if the end product of CRISPR genome editing could have been developed by nature or by traditional breeding techniques, they should not be considered as GMO as long as they are not plant pests or derived from plant pests [111]. Under this ruling, products without exogenous DNA (transgenes) and with small changes in the genetic code such as deletions and insertions that could have come about by crossbreeding or conventional techniques are outside the scope of the USDA [132]. CRISPR edited crops will therefore not be controlled by the USDA, thereby paving the way for CRISPR crops to be cultivated and sold without regulatory monitoring. With it, USDA wishes to foster biotechnological innovation.

The Food and Drug Administration (FDA) overseeing food and feed safety in the U.S. encourages farmers to consult the FDA before marketing new products. FDA helps developers to ensure that their genetically engineered food products are safe and lawful before commercial distribution, but the consultation program is only voluntary [133].

These laissez-faire approaches among the U.S. agencies will spare many years and huge amounts of money in bringing a genetically modified plant to the market, but as long as there are no international agreements and consistencies some countries are gridlocked while others can exploit and benefit from the situation a lot.

5.5 Gene drive

The discovery of CRISPR-Cas9 has contributed to countless advancements within diverse fields of study. The support of CRISPR-Cas9 technology enabled certain research areas to overcome obstacles that previously stood in the way of their progression. Gene drive is a prime example of a technique formerly limited by technical constraints, until the deployment of CRISPR-Cas9 came to solve critical issues regarding the creation of synthetic gene drives [134].

61

Gene drive is a technique inspired by naturally occurring ​“selfish” genetic elements. These elements allow for genes to be passed on to generations at a higher rate than the wild type (WT) gene, even if the element results in a potential fitness cost. Similar to selfish genetic elements, gene drive systems increase the rate of inheritance in the genes it is incorporated into, see figure 24. Naturally, an allele has a 50% chance to be passed on to the offspring, but with a gene drive system specific alleles can be inherited to a greater extent across multiple generations, leading to the capacity of a gene to rapidly spread throughout a population [134]. There are numerous types of gene drives each increasing inheritance through various kinds of mechanisms. Although there are two major types of gene drive mechanisms; gamete reduction, and homing based gene drive [134].

Gametes reduction operates by reducing the viable gametes containing the wild type allele in early stages of meiotic. These types of biased inheritance mechanisms can, for instance, alter the offspring sex-ratio or be used for population suppression [134].

Homing-based gene drives operate by copying itself onto chromosomes lacking the gene drive element (called homing (figure 23)). Homing-based mechanisms can be utilized to spread desirable traits, termed cargo, in populations when inserted in the gene drive system, or alternatively, used to disrupt genes causing sterility or recessive lethality. Homing based gene drives are dependent on endonuclease activity to bring about changes in the genome and achieve increased inheritance. In synthetic gene drive design, the naturally occurring nucleases are generally exchanged into CRISPR endonuclease, with the purpose of providing the gene drive with a more manageable and easy to engineer endonuclease [134].

The first step of the homing gene drive mechanism is the transcription of the endonuclease (in this example CRISPR-Cas9), the CRISPR-Cas9 recognizes and cuts the wild type allele on opposite chromosomes. The newly made DSB induced by CRISPR-Cas9 triggers cell repair mechanisms. Via the HDR repair procedure, the cell utilizes the gene drive elements as a repair template causing the gene drive to get incorporated into the opposite chromosome. The result is a homozygous offspring containing the gene drive element on both chromosome pairs, ensuring the gene drive to be passed onto the next generation and repeat the process.

62

Figure 23: Homing-based gene drive mechanism. Two mosquitos are crossed, one homozygous parent with a gene drive element and one parent with wild type alleles (shown in green). Normally, the crossing between the parental mosquitos should result in a heterozygous offspring containing one wild type and one gene drive allele. However with a homing based gene drive this is not the case, instead the following takes place; First, Cas9 together with a sgRNA is transcribed and the complex cuts the wild type allele. After the cutting, homing takes place through the repair pathway HDR, which identifies the gene drive system as a repair template due to the disrupted gene fragments up- and down-stream the drive. In the HDR process the gene drive gets copied to the opposite chromosome and the end result is a homozygous offspring carrying two sets of gene drive systems.

63

Figure 24: Inheritance patterns; normal inheritance compared with gene drive inheritance. The normal inheritance pattern shows how a normal modified gene (pink mosquito) has a 50% chance of transmission to future generations, whilst the gene drive inheritance patterns shows 100% chance of transmitting the gene drive systems into future generations, justifying gene drives potential to exponentially spread in a population.

Future hopes for gene drives include eliminating insect-borne diseases, controlling crop pests, and controlling the spread of invasive foreign species. Another benefit of implementing genes drives would be that it is cost efficient and environmentally friendly. Compared to current methods, for example, controlling crop pests, invasive species and insect-borne diseases with diverse pesticides [135].

Regarding invasive species, destroying natural habitats for original animals is a widespread problem, and could possibly be kept in check using gene drives. Invasive species, like the Asian carp in Mississippi, has infected the natural environment in flood areas and caused a radical decrease of common carp [136]. The Asian carp is a fast-growing fish and can increase in numbers rapidly, putting the natural ecosystem out of balance, making it a heavy problem in the wide-spread rivers from Mississippi.

64

In 2017, the first approach to use gene drives in vector borne diseases was tested. The study focused on a “natural gene drive” that was implemented in a part of Australia for two years. The ​ mosquito was modified, where the ​Wolbachia,​ a bacterial symbiont, was transferred from ​Drosophila​. The advantages of using bacterial symbionts is that it can reduce the mosquitoes ability to transfer disease, and by this means, hinder the spread of zika and dengue. The natural gene drive showed successful results where it spread through populations in implemented areas. The results indicated that the natural gene drive could be used on a large scale [137].

Yet, significant challenges remain to be solved before gene drive systems can be applied in a wider range before a broadened use of the tool can take place. Some of the obstacles are to make reliable estimations of the impact on the ecosystem, investigating the risk of gene drives transmitting between species, and preventing development of gene drive resistance [135].

5.6 Anti-CRISPR proteins

The race between bacteriophages and bacteria have resulted in the development of a protein which helps the phages to withstand the CRISPR immune system. These anti-CRISPR (Arc) proteins were found shortly after the CRISPR-Cas and are today studied by scientists to expand the CRISPR-toolbox to be prepared to inhibit unethical application of CRISPR-Cas. Today these naturally occurring proteins may be used as a last resort in order to inhibit undesired results from CRISPR systems. Arcs have even been described as a practicable steering wheel to perform safe editing of a genome. Bioinformatic tools can be used to find these proteins and adopt the inhibiting effects within medicine or an unwanted gene drive [138].

Arcs work in different ways, only a few mechanisms have been proposed for certain proteins since they were first discovered in phage DNA in 2013 [139]. Anti-CRISPR proteins with immunity for Type II and V are the most studied. The studies have shown that some Arcs affect stability in parts of the complex, or the nuclease domains resulting in DNA blocking and inhibition of cleavage. Bacteria can surmount the anti-CRISPR proteins to a certain degree, but phage-cooperation will eventually vanquish defense against viral DNA [140]. Today, Doudna and coworkers have proven that these switch-off button molecules, efficiently can be transported into human cells [141].

65

6. Limitations

6.1 Technical limitations and knowledge constraints

The two main major hurdles with CRISPR-Cas is how to deliver the components safely and efficiently, and how to decrease the high off-target effects it has shown to cause in many species, in both target and non-target tissues. The various delivery methods already examined have shown diverse merits and demerits when it comes to degree of immunogenicity, sequence specificity, unintended cleavages and efficiency.

Let us take a journey into the molecular pathway of the CRISPR components from the extracellular environment, to the final cut within the nucleus, and its subsequent gene editing outcomes. The CRISPR-Cas9 system can be delivered as DNA, RNA or as RNP. These constructs can be introduced to the cell via various methods; some delivering the CRISPR components directly to the nucleus, while others can be used for inserting them via the cytoplasm. The methods by themselves can be harmful or inefficient for specific cell types or tissues, thus obliging the researcher to find out which of them are more preferable for their cell of interest. Some processes require skills and expensive equipment, while others only require particular reagents or laborious handicraft. If the CRISPR components are delivered into the cytoplasm, they need to be protected from endosomal entrapment and degradation. Besides, the CRISPR molecules have to cross the nucleus membrane. This is usually not a huge problem in dividing cells, since the nucleus becomes permeable during the repeated mitosis, but it is a considerable issue to get the CRISPR components into the nucleus of non-dividing cells. Viral vectors are generally used for non-dividing cells since they have a natural mechanism to penetrate the nucleus with their viral genome.

Later on, when the Cas protein and sgRNA has crossed the nuclear membrane, the complex needs to search for the right target sequence within the genome. Even if the sgRNA are about 20 bp in length, one and several mismatches are permitted, making it even shorter in terms of the number of nucleotides that are required for sequence specificity [142]. The lower number of accepted nucleotides, the higher the risk is to encounter a similar sequence in the genome. Further, only the 8-12 bp seed sequence adjacent to the PAM site needs to agree exactly to the corresponding genome sequence in order to follow through with the cutting, leaving the sequence specificity even more modest [143]. As previously stated, variations or mutations in the sgRNA sequence farther away from the 5’-end (downstream) of the PAM do not influence the pairing with the target sequence as

66

much as mutations in the 3’-seed sequence immediately upstreams of the PAM site. This leads to the undesired, high degree of unintended cuts for gene editing purposes in humans and animals. The degree of unintended cuts can sometimes even be higher than on-target cleavages [144]. The capacity to target less accurate sequences may have its origin in the adaptive defense mechanism of bacteria, since it much likely is a benefit to be able to recognize and destroy viruses with both exact and similar genomic codes.

As mentioned, the Cas9 protein does also require a PAM recognition site in order to induce a double stranded break in the DNA strand. These PAM sites consist of specifically ordered nucleic acids and are limited to some genomic sites. As for Cas9 proteins, the PAM sequence is NGG, where N stands for any nucleotide and G represents the nucleotide guanine. Other Cas proteins have different PAMs, broadening the possible cutting sites within genomes, and further explorations have even enabled changes and novel varieties of the PAM sequence for the Cas9 protein alone. Still, the amount of detected PAM sequences are too few as the existing ones do not always sit adjacent to the gene that the scientist wants to alter. On the other hand, there are too many PAMs. This might sound confusing, but if we look at the Cas9’s PAMs sequence NGG it is clear that the Cas9 protein only requires one nucleotide followed by two G:s in order to bind in to the DNA sequence. When looking at the frequency of GG in the human genome, it will result in the enormous amount of 161 284 793 PAMs sites in the genome, or literally one in every 42 bases [145]. This led to an unspecified manner, and might be the reason for high off-target effects (along with permitted mismatches in the spacer sequence). Therefore, longer PAM motifs or less frequently existing PAM sequences would be preferable in human experiments.

If we allow ourselves to dream of the scenario of cuts only happening at on-target sites, the problem with the eukaryotic cell’s mechanism to repair broken DNA still remains. Eukaryotic cells mend cleavages in DNA by the mechanisms of NHEJ or HDR (detailed description of the repair mechanisms can be found in section 2.1.3). If the DNA gets repaired by the NHEJ mechanism it will cause random mutations in the sequence, typically insertions or deletions (indels collectively), which can bring harmful consequences for the organism. Due to NHEJ’s haphazardness, scientists are trying to shift the cell’s repair mechanism from NHEJ towards HDR repair in order to guarantee safe, non-random mutations instead.

A last significant limitation to have in mind is the lack of knowledge we have when it comes to understanding all processes in various cells, tissues and organisms, and further have knowledge about the origin of the extensive number of diseases figuring in the world. It is a difficult task to find out which gene or genes that are involved in a certain disease (or a particular feature such as eye color). Even if the scientist discovers a disease-causing gene in humans, that gene might as well have essential functions for other mechanisms in our bodies, making it

67

impossible to knockout that specific gene. Further does humans have large differences in genomic setups, making it even harder since individually curatives have to be developed.

The ethical concern of gene editing with CRISPR can also be seen as a limitation for CRISPRs progression in the future, but more of that is discussed in the next chapter.

68

7. Ethics

7.1 Diseases, diagnostic tools and patents

The CRISPR-Cas9 system can be applied in many areas and is ensued with risks such as misapplication, inequalities and eugenics. The lack of knowledge of the complicated connections within the human genome, and large concerns about mosaics, when multiple genotypes exist in one patient [144], has an accumulative effect on ethical​ questions concerning CRISPR application to treat diseases in humans.

CRISPR was for the first time used in a patient with sickle-cell anemia in 2019 [146], editing somatic cells. If the treatment turns out to be efficient and safe, it would sound like an easy option for patients. However, there is another point of view to consider. Individuals heterozygous for the sickle-cell anemia gene can have a partial protection against malaria [147] which in some cases can be valued higher than a treatment for the disease. A disease can be seen as a part of a person, forming personalities and qualities. By performing CRISPR modifications in the DNA these characteristics would be lost. The same arguments can be heard from people with deafness or blindness, since the deficiencies give a new “insight” into life. Another aspect to consider is which diseases should be studied and treated with CRISPR. If safe editing can be performed, wealthy families with members suffering from a genetically inherited disease can finally get treatment, but people in developing countries will probably see another larger gap against the more privileged. This can also be taken to even further discussions about how to define a disease.

There are concerns that these approaches will cross crucial borders; where to draw the line between diseases and undesirable human traits? CRISPR can be considered as a democratising technology and therefore a worldwide regulation is called for. Regulation and laws have to meet many demands since they have to be equal between nations, and not limit progress or discoveries in particular countries since the technique can contribute to the new innovations, promoting economic growth. The fast growing technology and the following concerns contributed to a meeting concerning genome editing since the debate was falling behind in aspects of ethics and safety. The International Summit on Human Gene Editing has concluded that it would be irresponsible to apply germline editing due to the lack of safety and efficacy [148].

The possibilities to treat human diseases and develop diagnostic tools also relies on approved patents. Since the discovery of CRISPR systems is a result from collective progression, it is hard to say who really discovered

69

CRISPR. Patents give power which will affect prices in all biotechnology products designed with CRISPR. The US patent dispute for using CRISPR in eukaryotic cells has been going on for some years, since the Broad institute of MIT and Harvard first received an approved patent [149], even though Berkeley filed for it first when Jennifer Doudna and Emmanuelle Charpentier published the revolutionary paper in 2012. Both the US and European patent debates are still heated, and will affect the market for CRISPR.

The development of new diagnostic tools also pave the way to patent disputes which potentially can raise the prices. In developing countries, diagnostics tool kits have been available for some time, but the price and the requirement of equipment for testing have been a huge limitation. The cheap and fast CRISPR diagnostics, which can be performed without a high-tech lab, gives a whole new hope, but is still dependent on the outcome of patent battles.

7.2 Designer babies

The research on human germline cells - good or bad - will have inescapable consequences that can affect future generations since it is passed on to the offspring. For a long time, germline editing has been deemed as illegal and unethical in several countries.

In 2018, , a chinese scientist presented his research performing CRISPR in viable, human twin-embryos. The gene-edited twins were born in late 2018 [150] and held a CCR5 gene mutation leading to HIV resistance. The criticism was multi-faceted and the ethical missteps were profound, since it is still not known how the disrupted gene worked in all the different cell types in the body. His work has also been criticized for the lack of peer-review and the fact that the results were never published in a scientific paper. It has not yet been verified if both twins are HIV-resistant, or if there is a possibility of mosaicism in the cells. Posterior studies in mice, where the CCR5 gene was disrupted, indicated effects on the cognitive behavior [151].

In 2019, a third gene-edited baby was born, and the talk about the non-medical application of CRISPR has grown ever since [150]. Questions have to be answered where to draw the line for editing on embryos. Should only inherited diseases be edited or should parents be able to select characteristics, like the myostatin gene for muscle growth or should it be possible to affect the intellectual traits? In this aspect, the CRISPR babies go against many moral beliefs and religions since it proposes unnatural selection in humans and as mentioned

70

earlier, some people consider the disease as a part of their personality, which should vanish if DNA is manipulated before birth.

But if the germline editing can be performed safely, it can greatly improve human health. By changing the human code, healthy babies will lower healthcare costs, which potentially will be favoured by governments and a whole new inequality in society. By this means, eugenic questions also arise concerning if CRISPR applications will create a “natural” human and if they will be treated differently.

7.3 Ecosystems and gene drive

Gene drives can be used to alter the sex ratio of born offsprings, to spread a modified gene, induce infertility traits and recessive lethality traits. So far, gene drives have focused on solving issues regarding insect-borne diseases as well as discussing ways to control crop pests and invasive species. Although gene drives could be the answer to several profound problems it is still a relatively new technology. The technology lacks careful examination before a broadened use of the tool can take place. Given the fact that gene drive can affect the natural ecosystems, ethical issues arise concerning if humans have the right to eliminate other species for our own benefit. Noteworthy, mosquito-borne diseases, compared with other insect species, are responsible for the highest spread of deadly diseases in the world [98]. Considerably, not acting can give a worse outcome.

Bioethical questions arise with the employment of gene drives, which could affect the ecosystems negatively given that various species are dependent on each other. Synthetic gene drives have so far not left the lab environment with the respect of disrupting the balance in ecosystems. Eradicating or spreading modified genes throughout species carries a risk of unforeseen outcomes. If gene drives would be released into the wild, it would not take count of borders or continents. The erase of species in some part of the world might be beneficial, but devastating in another place. Even the ability to steer (de-extinction) by saving endangered species, or bringing back extinct animals can within a near future be tested. De-extinction might bring huge shock waves and the decision to play God and implement unnatural selection between species may provoke religions or society.

Throughout history, society has shown repulsion towards gene editing, for example the Aquadvantage salmon (derided as Frankenfish), which would give double meat at a considerably less amount of time and price. The environmental impact from the fish was clearly reduced and could make an impact on the oceans. The

71

bioethical questions slammed the artificial Frankenfish with criticism, and anti-GMO movements influenced supermarkets not to sell the fish when it was first promoted in the U.S [152].

To be able to implement gene editing and gene drives to save and ameliorate ecosystems, engagement and education is needed for the public to be able to discuss the possibilities.

7.4 Moratorium

The CRISPR baby discussions made it clear that regulation concerning genome editing in the human germline is absent. Many scientists, including Emmanuelle Charpentier and Feng Zhang, call for a global moratorium since they believe that nations must discuss ethical and scientific questions, which are already lacking behind. The public also needs to be engaged and discuss the opportunities and consequences of heritable germline editing.

In March 2019, biologists suggested a temporary moratorium on human embryo editing, to hinder experiments eventuate in more CRISPR babies [153]. The moratorium up for suggestion is not aimed to stop all science on germline cells, but is rather adopted for research resulting in birth of new CRISPR babies. The paper got criticism from one of the pioneers of CRISPR-Cas9, Jennifer Doudna who in 2015 called for a similar moratorium (before He Jiankui presented the first CRISPR babies in 2018). Doudna eschewed signing the moratorium, since she believed that a moratoria does not hold any ground as counter measurements anymore. Regulations considering heritable gene editing is required, along with transparency and open discussions [154].

72

8. Concluding remarks

After completing the two months of full-time research on CRISPR-Cas systems, the gene editing tool widely known as ‘revolutionary’ was perceived differently compared to at the start of the project. In early stages of the project, the technique was preconceived as a gene editing tool able to modify genomes with ease, a matter later found not to be 100% true. Speculations of why this was the case lead to theories suggesting that unjust mainstream media portrayal and limited knowledge was the cause. In contrast to early assumptions, CRISPR-Cas9 had a reasonable number of limitations and is virtually impossible to use for revolutionary research at your kitchen table, as some people might predicate. It may be cheap and relatively easy to use when compared to other gene editing techniques, but to understand the mechanism of CRISPR-Cas9 in detail takes a considerable amount of time. Nevertheless, to perform simple CRISPR experiments does not require immense knowledge, time or effort, thereby making it a good laboratory exercise for undergraduate students. The suggested lab exercise will require some understanding of the CRISPR system beforehand, and also provide knowledge about CRISPR in practice - its simplicity and difficulties - in order for the student to draw their own conclusions of its efficiency and limitations.

To summarize, CRISPR-Cas has evolved from a natural, adaptive immune system in bacteria into modern time's greatest biotechnical tool. The CRISPR-Cas system still has to overcome delivery problems and minimize off-target effects to be safe within the ambitious applications. Nevertheless, the versatility of the systems is remarkable, and the simple programmability has already vastly extended studies within gene therapy, agriculture, epigenetics and many more application areas. The research endeavoring to understand diseases has reached a new phase, providing novel fast methods to tackle previously unsolvable problems, but the current market of gene therapy and biotechnological innovations may result in extreme expenses for the first CRISPR drug, and lead to amplified inequalities among countries and people around the world.

Conclusion can be drawn that an establishment of the CRISPR technology is needed in early education, both theoretical and laboratorical, since it much likely will grow into a technique applied globally - if ethical discussions and wise regulations are established. The final questions need a ‘little’ ​savoir-faire ​to ensure that the new technique eventually can make a positive contribution to both humans, animals and ecosystems on our planet.

73

9. References

1. Jayaswal S, (2020). G​ regor Mendel​. Salem Press Biographical Encyclopedia. Available at: https://search-ebscohost-com.e.bibl.liu.se/login.aspx?direct=true&db=ers&AN=88807104&lang=sv &site=eds-live&scope=site. 2. De Castro M, (2016). J​ ohann Gregor Mendel: paragon of experimental science​. Molecular Genetics & Genomic Medicine, 4(1). 3. Hagan M, (2019). W​ illiam Bateson​. Salem Press Biographical Encyclopedia. Available at: https://search-ebscohost-com.e.bibl.liu.se/login.aspx?direct=true&db=ers&AN=89129931&lang=sv &site=eds-live&scope=site. 4. Driscoll S, (2019). ​Francis Crick. S​ alem Press Biographical Encyclopedia. Available at: https://search-ebscohost-com.e.bibl.liu.se/login.aspx?direct=true&db=ers&AN=89129720&lang=sv &site=eds-live&scope=site. 5. Kuhl S. A. K, (2019). C​ entral dogma of . ​Salem Press Encyclopedia of Health. Available at: https://search-ebscohost-com.e.bibl.liu.se/login.aspx?direct=true&db=ers&AN=94416402&lang=sv &site=eds-live&scope=site. 6. Synthego, (2018). H​ istory of Genetic engineering ,​ Available at:​ https://www.synthego.com/learn/genome-engineering-history ​ (Accessed: 19 May 2020). 7. Kresge N, Simoni RD, Hill RL, (2010). ​The characterization of restriction : the work of Hamilton Smith. J​ Biol Chem, 285(3). Available at: https://www-ncbi-nlm-nih-gov.e.bibl.liu.se/pmc/articles/PMC2804378/#? 8. U:S National library of science. Available at: https://profiles.nlm.nih.gov/spotlight/cd/feature/biographical​. (Accessed: 18 May 2020). 9. David A. Jackson, Robert H. Symons and Paul Berg, (1972).​ Biochemical Method for Inserting New Genetic Information into DNA of Simian Virus 40: Circular SV40 DNA Molecules Containing Lambda Phage Genes and the Galactose Operon of .​ Proc Natl Acad Sci USA, 69(10), 2904. 10. Science History Institute,​ (​ 2017). ​Recombinant DNA history. A​ vailable at: https://www.sciencehistory.org/historical-profile/herbert-w-boyer-and-stanley-n-cohen.​ 11. Jaenisch, Rudolf, Ushma S Neill, (2015). ​A conversation with Rudolf Jaenisch. T​ he Journal of clinical investigation, 125(9), ​3305-3307. 12. Institute of Medicine (US) Committee to Study Decision Making; Hanna KE, editor, (1991). A​ silomar and Recombinant DNA: The End of the Beginning. ​ Biomedical Politics. Washington (DC): National Academies Press (US). ISBN-10: 0-309-04486-3. 13. F.Sanger A.R.Coulson, (1975).​ A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. ​Journal of Molecular Biology, 94(3), 441-448. 14. Stableford B, (2018). F​ rederick Sanger​. Salem Press Biographical Encyclopedia. Available at: https://search-ebscohost-com.e.bibl.liu.se/login.aspx?direct=true&db=ers&AN=90669622&lang=sv &site=eds-live&scope=site. 15. Syntego. ​History of Genetic Engineering and the Rise of Genome Editing Tools.​ Available at: https://www.synthego.com/learn/genome-engineering-history.​ (Accessed: 19 May 2020).

74

16. Deepak S.A, et al, (2007).​ Real-Time PCR: Revolutionizing Detection and Expression Analysis of Genes.C​ urrent genomics, 8(4), 234-51. 17. InformedHealth.org, (2016). ​The innate and adaptive immune systems. ​ Institute for Quality and Efficiency in Health Care. Available at: ​https://www.ncbi.nlm.nih.gov/books/NBK279396/ 18. David P. Clark, Nanette J. Pazdernik, Michelle R. McGehee, (2018).​ Molecular biology third edition,​ Andre wolff, Elsevier Science.3, 625-630. ISBN: 0128132892 . 19. Synthego. ​How to use CRISPR. ​ Available at: h​ ttps://www.synthego.com/guide/how-to-use-crispr.​ (Accessed: 19 May 2020). 20. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E, (2012). A​ programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity.​ Science, 337(6096), 816‐821. 21. Gleditzsch D, Pausch P, Müller-Esparza H, et al, (2019). ​PAM identification by CRISPR-Cas effector complexes: diversified mechanisms and structures​. RNA Biol, 16(4), 504‐517. 22. Bratovič M, Fonfara I, Chylinski K, et al, (2020). B​ ridge helix arginines play a critical role in Cas9 sensitivity to mismatches. N​ at Chem Biol, 16, 587-595. 23. Kim E, Koo T, Park S, et al, (2017). ​ In vivo genome editing with a small Cas9 orthologue derived from Campylobacter jejuni.​ Nat Commun, 8, 14500. 24. Josephs EA, Kocak DD, Fitzgibbon CJ, McMenemy J, Gersbach CA, Marszalek PE, (2015). ​ Structure and specificity of the RNA-guided endonuclease Cas9 during DNA interrogation, target binding and cleavage. ​Nucleic Acids Res, 43, 8924‐8941. 25. Palermo G, Chen JS, Ricci CG, et al, (2018). K​ ey role of the REC lobe during CRISPR-Cas9 activation by 'sensing', 'regulating', and 'locking' the catalytic HNH domain.​ Q Rev Biophys, 51(91). 26. Jiang F, Taylor DW, Chen JS, et al, (2016). ​Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage​. Science, 351(6275), 867‐871. 27. Wang H, La Russa M and Qi L.S, (2016). C​ RISPR/Cas9 in Genome Editing and Beyond​, Annual review of ,85, 227-264. 28. Szczelkun MD, Tikhomirova MS, Sinkunas T, et al, (2014). D​ irect observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. ​ Proc Natl Acad Sci U S A, 111(27), 9798‐9803. 29. Zeng Y, Cui Y, Zhang Y, et al, (2018). ​The initiation, propagation and dynamics of CRISPR-SpyCas9 R-loop complex. N​ ucleic Acids Res, 46(1), 350‐361. 30. Addgene, (2018 ). ​CRISPR 101, A desktop resource. 2nd edition. ​ Available at: https://www.addgene.org/guides/crispr/ 31. Haapaniemi E, Botla S, Persson J, et al, (2018). ​CRISPR–Cas9 genome editing induces a p53-mediated DNA damage response.​ Nat Med, 24, 927–930. 32. Mingjie Liu, Saad Rehman, Xidian Tang, Kui Gu, Qinlei Fan, Dekun Chen, & Wentao Ma, (2019). Methodologies for Improving HDR Efficiency. ​Available at: https://doi-org.e.bibl.liu.se/10.3389/fgene.2018.00691 33. Pa​ trick D. Hsu, Eric S. Lander, Feng Zhang, (2014). ​Development and Applications of CRISPR-Cas9 for Genome Engineering. ​Cell.Elsevier, 157. 34. Westra E. R, Dowling A. J, Broniewski J. M & van Houte S, (2016).​ Evolution and of CRISPR.​ Westra. 47, 307-331.

75

35. Tang Y, Fu Y, (2018). C​ lass 2 CRISPR/Cas: an expanding biotechnology toolbox for and beyond genome editing​. Cell Biosci, 8(59). 36. David Goodsell, (2015). C​ ascade and CRISPR​. PDB. Available at: https://pdb101.rcsb.org/motm/181 37. Christopher A. Lino, et al, (2018). D​ elivering CRISPR: a review of the challenges and approaches.​ Drug delivery, 25(1), 1234-1257. 38. Pinilla-Redondo R, Mayo-Muñoz D, Russel J, et al, (2020). ​ Type IV CRISPR-Cas systems are highly diverse and involved in competition between plasmids.​ Nucleic Acids Res, 48(4), 2000‐2012. 39. Richter, Corinna, et al, (2012). F​ unction and regulation of clustered regularly interspaced short palindromic repeats (CRISPR) / CRISPR associated (Cas) systems. ​ Viruses, 4(10), 2291-311. 40. Gao, Zongliang, et al,​ (​ 2018). ​RNA Polymerase II Activity of Type 3 Pol III Promoters.​ Molecular therapy. Nucleic acids, 12, 135-145. 41. Xu Tang, Xuelian Zheng, et.al, (​ 2016). A Single Transcript CRISPR-Cas9 System for Efficient Genome Editing in Plants. ​Molecular Plant, 9, 1088–1091​. 42. Luisa Bortesi, Rainer Fischer, (2015). ​The CRISPR/Cas9 system for plant genome editing and beyond. Biotechnology Advances. Elsevier Inc , 33(1), 41-42. 43. Thermofisher, (2014). G​ eneArt CRISPR Nuclease mRNA System​. Thermo Fisher. Available at: https://www.thermofisher.com/content/dam/LifeTech/global/life-sciences/synthetic-biology/pdfs/ GeneArt%20CRISPR%20Nuclease%20mRNA%20System.pdf​. 44. Vijai Singh, Pawan Dhar, (2020).​ Genome Engineering via CRISPR-Cas9 System. A​ cademic Press, 1. 45-72. ISBN: 9780128181409 45. Chylinski K, Le Rhun A, Charpentier E, (2013 ). ​The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems.​ RNA Biol, 10(5), 726‐737. 46. McGinn J, Marraffini L.A​, ​(2019). M​ olecular mechanisms of CRISPR–Cas spacer acquisition. ​ Nat Rev Microbiol, 17, 7–12. 47. Wan H, Li J, Chang S, et al, (2019). ​ Probing the Behaviour of Cas1-Cas2 upon Protospacer Binding in CRISPR-Cas Systems using Molecular Dynamics Simulations.​ Sci Rep 9, 3188. 48. Clare E. Thomas, Anja Ehrhardt, Mark A. Kay, (2003). ​PROGRESS AND PROBLEMS WITH THE USE OF VIRAL VECTORS FOR GENE THERAPY. N​ ature Publishing Group , 4, 346-358. 49. Eliot Marshall, (1999). ​Gene Therapy Death Prompts Review of Adenovirus Vector. ​The American Association for the Advancement of Science, 286, 2244-224. 50. Cavazzana Marina, et al, (2016). ​Gene Therapy for X-Linked Severe Combined Immunodeficiency: Where Do We Stand?.​ Human gene therapy, 27(2), 108-16. 51. Kim E, Koo T, Park S, et al, (2017). I​ n vivo genome editing with a small Cas9 orthologue derived from Campylobacter jejuni. ​ Nat Commun 8. d​ oi.org/10.1038/ncomms14500 52. Rolf Turk, Netanya Y Spencer, (2019). A​ high degree of similarity in CRISPR-Cas9 editing efficiency is found between 2-part guide RNAs and single guide RNAs. I​ ntegrated DNA Technologies. Available at: https://eu.idtdna.com/pages/education/decoded/article/a-high-degree-of-similarity-in-crispr-cas9-edit ing-efficiency-is-found-between-2-part-guide-rnas-and-single-guide-rnas 53. Enzmann. Brittany, (2018). ​The Problems of Using Plasmids for CRISPR Genome Editing. S​ ynthego. Available at: h​ ttps://www.synthego.com/blog/crispr-plasmid-pitfalls​, (Accessed: 20 May 2020). 54. Chen, Baohui, et al, (2013). ​Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system.​ Cell, 155(7), 1479-91.

76

55. Addgene, (2017). ​CRISPR 101, A desktop resource.​ 2nd edition. Chapter 06. Available at: https://www.addgene.org/guides/crispr/ 56. Xu C. L, Ruan M, Mahajan V. B & Tsang S. H, (2019). ​ Viral Delivery Systems for CRISPR​. Viruses, 11(1), 28. 57. Leila Haery, Marcey Patrick, et al, (2018). V​ iral Vectors 101 A Desktop Resource.​ Addgene, 1. Available at: https://cdn2.hubspot.net/hubfs/306096/viralVectorEbook_TJF_2018_7_24/2020_04_27_viralVect ors101v1_TJF_2018_6_11_EDITS%20INCORPORATED.pdf?_hsenc=p2ANqtz-_XADvpR0RDp 2oBskst77G43QVpgWkOKJl7WdIJv_iIi9ycPah59QzGKqRQq-049CCP_tZWajBOR2JHlTseQS3eo ghMNYcdk7PVHJNeyeuIB5qQx7U&_hsmi=65258393&hsCtaTracking=90e1e508-4c55-4aa4-a61d- 0b1cda09b5b2%7C56bece4f-5659-454f-a6fe-695f4dabeeb9 58. Xu, Christine L, et al, (2019).​ Viral Delivery Systems for CRISPR​. Viruses, 11(1), 28. 59. Hanlon K.S, Kleinstiver B.P, Garcia S.P, et al, (2019). H​ igh levels of AAV vector integration into CRISPR-induced DNA break​s. Nat Commun 10, 4439. 60. Pellegrini.Roger (2016),​ How to express CRISPR in your target cells. B​ enchtalk. Available at: https://www.benchling.com/2016/03/24/how-to-express-crispr-in-your-target-cells/.​ Accessed: May 20. 61. Basila M, Kelley ML, Smith AVB, (2017).​ Minimal 2'-O-methyl phosphorothioate linkage modification pattern of synthetic guide RNAs for increased stability and efficient CRISPR-Cas9 gene editing avoiding cellular toxicity​. PLoS One, 12(11). 62. Xiquan Liang, Jason Potter, et.al, (2015). ​Rapid and highly efficient mammalian cell engineering via Cas9 protein transfection. ​J Biotechnol, 208, 44-53. 63. Genscript, (2018). G​ enScript-CRISPR-RNP-User-Manual.​ Genscript. Available at: https://www.genscript.com/gsfiles/techfiles/GenScript-CRISPR-RNP-User-Manual-2018.pdf 64. Scott T, Urak R, Soemardy C, et al, (2019).​ Improved Cas9 activity by specific modifications of the tracrRNA. S​ ci Rep 9, 4439. 65. Pellegrini.Roger, (2016). H​ ow to Synthesize your gRNAs for CRISPR. ​Benchtalk. Available at:

https://www.benchling.com/2016/02/23/how-to-synthesize-your-grnas-for-crispr/#table1 66. Dean DA, Strong DD, Zimmer WE, (2015). N​ uclear entry of nonviral vectors. ​Gene Ther, 12(11), 881‐890. 67. Wilbie D, Walther J, Mastrobattista E,​ ​(2019).​ Delivery Aspects of CRISPR/Cas for in Vivo Genome Editing​. Acc Chem Res, 52(6), 1555‐1564. 68. Wilson RC, Gilbert LA, (2018). ​ The Promise and Challenge of In Vivo Delivery for Genome Therapeutics​. ACS Chem Biol. 13(2), 376‐382. 69. Alain Vertes, Nathan J. Dowden, et al, (2020).​ Second Generation Cell and Gene-Based Therapies: Biological Advances, Clinical Outcomes and Strategies for Capitalisation. ​Academic Press. ISBN: 9780128120347 70. Origene. Available at: ​https://www.origene.com/products/gene-expression/crispr-cas9/crispra-crispri,​ (Accessed: 20 May 2020). 71. Konermann S, Brigham M, Trevino A, et al, (2015).​ Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex​. Nature, 517, 583–588. 72. Baumann V, Wiesbeck M, Breunig C.T, et al, (2019). T​ argeted removal of epigenetic barriers during transcriptional reprogramming​. Nat Commun 10, 2119.

77

73. Lucy Thorne, (2019). ​CRISPRi and CRISPRa: Moving beyond Gene Editing.​ Biocompare. Available at: https://www.biocompare.com/Editorial-Articles/361999-CRISPRi-and-CRISPRa-Moving-beyond- Gene-Editing/​. 74. Hilton, Isaac B, et al,​ ​ (2015). ​ by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers.​ Nature biotechnology, 33(5),510-570. 75. Hitoshi Mitsunobu, Jun Teramoto, et.al, (2017). B​ eyond Native Cas9: Manipulating Genomic Information and Function.​ Trends in Biotechnology E​ lsevier, 35(10), 983-996. 76. Ayman Eid, et al, (2018).​ CRISPR base editors: genome editing without double-stranded breaks.​ The Biochemical journal, 475(11), p.1955-1964. 77. Rees, Holly A, and David R Liu, (2018). B​ ase editing: precision chemistry on the genome and transcriptome of living cells.​ Nature reviews. Genetics, 19(12), 770-788. 78. Kutubuddin Molla, (2019). C​ RISPR-mediated Plant Base Editors​. Addgene blog. Available at: https://blog.addgene.org/crispr-101-crispr-mediated-plant-base-editors 79. Hu JH, Miller SM, Geurts MH, et al, (2018). E​ volved Cas9 variants with broad PAM compatibility and high DNA specificity. ​ Nature, 556, 57‐63. 80. Maarten H. Geurts, Eyleen de Poel, Gimano D. Amatngalim, Rurika Oka, Fleur M. Meijers, Evelien Kruisselbrink, et al, (2020).​CRISPR-Based Adenine Editors Correct Nonsense Mutations in a Cystic Fibrosis Organoid Biobank​. Cell Stem Cell. Elsevier, 26(4), 503-510 81. Anzalone A.V, Randolph P.B, Davis J.R, et al, (2019). S​ earch-and-replace genome editing without double-strand breaks or donor DNA​. Nature 576, 149–157. 82. Ishino Y, Shinagawa H, Makino K, Amemura M, Nakata A, (1987). N​ ucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product. ​ J Bacteriol, 169, 5429‐5433. 83. Ishino Y, Krupovic M, Forterre P, (2018). ​ History of CRISPR-Cas from Encounter with a Mysterious Repeated Sequence to Genome Editing Technology​. J Bacteriol, 200(7). 84. Madden T, (2013).​The BLAST Sequence Analysis Tool. The NCBI Handbook. B​ ethesda (MD): National Center for Biotechnology Information (US), 2nd edition. Available at: https://www.ncbi.nlm.nih.gov/books/NBK153387/ 85. , Dennis A. Romero, Anne-Claire Coûté-Monvoisin, et.al, (2008) ​ Diversity, Activity, and Evolution of CRISPR Loci in Streptococcus thermophilus.​ Journal of Bacteriology. 190 (4), 1401-1412. 86. Broad Institute. ​CRISPR Timeline. A​ vailable at: https://www.broadinstitute.org/what-broad/areas-focus/project-spotlight/crispr-timeline 87. Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, Zhang F, (2013). ​Genome engineering using the CRISPR-Cas9 system​. Nat Protoc, 8(11), 2281‐2308. 88. Carroll D, (2011). ​Genome engineering with zinc-finger nucleases.​ Genetics, 188(4), 773‐782. 89. Gaj T, Sirk SJ, Shui SL, Liu J, (2016). ​ Genome-Editing Technologies: Principles and Applications.​ Cold Spring Harb Perspect Biol,8(12):a023754 90. Mark Shwartz, (2018). T​ arget, delete, repair​. Stanford University. Available at: https://stanmed.stanford.edu/2018winter/CRISPR-for-gene-editing-is-revolutionary-but-it-comes-with-risks.ht ml.​ (Accessed: 20 May 2020).

78

91. Bao Z, Cobb RE, Zhao H, (2016). A​ ccelerated genome engineering through multiplexing. ​Wiley Interdiscip Rev Syst Biol Med, 8(1), 5‐21. 92. Y. Chen, Y. Niu & W. Ji (2016), G​ enome editing in nonhuman primates: approach to generating human disease models. ​ Journal of internal Medicine, 280(3), 246–251. 93. Carlson D, Lancto C, Zang B, et al, (2016). ​ Production of hornless dairy cattle from genome-edited cell lines.​ Nat Biotechnol, 34, 479–48. 94. McCarty N.S, Graham A.E, Studená L, et al, (2020). M​ ultiplexed CRISPR technologies for gene editing and transcriptional regulation.​ Nat Commun 11, p.1281. 95. Belén Adiego-Pérez, Paola Randazzo, Jean Marc Daran, René Verwaal, Johannes A Roubos, Pascale Daran-Lapujade, John van der Oost, (2019). M​ ultiplex genome editing of using CRISPR-Cas​. FEMS Microbiology Letters, 366(8). 96. Le Cong, F. Ann Ran, David Cox, Shuailiang Lin, Robert Barretto, Naomi Habib, Patrick D. Hsu, Xuebing Wu, Wenyan Jiang, Luciano A. Marraffini, Feng Zhang, (2013). M​ ultiplex Genome Engineering Using CRISPR/Cas Systems. ​ The American Association for the Advancement of Science, 339, 819-823. 97. Schenkwein D, Ylä-Herttuala S, (2018). ​Gene Editing of Human Embryos with CRISPR/Cas9: Great Promise Coupled with Important Caveats​. Mol Ther, 26(3), 659‐660. 98. WHO, (1998)​. Global health situation and trends​ 1​ 955-2025. ​Available at: https://www.who.int/whr/1998/media_centre/50facts/en/​, (Accessed: 20 May 2020). 99. Amy Maxmen, (2019). ​Faster, better, cheaper: the rise of CRISPR in disease detection​. Nature, 566, 437. 100. World health organisation, (2017).​10 facts about cancer. ​Available at: https://www.who.int/features/factfiles/cancer/en/.​ 101. Stadtmauer EA, Fraietta JA, Davis MM, et al, (2020). C​ RISPR-engineered T cells in patients with refractory cancer​. Science, 367 (6481). 102. ClinicalTrials.gov, (2018). ​A Safety and Efficacy Study Evaluating CTX001 in Subjects With Transfusion-Dependent β-Thalassemia.​U.S. National Library of Medicine​. ​ Available at: https://clinicaltrials.gov/ct2/show/study/NCT03655678​. (Accessed: May 20 2020) 103. ClinicalTrials.gov, (2018). ​A Safety and Efficacy Study Evaluating CTX001 in Subjects With Severe ​. U.S National Library of Medicine. Available at: https://clinicaltrials.gov/ct2/show/study/NCT03745287​ (Accessed: May 20 2020) 104. Wu SS, Li QC, Yin CQ, Xue W, Song CQ, (2020). A​ dvances in CRISPR/Cas-based Gene Therapy in Human Genetic Diseases​. Theranostics, 10(10), 4374‐4382. 105. EDITA Medicine, (2020). A​ llergan And Editas Medicine Announce Dosing Of First Patient In Landmark Phase 1/2 Clinical Trial Of CRISPR Medicine AGN-151587 (EDIT-101) For The Treatment Of LCA10​. Editas Medicine, Allergan plc. Available at: https://ir.editasmedicine.com/news-releases/news-release-details/allergan-and-editas-medicine-announ ce-dosing-first-patient 106. Maeder M.L, Stefanidakis M, Wilson C.J, et al, (2019). D​ evelopment of a gene-editing approach to restore vision loss in Leber congenital amaurosis type 10.​ Nat Med, 25, 229–233. 107. WHO, (2005).​ Xenotransplantation.​ World Health Organization. Available at : https://www.who.int/transplantation/xeno/en/,​ (Accessed: 20 May 2020). 108. Sherlock Biosciences, (2020). C​ OVID-19 Test is First FDA-Authorized Use of CRISPR Technology. Available at:

79

https://sherlock.bio/sherlock-biosciences-receives-fda-emergency-use-authorization-for-crispr-sars-cov- 2-rapid-diagnostic/​. 109. Subbaraman N, (2020). ​ Coronavirus tests: researchers chase new diagnostics to fight the pandemic​. Nature, published online ahead of print. Available at: https://www.nature.com/articles/d41586-020-00827-6 ​ (Accessed: 20 May 2020). 110. Guglielmi G, ​ ​(2020). ​ First CRISPR test for the coronavirus approved in the United States​. Nature, published online ahead of print. Available at: h​ ttps://www.nature.com/articles/d41586-020-01402-9 ,(Accessed: 20 May 2020). 111. U.S. DEPARTMENT OF AGRICULTURE, (2018). ​Secretary Perdue Issues USDA Statement on Plant Breeding Innovation. A​ vailable at: https://www.usda.gov/media/press-releases/2018/03/28/secretary-perdue-issues-usda-statement-plant -breeding-innovation 112. Safari, F, Zare K, Negahdaripour M, et al, (2019). C​ RISPR Cpf1 proteins: structure, function and implications for genome editing​. Cell Biosci 9, 36. 113. Kashaf Zafar, Muhammad Z. Khan, Imran Amin, Zahid Mukhtar, Sumera Yasmin, Muhammad Arif, Khansa Ejaz, Shahid Mansoor, (2020). ​ Precise CRISPR-Cas9 mediated genome editing in Super Basmati rice for resistance against bacterial blight by targeting the major susceptibility gene​. Front. Plant Sci, 11,575. 114. Yanpeng Wang1, Xi Cheng, et.al, (2014). ​Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew. N​ at Biotechnol 32, 947–951 115. Shi J, Gao H, Wang H, et al, (2017). ​ARGOS8 variants generated by CRISPR-Cas9 improve maize grain yield under field drought stress conditions​. Plant Biotechnol J, 15(2), 207‐216. 116. Liu J, Gunapati S, Mihelich NT, Stec AO, Michno JM, Stupar RM, (2019).G​ enome Editing in Soybean with CRISPR/Cas9.​ Methods Mol Biol,1917,217‐234. 117. Andersson M , Turesson H, Nicolia A, et al, (2017). E​ fficient targeted multiallelic mutagenesis in tetraploid potato (Solanum tuberosum) by transient CRISPR-Cas9 expression in protoplasts.​ Plant Cell Rep, 36, 117–128. 118. Sun L, Nasrullah, Ke F, Nie Z, Wang P, Xu J​, ​ (2019). ​Citrus Genetic Engineering for Disease Resistance: Past, Present and Future.​ Int J Mol Sci, 20(21), 5256. 119. DeeAnn Visk, (2017). ​CRISPR Applications in Plants.​ Available at: https://www.genengnews.com/insights/crispr-applications-in-plants/.​ 120. World Health Organisation (2019), V​ itamin A supplementation in children 6–59 months of age with severe acute malnutrition.​ Available at: ​https://www.who.int/elena/titles/vitamina_sam/en/ 121. Endo, A., Saika, H., Takemura, M. et al, (2019). A​ novel approach to carotenoid accumulation in rice callus by mimicking the cauliflower Orange mutation via genome editing​. Rice 12, 81. 122. Heidi Ledford, (2017). ​ Geneticists enlist engineered virus and CRISPR to battle citrus disease.​ Springer Nature. 545 (7654),277‐278. 123. Zsögön A, Čermák T, Naves ER, et al, (2018). ​ De novo domestication of wild tomato using genome editing​. Nat Biotechnol, 36, 1211-1216. 124. Nekrasov V, Wang C, Win J, Lanz C, Weigel D, Kamoun S, (2017). ​Rapid generation of a transgene-free powdery mildew resistant tomato by genome deletion.​ Sci Rep, 7(1), 482. 125. Whitworth, K., Rowland, R., Ewen, C. et al, (2016). ​ Gene-edited pigs are protected from porcine reproductive and respiratory syndrome virus.​ Nat Biotechnol 34, 20-22.

80

126. Wang X, Cai B, Zhou J, et al, (2016). ​Disruption of FGF5 in Cashmere Goats Using CRISPR/Cas9 Results in More Secondary Hair Follicles and Longer Fibers.​ PLoS One, 11(10). 127. Ajjawi, I., Verruto, J., Aqui, M. et al, (2017). ​ Lipid production in Nannochloropsis gaditana is doubled by decreasing expression of a single transcriptional regulator. ​ Nat Biotechnol 35, 647-652. 128. Katelyn Brandt, Rodolphe Barrangou, (2019). A​ pplications of CRISPR Technologies Across the Food Supply Chain.​ Annual Review of Food Science and Technology, 10, 133-150. 129. Oishi I, Yoshii K, Miyahara D, Kagami H, Tagami T, (2016). T​ argeted mutagenesis in chicken using CRISPR/Cas9 system.​ Sci Rep, 6(23980). 130. García-Molina MD, Giménez MJ, Sánchez-León S, Barro F, (2019). G​ luten Free Wheat: Are We There?. ​Nutrients.11(3), 487. 131. Callaway E, (2018). ​CRISPR plants now subject to tough GM laws in European Union​. Nature, 560(7716),16. 132. Nature, (2018). ​ A CRISPR definition of genetic modification​. Nature Plants, 4, 233 . 133. U:S Food and Drug administration, (2020)​.Understanding New Plant Varieties. ​FDA. Available at: h​ ttps://www.fda.gov/food/food-new-plant-varieties/understanding-new-plant-varieties 134. Champer, J., Buchman, A. & Akbari, O, (2016). ​Cheating evolution: engineering gene drives to manipulate the fate of wild populations. ​ Nat Rev Genet 17, 146-159. 135. Wedell N, Price TAR, Lindholm AK, (2019).​Gene drive: progress and prospects. ​ Proc Biol Sci. 286(1917). 136. U.S. Department of Agriculture,​ Invasive species, Asian carp.​ Available at: https://www.invasivespeciesinfo.gov/aquatic-invasives/fish-and-other-aquatic-vertebrates/asian-carp 137. Jiggins FM, (2017). ​ The spread of Wolbachia through mosquito populations. ​ PLoS Biol. 15(6). 138. Elie Dolgin, (2020). T​ he kill-switch for CRISPR that could make gene-editing safer​. Nature, 577, 308-310. 139. Bondy-Denomy J, Pawluk A, Maxwell KL, Davidson AR, (2013). ​ Bacteriophage genes that inactivate the CRISPR/Cas bacterial immune system​. Nature, 493, 429‐432. 140. Landsberger M, Gandon S, Meaden S, et al, (2018). A​ nti-CRISPR Phages Cooperate to Overcome CRISPR-Cas Immunity. C​ ell, 174(4), 908‐916. 141. Shin J, Jiang F, Liu JJ, et al, (2017). ​Disabling Cas9 by an anti-CRISPR DNA mimic.​ Sci Adv, 3(7). 142. Zhou H, Zhou M, Li D, et al, (2017). ​Whole genome analysis of CRISPR Cas9 sgRNA off-target homologies via an efficient computational algorithm​. BMC Genomics, 18(9). 143. Wu X, Kriz AJ, Sharp PA,​ (​2014). ​ Target specificity of the CRISPR-Cas9 system​. Quant Biol, 2, 59‐70. 144. Young-Il Jo,Bharathi Suresh,Hyongbum Kim,Suresh Ramakrishna, (2015). C​ RISPR/Cas9 system as an innovative genetic engineering tool: Enhancements in sequence specificity and delivery methods. Elsevier. 1856 (2), 234-243. 145. Scherer S, (2008). ​ A short guide to the human genom​e. Cold Spring Harbor Laboratory Press. Available at: https://eu.idtdna.com/pages/support/faqs/how-often-are-the-pam-sequences-presented-in-the-mam malian-genome-in-average.​ 146. Sickle Cell Gene Therapy with CRISPR​. Synthego. Available at: https://www.synthego.com/crispr-sickle-cell-disease

81

147. Luzzatto L, (2012). ​Sickle cell anaemia and malaria.​ Mediterr J Hematol Infect Dis. 4(1). 148. Innovative Genomics Institute, (2015).​ International Summit on Human Gene Editing.​ Available at: h​ ttps://innovativegenomics.org/events/international-summit-on-human-gene-editing/ 149. Heidi Ledford, (2017). ​Broad Institute wins bitter battle over CRISPR patents. S​ pringer Nature, 542(7642), 401. 150. David Cyranoski, (2020). W​ hat CRISPR-baby prison sentences mean for research.​ Nature, 577, 154-155. 151. Zhou M, Greenhill S, Huang S, et al, (2016). C​ CR5 is a suppressor for cortical plasticity and hippocampal learning and memory​. Elife, 5. 152. Michael McCarthy, (2000). ​ GM fish fail to hook Scottish salmon farmers.​ The Independent. Available at: https://www.independent.co.uk/environment/gm-fish-fail-to-hook-scottish-salmon-farmers-279443.h tml 153. Eric Lander, Françoise Baylis, Feng Zhang, Emmanuelle Charpentier, Paul Berg et.al, (2019). Adopt a moratorium on heritable genome editing. ​ Nature, 567, 165-168. 154. Jennifer Doudna et.al, (2019). C​ RISPR's unwanted anniversary.​ Science, 366(6467), 777. 155. Malin E.V. Johansson, Gunnar C. Hansson, (2014).​ Is the Intestinal Goblet Cell a Major Immune Cell?. ​Cell Host & Microbe, 15, 251-252.

82