The CRISPR-Cas System
Total Page:16
File Type:pdf, Size:1020Kb
Linköping University - Department of Physics, Chemistry and Biology Bachelor thesis, 16 hp - Educational program: Physics, Chemistry and Biology Spring term 2020 - LITH - IFM - G - EX -- 20/3902 -- SE The CRISPR-Cas system Can CRISPR bring about a brighter future? An extensive and up-to-date CRISPR overview, focusing on the CRISPR-Cas9 technology including a suggestion of a CRISPR-Cas9 laboratory assignment for University students. Sara Berggren, Isabella Enoksson and Cassandra Stens Examiner: Assoc. Prof. Lars-Göran Mårtensson 1 Abstract Derived from and inspired by the adaptive immune system of bacteria, CRISPR has gone from basic biology knowledge to a revolutionizing biotechnological tool, applicable in many research areas such as medicine, industry and agriculture. The full mechanism of CRISPR-Cas9 was first published in 2012 and various CRISPR-Cas systems have already passed the first stages of clinical trials as new gene therapies. The immense research has resulted in continuously growing knowledge of CRISPR systems and the technique seems to have the potential to greatly impact all life on our planet. Therefore, this literature study aims to thoroughly describe the CRISPR-Cas system, and further suggest an undergraduate laboratory exercise involving gene editing with the CRISPR-Cas9 tool. In this paper, we describe the fundamental technical background of the CRISPR-Cas system, especially emphasizing the most studied CRISPR-Cas9 system, its development and applications areas, as well as highlighting its current limitations and ethical concerns. The history of genetic engineering and the discovery of the CRISPR system is also described, along with a comparison with other established gene editing techniques. This study concludes that a deeper knowledge about CRISPR is important and required since the technique is applicable in many research areas. A laboratory exercise will not only inspire but also provide extended theoretical and practical knowledge for undergraduate students. Acknowledgements We are forever grateful for all of the guidance our examiner, Assoc. Prof. Lars-Göran Mårtensson at Linköping University, has given us during these ten weeks of intense CRISPR-Cas exploration. You have shared your valuable knowledge and experience from years of research and publications, giving us a great starting point and potentiality through the paper. Thank you for your critical reading, thoughtful recommendations and many laughs - you have made our bachelor fun and memorable! Further appreciations and thanks goes to the BioRender website, enabling quick and artistic illustrations for our report. While writing this paper, we have tried to cite original articles as much as possible, but we apologize in advance if there are mistakes unintentionally citing other articles and researchers than the originals. 2 Terminology, abbreviations and acronyms Abbreviations and acronyms AdV adenovirus AAV adeno-associated virus bp base pairs Cas CRISPR associated protein CRISPR clustered regularly interspaced short palindromic repeats CRISPRa CRISPR activation CRISPRi CRISPR interference crRNA CRISPR RNA cr:tracrRNA a crRNA-tracrRNA hybrid dCas9 dead Cas9 (at times called deactivated Cas9) dsDNA double stranded DNA DSB double stranded break HDR homology directed repair Indels insertions or deletions NHEJ nonhomologous end joining nt nucleotides PAMs protospacer adjacent motifs pre-crRNA precursor CRISPR RNA repeats direct repeats RNP ribonucleoprotein ssDNA single stranded DNA SpCas9 Streptococcus pyogenes CRISPR associated (Cas) protein tracrRNA trans-activating crRNA WT wild type Terminology Anti-CRISPR protein: inhibitors of the CRISPR-Cas system which can be applied to both mammalian and prokaryotic cells for more precise and controllable editing. Cas9: the most common and used type of Cas protein. Discovered in Streptococcus pyogenes. Cas9 complex: the Cas9 protein attached to gRNA (cr:tracrRNA or sgRNA). Cas protein, CRISPR associated protein: is a non-specific endonuclease, i.e. capable of cleaving DNA. Requires a guide RNA to find where in the genome to execute the cut. There are multiple types of Cas proteins isolated from various bacteria and archaea. 3 CRISPR array: a series of palindromic, repeated sequences called direct repeats interspaced by different target-specific sequences called spacers. CRISPR locus: consists of a clustered set of tracrRNA genes, Cas genes and a CRISPR array at fixed regions in chromosomes. A single genome can incorporate multiple CRISPR loci. CRISPR-Cas9 system: relies on two components, a Cas9 protein and a guide RNA. The guide RNA can be in the form of a two-component crRNA and tracrRNA hybrid (cr:tracrRNA), or as a one-component sgRNA (crRNA and tracrRNA linked via a linker loop). CRISPR RNA, crRNA: a nucleotide sequence consisting of a spacer and a repeat sequence. The spacer is the customizable part of the gRNA and can be programmed to recognize particular target DNA sequences. crRNAs are transcribed from the CRISPR array. cr:tracrRNA: a gRNA configuration where crRNA and tracrRNA are bound to each other. Pronunciation: crispr tracr RNA. dCas9: a mutated (dead or deactivated) Cas9 protein without cleavage activity. Direct repeats: identical, repetitive and palindromic sequences interspaced between spacer fragments. Called repeats for short. The direct repeats are complementary to sequences in tracrRNAs. Electroporation: a microbiology technique in which an electric field is applied on a sample in order to increase the permeability of cell membranes, thereby supporting drugs, chemicals and genetic material to be introduced into target cells. Endonucleases: catalyse the cleavage of DNA by cutting the phosphodiester bonds in nucleic acid strands. Guide RNA, gRNA: can be in two conformations; a two-component cr:tracrRNA, or a one-component sgRNA. A gRNA in the shape of a cr:tracrRNA consists of a crRNA and a tracrRNA bound to each other. A gRNA in the form of an sgRNA consists of crRNA and tracrRNA linked together by a sequence called the linker loop. Despite configuration, gRNAs are capable of binding to Cas proteins and are used to recognize DNA regions of interest, thereby guiding the Cas protein to a certain target site. HNH: an endonuclease domain in the Cas9 protein responsible to cut the target strand. Leader sequence: a sequence ahead of the CRISPR array, includes the promoter. The leader sequence has an important role in the process of spacer acquisition. Linker loop: a sequence connecting the crRNA and tracrRNA together. 4 Motif: a sequence motif is a short, recurring sequence pattern in DNA presumed to have a biochemical function, as for example to indicate binding sites for proteins such as nucleases (e.g. Cas proteins). A sequence motif is distinguished from a structural motif; the latter describing a specific part of a protein’s 3D structure. Multiplex: a quality of CRISPR-Cas9 systems, where several genome sites in a cell are targeted and cut simultaneously, with the aid of multiple gRNAs. Palindromic repeats: sequences that are read the same forward as backwards on two complementary strands. PAM: a short, recurring sequence (motif) required for target site recognition of Cas proteins. Different Cas proteins are specific for different PAMs, for instance, the Cas9 PAM motif is NGG. PAMs are located immediately adjacent to the target protospacer in various organism’s DNA. pre-crRNA: a single spacer sequence flanked by repeat fragments. The CRISPR array is transcribed into a long precursor CRISPR RNA (pre-crRNA), prior to being cleaved into matured crRNAs. In the case of CRISPR Type II, assistance of tracrRNA, Cas9 and RNase III is required to process pre-crRNA into matured crRNA. Protospacer: a short DNA sequence derived and collected from foreign genomes. Protospacers are directly connected to PAM motifs in foreign genomes. When the protospacer is inserted into the CRISPR array in the bacterial genome it is termed a spacer. RNP complex: the Cas9 protein in complex with gRNA (cr:tracrRNA or sgRNA). RuvC: a nuclease domain, common in multiple types of Cas proteins, capable to cut the non-target strand. Seed sequence: a 8-12 bp sequence of the guide RNA that must be strictly complementary to the target DNA. Single guide RNA, sgRNA: a single RNA sequence composed of a crRNA linked to a tracrRNA via a linker loop. Spacers: ~20 bp sequences of various foreign DNA that is inserted and found within the CRISPR array. Spacers are small, unique fragments of DNA gathered from viruses that previously have tried to infect the bacterial cell. The incorporated spacers are used to express short guide RNAs, called crRNAs. By means of comparing the spacer with the complementary foreign DNA (the protospacers), CRISPR-Cas systems can identify and eliminate foreign DNA. Spacer acquisition: the insertion process of new foreign DNA fragments into the CRISPR array. tracrRNA, trans-activating crispr RNA: serves as a binding scaffold for the Cas protein and the crRNA. The tracrRNA is fixed and not programmable (in contrast to the case of crRNAs). 5 Contents 1. Introduction 9 1.1 Aim 9 1.2 Background 9 2. History 11 2.1 Genetic engineering in historical perspective 11 3. Theory 13 3.1 The CRISPR-Cas system 13 3.1.1 Basic technical background of CRISPR-Cas9 14 3.1.2 The protein structure of CRISPR-Cas9 18 3.1.3 The DNA repair mechanisms of eukaryotic cells 22 3.1.4 Cas classes and types 24 3.1.5 Transcription of the CRISPR locus 26 3.1.6 Spacer acquisition into the CRISPR array 29 3.1.7 Introduction of CRISPR-Cas9 into cells 32 3.1.7.1 DNA plasmid delivery 34 3.1.7.2 Viral vector delivery 35 3.1.7.3 RNA delivery 37 3.1.7.4 Protein delivery (RNP delivery) 38 3.1.7.5 Physical and chemical delivery methods (non-viral methods)38 3.1.7.6 Cytosolic vs nucleic insertion 40 3.1.7.7 In vivo and ex vivo delivery 41 3.1.8 Modifications of the CRISPR-Cas9 system 42 3.1.8.1 dCas9, CRISPRa and CRISPRi 43 3.1.8.2 Nickases 44 3.1.8.3 Base editing and prime editing 46 6 4.