1

Amplicon-sequencing of BCL11a-edited stem cells used to treat beta thalassemia and sickle cell disease suffices to demonstrate that CRISPR is a shredder Sandeep Chakraborty

Abstract The quanta of ’genome vandalism’ CRISPR does is easily observed by amplicon sequencing of the target . The algorithm chooses reads that match the targeted gene GENE, but also another gene (OFFGENE) - meaning there has been a translocation/integration. This translocation/integration is possible if and only if OFFGENE has also been edited - an off-target. Larger the number of reads with GENE-OFFGENE, larger the confidence. For example, the stem-cell study, which edits BCL11a to increase fetal hemoglobin, has 111 BCL11a-RNA45SN5 reads [1]. Analysis of the edit location in RNA45SN5 shows 8 mismatches with the gRNA, which falls within the permissibles limits [2]. There are 20 such which have integrated with the BCL11a gene with high confidence. This is a very con- servative estimate for two reasons. First, amplicon-sequencing only looks at GENE. If two OFFGENE’s were integrating, the data would not show that. Secondly, I have only looked the gene space - so only 2% of the genome. Since we find 20 genes integrating with BCL11a after such constraints, it is fair to assume that CRISPR is literally shredding the chromosome - a possible reason for significant fatalities in clinical trials [3]. 2

Table 1: Sample set of coding genes specific to walnut: Year Title Gene-edited NReadInt 2019 Highly efficient therapeutic gene editing of human hematopoietic stem cells [1] BCL11a 853

Introduction

‘Genomic vandalism’ [4] by CRISPR-Cas9 is seriously underestimated, a possible reason for fatalities in clinical trials [3]. Non-specificity is a critical requirement in enzymes that have evolved as defense against hypervariable viruses [5]. A seminal 2014 paper shows that “mismatches at dCas9 binding sites can be as high as 10“, “as many as 9 of the mismatches can be consecutive in the PAM-distal region” and “a perfect match of 10 bases in the PAM-proximal region of the sgRNA guiding sequence is sufficient to mediate Cas9 binding to DNA” [2]. In line with those expectations, off-targets with 7 mismatches was shown in a study using Oxford Nanopore sequencing [6]. With 7 mismatches allowed, there are 100k possible off-targets for any gRNA in the . In addition to off-target edits [7–9], plasmid integration [10–12], translocations [13], on-target mRNA misregulation [14], p53 activation [15, 16] are also of concern. There are several techniques for the unbiased detection of CRISRP-Cas9 off-targets [17–19]. Here, I show that amplicon-sequencing is probably the easiest to find off-targets - translocations of an edited-gene with different parts of the genome reveal off-targets, especially when such reads are in large numbers. Using data from a study that knocks out BCL11a to increase fetal hemoglobin [1], a proposed gene-therapy for beta thalassemia and sickle cell disease [20], I show significant translocation of different genes in the exact location where BCL11a is edited.

Materials and methods

The 21 gRNAs used (SIstemcellBCL11a:GRNA) map to Accid:NG 011968.1 (BAF chromatin remodeling complex subunit BCL11A on chromosome 2) around 63100–63400. The 42 samples (Accid:PRJNA517275, SIstemcellBCL11a:list.sra) which were edited using CRISPR were mapped to this slice from BCL11A (SIs- temcellBCL11a:SLICEBCL11a.fa) using BLAST (cutoff=E-10). These reads were then mapped to the hu- man genes - removing the BCL11A gene (SIgenes.minusBCL11a.fa) - so that one can detect translocations. Only reads with >98% identity and BLAST score >100 were chosen to have a high confidence of integration. This gave 853 reads in all (SIstemcellBCL11a:ALLintegrations.fa).

Results

This list will be continually updated for multiple studies. For each study, I will be providing the gRNAs, targeted gene and integrated reads.

1 Stem cell edited for BCL11a [1]

In this study, BCL11a was knocked out using various gRNAS (SIstemcellBCL11a:GRNA) [1]. Table 2 shows genes that have integrated with BCL11a - this is possible if and only if the other gene has also been edited. These sequences are available at SIstemcellBCL11a:ALLintegrations.fa. A specific off-target at coordinate 13040 which has a mismatch of 8, with one bulge [21] (a T removed) is shown in Fig 1. This mismatches are within permissible limits [2] – less than 10 mismatches, and very strict on the PAM-proximal side (only 1 MM there), and 2 bps after the PAM. 3

Table 2: Genes that integrate with BCL11a after the CRISPR edit: Such integrations can hap- pen if and only if the other gene has also been edited. These sequences are available at SIstemcell- BCL11a:ALLintegrations.fa. Nreads Accid Description 225 NM 032134.2 glutamine rich 2 (QRICH2), 165 NM 001286462.1 chromosome 21 open reading frame 58 (C21orf58), 111 NR 046235.3 RNA, 45S pre-ribosomal N5 (RNA45SN5), ribosomal RNA 98 NM 194449.4 PH domain and leucine rich repeat protein phosphatase 1 (PHLPP1), mRNA 58 XM 011544683.1 PREDICTED: fibroblast growth factor 17 (FGF17), 45 NM 015124.5 GRAM domain containing 4 (GRAMD4), 42 XM 024452143.1 PREDICTED: basic proline-rich protein-like (LOC112268284), mRNA 32 NM 006887.5 ZFP36 ring finger protein like 2 (ZFP36L2), mRNA 15 NM 152240.2 zinc finger matrin-type 3 (ZMAT3), 12 NM 001170544.1 mitochondrial serine/threonine protein phosphatase (PGAM5), 9 NM 003661.3 apolipoprotein L1 (APOL1), 8 NM 172177.4 mitochondrial ribosomal protein L42 (MRPL42), 6 NM 001079880.1 protein kinase D2 (PRKD2), 5 NM 033540.3 mitofusin 1 (MFN1), mRNA 5 NM 144597.3 chromosome 15 open reading frame 40 (C15orf40), 5 NM 001164281.2 TRAF3 interacting protein 2 (TRAF3IP2), 4 NM 005154.5 ubiquitin specific peptidase 8 (USP8), 2 NM 001142643.2 CASK interacting protein 2 (CASKIN2), 2 NM 001193609.1 tubulin delta 1 (TUBD1), 2 NM 001322415.1 PNN interacting serine and arginine rich protein (PNISR),

Discussion and Conclusion: Gene-editing based therapies provide revolutionary hope for curing many diseases. However, ensuring safety of any such endeavor must be paramount to avoid doing more harm [22,23]. Recent CRISPR trials have had a very high rate of fatalities [3]. One possible reason for this is off-targets [2,6–8], which is highly underesti- mated. Here, I show using data from a stem-cell study that edits BCL11a to increase fetal hemoglobin [1], a possible therapy for beta thalassemia and sickle cell disease, extensive translocations/integrations, indi- rectly providing a method for off-target detection. The extent of ’genomic vandalism’ [4] brings in p53 (the Guardian of the Genome) response [15, 16] - selecting p53 deficient cells, further exposing the patients to oncogenic risks.

Competing interests

No competing interests were disclosed.

References

1. Wu Y, Zeng J, Roscoe BP, Liu P, Yao Q, et al. (2019) Highly efficient therapeutic gene editing of human hematopoietic stem cells. Nature medicine 25: 776. 2. Kuscu C, Arslan S, Singh R, Thorpe J, Adli M (2014) Genome-wide analysis reveals characteristics of off-target sites bound by the cas9 endonuclease. Nature biotechnology 32: 677. 4

3. Jing Z, Zhang N, Ding L, Wang X, Hua Y, et al. (2018) Safety and activity of programmed cell death-1 gene knockout engineered t cells in patients with previously treated advanced esophageal squamous cell carcinoma: An open-label, single-arm phase i study. . 4. Wilson JM (2018) University flunk-out to genomics pioneer: An interview with george church, phd. Human Gene Therapy Clinical Development 29: 118–120. 5. Chakraborty S (2018) Inconclusive studies on possible CRISPR-Cas off-targets should moderate ex- pectations about enzymes that have evolved to be non-specific. Journal of Biosciences 43: 225–228. 6. Chakraborty S (2019) Crispr-Cas off-target detection using Oxford Nanopore sequencing-is the mito- chondrial genome more vulnerable to off-targets? bioRxiv : 741322. 7. Fu Y, Foden JA, Khayter C, Maeder ML, Reyon D, et al. (2013) High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature biotechnology 31: 822–826. 8. Pattanayak V, Lin S, Guilinger JP, Ma E, Doudna JA, et al. (2013) High-throughput profiling of off-target DNA cleavage reveals rna-programmed cas9 nuclease specificity. Nature biotechnology 31: 839. 9. Chakraborty S (2019) Unreported off-target integration of beta-lactamase from plasmid in gene-edited hornless cows . 10. Chakraborty S (2019) Sequencing data from Massachusetts General Hospital shows Cas9 integration into the genome, highlighting a serious hazard in gene-editing therapeutics [v1; ref status: not peer reviewed, https://doi.org/10.12688/f1000research.20744.1]. F1000research : 8:1846.

11. Chakraborty S (2019) Long-term evaluation of AAV-CRISPR genome editing for duchenne muscular dystrophy shows its not safe due to AAV - and more worryingly Cas9 - integration . 12. Chakraborty S (2019). Prime-editors (nickases) and dcas9 have the same problem as conventional crispr-cas9 of plasmid/cas9 integration after making a double stranded break. URL osf.io/jf6pe. 13. Kosicki M, Tomberg K, Bradley A (2018) Repair of double-strand breaks induced by CRISPR–Cas9 leads to large deletions and complex rearrangements. Nature biotechnology 36: 765. 14. Tuladhar R, Yeu Y, Piazza JT, Tan Z, Clemenceau JR, et al. (2019) Crispr/cas9-based mutagenesis frequently provokes on-target mrna misregulation. bioRxiv : 583138. 15. Ihry RJ, Worringer KA, Salick MR, Frias E, Ho D, et al. (2018) p53 inhibits crispr–cas9 engineering in human pluripotent stem cells. Nature medicine 24: 939. 16. Sinha S, Guerra KB, Cheng K, Leiserson MD, Wilson DM, et al. (2019) Integrated computational and experimental identification of p53, kras and vhl mutant selection associated with crispr-cas9 editing. bioRxiv : 407767. 17. Wienert B, Wyman SK, Richardson CD, Yeh CD, Akcakaya P, et al. (2019) Unbiased detection of CRISPR off-targets in vivo using DISCOVER-Seq. Science 364: 286–289. 18. Tsai SQ, Nguyen NT, Malagon-Lopez J, Topkar VV, Aryee MJ, et al. (2017) Circle-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets. Nature Methods 14: 607.

19. Tsai SQ, Zheng Z, Nguyen NT, Liebers M, Topkar VV, et al. (2015) GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nature biotechnology 33: 187. 5

20. Lettre G, Sankaran VG, Bezerra MAC, Ara´ujoAS, Uda M, et al. (2008) Dna polymorphisms at the bcl11a, hbs1l-myb, and β-globin loci associate with fetal hemoglobin levels and pain crises in sickle cell disease. Proceedings of the National Academy of Sciences 105: 11869–11874. 21. Lin Y, Cradick TJ, Brown MT, Deshmukh H, Ranjan P, et al. (2014) Crispr/cas9 systems have off- target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic acids research 42: 7473–7485.

22. Couzin J, Kaiser J (2005) As gelsinger case ends, gene therapy suffers another blow . 23. Saˆada-BouzidE, Defaucheux C, Karabajakian A, Coloma VP, Servois V, et al. (2017) Hyperpro- gression during anti-pd-1/pd-l1 therapy in patients with recurrent and/or metastatic head and neck squamous cell carcinoma. Annals of Oncology 28: 1605–1611. 6

Figure 1: A specific off-target at location 13040 which has a mismatch of 8, with one bulge (a T removed). Note this matches the constraint in the seminal 2014 paper [2] – less than 10 mismatches, and very strict on the PAM-proximal side (only 1 MM there). The cut happens just 2 bps after the PAM.