https://www.alphaknockout.com

Mouse Spg11 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Spg11 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Spg11 (NCBI Reference Sequence: NM_145531 ; Ensembl: ENSMUSG00000033396 ) is located on Mouse 2. 40 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 40 (Transcript: ENSMUST00000036450). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Spg11 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-25C18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele develop a progressive spastic and ataxic gait disorder and show loss of cortical motoneurons and Purkinje cells, a reduced number of lysosomes available for fusion with autophagosomes in degenerating neurons, and accumulation of autolysosome-derived material.

Exon 2 starts from about 3.42% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 3331 bp, and the size of intron 3 for 3'-loxP site insertion: 1106 bp. The size of effective cKO region: ~2275 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 40 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Spg11 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8775bp) | A(24.33% 2135) | C(21.64% 1899) | T(31.99% 2807) | G(22.04% 1934)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 122115040 122118039 3000 browser details YourSeq 164 1425 1840 3000 88.0% chr4 - 86803647 86804053 407 browser details YourSeq 161 1419 1631 3000 91.2% chr11 + 61625969 61626461 493 browser details YourSeq 158 1421 1762 3000 88.0% chr4 - 108280021 108321637 41617 browser details YourSeq 157 1420 1615 3000 91.7% chr5 - 24666453 24667021 569 browser details YourSeq 157 1421 1610 3000 92.5% chr4 - 83481424 83481620 197 browser details YourSeq 157 1421 1606 3000 91.3% chr13 - 58240782 58240965 184 browser details YourSeq 156 1419 1908 3000 81.8% chr9 - 14399376 14399755 380 browser details YourSeq 156 1316 1606 3000 86.0% chr14 - 20779599 20779882 284 browser details YourSeq 155 1421 1619 3000 90.2% chr10 - 81772904 81773466 563 browser details YourSeq 155 1420 1908 3000 83.4% chr1 - 36222022 36222376 355 browser details YourSeq 154 1315 1602 3000 92.3% chr16 - 90269941 90270338 398 browser details YourSeq 153 1414 1601 3000 90.6% chr7 + 45780348 45780533 186 browser details YourSeq 153 1418 1602 3000 92.8% chr4 + 139328791 139328975 185 browser details YourSeq 152 1421 1606 3000 91.9% chr12 - 3569697 3569886 190 browser details YourSeq 152 1422 1611 3000 90.5% chr10 + 69790130 69790322 193 browser details YourSeq 151 1315 1605 3000 89.2% chr11 - 87390450 87390827 378 browser details YourSeq 150 1407 1602 3000 87.4% chr4 - 109223062 109223251 190 browser details YourSeq 150 1418 1606 3000 90.3% chr11 - 61114892 61115080 189 browser details YourSeq 149 1420 1602 3000 91.7% chr2 - 84692981 84693161 181

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 122109765 122112764 3000 browser details YourSeq 172 1 182 3000 97.3% chr16 + 81531799 81531980 182 browser details YourSeq 171 1 182 3000 97.3% chr12 - 118742916 118743099 184 browser details YourSeq 170 1 182 3000 96.8% chr8 - 73073658 73073839 182 browser details YourSeq 170 1 182 3000 96.8% chr16 - 52512402 52512583 182 browser details YourSeq 170 1 182 3000 96.8% chr1 - 52861861 52862042 182 browser details YourSeq 170 1 182 3000 96.8% chr18 + 30378988 30379169 182 browser details YourSeq 170 1 182 3000 96.8% chr15 + 11092379 11092560 182 browser details YourSeq 170 1 182 3000 96.8% chr11 + 47071470 47071651 182 browser details YourSeq 169 1 182 3000 96.7% chrX - 104554273 104554456 184 browser details YourSeq 169 1 182 3000 96.7% chrX + 37987740 37987922 183 browser details YourSeq 169 1 182 3000 96.7% chr19 + 61267745 61267927 183 browser details YourSeq 169 1 182 3000 96.8% chr18 + 47587483 47587665 183 browser details YourSeq 169 1 182 3000 96.8% chr1 + 33950533 33950749 217 browser details YourSeq 168 1 182 3000 96.2% chr8 - 30944730 30944911 182 browser details YourSeq 168 1 182 3000 96.2% chr7 - 82893118 82893299 182 browser details YourSeq 168 1 182 3000 96.2% chr6 - 84831968 84832149 182 browser details YourSeq 168 1 182 3000 96.2% chr5 - 85164068 85164249 182 browser details YourSeq 168 1 182 3000 96.2% chr11 - 99910341 99910522 182 browser details YourSeq 168 1 182 3000 95.1% chr8 + 102547680 102547860 181

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Spg11 SPG11, spatacsin vesicle trafficking associated [ Mus musculus (house mouse) ] Gene ID: 214585, updated on 12-Aug-2019

Gene summary

Official Symbol Spg11 provided by MGI Official Full Name SPG11, spatacsin vesicle trafficking associated provided by MGI Primary source MGI:MGI:2444989 See related Ensembl:ENSMUSG00000033396 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as A330015I11; 6030465E24Rik; C530005A01Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 7.0), limb E14.5 (RPKM 6.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 E5 See Spg11 in Genome Data Viewer

Exon count: 42

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (122053526..122119434, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (121879262..121944122, complement)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Spg11 ENSMUSG00000033396

Description SPG11, spatacsin vesicle trafficking associated [Source:MGI Symbol;Acc:MGI:2444989] Gene Synonyms 6030465E24Rik, C530005A01Rik, spastic paraplegia 11 Location Chromosome 2: 122,053,520-122,118,386 reverse strand. GRCm38:CM000995.2 About this gene This gene has 3 transcripts (splice variants), 201 orthologues, is a member of 1 Ensembl protein family and is associated with 18 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Spg11-201 ENSMUST00000036450.7 7671 2430aa ENSMUSP00000037543.7 Protein coding CCDS50687 Q3UHA3 TSL:1 GENCODE basic APPRIS P1

Spg11-202 ENSMUST00000133145.7 2555 No protein - lncRNA - - TSL:5

Spg11-203 ENSMUST00000135847.1 1813 No protein - lncRNA - - TSL:1

84.87 kb Forward strand 122.06Mb 122.08Mb 122.10Mb 122.12Mb Eif3j1-201 >protein coding (Comprehensive set...

Contigs AL845457.16 > Genes (Comprehensive set... < Spg11-201protein coding

< Spg11-202lncRNA

< Spg11-203lncRNA

< Patl2-201protein coding

Regulatory Build

122.06Mb 122.08Mb 122.10Mb 122.12Mb Reverse strand 84.87 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000036450

< Spg11-201protein coding

Reverse strand 64.87 kb

ENSMUSP00000037... Low complexity (Seg) Coiled-coils (Ncoils) Pfam Spatacsin, C-terminal domain PANTHER Spatacsin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2430

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7