https://www.alphaknockout.com

Mouse Spg11 Knockout Project (CRISPR/Cas9)

Objective: To create a Spg11 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Spg11 (NCBI Reference Sequence: NM_145531 ; Ensembl: ENSMUSG00000033396 ) is located on Mouse 2. 40 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 40 (Transcript: ENSMUST00000036450). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele develop a progressive spastic and ataxic gait disorder and show loss of cortical motoneurons and Purkinje cells, a reduced number of lysosomes available for fusion with autophagosomes in degenerating neurons, and accumulation of autolysosome-derived material.

Exon 2 starts from about 3.42% of the coding region. Exon 2~6 covers 16.28% of the coding region. The size of effective KO region: ~6753 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 40

Legends Exon of mouse Spg11 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.5% 510) | C(20.65% 413) | T(31.5% 630) | G(22.35% 447)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.75% 435) | C(20.8% 416) | T(33.0% 660) | G(24.45% 489)

Note: The 2000 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 122114790 122116789 2000 browser details YourSeq 164 175 590 2000 88.0% chr4 - 86803647 86804053 407 browser details YourSeq 161 169 381 2000 91.2% chr11 + 61625969 61626461 493 browser details YourSeq 158 171 512 2000 88.0% chr4 - 108280021 108321637 41617 browser details YourSeq 157 170 365 2000 91.7% chr5 - 24666453 24667021 569 browser details YourSeq 157 171 360 2000 92.5% chr4 - 83481424 83481620 197 browser details YourSeq 157 171 356 2000 91.3% chr13 - 58240782 58240965 184 browser details YourSeq 156 169 658 2000 81.8% chr9 - 14399376 14399755 380 browser details YourSeq 156 66 356 2000 86.0% chr14 - 20779599 20779882 284 browser details YourSeq 155 171 369 2000 90.2% chr10 - 81772904 81773466 563 browser details YourSeq 155 170 658 2000 83.4% chr1 - 36222022 36222376 355 browser details YourSeq 154 65 352 2000 92.3% chr16 - 90269941 90270338 398 browser details YourSeq 153 164 351 2000 90.6% chr7 + 45780348 45780533 186 browser details YourSeq 153 168 352 2000 92.8% chr4 + 139328791 139328975 185 browser details YourSeq 152 171 356 2000 91.9% chr12 - 3569697 3569886 190 browser details YourSeq 152 172 361 2000 90.5% chr10 + 69790130 69790322 193 browser details YourSeq 151 65 355 2000 89.2% chr11 - 87390450 87390827 378 browser details YourSeq 150 157 352 2000 87.4% chr4 - 109223062 109223251 190 browser details YourSeq 150 168 356 2000 90.3% chr11 - 61114892 61115080 189 browser details YourSeq 149 170 352 2000 91.7% chr2 - 84692981 84693161 181

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 122106037 122108036 2000 browser details YourSeq 177 447 811 2000 87.1% chr10 - 128479408 128479758 351 browser details YourSeq 167 447 794 2000 81.7% chr11 - 68535271 69305874 770604 browser details YourSeq 159 657 1024 2000 91.8% chr15 + 62030702 62031208 507 browser details YourSeq 155 545 811 2000 91.5% chr10 - 40319106 40319403 298 browser details YourSeq 145 447 799 2000 86.9% chr11 + 107522871 107523121 251 browser details YourSeq 143 374 813 2000 82.1% chr9 - 64308985 64309217 233 browser details YourSeq 137 650 811 2000 92.6% chr15 + 53257295 53257461 167 browser details YourSeq 137 658 821 2000 93.2% chr10 + 88682178 88682348 171 browser details YourSeq 136 650 813 2000 92.5% chr14 + 64095472 64095641 170 browser details YourSeq 135 656 812 2000 94.2% chr1 + 58186896 58187063 168 browser details YourSeq 134 656 809 2000 94.8% chr11 - 58874277 58874741 465 browser details YourSeq 134 656 813 2000 89.7% chr5 + 124847867 124848020 154 browser details YourSeq 134 649 805 2000 91.0% chr16 + 3942464 3942618 155 browser details YourSeq 133 656 811 2000 93.0% chr2 - 90979235 90979401 167 browser details YourSeq 133 649 806 2000 90.4% chr19 - 40462072 40462227 156 browser details YourSeq 133 656 809 2000 91.5% chr11 + 84163621 84163772 152 browser details YourSeq 132 300 795 2000 93.0% chr1 + 130505802 130506330 529 browser details YourSeq 131 656 811 2000 92.3% chr11 - 104488678 104488836 159 browser details YourSeq 131 656 811 2000 92.3% chr10 - 20315105 20315385 281

Note: The 2000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Spg11 SPG11, spatacsin vesicle trafficking associated [ Mus musculus (house mouse) ] Gene ID: 214585, updated on 12-Aug-2019

Gene summary

Official Symbol Spg11 provided by MGI Official Full Name SPG11, spatacsin vesicle trafficking associated provided by MGI Primary source MGI:MGI:2444989 See related Ensembl:ENSMUSG00000033396 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as A330015I11; 6030465E24Rik; C530005A01Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 7.0), limb E14.5 (RPKM 6.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 E5 See Spg11 in Genome Data Viewer Exon count: 42

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (122053526..122119434, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (121879262..121944122, complement)

Chromosome 2 - NC_000068.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Spg11 ENSMUSG00000033396

Description SPG11, spatacsin vesicle trafficking associated [Source:MGI Symbol;Acc:MGI:2444989] Gene Synonyms 6030465E24Rik, C530005A01Rik, spastic paraplegia 11 Location Chromosome 2: 122,053,520-122,118,386 reverse strand. GRCm38:CM000995.2 About this gene This gene has 3 transcripts (splice variants), 201 orthologues, is a member of 1 Ensembl protein family and is associated with 18 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Spg11-201 ENSMUST00000036450.7 7671 2430aa ENSMUSP00000037543.7 Protein coding CCDS50687 Q3UHA3 TSL:1 GENCODE basic APPRIS P1

Spg11-202 ENSMUST00000133145.7 2555 No protein - lncRNA - - TSL:5

Spg11-203 ENSMUST00000135847.1 1813 No protein - lncRNA - - TSL:1

84.87 kb Forward strand 122.06Mb 122.08Mb 122.10Mb 122.12Mb Eif3j1-201 >protein coding (Comprehensive set...

Contigs AL845457.16 > Genes (Comprehensive set... < Spg11-201protein coding

< Spg11-202lncRNA

< Spg11-203lncRNA

< Patl2-201protein coding

Regulatory Build

122.06Mb 122.08Mb 122.10Mb 122.12Mb Reverse strand 84.87 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000036450

< Spg11-201protein coding

Reverse strand 64.87 kb

ENSMUSP00000037... Low complexity (Seg) Coiled-coils (Ncoils) Pfam Spatacsin, C-terminal domain PANTHER Spatacsin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2430

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8