https://www.alphaknockout.com

Mouse Snn Knockout Project (CRISPR/Cas9)

Objective: To create a Snn knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Snn (NCBI Reference Sequence: NM_009223 ; Ensembl: ENSMUSG00000037972 ) is located on Mouse 16. 2 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 2 (Transcript: ENSMUST00000089011). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.38% of the coding region. Exon 2 covers 100.0% of the coding region. The size of effective KO region: ~264 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2

Legends Exon of mouse Snn Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.8% 436) | C(25.4% 508) | T(26.45% 529) | G(26.35% 527)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.5% 450) | C(24.4% 488) | T(26.3% 526) | G(26.8% 536)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr16 + 11070320 11072319 2000 browser details YourSeq 194 336 547 2000 96.2% chr6 - 29956166 30165120 208955 browser details YourSeq 191 334 545 2000 95.8% chr8 + 27146551 27147043 493 browser details YourSeq 189 342 545 2000 96.6% chr8 - 94197537 94197938 402 browser details YourSeq 188 341 544 2000 97.0% chr15 - 100050089 100050314 226 browser details YourSeq 186 345 545 2000 96.5% chr12 - 75616644 75616847 204 browser details YourSeq 184 349 545 2000 97.0% chr9 - 44561262 44561459 198 browser details YourSeq 184 341 559 2000 92.0% chr3 - 54622302 54622518 217 browser details YourSeq 184 351 544 2000 96.4% chr12 - 76777501 76777693 193 browser details YourSeq 182 351 545 2000 97.0% chr3 - 145973794 145973992 199 browser details YourSeq 182 349 546 2000 96.0% chr2 - 165979894 165980091 198 browser details YourSeq 182 343 544 2000 95.5% chr2 - 32466863 32467090 228 browser details YourSeq 182 356 545 2000 96.9% chrX + 159358048 159358236 189 browser details YourSeq 182 356 562 2000 94.0% chr2 + 34472010 34472213 204 browser details YourSeq 181 351 545 2000 96.5% chr7 - 116327736 116327930 195 browser details YourSeq 181 351 545 2000 96.5% chr9 + 120622000 120622194 195 browser details YourSeq 181 341 545 2000 92.5% chr7 + 116354257 116354454 198 browser details YourSeq 181 353 544 2000 97.4% chr2 + 157281925 157282120 196 browser details YourSeq 181 340 544 2000 95.1% chr11 + 107064335 107064542 208 browser details YourSeq 181 351 545 2000 96.5% chr1 + 170929613 170929807 195

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr16 + 11072584 11074583 2000 browser details YourSeq 41 942 1090 2000 95.7% chr2 + 142613796 142614235 440 browser details YourSeq 28 940 971 2000 87.1% chr14 - 109108811 109108841 31 browser details YourSeq 27 1686 1718 2000 93.4% chr11 + 56634045 56634077 33 browser details YourSeq 25 947 974 2000 96.3% chr1 - 136697578 136697608 31 browser details YourSeq 24 939 962 2000 100.0% chr18 + 12143968 12143991 24 browser details YourSeq 24 940 963 2000 100.0% chr16 + 36065349 36065372 24 browser details YourSeq 23 941 963 2000 100.0% chrX + 117313770 117313792 23 browser details YourSeq 22 941 962 2000 100.0% chr4 + 68684000 68684021 22 browser details YourSeq 22 942 963 2000 100.0% chr18 + 46621421 46621442 22 browser details YourSeq 21 939 959 2000 100.0% chr6 + 113167335 113167355 21 browser details YourSeq 21 939 959 2000 100.0% chr11 + 114651500 114651520 21 browser details YourSeq 21 943 964 2000 100.0% chr11 + 104745169 104745191 23 browser details YourSeq 20 941 960 2000 100.0% chr7 - 4168203 4168222 20 browser details YourSeq 20 941 960 2000 100.0% chr3 - 84138933 84138952 20 browser details YourSeq 20 937 956 2000 100.0% chr1 - 73167667 73167686 20 browser details YourSeq 20 941 960 2000 100.0% chr1 - 37353193 37353212 20 browser details YourSeq 20 941 960 2000 100.0% chr5 + 81365680 81365699 20 browser details YourSeq 20 940 959 2000 100.0% chr16 + 35389261 35389280 20

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Snn stannin [ Mus musculus (house mouse) ] Gene ID: 20621, updated on 14-Aug-2019

Gene summary

Official Symbol Snn provided by MGI Official Full Name stannin provided by MGI Primary source MGI:MGI:1276549 See related Ensembl:ENSMUSG00000037972 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI848521; AW547286; 2810407J07Rik Expression Broad expression in CNS E18 (RPKM 67.8), CNS E14 (RPKM 61.7) and 21 other tissues See more Orthologs human all

Genomic context

Location: 16; 16 A1 See Snn in Genome Data Viewer Exon count: 2

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 16 NC_000082.6 (11066298..11074985)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 16 NC_000082.5 (11066391..11075078)

Chromosome 16 - NC_000082.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Snn ENSMUSG00000037972

Description stannin [Source:MGI Symbol;Acc:MGI:1276549] Gene Synonyms 2810407J07Rik Location : 11,060,945-11,074,985 forward strand. GRCm38:CM001009.2 About this gene This gene has 2 transcripts (splice variants), 142 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Snn-201 ENSMUST00000089011.5 2872 88aa ENSMUSP00000086405.4 Protein coding CCDS37255 P61807 Q5M8P0 TSL:1 GENCODE basic APPRIS P1

Snn-202 ENSMUST00000228962.1 847 88aa ENSMUSP00000154936.1 Protein coding CCDS37255 P61807 Q5M8P0 GENCODE basic APPRIS P1

34.04 kb Forward strand 11.06Mb 11.07Mb 11.08Mb (Comprehensive set... Snn-202 >protein coding

Snn-201 >protein coding

Contigs AC164093.2 >

Genes < Litaf-204protein coding < Txndc11-201protein coding (Comprehensive set...

< Litaf-205lncRNA < Txndc11-208retained intron

< Txndc11-206lncRNA

Regulatory Build

11.06Mb 11.07Mb 11.08Mb Reverse strand 34.04 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000089011

8.66 kb Forward strand

Snn-201 >protein coding

ENSMUSP00000086... Transmembrane heli... Low complexity (Seg) Pfam Stannin transmembrane Stannin unstructured linker Stannin cytoplasmic

PANTHER Stannin Gene3D Stannin superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) M R

Variant Legend synonymous variant

Scale bar 0 8 16 24 32 40 48 56 64 72 80 88

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8