https://www.alphaknockout.com

Mouse Sncaip Knockout Project (CRISPR/Cas9)

Objective: To create a Sncaip knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sncaip (NCBI Reference Sequence: NM_001199151 ; Ensembl: ENSMUSG00000024534 ) is located on Mouse 18. 12 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 12 (Transcript: ENSMUST00000178678). Exon 4~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 4.53% of the coding region. Exon 4~5 covers 36.23% of the coding region. The size of effective KO region: ~2943 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 12

Legends Exon of mouse Sncaip Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.2% 524) | C(21.45% 429) | T(29.8% 596) | G(22.55% 451)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.95% 639) | C(18.2% 364) | T(30.15% 603) | G(19.7% 394)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 52866540 52868539 2000 browser details YourSeq 274 504 1265 2000 85.0% chr3 - 50884148 50884973 826 browser details YourSeq 225 501 1237 2000 79.5% chr3 + 51626432 51627294 863 browser details YourSeq 219 848 1243 2000 86.0% chr3 + 102246399 102247100 702 browser details YourSeq 215 842 1265 2000 87.4% chr2 + 58963668 58964090 423 browser details YourSeq 207 844 1270 2000 88.5% chr8 + 24159952 24160402 451 browser details YourSeq 203 842 1269 2000 81.4% chr10 + 94025706 94026086 381 browser details YourSeq 202 881 1264 2000 87.7% chr9 - 66690878 66691268 391 browser details YourSeq 201 842 1267 2000 83.4% chr3 + 100408882 100409314 433 browser details YourSeq 200 503 1220 2000 82.2% chr1 + 170978503 170979068 566 browser details YourSeq 197 849 1265 2000 87.3% chr17 - 24026725 24027160 436 browser details YourSeq 195 842 1263 2000 86.6% chr9 - 43981981 43982562 582 browser details YourSeq 194 842 1220 2000 81.3% chrX + 105274873 105275213 341 browser details YourSeq 192 724 1264 2000 85.1% chr7 - 111269667 111270561 895 browser details YourSeq 192 842 1265 2000 85.3% chr8 + 80687461 80687889 429 browser details YourSeq 190 881 1267 2000 84.1% chr17 + 83869577 83869944 368 browser details YourSeq 189 852 1268 2000 83.7% chrX + 85450374 85450797 424 browser details YourSeq 187 664 1061 2000 85.6% chr13 - 58483352 58483988 637 browser details YourSeq 186 842 1265 2000 83.6% chr5 - 21172926 21173341 416 browser details YourSeq 185 754 1247 2000 85.5% chr19 - 7650480 7651247 768

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 52871483 52873482 2000 browser details YourSeq 67 1865 1963 2000 87.7% chr8 - 120698134 120698233 100 browser details YourSeq 66 1869 1970 2000 85.9% chr6 - 43190290 43190392 103 browser details YourSeq 66 1898 2000 2000 84.4% chr2 - 101669476 101669677 202 browser details YourSeq 61 1870 1963 2000 76.9% chr16 + 15809910 15809991 82 browser details YourSeq 60 1903 2000 2000 80.7% chr1 - 157162188 157162285 98 browser details YourSeq 60 1871 1956 2000 89.7% chr10 + 63042662 63042750 89 browser details YourSeq 59 1873 1963 2000 82.5% chr6 - 38657455 38657545 91 browser details YourSeq 58 1868 1964 2000 79.6% chr11 - 6361508 6361599 92 browser details YourSeq 58 1870 1967 2000 79.6% chr6 + 127121231 127121328 98 browser details YourSeq 55 1850 2000 2000 75.6% chr8 + 113503571 113503719 149 browser details YourSeq 55 1899 1969 2000 91.1% chr7 + 80152315 80152604 290 browser details YourSeq 55 1871 1958 2000 84.0% chr7 + 36552284 36552370 87 browser details YourSeq 55 1865 1943 2000 84.9% chr18 + 36252599 36252677 79 browser details YourSeq 54 1869 1938 2000 88.6% chr3 - 149525898 149525967 70 browser details YourSeq 53 1881 1962 2000 82.7% chr4 + 121029276 121029356 81 browser details YourSeq 53 1928 2000 2000 86.4% chr2 + 34338379 34338451 73 browser details YourSeq 53 1868 1938 2000 87.4% chr11 + 96918749 96918819 71 browser details YourSeq 52 1866 1943 2000 83.4% chr14 - 121600125 121600202 78 browser details YourSeq 51 1865 1964 2000 79.2% chr8 + 13345477 13345573 97

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Sncaip synuclein, alpha interacting protein (synphilin) [ Mus musculus (house mouse) ] Gene ID: 67847, updated on 21-Oct-2019

Gene summary

Official Symbol Sncaip provided by MGI Official Full Name synuclein, alpha interacting protein (synphilin) provided by MGI Primary source MGI:MGI:1915097 See related Ensembl:ENSMUSG00000024534 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as SYPH1; BB104594; 2810407O15Rik; 4933427B05Rik Expression Broad expression in CNS E14 (RPKM 5.1), whole brain E14.5 (RPKM 5.0) and 15 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 D1 See Sncaip in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (52767811..52915935)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (52927363..53075584)

Chromosome 18 - NC_000084.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Sncaip ENSMUSG00000024534

Description synuclein, alpha interacting protein (synphilin) [Source:MGI Symbol;Acc:MGI:1915097] Gene Synonyms 4933427B05Rik, SYPH1, synphilin-1 Location Chromosome 18: 52,767,709-52,915,935 forward strand. GRCm38:CM001011.2 About this gene This gene has 11 transcripts (splice variants), 185 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sncaip- ENSMUST00000178678.7 3724 965aa ENSMUSP00000137367.1 Protein coding CCDS57124 G5E848 TSL:1 206 GENCODE basic APPRIS ALT2

Sncaip- ENSMUST00000025413.13 3632 965aa ENSMUSP00000025413.6 Protein coding CCDS57124 G5E848 TSL:1 201 GENCODE basic APPRIS ALT2

Sncaip- ENSMUST00000178011.7 3433 915aa ENSMUSP00000137549.1 Protein coding CCDS37820 E9Q4E2 TSL:1 205 GENCODE basic APPRIS P3

Sncaip- ENSMUST00000163742.8 3272 915aa ENSMUSP00000127189.2 Protein coding CCDS37820 E9Q4E2 TSL:5 203 GENCODE basic APPRIS P3

Sncaip- ENSMUST00000115410.8 3585 914aa ENSMUSP00000111069.2 Protein coding - Q3V1N2 TSL:1 202 GENCODE basic APPRIS ALT2

Sncaip- ENSMUST00000179625.7 2568 855aa ENSMUSP00000136838.1 Protein coding - J3QNK1 TSL:5 209 GENCODE basic

Sncaip- ENSMUST00000179689.1 1512 503aa ENSMUSP00000137107.1 Protein coding - J3QP58 TSL:5 210 GENCODE basic

Sncaip- ENSMUST00000179038.1 404 10aa ENSMUSP00000137201.1 Protein coding - A0A0G2JDC8 CDS 3' 208 incomplete TSL:5

Sncaip- ENSMUST00000178883.7 2383 55aa ENSMUSP00000136200.1 Nonsense mediated - J3QPG2 TSL:5 207 decay

Sncaip- ENSMUST00000180259.7 2312 55aa ENSMUSP00000137282.1 Nonsense mediated - J3QPG2 TSL:5 211 decay

Sncaip- ENSMUST00000177861.7 1905 62aa ENSMUSP00000136021.1 Nonsense mediated - J3KMU0 TSL:5 204 decay

Page 7 of 9 https://www.alphaknockout.com

168.23 kb Forward strand 52.80Mb 52.85Mb 52.90Mb (Comprehensive set... Sncaip-202 >protein coding

Sncaip-206 >protein coding

Sncaip-201 >protein coding

Sncaip-203 >protein coding

Sncaip-205 >protein coding

Sncaip-208 >protein coding

Ancv1r-201 >protein coding

Sncaip-211 >nonsense mediated decay

Sncaip-207 >nonsense mediated decay

Sncaip-204 >nonsense mediated decay

Sncaip-210 >protein coding

Sncaip-209 >protein coding

Contigs AC121943.3 > AC124178.3 > Regulatory Build

52.80Mb 52.85Mb 52.90Mb Reverse strand 168.23 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000178678

148.22 kb Forward strand

Sncaip-206 >protein coding

ENSMUSP00000137... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily repeat-containing domain superfamily SMART Pfam Ankyrin repeat-containing domain

Synphilin-1, alpha-Synuclein-binding domain PROSITE profiles Ankyrin repeat

Ankyrin repeat-containing domain PANTHER Synphilin-1 Gene3D Ankyrin repeat-containing domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 965

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9