https://www.alphaknockout.com

Mouse Scin Knockout Project (CRISPR/Cas9)

Objective: To create a Scin knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Scin (NCBI Reference Sequence: NM_001146196 ; Ensembl: ENSMUSG00000002565 ) is located on Mouse 12. 16 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 16 (Transcript: ENSMUST00000002640). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a conditional allele knocked-out in osteoclasts exhibit impaired osteoclast differentiation and reduced peridontal disease-mediated bone loss.

Exon 2 starts from about 9.32% of the coding region. Exon 2~3 covers 14.78% of the coding region. The size of effective KO region: ~5922 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 16

Legends Exon of mouse Scin Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.35% 627) | C(18.95% 379) | T(30.45% 609) | G(19.25% 385)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.8% 616) | C(19.2% 384) | T(30.9% 618) | G(19.1% 382)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 - 40128036 40130035 2000 browser details YourSeq 83 1666 1766 2000 95.7% chr1 + 148512216 148512737 522 browser details YourSeq 74 1539 1732 2000 72.8% chr13 - 118290748 118290888 141 browser details YourSeq 73 332 476 2000 82.5% chr17 + 11934291 11934802 512 browser details YourSeq 68 336 442 2000 82.3% chr6 - 82299551 82299658 108 browser details YourSeq 66 342 446 2000 79.7% chr4 - 134085203 134085306 104 browser details YourSeq 66 331 433 2000 85.2% chr17 - 72517029 72517371 343 browser details YourSeq 66 336 440 2000 87.7% chr11 - 82642348 82642630 283 browser details YourSeq 65 1715 1826 2000 91.1% chr10 - 66100632 66100745 114 browser details YourSeq 65 344 445 2000 84.3% chr9 + 115225299 115225401 103 browser details YourSeq 63 337 427 2000 84.7% chr7 - 98666539 98666629 91 browser details YourSeq 62 339 442 2000 79.8% chr1 + 131983414 131983516 103 browser details YourSeq 61 335 442 2000 78.8% chr1 + 85041454 85041562 109 browser details YourSeq 60 363 442 2000 89.8% chr11 + 21973197 22048139 74943 browser details YourSeq 60 335 427 2000 82.7% chr1 + 125574120 125574212 93 browser details YourSeq 59 355 472 2000 91.7% chr4 - 105487370 105487488 119 browser details YourSeq 59 339 442 2000 83.6% chr1 - 40288354 40288455 102 browser details YourSeq 58 1711 1821 2000 76.2% chr10 - 91771338 91771414 77 browser details YourSeq 57 337 446 2000 78.3% chr7 + 36599582 36599694 113 browser details YourSeq 56 335 443 2000 83.6% chr5 - 127040641 127040748 108

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 - 40122567 40124566 2000 browser details YourSeq 145 1593 1989 2000 87.2% chr1 - 173245943 173246347 405 browser details YourSeq 140 1531 1983 2000 86.9% chr6 + 40590066 40590530 465 browser details YourSeq 136 1530 1981 2000 86.3% chrX + 106045457 106045911 455 browser details YourSeq 128 1536 1983 2000 81.7% chr17 - 43540754 43541191 438 browser details YourSeq 128 1536 1972 2000 89.1% chr12 + 40717597 40718057 461 browser details YourSeq 128 1535 1971 2000 86.4% chr1 + 47033808 47034261 454 browser details YourSeq 127 1570 1983 2000 83.6% chrX + 6898984 6899376 393 browser details YourSeq 127 1592 1988 2000 84.5% chr7 + 76974241 76974604 364 browser details YourSeq 116 1530 1983 2000 85.4% chr11 - 121750984 121751449 466 browser details YourSeq 113 1500 1720 2000 85.6% chr3 + 56895213 56895570 358 browser details YourSeq 113 1671 1988 2000 87.5% chr10 + 25810542 25810875 334 browser details YourSeq 109 1534 1970 2000 87.8% chr10 - 46568516 46568982 467 browser details YourSeq 108 1537 1988 2000 72.2% chr3 - 93088182 93088734 553 browser details YourSeq 106 1530 1966 2000 87.4% chrX - 116680021 116680478 458 browser details YourSeq 105 1530 1718 2000 87.3% chr3 - 60228084 60228279 196 browser details YourSeq 105 1538 1779 2000 86.7% chr8 + 74230310 74230563 254 browser details YourSeq 105 1551 1988 2000 86.4% chr2 + 55456808 55457257 450 browser details YourSeq 103 1534 1983 2000 81.6% chrX + 135856540 135856981 442 browser details YourSeq 103 1680 1981 2000 83.4% chr7 + 92831185 92831477 293

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Scin scinderin [ Mus musculus (house mouse) ] Gene ID: 20259, updated on 12-Aug-2019

Gene summary

Official Symbol Scin provided by MGI Official Full Name scinderin provided by MGI Primary source MGI:MGI:1306794 See related Ensembl:ENSMUSG00000002565 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AW545522; adseverin Expression Biased expression in colon adult (RPKM 56.3), large intestine adult (RPKM 22.6) and 4 other tissuesS ee more Orthologs human all

Genomic context

Location: 12; 12 B1 See Scin in Genome Data Viewer Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (40059769..40134228, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (40786356..40860815, complement)

Chromosome 12 - NC_000078.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Scin ENSMUSG00000002565

Description scinderin [Source:MGI Symbol;Acc:MGI:1306794] Gene Synonyms adseverin Location Chromosome 12: 40,059,769-40,134,228 reverse strand. GRCm38:CM001005.2 About this gene This gene has 2 transcripts (splice variants), 171 orthologues, 7 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Scin-201 ENSMUST00000002640.5 2995 715aa ENSMUSP00000002640.5 Protein coding CCDS49055 Q60604 TSL:1 GENCODE basic APPRIS P1

Scin-202 ENSMUST00000078481.13 2654 615aa ENSMUSP00000077573.7 Protein coding CCDS25891 Q60604 TSL:1 GENCODE basic

94.46 kb Forward strand 40.06Mb 40.08Mb 40.10Mb 40.12Mb 40.14Mb Contigs < AC131921.9 < AC174598.2 Genes (Comprehensive set... < Scin-202protein coding

< Scin-201protein coding

Regulatory Build

40.06Mb 40.08Mb 40.10Mb 40.12Mb 40.14Mb Reverse strand 94.46 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000002640

< Scin-201protein coding

Reverse strand 74.46 kb

ENSMUSP00000002... Low complexity (Seg) Superfamily SSF55753

Gelsolin-like domain superfamily SMART Villin/ Prints Villin/Gelsolin Pfam Gelsolin-like domain PANTHER Villin/Gelsolin

Adseverin Gene3D ADF-H/Gelsolin-like domain superfamily CDD cd11290 cd11289 cd11292 cd11293 cd11288 cd11291

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 715

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8