https://www.alphaknockout.com

Mouse Stab1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Stab1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Stab1 (NCBI Reference Sequence: NM_138672 ; Ensembl: ENSMUSG00000042286 ) is located on Mouse 14. 69 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 69 (Transcript: ENSMUST00000036618). Exon 19~22 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Stab1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-121F22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit no physical or behavioral abnormalities.

Exon 19 starts from about 25.84% of the coding region. The knockout of Exon 19~22 will result in frameshift of the gene. The size of intron 18 for 5'-loxP site insertion: 1094 bp, and the size of intron 22 for 3'-loxP site insertion: 713 bp. The size of effective cKO region: ~2519 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 16

1 15 17 18 19 20 21 22 23 24 25 69 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Stab1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9019bp) | A(24.03% 2167) | C(26.29% 2371) | T(23.08% 2082) | G(26.6% 2399)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 31157998 31160997 3000 browser details YourSeq 383 2266 2854 3000 90.0% chr11 + 93691613 93692220 608 browser details YourSeq 340 2329 2840 3000 90.6% chr9 - 70821756 71268732 446977 browser details YourSeq 246 2485 2855 3000 93.2% chr7 - 130136808 130137196 389 browser details YourSeq 244 2361 2855 3000 91.0% chr5 - 142343298 142344046 749 browser details YourSeq 241 2269 2862 3000 86.1% chr15 - 24829698 24830122 425 browser details YourSeq 233 2266 2853 3000 86.0% chr17 + 31043492 31043854 363 browser details YourSeq 232 2265 2855 3000 81.8% chr17 - 64552276 64552661 386 browser details YourSeq 232 2265 2865 3000 85.3% chr8 + 47322850 47323255 406 browser details YourSeq 231 2290 2863 3000 90.6% chr7 + 126628543 126629185 643 browser details YourSeq 231 2266 2852 3000 85.6% chr7 + 119634116 119634517 402 browser details YourSeq 228 2269 2863 3000 84.6% chrX - 103756734 103757176 443 browser details YourSeq 227 2263 2863 3000 85.9% chrX + 142175473 142175866 394 browser details YourSeq 220 2485 2883 3000 89.1% chr13 + 93959849 93960264 416 browser details YourSeq 216 2277 2862 3000 84.6% chr6 + 125707395 125707794 400 browser details YourSeq 216 2270 2863 3000 84.9% chr17 + 87901909 87902321 413 browser details YourSeq 215 2490 2864 3000 89.8% chr3 + 69461778 69462164 387 browser details YourSeq 211 2277 2855 3000 85.3% chr6 + 52110341 52110733 393 browser details YourSeq 209 2271 2865 3000 85.8% chrX - 85948523 85948931 409 browser details YourSeq 209 2491 2855 3000 93.1% chr15 + 84071673 84072063 391

Note: The 3000 bp section upstream of Exon 19 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 31152479 31155478 3000 browser details YourSeq 163 22 236 3000 90.0% chr16 - 13713149 13713354 206 browser details YourSeq 154 33 250 3000 87.1% chr7 + 63337188 63337374 187 browser details YourSeq 154 41 220 3000 93.3% chr3 + 95654454 95655016 563 browser details YourSeq 148 36 1997 3000 91.6% chr18 + 64012268 64290191 277924 browser details YourSeq 143 36 215 3000 92.9% chr13 - 44783422 44783603 182 browser details YourSeq 143 44 210 3000 93.5% chr16 + 35944294 35944463 170 browser details YourSeq 142 10 196 3000 89.5% chr17 + 58433972 58434155 184 browser details YourSeq 139 32 209 3000 87.1% chr15 - 34623769 34623939 171 browser details YourSeq 137 32 212 3000 86.3% chr15 - 49117167 49117328 162 browser details YourSeq 136 42 209 3000 92.6% chr19 - 6237473 6628919 391447 browser details YourSeq 136 32 216 3000 87.7% chr10 - 107821513 107821686 174 browser details YourSeq 135 32 188 3000 94.2% chr9 + 56511276 56511883 608 browser details YourSeq 134 42 212 3000 88.5% chr11 + 75352980 75353139 160 browser details YourSeq 133 42 224 3000 86.0% chrX + 116770337 116770494 158 browser details YourSeq 132 33 257 3000 82.3% chr8 - 69763804 69763977 174 browser details YourSeq 131 41 197 3000 92.9% chr17 + 87319576 87319732 157 browser details YourSeq 130 36 250 3000 84.0% chr2 + 180158365 180158534 170 browser details YourSeq 130 36 193 3000 92.3% chr11 + 115714268 115714425 158 browser details YourSeq 129 33 197 3000 86.5% chr12 - 80485576 80485737 162

Note: The 3000 bp section downstream of Exon 22 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Stab1 stabilin 1 [ Mus musculus (house mouse) ] Gene ID: 192187, updated on 8-Oct-2019

Gene summary

Official Symbol Stab1 provided by MGI Official Full Name stabilin 1 provided by MGI Primary source MGI:MGI:2178742 See related Ensembl:ENSMUSG00000042286 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MS-1; FEEL-1; FELE-1; STAB-1; MFEEL-1; mKIAA0246 Expression Ubiquitous expression in ovary adult (RPKM 36.3), subcutaneous fat pad adult (RPKM 27.9) and 27 other tissues See more Orthologs human all

Genomic context

Location: 14; 14 B See Stab1 in Genome Data Viewer

Exon count: 69

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (31139017..31168651, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (31952203..31981827, complement)

Chromosome 14 - NC_000080.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Stab1 ENSMUSG00000042286

Description stabilin 1 [Source:MGI Symbol;Acc:MGI:2178742] Gene Synonyms MS-1 Location Chromosome 14: 31,139,013-31,168,641 reverse strand. GRCm38:CM001007.2 About this gene This gene has 13 transcripts (splice variants), 199 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Stab1-201 ENSMUST00000036618.13 7999 2571aa ENSMUSP00000046199.7 Protein coding CCDS26906 G3X973 TSL:1 GENCODE basic APPRIS P1

Stab1-203 ENSMUST00000159249.1 591 197aa ENSMUSP00000125542.1 Protein coding - F7BK35 CDS 5' and 3' incomplete TSL:5

Stab1-207 ENSMUST00000160024.7 413 92aa ENSMUSP00000125239.1 Protein coding - F7CT68 CDS 5' incomplete TSL:3

Stab1-208 ENSMUST00000160720.1 3453 No protein - Retained intron - - TSL:2

Stab1-209 ENSMUST00000161129.7 3001 No protein - Retained intron - - TSL:1

Stab1-205 ENSMUST00000159532.1 2427 No protein - Retained intron - - TSL:1

Stab1-212 ENSMUST00000162169.7 2344 No protein - Retained intron - - TSL:2

Stab1-206 ENSMUST00000159757.7 2116 No protein - Retained intron - - TSL:1

Stab1-204 ENSMUST00000159480.7 1236 No protein - Retained intron - - TSL:2

Stab1-211 ENSMUST00000161631.1 887 No protein - Retained intron - - TSL:3

Stab1-213 ENSMUST00000162763.1 524 No protein - Retained intron - - TSL:3

Stab1-210 ENSMUST00000161464.1 444 No protein - Retained intron - - TSL:3

Stab1-202 ENSMUST00000159208.7 242 No protein - lncRNA - - TSL:5

Page 6 of 8 https://www.alphaknockout.com

49.63 kb Forward strand

31.13Mb 31.14Mb 31.15Mb 31.16Mb 31.17Mb Nt5dc2-204 >protein coding (Comprehensive set...

Nt5dc2-203 >protein coding

Nt5dc2-202 >protein coding

Nt5dc2-206 >protein coding

Nt5dc2-201 >protein coding

Nt5dc2-205 >retained intron

Contigs AC154446.2 > Genes (Comprehensive set... < Stab1-201protein coding < Nisch-201protein coding

< Stab1-212retained intron < Stab1-206retained intron < Nisch-220nonsense mediated decay

< Stab1-204retained intron < Stab1-210retained intron < Nisch-213protein coding

< Stab1-207protein coding < Stab1-205retained intron < Nisch-218protein coding

< Stab1-209retained intron < Nisch-216protein coding

< Stab1-202lncRNA < Nisch-203retained intron

< Stab1-208retained intron < Nisch-210protein coding

< Stab1-211retained intron < Nisch-202lncRNA

< Stab1-203protein coding < Nisch-214lncRNA

< Stab1-213retained intron < Nisch-217lncRNA

< Nisch-206lncRNA

Regulatory Build

31.13Mb 31.14Mb 31.15Mb 31.16Mb 31.17Mb Reverse strand 49.63 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000036618

< Stab1-201protein coding

Reverse strand 29.63 kb

ENSMUSP00000046... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily SSF57196 C-type lectin fold

FAS1 domain superfamily SMART EGF-like domain Link domain

Laminin EGF domain

EGF-like calcium-binding domain

FAS1 domain Pfam EGF domain Link domain

FAS1 domain PROSITE profiles EGF-like domain Link domain

FAS1 domain PROSITE patterns EGF-like, conserved site Link domain

EGF-like, conserved site PANTHER PTHR24038

PTHR24038:SF8 Gene3D 2.10.25.10 C-type lectin-like/link domain superfamily

FAS1 domain superfamily

2.170.300.10 CDD cd00055

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2571

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8