https://www.alphaknockout.com

Mouse Stab2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Stab2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Stab2 (NCBI Reference Sequence: NM_138673 ; Ensembl: ENSMUSG00000035459 ) is located on Mouse 10. 69 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 69 (Transcript: ENSMUST00000035288). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Stab2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-233D10 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for knock-out alleles exhibit no gross abnormaities. Mice homozygous for one null allele display elevated serum hyaluronic acid levels and decreased metastasis.

Exon 4 starts from about 4.64% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 15628 bp, and the size of intron 4 for 3'-loxP site insertion: 1090 bp. The size of effective cKO region: ~586 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 4 5 69 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Stab2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7086bp) | A(25.26% 1790) | C(23.47% 1663) | T(29.93% 2121) | G(21.34% 1512)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 86981427 86984426 3000 browser details YourSeq 161 1718 2088 3000 82.0% chr9 - 78840612 78840993 382 browser details YourSeq 154 1833 2162 3000 89.3% chr16 - 22233080 22233417 338 browser details YourSeq 141 1764 2083 3000 85.1% chr12 + 91864981 91865315 335 browser details YourSeq 131 1782 2082 3000 87.1% chr11 + 69870512 69870818 307 browser details YourSeq 129 1715 2110 3000 91.7% chr4 - 125806836 125807254 419 browser details YourSeq 115 2705 2912 3000 82.4% chr17 - 28995293 28995561 269 browser details YourSeq 115 1706 1958 3000 89.2% chr3 + 28575555 28575830 276 browser details YourSeq 114 1782 2066 3000 87.2% chr12 - 98892460 98892743 284 browser details YourSeq 113 1673 2119 3000 90.0% chr4 + 116410592 116411052 461 browser details YourSeq 111 1925 2073 3000 87.9% chr10 - 83169327 83169487 161 browser details YourSeq 107 2703 2885 3000 83.9% chr17 - 46766382 47037528 271147 browser details YourSeq 106 2675 2888 3000 78.1% chr17 - 87839060 87839272 213 browser details YourSeq 106 2714 2880 3000 82.1% chr15 + 76641625 76641792 168 browser details YourSeq 103 2703 2882 3000 76.6% chr4 + 142142569 142142744 176 browser details YourSeq 101 1783 2061 3000 85.5% chr10 - 31346943 31347223 281 browser details YourSeq 100 2705 2898 3000 78.8% chr11 - 61968231 61968420 190 browser details YourSeq 97 2727 2901 3000 80.3% chr15 - 5386674 5386845 172 browser details YourSeq 97 2675 2857 3000 76.6% chr6 + 31378315 31378472 158 browser details YourSeq 95 2680 2827 3000 82.5% chr14 + 69384616 69602987 218372

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 86977841 86980840 3000 browser details YourSeq 635 1437 2675 3000 86.4% chrX + 95294300 95295586 1287 browser details YourSeq 595 1452 2491 3000 87.0% chr11 - 118774409 118775467 1059 browser details YourSeq 575 1494 2684 3000 82.1% chr10 + 119299028 119300175 1148 browser details YourSeq 574 1442 2673 3000 85.1% chr1 + 171984122 171985385 1264 browser details YourSeq 568 1452 2681 3000 81.8% chr7 + 140279885 140281101 1217 browser details YourSeq 562 1573 2681 3000 86.7% chrX - 150433596 150455701 22106 browser details YourSeq 561 1477 2681 3000 85.9% chr10 + 115700965 115702139 1175 browser details YourSeq 560 1588 2679 3000 82.7% chr13 + 64981973 64983058 1086 browser details YourSeq 558 1484 2681 3000 83.0% chr17 + 10883757 10884942 1186 browser details YourSeq 556 1647 2681 3000 86.0% chr14 - 118523208 118524243 1036 browser details YourSeq 553 1541 2681 3000 83.6% chr9 - 31752511 31753635 1125 browser details YourSeq 551 1447 2607 3000 82.4% chr4 - 106258654 106259790 1137 browser details YourSeq 550 1588 2681 3000 86.0% chrX + 153934459 153935547 1089 browser details YourSeq 549 1452 2679 3000 84.5% chr2 + 62119044 62120238 1195 browser details YourSeq 544 1577 2679 3000 85.9% chr16 + 23364836 23365931 1096 browser details YourSeq 538 1517 2681 3000 83.8% chr10 - 121181449 121182613 1165 browser details YourSeq 534 1580 2689 3000 83.7% chr7 - 84199396 84200483 1088 browser details YourSeq 523 1451 2500 3000 85.6% chr5 - 44469757 44470819 1063 browser details YourSeq 515 1453 2679 3000 85.2% chr1 + 35443733 35444912 1180

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Stab2 stabilin 2 [ Mus musculus (house mouse) ] Gene ID: 192188, updated on 12-Aug-2019

Gene summary

Official Symbol Stab2 provided by MGI Official Full Name stabilin 2 provided by MGI Primary source MGI:MGI:2178743 See related Ensembl:ENSMUSG00000035459 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as FELL; FEEL-2; STAB-2; MFEEL-2 Expression Biased expression in spleen adult (RPKM 16.0), liver E18 (RPKM 6.2) and 8 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 C1 See Stab2 in Genome Data Viewer

Exon count: 74

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (86841194..87008038, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (86303955..86470687, complement)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Stab2 ENSMUSG00000035459

Description stabilin 2 [Source:MGI Symbol;Acc:MGI:2178743] Gene Synonyms FEEL-2, STAB-2 Location Chromosome 10: 86,841,198-87,008,025 reverse strand. GRCm38:CM001003.2 About this gene This gene has 7 transcripts (splice variants), 226 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Stab2- ENSMUST00000035288.16 8228 2559aa ENSMUSP00000048309.8 Protein coding CCDS36021 E5RKF9 TSL:1 201 Q8R4U0 GENCODE basic APPRIS P1

Stab2- ENSMUST00000219341.2 3502 870aa ENSMUSP00000151465.2 Nonsense mediated - A0A1W2P6Y4 CDS 5' 205 decay incomplete TSL:5

Stab2- ENSMUST00000219612.1 502 No - Retained intron - - TSL:3 206 protein

Stab2- ENSMUST00000219659.1 442 No - Retained intron - - TSL:5 207 protein

Stab2- ENSMUST00000218408.1 1237 No - lncRNA - - TSL:1 203 protein

Stab2- ENSMUST00000218366.1 720 No - lncRNA - - TSL:1 202 protein

Stab2- ENSMUST00000219280.1 523 No - lncRNA - - TSL:3 204 protein

Page 6 of 8 https://www.alphaknockout.com

186.83 kb Forward strand 86.85Mb 86.90Mb 86.95Mb 87.00Mb Nt5dc3-201 >protein coding Gm16280-201 >lncRNA Gm16271-201 >lncRNA (Comprehensive set...

Nt5dc3-202 >retained intron Gm16270-201 >transcribed processed pseudogene Gm16269-201 >processed pseudogene

Gm49358-201 >protein coding

Gm16268-201 >lncRNA

Contigs < AC112790.5 < AC025501.19

Genes (Comprehensive set... < Stab2-201protein coding

< Stab2-205nonsense mediated decay < Stab2-203lncRNA < Stab2-204lncRNA

< Stab2-206retained intron < Stab2-202lncRNA

< Stab2-207retained intron

Regulatory Build

86.85Mb 86.90Mb 86.95Mb 87.00Mb Reverse strand 186.83 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000035288

< Stab2-201protein coding

Reverse strand 166.83 kb

ENSMUSP00000048... Transmembrane heli... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily SSF57196 C-type lectin fold

FAS1 domain superfamily SMART EGF-like domain Link domain

EGF-like calcium-binding domain

FAS1 domain

Laminin EGF domain Pfam FAS1 domain Link domain

EGF domain PROSITE profiles FAS1 domain

EGF-like domain Link domain PROSITE patterns EGF-like, conserved site Link domain

EGF-like, conserved site PANTHER PTHR24038

PTHR24038:SF0 Gene3D 2.170.300.10 C-type lectin-like/link domain superfamily

FAS1 domain superfamily

2.10.25.10 CDD cd00054 cd00055

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2559

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8