https://www.alphaknockout.com

Mouse Stox1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Stox1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Stox1 (NCBI Reference Sequence: NM_001033260 ; Ensembl: ENSMUSG00000036923 ) is located on Mouse 10. 4 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 4 (Transcript: ENSMUST00000133371). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Stox1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-222J16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 15.32% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 1495 bp, and the size of intron 3 for 3'-loxP site insertion: 4287 bp. The size of effective cKO region: ~2871 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Stox1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9371bp) | A(25.33% 2374) | C(23.43% 2196) | T(28.0% 2624) | G(23.23% 2177)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 62666576 62669575 3000 browser details YourSeq 157 485 1026 3000 82.7% chr11 + 106492241 106492505 265 browser details YourSeq 149 489 728 3000 89.1% chr2 + 14536585 14536875 291 browser details YourSeq 145 483 718 3000 86.1% chr15 - 73181014 73181215 202 browser details YourSeq 145 451 641 3000 86.1% chr19 + 3354829 3355011 183 browser details YourSeq 143 461 646 3000 91.9% chr19 - 60418600 60418788 189 browser details YourSeq 143 483 722 3000 86.2% chr11 - 100912086 100912318 233 browser details YourSeq 140 309 642 3000 93.3% chr11 + 78315925 78316355 431 browser details YourSeq 139 457 642 3000 91.7% chr17 + 7612389 7612629 241 browser details YourSeq 138 462 643 3000 91.6% chr16 + 11239130 11239317 188 browser details YourSeq 137 486 716 3000 83.4% chr4 - 57446933 57447133 201 browser details YourSeq 135 483 645 3000 93.0% chr3 - 127609541 127609703 163 browser details YourSeq 134 483 661 3000 87.3% chr5 - 139316806 139316982 177 browser details YourSeq 134 461 644 3000 86.5% chr16 - 8727836 8728014 179 browser details YourSeq 134 486 1025 3000 78.9% chr15 - 89316528 89316969 442 browser details YourSeq 134 461 647 3000 87.0% chr11 - 121213489 121213674 186 browser details YourSeq 133 459 642 3000 89.2% chr15 + 76553422 76553600 179 browser details YourSeq 133 461 641 3000 89.0% chr1 + 4866054 4866238 185 browser details YourSeq 132 483 662 3000 85.0% chr2 - 21199485 21199651 167 browser details YourSeq 132 486 641 3000 92.4% chr9 + 115026580 115026735 156

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 62660705 62663704 3000 browser details YourSeq 142 1850 2510 3000 78.9% chr12 - 21851816 21852097 282 browser details YourSeq 132 2335 2526 3000 88.0% chr4 + 53625999 53626954 956 browser details YourSeq 126 1850 2510 3000 77.5% chr12 + 20351810 20352091 282 browser details YourSeq 124 2285 2510 3000 83.5% chr2 + 125022362 125022579 218 browser details YourSeq 119 2133 2510 3000 82.7% chrX - 104223034 104223358 325 browser details YourSeq 119 2288 2510 3000 83.3% chr2 + 5858089 5858296 208 browser details YourSeq 118 2315 2510 3000 85.8% chr5 + 34522780 34523392 613 browser details YourSeq 117 2310 2522 3000 82.9% chr9 - 75154782 75154983 202 browser details YourSeq 115 2324 2510 3000 79.2% chr19 + 42713463 42713622 160 browser details YourSeq 115 2311 2507 3000 85.7% chr12 + 16498826 16499020 195 browser details YourSeq 112 2354 2508 3000 89.6% chrX - 168680411 168680569 159 browser details YourSeq 112 2209 2510 3000 89.9% chr3 - 87524193 87524738 546 browser details YourSeq 112 2115 2510 3000 84.1% chr15 - 83515547 83515922 376 browser details YourSeq 110 2351 2510 3000 89.9% chr2 - 127132890 127133049 160 browser details YourSeq 110 2379 2962 3000 80.5% chr15 + 78016460 78016953 494 browser details YourSeq 109 2351 2518 3000 86.7% chr18 - 63326770 63326938 169 browser details YourSeq 108 2178 2510 3000 89.2% chr8 - 95192434 95193062 629 browser details YourSeq 108 2371 2518 3000 88.8% chr5 + 38165113 38165258 146 browser details YourSeq 107 2349 2502 3000 90.3% chr15 - 86804108 86804263 156

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Stox1 storkhead box 1 [ Mus musculus (house mouse) ] Gene ID: 216021, updated on 3-Sep-2019

Gene summary

Official Symbol Stox1 provided by MGI Official Full Name storkhead box 1 provided by MGI Primary source MGI:MGI:2684909 See related Ensembl:ENSMUSG00000036923 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm63; 4732470K04Rik Expression Biased expression in testis adult (RPKM 3.2), CNS E18 (RPKM 0.8) and 13 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 B4 See Stox1 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (62659043..62726094, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (62122170..62188847, complement)

Chromosome 10 - NC_000076.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Stox1 ENSMUSG00000036923

Description storkhead box 1 [Source:MGI Symbol;Acc:MGI:2684909] Gene Synonyms 4732470K04Rik Location : 62,659,043-62,726,128 reverse strand. GRCm38:CM001003.2 About this gene This gene has 3 transcripts (splice variants), 189 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Stox1- ENSMUST00000133371.7 3510 990aa ENSMUSP00000114652.1 Protein coding CCDS48580 B2RQL2 TSL:1 202 GENCODE basic APPRIS P1

Stox1- ENSMUST00000148720.7 1012 136aa ENSMUSP00000116180.1 Protein coding - F6QZ35 CDS 5' 203 incomplete TSL:3

Stox1- ENSMUST00000126979.1 507 39aa ENSMUSP00000114348.1 Nonsense mediated - F6YZ72 CDS 5' 201 decay incomplete TSL:3

87.09 kb Forward strand 62.66Mb 62.68Mb 62.70Mb 62.72Mb Gm18514-201 >processed pseudogene Gm47261-201 >processed pseudogene (Comprehensive set...

Mir7215-201 >miRNA

Contigs AC122539.4 > Genes (Comprehensive set... < Ddx50-201protein coding < Stox1-201nonsense mediated decay

< Ddx50-206lncRNA < Stox1-202protein coding

< Ddx50-203retained intron

< Ddx50-204retained intron

< Stox1-203protein coding

Regulatory Build

62.66Mb 62.68Mb 62.70Mb 62.72Mb Reverse strand 87.09 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript pseudogene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000133371

< Stox1-202protein coding

Reverse strand 67.09 kb

ENSMUSP00000114... MobiDB lite Low complexity (Seg) Pfam Storkhead-box protein, winged-helix domain PANTHER Storkhead-box protein 1/2

PTHR22437:SF1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 990

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7