https://www.alphaknockout.com

Mouse Steap4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Steap4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Steap4 (NCBI Reference Sequence: NM_054098 ; Ensembl: ENSMUSG00000012428 ) is located on Mouse 5. 5 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 5 (Transcript: ENSMUST00000115421). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Steap4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-212F6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit adipose accumulation, oxidative stress, increased liver weight, lower metabolic rate, hypoactivity, insulin resistance, glucose intolerance, mild hyperglycemia and dyslipidemia.

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 14897 bp, and the size of intron 3 for 3'-loxP site insertion: 1385 bp. The size of effective cKO region: ~2082 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Steap4 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8584bp) | A(29.4% 2524) | C(20.51% 1761) | T(29.49% 2531) | G(20.6% 1768)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 + 7972190 7975189 3000 browser details YourSeq 92 1629 2149 3000 94.3% chr9 + 100969370 101014316 44947 browser details YourSeq 86 2023 2157 3000 80.0% chr5 + 130193520 130193650 131 browser details YourSeq 85 2037 2154 3000 84.3% chr13 + 61922050 61922163 114 browser details YourSeq 82 2022 2157 3000 84.5% chr10 - 112955902 112956032 131 browser details YourSeq 79 2027 2154 3000 86.3% chr9 + 75213971 75214096 126 browser details YourSeq 79 2022 2157 3000 84.7% chr10 + 126989776 126989907 132 browser details YourSeq 76 2042 2158 3000 85.9% chr5 - 122664164 122664276 113 browser details YourSeq 76 2022 2157 3000 91.2% chr2 - 144237549 144237864 316 browser details YourSeq 76 2036 2157 3000 91.4% chr1 + 36528827 36639507 110681 browser details YourSeq 74 1626 2153 3000 73.2% chr9 - 109936499 109936898 400 browser details YourSeq 73 2037 2161 3000 84.5% chr17 - 38461908 38462025 118 browser details YourSeq 72 1583 1691 3000 93.0% chr19 - 3960036 3960431 396 browser details YourSeq 72 2040 2157 3000 85.1% chr1 - 150683174 150683287 114 browser details YourSeq 72 2060 2161 3000 93.9% chr9 + 20753332 20753435 104 browser details YourSeq 72 2028 2157 3000 92.9% chr2 + 25535309 25535533 225 browser details YourSeq 71 2042 2157 3000 88.3% chrX + 32697470 32697583 114 browser details YourSeq 69 2042 2157 3000 85.9% chr8 - 123382159 123382267 109 browser details YourSeq 69 2036 2171 3000 93.8% chr15 + 87045581 87045732 152 browser details YourSeq 64 2042 2157 3000 87.2% chrX + 32022075 32022188 114

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 + 7977274 7980273 3000 browser details YourSeq 298 2330 2632 3000 99.4% chr5 + 7979443 7979762 320 browser details YourSeq 286 2145 2435 3000 99.4% chr5 + 7979561 7979868 308 browser details YourSeq 138 2490 2632 3000 98.7% chr5 + 7979443 7979602 160 browser details YourSeq 129 2145 2579 3000 92.8% chr4 + 100313692 100314286 595 browser details YourSeq 117 2145 2579 3000 93.4% chr4 + 100313129 100313877 749 browser details YourSeq 117 2206 2579 3000 93.4% chr4 + 100313366 100314197 832 browser details YourSeq 115 2145 2338 3000 96.8% chr5 + 7979521 7979811 291 browser details YourSeq 105 2145 2579 3000 93.4% chr4 + 100313866 100314452 587 browser details YourSeq 103 2182 2439 3000 83.8% chr7 + 132654816 132655049 234 browser details YourSeq 99 2164 2618 3000 92.4% chr14 - 16459203 16459659 457 browser details YourSeq 91 2164 2618 3000 77.6% chr14 - 16459175 16459407 233 browser details YourSeq 90 2444 2632 3000 97.9% chr5 + 7979517 7979722 206 browser details YourSeq 84 2226 2618 3000 82.4% chr5 - 148920276 148920645 370 browser details YourSeq 81 2183 2592 3000 81.8% chr13 - 96513634 96514003 370 browser details YourSeq 80 2164 2475 3000 77.7% chr14 - 16459175 16459379 205 browser details YourSeq 79 2443 2620 3000 92.3% chr7 + 132654808 132654993 186 browser details YourSeq 76 2222 2395 3000 93.0% chr7 + 132654870 132655045 176 browser details YourSeq 73 2206 2579 3000 83.2% chr4 + 100313690 100314044 355 browser details YourSeq 67 2260 2613 3000 71.5% chr11 - 52625216 52625321 106

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Steap4 STEAP family member 4 [ Mus musculus (house mouse) ] Gene ID: 117167, updated on 10-Oct-2019

Gene summary

Official Symbol Steap4 provided by MGI Official Full Name STEAP family member 4 provided by MGI Primary source MGI:MGI:1923560 See related Ensembl:ENSMUSG00000012428 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tiarp; Tnfaip9; AI481214; 1110021O17Rik Expression Biased expression in subcutaneous fat pad adult (RPKM 70.3), mammary gland adult (RPKM 51.7) and 9 other tissues See more

Genomic context

Location: 5; 5 A1 See Steap4 in Genome Data Viewer Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (7960472..7982213)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (7960472..7982213)

Chromosome 5 - NC_000071.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Steap4 ENSMUSG00000012428

Description STEAP family member 4 [Source:MGI Symbol;Acc:MGI:1923560] Gene Synonyms Tiarp, Tnfaip9 Location Chromosome 5: 7,960,457-7,982,213 forward strand. GRCm38:CM000998.2 About this gene This gene has 1 transcript (splice variant), 242 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Steap4-201 ENSMUST00000115421.2 3155 470aa ENSMUSP00000111081.1 Protein coding CCDS39008 Q923B6 TSL:1 GENCODE basic APPRIS P1

41.76 kb Forward strand 7.96Mb 7.97Mb 7.98Mb 7.99Mb (Comprehensive set... Steap4-201 >protein coding

Contigs < AC133514.4 Genes < Gm30835-201lncRNA (Comprehensive set...

Regulatory Build

7.96Mb 7.97Mb 7.98Mb 7.99Mb Reverse strand 41.76 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000115421

21.76 kb Forward strand

Steap4-201 >protein coding

ENSMUSP00000111... Transmembrane heli... Low complexity (Seg) Superfamily NAD(P)-binding domain superfamily Pfam Pyrroline-5-carboxylate reductase, catalytic, N-terminal Ferric reductase transmembrane component-like domain

PANTHER PTHR14239:SF5

PTHR14239 Gene3D 3.40.50.720

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 470

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7