https://www.alphaknockout.com

Mouse Hsd17b3 Knockout Project (CRISPR/Cas9)

Objective: To create a Hsd17b3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hsd17b3 (NCBI Reference Sequence: NM_008291 ; Ensembl: ENSMUSG00000033122 ) is located on Mouse 13. 11 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 11 (Transcript: ENSMUST00000166224). Exon 3~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 20.77% of the coding region. Exon 3~4 covers 20.11% of the coding region. The size of effective KO region: ~2587 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 11

Legends Exon of mouse Hsd17b3 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1785 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.25% 505) | C(21.45% 429) | T(31.45% 629) | G(21.85% 437)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1785bp) | A(25.66% 458) | C(21.74% 388) | T(31.32% 559) | G(21.29% 380)

Note: The 1785 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr13 - 64076424 64078423 2000 browser details YourSeq 168 42 703 2000 87.4% chr1 - 4757905 5000202 242298 browser details YourSeq 121 48 680 2000 89.6% chr11 + 101693611 101711697 18087 browser details YourSeq 107 37 167 2000 93.6% chrX - 36679685 36679819 135 browser details YourSeq 107 4 179 2000 84.4% chr15 + 89930246 89930403 158 browser details YourSeq 95 37 185 2000 90.6% chr13 + 89351446 89351604 159 browser details YourSeq 91 47 178 2000 90.4% chr2 - 154762604 154762733 130 browser details YourSeq 88 48 178 2000 88.5% chr11 + 30400321 30400448 128 browser details YourSeq 86 192 708 2000 74.6% chr14 + 30193569 30193977 409 browser details YourSeq 85 42 172 2000 89.5% chr12 + 3564117 3564246 130 browser details YourSeq 83 42 177 2000 90.4% chr1 - 63385842 63385978 137 browser details YourSeq 81 38 252 2000 93.7% chr7 + 143889552 143889977 426 browser details YourSeq 79 54 353 2000 71.9% chr2 + 157156454 157156565 112 browser details YourSeq 79 27 139 2000 91.9% chr1 + 180543639 180543754 116 browser details YourSeq 78 195 673 2000 72.9% chr11 + 73163533 73163886 354 browser details YourSeq 75 510 695 2000 78.5% chr5 - 63997465 63997651 187 browser details YourSeq 74 71 183 2000 96.4% chr11 + 120702943 120703068 126 browser details YourSeq 74 54 176 2000 91.7% chr11 + 64380634 64380754 121 browser details YourSeq 74 42 172 2000 92.1% chr10 + 55387568 55387698 131 browser details YourSeq 72 510 698 2000 85.8% chr6 - 30894336 30894523 188

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1785 1 1785 1785 100.0% chr13 - 64072052 64073836 1785 browser details YourSeq 267 832 1180 1785 90.9% chr3 + 103162497 103162868 372 browser details YourSeq 266 836 1179 1785 90.7% chr11 - 76068637 76069061 425 browser details YourSeq 265 833 1180 1785 88.9% chr12 + 28584626 28584980 355 browser details YourSeq 263 829 1182 1785 90.3% chr9 + 60850898 60851258 361 browser details YourSeq 261 832 1180 1785 89.5% chr19 + 46115282 46115640 359 browser details YourSeq 260 831 1178 1785 91.2% chr9 - 44557069 44557421 353 browser details YourSeq 258 836 1259 1785 90.6% chr8 + 86907867 86908445 579 browser details YourSeq 258 860 1503 1785 84.5% chr6 + 52710611 52711189 579 browser details YourSeq 257 844 1528 1785 86.3% chr5 - 146820061 146820719 659 browser details YourSeq 257 830 1180 1785 88.5% chr11 + 87138504 87138875 372 browser details YourSeq 254 839 1528 1785 83.1% chrX - 157544272 157544622 351 browser details YourSeq 254 836 1200 1785 86.5% chr8 - 84849070 84849426 357 browser details YourSeq 252 840 1180 1785 88.7% chr9 - 25632920 25633405 486 browser details YourSeq 250 841 1177 1785 88.2% chr4 - 11304906 11305266 361 browser details YourSeq 249 735 1468 1785 83.4% chr18 - 34573201 34573926 726 browser details YourSeq 248 836 1157 1785 89.5% chr4 - 137066186 137066522 337 browser details YourSeq 246 836 1177 1785 91.6% chr5 + 110899324 110899859 536 browser details YourSeq 246 838 1261 1785 91.0% chr13 + 24577902 24578497 596 browser details YourSeq 245 830 1157 1785 92.4% chr17 - 35604542 35604887 346

Note: The 1785 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Hsd17b3 hydroxysteroid (17-beta) dehydrogenase 3 [ Mus musculus (house mouse) ] Gene ID: 15487, updated on 15-Oct-2019

Gene summary

Official Symbol Hsd17b3 provided by MGI Official Full Name hydroxysteroid (17-beta) dehydrogenase 3 provided by MGI Primary source MGI:MGI:107177 See related Ensembl:ENSMUSG00000033122 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Restricted expression toward testis adult (RPKM 6.2) See more Orthologs human all

Genomic context

Location: 13; 13 B3 See Hsd17b3 in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (64058274..64089262, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (64159582..64190509, complement)

Chromosome 13 - NC_000079.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Hsd17b3 ENSMUSG00000033122

Description hydroxysteroid (17-beta) dehydrogenase 3 [Source:MGI Symbol;Acc:MGI:107177] Gene Synonyms 17(beta)HSD type 3 Location Chromosome 13: 64,058,266-64,089,230 reverse strand. GRCm38:CM001006.2 About this gene This gene has 5 transcripts (splice variants), 178 orthologues, 31 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hsd17b3-202 ENSMUST00000166224.7 1286 305aa ENSMUSP00000132011.1 Protein coding CCDS26594 P70385 TSL:2 GENCODE basic APPRIS P1

Hsd17b3-204 ENSMUST00000222783.1 1184 305aa ENSMUSP00000152848.1 Protein coding CCDS26594 P70385 TSL:5 GENCODE basic APPRIS P1

Hsd17b3-201 ENSMUST00000039832.6 1131 305aa ENSMUSP00000044217.6 Protein coding CCDS26594 P70385 TSL:1 GENCODE basic APPRIS P1

Hsd17b3-203 ENSMUST00000221513.1 655 123aa ENSMUSP00000152478.1 Protein coding - A0A1Y7VJL6 CDS 5' incomplete TSL:5

Hsd17b3-205 ENSMUST00000222810.1 653 193aa ENSMUSP00000152274.1 Protein coding - A0A1Y7VJ36 CDS 3' incomplete TSL:3

50.97 kb Forward strand 64.05Mb 64.06Mb 64.07Mb 64.08Mb 64.09Mb Contigs CT009717.9 > (Comprehensive set... < Hsd17b3-204protein coding < Slc35d2-203retained intron

< Hsd17b3-202protein coding < Slc35d2-201protein coding

< Hsd17b3-201protein coding < Slc35d2-202protein coding

< Hsd17b3-203protein coding < Slc35d2-208nonsense mediated decay

< Hsd17b3-205protein coding

< Slc35d2-206nonsense mediated decay

Regulatory Build

64.05Mb 64.06Mb 64.07Mb 64.08Mb 64.09Mb Reverse strand 50.97 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000166224

< Hsd17b3-202protein coding

Reverse strand 30.93 kb

ENSMUSP00000132... Superfamily NAD(P)-binding domain superfamily Prints Short-chain dehydrogenase/reductase SDR

Short-chain dehydrogenase/reductase SDR Pfam Short-chain dehydrogenase/reductase SDR PROSITE patterns Short-chain dehydrogenase/reductase, conserved site PIRSF PIRSF000126 PANTHER PTHR43899

Testosterone 17-beta-dehydrogenase 3 Gene3D 3.40.50.720 CDD cd05356

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 305

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8