https://www.alphaknockout.com

Mouse Tmprss3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tmprss3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tmprss3 (NCBI Reference Sequence: NM_001163776.1 ; Ensembl: ENSMUSG00000024034 ) is located on Mouse 17. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000114549). Exon 3~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tmprss3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-374L6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for an ENU-induced allele exhibit early onset deafness and disrupted vestibular function associated with hair cell degeneration.

Exon 3 starts from about 11.3% of the coding region. The knockout of Exon 3~5 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 2592 bp, and the size of intron 5 for 3'-loxP site insertion: 2327 bp. The size of effective cKO region: ~2055 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 13 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tmprss3 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8555bp) | A(23.05% 1972) | C(23.4% 2002) | T(26.71% 2285) | G(26.84% 2296)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 31195259 31198258 3000 browser details YourSeq 98 2028 2201 3000 75.8% chr4 + 123791225 123791393 169 browser details YourSeq 96 2028 2391 3000 88.8% chr4 + 105913806 105914381 576 browser details YourSeq 88 1908 2176 3000 75.9% chr13 - 69114074 69114257 184 browser details YourSeq 86 1940 2182 3000 90.2% chr8 + 25134817 25135058 242 browser details YourSeq 85 1940 2190 3000 89.0% chrX - 101763688 101763961 274 browser details YourSeq 85 2028 2182 3000 77.5% chr1 - 132672264 132672418 155 browser details YourSeq 83 2028 2190 3000 75.5% chr12 - 24970746 24970908 163 browser details YourSeq 81 2028 2176 3000 77.2% chr12 - 70426881 70427029 149 browser details YourSeq 79 2028 2192 3000 90.8% chr3 - 100349376 100349673 298 browser details YourSeq 77 1940 2125 3000 89.6% chr1 - 35994298 35994678 381 browser details YourSeq 76 1941 2185 3000 87.8% chrX + 99775503 99775749 247 browser details YourSeq 76 1935 2096 3000 89.7% chr6 + 51276829 51277001 173 browser details YourSeq 75 2028 2191 3000 75.5% chr4 - 100002660 100002825 166 browser details YourSeq 75 2028 2176 3000 87.2% chr3 - 152098717 152098879 163 browser details YourSeq 75 2028 2190 3000 88.7% chr8 + 117106478 117106654 177 browser details YourSeq 73 1970 2190 3000 92.0% chr17 - 27722525 27722757 233 browser details YourSeq 73 2038 2173 3000 86.6% chr1 - 71369621 71369756 136 browser details YourSeq 73 2028 2176 3000 91.1% chr1 - 35342685 35342833 149 browser details YourSeq 73 1940 2096 3000 77.5% chr1 + 135775585 135775750 166

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 31190204 31193203 3000 browser details YourSeq 216 2491 2946 3000 88.6% chr10 + 32046755 32047192 438 browser details YourSeq 215 2491 2960 3000 93.8% chr1 + 59092696 59406712 314017 browser details YourSeq 196 2553 2987 3000 92.5% chr10 + 60916684 61233756 317073 browser details YourSeq 191 2535 2984 3000 90.5% chr1 - 185078458 185079058 601 browser details YourSeq 174 2520 2993 3000 84.7% chr10 + 59177643 59177982 340 browser details YourSeq 169 2548 2933 3000 93.4% chr10 - 44142691 44405504 262814 browser details YourSeq 165 2604 2928 3000 92.2% chr1 + 190182907 190549238 366332 browser details YourSeq 163 2563 2944 3000 90.6% chr4 - 46354378 46354984 607 browser details YourSeq 159 2625 2931 3000 93.2% chr13 - 113692643 113692974 332 browser details YourSeq 152 2526 2911 3000 85.8% chr8 + 34128650 34129016 367 browser details YourSeq 152 2624 2987 3000 94.4% chr1 + 38830886 38840823 9938 browser details YourSeq 149 2592 2987 3000 92.3% chr5 + 119975849 119976266 418 browser details YourSeq 147 2468 2950 3000 86.6% chr13 - 52848893 52849310 418 browser details YourSeq 145 2499 2946 3000 91.1% chr10 + 17560741 17561209 469 browser details YourSeq 142 2494 3000 3000 82.5% chr1 - 16342691 16343039 349 browser details YourSeq 142 2580 2965 3000 84.1% chr12 + 5490358 5490607 250 browser details YourSeq 140 2491 3000 3000 82.2% chr10 + 19281853 19282179 327 browser details YourSeq 138 2506 3000 3000 83.3% chr12 + 31393592 31393865 274 browser details YourSeq 137 2485 2901 3000 83.0% chr10 - 115697214 115697484 271

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Tmprss3 transmembrane protease, serine 3 [ Mus musculus (house mouse) ] Gene ID: 140765, updated on 26-Jun-2020

Gene summary

Official Symbol Tmprss3 provided by MGI Official Full Name transmembrane protease, serine 3 provided by MGI Primary source MGI:MGI:2155445 See related Ensembl:ENSMUSG00000024034 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 17; 17 A3.3 See Tmprss3 in Genome Data Viewer Exon count: 12

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (31179263..31200504, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (31316210..31335919, complement)

Chromosome 17 - NC_000083.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Tmprss3 ENSMUSG00000024034

Description transmembrane protease, serine 3 [Source:MGI Symbol;Acc:MGI:2155445] Location Chromosome 17: 31,179,265-31,198,977 reverse strand. GRCm38:CM001010.2 About this gene This gene has 4 transcripts (splice variants), 287 orthologues, 20 paralogues, is a member of 1 Ensembl protein family and is associated with 23 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tmprss3-202 ENSMUST00000114549.3 2881 475aa ENSMUSP00000110196.2 Protein coding CCDS50053 Q3TZ06 Q8K1T0 TSL:1 GENCODE basic

Tmprss3-201 ENSMUST00000024833.12 2738 453aa ENSMUSP00000024833.5 Protein coding CCDS37547 Q8K1T0 TSL:1 GENCODE basic APPRIS P2

Tmprss3-203 ENSMUST00000236793.1 2674 453aa ENSMUSP00000158048.1 Protein coding CCDS37547 Q8K1T0 GENCODE basic APPRIS P2

Tmprss3-204 ENSMUST00000237740.1 2403 451aa ENSMUSP00000158320.1 Protein coding - A0A494BB54 GENCODE basic APPRIS ALT1

39.71 kb Forward strand 31.17Mb 31.18Mb 31.19Mb 31.20Mb Ubash3a-202 >nonsense mediated decay (Comprehensive set...

Ubash3a-208 >protein coding

Ubash3a-205 >protein coding

Ubash3a-201 >protein coding

Contigs < AC167247.2

Genes < Tmprss3-202protein coding (Comprehensive set...

< Tmprss3-201protein coding

< Tmprss3-203protein coding

< Tmprss3-204protein coding

Regulatory Build

31.17Mb 31.18Mb 31.19Mb 31.20Mb Reverse strand 39.71 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000114549

< Tmprss3-202protein coding

Reverse strand 19.71 kb

ENSMUSP00000110... Transmembrane heli... Low complexity (Seg) Superfamily SRCR-like domain superfamily

LDL receptor-like superfamily Peptidase S1, PA clan SMART Low-density lipoprotein (LDL) receptor class A repeat

SRCR-like domain Serine proteases, trypsin domain Prints Peptidase S1A, chymotrypsin family Pfam Low-density lipoprotein (LDL) receptor class A repeat

SRCR-like domain Serine proteases, trypsin domain PROSITE profiles Low-density lipoprotein (LDL) receptor class A repeat

SRCR domain Serine proteases, trypsin domain PROSITE patterns Low-density lipoprotein (LDL) receptor class A, conserved site Serine proteases, trypsin family, serine active site

Serine proteases, trypsin family, histidine active site PANTHER PTHR24253

PTHR24253:SF86 Gene3D LDL receptor-like superfamily 2.40.10.10

SRCR-like domain superfamily CDD Low-density lipoprotein (LDL) receptor class A repeat

Serine proteases, trypsin domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 475

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7