https://www.alphaknockout.com

Mouse Tmprss11d Knockout Project (CRISPR/Cas9)

Objective: To create a Tmprss11d knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tmprss11d (NCBI Reference Sequence: NM_145561 ; Ensembl: ENSMUSG00000061259 ) is located on Mouse 5. 10 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000031175). Exon 4~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Aged female mice homozygous for a knock-in allele exhibit increased lymphoma incidence.

Exon 4 starts from about 19.98% of the coding region. Exon 4~6 covers 20.94% of the coding region. The size of effective KO region: ~9677 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 10

Legends Exon of mouse Tmprss11d Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1486 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1032 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1486bp) | A(32.64% 485) | C(20.46% 304) | T(32.71% 486) | G(14.2% 211)

Note: The 1486 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1032bp) | A(31.3% 323) | C(15.12% 156) | T(33.62% 347) | G(19.96% 206)

Note: The 1032 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1486 1 1486 1486 100.0% chr5 - 86337295 86338780 1486 browser details YourSeq 57 191 273 1486 82.1% chr13 + 94700015 94700088 74 browser details YourSeq 57 198 442 1486 92.6% chr11 + 114817108 114817444 337 browser details YourSeq 57 193 273 1486 89.2% chr10 + 68703295 68703546 252 browser details YourSeq 54 197 277 1486 95.2% chr2 - 31450076 31450322 247 browser details YourSeq 54 199 277 1486 95.0% chr17 + 26046219 26046422 204 browser details YourSeq 53 193 273 1486 85.3% chr4 - 24062706 24062781 76 browser details YourSeq 53 193 256 1486 95.0% chr11 - 105501963 105502044 82 browser details YourSeq 51 199 271 1486 91.9% chr15 - 55314575 55314674 100 browser details YourSeq 50 193 255 1486 92.9% chr16 - 90641033 90641118 86 browser details YourSeq 50 68 182 1486 75.6% chr10 - 118529315 118529434 120 browser details YourSeq 50 195 273 1486 93.0% chr12 + 104925945 104926030 86 browser details YourSeq 48 190 259 1486 94.7% chr17 + 68904947 68905179 233 browser details YourSeq 48 199 269 1486 87.5% chr15 + 83962103 83962174 72 browser details YourSeq 48 25 182 1486 94.6% chr14 + 66080123 66331137 251015 browser details YourSeq 48 191 252 1486 96.3% chr1 + 178518480 178518780 301 browser details YourSeq 47 193 256 1486 86.3% chr10 + 25028186 25028245 60 browser details YourSeq 46 61 184 1486 87.1% chr10 - 109476939 109477061 123 browser details YourSeq 46 195 257 1486 94.3% chr1 - 125658840 125658924 85 browser details YourSeq 46 193 259 1486 94.4% chr7 + 34804006 34804095 90

Note: The 1486 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1032 1 1032 1032 100.0% chr5 - 86326586 86327617 1032 browser details YourSeq 32 997 1028 1032 100.0% chr4 - 76313339 76313370 32 browser details YourSeq 32 997 1028 1032 100.0% chr19 - 12153368 12153399 32 browser details YourSeq 32 997 1028 1032 100.0% chr16 - 71439654 71439685 32 browser details YourSeq 32 997 1028 1032 100.0% chr13 - 39880059 39880090 32 browser details YourSeq 32 996 1028 1032 100.0% chr12 + 34359370 34359424 55 browser details YourSeq 31 997 1027 1032 100.0% chr8 - 31990728 31990758 31 browser details YourSeq 31 993 1025 1032 97.0% chr13 - 37811926 37811958 33 browser details YourSeq 31 997 1028 1032 100.0% chr1 + 70488560 70488594 35 browser details YourSeq 30 997 1028 1032 96.9% chr6 - 73545629 73545660 32 browser details YourSeq 30 997 1028 1032 96.9% chr16 - 83936528 83936559 32 browser details YourSeq 30 997 1026 1032 100.0% chr1 - 114747815 114747844 30 browser details YourSeq 30 971 1021 1032 94.2% chr1 - 25058375 25058475 101 browser details YourSeq 30 999 1028 1032 100.0% chr8 + 98592426 98592455 30 browser details YourSeq 30 997 1032 1032 91.7% chr6 + 91734114 91734149 36 browser details YourSeq 29 997 1025 1032 100.0% chr9 - 45462133 45462161 29 browser details YourSeq 29 992 1022 1032 90.0% chr5 - 17633640 17633669 30 browser details YourSeq 29 997 1027 1032 96.8% chr15 - 62391432 62391462 31 browser details YourSeq 29 997 1025 1032 100.0% chr12 - 45459942 45459970 29 browser details YourSeq 29 998 1028 1032 96.8% chr10 - 64154387 64154417 31

Note: The 1032 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Tmprss11d transmembrane protease, serine 11d [ Mus musculus (house mouse) ] Gene ID: 231382, updated on 10-Oct-2019

Gene summary

Official Symbol Tmprss11d provided by MGI Official Full Name transmembrane protease, serine 11d provided by MGI Primary source MGI:MGI:2385221 See related Ensembl:ENSMUSG00000061259 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AST; AsP; BC020151 Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 5; 5 E1 See Tmprss11d in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (86302854..86373387, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (86731879..86802412, complement)

Chromosome 5 - NC_000071.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Tmprss11d ENSMUSG00000061259

Description transmembrane protease, serine 11d [Source:MGI Symbol;Acc:MGI:2385221] Gene Synonyms AsP Location Chromosome 5: 86,302,217-86,373,420 reverse strand. GRCm38:CM000998.2 About this gene This gene has 2 transcripts (splice variants), 491 orthologues, 20 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tmprss11d-201 ENSMUST00000031175.11 2716 417aa ENSMUSP00000031175.5 Protein coding CCDS39124 Q8VHK8 TSL:1 GENCODE basic APPRIS P1

Tmprss11d-202 ENSMUST00000122377.1 2256 279aa ENSMUSP00000113079.1 Protein coding - Q8VHK8 TSL:1 GENCODE basic

91.20 kb Forward strand 86.30Mb 86.32Mb 86.34Mb 86.36Mb 86.38Mb Contigs AC100746.4 > < AC098885.3 Genes (Comprehensive set... < Tmprss11d-202protein coding

< Tmprss11d-201protein coding

Regulatory Build

86.30Mb 86.32Mb 86.34Mb 86.36Mb 86.38Mb Reverse strand 91.20 kb

Regulation Legend CTCF Open Chromatin Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000031175

< Tmprss11d-201protein coding

Reverse strand 71.20 kb

ENSMUSP00000031... Transmembrane heli... PDB-ENSP mappings Superfamily SEA domain superfamily Peptidase S1, PA clan

SMART Serine proteases, domain Prints Peptidase S1A, chymotrypsin family Pfam SEA domain Serine proteases, trypsin domain

PROSITE profiles SEA domain Serine proteases, trypsin domain

PROSITE patterns Serine proteases, trypsin family, serine active site

Serine proteases, trypsin family, histidine active site PIRSF Peptidase S1A, HAT/DESC1

PANTHER PTHR24253:SF45

PTHR24253 Gene3D SEA domain superfamily 2.40.10.10 CDD Serine proteases, trypsin domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 417

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8