https://www.alphaknockout.com

Mouse Tprn Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tprn conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tprn (NCBI Reference Sequence: NM_175286 ; Ensembl: ENSMUSG00000048707 ) is located on Mouse 2. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000114336). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tprn gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-132N23 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit hearing loss and degeneration of hair cell stereocilia.

Exon 1 covers 83.04% of the coding region. Start codon is in exon 1, and stop codon is in exon 4. The size of intron 1 for 3'-loxP site insertion: 4278 bp. The size of effective cKO region: ~2126 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele A T

5' G gRNA region 3'

1 4

Targeting vector A T G

Targeted allele A T G

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Tprn cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8366bp) | A(23.33% 1952) | C(26.13% 2186) | T(23.33% 1952) | G(27.21% 2276)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 25259689 25262688 3000 browser details YourSeq 855 427 1447 3000 92.5% chr14 - 103566376 103567419 1044 browser details YourSeq 847 433 1444 3000 93.6% chr11 + 52465390 52466422 1033 browser details YourSeq 835 468 1441 3000 94.1% chr16 + 17533485 17534469 985 browser details YourSeq 832 433 1441 3000 92.1% chr17 - 55168718 55169735 1018 browser details YourSeq 829 468 1441 3000 93.3% chr5 + 28804015 28805012 998 browser details YourSeq 827 468 1441 3000 93.7% chr19 - 60635768 60636758 991 browser details YourSeq 827 427 1422 3000 92.1% chr14 - 47522712 47523723 1012 browser details YourSeq 824 427 1440 3000 91.6% chr1 + 118225362 118226401 1040 browser details YourSeq 823 427 1433 3000 92.4% chr10 + 21249407 21250597 1191 browser details YourSeq 822 432 1431 3000 92.0% chr5 - 130486434 130487443 1010 browser details YourSeq 820 433 1427 3000 92.1% chr19 - 25885773 25886777 1005 browser details YourSeq 820 468 1427 3000 93.9% chr11 + 80721949 80722923 975 browser details YourSeq 818 427 1448 3000 92.2% chr10 + 78532352 78533396 1045 browser details YourSeq 817 433 1433 3000 91.4% chr2 - 94644560 94645580 1021 browser details YourSeq 817 427 1422 3000 92.8% chr10 + 74786824 74787838 1015 browser details YourSeq 816 468 1421 3000 93.4% chr13 - 15174930 15217171 42242 browser details YourSeq 816 427 1426 3000 91.4% chr10 - 88840220 88841239 1020 browser details YourSeq 815 427 1427 3000 93.7% chr12 - 98665496 98666500 1005 browser details YourSeq 815 468 1422 3000 93.8% chr14 + 13987382 13988355 974

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 25264805 25267804 3000 browser details YourSeq 23 1878 1904 3000 92.6% chr1 + 143675125 143675151 27

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tprn taperin [ Mus musculus (house mouse) ] Gene ID: 97031, updated on 10-Oct-2019

Gene summary

Official Symbol Tprn provided by MGI Official Full Name taperin provided by MGI Primary source MGI:MGI:2139535 See related Ensembl:ENSMUSG00000048707 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C87750; C430004E15Rik Expression Biased expression in testis adult (RPKM 57.5), small intestine adult (RPKM 53.4) and 14 other tissues See more Orthologs all

Genomic context

Location: 2; 2 A3 See Tprn in Genome Data Viewer

Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (25262598..25269886)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (25118118..25125406)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Tprn ENSMUSG00000048707

Description taperin [Source:MGI Symbol;Acc:MGI:2139535] Gene Synonyms C430004E15Rik Location Chromosome 2: 25,262,618-25,269,885 forward strand. GRCm38:CM000995.2 About this gene This gene has 4 transcripts (splice variants), 167 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tprn-201 ENSMUST00000114336.3 2787 749aa ENSMUSP00000109975.3 Protein coding CCDS15759 A2AI08 TSL:1 GENCODE basic APPRIS P1

Tprn-204 ENSMUST00000155738.1 464 No protein - lncRNA - - TSL:3

Tprn-202 ENSMUST00000137361.1 441 No protein - lncRNA - - TSL:5

Tprn-203 ENSMUST00000141509.1 311 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

27.27 kb Forward strand 25.255Mb 25.260Mb 25.265Mb 25.270Mb 25.275Mb (Comprehensive set... Tmem203-201 >protein coding Tprn-201 >protein coding Anapc2-201 >protein coding

Tprn-202 >lncRNA Anapc2-203 >lncRNA Anapc2-202 >lncRNA

Tprn-203 >lncRNA Anapc2-204 >protein coding

Tprn-204 >lncRNA

Contigs AL732309.9 > Genes < Ndor1-211protein coding < Ssna1-201protein coding (Comprehensive set...

< Ndor1-201protein coding < Ssna1-203lncRNA

< Ndor1-202protein coding < Ssna1-204lncRNA

< Ndor1-210nonsense mediated decay < Ssna1-202lncRNA

< Ndor1-208protein coding

< Ndor1-209protein coding

< Ndor1-206lncRNA

< Ndor1-207retained intron

< Ndor1-205nonsense mediated decay

Regulatory Build

25.255Mb 25.260Mb 25.265Mb 25.270Mb 25.275Mb Reverse strand 27.27 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000114336

7.27 kb Forward strand

Tprn-201 >protein coding

ENSMUSP00000109... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Phostensin/Taperin N-terminal domain Phostensin/Taperin PP1-binding domain

PANTHER Phostensin/Taperin

Taperin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant inframe deletion missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 749

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8