https://www.alphaknockout.com

Mouse Tep1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tep1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tep1 (NCBI Reference Sequence: NM_009351 ; Ensembl: ENSMUSG00000006281 ) is located on Mouse 14. 55 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 55 (Transcript: ENSMUST00000006444). Exon 10 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tep1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-321A24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a disruption in this gene show no obvious phenotype. No changes are seen in activity or length.

Exon 10 starts from about 19.88% of the coding region. The knockout of Exon 10 will result in frameshift of the gene. The size of intron 9 for 5'-loxP site insertion: 4878 bp, and the size of intron 10 for 3'-loxP site insertion: 990 bp. The size of effective cKO region: ~610 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 10 11 12 13 55 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tep1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7110bp) | A(25.7% 1827) | C(23.21% 1650) | T(28.78% 2046) | G(22.32% 1587)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 50855950 50858949 3000 browser details YourSeq 352 2600 2996 3000 95.0% chr4 + 8461815 8462217 403 browser details YourSeq 351 2413 2996 3000 87.1% chr1 - 171137648 171138093 446 browser details YourSeq 348 2603 2996 3000 94.7% chr19 - 39666144 39821847 155704 browser details YourSeq 343 2599 2996 3000 93.5% chr11 - 118308673 118309075 403 browser details YourSeq 342 2600 2996 3000 93.7% chr16 - 83579120 83579528 409 browser details YourSeq 342 2600 2996 3000 94.6% chr12 + 55704024 55704426 403 browser details YourSeq 340 2602 2996 3000 93.7% chr13 - 50460565 50460964 400 browser details YourSeq 340 2600 2996 3000 93.7% chr16 + 11742647 11743049 403 browser details YourSeq 339 2597 2997 3000 93.0% chr5 + 76103740 76104146 407 browser details YourSeq 339 2600 2996 3000 93.2% chr16 + 85813770 85814170 401 browser details YourSeq 339 2600 2995 3000 93.4% chr16 + 11360461 11360861 401 browser details YourSeq 338 2599 2997 3000 93.4% chr12 + 17464550 17464953 404 browser details YourSeq 336 2599 2996 3000 92.7% chr6 - 37470475 37470868 394 browser details YourSeq 336 2602 2996 3000 92.9% chr18 - 3847618 3848017 400 browser details YourSeq 336 2598 2979 3000 94.7% chr1 - 60047547 60047938 392 browser details YourSeq 336 2600 2996 3000 93.3% chr12 + 17459240 17459640 401 browser details YourSeq 335 2599 2996 3000 93.8% chr8 - 79546022 79546430 409 browser details YourSeq 335 2603 2999 3000 92.7% chr19 - 39897184 39897589 406 browser details YourSeq 334 2600 2996 3000 92.7% chr16 + 14557125 14557526 402

Note: The 3000 bp section upstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 50852340 50855339 3000 browser details YourSeq 124 454 703 3000 90.5% chr6 - 100467333 100468419 1087 browser details YourSeq 121 451 703 3000 92.2% chr6 - 100467027 100468209 1183 browser details YourSeq 117 451 703 3000 82.1% chr6 - 100467316 100467512 197 browser details YourSeq 115 451 717 3000 90.4% chr6 - 100467551 100468405 855 browser details YourSeq 101 451 703 3000 91.6% chr6 - 100467197 100467478 282 browser details YourSeq 89 451 625 3000 90.8% chr6 - 100467024 100468175 1152 browser details YourSeq 87 454 703 3000 81.0% chr6 - 100467299 100467526 228 browser details YourSeq 75 457 681 3000 75.0% chr6 - 100468093 100468186 94 browser details YourSeq 74 451 681 3000 92.8% chr6 - 100468127 100468609 483 browser details YourSeq 74 118 252 3000 84.6% chr2 + 162535742 162535874 133 browser details YourSeq 68 139 260 3000 87.1% chr7 + 109848040 109848164 125 browser details YourSeq 66 56 253 3000 89.6% chr7 - 79564479 79564676 198 browser details YourSeq 65 112 249 3000 79.7% chr12 + 58901059 58901199 141 browser details YourSeq 63 114 251 3000 84.0% chrX + 151369070 151369206 137 browser details YourSeq 61 2649 2787 3000 90.6% chr1 + 4675651 4675996 346 browser details YourSeq 60 177 260 3000 87.5% chr1 + 106502576 106502660 85 browser details YourSeq 59 2647 2856 3000 91.6% chr2 + 129982081 129982621 541 browser details YourSeq 58 451 654 3000 92.5% chr6 - 100467056 100467444 389 browser details YourSeq 57 114 251 3000 85.0% chr1 - 93708527 93708662 136

Note: The 3000 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Tep1 telomerase associated protein 1 [ Mus musculus (house mouse) ] Gene ID: 21745, updated on 12-Aug-2019

Gene summary

Official Symbol Tep1 provided by MGI Official Full Name telomerase associated protein 1 provided by MGI Primary source MGI:MGI:109573 See related Ensembl:ENSMUSG00000006281 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tp1 Expression Broad expression in large intestine adult (RPKM 17.7), colon adult (RPKM 17.6) and 21 other tissues See more Orthologs human all

Genomic context

Location: 14 C1; 14 26.26 cM See Tep1 in Genome Data Viewer

Exon count: 56

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (50824059..50870579, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (51443736..51490229, complement)

Chromosome 14 - NC_000080.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Tep1 ENSMUSG00000006281

Description telomerase associated protein 1 [Source:MGI Symbol;Acc:MGI:109573] Gene Synonyms Tp1 Location : 50,824,059-50,870,560 reverse strand. GRCm38:CM001007.2 About this gene This gene has 8 transcripts (splice variants), 136 orthologues, 24 paralogues, is a member of 2 Ensembl protein families and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tep1- ENSMUST00000006444.8 8171 2629aa ENSMUSP00000006444.7 Protein coding CCDS36907 P97499 TSL:1 201 GENCODE basic APPRIS P1

Tep1- ENSMUST00000226222.1 434 79aa ENSMUSP00000154505.1 Protein coding - A0A2I3BRB1 CDS 5' 202 incomplete

Tep1- ENSMUST00000226430.1 5690 489aa ENSMUSP00000154492.1 Nonsense mediated - A0A2I3BR80 CDS 5' 203 decay incomplete

Tep1- ENSMUST00000227526.1 737 51aa ENSMUSP00000153878.1 Nonsense mediated - A0A2I3BQ06 CDS 5' 207 decay incomplete

Tep1- ENSMUST00000227228.1 1274 No - Retained intron - - - 206 protein

Tep1- ENSMUST00000228562.1 702 No - Retained intron - - - 208 protein

Tep1- ENSMUST00000226789.1 551 No - Retained intron - - - 204 protein

Tep1- ENSMUST00000227193.1 380 No - lncRNA - - - 205 protein

Page 6 of 8 https://www.alphaknockout.com

66.50 kb Forward strand

Genes Parp2-201 >protein coding Gm26782-205 >lncRNA (Comprehensive set...

Parp2-202 >retained intron Gm26782-203 >lncRNA

Parp2-206 >retained intron Gm26782-206 >lncRNA

Parp2-204 >nonsense mediated decay Gm26782-204 >lncRNA

Parp2-205 >retained intron Gm26782-202 >lncRNA

Parp2-207 >retained intron Gm26782-201 >lncRNA

Parp2-208 >retained intron Gm26782-207 >lncRNA

Contigs < AC027184.15 Genes (Comprehensive set... < Tep1-201protein coding

< Tep1-202protein coding< Tep1-205lncRNA < Tep1-204retained intron

< Tep1-203nonsense mediated decay

< Tep1-208retained intron

< Tep1-207nonsense mediated decay

< Tep1-206retained intron

Regulatory Build

Reverse strand 66.50 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000006444

< Tep1-201protein coding

Reverse strand 46.50 kb

ENSMUSP00000006... MobiDB lite Low complexity (Seg) Superfamily TROVE domain superfamily P-loop containing nucleoside triphosphate hydrolase

WD40-repeat-containing domain superfamily SMART WD40 repeat Pfam TROVE domain Domain of unknown function DUF4062 WD40 repeat

TEP1, N-terminal NACHT nucleoside triphosphatase PROSITE profiles TROVE domain NACHT nucleoside triphosphatase

TEP1, N-terminal WD40 repeat

WD40-repeat-containing domain PROSITE patterns WD40 repeat, conserved site PANTHER PTHR44791 Gene3D 3.40.50.300 1.25.40.370

WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2629

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8