https://www.alphaknockout.com

Mouse Ttc39a Knockout Project (CRISPR/Cas9)

Objective: To create a Ttc39a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ttc39a (NCBI Reference Sequence: NM_153392 ; Ensembl: ENSMUSG00000028555 ) is located on Mouse 4. 18 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 18 (Transcript: ENSMUST00000064129). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 2.26% of the coding region. Exon 2~5 covers 22.11% of the coding region. The size of effective KO region: ~6956 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 18

Legends Exon of mouse Ttc39a Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.3% 526) | C(22.5% 450) | T(25.6% 512) | G(25.6% 512)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.25% 465) | C(22.85% 457) | T(32.4% 648) | G(21.5% 430)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 109413988 109415987 2000 browser details YourSeq 211 919 1415 2000 96.2% chr9 - 64964113 65414846 450734 browser details YourSeq 189 941 1415 2000 96.1% chr3 + 89166277 89576991 410715 browser details YourSeq 152 1234 1415 2000 94.2% chr9 - 119306762 119306950 189 browser details YourSeq 146 1245 1415 2000 94.0% chr7 - 130110141 130110333 193 browser details YourSeq 144 1233 1416 2000 95.6% chr7 - 123075037 123075248 212 browser details YourSeq 144 1241 1415 2000 95.2% chr15 - 8313865 8314068 204 browser details YourSeq 142 927 1389 2000 85.2% chr5 + 74522850 74523216 367 browser details YourSeq 142 1235 1389 2000 96.8% chr14 + 27078020 27078489 470 browser details YourSeq 139 1214 1398 2000 94.4% chr9 + 121877608 121878128 521 browser details YourSeq 138 1235 1415 2000 93.3% chr7 - 4016040 4016256 217 browser details YourSeq 137 1254 1415 2000 92.6% chr5 - 34775525 34775708 184 browser details YourSeq 137 1224 1398 2000 91.8% chr11 - 69430078 69430249 172 browser details YourSeq 136 1228 1414 2000 93.0% chr11 - 61035935 61036149 215 browser details YourSeq 136 1234 1388 2000 94.9% chr18 + 75406162 75406343 182 browser details YourSeq 136 1246 1415 2000 92.5% chr14 + 64760873 64761067 195 browser details YourSeq 136 1234 1398 2000 92.5% chr11 + 81252566 81252844 279 browser details YourSeq 134 1228 1398 2000 94.7% chr8 - 18555397 18555893 497 browser details YourSeq 134 1251 1414 2000 94.7% chr15 - 76365021 76365214 194 browser details YourSeq 132 1257 1415 2000 94.0% chrX - 8883152 8883335 184

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 109422944 109424943 2000 browser details YourSeq 273 1620 2000 2000 91.3% chr4 - 109511780 109512146 367 browser details YourSeq 93 1217 1517 2000 87.2% chr8 + 111809997 111810400 404 browser details YourSeq 91 546 1566 2000 95.1% chr18 + 7130083 7473434 343352 browser details YourSeq 87 1217 1447 2000 80.0% chr6 - 148878897 148879106 210 browser details YourSeq 84 1221 1503 2000 92.1% chr11 - 3199001 3199708 708 browser details YourSeq 79 1631 2000 2000 80.0% chr11 + 40387439 40387807 369 browser details YourSeq 76 1629 1828 2000 80.5% chr9 + 85289040 85289239 200 browser details YourSeq 75 1235 1453 2000 87.8% chr17 - 12141059 12141429 371 browser details YourSeq 69 452 714 2000 92.5% chr1 - 134220291 134220960 670 browser details YourSeq 67 1433 1632 2000 73.4% chr3 - 56950092 56950216 125 browser details YourSeq 66 1324 1462 2000 80.3% chr4 - 109302566 109302700 135 browser details YourSeq 65 1175 1290 2000 84.8% chr13 + 100465668 100465820 153 browser details YourSeq 64 1338 1459 2000 89.2% chr11 + 59945360 59945485 126 browser details YourSeq 62 1232 1464 2000 86.1% chr2 - 36165632 36165985 354 browser details YourSeq 62 1388 1509 2000 77.2% chr19 - 6163434 6163985 552 browser details YourSeq 62 1323 1468 2000 90.8% chr1 + 91807893 91808191 299 browser details YourSeq 61 1217 1305 2000 87.5% chr11 - 86281904 86281992 89 browser details YourSeq 60 1317 1453 2000 88.4% chr10 - 39843694 39843833 140 browser details YourSeq 60 433 589 2000 95.6% chr2 + 38891857 38892073 217

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ttc39a tetratricopeptide repeat domain 39A [ Mus musculus (house mouse) ] Gene ID: 230603, updated on 12-Aug-2019

Gene summary

Official Symbol Ttc39a provided by MGI Official Full Name tetratricopeptide repeat domain 39A provided by MGI Primary source MGI:MGI:2444350 See related Ensembl:ENSMUSG00000028555 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4922503N01Rik Expression Broad expression in testis adult (RPKM 47.8), colon adult (RPKM 35.7) and 15 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 C7 See Ttc39a in Genome Data Viewer Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (109394147..109444749)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (109079705..109117350)

Chromosome 4 - NC_000070.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Ttc39a ENSMUSG00000028555

Description tetratricopeptide repeat domain 39A [Source:MGI Symbol;Acc:MGI:2444350] Gene Synonyms 4922503N01Rik Location Chromosome 4: 109,406,623-109,444,745 forward strand. GRCm38:CM000997.2 About this gene This gene has 8 transcripts (splice variants), 196 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 22 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ttc39a-201 ENSMUST00000064129.13 2356 576aa ENSMUSP00000066334.7 Protein coding CCDS18463 A2ACP1 TSL:1 GENCODE basic APPRIS P3

Ttc39a-202 ENSMUST00000106618.7 1970 578aa ENSMUSP00000102229.1 Protein coding CCDS51260 A2ACP1 TSL:1 GENCODE basic APPRIS ALT2

Ttc39a-203 ENSMUST00000106619.7 1778 189aa ENSMUSP00000102230.1 Protein coding - A0A0A0MQC5 TSL:1 GENCODE basic

Ttc39a-204 ENSMUST00000124209.7 672 137aa ENSMUSP00000118672.1 Protein coding - A0A0A0MQI5 CDS 3' incomplete TSL:3

Ttc39a-206 ENSMUST00000139237.2 614 109aa ENSMUSP00000121779.1 Protein coding - A0A0A0MQL1 CDS 3' incomplete TSL:3

Ttc39a-208 ENSMUST00000153315.7 613 159aa ENSMUSP00000117621.1 Protein coding - X1WI14 CDS 3' incomplete TSL:3

Ttc39a-205 ENSMUST00000126797.1 815 No protein - lncRNA - - TSL:2

Ttc39a-207 ENSMUST00000150909.1 403 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

58.12 kb Forward strand 109.40Mb 109.41Mb 109.42Mb 109.43Mb 109.44Mb 109.45Mb (Comprehensive set... Ttc39a-204 >protein coding

Ttc39a-201 >protein coding

Ttc39a-203 >protein coding

Ttc39a-205 >lncRNA

Ttc39a-208 >protein coding

Ttc39a-202 >protein coding

Ttc39a-207 >lncRNA

Ttc39a-206 >protein coding

Contigs AL669905.14 > Genes < Ttc39aos1-202lncRNA < Rnf11-201protein coding (Comprehensive set...

< Ttc39aos1-201lncRNA < Rnf11-202protein coding

< Rnf11-203protein coding

Regulatory Build

109.40Mb 109.41Mb 109.42Mb 109.43Mb 109.44Mb 109.45Mb Reverse strand 58.12 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000064129

37.66 kb Forward strand

Ttc39a-201 >protein coding

ENSMUSP00000066... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF81901 SMART Tetratricopeptide repeat Pfam Inclusion body clearance protein Iml2/Tetratricopeptide repeat protein 39 PANTHER Inclusion body clearance protein Iml2/Tetratricopeptide repeat protein 39

Tetratricopeptide repeat protein 39A

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 576

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9