https://www.alphaknockout.com

Mouse Ttc39a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ttc39a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ttc39a (NCBI Reference Sequence: NM_153392 ; Ensembl: ENSMUSG00000028555 ) is located on Mouse 4. 18 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 18 (Transcript: ENSMUST00000064129). Exon 3~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ttc39a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-110F24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 8.33% of the coding region. The knockout of Exon 3~5 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 5201 bp, and the size of intron 5 for 3'-loxP site insertion: 3358 bp. The size of effective cKO region: ~2297 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ttc39a Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8650bp) | A(22.81% 1973) | C(24.34% 2105) | T(28.75% 2487) | G(24.1% 2085)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 109418044 109421043 3000 browser details YourSeq 178 46 1917 3000 92.1% chr4 - 21938970 22088761 149792 browser details YourSeq 104 46 1915 3000 81.6% chr7 - 79183193 79230933 47741 browser details YourSeq 100 457 915 3000 77.5% chr14 + 32144817 32145014 198 browser details YourSeq 88 438 571 3000 92.5% chr9 - 47434207 47434342 136 browser details YourSeq 80 442 556 3000 90.8% chr17 - 5874377 5874509 133 browser details YourSeq 80 1808 1917 3000 88.4% chr14 - 30787885 30787994 110 browser details YourSeq 79 444 546 3000 96.7% chr9 + 121972727 121972841 115 browser details YourSeq 78 443 591 3000 80.7% chr9 - 61492299 61492409 111 browser details YourSeq 75 1793 1914 3000 95.3% chr15 + 82352443 82352727 285 browser details YourSeq 74 440 538 3000 92.0% chr5 - 142294609 142294801 193 browser details YourSeq 74 1808 1914 3000 91.3% chr2 - 76418888 76419105 218 browser details YourSeq 74 1813 1915 3000 93.0% chr5 + 66061008 66061109 102 browser details YourSeq 73 1808 1915 3000 88.6% chr4 - 123720377 123720482 106 browser details YourSeq 73 1805 1915 3000 87.7% chr15 - 43860051 43860159 109 browser details YourSeq 73 1814 1949 3000 90.3% chr12 - 4496274 4496449 176 browser details YourSeq 72 1805 1918 3000 89.3% chr11 - 83568574 83568685 112 browser details YourSeq 72 1793 1907 3000 78.4% chr11 - 77099904 77100015 112 browser details YourSeq 71 460 549 3000 92.6% chr19 - 23773917 23774028 112 browser details YourSeq 70 1808 1915 3000 87.3% chr2 - 119084278 119084383 106

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 109423194 109426193 3000 browser details YourSeq 459 1370 1944 3000 93.9% chr4 - 109511586 109512146 561 browser details YourSeq 224 1562 2232 3000 78.4% chr4 - 84749807 84750444 638 browser details YourSeq 200 1491 2122 3000 84.2% chr4 + 104895039 104895698 660 browser details YourSeq 196 1566 2156 3000 85.0% chrX + 100128137 100128782 646 browser details YourSeq 187 1479 2156 3000 87.0% chr8 - 94133724 94134416 693 browser details YourSeq 177 1754 2274 3000 83.8% chr14 + 20058744 20059271 528 browser details YourSeq 169 1728 2160 3000 83.1% chr11 + 72695179 72695628 450 browser details YourSeq 167 1503 2072 3000 78.2% chrX + 141400672 141401223 552 browser details YourSeq 167 1810 2188 3000 83.7% chr1 + 163127891 163128277 387 browser details YourSeq 163 1534 2209 3000 83.7% chr19 - 36667810 36668497 688 browser details YourSeq 161 1286 1983 3000 79.2% chr9 - 86364161 86364592 432 browser details YourSeq 160 1771 2036 3000 89.3% chr19 - 38083461 38083759 299 browser details YourSeq 159 1369 1986 3000 86.5% chr4 - 46571776 46572398 623 browser details YourSeq 156 1829 2272 3000 83.2% chr12 + 85574262 85574731 470 browser details YourSeq 156 1448 1988 3000 87.2% chr1 + 157642212 157642773 562 browser details YourSeq 153 1379 2188 3000 90.2% chr5 - 68261850 68262666 817 browser details YourSeq 145 1564 1988 3000 86.9% chr3 - 131555210 131555650 441 browser details YourSeq 142 1379 1987 3000 82.7% chr9 + 85289040 85289649 610 browser details YourSeq 138 1504 2251 3000 88.4% chr2 - 143774951 143775702 752

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ttc39a tetratricopeptide repeat domain 39A [ Mus musculus (house mouse) ] Gene ID: 230603, updated on 12-Aug-2019

Gene summary

Official Symbol Ttc39a provided by MGI Official Full Name tetratricopeptide repeat domain 39A provided by MGI Primary source MGI:MGI:2444350 See related Ensembl:ENSMUSG00000028555 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4922503N01Rik Expression Broad expression in testis adult (RPKM 47.8), colon adult (RPKM 35.7) and 15 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 C7 See Ttc39a in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (109394147..109444749)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (109079705..109117350)

Chromosome 4 - NC_000070.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Ttc39a ENSMUSG00000028555

Description tetratricopeptide repeat domain 39A [Source:MGI Symbol;Acc:MGI:2444350] Gene Synonyms 4922503N01Rik Location Chromosome 4: 109,406,623-109,444,745 forward strand. GRCm38:CM000997.2 About this gene This gene has 8 transcripts (splice variants), 196 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 22 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ttc39a-201 ENSMUST00000064129.13 2356 576aa ENSMUSP00000066334.7 Protein coding CCDS18463 A2ACP1 TSL:1 GENCODE basic APPRIS P3

Ttc39a-202 ENSMUST00000106618.7 1970 578aa ENSMUSP00000102229.1 Protein coding CCDS51260 A2ACP1 TSL:1 GENCODE basic APPRIS ALT2

Ttc39a-203 ENSMUST00000106619.7 1778 189aa ENSMUSP00000102230.1 Protein coding - A0A0A0MQC5 TSL:1 GENCODE basic

Ttc39a-204 ENSMUST00000124209.7 672 137aa ENSMUSP00000118672.1 Protein coding - A0A0A0MQI5 CDS 3' incomplete TSL:3

Ttc39a-206 ENSMUST00000139237.2 614 109aa ENSMUSP00000121779.1 Protein coding - A0A0A0MQL1 CDS 3' incomplete TSL:3

Ttc39a-208 ENSMUST00000153315.7 613 159aa ENSMUSP00000117621.1 Protein coding - X1WI14 CDS 3' incomplete TSL:3

Ttc39a-205 ENSMUST00000126797.1 815 No protein - lncRNA - - TSL:2

Ttc39a-207 ENSMUST00000150909.1 403 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

58.12 kb Forward strand 109.40Mb 109.41Mb 109.42Mb 109.43Mb 109.44Mb 109.45Mb (Comprehensive set... Ttc39a-204 >protein coding

Ttc39a-201 >protein coding

Ttc39a-203 >protein coding

Ttc39a-205 >lncRNA

Ttc39a-208 >protein coding

Ttc39a-202 >protein coding

Ttc39a-207 >lncRNA

Ttc39a-206 >protein coding

Contigs AL669905.14 > Genes < Ttc39aos1-202lncRNA < Rnf11-201protein coding (Comprehensive set...

< Ttc39aos1-201lncRNA < Rnf11-202protein coding

< Rnf11-203protein coding

Regulatory Build

109.40Mb 109.41Mb 109.42Mb 109.43Mb 109.44Mb 109.45Mb Reverse strand 58.12 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000064129

37.66 kb Forward strand

Ttc39a-201 >protein coding

ENSMUSP00000066... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF81901 SMART Tetratricopeptide repeat Pfam Inclusion body clearance protein Iml2/Tetratricopeptide repeat protein 39 PANTHER Inclusion body clearance protein Iml2/Tetratricopeptide repeat protein 39

Tetratricopeptide repeat protein 39A

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 576

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8