https://www.alphaknockout.com

Mouse Tbcd Knockout Project (CRISPR/Cas9)

Objective: To create a Tbcd knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tbcd (NCBI Reference Sequence: NM_029878 ; Ensembl: ENSMUSG00000039230 ) is located on Mouse 11. 39 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 39 (Transcript: ENSMUST00000103013). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 5.16% of the coding region. Exon 2~6 covers 12.65% of the coding region. The size of effective KO region: ~8177 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 39

Legends Exon of mouse Tbcd Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1402 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1402bp) | A(25.04% 351) | C(20.26% 284) | T(27.32% 383) | G(27.39% 384)

Note: The 1402 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.05% 441) | C(22.2% 444) | T(32.95% 659) | G(22.8% 456)

Note: The 2000 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1402 1 1402 1402 100.0% chr11 + 121452230 121453631 1402 browser details YourSeq 100 472 753 1402 88.2% chr4 + 138372842 138373125 284 browser details YourSeq 94 528 881 1402 86.2% chr11 + 20862074 20862584 511 browser details YourSeq 75 399 730 1402 73.8% chr1 - 64278814 64279124 311 browser details YourSeq 66 458 759 1402 90.3% chr10 + 94660900 94661206 307 browser details YourSeq 60 528 751 1402 87.5% chrX - 73984780 73985023 244 browser details YourSeq 60 396 710 1402 92.9% chr17 + 10851831 10852145 315 browser details YourSeq 57 401 543 1402 81.7% chr1 - 138574246 138574381 136 browser details YourSeq 56 558 755 1402 93.8% chr17 - 50081125 50081330 206 browser details YourSeq 56 510 611 1402 79.2% chr7 + 74526171 74526273 103 browser details YourSeq 53 578 748 1402 82.2% chr10 + 111565840 111566071 232 browser details YourSeq 52 692 768 1402 96.6% chr10 - 61148577 61148944 368 browser details YourSeq 51 687 759 1402 90.5% chr10 - 21105458 21105532 75 browser details YourSeq 50 675 748 1402 94.7% chr9 - 63177840 63177926 87 browser details YourSeq 49 399 587 1402 67.2% chr11 - 71649051 71649241 191 browser details YourSeq 49 692 748 1402 96.3% chrX + 106511371 106511430 60 browser details YourSeq 48 692 764 1402 92.9% chr5 + 25203025 25203100 76 browser details YourSeq 47 692 748 1402 96.1% chr6 - 146480236 146480295 60 browser details YourSeq 47 403 558 1402 88.4% chr1 + 193448799 193448955 157 browser details YourSeq 46 692 767 1402 96.1% chr6 - 99744981 99745063 83

Note: The 1402 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 + 121461809 121463808 2000 browser details YourSeq 101 1451 1880 2000 91.7% chr5 - 145278787 145279229 443 browser details YourSeq 95 1516 1880 2000 78.9% chr6 - 38304173 38304461 289 browser details YourSeq 93 1765 1918 2000 88.5% chr15 - 98304401 98304731 331 browser details YourSeq 90 1765 1881 2000 93.4% chr11 - 4389769 4389888 120 browser details YourSeq 87 1768 1880 2000 90.6% chr5 + 143081447 143081564 118 browser details YourSeq 86 1766 1880 2000 90.6% chr11 - 120740250 120740368 119 browser details YourSeq 85 1786 1880 2000 95.8% chr11 + 98521974 98522072 99 browser details YourSeq 85 1764 1879 2000 90.9% chr1 + 136742885 136743014 130 browser details YourSeq 84 1774 1880 2000 91.3% chr1 + 5976561 5976671 111 browser details YourSeq 82 1766 1878 2000 89.6% chr10 + 86840966 86841112 147 browser details YourSeq 81 1768 1880 2000 93.7% chr1 - 86751225 86751343 119 browser details YourSeq 81 1772 1880 2000 89.5% chr1 + 72715641 72715756 116 browser details YourSeq 79 1774 1880 2000 90.7% chr11 + 105629963 105630077 115 browser details YourSeq 78 1772 1880 2000 88.4% chr7 - 44751834 44751949 116 browser details YourSeq 78 1786 1880 2000 91.6% chr9 + 57476569 57476667 99 browser details YourSeq 78 1765 1880 2000 86.8% chr6 + 148866876 148866997 122 browser details YourSeq 78 1701 1880 2000 94.4% chr10 + 7256971 7257339 369 browser details YourSeq 77 1767 1880 2000 86.6% chr8 - 111453094 111453213 120 browser details YourSeq 76 1786 1880 2000 92.3% chr9 - 23876626 23876724 99

Note: The 2000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tbcd tubulin-specific chaperone d [ Mus musculus (house mouse) ] Gene ID: 108903, updated on 12-Aug-2019

Gene summary

Official Symbol Tbcd provided by MGI Official Full Name tubulin-specific chaperone d provided by MGI Primary source MGI:MGI:1919686 See related Ensembl:ENSMUSG00000039230 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mKIAA0988; 2310057L06Rik; A030005L14Rik Expression Ubiquitous expression in CNS E18 (RPKM 8.3), whole brain E14.5 (RPKM 7.8) and 28 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 E2 See Tbcd in Genome Data Viewer Exon count: 43

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (121451991..121617170)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (121313263..121478484)

Chromosome 11 - NC_000077.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Tbcd ENSMUSG00000039230

Description tubulin-specific chaperone d [Source:MGI Symbol;Acc:MGI:1919686] Gene Synonyms 2310057L06Rik, A030005L14Rik Location Chromosome 11: 121,451,949-121,617,164 forward strand. GRCm38:CM001004.2 About this gene This gene has 8 transcripts (splice variants), 197 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tbcd- ENSMUST00000103013.9 3906 1196aa ENSMUSP00000099302.3 Protein coding CCDS25778 Q8BYA0 TSL:1 201 GENCODE basic APPRIS P1

Tbcd- ENSMUST00000106093.1 368 84aa ENSMUSP00000101699.1 Protein coding - B1ATU0 CDS 3' 202 incomplete TSL:3

Tbcd- ENSMUST00000125167.7 944 72aa ENSMUSP00000124735.1 Nonsense mediated - F6UI15 CDS 5' 203 decay incomplete TSL:5

Tbcd- ENSMUST00000155666.7 4832 No - Retained intron - - TSL:1 208 protein

Tbcd- ENSMUST00000147470.1 716 No - Retained intron - - TSL:5 205 protein

Tbcd- ENSMUST00000151666.1 1324 No - lncRNA - - TSL:1 207 protein

Tbcd- ENSMUST00000139414.7 1292 No - lncRNA - - TSL:1 204 protein

Tbcd- ENSMUST00000147560.7 699 No - lncRNA - - TSL:5 206 protein

Page 7 of 9 https://www.alphaknockout.com

185.22 kb Forward strand 121.45Mb 121.50Mb 121.55Mb 121.60Mb (Comprehensive set... Fn3k-201 >protein coding Tbcd-202 >protein coding Tbcd-206 >lncRNA Tbcd-208 >retained intron

Fn3k-202 >protein coding Tbcd-203 >nonsense mediated decay Tbcd-204 >lncRNA

Fn3k-203 >protein coding Tbcd-205 >retained intron Tbcd-207 >lncRNA

Tbcd-201 >protein coding

Contigs AL663088.10 > AL645972.10 >

Genes < Zfp750-201protein coding < B3gntl1-202protein coding (Comprehensive set...

Regulatory Build

121.45Mb 121.50Mb 121.55Mb 121.60Mb Reverse strand 185.22 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000103013

165.22 kb Forward strand

Tbcd-201 >protein coding

ENSMUSP00000099... Low complexity (Seg) Superfamily Armadillo-type fold Pfam Tubulin-specific chaperone D, C-terminal

PANTHER Tubulin-folding cofactor D Gene3D Armadillo-like helical

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1196

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9