http://www.alphaknockout.com/ Mouse Tub Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tub conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tub ( NCBI Reference Sequence: NM_021885 ; Ensembl: ENSMUSG00000031028 ) is located on mouse 7. 12 exons are identified , with the ATG start codon in exon 1 and the TAG stop codon in exon 12 (Transcript: ENSMUST00000033341). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Tub gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-425C14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutants exhibit a late-developing obesity with hyperinsulinemia, retinal degeneration, and hearing loss associated with death of both outer and inner hair cells.

The knockout of Exon 2~4 will result in frameshift of the gene, and covers 23.7% of the coding region. The size of intron 1 for 5'-loxP site insertion: 9378 bp, and the size of intron 4 for 3'-loxP site insertion: 1736 bp. The size of effective cKO region: ~2736 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and translation cannot be predicted at existing technological level.

Page 1 of 7 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 12 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tub Homology arm cKO region loxP site

Page 2 of 7 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9237bp) | A(23.01% 2125) | C(24.63% 2275) | G(26.86% 2481) | T(25.51% 2356)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 109017268 109020267 3000 browser details YourSeq 205 1654 1973 3000 83.4% chr16 - 84722230 84722532 303 browser details YourSeq 200 1656 1936 3000 88.1% chr18 - 78490992 78491294 303 browser details YourSeq 199 1654 1979 3000 85.1% chr2 - 25861070 25861392 323 browser details YourSeq 198 1656 1973 3000 84.9% chr13 - 92544490 92544793 304 browser details YourSeq 198 1631 1934 3000 90.7% chr1 + 157615106 157615430 325 browser details YourSeq 196 1656 1978 3000 88.6% chr9 + 52569946 52570269 324 browser details YourSeq 194 1656 1985 3000 85.8% chr17 + 69360010 69360339 330 browser details YourSeq 193 1655 1954 3000 86.8% chr14 + 14092296 14092593 298 browser details YourSeq 193 1656 1936 3000 85.4% chr10 + 58841555 58841856 302 browser details YourSeq 192 1655 1975 3000 81.9% chr11 + 62052267 62052569 303 browser details YourSeq 191 1656 1979 3000 84.6% chr10 - 63180567 63180892 326 browser details YourSeq 191 1656 1975 3000 86.5% chr9 + 118650166 118650488 323 browser details YourSeq 189 1656 1975 3000 84.0% chr11 - 113606390 113606697 308 browser details YourSeq 189 1655 1948 3000 88.3% chr14 + 73012704 73013005 302 browser details YourSeq 187 1660 1936 3000 88.5% chr8 - 26930649 26930949 301 browser details YourSeq 187 1664 1936 3000 88.2% chr12 - 30550079 30550373 295 browser details YourSeq 187 1656 1927 3000 86.7% chr17 + 31354603 31354915 313 browser details YourSeq 184 1664 1925 3000 84.4% chr18 - 62406588 62406846 259 browser details YourSeq 183 1656 1936 3000 87.9% chr11 - 82013220 82013490 271

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 109023005 109026004 3000 browser details YourSeq 378 619 1019 3000 96.8% chr7 + 109023796 109024195 400 browser details YourSeq 378 792 1191 3000 97.5% chr7 + 109023623 109024023 401 browser details YourSeq 284 903 1191 3000 99.4% chr7 + 109023619 109023908 290 browser details YourSeq 213 619 840 3000 97.3% chr7 + 109023968 109024188 221 browser details YourSeq 167 1022 1191 3000 99.5% chr7 + 109023623 109023793 171 browser details YourSeq 103 1088 1191 3000 100.0% chr7 + 109023632 109023736 105 browser details YourSeq 61 2805 2929 3000 89.5% chr6 - 89296106 89296265 160 browser details YourSeq 55 637 996 3000 66.7% chr8 - 46482153 46482233 81 browser details YourSeq 54 697 1113 3000 68.4% chr12 - 104644841 104645057 217 browser details YourSeq 54 639 998 3000 70.0% chr12 - 104644841 104645057 217 browser details YourSeq 54 619 674 3000 94.6% chr7 + 109024141 109024195 55 browser details YourSeq 49 2811 2891 3000 88.9% chr14 - 66858854 66858938 85 browser details YourSeq 48 2841 2968 3000 92.9% chr15 - 97624065 97624209 145 browser details YourSeq 47 737 1155 3000 63.5% chr8 + 19559612 19559811 200 browser details YourSeq 47 852 1040 3000 94.3% chr8 + 19559612 19559811 200 browser details YourSeq 40 1025 1164 3000 64.6% chr4 + 131747939 131748005 67 browser details YourSeq 39 2844 2895 3000 91.5% chr8 - 115172881 115172933 53 browser details YourSeq 38 2805 2894 3000 89.8% chr19 + 54117206 54117297 92 browser details YourSeq 36 2839 2902 3000 95.0% chr1 - 8133135 8133213 79

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 http://www.alphaknockout.com/ Gene and protein information: Tub tubby bipartite transcription factor [ Mus musculus (house mouse) ] Gene ID: 22141, updated on 3-Jan-2021

Gene summary

Official Symbol Tub provided by MGI Official Full Name tubby bipartite transcription factor provided by MGI Primary source MGI:MGI:2651573 See related Ensembl:ENSMUSG00000031028 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as rd; rd5; tub Expression Biased expression in CNS E18 (RPKM 40.9), whole brain E14.5 (RPKM 30.4) and 8 other tissues See more Orthologs human all NEW Try the new Data Table view

Genomic context

Location: 7 E3; 7 57.21 cM See Tub in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 7 NC_000073.7 (108610087..108633666)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (109010880..109034459)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (116154394..116177973)

Chromosome 7 - NC_000073.7

Page 5 of 7 http://www.alphaknockout.com/

Transcript information: This gene has 5 transcripts

Gene: Tub ENSMUSG00000031028

Description tubby bipartite transcription factor [Source:MGI Symbol;Acc:MGI:2651573] Gene Synonyms rd5, tub Location Chromosome 7: 108,950,338-109,034,460 forward strand. GRCm38:CM001000.2 About this gene This gene has 5 transcripts (splice variants), 297 orthologues, 5 paralogues and is associated with 41 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Match Flags

Tub-201 ENSMUST00000033341.11 5997 505aa ENSMUSP00000033341.5 Protein coding CCDS21732 P50586 Q4VA41 TSL:1 GENCODE basic APPRIS P1

Tub-202 ENSMUST00000119474.1 2097 459aa ENSMUSP00000113580.1 Protein coding - A0A0R4J1P1 TSL:1 GENCODE basic

Tub-204 ENSMUST00000207583.1 772 167aa ENSMUSP00000146894.1 Protein coding - Q3UEX1 CDS 3' incomplete TSL:2

Tub-205 ENSMUST00000208442.1 1153 No protein - Retained intron - - TSL:NA

Tub-203 ENSMUST00000147943.1 747 No protein - Retained intron - - TSL:5

104.12 kb Forward strand 108.96Mb 108.98Mb 109.00Mb 109.02Mb 109.04Mb (Comprehensive set... Eif3f-201 >protein coding Tub-201 >protein coding

Eif3f-203 >retained intron Tub-202 >protein coding

Eif3f-202 >retained intron Tub-203 >retained intron

Tub-204 >protein coding

Tub-205 >retained intron

Contigs < AC167362.2

Genes < Ric3-206nonsense mediated decay (Comprehensive set...

< Ric3-201protein coding

< Ric3-202protein coding

Regulatory Build

108.96Mb 108.98Mb 109.00Mb 109.02Mb 109.04Mb Reverse strand 104.12 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Page 6 of 7 http://www.alphaknockout.com/

Transcript: ENSMUST00000033341

23.64 kb Forward strand

Tub-201 >protein coding

ENSMUSP00000033... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Tubby-like, C-terminal Prints Tubby, N-terminal Tubby, C-terminal Pfam Tubby, N-terminal Tubby, C-terminal

PROSITE patterns Tubby, C-terminal, conserved site

Tubby, C-terminal, conserved site PANTHER PTHR16517:SF20

PTHR16517 Gene3D Tubby-like, C-terminal

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 505

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 7 of 7