https://www.alphaknockout.com

Mouse Tube1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tube1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tube1 (NCBI Reference Sequence: NM_028006.2 ; Ensembl: ENSMUSG00000019845 ) is located on Mouse 10. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000019991). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tube1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-92H12 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 10.74% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 2093 bp, and the size of intron 4 for 3'-loxP site insertion: 3631 bp. The size of effective cKO region: ~558 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 12 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tube1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7058bp) | A(26.49% 1870) | C(21.28% 1502) | T(27.98% 1975) | G(24.24% 1711)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 39133885 39136884 3000 browser details YourSeq 46 479 546 3000 94.3% chr18 + 38454951 38455021 71 browser details YourSeq 32 480 552 3000 94.5% chr16 + 13788064 13788137 74 browser details YourSeq 30 479 512 3000 94.2% chr19 - 17591948 17591981 34 browser details YourSeq 29 2854 2890 3000 93.8% chr3 - 154436544 154436582 39 browser details YourSeq 28 479 508 3000 96.7% chr3 + 28788773 28788802 30 browser details YourSeq 28 479 508 3000 96.7% chr10 + 36963902 36963931 30 browser details YourSeq 27 481 509 3000 96.6% chr6 + 137712150 137712178 29 browser details YourSeq 27 498 547 3000 89.7% chr4 + 29303570 29303618 49 browser details YourSeq 27 479 509 3000 93.6% chr13 + 98689331 98689361 31 browser details YourSeq 26 478 509 3000 86.7% chr5 - 31081166 31081196 31 browser details YourSeq 26 1683 1717 3000 96.5% chr2 - 51413869 51413904 36 browser details YourSeq 24 481 514 3000 85.3% chr7 - 9182606 9182639 34 browser details YourSeq 24 481 514 3000 85.3% chr7 - 9379231 9379264 34 browser details YourSeq 24 481 514 3000 85.3% chr7 - 9491121 9491154 34 browser details YourSeq 23 482 514 3000 84.9% chr7 - 8309936 8309968 33 browser details YourSeq 22 532 559 3000 89.3% chr12 - 16813884 16813911 28 browser details YourSeq 21 2055 2077 3000 95.7% chr4 - 73864475 73864497 23 browser details YourSeq 21 483 503 3000 100.0% chr10 + 61072906 61072926 21 browser details YourSeq 21 1882 1908 3000 88.9% chr1 + 149480001 149480027 27

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 39137443 39140442 3000 browser details YourSeq 199 2370 2770 3000 89.7% chr11 + 6071762 6072149 388 browser details YourSeq 193 2399 2769 3000 93.7% chr5 + 25015305 25015722 418 browser details YourSeq 167 2455 2769 3000 93.3% chr5 + 25015305 25015638 334 browser details YourSeq 155 2399 2767 3000 93.3% chr15 - 36202192 36202793 602 browser details YourSeq 154 2421 2723 3000 87.4% chr5 + 25015453 25015701 249 browser details YourSeq 141 2427 2770 3000 86.1% chr11 + 6071785 6072069 285 browser details YourSeq 136 2404 2767 3000 91.0% chr11 + 120742926 120743878 953 browser details YourSeq 134 2523 2769 3000 87.7% chr5 + 25015389 25015617 229 browser details YourSeq 129 2404 2768 3000 91.7% chr11 + 120743012 120743849 838 browser details YourSeq 108 2406 2717 3000 92.8% chr1 - 16709777 16710138 362 browser details YourSeq 98 2339 2725 3000 89.3% chr11 + 6710339 6710917 579 browser details YourSeq 94 2460 2768 3000 81.3% chr11 + 120743143 120743401 259 browser details YourSeq 85 2455 2607 3000 93.9% chr5 + 25015453 25015721 269 browser details YourSeq 85 2637 2770 3000 86.8% chr11 + 6070995 6071118 124 browser details YourSeq 78 2533 2769 3000 78.5% chr11 + 6071431 6071544 114 browser details YourSeq 75 1476 1697 3000 90.4% chrX + 42474584 42474807 224 browser details YourSeq 72 2425 2607 3000 92.9% chr11 + 120743477 120743864 388 browser details YourSeq 72 2405 2770 3000 74.7% chr11 + 6071785 6071951 167 browser details YourSeq 71 2546 2769 3000 78.5% chr5 + 25015558 25015701 144

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tube1 , epsilon 1 [ Mus musculus (house mouse) ] Gene ID: 71924, updated on 26-Jun-2020

Gene summary

Official Symbol Tube1 provided by MGI Official Full Name tubulin, epsilon 1 provided by MGI Primary source MGI:MGI:1919174 See related Ensembl:ENSMUSG00000019845 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tube; AI551343; 2310061K05Rik Expression Biased expression in thymus adult (RPKM 18.3), CNS E11.5 (RPKM 7.0) and 10 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 B1 See Tube1 in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (39133949..39151058)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (38853829..38870864)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Tube1 ENSMUSG00000019845

Description tubulin, epsilon 1 [Source:MGI Symbol;Acc:MGI:1919174] Gene Synonyms 2310061K05Rik Location Chromosome 10: 39,133,976-39,152,542 forward strand. GRCm38:CM001003.2 About this gene This gene has 6 transcripts (splice variants), 286 orthologues, 20 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tube1- ENSMUST00000019991.7 2929 475aa ENSMUSP00000019991.7 Protein coding CCDS23787 Q9D6T1 TSL:1 201 GENCODE basic APPRIS P1

Tube1- ENSMUST00000213459.1 4192 216aa ENSMUSP00000150602.1 Nonsense mediated - A0A1L1SU34 TSL:1 203 decay

Tube1- ENSMUST00000213898.1 1562 No - Processed transcript - - TSL:1 204 protein

Tube1- ENSMUST00000214493.1 3525 No - Retained intron - - TSL:1 205 protein

Tube1- ENSMUST00000217214.1 1430 No - Retained intron - - TSL:1 206 protein

Tube1- ENSMUST00000213237.1 695 No - Retained intron - - TSL:3 202 protein

Page 6 of 8 https://www.alphaknockout.com

38.57 kb Forward strand 39.13Mb 39.14Mb 39.15Mb 39.16Mb (Comprehensive set... Tube1-202 >retained intron

Tube1-201 >protein coding

Tube1-206 >retained intron Tube1-205 >retained intron

Tube1-203 >nonsense mediated decay

Tube1-204 >processed transcript

Contigs AC153958.2 > Genes < Fam229b-201protein coding < Ccn6-201protein coding (Comprehensive set...

< Fam229b-208protein coding

< Fam229b-204protein coding

< Fam229b-209protein coding

< Fam229b-202protein coding

< Fam229b-205protein coding

< Fam229b-203protein coding

< Fam229b-210retained intron

Regulatory Build

39.13Mb 39.14Mb 39.15Mb 39.16Mb Reverse strand 38.57 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000019991

17.04 kb Forward strand

Tube1-201 >protein coding

ENSMUSP00000019... Low complexity (Seg) Superfamily Tubulin/FtsZ, GTPase domain superfamily Tubulin/FtsZ, C-terminal

SMART Tubulin/FtsZ, GTPase domain Tubulin/FtsZ, 2-layer sandwich domain

Prints Tubulin

Epsilon tubulin Pfam Tubulin/FtsZ, GTPase domain Tubulin/FtsZ, 2-layer sandwich domain

PROSITE patterns Tubulin, conserved site

PANTHER Epsilon tubulin

Tubulin Gene3D Tubulin/FtsZ, GTPase domain superfamily Tubulin, C-terminal

CDD cd02190

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 475

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8