https://www.alphaknockout.com

Mouse Tube1 Knockout Project (CRISPR/Cas9)

Objective: To create a Tube1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tube1 (NCBI Reference Sequence: NM_028006 ; Ensembl: ENSMUSG00000019845 ) is located on Mouse 10. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000019991). Exon 5~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 14.81% of the coding region. Exon 5~9 covers 52.14% of the coding region. The size of effective KO region: ~4891 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 9 12

Legends Exon of mouse Tube1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1598 bp section downstream of Exon 9 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.15% 563) | C(19.0% 380) | T(28.2% 564) | G(24.65% 493)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1598bp) | A(28.1% 449) | C(18.15% 290) | T(32.23% 515) | G(21.53% 344)

Note: The 1598 bp section downstream of Exon 9 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 + 39138824 39140823 2000 browser details YourSeq 199 989 1389 2000 89.7% chr11 + 6071762 6072149 388 browser details YourSeq 193 1018 1388 2000 93.7% chr5 + 25015305 25015722 418 browser details YourSeq 167 1074 1388 2000 93.3% chr5 + 25015305 25015638 334 browser details YourSeq 154 1040 1342 2000 87.4% chr5 + 25015453 25015701 249 browser details YourSeq 145 1040 1386 2000 94.0% chr15 - 36202276 36202751 476 browser details YourSeq 141 1046 1389 2000 86.1% chr11 + 6071785 6072069 285 browser details YourSeq 136 1023 1386 2000 91.0% chr11 + 120742926 120743878 953 browser details YourSeq 134 1142 1388 2000 87.7% chr5 + 25015389 25015617 229 browser details YourSeq 129 1023 1387 2000 91.7% chr11 + 120743012 120743849 838 browser details YourSeq 112 1016 1336 2000 92.5% chr1 - 16709777 16710188 412 browser details YourSeq 106 1142 1386 2000 94.2% chr15 - 36202234 36202542 309 browser details YourSeq 106 1096 1294 2000 94.2% chr15 - 36202192 36202500 309 browser details YourSeq 98 958 1344 2000 89.3% chr11 + 6710339 6710917 579 browser details YourSeq 94 1079 1387 2000 81.3% chr11 + 120743143 120743401 259 browser details YourSeq 88 1163 1381 2000 83.4% chr15 + 97617751 97617954 204 browser details YourSeq 85 1074 1226 2000 93.9% chr5 + 25015453 25015721 269 browser details YourSeq 85 1256 1389 2000 86.8% chr11 + 6070995 6071118 124 browser details YourSeq 78 1152 1388 2000 78.5% chr11 + 6071431 6071544 114 browser details YourSeq 75 95 316 2000 90.4% chrX + 42474584 42474807 224

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1598 1 1598 1598 100.0% chr10 + 39145715 39147312 1598 browser details YourSeq 137 599 1346 1598 79.7% chr1 + 93631870 93632385 516 browser details YourSeq 124 599 1161 1598 89.3% chr12 + 75605647 75606212 566 browser details YourSeq 120 993 1159 1598 92.2% chr12 + 8548647 8548820 174 browser details YourSeq 119 1008 1165 1598 87.6% chr6 - 115846541 115846695 155 browser details YourSeq 119 993 1161 1598 90.5% chr9 + 114082885 114083053 169 browser details YourSeq 118 993 1166 1598 84.7% chr1 - 75295350 75295505 156 browser details YourSeq 114 599 1165 1598 79.8% chrX + 103365505 103365839 335 browser details YourSeq 114 1006 1165 1598 88.7% chr13 + 86976773 86976928 156 browser details YourSeq 111 1033 1165 1598 93.1% chr16 + 18792045 18792184 140 browser details YourSeq 109 1018 1157 1598 92.4% chr13 - 113493627 113493774 148 browser details YourSeq 108 599 1144 1598 76.9% chr5 - 53606180 53606426 247 browser details YourSeq 108 1018 1162 1598 90.3% chr1 - 159828360 159828509 150 browser details YourSeq 108 1019 1162 1598 91.6% chr13 + 62740242 62740387 146 browser details YourSeq 107 1006 1161 1598 90.3% chr11 + 96987326 97105683 118358 browser details YourSeq 104 993 1124 1598 91.5% chr4 - 108091356 108091490 135 browser details YourSeq 104 1018 1161 1598 89.3% chr15 + 83610636 83610783 148 browser details YourSeq 102 1023 1144 1598 94.8% chr3 - 87532264 87532386 123 browser details YourSeq 102 1018 1144 1598 91.9% chr11 + 59246822 59246955 134 browser details YourSeq 101 1039 1167 1598 91.1% chr6 - 95554346 95554475 130

Note: The 1598 bp section downstream of Exon 9 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tube1 , epsilon 1 [ Mus musculus (house mouse) ] Gene ID: 71924, updated on 10-Oct-2019

Gene summary

Official Symbol Tube1 provided by MGI Official Full Name tubulin, epsilon 1 provided by MGI Primary source MGI:MGI:1919174 See related Ensembl:ENSMUSG00000019845 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tube; AI551343; 2310061K05Rik Expression Biased expression in thymus adult (RPKM 18.3), CNS E11.5 (RPKM 7.0) and 10 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 B1 See Tube1 in Genome Data Viewer Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (39133949..39151058)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (38853829..38870864)

Chromosome 10 - NC_000076.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Tube1 ENSMUSG00000019845

Description epsilon-tubulin 1 [Source:MGI Symbol;Acc:MGI:1919174] Gene Synonyms 2310061K05Rik Location Chromosome 10: 39,133,976-39,152,542 forward strand. GRCm38:CM001003.2 About this gene This gene has 6 transcripts (splice variants), 215 orthologues, 20 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tube1- ENSMUST00000019991.7 2929 475aa ENSMUSP00000019991.7 Protein coding CCDS23787 Q9D6T1 TSL:1 201 GENCODE basic APPRIS P1

Tube1- ENSMUST00000213459.1 4192 216aa ENSMUSP00000150602.1 Nonsense mediated - A0A1L1SU34 TSL:1 203 decay

Tube1- ENSMUST00000214493.1 3525 No - Retained intron - - TSL:1 205 protein

Tube1- ENSMUST00000217214.1 1430 No - Retained intron - - TSL:1 206 protein

Tube1- ENSMUST00000213237.1 695 No - Retained intron - - TSL:3 202 protein

Tube1- ENSMUST00000213898.1 1562 No - lncRNA - - TSL:1 204 protein

Page 7 of 9 https://www.alphaknockout.com

38.57 kb Forward strand

39.13Mb 39.14Mb 39.15Mb 39.16Mb (Comprehensive set... Tube1-202 >retained intron

Tube1-201 >protein coding

Tube1-206 >retained intron Tube1-205 >retained intron

Tube1-203 >nonsense mediated decay

Tube1-204 >lncRNA

Contigs AC153958.2 > Genes < Fam229b-201protein coding < Ccn6-201protein coding (Comprehensive set...

< Fam229b-208protein coding

< Fam229b-204protein coding

< Fam229b-209protein coding

< Fam229b-202protein coding

< Fam229b-205protein coding

< Fam229b-203protein coding

< Fam229b-210retained intron

Regulatory Build

39.13Mb 39.14Mb 39.15Mb 39.16Mb Reverse strand 38.57 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000019991

17.04 kb Forward strand

Tube1-201 >protein coding

ENSMUSP00000019... Low complexity (Seg) Superfamily Tubulin/FtsZ, GTPase domain superfamily Tubulin/FtsZ, C-terminal

SMART Tubulin/FtsZ, GTPase domain Tubulin/FtsZ, 2-layer sandwich domain

Prints Epsilon tubulin

Tubulin Pfam Tubulin/FtsZ, GTPase domain Tubulin/FtsZ, 2-layer sandwich domain

PROSITE patterns Tubulin, conserved site

PANTHER Epsilon tubulin

Tubulin Gene3D Tubulin/FtsZ, GTPase domain superfamily Tubulin, C-terminal

CDD cd02190

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 475

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9