https://www.alphaknockout.com

Mouse Mtus1 Knockout Project (CRISPR/Cas9)

Objective: To create a Mtus1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mtus1 (NCBI Reference Sequence: NM_001005863 ; Ensembl: ENSMUSG00000045636 ) is located on Mouse 8. 14 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000059115). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trap allele exhibit spontaneous heart hypertrophy and SLE-like lymphoproliferative disease.

Exon 2 starts from the coding region. Exon 2~3 covers 62.26% of the coding region. The size of effective KO region: ~8609 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 14

Legends Exon of mouse Mtus1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.55% 491) | C(20.15% 403) | T(33.3% 666) | G(22.0% 440)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.9% 538) | C(19.75% 395) | T(29.75% 595) | G(23.6% 472)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 - 41084678 41086677 2000 browser details YourSeq 186 428 1282 2000 90.8% chr16 - 22962776 23309850 347075 browser details YourSeq 157 428 791 2000 92.0% chr2 - 155813189 155813680 492 browser details YourSeq 146 429 792 2000 90.6% chr1 + 181164694 181398452 233759 browser details YourSeq 144 428 792 2000 83.6% chr2 - 58493290 58493538 249 browser details YourSeq 141 438 790 2000 82.9% chr9 + 73114179 73114381 203 browser details YourSeq 138 407 580 2000 92.7% chr9 + 117153260 117153519 260 browser details YourSeq 134 434 788 2000 82.0% chrX - 69264125 69264294 170 browser details YourSeq 132 428 586 2000 91.9% chr1 + 165231474 165231635 162 browser details YourSeq 130 430 586 2000 91.9% chr7 - 118147888 118148042 155 browser details YourSeq 128 428 580 2000 92.1% chr2 - 156332079 156332231 153 browser details YourSeq 128 428 580 2000 92.1% chr11 + 70901150 70901319 170 browser details YourSeq 127 434 580 2000 93.2% chr17 - 24452532 24452678 147 browser details YourSeq 127 428 583 2000 91.1% chr11 - 59174004 59174177 174 browser details YourSeq 127 430 580 2000 90.7% chr5 + 80574641 80574790 150 browser details YourSeq 126 430 580 2000 92.1% chr12 - 109714368 109714537 170 browser details YourSeq 126 428 579 2000 90.7% chr16 + 32891998 32892148 151 browser details YourSeq 126 428 579 2000 90.1% chr12 + 85270944 85271094 151 browser details YourSeq 125 428 579 2000 91.5% chr4 - 37011874 37012026 153 browser details YourSeq 125 428 580 2000 89.5% chr4 + 128174009 128174160 152

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 - 41074232 41076231 2000 browser details YourSeq 271 598 954 2000 93.1% chr7 + 81271575 81732233 460659 browser details YourSeq 262 589 933 2000 92.3% chr11 + 82796650 82797185 536 browser details YourSeq 247 575 935 2000 91.6% chr16 - 18105999 18106433 435 browser details YourSeq 224 576 918 2000 88.9% chr9 + 22252869 22253205 337 browser details YourSeq 220 577 905 2000 90.5% chr9 + 21708732 21709420 689 browser details YourSeq 220 581 916 2000 88.4% chr8 + 110735576 110736048 473 browser details YourSeq 220 607 918 2000 89.4% chr15 + 67340075 67340399 325 browser details YourSeq 219 583 906 2000 86.5% chr16 - 44187660 44187958 299 browser details YourSeq 219 556 820 2000 95.1% chr19 + 5063542 5064030 489 browser details YourSeq 213 589 907 2000 86.5% chr11 + 20143829 20144304 476 browser details YourSeq 203 589 866 2000 91.4% chr9 - 59545601 59546221 621 browser details YourSeq 202 561 907 2000 90.2% chr11 + 103521003 103521563 561 browser details YourSeq 196 608 938 2000 91.9% chr1 - 132213862 132214342 481 browser details YourSeq 187 602 924 2000 89.2% chrX + 42056205 42056703 499 browser details YourSeq 184 561 926 2000 91.4% chr10 + 80750211 80750567 357 browser details YourSeq 174 582 781 2000 95.9% chr19 - 36464591 36464791 201 browser details YourSeq 173 560 767 2000 93.5% chr13 + 58164966 58165620 655 browser details YourSeq 166 567 767 2000 92.1% chr9 - 110718669 110718866 198 browser details YourSeq 166 575 875 2000 85.9% chr5 - 121389799 121390047 249

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Mtus1 mitochondrial tumor suppressor 1 [ Mus musculus (house mouse) ] Gene ID: 102103, updated on 12-Aug-2019

Gene summary

Official Symbol Mtus1 provided by MGI Official Full Name mitochondrial tumor suppressor 1 provided by MGI Primary source MGI:MGI:2142572 See related Ensembl:ENSMUSG00000045636 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MD44; Atip1; MTSG1; C85752; ATBP135; AI481402; mKIAA1288; Cctsg1-440; B430010I23Rik; B430305I03Rik Expression Ubiquitous expression in heart adult (RPKM 15.8), cerebellum adult (RPKM 12.8) and 26 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 A4 See Mtus1 in Genome Data Viewer Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (40990912..41133983, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (42076266..42219080, complement)

Chromosome 8 - NC_000074.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Mtus1 ENSMUSG00000045636

Description mitochondrial tumor suppressor 1 [Source:MGI Symbol;Acc:MGI:2142572] Gene Synonyms Atip1, B430305I03Rik, MD44, MTSG1 Location : 40,990,914-41,133,726 reverse strand. GRCm38:CM001001.2 About this gene This gene has 14 transcripts (splice variants), 250 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mtus1- ENSMUST00000059115.12 6548 1210aa ENSMUSP00000059503.6 Protein coding CCDS40328 A0A0R4J1L9 TSL:5 202 GENCODE basic

Mtus1- ENSMUST00000118835.7 6426 1210aa ENSMUSP00000112626.1 Protein coding CCDS40328 A0A0R4J1L9 TSL:1 205 GENCODE basic

Mtus1- ENSMUST00000093534.10 4148 520aa ENSMUSP00000091252.4 Protein coding CCDS40329 A0A0R4J147 TSL:1 203 GENCODE basic

Mtus1- ENSMUST00000051379.13 3964 440aa ENSMUSP00000053554.7 Protein coding CCDS40330 A0A0R4J0N9 TSL:1 201 GENCODE basic APPRIS P1

Mtus1- ENSMUST00000145860.1 2686 759aa ENSMUSP00000119440.1 Protein coding - E9Q8N4 CDS 3' incomplete 211 TSL:1

Mtus1- ENSMUST00000143853.7 641 214aa ENSMUSP00000116339.1 Protein coding - F6Q593 CDS 5' and 3' 210 incomplete TSL:2

Mtus1- ENSMUST00000117735.7 556 123aa ENSMUSP00000113082.1 Protein coding - D3Z7B3 CDS 3' incomplete 204 TSL:2

Mtus1- ENSMUST00000131965.1 364 92aa ENSMUSP00000121605.1 Protein coding - D3Z2H5 CDS 3' incomplete 207 TSL:2

Mtus1- ENSMUST00000155055.1 336 38aa ENSMUSP00000119163.1 Protein coding - D3Z2Y9 CDS 3' incomplete 212 TSL:3

Mtus1- ENSMUST00000135194.1 1325 No - Retained - - TSL:1 208 protein intron

Mtus1- ENSMUST00000127665.1 617 No - Retained - - TSL:3 206 protein intron

Mtus1- ENSMUST00000142936.1 780 No - lncRNA - - TSL:1 209 protein

Mtus1- ENSMUST00000155626.7 669 No - lncRNA - - TSL:3 214 protein

Mtus1- ENSMUST00000155174.7 608 No - lncRNA - - TSL:3 213 protein

Page 7 of 9 https://www.alphaknockout.com

162.81 kb Forward strand 41.00Mb 41.05Mb 41.10Mb Pdgfrl-201 >protein coding B430010I23Rik-201 >lncRNA Gm16193-201 >lncRNA (Comprehensive set...

B430010I23Rik-202 >lncRNA

Gm16192-201 >lncRNA

Contigs < AC116511.15 < AC156554.13 Genes (Comprehensive set... < Mtus1-202protein coding

< Mtus1-205protein coding

< Mtus1-203protein coding < Mtus1-212protein coding

< Mtus1-201protein coding < Mtus1-210protein coding

< Mtus1-208retained intron < Mtus1-211protein coding

< Mtus1-214lncRNA< Mtus1-209lncRNA

< Mtus1-204protein coding

< Mtus1-207protein coding

< Mtus1-213lncRNA

< Mtus1-206retained intron

Regulatory Build

41.00Mb 41.05Mb 41.10Mb Reverse strand 162.81 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000059115

< Mtus1-202protein coding

Reverse strand 142.81 kb

ENSMUSP00000059... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) PANTHER Microtubule-associated tumour suppressor 1

PTHR24200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion inframe deletion missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1210

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9