https://www.alphaknockout.com

Mouse Tomm20 Knockout Project (CRISPR/Cas9)

Objective: To create a Tomm20 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tomm20 (NCBI Reference Sequence: NM_024214 ; Ensembl: ENSMUSG00000093904 ) is located on Mouse 8. 5 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000179857). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 28.05% of the coding region. Exon 2~4 covers 62.53% of the coding region. The size of effective KO region: ~4186 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5

Legends Exon of mouse Tomm20 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1980 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.95% 499) | C(18.8% 376) | T(33.8% 676) | G(22.45% 449)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1980bp) | A(27.22% 539) | C(19.6% 388) | T(30.3% 600) | G(22.88% 453)

Note: The 1980 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 - 126941226 126943225 2000 browser details YourSeq 142 1532 1706 2000 95.0% chr1 - 144060322 144060785 464 browser details YourSeq 138 1524 1674 2000 96.6% chr10 + 108270464 108270619 156 browser details YourSeq 137 1524 1685 2000 95.4% chr13 + 104873451 104873617 167 browser details YourSeq 135 1527 1675 2000 96.7% chr5 + 150226388 150226548 161 browser details YourSeq 133 1518 1665 2000 95.3% chr10 - 60097635 60097784 150 browser details YourSeq 133 1522 1667 2000 96.6% chr10 + 67942246 67942402 157 browser details YourSeq 132 1528 1667 2000 97.9% chr16 - 32357542 32357689 148 browser details YourSeq 132 1527 1667 2000 97.2% chr13 + 36046287 36046433 147 browser details YourSeq 131 1524 1667 2000 96.5% chr12 - 60690003 60690147 145 browser details YourSeq 131 1524 1667 2000 95.9% chr14 + 20697506 20697654 149 browser details YourSeq 131 1524 1663 2000 97.2% chr1 + 56499492 56499635 144 browser details YourSeq 130 1508 1667 2000 89.7% chr12 - 73099348 73099498 151 browser details YourSeq 130 1524 1665 2000 96.5% chr1 - 181722605 181722753 149 browser details YourSeq 130 1527 1702 2000 93.5% chr19 + 36934007 36934405 399 browser details YourSeq 130 1532 1668 2000 97.9% chr12 + 113091879 113092021 143 browser details YourSeq 130 1524 1667 2000 95.9% chr10 + 83355554 83355704 151 browser details YourSeq 130 1528 1667 2000 97.2% chr10 + 80555750 80556268 519 browser details YourSeq 129 1517 1667 2000 93.3% chr6 - 6178312 6178468 157 browser details YourSeq 129 1528 1667 2000 96.5% chr13 - 21642954 21643098 145

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1980 1 1980 1980 100.0% chr8 - 126935060 126937039 1980 browser details YourSeq 194 897 1507 1980 93.1% chr4 + 143363379 143374969 11591 browser details YourSeq 191 897 1510 1980 81.5% chr10 - 85676508 85676953 446 browser details YourSeq 186 911 1510 1980 83.7% chr7 - 116310801 116311305 505 browser details YourSeq 180 1260 1507 1980 88.9% chr17 - 48295826 48296141 316 browser details YourSeq 180 1255 1510 1980 90.3% chr13 + 104866346 104866669 324 browser details YourSeq 178 1262 1510 1980 91.8% chr7 + 82588110 82588426 317 browser details YourSeq 176 1258 1510 1980 91.2% chr6 + 147191064 147191380 317 browser details YourSeq 174 1258 1506 1980 92.2% chr8 - 91965912 91966228 317 browser details YourSeq 172 1262 1510 1980 87.2% chr5 + 116352489 116666724 314236 browser details YourSeq 171 1262 1510 1980 89.5% chr6 - 143424559 143424875 317 browser details YourSeq 170 1266 1510 1980 91.1% chr18 + 66483446 66709268 225823 browser details YourSeq 168 1262 1510 1980 89.0% chr8 + 35069050 35069366 317 browser details YourSeq 165 1262 1510 1980 89.9% chr6 - 117069266 117069582 317 browser details YourSeq 164 1262 1510 1980 90.0% chr1 - 125860546 125860865 320 browser details YourSeq 161 1262 1509 1980 92.2% chr5 - 76496084 76496394 311 browser details YourSeq 161 1262 1510 1980 88.6% chr19 - 42664466 42664774 309 browser details YourSeq 161 1262 1509 1980 93.1% chr10 - 68573635 68573945 311 browser details YourSeq 160 914 1491 1980 80.5% chr9 + 49806973 49807349 377 browser details YourSeq 159 1258 1850 1980 92.5% chr7 - 118510660 118511449 790

Note: The 1980 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Tomm20 translocase of outer mitochondrial membrane 20 [ Mus musculus (house mouse) ] Gene ID: 67952, updated on 17-Sep-2019

Gene summary

Official Symbol Tomm20 provided by MGI Official Full Name translocase of outer mitochondrial membrane 20 provided by MGI Primary source MGI:MGI:1915202 See related Ensembl:ENSMUSG00000093904 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MAS20; MOM19; TOM20; Gm19268; BB284719; mKIAA0016; 1810060K07Rik Expression Ubiquitous expression in adrenal adult (RPKM 54.6), placenta adult (RPKM 33.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 E2 See Tomm20 in Genome Data Viewer Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (126930664..126945921, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (129458450..129469730, complement)

Chromosome 8 - NC_000074.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Tomm20 ENSMUSG00000093904

Description translocase of outer mitochondrial membrane 20 [Source:MGI Symbol;Acc:MGI:1915202] Gene Synonyms 1810060K07Rik, TOM20 Location Chromosome 8: 126,930,667-126,945,844 reverse strand. GRCm38:CM001001.2 About this gene This gene has 2 transcripts (splice variants), 228 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tomm20-201 ENSMUST00000179857.2 4902 145aa ENSMUSP00000136493.1 Protein coding CCDS22786 Q4KL41 Q9DCC8 TSL:1 GENCODE basic APPRIS P1

Tomm20-202 ENSMUST00000212771.1 458 102aa ENSMUSP00000148566.1 Protein coding - A0A1D5RLZ6 TSL:2 GENCODE basic

35.18 kb Forward strand 126.93Mb 126.94Mb 126.95Mb Contigs AC119874.12 > (Comprehensive set... < Tomm20-201protein coding < Rbm34-201protein coding

< Tomm20-202protein coding < Rbm34-203protein coding

< Gm26397-201snoRNA

Regulatory Build

126.93Mb 126.94Mb 126.95Mb Reverse strand 35.18 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000179857

< Tomm20-201protein coding

Reverse strand 15.17 kb

ENSMUSP00000136... Transmembrane heli... Low complexity (Seg) TIGRFAM Protein import receptor MAS20

Superfamily Mitochondrial outer membrane translocase complex, Tom20 domain superfamily Prints Protein import receptor MAS20

Protein import receptor MAS20, metazoan Pfam Protein import receptor MAS20 PIRSF Protein import receptor MAS20

PANTHER Protein import receptor MAS20

PTHR12430:SF2 Gene3D Mitochondrial outer membrane translocase complex, Tom20 domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 20 40 60 80 100 120 145

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8