https://www.alphaknockout.com

Mouse Tmem120a Knockout Project (CRISPR/Cas9)

Objective: To create a Tmem120a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tmem120a (NCBI Reference Sequence: NM_172541 ; Ensembl: ENSMUSG00000039886 ) is located on Mouse 5. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000043378). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 7.97% of the coding region. Exon 2~3 covers 22.93% of the coding region. The size of effective KO region: ~434 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 12

Legends Exon of mouse Tmem120a Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1648 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1648bp) | A(22.75% 375) | C(26.76% 441) | T(21.6% 356) | G(28.88% 476)

Note: The 1648 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.45% 589) | C(21.95% 439) | T(23.05% 461) | G(25.55% 511)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1648 1 1648 1648 100.0% chr5 - 135742400 135744047 1648 browser details YourSeq 103 1222 1417 1648 86.7% chr10 - 117404408 117404648 241 browser details YourSeq 100 1229 1409 1648 90.2% chr15 + 57797208 57797427 220 browser details YourSeq 93 1231 1409 1648 84.5% chr19 + 34354584 34354801 218 browser details YourSeq 90 1229 1409 1648 91.0% chr14 - 31217136 31217356 221 browser details YourSeq 89 1224 1409 1648 90.9% chr1 - 72375778 72375998 221 browser details YourSeq 87 1296 1417 1648 83.8% chr4 - 80765178 80765296 119 browser details YourSeq 87 1230 1415 1648 89.2% chr11 - 94379421 94379622 202 browser details YourSeq 86 1220 1406 1648 92.1% chr1 + 36624570 36624792 223 browser details YourSeq 85 1228 1409 1648 89.9% chr4 + 55967615 55967832 218 browser details YourSeq 85 1231 1406 1648 87.1% chr12 + 103121512 103121688 177 browser details YourSeq 84 1230 1410 1648 87.5% chr1 + 169793797 169810122 16326 browser details YourSeq 83 1230 1409 1648 88.8% chr1 - 156775701 156775919 219 browser details YourSeq 83 1297 1409 1648 84.6% chr10 + 94193561 94193671 111 browser details YourSeq 82 1296 1409 1648 84.6% chr12 + 88891040 88891151 112 browser details YourSeq 80 1296 1410 1648 85.1% chr6 - 92271696 92271810 115 browser details YourSeq 80 1296 1409 1648 88.2% chr14 - 36747367 36747479 113 browser details YourSeq 77 1303 1411 1648 85.0% chr9 - 99300834 99300940 107 browser details YourSeq 77 1296 1409 1648 85.0% chr6 - 127594076 127594188 113 browser details YourSeq 77 1224 1416 1648 86.8% chr12 + 72404297 72404536 240

Note: The 1648 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 135739966 135741965 2000 browser details YourSeq 288 1590 1925 2000 95.1% chr10 + 91090333 91090764 432 browser details YourSeq 280 1565 1925 2000 92.0% chr5 + 135941837 135942243 407 browser details YourSeq 278 1590 1925 2000 95.2% chr7 - 132619124 132619477 354 browser details YourSeq 260 1594 1925 2000 94.0% chr9 + 59742389 59742760 372 browser details YourSeq 259 806 1429 2000 86.9% chr17 - 36722104 36722861 758 browser details YourSeq 256 1589 1916 2000 92.8% chr5 - 21752813 21753166 354 browser details YourSeq 250 812 1427 2000 86.6% chr2 - 181862349 181863061 713 browser details YourSeq 244 891 1430 2000 88.7% chr6 - 35703760 35704310 551 browser details YourSeq 238 812 1430 2000 85.7% chr5 - 3749973 3750601 629 browser details YourSeq 237 812 1430 2000 90.3% chr4 - 116373889 116374517 629 browser details YourSeq 237 812 1430 2000 91.4% chr10 - 112014902 112015526 625 browser details YourSeq 227 812 1430 2000 89.9% chr16 + 15041806 15042428 623 browser details YourSeq 227 883 1430 2000 90.3% chr11 + 23598970 23599528 559 browser details YourSeq 226 899 1427 2000 89.3% chrX - 107648079 107648707 629 browser details YourSeq 225 811 1430 2000 85.3% chr6 + 81490180 81490796 617 browser details YourSeq 224 812 1430 2000 87.4% chr14 - 16629880 16630507 628 browser details YourSeq 223 812 1427 2000 88.9% chr10 - 38900710 38901417 708 browser details YourSeq 220 812 1427 2000 86.8% chr16 - 49149498 49150212 715 browser details YourSeq 216 812 1430 2000 88.6% chr5 - 33992716 33993342 627

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tmem120a transmembrane protein 120A [ Mus musculus (house mouse) ] Gene ID: 215210, updated on 12-Aug-2019

Gene summary

Official Symbol Tmem120a provided by MGI Official Full Name transmembrane protein 120A provided by MGI Primary source MGI:MGI:2686991 See related Ensembl:ENSMUSG00000039886 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tmpit; 2010310D06Rik Expression Broad expression in duodenum adult (RPKM 93.9), subcutaneous fat pad adult (RPKM 83.5) and 23 other tissues See Orthologs more human all

Genomic context

Location: 5; 5 G2 See Tmem120a in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (135735490..135744172, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (136211360..136220042, complement)

Chromosome 5 - NC_000071.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Tmem120a ENSMUSG00000039886

Description transmembrane protein 120A [Source:MGI Symbol;Acc:MGI:2686991] Gene Synonyms 2010310D06Rik, Tmpit Location Chromosome 5: 135,735,485-135,744,448 reverse strand. GRCm38:CM000998.2 About this gene This gene has 5 transcripts (splice variants), 245 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tmem120a-201 ENSMUST00000043378.8 1559 343aa ENSMUSP00000045252.2 Protein coding CCDS19744 Q8C1E7 TSL:1 GENCODE basic APPRIS P1

Tmem120a-204 ENSMUST00000153399.1 358 92aa ENSMUSP00000120834.1 Protein coding - D3Z0U3 CDS 3' incomplete TSL:5

Tmem120a-202 ENSMUST00000127156.1 785 No protein - Retained intron - - TSL:3

Tmem120a-203 ENSMUST00000141779.5 679 No protein - Retained intron - - TSL:2

Tmem120a-205 ENSMUST00000199952.4 480 No protein - lncRNA - - TSL:5

Page 7 of 9 https://www.alphaknockout.com

28.96 kb Forward strand

135.73Mb 135.74Mb 135.75Mb Por-208 >protein coding Por-204 >retained intron (Comprehensive set...

Por-209 >protein coding Por-206 >retained intron

Por-201 >protein coding

Por-202 >protein coding

Por-205 >retained intron

Por-203 >protein coding

Contigs AC083948.3 > Genes (Comprehensive set... < Tmem120a-201protein coding < Styxl1-201protein coding

< Tmem120a-203retained intron < Styxl1-208protein coding

< Tmem120a-205lncRNA < Styxl1-207protein coding

< Mir7034-201miRNA < Styxl1-204protein coding

< Tmem120a-202retained intron < Styxl1-205protein coding

< Tmem120a-204protein coding < Styxl1-202protein coding

< Styxl1-206nonsense mediated decay

< Styxl1-203protein coding

Regulatory Build

135.73Mb 135.74Mb 135.75Mb Reverse strand 28.96 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000043378

< Tmem120a-201protein coding

Reverse strand 8.96 kb

ENSMUSP00000045... Transmembrane heli... Low complexity (Seg) Pfam TMPIT-like

PANTHER PTHR21433:SF1

TMPIT-like

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant stop retained variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 343

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9