https://www.alphaknockout.com

Mouse Tsfm Knockout Project (CRISPR/Cas9)

Objective: To create a Tsfm knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tsfm (NCBI Reference Sequence: NM_025537 ; Ensembl: ENSMUSG00000040521 ) is located on Mouse 10. 6 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000040560). Exon 3~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 23.56% of the coding region. Exon 3~5 covers 34.98% of the coding region. The size of effective KO region: ~4747 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6

Legends Exon of mouse Tsfm Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 780 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(780bp) | A(26.28% 205) | C(20.77% 162) | T(29.23% 228) | G(23.72% 185)

Note: The 780 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.8% 476) | C(21.7% 434) | T(27.9% 558) | G(26.6% 532)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 780 1 780 780 100.0% chr10 - 127029683 127030462 780 browser details YourSeq 167 480 692 780 88.7% chr11 + 97266770 97266967 198 browser details YourSeq 162 493 679 780 93.6% chr8 - 33754934 33755122 189 browser details YourSeq 160 480 683 780 91.9% chr16 - 17696539 17696740 202 browser details YourSeq 160 495 679 780 93.6% chr12 + 3569692 3569878 187 browser details YourSeq 158 494 680 780 93.1% chr12 + 70926461 70926665 205 browser details YourSeq 158 496 678 780 91.2% chr11 + 75529088 75529267 180 browser details YourSeq 157 487 679 780 92.0% chr11 - 115773179 115773372 194 browser details YourSeq 156 496 666 780 96.0% chr16 + 21970222 21970394 173 browser details YourSeq 156 495 686 780 91.1% chr13 + 47053172 47053368 197 browser details YourSeq 155 457 678 780 91.5% chrX + 7719611 7720025 415 browser details YourSeq 155 498 679 780 92.6% chr11 + 113706884 113707063 180 browser details YourSeq 154 499 683 780 90.1% chr11 - 115349621 115349803 183 browser details YourSeq 154 499 678 780 97.6% chr8 + 114409593 114409773 181 browser details YourSeq 154 496 671 780 94.9% chr11 + 113711359 113711542 184 browser details YourSeq 153 495 679 780 91.9% chr4 - 39569645 39569835 191 browser details YourSeq 153 495 666 780 94.8% chr9 + 99231994 99232167 174 browser details YourSeq 152 504 695 780 91.9% chr5 - 121337798 121337999 202 browser details YourSeq 152 496 684 780 90.5% chr5 - 118301919 118302109 191 browser details YourSeq 152 500 684 780 93.3% chr19 - 32875777 32875966 190

Note: The 780 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 127022936 127024935 2000 browser details YourSeq 94 739 897 2000 81.0% chr18 - 37788435 37788577 143 browser details YourSeq 93 737 898 2000 84.1% chr11 + 31829225 31829373 149 browser details YourSeq 91 752 897 2000 82.1% chr9 + 61106858 61106987 130 browser details YourSeq 88 744 884 2000 81.7% chr17 + 45915386 45915519 134 browser details YourSeq 87 739 884 2000 80.5% chr4 + 155798141 155798279 139 browser details YourSeq 86 739 879 2000 80.4% chr6 - 117822107 117822241 135 browser details YourSeq 86 744 879 2000 86.5% chr3 - 28465062 28465195 134 browser details YourSeq 85 739 872 2000 83.1% chr14 - 88154101 88154228 128 browser details YourSeq 85 751 884 2000 82.9% chr8 + 35711471 35711598 128 browser details YourSeq 84 737 862 2000 82.7% chr9 - 70194486 70194607 122 browser details YourSeq 84 717 862 2000 90.5% chr4 + 117221525 117221671 147 browser details YourSeq 83 737 870 2000 83.7% chr7 + 88547735 88547864 130 browser details YourSeq 80 723 842 2000 90.2% chr8 - 5092012 5092403 392 browser details YourSeq 80 758 884 2000 81.8% chr12 - 79013844 79013961 118 browser details YourSeq 80 739 870 2000 82.3% chrX + 99881395 99881520 126 browser details YourSeq 80 741 862 2000 91.0% chr16 + 20329234 20329363 130 browser details YourSeq 78 743 866 2000 90.0% chr5 - 129384831 129384954 124 browser details YourSeq 77 739 862 2000 83.9% chr9 + 53642859 53642977 119 browser details YourSeq 76 752 870 2000 84.6% chr9 - 64136382 64136495 114

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tsfm Ts translation elongation factor, mitochondrial [ Mus musculus (house mouse) ] Gene ID: 66399, updated on 10-Oct-2019

Gene summary

Official Symbol Tsfm provided by MGI Official Full Name Ts translation elongation factor, mitochondrial provided by MGI Primary source MGI:MGI:1913649 See related Ensembl:ENSMUSG00000040521 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as EF-TS; EF-Tsmt; 2310050B20Rik; 9430024O13Rik Expression Ubiquitous expression in adrenal adult (RPKM 24.4), ovary adult (RPKM 16.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 D3 See Tsfm in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (127022332..127030814, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (126459388..126467870, complement)

Chromosome 10 - NC_000076.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Tsfm ENSMUSG00000040521

Description Ts translation elongation factor, mitochondrial [Source:MGI Symbol;Acc:MGI:1913649] Gene Synonyms 2310050B20Rik, 9430024O13Rik, EF-TS, EF-Tsmt Location Chromosome 10: 127,011,572-127,030,840 reverse strand. GRCm38:CM001003.2 About this gene This gene has 7 transcripts (splice variants), 181 orthologues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tsfm-201 ENSMUST00000040560.10 1206 324aa ENSMUSP00000042134.4 Protein coding CCDS24222 Q9CZR8 TSL:1 GENCODE basic APPRIS P1

Tsfm-202 ENSMUST00000120547.1 1271 193aa ENSMUSP00000113446.1 Protein coding - Q9CX33 TSL:1 GENCODE basic

Tsfm-207 ENSMUST00000152054.7 655 206aa ENSMUSP00000122669.1 Protein coding - D3Z4M7 TSL:3 GENCODE basic

Tsfm-203 ENSMUST00000134917.1 695 No protein - Retained intron - - TSL:1

Tsfm-206 ENSMUST00000145476.1 656 No protein - Retained intron - - TSL:1

Tsfm-204 ENSMUST00000138556.1 641 No protein - Retained intron - - TSL:1

Tsfm-205 ENSMUST00000140564.7 833 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

39.27 kb Forward strand

127.01Mb 127.02Mb 127.03Mb 127.04Mb Avil-201 >protein coding (Comprehensive set...

Avil-204 >protein coding

Avil-202 >protein coding

Avil-203 >protein coding

Contigs < AC134329.3 Genes (Comprehensive set... < Tsfm-207protein coding < Eef1akmt3-201protein coding

< Tsfm-201protein coding

< Tsfm-204retained intron < Tsfm-203retained intron

< Tsfm-205lncRNA

< Tsfm-206retained intron

< Tsfm-202protein coding

Regulatory Build

127.01Mb 127.02Mb 127.03Mb 127.04Mb Reverse strand 39.27 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000040560

< Tsfm-201protein coding

Reverse strand 8.51 kb

ENSMUSP00000042... Low complexity (Seg) Superfamily UBA-like superfamily Elongation factor Ts, dimerisation domain superfamily Pfam Translation elongation factor EFTs/EF1B, dimerisation

PROSITE patterns Translation elongation factor Ts, conserved site

Translation elongation factor Ts, conserved site PANTHER Translation elongation factor EFTs/EF1B

PTHR11741:SF0 HAMAP Translation elongation factor EFTs/EF1B

Gene3D 1.10.8.10 Elongation factor Ts, dimerisation domain superfamily

CDD cd14275

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

stop gained missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 324

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9