Mouse Tsfm Knockout Project (CRISPR/Cas9)

https://www.alphaknockout.com Mouse Tsfm Knockout Project (CRISPR/Cas9) Objective: To create a Tsfm knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Tsfm gene (NCBI Reference Sequence: NM_025537 ; Ensembl: ENSMUSG00000040521 ) is located on Mouse chromosome 10. 6 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000040560). Exon 3~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 3 starts from about 23.56% of the coding region. Exon 3~5 covers 34.98% of the coding region. The size of effective KO region: ~4747 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 3 4 5 6 Legends Exon of mouse Tsfm Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 780 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(780bp) | A(26.28% 205) | C(20.77% 162) | T(29.23% 228) | G(23.72% 185) Note: The 780 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(23.8% 476) | C(21.7% 434) | T(27.9% 558) | G(26.6% 532) Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 780 1 780 780 100.0% chr10 - 127029683 127030462 780 browser details YourSeq 167 480 692 780 88.7% chr11 + 97266770 97266967 198 browser details YourSeq 162 493 679 780 93.6% chr8 - 33754934 33755122 189 browser details YourSeq 160 480 683 780 91.9% chr16 - 17696539 17696740 202 browser details YourSeq 160 495 679 780 93.6% chr12 + 3569692 3569878 187 browser details YourSeq 158 494 680 780 93.1% chr12 + 70926461 70926665 205 browser details YourSeq 158 496 678 780 91.2% chr11 + 75529088 75529267 180 browser details YourSeq 157 487 679 780 92.0% chr11 - 115773179 115773372 194 browser details YourSeq 156 496 666 780 96.0% chr16 + 21970222 21970394 173 browser details YourSeq 156 495 686 780 91.1% chr13 + 47053172 47053368 197 browser details YourSeq 155 457 678 780 91.5% chrX + 7719611 7720025 415 browser details YourSeq 155 498 679 780 92.6% chr11 + 113706884 113707063 180 browser details YourSeq 154 499 683 780 90.1% chr11 - 115349621 115349803 183 browser details YourSeq 154 499 678 780 97.6% chr8 + 114409593 114409773 181 browser details YourSeq 154 496 671 780 94.9% chr11 + 113711359 113711542 184 browser details YourSeq 153 495 679 780 91.9% chr4 - 39569645 39569835 191 browser details YourSeq 153 495 666 780 94.8% chr9 + 99231994 99232167 174 browser details YourSeq 152 504 695 780 91.9% chr5 - 121337798 121337999 202 browser details YourSeq 152 496 684 780 90.5% chr5 - 118301919 118302109 191 browser details YourSeq 152 500 684 780 93.3% chr19 - 32875777 32875966 190 Note: The 780 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 127022936 127024935 2000 browser details YourSeq 94 739 897 2000 81.0% chr18 - 37788435 37788577 143 browser details YourSeq 93 737 898 2000 84.1% chr11 + 31829225 31829373 149 browser details YourSeq 91 752 897 2000 82.1% chr9 + 61106858 61106987 130 browser details YourSeq 88 744 884 2000 81.7% chr17 + 45915386 45915519 134 browser details YourSeq 87 739 884 2000 80.5% chr4 + 155798141 155798279 139 browser details YourSeq 86 739 879 2000 80.4% chr6 - 117822107 117822241 135 browser details YourSeq 86 744 879 2000 86.5% chr3 - 28465062 28465195 134 browser details YourSeq 85 739 872 2000 83.1% chr14 - 88154101 88154228 128 browser details YourSeq 85 751 884 2000 82.9% chr8 + 35711471 35711598 128 browser details YourSeq 84 737 862 2000 82.7% chr9 - 70194486 70194607 122 browser details YourSeq 84 717 862 2000 90.5% chr4 + 117221525 117221671 147 browser details YourSeq 83 737 870 2000 83.7% chr7 + 88547735 88547864 130 browser details YourSeq 80 723 842 2000 90.2% chr8 - 5092012 5092403 392 browser details YourSeq 80 758 884 2000 81.8% chr12 - 79013844 79013961 118 browser details YourSeq 80 739 870 2000 82.3% chrX + 99881395 99881520 126 browser details YourSeq 80 741 862 2000 91.0% chr16 + 20329234 20329363 130 browser details YourSeq 78 743 866 2000 90.0% chr5 - 129384831 129384954 124 browser details YourSeq 77 739 862 2000 83.9% chr9 + 53642859 53642977 119 browser details YourSeq 76 752 870 2000 84.6% chr9 - 64136382 64136495 114 Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Tsfm Ts translation elongation factor, mitochondrial [ Mus musculus (house mouse) ] Gene ID: 66399, updated on 10-Oct-2019 Gene summary Official Symbol Tsfm provided by MGI Official Full Name Ts translation elongation factor, mitochondrial provided by MGI Primary source MGI:MGI:1913649 See related Ensembl:ENSMUSG00000040521 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as EF-TS; EF-Tsmt; 2310050B20Rik; 9430024O13Rik Expression Ubiquitous expression in adrenal adult (RPKM 24.4), ovary adult (RPKM 16.4) and 28 other tissues See more Orthologs human all Genomic context Location: 10; 10 D3 See Tsfm in Genome Data Viewer Exon count: 6 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (127022332..127030814, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (126459388..126467870, complement) Chromosome 10 - NC_000076.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 7 transcripts Gene: Tsfm ENSMUSG00000040521 Description Ts translation elongation factor, mitochondrial [Source:MGI Symbol;Acc:MGI:1913649] Gene Synonyms 2310050B20Rik, 9430024O13Rik, EF-TS, EF-Tsmt Location Chromosome 10: 127,011,572-127,030,840 reverse strand. GRCm38:CM001003.2 About this gene This gene has 7 transcripts (splice variants), 181 orthologues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Tsfm-201 ENSMUST00000040560.10 1206 324aa ENSMUSP00000042134.4 Protein coding CCDS24222 Q9CZR8 TSL:1 GENCODE basic APPRIS P1 Tsfm-202 ENSMUST00000120547.1 1271 193aa ENSMUSP00000113446.1 Protein coding - Q9CX33 TSL:1 GENCODE basic Tsfm-207 ENSMUST00000152054.7 655 206aa ENSMUSP00000122669.1 Protein coding - D3Z4M7 TSL:3 GENCODE basic Tsfm-203 ENSMUST00000134917.1 695 No protein - Retained intron - - TSL:1 Tsfm-206 ENSMUST00000145476.1 656 No protein - Retained intron - - TSL:1 Tsfm-204 ENSMUST00000138556.1 641 No protein - Retained intron - - TSL:1 Tsfm-205 ENSMUST00000140564.7 833 No protein - lncRNA - - TSL:3 Page 7 of 9 https://www.alphaknockout.com 39.27 kb Forward strand 127.01Mb 127.02Mb 127.03Mb 127.04Mb Genes Avil-201 >protein coding (Comprehensive set... Avil-204 >protein coding Avil-202 >protein coding Avil-203 >protein coding Contigs < AC134329.3 Genes (Comprehensive set... < Tsfm-207protein coding < Eef1akmt3-201protein coding < Tsfm-201protein coding < Tsfm-204retained intron < Tsfm-203retained intron < Tsfm-205lncRNA < Tsfm-206retained intron < Tsfm-202protein coding Regulatory Build 127.01Mb 127.02Mb 127.03Mb 127.04Mb Reverse strand 39.27 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene processed transcript Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000040560 < Tsfm-201protein coding Reverse strand 8.51 kb ENSMUSP00000042... Low complexity (Seg) Superfamily UBA-like superfamily Elongation factor Ts, dimerisation domain superfamily Pfam Translation elongation factor EFTs/EF1B, dimerisation PROSITE patterns Translation elongation factor Ts, conserved site Translation elongation factor Ts, conserved site PANTHER Translation elongation factor EFTs/EF1B PTHR11741:SF0 HAMAP Translation elongation factor EFTs/EF1B Gene3D 1.10.8.10 Elongation factor Ts, dimerisation domain superfamily CDD cd14275 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend stop gained missense variant splice region variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 324 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Mouse Tsfm Knockout Project (CRISPR/Cas9)

Complexity of a Small Non-Protein Coding Sequence in Chromosomal Region 22Q11.2: Presence of Specialized DNA Secondary Structures and RNA Exon/Intron Motifs Delihas

A Different View on DNA Amplifications Indicates Frequent, Highly Complex, and Stable Amplicons on 12Q13-21 in Glioma

Variation in Protein Coding Genes Identifies Information Flow

Accurate Prediction of Kinase-Substrate Networks Using

Gnomad Lof Supplement

Annotating Gene Sequence Variation Watch for Multiple Transcripts!

Table S1. 103 Ferroptosis-Related Genes Retrieved from the Genecards

Systematic Detection of Brain Protein-Coding Genes Under Positive Selection During Primate Evolution and Their Roles in Cognition

Whole Genome Analyses of a Well-Differentiated Liposarcoma Reveals Novel SYT1 and DDR2 Rearrangements

Annotation of Functional Variation Within Non-MHC MS Susceptibility Loci Through Bioinformatics Analysis

A Multi- Tissue Transcriptomic Network META- Analysis Rosa Faner1* , Jarrett D

Hipsc-Derived Cardiomyocyte Model of LQT2 Syndrome Derived