https://www.alphaknockout.com

Mouse Guf1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Guf1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Guf1 (NCBI Reference Sequence: NM_172711 ; Ensembl: ENSMUSG00000029208 ) is located on Mouse 5. 17 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 17 (Transcript: ENSMUST00000087228). Exon 5~8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Guf1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-265J22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 23.25% of the coding region. The knockout of Exon 5~8 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 622 bp, and the size of intron 8 for 3'-loxP site insertion: 1225 bp. The size of effective cKO region: ~2808 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 17 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Guf1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9308bp) | A(27.57% 2566) | C(18.26% 1700) | T(33.15% 3086) | G(21.01% 1956)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 + 69556301 69559300 3000 browser details YourSeq 38 2924 3000 3000 89.6% chr3 - 76338651 76338730 80 browser details YourSeq 37 1472 1728 3000 63.7% chr10 + 125403276 125403481 206 browser details YourSeq 35 2931 2983 3000 87.9% chr6 - 50144812 50144863 52 browser details YourSeq 34 1460 1497 3000 94.8% chr9 - 75158159 75158196 38 browser details YourSeq 34 1460 1497 3000 94.8% chr7 - 124468114 124468151 38 browser details YourSeq 34 1460 1497 3000 94.8% chr6 - 49977154 49977191 38 browser details YourSeq 34 1460 1497 3000 94.8% chr5 - 37047279 37047316 38 browser details YourSeq 34 1460 1497 3000 94.8% chr1 - 72175964 72176001 38 browser details YourSeq 34 1460 1497 3000 94.8% chr7 + 73926346 73926383 38 browser details YourSeq 34 1460 1497 3000 94.8% chr7 + 35259456 35259493 38 browser details YourSeq 33 1463 1495 3000 100.0% chr1 - 55736221 55736253 33 browser details YourSeq 33 1460 1496 3000 94.6% chr18 + 69341761 69341797 37 browser details YourSeq 32 1462 1494 3000 100.0% chr1 - 3490404 3490438 35 browser details YourSeq 31 1460 1497 3000 81.9% chr2 + 172547206 172547239 34 browser details YourSeq 30 1462 1492 3000 100.0% chr4 + 57074884 57074916 33 browser details YourSeq 29 1460 1490 3000 96.8% chr3 + 87852966 87852996 31 browser details YourSeq 28 1465 1495 3000 86.3% chr15 - 53998520 53998548 29 browser details YourSeq 25 1459 1483 3000 100.0% chr11 - 101788848 101788872 25 browser details YourSeq 25 2927 2952 3000 100.0% chr11 - 48601770 48601796 27

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 + 69562109 69565108 3000 browser details YourSeq 136 1826 2015 3000 86.5% chr12 + 55938431 55938622 192 browser details YourSeq 135 1826 2014 3000 87.0% chr7 + 3743380 3743564 185 browser details YourSeq 135 1837 2015 3000 89.1% chr5 + 31501475 31501658 184 browser details YourSeq 134 1838 2014 3000 90.4% chr10 + 25329012 25329191 180 browser details YourSeq 129 1826 2016 3000 87.3% chr13 - 46609251 46609447 197 browser details YourSeq 128 1841 2012 3000 89.6% chr1 - 74080459 74080638 180 browser details YourSeq 128 1831 2015 3000 86.3% chr9 + 75430206 75430392 187 browser details YourSeq 127 1826 2014 3000 88.5% chr1 + 175911526 175911866 341 browser details YourSeq 126 1836 2015 3000 91.0% chr1 + 131331072 131331260 189 browser details YourSeq 125 1837 2015 3000 88.5% chrX + 102203983 102204167 185 browser details YourSeq 125 1836 2014 3000 90.9% chr2 + 49067596 49067785 190 browser details YourSeq 124 1836 2014 3000 84.8% chr9 - 62297516 62297676 161 browser details YourSeq 123 1836 2015 3000 88.2% chr2 - 144618609 144618902 294 browser details YourSeq 122 1865 2014 3000 91.3% chr10 - 5763866 5764019 154 browser details YourSeq 122 1866 2016 3000 90.7% chr10 + 60187630 60187783 154 browser details YourSeq 121 1867 2015 3000 91.8% chr4 - 126004427 126004578 152 browser details YourSeq 119 1831 2016 3000 91.7% chr13 - 59137323 59137510 188 browser details YourSeq 119 1863 2012 3000 90.5% chr19 + 32730940 32731091 152 browser details YourSeq 118 1858 2015 3000 88.9% chr1 + 111611993 111612153 161

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Guf1 GUF1 homolog, GTPase [ Mus musculus (house mouse) ] Gene ID: 231279, updated on 12-Aug-2019

Gene summary

Official Symbol Guf1 provided by MGI Official Full Name GUF1 homolog, GTPase provided by MGI Primary source MGI:MGI:2140726 See related Ensembl:ENSMUSG00000029208 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as EF-4; AA407526; 4631409J12 Expression Broad expression in CNS E18 (RPKM 9.9), CNS E14 (RPKM 9.2) and 27 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 C3.1 See Guf1 in Genome Data Viewer

Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (69556910..69574652)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (69948181..69964869)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Guf1 ENSMUSG00000029208

Description GUF1 homolog, GTPase [Source:MGI Symbol;Acc:MGI:2140726] Location Chromosome 5: 69,556,923-69,575,973 forward strand. GRCm38:CM000998.2 About this gene This gene has 9 transcripts (splice variants), 201 orthologues, 18 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Guf1- ENSMUST00000087228.10 3433 651aa ENSMUSP00000084480.4 Protein coding CCDS19322 Q8C3X4 TSL:1 202 GENCODE basic APPRIS P1

Guf1- ENSMUST00000031113.12 1769 563aa ENSMUSP00000031113.6 Protein coding CCDS80296 Q8C3X4 TSL:1 201 GENCODE basic

Guf1- ENSMUST00000173205.2 1827 590aa ENSMUSP00000133467.2 Protein coding - G3UWY0 CDS 5' 208 incomplete TSL:5

Guf1- ENSMUST00000154728.7 3091 302aa ENSMUSP00000144246.1 Nonsense mediated - A0A0J9YUM0 TSL:1 207 decay

Guf1- ENSMUST00000144363.7 1776 296aa ENSMUSP00000114707.2 Nonsense mediated - F6ZM03 CDS 5' 206 decay incomplete TSL:2

Guf1- ENSMUST00000132169.7 1752 302aa ENSMUSP00000144290.1 Nonsense mediated - A0A0J9YUM0 TSL:1 205 decay

Guf1- ENSMUST00000125660.2 6321 No - Retained intron - - TSL:1 204 protein

Guf1- ENSMUST00000125543.7 2611 No - Retained intron - - TSL:1 203 protein

Guf1- ENSMUST00000202180.1 1293 No - Retained intron - - TSL:NA 209 protein

Page 6 of 8 https://www.alphaknockout.com

39.05 kb Forward strand 69.55Mb 69.56Mb 69.57Mb 69.58Mb Guf1-207 >nonsense mediated decay (Comprehensive set...

Guf1-205 >nonsense mediated decay Guf1-209 >retained intron

Guf1-203 >retained intron Guf1-204 >retained intron

Guf1-202 >protein coding

Guf1-201 >protein coding

Guf1-206 >nonsense mediated decay

Guf1-208 >protein coding

Contigs AC109186.7 > Genes < 3110031N09Rik-201TEC < Gnpda2-203nonsense mediated decay (Comprehensive set...

< Gnpda2-201protein coding

< Gnpda2-206protein coding

< Gnpda2-205retained intron

< Gnpda2-207protein coding

< Gnpda2-202protein coding

< Gnpda2-204nonsense mediated decay

Regulatory Build

69.55Mb 69.56Mb 69.57Mb 69.58Mb Reverse strand 39.05 kb

Regulation Legend Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000087228

16.14 kb Forward strand

Guf1-202 >protein coding

ENSMUSP00000084... Low complexity (Seg) TIGRFAM Elongation factor 4

Small GTP-binding protein domain Superfamily Translation protein, beta-barrel domain superfamily

P-loop containing nucleoside triphosphate hydrolase EF-G domain III/V-like Prints Transcription factor, GTP-binding domain Pfam Translation elongation factor EFTu-like, domain 2 GTP-binding protein LepA, C-terminal

Transcription factor, GTP-binding domain Elongation factor EFG, domain V-like PROSITE profiles Transcription factor, GTP-binding domain

PROSITE patterns Tr-type G domain, conserved site

PANTHER PTHR43512

PTHR43512:SF3 HAMAP Elongation factor 4

Gene3D 3.40.50.300 2.40.30.10 3.30.70.870 3.30.70.3380 LepA, C-terminal domain superfamily

CDD cd01890 cd03699 cd16260 Elongation factor 4, domain IV

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 651

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8