https://www.alphaknockout.com

Mouse Gfpt1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gfpt1 conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gfpt1 ( NCBI Reference Sequence: NM_013528.3 ; Ensembl: ENSMUSG00000029992 ) is located on mouse 6. 19 exons are identified , with the ATG start codon in exon 1 and the TAA stop codon in exon 19 (Transcript: ENSMUST00000113658). Exon 3~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Gfpt1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-236I17 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 5.68% of the coding region. The knockout of Exon 3~5 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 2722 bp, and the size of intron 5 for 3'-loxP site insertion: 1427 bp. The size of effective cKO region: ~3801 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and protein translation cannot be predicted at existing technological level.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gfpt1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9452bp) | A(26.95% 2547) | C(19.53% 1846) | T(31.2% 2949) | G(22.32% 2110)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 87050502 87053501 3000 browser details YourSeq 168 877 1172 3000 92.9% chr11 + 112799738 112800057 320 browser details YourSeq 162 915 1153 3000 93.6% chr10 - 81276791 81277058 268 browser details YourSeq 158 152 340 3000 94.4% chr4 - 126356532 126356725 194 browser details YourSeq 158 159 330 3000 96.5% chr10 + 81515086 81515257 172 browser details YourSeq 158 822 1142 3000 92.5% chr10 + 60149464 60150206 743 browser details YourSeq 156 152 332 3000 94.3% chr7 - 127295044 127295225 182 browser details YourSeq 156 161 331 3000 96.0% chr15 + 53257294 53257465 172 browser details YourSeq 153 161 325 3000 97.0% chr1 - 87346622 87346788 167 browser details YourSeq 153 991 1175 3000 91.8% chr1 + 18480695 18480881 187 browser details YourSeq 152 1001 1175 3000 94.7% chr3 - 57577694 57577870 177 browser details YourSeq 152 997 1179 3000 93.7% chr13 - 73638139 73638321 183 browser details YourSeq 151 160 320 3000 95.6% chr11 - 110418177 110418335 159 browser details YourSeq 150 161 332 3000 93.7% chr8 - 64050362 64050533 172 browser details YourSeq 150 160 330 3000 94.2% chr5 - 16797533 16797704 172 browser details YourSeq 150 1000 1173 3000 93.7% chr13 + 93594260 93594435 176 browser details YourSeq 149 984 1287 3000 90.3% chr19 - 10505467 10505766 300 browser details YourSeq 148 1008 1176 3000 94.1% chr5 - 3508001 3508221 221 browser details YourSeq 147 957 1156 3000 93.0% chr9 - 35186753 35186995 243 browser details YourSeq 147 993 1165 3000 93.0% chr2 - 90908331 90908504 174

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 87056454 87059453 3000 browser details YourSeq 77 1393 1687 3000 89.8% chr6 + 105839180 105839513 334 browser details YourSeq 76 879 1121 3000 70.8% chr8 + 122878466 122878633 168 browser details YourSeq 74 880 1049 3000 76.7% chr18 - 38129391 38129556 166 browser details YourSeq 72 1 1001 3000 86.0% chr2 - 168780686 168810958 30273 browser details YourSeq 70 1 90 3000 88.9% chrX - 93477517 93477606 90 browser details YourSeq 70 1 91 3000 89.1% chr4 - 10910006 10910195 190 browser details YourSeq 68 1 98 3000 88.1% chr1 - 59452654 59452754 101 browser details YourSeq 65 8 90 3000 89.2% chr12 + 71313623 71313705 83 browser details YourSeq 64 1 90 3000 85.6% chr6 - 87740824 87740913 90 browser details YourSeq 63 8 92 3000 87.1% chr4 - 133228321 133228405 85 browser details YourSeq 63 1 91 3000 87.1% chr11 - 107235110 107235688 579 browser details YourSeq 62 8 92 3000 87.0% chr1 - 71595986 71596070 85 browser details YourSeq 62 928 1027 3000 83.9% chr10 + 121869143 121869250 108 browser details YourSeq 58 8 91 3000 84.6% chrX + 12190868 12190951 84 browser details YourSeq 58 20 98 3000 87.2% chr14 + 33229773 33229851 79 browser details YourSeq 58 864 977 3000 76.2% chr11 + 78174305 78174412 108 browser details YourSeq 57 879 993 3000 80.0% chr18 + 31684016 31684132 117 browser details YourSeq 56 20 93 3000 87.9% chr11 + 79597958 79598031 74 browser details YourSeq 55 1 63 3000 93.7% chr19 - 5709474 5709536 63

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Gfpt1 -6-phosphate transaminase 1 [ Mus musculus (house mouse) ] Gene ID: 14583, updated on 8-Nov-2020

Gene summary

Official Symbol Gfpt1 provided by MGI Official Full Name glutamine fructose-6-phosphate transaminase 1 provided by MGI Primary source MGI:MGI:95698 See related Ensembl:ENSMUSG00000029992 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GF; GFA; GFAT; Gfpt; GFAT1; GFAT1m; AI324119; AI449986; 2810423A18Rik Expression Ubiquitous expression in colon adult (RPKM 40.5), large intestine adult (RPKM 20.6) and 27 other tissues See more Orthologs human all NEW Try the new Gene table Try the new Transcript table

Genomic context

Location: 6 D1; 6 37.81 cM See Gfpt1 in Genome Data Viewer

Exon count: 20

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 6 NC_000072.7 (87019828..87069189)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (87042846..87092207)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (86992840..87042201)

Chromosome 6 - NC_000072.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Gfpt1 ENSMUSG00000029992

Description glutamine fructose-6-phosphate transaminase 1 [Source:MGI Symbol;Acc:MGI:95698] Gene Synonyms 2810423A18Rik, GFA, GFAT, GFAT1 Location Chromosome 6: 87,042,846-87,092,197 forward strand. GRCm38:CM000999.2 About this gene This gene has 7 transcripts (splice variants), 297 orthologues, 1 paralogue and is associated with 8 phenotypes. Transcripts

UniProt Name Transcript ID bp Protein Translation ID Biotype CCDS Flags Match

Gfpt1- ENSMUST00000113658.7 6238 681aa ENSMUSP00000109288.1 Protein coding CCDS39545 P47856 TSL:1 204 GENCODE basic APPRIS P1

Gfpt1- ENSMUST00000032057.7 2286 697aa ENSMUSP00000032057.7 Protein coding - P47856 TSL:1 201 GENCODE basic

Gfpt1- ENSMUST00000113655.7 1255 102aa ENSMUSP00000109285.1 Protein coding - D3YYE0 TSL:1 202 GENCODE basic

Gfpt1- ENSMUST00000113657.7 1200 124aa ENSMUSP00000109287.1 Protein coding - D3YYD9 TSL:1 203 GENCODE basic

Gfpt1- ENSMUST00000150410.1 473 No - Processed - - TSL:2 206 protein transcript

Gfpt1- ENSMUST00000204872.1 2809 No - Retained intron - - TSL:NA 207 protein

Gfpt1- ENSMUST00000146410.1 729 No - Retained intron - - TSL:3 205 protein

Page 6 of 8 https://www.alphaknockout.com

69.35 kb Forward strand 87.04Mb 87.06Mb 87.08Mb 87.10Mb (Comprehensive set... Gfpt1-204 >protein coding

Gfpt1-203 >protein coding Gfpt1-205 >retained intron

Gfpt1-202 >protein coding

Gfpt1-206 >processed transcript

Gfpt1-207 >retained intron

Gfpt1-201 >protein coding

Contigs AC159712.7 > AC162465.3 > Regulatory Build

87.04Mb 87.06Mb 87.08Mb 87.10Mb Reverse strand 69.35 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000113658

49.35 kb Forward strand

Gfpt1-204 >protein coding

ENSMUSP00000109... Low complexity (Seg) TIGRFAM Glucosamine-fructose-6-phosphate aminotransferase, isomerising Superfamily Nucleophile aminohydrolases, N-terminal SSF53697

Pfam PF13522 Sugar isomerase (SIS) PROSITE profiles Glutamine amidotransferase type 2 domain Sugar isomerase (SIS) PANTHER PTHR10937

PTHR10937:SF12 Gene3D 3.40.50.10490 CDD cd00714 GlmS/AgaS, SIS domain 1 GlmS/FrlB, SIS domain 2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 681

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8