https://www.alphaknockout.com

Mouse Gnmt Knockout Project (CRISPR/Cas9)

Objective: To create a Gnmt knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gnmt (NCBI Reference Sequence: NM_010321 ; Ensembl: ENSMUSG00000002769 ) is located on Mouse 17. 6 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000002846). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null mutation display elevated levels of methionine and S-adenosylmethionine in the liver. Mice homozygous for another null allele exhibit hepatitis, increased hepatic glycogen storage, and hepatocellular carcinoma.

Exon 2 starts from about 23.55% of the coding region. Exon 2 covers 14.56% of the coding region. The size of effective KO region: ~128 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 6

Legends Exon of mouse Gnmt Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1553 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 548 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1553bp) | A(27.5% 427) | C(22.99% 357) | T(24.21% 376) | G(25.31% 393)

Note: The 1553 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(548bp) | A(18.61% 102) | C(28.1% 154) | T(29.01% 159) | G(24.27% 133)

Note: The 548 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1553 1 1553 1553 100.0% chr17 - 46727393 46728945 1553 browser details YourSeq 235 1064 1393 1553 94.5% chr2 + 130670414 130779345 108932 browser details YourSeq 220 1066 1390 1553 97.9% chr17 + 53619522 53620018 497 browser details YourSeq 212 1088 1390 1553 95.8% chr2 - 32101876 32102201 326 browser details YourSeq 210 1097 1390 1553 97.4% chr17 - 32261386 32261709 324 browser details YourSeq 208 830 1390 1553 90.7% chr17 - 56243546 56244017 472 browser details YourSeq 208 1082 1391 1553 94.1% chr16 + 14322552 14323132 581 browser details YourSeq 207 1065 1390 1553 91.3% chr2 - 70997981 70998216 236 browser details YourSeq 206 1083 1392 1553 93.7% chr4 + 129296501 129297126 626 browser details YourSeq 204 1089 1390 1553 95.2% chr4 + 116680871 116681231 361 browser details YourSeq 203 830 1389 1553 93.4% chr17 - 24098897 24099407 511 browser details YourSeq 200 1169 1391 1553 96.2% chr1 + 78415614 78415831 218 browser details YourSeq 199 832 1392 1553 93.4% chr6 - 100164861 100165395 535 browser details YourSeq 198 1194 1405 1553 98.1% chr11 + 23259838 23260050 213 browser details YourSeq 196 1197 1406 1553 98.6% chr6 - 54651256 54651470 215 browser details YourSeq 195 1134 1390 1553 97.2% chr11 - 101428001 101428334 334 browser details YourSeq 195 1193 1404 1553 97.6% chr2 + 103818241 103818456 216 browser details YourSeq 195 1106 1390 1553 96.3% chr12 + 8768094 8768536 443 browser details YourSeq 194 1195 1404 1553 98.1% chr7 - 54320853 54321068 216 browser details YourSeq 193 1201 1406 1553 97.1% chr7 - 101254271 101254478 208

Note: The 1553 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 548 1 548 548 100.0% chr17 - 46726717 46727264 548 browser details YourSeq 26 177 206 548 81.5% chr16 - 4492908 4492934 27 browser details YourSeq 25 518 543 548 100.0% chr11 - 81079904 81079936 33 browser details YourSeq 23 525 548 548 100.0% chr11 + 19647766 19647790 25 browser details YourSeq 22 190 211 548 100.0% chr14 + 122169697 122169718 22 browser details YourSeq 21 114 134 548 100.0% chr17 + 79422978 79422998 21 browser details YourSeq 20 520 539 548 100.0% chr13 - 42238235 42238254 20 browser details YourSeq 20 7 26 548 100.0% chr11 + 102375177 102375196 20

Note: The 548 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Gnmt glycine N-methyltransferase [ Mus musculus (house mouse) ] Gene ID: 14711, updated on 5-Oct-2019

Gene summary

Official Symbol Gnmt provided by MGI Official Full Name glycine N-methyltransferase provided by MGI Primary source MGI:MGI:1202304 See related Ensembl:ENSMUSG00000002769 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Restricted expression toward liver adult (RPKM 1724.9) See more Orthologs human all

Genomic context

Location: 17 22.9 cM; 17 C See Gnmt in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (46725664..46729211, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (46862613..46866114, complement)

Chromosome 17 - NC_000083.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Gnmt ENSMUSG00000002769

Description glycine N-methyltransferase [Source:MGI Symbol;Acc:MGI:1202304] Gene Synonyms glycine N methyl transferase Location Chromosome 17: 46,725,664-46,729,168 reverse strand. GRCm38:CM001010.2 About this gene This gene has 3 transcripts (splice variants), 193 orthologues, is a member of 1 Ensembl protein family and is associated with 18 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gnmt-201 ENSMUST00000002846.8 1035 293aa ENSMUSP00000002846.8 Protein coding CCDS28838 Q9QXF8 TSL:1 GENCODE basic APPRIS P1

Gnmt-202 ENSMUST00000147112.1 855 No protein - Retained intron - - TSL:2

Gnmt-203 ENSMUST00000233086.1 554 No protein - lncRNA - - -

23.50 kb Forward strand 46.720Mb 46.725Mb 46.730Mb 46.735Mb Pex6-203 >retained intron (Comprehensive set...

Pex6-201 >protein coding

Pex6-202 >lncRNA Pex6-204 >retained intron

Contigs < CT030702.11 Genes (Comprehensive set... < Gnmt-201protein coding < Cnpy3-203nonsense mediated decay

< Gnmt-202retained intron < Cnpy3-201protein coding

< Gnmt-203lncRNA < Cnpy3-204retained intron

Regulatory Build

46.720Mb 46.725Mb 46.730Mb 46.735Mb Reverse strand 23.50 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000002846

< Gnmt-201protein coding

Reverse strand 3.50 kb

ENSMUSP00000002... PDB-ENSP mappings Superfamily S-adenosyl-L-methionine-dependent methyltransferase

Pfam Methyltransferase domain PROSITE profiles Glycine/Sarcosine N-methyltransferase PIRSF Glycine/Sarcosine N-methyltransferase

PANTHER Glycine/Sarcosine N-methyltransferase

Gene3D 3.30.46.10

3.40.50.150 CDD cd02440

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

stop gained missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 293

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8