https://www.alphaknockout.com

Mouse Mgst3 Knockout Project (CRISPR/Cas9)

Objective: To create a Mgst3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mgst3 (NCBI Reference Sequence: NM_025569 ; Ensembl: ENSMUSG00000026688 ) is located on Mouse 1. 6 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000028005). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.22% of the coding region. Exon 2~6 covers 100.0% of the coding region. The size of effective KO region: ~5901 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6

Legends Exon of mouse Mgst3 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.55% 511) | C(23.4% 468) | T(27.0% 540) | G(24.05% 481)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.2% 544) | C(23.3% 466) | T(26.5% 530) | G(23.0% 460)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 167378409 167380408 2000 browser details YourSeq 194 672 892 2000 93.7% chr1 - 167380870 167381074 205 browser details YourSeq 71 1110 1594 2000 73.1% chr2 - 4830101 4830422 322 browser details YourSeq 70 1118 1585 2000 77.4% chr1 + 56851978 56852406 429 browser details YourSeq 55 1113 1215 2000 76.7% chr15 - 78733425 78733527 103 browser details YourSeq 47 1119 1536 2000 65.6% chr13 + 45844657 45844975 319 browser details YourSeq 43 1554 1862 2000 80.4% chr2 + 131684266 131684571 306 browser details YourSeq 40 1110 1156 2000 93.7% chr16 - 93998886 93998933 48 browser details YourSeq 40 1115 1174 2000 83.4% chr1 - 153746540 153746599 60 browser details YourSeq 40 1078 1157 2000 91.7% chr11 + 68618640 68619011 372 browser details YourSeq 39 1114 1156 2000 95.4% chr12 + 24927763 24927805 43 browser details YourSeq 38 1551 1612 2000 93.1% chr14 + 109995605 109995667 63 browser details YourSeq 38 1319 1370 2000 78.6% chr1 + 180625071 180625116 46 browser details YourSeq 37 1110 1154 2000 91.2% chr3 - 129567119 129567163 45 browser details YourSeq 36 1729 1864 2000 92.9% chr1 - 79148664 79148799 136 browser details YourSeq 36 1110 1154 2000 91.2% chr11 + 55817901 55817946 46 browser details YourSeq 35 1077 1220 2000 88.9% chr14 - 67119961 67120330 370 browser details YourSeq 34 1335 1381 2000 94.9% chr2 - 136067784 136067836 53 browser details YourSeq 34 1115 1156 2000 90.5% chr2 + 168840793 168840834 42 browser details YourSeq 33 1340 1378 2000 94.9% chr2 - 174313968 174314016 49

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 167370506 167372505 2000 browser details YourSeq 177 630 1840 2000 86.7% chr13 + 47008517 47439198 430682 browser details YourSeq 115 630 811 2000 81.9% chr3 - 97979892 97980071 180 browser details YourSeq 115 630 823 2000 79.9% chr16 - 21262291 21262476 186 browser details YourSeq 113 642 825 2000 83.1% chr10 + 76869505 76869691 187 browser details YourSeq 110 630 789 2000 83.6% chr2 + 148024837 148024994 158 browser details YourSeq 107 631 781 2000 85.5% chr5 - 77121389 77121539 151 browser details YourSeq 102 633 804 2000 80.0% chr2 - 18310138 18310307 170 browser details YourSeq 101 630 781 2000 79.9% chr15 - 58149756 58149899 144 browser details YourSeq 100 632 781 2000 83.4% chr15 - 74648530 74648679 150 browser details YourSeq 99 1672 1853 2000 86.7% chr13 - 95915683 95915881 199 browser details YourSeq 99 630 781 2000 87.8% chr2 + 92149479 92149631 153 browser details YourSeq 97 630 772 2000 83.5% chr7 - 27226820 27226961 142 browser details YourSeq 97 607 776 2000 84.1% chr11 + 70853966 70854157 192 browser details YourSeq 93 631 780 2000 84.0% chr8 - 75083335 75083483 149 browser details YourSeq 93 630 780 2000 80.8% chr10 + 119443049 119443199 151 browser details YourSeq 92 1671 1841 2000 84.9% chr3 - 101885370 101885542 173 browser details YourSeq 92 656 789 2000 80.8% chr3 - 32008102 32008231 130 browser details YourSeq 92 1673 1895 2000 83.1% chr17 - 46692081 46692285 205 browser details YourSeq 92 1671 1875 2000 89.7% chr1 + 130830947 130831153 207

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Mgst3 microsomal S-transferase 3 [ Mus musculus (house mouse) ] Gene ID: 66447, updated on 12-Aug-2019

Gene summary

Official Symbol Mgst3 provided by MGI Official Full Name microsomal glutathione S-transferase 3 provided by MGI Primary source MGI:MGI:1913697 See related Ensembl:ENSMUSG00000026688 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GST-III; AA516734; 2010012L10Rik; 2010306B17Rik; 2700004G04Rik Expression Broad expression in stomach adult (RPKM 1980.6), duodenum adult (RPKM 1260.0) and 18 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 H2.3 See Mgst3 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (167371966..167393841, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (169302515..169323928, complement)

Chromosome 1 - NC_000067.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Mgst3 ENSMUSG00000026688

Description microsomal glutathione S-transferase 3 [Source:MGI Symbol;Acc:MGI:1913697] Gene Synonyms 2010012L10Rik, 2010306B17Rik, 2700004G04Rik, GST-III Location : 167,371,966-167,393,841 reverse strand. GRCm38:CM000994.2 About this gene This gene has 1 transcript (splice variant), 200 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mgst3-201 ENSMUST00000028005.2 1097 153aa ENSMUSP00000028005.2 Protein coding CCDS15457 Q9CPU4 TSL:1 GENCODE basic APPRIS P1

41.88 kb Forward strand

167.37Mb 167.38Mb 167.39Mb 167.40Mb Aldh9a1-201 >protein coding (Comprehensive set...

Aldh9a1-202 >retained intron

Contigs AC113970.8 >

Genes < Mgst3-201protein coding (Comprehensive set...

Regulatory Build

167.37Mb 167.38Mb 167.39Mb 167.40Mb Reverse strand 41.88 kb

Regulation Legend

CTCF Enhancer Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000028005

< Mgst3-201protein coding

Reverse strand 21.88 kb

ENSMUSP00000028... Transmembrane heli... Superfamily Membrane associated eicosanoid/glutathione metabolism-like domain superfamily Pfam Membrane-associated, eicosanoid/glutathione metabolism (MAPEG) protein PANTHER PTHR10250

PTHR10250:SF21 Gene3D Membrane associated eicosanoid/glutathione metabolism-like domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 153

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8