https://www.alphaknockout.com

Mouse H6pd Knockout Project (CRISPR/Cas9)

Objective: To create a H6pd knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The H6pd (NCBI Reference Sequence: NM_173371 ; Ensembl: ENSMUSG00000028980 ) is located on Mouse 4. 5 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000030830). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele show enlarged adrenal glands, reduced plasma corticosterone levels and altered 11 beta-hydroxysteroid dehydrogenase type 1 enzyme activity. Treatment with 11-dehydrocorticosterone fails to inhibit glucose-stimulatedinsulin secretion in pancreatic islets.

Exon 2 starts from about 0.25% of the coding region. Exon 2~3 covers 31.58% of the coding region. The size of effective KO region: ~1913 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 5

Legends Exon of mouse H6pd Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.9% 418) | C(24.35% 487) | T(29.55% 591) | G(25.2% 504)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(19.6% 392) | C(24.7% 494) | T(29.05% 581) | G(26.65% 533)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 - 149996382 149998381 2000 browser details YourSeq 105 734 1007 2000 82.8% chr15 - 81293059 81573773 280715 browser details YourSeq 103 674 968 2000 86.7% chr14 - 57859986 57860280 295 browser details YourSeq 103 734 1048 2000 86.9% chr11 - 84419029 84419511 483 browser details YourSeq 101 733 1043 2000 84.0% chr4 + 139358191 139358678 488 browser details YourSeq 95 734 1038 2000 78.8% chr2 - 142888743 142889023 281 browser details YourSeq 87 734 1130 2000 80.2% chr5 - 23453255 23453632 378 browser details YourSeq 87 734 1056 2000 89.9% chr1 + 136388020 136388514 495 browser details YourSeq 85 734 1043 2000 87.7% chr4 + 118447653 118447962 310 browser details YourSeq 85 678 830 2000 85.6% chr3 + 10348611 10348761 151 browser details YourSeq 79 734 967 2000 89.9% chr17 - 35034376 35034697 322 browser details YourSeq 79 734 864 2000 86.8% chr11 - 85148971 85149101 131 browser details YourSeq 79 678 830 2000 78.8% chr17 + 17648673 17648809 137 browser details YourSeq 77 734 842 2000 83.2% chr13 - 55489539 55489643 105 browser details YourSeq 76 743 1039 2000 86.6% chr7 + 99301342 99301640 299 browser details YourSeq 76 734 838 2000 89.7% chr12 + 40297515 40298024 510 browser details YourSeq 75 734 1010 2000 88.7% chr16 - 4531650 4531924 275 browser details YourSeq 75 734 830 2000 94.2% chr15 - 103224618 103224735 118 browser details YourSeq 75 734 842 2000 82.2% chr2 + 178767129 178767233 105 browser details YourSeq 74 734 838 2000 82.5% chr3 - 90148058 90148157 100

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 - 149992469 149994468 2000 browser details YourSeq 87 1119 1267 2000 91.6% chr11 - 77229863 77652091 422229 browser details YourSeq 72 1145 1270 2000 82.0% chr12 - 76179281 76179565 285 browser details YourSeq 62 1115 1260 2000 86.1% chr2 - 70260957 70261109 153 browser details YourSeq 62 1079 1260 2000 85.4% chr11 - 69950565 69950804 240 browser details YourSeq 56 1118 1255 2000 86.2% chr11 + 119333063 119333199 137 browser details YourSeq 55 1117 1274 2000 80.3% chr10 - 127606611 127606768 158 browser details YourSeq 53 1121 1239 2000 92.1% chr6 + 108080594 108080713 120 browser details YourSeq 52 320 410 2000 95.0% chr11 - 87813187 87813373 187 browser details YourSeq 50 1199 1260 2000 90.4% chr2 - 144320509 144320570 62 browser details YourSeq 49 1216 1325 2000 77.7% chr17 - 87032651 87032753 103 browser details YourSeq 49 936 1157 2000 96.3% chr12 + 49289010 49289421 412 browser details YourSeq 48 1127 1323 2000 74.1% chr10 - 115841285 115841441 157 browser details YourSeq 48 1221 1323 2000 90.0% chr13 + 9015235 9015344 110 browser details YourSeq 47 1201 1259 2000 92.8% chr11 - 120192016 120192075 60 browser details YourSeq 47 1139 1255 2000 73.1% chr11 - 77910768 77910851 84 browser details YourSeq 47 1188 1268 2000 81.4% chr11 + 6438201 6438282 82 browser details YourSeq 47 1226 1323 2000 91.3% chr1 + 74315708 74315811 104 browser details YourSeq 46 1201 1270 2000 79.8% chr11 + 55025345 55025413 69 browser details YourSeq 46 1221 1323 2000 87.1% chr1 + 57051890 57052002 113

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: H6pd hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase) [ Mus musculus (house mouse) ] Gene ID: 100198, updated on 24-Sep-2019

Gene summary

Official Symbol H6pd provided by MGI Official Full Name hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase) provided by MGI Primary source MGI:MGI:2140356 See related Ensembl:ENSMUSG00000028980 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gpd1; G6pd1; Gpd-1; AI785303 Expression Broad expression in liver adult (RPKM 57.6), lung adult (RPKM 52.4) and 20 other tissuesS ee more Orthologs human all

Genomic context

Location: 4 E2; 4 80.65 cM See H6pd in Genome Data Viewer Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (149979474..150009023, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (149353590..149383132, complement)

Chromosome 4 - NC_000070.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: H6pd ENSMUSG00000028980

Description hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase) [Source:MGI Symbol;Acc:MGI:2140356] Gene Synonyms G6pd1, Gpd-1, Gpd1 Location Chromosome 4: 149,979,475-150,009,023 reverse strand. GRCm38:CM000997.2 About this gene This gene has 4 transcripts (splice variants), 191 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 15 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

H6pd-201 ENSMUST00000030830.3 4745 797aa ENSMUSP00000030830.3 Protein coding CCDS18967 A2A7A7 TSL:1 GENCODE basic

H6pd-202 ENSMUST00000084117.12 4571 789aa ENSMUSP00000081134.6 Protein coding CCDS71524 Q8CFX1 TSL:1 GENCODE basic APPRIS P1

H6pd-204 ENSMUST00000153394.1 650 171aa ENSMUSP00000115647.1 Protein coding - A2A7A8 CDS 3' incomplete TSL:3

H6pd-203 ENSMUST00000152907.1 392 No protein - lncRNA - - TSL:2

49.55 kb Forward strand 149.97Mb 149.98Mb 149.99Mb 150.00Mb 150.01Mb Contigs < AL606914.9 Genes (Comprehensive set... < H6pd-201protein coding

< H6pd-202protein coding

< H6pd-204protein coding

< H6pd-203lncRNA

Regulatory Build

149.97Mb 149.98Mb 149.99Mb 150.00Mb 150.01Mb Reverse strand 49.55 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000030830

< H6pd-201protein coding

Reverse strand 29.55 kb

ENSMUSP00000030... Low complexity (Seg) Cleavage site (Sign... TIGRFAM 6-phosphogluconolactonase, DevB-type Superfamily SSF55347 NagB/RpiA transferase-like

NAD(P)-binding domain superfamily Prints Glucose-6-phosphate dehydrogenase Pfam Glucose-6-phosphate dehydrogenase, NAD-binding Glucosamine/galactosamine-6-phosphate isomerase

Glucose-6-phosphate dehydrogenase, C-terminal PROSITE profiles PS51257 PROSITE patterns Glucose-6-phosphate dehydrogenase, active site PANTHER Glucose-6-phosphate dehydrogenase

PTHR23429:SF7 Gene3D 3.30.360.10 3.40.50.1360

3.40.50.720 CDD 6-phosphogluconolactonase, DevB-type

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 797

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8