https://www.alphaknockout.com

Mouse Idh3g Knockout Project (CRISPR/Cas9)

Objective: To create a Idh3g knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Idh3g (NCBI Reference Sequence: NM_008323 ; Ensembl: ENSMUSG00000002010 ) is located on Mouse X. 13 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 13 (Transcript: ENSMUST00000052761). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 6.96% of the coding region. Exon 2~5 covers 22.48% of the coding region. The size of effective KO region: ~1740 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 13

Legends Exon of mouse Idh3g Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 464 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.95% 519) | C(20.15% 403) | T(30.55% 611) | G(23.35% 467)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(464bp) | A(19.83% 92) | C(23.06% 107) | T(36.85% 171) | G(20.26% 94)

Note: The 464 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrX - 73782717 73784716 2000 browser details YourSeq 48 221 354 2000 66.3% chr13 + 23990004 23990090 87 browser details YourSeq 45 178 255 2000 84.3% chr3 - 66372141 66372216 76 browser details YourSeq 45 1682 1747 2000 92.5% chr12 + 85494914 85494979 66 browser details YourSeq 44 1645 1730 2000 96.0% chr18 - 62117154 62117240 87 browser details YourSeq 42 212 275 2000 82.9% chrX - 144100725 144100788 64 browser details YourSeq 40 202 241 2000 100.0% chr5 - 67368328 67368367 40 browser details YourSeq 40 202 435 2000 57.8% chr1 + 58954656 58954700 45 browser details YourSeq 39 202 242 2000 97.6% chr7 + 101890816 101890856 41 browser details YourSeq 39 167 245 2000 88.9% chr2 + 155384579 155384656 78 browser details YourSeq 39 205 246 2000 97.7% chr12 + 30107022 30107064 43 browser details YourSeq 38 1643 1747 2000 95.3% chr6 - 87388334 87388440 107 browser details YourSeq 38 204 243 2000 97.5% chr10 - 85031474 85031513 40 browser details YourSeq 37 212 279 2000 83.7% chr15 - 38217289 38217355 67 browser details YourSeq 37 204 268 2000 86.7% chr12 + 17784313 17784376 64 browser details YourSeq 36 206 248 2000 93.1% chr2 - 5192841 5192884 44 browser details YourSeq 36 198 243 2000 95.0% chr16 + 32347611 32347658 48 browser details YourSeq 35 213 251 2000 94.9% chr3 + 34708922 34708960 39 browser details YourSeq 35 203 243 2000 92.7% chr1 + 119108068 119108108 41 browser details YourSeq 34 192 239 2000 82.7% chr3 - 107060669 107060715 47

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 464 1 464 464 100.0% chrX - 73781213 73781676 464 browser details YourSeq 25 406 435 464 77.0% chr15 - 41908926 41908951 26 browser details YourSeq 22 256 277 464 100.0% chr8 - 60083091 60083112 22 browser details YourSeq 21 290 310 464 100.0% chr18 + 50830304 50830324 21 browser details YourSeq 20 60 79 464 100.0% chr1 - 98219491 98219510 20 browser details YourSeq 20 52 71 464 100.0% chr2 + 35358865 35358884 20 browser details YourSeq 20 66 85 464 100.0% chr19 + 25887784 25887803 20 browser details YourSeq 20 178 197 464 100.0% chr15 + 54744463 54744482 20 browser details YourSeq 20 48 69 464 95.5% chr1 + 40135470 40135491 22

Note: The 464 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Idh3g isocitrate dehydrogenase 3 (NAD+), gamma [ Mus musculus () ] Gene ID: 15929, updated on 10-Oct-2019

Gene summary

Official Symbol Idh3g provided by MGI Official Full Name isocitrate dehydrogenase 3 (NAD+), gamma provided by MGI Primary source MGI:MGI:1099463 See related Ensembl:ENSMUSG00000002010 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Ubiquitous expression in heart adult (RPKM 350.1), kidney adult (RPKM 236.5) and 27 other tissues See more Orthologs human all

Genomic context

Location: X A7.3; X 37.41 cM See Idh3g in Genome Data Viewer

Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (73778963..73786897, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (71024302..71032236, complement)

Chromosome X - NC_000086.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Idh3g ENSMUSG00000002010

Description isocitrate dehydrogenase 3 (NAD+), gamma [Source:MGI Symbol;Acc:MGI:1099463] Location Chromosome X: 73,778,963-73,786,897 reverse strand. GRCm38:CM001013.2 About this gene This gene has 9 transcripts (splice variants), 208 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Idh3g-201 ENSMUST00000052761.8 1331 393aa ENSMUSP00000056502.8 Protein coding CCDS30212 P70404 Q3TGZ3 TSL:1 GENCODE basic APPRIS P1

Idh3g-209 ENSMUST00000156299.7 2080 No protein - Retained intron - - TSL:2

Idh3g-202 ENSMUST00000129070.7 859 No protein - Retained intron - - TSL:2

Idh3g-206 ENSMUST00000142707.7 702 No protein - Retained intron - - TSL:3

Idh3g-204 ENSMUST00000130119.1 688 No protein - Retained intron - - TSL:2

Idh3g-208 ENSMUST00000148419.7 680 No protein - Retained intron - - TSL:2

Idh3g-203 ENSMUST00000129367.7 641 No protein - Retained intron - - TSL:2

Idh3g-205 ENSMUST00000140944.7 829 No protein - lncRNA - - TSL:5

Idh3g-207 ENSMUST00000145341.1 395 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

27.93 kb Forward strand

73.77Mb 73.78Mb 73.79Mb Plxnb3-203 >retained intron Srpk3-202 >retained intron Ssr4-208 >protein coding Gm14817-201 >lncRNA (Comprehensive set...

Plxnb3-202 >retained intron Srpk3-201 >protein coding Ssr4-201 >protein coding

Plxnb3-204 >retained intron Srpk3-203 >retained intron Ssr4-204 >retained intron

Plxnb3-201 >protein coding Srpk3-204 >retained intron Ssr4-205 >retained intron

Plxnb3-205 >retained intron Ssr4-207 >retained intron

Ssr4-203 >retained intron

Ssr4-206 >retained intron

Ssr4-202 >retained intron

Contigs AL672094.11 > Genes (Comprehensive set... < Idh3g-209retained intron < Pdzd4-202protein coding

< Idh3g-201protein coding < Pdzd4-201protein coding

< Idh3g-202retained intron

< Idh3g-208retained intron

< Idh3g-204retained intron

< Idh3g-205lncRNA

< Idh3g-206retained intron

< Idh3g-203retained intron

< Idh3g-207lncRNA

Regulatory Build

73.77Mb 73.78Mb 73.79Mb Reverse strand 27.93 kb

Regulation Legend CTCF Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000052761

< Idh3g-201protein coding

Reverse strand 7.93 kb

ENSMUSP00000056... Low complexity (Seg) TIGRFAM Isocitrate dehydrogenase NAD-dependent

Superfamily SSF53659 SMART Isopropylmalate dehydrogenase-like domain

Pfam Isopropylmalate dehydrogenase-like domain

PROSITE patterns Isocitrate/isopropylmalate dehydrogenase, conserved site

PANTHER PTHR11835:SF58

PTHR11835 Gene3D 3.40.718.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 393

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9