Mouse Gstm4 Knockout Project (CRISPR/Cas9)

https://www.alphaknockout.com Mouse Gstm4 Knockout Project (CRISPR/Cas9) Objective: To create a Gstm4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Gstm4 gene (NCBI Reference Sequence: NM_026764 ; Ensembl: ENSMUSG00000027890 ) is located on Mouse chromosome 3. 8 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 8 (Transcript: ENSMUST00000029489). Exon 1~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 1 starts from about 0.15% of the coding region. Exon 1~8 covers 100.0% of the coding region. The size of effective KO region: ~3432 bp. The KO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 6 7 8 Legends Exon of mouse Gstm4 Knockout region Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(31.05% 621) | C(22.05% 441) | T(24.2% 484) | G(22.7% 454) Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(24.45% 489) | C(26.0% 520) | T(25.95% 519) | G(23.6% 472) Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 108044673 108046672 2000 browser details YourSeq 151 475 927 2000 86.5% chr11 + 23507509 23671916 164408 browser details YourSeq 98 197 672 2000 92.4% chr6 - 87596290 87597114 825 browser details YourSeq 89 494 685 2000 91.6% chr10 + 78346855 78347373 519 browser details YourSeq 80 462 643 2000 86.0% chr3 + 86095391 86095716 326 browser details YourSeq 80 225 560 2000 73.4% chr2 + 76642654 76642827 174 browser details YourSeq 78 469 689 2000 80.3% chr4 - 40878962 40879137 176 browser details YourSeq 77 771 929 2000 88.2% chr17 - 45443253 45443411 159 browser details YourSeq 75 468 647 2000 90.4% chr4 + 32946200 32946582 383 browser details YourSeq 73 471 560 2000 92.0% chr13 - 53432856 53432946 91 browser details YourSeq 73 471 578 2000 86.4% chr12 + 17536632 17536738 107 browser details YourSeq 70 469 560 2000 87.4% chr17 - 29177446 29177536 91 browser details YourSeq 70 468 569 2000 84.4% chr6 + 124483614 124483715 102 browser details YourSeq 70 471 560 2000 88.3% chr16 + 38678472 38678560 89 browser details YourSeq 69 462 558 2000 88.1% chr16 - 23102034 23102129 96 browser details YourSeq 68 475 578 2000 86.4% chr11 - 87096187 87096289 103 browser details YourSeq 66 730 864 2000 91.3% chr6 - 47582957 47583096 140 browser details YourSeq 66 831 927 2000 87.5% chr12 - 31776816 31776913 98 browser details YourSeq 66 469 652 2000 87.5% chr6 + 18437065 18437246 182 browser details YourSeq 66 469 560 2000 85.1% chr13 + 100658763 100658853 91 Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 108039239 108041238 2000 browser details YourSeq 361 77 565 2000 89.4% chr1 + 141255913 141256398 486 browser details YourSeq 138 1492 1986 2000 74.9% chr13 + 111378139 111378631 493 browser details YourSeq 136 1285 1910 2000 87.1% chr6 + 119169457 119170146 690 browser details YourSeq 134 1671 2000 2000 82.4% chrX - 152118942 152119248 307 browser details YourSeq 131 1543 1998 2000 78.3% chr9 - 79425540 79426044 505 browser details YourSeq 131 1436 1913 2000 88.8% chr2 - 126438349 126438863 515 browser details YourSeq 110 1197 1893 2000 82.4% chr6 + 32390292 32391057 766 browser details YourSeq 102 1676 1918 2000 88.8% chr11 - 3965916 3966162 247 browser details YourSeq 101 1653 1993 2000 83.1% chr1 - 73646273 73646807 535 browser details YourSeq 101 1681 1986 2000 89.8% chr11 + 25956847 25957152 306 browser details YourSeq 97 1287 1982 2000 73.5% chr15 + 5702781 5703544 764 browser details YourSeq 93 1201 1749 2000 91.1% chr12 + 56220253 56220855 603 browser details YourSeq 89 1675 2000 2000 88.6% chr1 + 172036519 172036857 339 browser details YourSeq 87 1197 1986 2000 86.6% chr10 + 104336592 104337500 909 browser details YourSeq 85 1200 1843 2000 84.5% chr9 + 107041601 107042330 730 browser details YourSeq 84 1761 2000 2000 88.2% chr5 - 88472682 88472924 243 browser details YourSeq 83 1681 1986 2000 91.1% chr10 - 75069642 75069958 317 browser details YourSeq 83 1530 1986 2000 91.9% chr2 + 129975164 129975647 484 browser details YourSeq 82 1762 1932 2000 90.2% chr6 - 100957321 100957804 484 Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found. Page 5 of 8 https://www.alphaknockout.com Gene and protein information: Gstm4 glutathione S-transferase, mu 4 [ Mus musculus (house mouse) ] Gene ID: 14865, updated on 10-Oct-2019 Gene summary Official Symbol Gstm4 provided by MGI Official Full Name glutathione S-transferase, mu 4 provided by MGI Primary source MGI:MGI:95862 See related Ensembl:ENSMUSG00000027890 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gstb4; Gstb-4; GSTM7-7; 1110004G14Rik Expression Broad expression in bladder adult (RPKM 39.1), liver adult (RPKM 27.9) and 27 other tissuesS ee more Genomic context Location: 3; 3 F2.3 See Gstm4 in Genome Data Viewer Exon count: 10 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (108030571..108045179, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (107843326..107847777, complement) Chromosome 3 - NC_000069.6 Page 6 of 8 https://www.alphaknockout.com Transcript information: This gene has 3 transcripts Gene: Gstm4 ENSMUSG00000027890 Description glutathione S-transferase, mu 4 [Source:MGI Symbol;Acc:MGI:95862] Gene Synonyms 1110004G14Rik, Gstb-4, Gstb4 Location Chromosome 3: 108,040,408-108,044,894 reverse strand. GRCm38:CM000996.2 About this gene This gene has 3 transcripts (splice variants), 215 orthologues, 16 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Gstm4-201 ENSMUST00000029489.14 1707 218aa ENSMUSP00000029489.8 Protein coding CCDS17748 Q8R5I6 TSL:1 GENCODE basic APPRIS P1 Gstm4-203 ENSMUST00000178808.7 1462 184aa ENSMUSP00000136643.1 Protein coding CCDS51045 A2AE91 TSL:1 GENCODE basic Gstm4-202 ENSMUST00000106670.1 777 184aa ENSMUSP00000102281.1 Protein coding CCDS51045 A2AE91 TSL:3 GENCODE basic 24.49 kb Forward strand 108.035Mb 108.040Mb 108.045Mb 108.050Mb Genes Gm25592-201 >snoRNA (Comprehensive set... Contigs AL671877.15 > Genes (Comprehensive set... < Gm12499-201unprocessed pseudogene < Gstm4-201protein coding < Gstm4-203protein coding < Gstm4-202protein coding Regulatory Build 108.035Mb 108.040Mb 108.045Mb 108.050Mb Reverse strand 24.49 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding pseudogene RNA gene Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000029489 < Gstm4-201protein coding Reverse strand 4.49 kb ENSMUSP00000029... Superfamily Thioredoxin-like superfamily Glutathione S-transferase, C-terminal domain superfamily SFLD SFLDG00363 SFLDG01205 Prints Glutathione S-transferase, Mu class Pfam Glutathione S-transferase, N-terminal Glutathione S-transferase, C-terminal PROSITE profiles Glutathione S-transferase, N-terminal Glutathione S-transferase, C-terminal-like PANTHER PTHR11571 PTHR11571:SF229 Gene3D 1.20.1050.10 3.40.30.10 CDD cd03075 cd03209 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant splice region variant synonymous variant Scale bar 0 20 40 60 80 100 120 140 160 180 218 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.

Mouse Gstm4 Knockout Project (CRISPR/Cas9)

Systems and Chemical Biology Approaches to Study Cell Function and Response to Toxins

GSTM4 Is a Microsatellite-Containing EWS&Sol;FLI Target Involved in Ewing&Apos;S Sarcoma Oncogenesis and Therapeutic

Whole Exome Sequencing in Families at High Risk for Hodgkin Lymphoma: Identification of a Predisposing Mutation in the KDR Gene

GSTM4 (1-218, His-Tag) Human Protein – AR09587PU-L | Origene

Snps in Genes Coding for ROS Metabolism and Signalling in Association with Docetaxel Clearance

Protein Sequence Comparison and Protein Evolution Tutorial

HHS Public Access Author Manuscript

SUPPORTING INFORMATION for Regulation of Gene Expression By

Transcriptome Profiling Reveals the Complexity of Pirfenidone Effects in IPF

A Personalized Genomics Approach of the Prostate Cancer

SUPPLEMENTARY DATA Supplementary Table 1. Characteristics of the Organ Donors and Human Islet Preparations Used for RNA-Seq

Acquisition of Inverted GSTM Exons by an Intron of Primate GSTM5 Gene