https://www.alphaknockout.com

Mouse Gpatch8 Knockout Project (CRISPR/Cas9)

Objective: To create a Gpatch8 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gpatch8 (NCBI Reference Sequence: NM_001159492 ; Ensembl: ENSMUSG00000034621 ) is located on Mouse 11. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000143842). Exon 4~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 4.3% of the coding region. Exon 4~6 covers 6.62% of the coding region. The size of effective KO region: ~7409 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 8

Legends Exon of mouse Gpatch8 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.85% 597) | C(18.15% 363) | T(34.1% 682) | G(17.9% 358)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.55% 611) | C(19.25% 385) | T(30.65% 613) | G(19.55% 391)

Note: The 2000 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 102508212 102510211 2000 browser details YourSeq 48 1623 1696 2000 88.2% chr8 - 111265613 111265685 73 browser details YourSeq 47 1623 1695 2000 92.8% chr1 - 177367765 177367838 74 browser details YourSeq 47 1623 1699 2000 80.6% chr1 + 135023296 135023372 77 browser details YourSeq 46 1620 1695 2000 80.3% chr11 - 96970756 96970831 76 browser details YourSeq 44 1626 1713 2000 96.0% chr2 - 159045997 159046086 90 browser details YourSeq 42 1623 1687 2000 82.9% chr4 + 134408611 134408678 68 browser details YourSeq 38 1641 1696 2000 84.0% chr4 - 136201406 136201461 56 browser details YourSeq 37 1670 1722 2000 73.4% chr10 + 88374610 88374654 45 browser details YourSeq 35 1661 1719 2000 92.7% chr19 + 7071438 7071497 60 browser details YourSeq 35 1623 1671 2000 85.8% chr16 + 5686721 5686769 49 browser details YourSeq 34 1669 1723 2000 75.0% chr19 + 11253799 11253837 39 browser details YourSeq 34 1658 1696 2000 94.8% chr14 + 55051146 55051187 42 browser details YourSeq 32 275 740 2000 44.5% chr12 - 54564368 54564413 46 browser details YourSeq 30 1659 1694 2000 91.7% chr17 - 30041461 30041496 36 browser details YourSeq 29 1671 1699 2000 100.0% chr8 - 62015114 62015142 29 browser details YourSeq 28 1661 1696 2000 88.9% chr11 + 117401381 117401416 36 browser details YourSeq 27 1671 1699 2000 96.6% chr7 - 130714670 130714698 29 browser details YourSeq 26 1661 1698 2000 79.5% chr6 - 135168586 135168621 36 browser details YourSeq 26 1669 1696 2000 96.5% chr2 - 172465256 172465283 28

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 102498803 102500802 2000 browser details YourSeq 71 546 697 2000 82.5% chr12 - 108872341 108872516 176 browser details YourSeq 67 548 696 2000 83.2% chr13 + 3257324 3257481 158 browser details YourSeq 61 558 692 2000 90.7% chr9 - 62037426 62037572 147 browser details YourSeq 59 557 686 2000 90.5% chr5 - 101853589 101853724 136 browser details YourSeq 59 555 687 2000 91.6% chr7 + 35850850 35850984 135 browser details YourSeq 55 546 687 2000 89.9% chr14 - 101143160 101143337 178 browser details YourSeq 55 559 719 2000 91.1% chr1 - 121927452 121927633 182 browser details YourSeq 54 623 702 2000 84.8% chr1 - 16182064 16182142 79 browser details YourSeq 54 548 691 2000 87.5% chrX + 56946092 56946261 170 browser details YourSeq 54 575 702 2000 91.0% chr14 + 100316289 100316442 154 browser details YourSeq 53 550 676 2000 89.6% chr3 + 83724746 83724873 128 browser details YourSeq 52 602 687 2000 89.4% chr10 + 77213953 77214054 102 browser details YourSeq 51 556 680 2000 91.9% chr10 + 20936509 20936634 126 browser details YourSeq 50 603 702 2000 81.7% chr1 + 187722850 187722948 99 browser details YourSeq 49 663 790 2000 93.2% chr14 - 14992789 14993051 263 browser details YourSeq 47 558 645 2000 91.3% chr2 + 130493126 130493213 88 browser details YourSeq 46 618 702 2000 84.5% chr6 - 147625561 147625643 83 browser details YourSeq 43 623 702 2000 86.6% chr8 - 22990925 22991003 79 browser details YourSeq 43 551 633 2000 89.1% chr11 - 65402117 65402204 88

Note: The 2000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Gpatch8 G patch domain containing 8 [ Mus musculus (house mouse) ] Gene ID: 237943, updated on 12-Aug-2019

Gene summary

Official Symbol Gpatch8 provided by MGI Official Full Name G patch domain containing 8 provided by MGI Primary source MGI:MGI:1918667 See related Ensembl:ENSMUSG00000034621 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Fbm1; Gpatc8; AU018890; mKIAA0553; 5430405G24Rik Expression Ubiquitous expression in cerebellum adult (RPKM 5.8), CNS E14 (RPKM 5.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 D-E1 See Gpatch8 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (102475911..102556391, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (102337229..102417472, complement)

Chromosome 11 - NC_000077.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Gpatch8 ENSMUSG00000034621

Description G patch domain containing 8 [Source:MGI Symbol;Acc:MGI:1918667] Gene Synonyms 5430405G24Rik, ENSMUSG00000075516, Fbm1, Gpatc8 Location Chromosome 11: 102,475,915-102,556,392 reverse strand. GRCm38:CM001004.2 About this gene This gene has 6 transcripts (splice variants), 195 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gpatch8-206 ENSMUST00000143842.1 6871 1505aa ENSMUSP00000120649.1 Protein coding CCDS48944 A2A6A1 TSL:5 GENCODE basic APPRIS P1

Gpatch8-201 ENSMUST00000069673.3 5597 No protein - lncRNA - - TSL:1

Gpatch8-202 ENSMUST00000125754.1 893 No protein - lncRNA - - TSL:1

Gpatch8-204 ENSMUST00000127018.1 432 No protein - lncRNA - - TSL:5

Gpatch8-205 ENSMUST00000131573.7 390 No protein - lncRNA - - TSL:2

Gpatch8-203 ENSMUST00000126804.7 387 No protein - lncRNA - - TSL:3

100.48 kb Forward strand

102.48Mb 102.50Mb 102.52Mb 102.54Mb 102.56Mb Mdk-ps1-201 >processed pseudogene (Comprehensive set...

Contigs AL596258.15 >

Genes < Itga2b-201protein coding < Gm11628-201processed pseudogene < Gpatch8-204lncRNA (Comprehensive set...

< Itga2b-209lncRNA< Gpatch8-201lncRNA < Gpatch8-202lncRNA

< Itga2b-208lncRNA < Gpatch8-205lncRNA

< Itga2b-205lncRNA < Gpatch8-203lncRNA

< Gpatch8-206protein coding

< Gm25337-201snRNA

Regulatory Build

102.48Mb 102.50Mb 102.52Mb 102.54Mb 102.56Mb Reverse strand 100.48 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000143842

< Gpatch8-206protein coding

Reverse strand 80.28 kb

ENSMUSP00000120... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Zinc finger C2H2 superfamily SMART G-patch domain Pfam Zinc finger, double-stranded RNA binding

G-patch domain PROSITE profiles G-patch domain

Zinc finger C2H2-type PROSITE patterns Zinc finger C2H2-type PANTHER PTHR17614:SF11

PTHR17614

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe insertion missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1505

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8