https://www.alphaknockout.com

Mouse Plekho1 Knockout Project (CRISPR/Cas9)

Objective: To create a Plekho1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Plekho1 (NCBI Reference Sequence: NM_023320 ; Ensembl: ENSMUSG00000015745 ) is located on Mouse 3. 6 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000015889). Exon 1~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele exhibit age-dependent increase in bone volume and increased osteoblast activity.

Exon 1 starts from about 0.08% of the coding region. Exon 1~6 covers 100.0% of the coding region. The size of effective KO region: ~6864 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6

Legends Exon of mouse Plekho1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.5% 490) | C(27.1% 542) | T(21.2% 424) | G(27.2% 544)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.7% 494) | C(26.6% 532) | T(22.85% 457) | G(25.85% 517)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 95995798 95997797 2000 browser details YourSeq 92 568 690 2000 85.0% chr11 + 116119204 116119310 107 browser details YourSeq 91 567 668 2000 96.0% chr4 - 126299445 126299715 271 browser details YourSeq 78 570 662 2000 92.2% chr4 + 88568896 88568987 92 browser details YourSeq 76 559 667 2000 87.3% chr12 + 84128423 84128540 118 browser details YourSeq 74 567 657 2000 92.2% chr11 + 84135848 84135938 91 browser details YourSeq 74 567 663 2000 85.4% chr1 + 170838904 170838995 92 browser details YourSeq 73 567 661 2000 85.6% chr5 - 64130733 64130817 85 browser details YourSeq 73 316 639 2000 81.4% chr5 - 32843388 32843693 306 browser details YourSeq 73 567 662 2000 84.6% chr15 - 34515197 34515281 85 browser details YourSeq 73 567 653 2000 94.0% chr5 + 66432243 66432329 87 browser details YourSeq 71 567 664 2000 90.0% chr5 + 6365790 6365893 104 browser details YourSeq 70 567 663 2000 84.5% chr15 - 12038188 12038268 81 browser details YourSeq 69 567 661 2000 88.3% chr7 + 80272621 80272714 94 browser details YourSeq 69 567 654 2000 93.8% chr17 + 29517235 29517342 108 browser details YourSeq 68 567 666 2000 82.8% chr10 + 59480435 59480519 85 browser details YourSeq 66 567 649 2000 84.7% chr12 - 113201725 113201803 79 browser details YourSeq 66 569 661 2000 82.9% chr10 - 99365620 99365696 77 browser details YourSeq 65 567 645 2000 90.6% chr5 - 36801034 36801111 78 browser details YourSeq 65 568 646 2000 86.9% chr12 - 57204786 57204861 76

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 95986932 95988931 2000 browser details YourSeq 34 1552 1590 2000 83.4% chr7 - 34835386 34835421 36 browser details YourSeq 32 619 789 2000 57.7% chr9 + 32269310 32269421 112 browser details YourSeq 31 1555 1588 2000 97.0% chr4 + 45277426 45277459 34 browser details YourSeq 30 21 148 2000 87.5% chr1 + 190162651 190162779 129 browser details YourSeq 29 1554 1588 2000 83.9% chrX + 168101868 168101899 32 browser details YourSeq 28 1561 1588 2000 100.0% chr8 + 72307389 72307416 28 browser details YourSeq 27 1565 1591 2000 100.0% chr2 - 129103300 129103326 27 browser details YourSeq 27 1528 1556 2000 89.3% chr14 - 6472604 6472631 28 browser details YourSeq 27 1560 1590 2000 82.8% chr8 + 9123967 9123995 29 browser details YourSeq 27 1564 1590 2000 100.0% chr7 + 102261996 102262022 27 browser details YourSeq 26 1565 1590 2000 100.0% chr5 - 115461256 115461281 26 browser details YourSeq 26 1563 1590 2000 96.5% chr4 - 138316970 138316997 28 browser details YourSeq 26 1565 1590 2000 100.0% chr4 + 33134241 33134266 26 browser details YourSeq 25 1565 1589 2000 100.0% chr9 - 59181827 59181851 25 browser details YourSeq 25 1563 1587 2000 100.0% chr5 - 122064939 122064963 25 browser details YourSeq 25 1565 1589 2000 100.0% chr5 - 33295732 33295756 25 browser details YourSeq 25 1565 1589 2000 100.0% chr17 - 23651988 23652012 25 browser details YourSeq 25 97 121 2000 100.0% chr1 - 104480379 104480403 25 browser details YourSeq 25 1564 1590 2000 96.3% chr9 + 67647760 67647786 27

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Plekho1 pleckstrin homology domain containing, family O member 1 [ Mus musculus (house mouse) ] Gene ID: 67220, updated on 12-Aug-2019

Gene summary

Official Symbol Plekho1 provided by MGI Official Full Name pleckstrin homology domain containing, family O member 1 provided by MGI Primary source MGI:MGI:1914470 See related Ensembl:ENSMUSG00000015745 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Jza2; Ckip1; CKIP-1; JZA-20; 2810052M02Rik Expression Ubiquitous expression in bladder adult (RPKM 51.0), CNS E11.5 (RPKM 47.2) and 27 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 F2.1 See Plekho1 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (95988809..95999355, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (95792751..95799762, complement)

Chromosome 3 - NC_000069.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Plekho1 ENSMUSG00000015745

Description pleckstrin homology domain containing, family O member 1 [Source:MGI Symbol;Acc:MGI:1914470] Gene Synonyms 2810052M02Rik, CKIP-1, JZA-20, Jza2 Location Chromosome 3: 95,988,429-95,996,001 reverse strand. GRCm38:CM000996.2 About this gene This gene has 5 transcripts (splice variants), 242 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Plekho1- ENSMUST00000015889.9 1931 408aa ENSMUSP00000015889.3 Protein coding CCDS17627 Q9JIY0 TSL:1 201 GENCODE basic APPRIS P1

Plekho1- ENSMUST00000123006.7 1208 365aa ENSMUSP00000118665.1 Protein coding - F6XQM2 CDS 5' incomplete 202 TSL:5

Plekho1- ENSMUST00000130043.7 788 262aa ENSMUSP00000115035.1 Protein coding - F6VV25 CDS 5' and 3' 203 incomplete TSL:2

Plekho1- ENSMUST00000143485.1 441 124aa ENSMUSP00000114505.1 Protein coding - D3YVD1 CDS 3' incomplete 204 TSL:3

Plekho1- ENSMUST00000157043.1 362 No - Retained - - TSL:2 205 protein intron

27.57 kb Forward strand 95.98Mb 95.99Mb 96.00Mb Contigs AC092855.39 > (Comprehensive set... < Plekho1-201protein coding < Vps45-201protein coding

< Plekho1-202protein coding

< Plekho1-203protein coding

< Plekho1-204protein coding

< Plekho1-205retained intron

Regulatory Build

95.98Mb 95.99Mb 96.00Mb Reverse strand 27.57 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000015889

< Plekho1-201protein coding

Reverse strand 7.57 kb

ENSMUSP00000015... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF50729 SMART Pleckstrin homology domain Pfam Pleckstrin homology domain PROSITE profiles Pleckstrin homology domain PANTHER Pleckstrin homology domain-containing family O member 1

PTHR15871 Gene3D PH-like domain superfamily CDD cd13317

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 408

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8