https://www.alphaknockout.com

Mouse Plekhg2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Plekhg2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Plekhg2 (NCBI Reference Sequence: NM_138752.2 ; Ensembl: ENSMUSG00000037552 ) is located on Mouse 7. 19 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 19 (Transcript: ENSMUST00000094644). Exon 5~9 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Plekhg2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-452G7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 11.71% of the coding region. The knockout of Exon 5~9 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 1173 bp, and the size of intron 9 for 3'-loxP site insertion: 2089 bp. The size of effective cKO region: ~1753 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Plekhg2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8253bp) | A(22.15% 1828) | C(27.4% 2261) | T(22.0% 1816) | G(28.45% 2348)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 28369138 28372137 3000 browser details YourSeq 102 2447 2576 3000 92.6% chr10 - 89988590 90372517 383928 browser details YourSeq 99 2447 2583 3000 84.8% chr18 - 38673185 38673317 133 browser details YourSeq 99 2450 2605 3000 89.1% chr11 + 79313544 79313697 154 browser details YourSeq 94 2405 2576 3000 92.1% chr10 + 93474671 93475184 514 browser details YourSeq 93 2447 2576 3000 92.0% chr7 - 44935345 45113497 178153 browser details YourSeq 93 2460 2579 3000 89.4% chr1 - 87810715 87810833 119 browser details YourSeq 93 2447 2576 3000 88.3% chr1 - 39524333 39524459 127 browser details YourSeq 92 2462 2576 3000 89.8% chr3 - 40911068 40911180 113 browser details YourSeq 92 2447 2576 3000 87.3% chr11 - 97649805 97649933 129 browser details YourSeq 92 2443 2583 3000 85.3% chr16 + 22267767 22267903 137 browser details YourSeq 92 2462 2585 3000 88.6% chr13 + 46766241 46766363 123 browser details YourSeq 91 2451 2583 3000 84.8% chr10 - 44938985 44939116 132 browser details YourSeq 90 2462 2576 3000 89.6% chr13 - 100563420 100563532 113 browser details YourSeq 90 2482 2602 3000 87.7% chr3 + 104841545 104841661 117 browser details YourSeq 90 2447 2576 3000 88.2% chr13 + 112521845 112521972 128 browser details YourSeq 90 2462 2583 3000 89.5% chr13 + 101160531 101160649 119 browser details YourSeq 89 2462 2576 3000 88.7% chr16 - 4917345 4917457 113 browser details YourSeq 89 2447 2576 3000 85.2% chr1 + 171172401 171172529 129 browser details YourSeq 88 2447 2576 3000 88.7% chrX - 105918094 105918221 128

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 28364385 28367384 3000 browser details YourSeq 145 948 1247 3000 92.9% chr10 + 63144135 63144677 543 browser details YourSeq 138 960 1545 3000 93.2% chr11 + 102538494 102917036 378543 browser details YourSeq 137 948 1105 3000 93.7% chr9 - 21542604 21542767 164 browser details YourSeq 137 948 1445 3000 83.3% chr10 + 76412911 76413245 335 browser details YourSeq 135 524 1079 3000 82.5% chr15 - 81484463 81484885 423 browser details YourSeq 134 948 1103 3000 94.7% chr12 + 15441039 15441201 163 browser details YourSeq 132 948 1103 3000 93.6% chr19 - 28864274 28864441 168 browser details YourSeq 131 945 1101 3000 93.0% chr18 - 67809632 67809793 162 browser details YourSeq 131 948 1095 3000 94.6% chr14 - 58814086 58814248 163 browser details YourSeq 131 948 1096 3000 92.6% chr11 - 4856562 4856709 148 browser details YourSeq 129 948 1099 3000 92.8% chr12 + 31912242 31912394 153 browser details YourSeq 128 950 1102 3000 90.6% chr12 + 70092514 70092663 150 browser details YourSeq 128 948 1102 3000 91.6% chr10 + 26750122 26750280 159 browser details YourSeq 126 941 1101 3000 89.9% chr13 - 117082458 117082618 161 browser details YourSeq 126 948 1095 3000 91.8% chr15 + 8486700 8486846 147 browser details YourSeq 126 948 1099 3000 88.6% chr10 + 78751261 78751409 149 browser details YourSeq 125 965 1099 3000 94.8% chr14 - 18260216 18260349 134 browser details YourSeq 124 965 1144 3000 92.4% chr8 - 119843040 119843252 213 browser details YourSeq 123 940 1101 3000 86.8% chr13 - 55772913 55773072 160

Note: The 3000 bp section downstream of Exon 9 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Plekhg2 pleckstrin homology domain containing, family G (with RhoGef domain) member 2 [ Mus musculus (house mouse) ] Gene ID: 101497, updated on 26-Jun-2020

Gene summary

Official Symbol Plekhg2 provided by MGI Official Full Name pleckstrin homology domain containing, family G (with RhoGef domain) member 2 provided by MGI Primary source MGI:MGI:2141874 See related Ensembl:ENSMUSG00000037552 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Clg; Cslg; AI194308 Expression Broad expression in thymus adult (RPKM 36.3), spleen adult (RPKM 18.2) and 21 other tissuesS ee more Orthologs human all

Genomic context

Location: 7 A3; 7 16.71 cM See Plekhg2 in Genome Data Viewer

Exon count: 20

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (28359603..28372681, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (29144623..29157681, complement)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Plekhg2 ENSMUSG00000037552

Description pleckstrin homology domain containing, family G (with RhoGef domain) member 2 [Source:MGI Symbol;Acc:MGI:2141874] Gene Synonyms Clg Location Chromosome 7: 28,359,604-28,372,599 reverse strand. GRCm38:CM001000.2 About this gene This gene has 10 transcripts (splice variants), 99 orthologues, 20 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Plekhg2- ENSMUST00000119990.7 4880 1340aa ENSMUSP00000112881.1 Protein coding CCDS52162 G5E8T4 TSL:1 202 GENCODE basic APPRIS ALT2

Plekhg2- ENSMUST00000094644.10 4623 1341aa ENSMUSP00000092228.4 Protein coding CCDS21040 E9QKB6 TSL:1 201 GENCODE basic APPRIS P3

Plekhg2- ENSMUST00000121085.7 4649 1365aa ENSMUSP00000113449.1 Protein coding - D3Z5N8 TSL:2 203 GENCODE basic APPRIS ALT2

Plekhg2- ENSMUST00000144700.7 3721 912aa ENSMUSP00000115651.1 Protein coding - A0A0R4J1S3 CDS 3' 206 incomplete TSL:1

Plekhg2- ENSMUST00000147362.7 729 205aa ENSMUSP00000118217.1 Protein coding - D3YY99 CDS 3' 207 incomplete TSL:3

Plekhg2- ENSMUST00000147887.1 476 63aa ENSMUSP00000122050.1 Protein coding - D3Z1B1 CDS 3' 209 incomplete TSL:3

Plekhg2- ENSMUST00000152281.1 858 56aa ENSMUSP00000117062.1 Nonsense mediated - D6RFC7 TSL:5 210 decay

Plekhg2- ENSMUST00000128015.7 5511 No - Retained intron - - TSL:2 204 protein

Plekhg2- ENSMUST00000129145.7 3074 No - Retained intron - - TSL:2 205 protein

Plekhg2- ENSMUST00000147767.1 661 No - Retained intron - - TSL:3 208 protein

Page 6 of 8 https://www.alphaknockout.com

33.00 kb Forward strand 28.35Mb 28.36Mb 28.37Mb 28.38Mb Rps16-201 >protein coding Gm44710-201 >antisense (Comprehensive set...

Rps16-202 >retained intron

Rps16-203 >retained intron

AF357399-201 >snoRNA

Contigs < AC149606.6 Genes (Comprehensive set... < Plekhg2-201protein coding < Zfp36-201protein coding

< Plekhg2-204retained intron < Zfp36-202protein coding

< Plekhg2-202protein coding

< Plekhg2-203protein coding

< Plekhg2-205retained intron < Plekhg2-207protein coding

< Plekhg2-206protein coding

< Plekhg2-208retained intron < Plekhg2-210nonsense mediated decay

< Plekhg2-209protein coding

Regulatory Build

28.35Mb 28.36Mb 28.37Mb 28.38Mb Reverse strand 33.00 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000094644

< Plekhg2-201protein coding

Reverse strand 12.63 kb

ENSMUSP00000092... MobiDB lite Low complexity (Seg) Superfamily SSF50729

Dbl homology (DH) domain superfamily SMART Dbl homology (DH) domain

Pleckstrin homology domain Pfam Dbl homology (DH) domain

Pleckstrin homology domain PROSITE profiles Pleckstrin homology domain

Dbl homology (DH) domain PANTHER PTHR45924

PTHR45924:SF3 Gene3D PH-like domain superfamily

Dbl homology (DH) domain superfamily CDD cd13243

Dbl homology (DH) domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1341

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8