https://www.alphaknockout.com

Mouse Plekha1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Plekha1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Plekha1 (NCBI Reference Sequence: NM_133942 ; Ensembl: ENSMUSG00000040268 ) is located on Mouse 7. 10 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 10 (Transcript: ENSMUST00000048180). Exon 7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Plekha1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-211P16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mcie homozygous for a gene trapped allele exhibit postnatal lethality and increased body weight.

Exon 7 starts from about 50.47% of the coding region. The knockout of Exon 7 will result in frameshift of the gene. The size of intron 6 for 5'-loxP site insertion: 2624 bp, and the size of intron 7 for 3'-loxP site insertion: 456 bp. The size of effective cKO region: ~543 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 7 8 10 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Plekha1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7021bp) | A(28.17% 1978) | C(19.07% 1339) | T(29.6% 2078) | G(23.16% 1626)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 130901621 130904620 3000 browser details YourSeq 43 1 54 3000 93.4% chr15 + 76384745 76384797 53 browser details YourSeq 40 33 132 3000 66.7% chr8 + 16182076 16182120 45 browser details YourSeq 31 1 43 3000 91.0% chr14 - 57803261 57803302 42 browser details YourSeq 27 32 58 3000 100.0% chr11 - 56569296 56569322 27 browser details YourSeq 26 33 58 3000 100.0% chr3 - 123511891 123511916 26 browser details YourSeq 26 33 58 3000 100.0% chr17 - 24286361 24286386 26 browser details YourSeq 26 33 58 3000 100.0% chr11 - 53313556 53313581 26 browser details YourSeq 24 33 60 3000 92.9% chr5 - 122812814 122812841 28 browser details YourSeq 23 1590 1615 3000 83.4% chr8 + 40438556 40438579 24

Note: The 3000 bp section upstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 130905164 130908163 3000 browser details YourSeq 209 1931 2189 3000 94.9% chr2 + 170272728 170273018 291 browser details YourSeq 199 1964 2190 3000 95.0% chr1 + 24128942 24129173 232 browser details YourSeq 196 1959 2188 3000 95.6% chr14 + 73508407 73508655 249 browser details YourSeq 187 2036 2941 3000 94.4% chr10 - 40131771 40274633 142863 browser details YourSeq 186 2024 2709 3000 97.0% chr13 + 111336489 111541642 205154 browser details YourSeq 155 2020 2189 3000 95.9% chr17 - 88072919 88073091 173 browser details YourSeq 154 2020 2191 3000 95.9% chr17 + 28390856 28391409 554 browser details YourSeq 153 2002 2189 3000 91.4% chr8 + 25948377 25948574 198 browser details YourSeq 152 2020 2189 3000 95.3% chr11 - 88884142 88884319 178 browser details YourSeq 152 2024 2189 3000 97.0% chr14 + 47963321 47963490 170 browser details YourSeq 152 2008 2188 3000 94.8% chr12 + 85333183 85333376 194 browser details YourSeq 151 2023 2189 3000 95.8% chr17 - 30590147 30590321 175 browser details YourSeq 150 2012 2186 3000 93.2% chr1 + 131857986 131858164 179 browser details YourSeq 149 1997 2194 3000 90.1% chr11 - 29832708 29832897 190 browser details YourSeq 149 2020 2188 3000 95.2% chr10 + 80626938 80627111 174 browser details YourSeq 148 2027 2188 3000 96.3% chr6 - 38246975 38247144 170 browser details YourSeq 148 2024 2192 3000 94.1% chr5 - 92420909 92421079 171 browser details YourSeq 147 2024 2188 3000 95.2% chr7 + 126232344 126232513 170 browser details YourSeq 146 1999 2197 3000 88.7% chr18 - 73687863 73688045 183

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Plekha1 pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1 [ Mus musculus (house mouse) ] Gene ID: 101476, updated on 10-Oct-2019

Gene summary

Official Symbol Plekha1 provided by MGI Official Full Name pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1 provided by MGI Primary source MGI:MGI:2442213 See related Ensembl:ENSMUSG00000040268 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as TAPP1; AA960558; C920009D07Rik Expression Broad expression in CNS E18 (RPKM 32.7), whole brain E14.5 (RPKM 25.8) and 26 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 F3 See Plekha1 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (130860551..130913494)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (138009424..138056816)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Plekha1 ENSMUSG00000040268

Description pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1 [Source:MGI Symbol;Acc:MGI:2442213] Gene Synonyms C920009D07Rik, TAPP1 Location Chromosome 7: 130,865,756-130,913,312 forward strand. GRCm38:CM001000.2 About this gene This gene has 12 transcripts (splice variants), 267 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Plekha1- ENSMUST00000048180.11 3321 356aa ENSMUSP00000035375.5 Protein coding CCDS21907 Q8BUL6 TSL:1 201 GENCODE basic

Plekha1- ENSMUST00000075181.10 2301 383aa ENSMUSP00000074675.4 Protein coding CCDS85434 Q8BUL6 TSL:1 202 GENCODE basic APPRIS P1

Plekha1- ENSMUST00000120441.7 3777 334aa ENSMUSP00000112777.1 Protein coding - D3YU01 TSL:5 203 GENCODE basic

Plekha1- ENSMUST00000151119.8 2397 316aa ENSMUSP00000123600.2 Protein coding - D6RCU3 TSL:1 211 GENCODE basic

Plekha1- ENSMUST00000126355.1 588 155aa ENSMUSP00000114411.1 Protein coding - F6YLP9 CDS 5' 204 incomplete TSL:3

Plekha1- ENSMUST00000136963.7 743 71aa ENSMUSP00000146948.1 Nonsense mediated - A0A140LIT3 CDS 5' 206 decay incomplete TSL:3

Plekha1- ENSMUST00000148513.1 2941 No - Retained intron - - TSL:1 209 protein

Plekha1- ENSMUST00000149029.7 774 No - Retained intron - - TSL:3 210 protein

Plekha1- ENSMUST00000146111.1 726 No - Retained intron - - TSL:3 208 protein

Plekha1- ENSMUST00000140153.1 517 No - Retained intron - - TSL:2 207 protein

Plekha1- ENSMUST00000135359.1 352 No - Retained intron - - TSL:3 205 protein

Plekha1- ENSMUST00000154282.1 153 No - Retained intron - - TSL:3 212 protein

Page 6 of 8 https://www.alphaknockout.com

67.56 kb Forward strand 130.86Mb 130.88Mb 130.90Mb 130.92Mb (Comprehensive set... Plekha1-203 >protein coding

Plekha1-209 >retained intron Plekha1-210 >retained intron Mir7061-201 >miRNA

Plekha1-207 >retained intron Plekha1-212 >retained intron Plekha1-206 >nonsense mediated decay

Plekha1-202 >protein coding

Plekha1-211 >protein coding

Plekha1-201 >protein coding

Plekha1-208 >retained intron

Plekha1-204 >protein coding

Plekha1-205 >retained intron

Contigs < AC115697.19 Genes < Fgfr2-217protein coding (Comprehensive set...

< Gm5602-201lncRNA

Regulatory Build

130.86Mb 130.88Mb 130.90Mb 130.92Mb Reverse strand 67.56 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000048180

47.39 kb Forward strand

Plekha1-201 >protein coding

ENSMUSP00000035... MobiDB lite Low complexity (Seg) Superfamily SSF50729 SMART Pleckstrin homology domain Pfam Pleckstrin homology domain PROSITE profiles Pleckstrin homology domain PANTHER PTHR14336:SF5

PTHR14336 Gene3D PH-like domain superfamily CDD cd13271

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 356

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8