https://www.alphaknockout.com

Mouse Pphln1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Pphln1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pphln1 (NCBI Reference Sequence: NM_146062 ; Ensembl: ENSMUSG00000036167 ) is located on Mouse 15. 10 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000049122). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Pphln1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-10A16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trap allele die prior to E7.5.

Exon 5 starts from about 30.01% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 17425 bp, and the size of intron 5 for 3'-loxP site insertion: 10404 bp. The size of effective cKO region: ~712 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 10 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Pphln1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7212bp) | A(24.38% 1758) | C(21.35% 1540) | T(32.83% 2368) | G(21.44% 1546)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 + 93438252 93441251 3000 browser details YourSeq 241 1746 2191 3000 91.1% chr11 - 106171737 106172351 615 browser details YourSeq 241 1746 2198 3000 91.4% chr11 - 95917786 95918287 502 browser details YourSeq 205 1744 2198 3000 89.0% chr1 + 133150441 133151041 601 browser details YourSeq 200 1792 2198 3000 85.6% chr10 + 61403571 61403887 317 browser details YourSeq 196 1745 2140 3000 89.2% chr7 + 127962615 127963110 496 browser details YourSeq 177 1742 2191 3000 86.9% chr6 + 52279149 52279525 377 browser details YourSeq 177 2003 2416 3000 85.8% chr15 + 79142978 79143211 234 browser details YourSeq 175 1754 2191 3000 91.9% chr9 + 59791639 59792207 569 browser details YourSeq 173 2002 2416 3000 90.6% chr2 - 91237633 91238072 440 browser details YourSeq 170 2002 2198 3000 91.1% chr15 - 51821385 51821574 190 browser details YourSeq 170 1956 2191 3000 93.4% chr19 + 34981696 34982149 454 browser details YourSeq 169 2005 2411 3000 85.1% chr11 + 80149204 80149425 222 browser details YourSeq 166 2005 2198 3000 93.7% chr4 - 53240746 53240939 194 browser details YourSeq 163 2002 2198 3000 89.8% chr12 - 3711112 3711306 195 browser details YourSeq 162 2008 2198 3000 92.6% chr4 - 120471739 120471948 210 browser details YourSeq 161 2019 2208 3000 93.6% chr2 + 140127107 140127305 199 browser details YourSeq 161 2005 2198 3000 91.8% chr19 + 44052790 44052985 196 browser details YourSeq 160 2008 2418 3000 82.9% chr18 - 7835434 7835682 249 browser details YourSeq 159 2004 2191 3000 93.5% chr2 + 116901191 116901380 190

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 + 93441964 93444963 3000 browser details YourSeq 224 2759 3000 3000 98.3% chr13 - 51035947 51097424 61478 browser details YourSeq 215 2684 2999 3000 92.0% chr18 + 53336483 53336777 295 browser details YourSeq 209 2773 3000 3000 97.8% chr18 - 63192376 63192619 244 browser details YourSeq 209 2773 3000 3000 96.9% chr14 + 95331929 95332158 230 browser details YourSeq 208 2773 2995 3000 96.9% chr4 - 51041873 51042115 243 browser details YourSeq 207 2773 2995 3000 96.9% chr4 - 51038423 51038697 275 browser details YourSeq 207 2760 3000 3000 94.3% chr15 + 98329093 98329331 239 browser details YourSeq 206 2773 3000 3000 96.1% chr19 - 43047254 43047509 256 browser details YourSeq 205 2773 2999 3000 95.6% chr17 - 49917518 49917759 242 browser details YourSeq 205 2773 2999 3000 95.6% chr12 + 64156128 64156366 239 browser details YourSeq 204 2773 3000 3000 96.9% chr7 - 72946860 72947097 238 browser details YourSeq 204 2772 2999 3000 95.2% chr4 - 80623763 80624000 238 browser details YourSeq 204 2773 3000 3000 96.0% chr8 + 34660648 34660876 229 browser details YourSeq 204 2772 3000 3000 95.2% chr3 + 60900140 60900377 238 browser details YourSeq 203 2773 3000 3000 96.4% chr18 - 63195708 63195944 237 browser details YourSeq 203 2773 3000 3000 96.0% chrX + 22479206 22479446 241 browser details YourSeq 202 2773 2999 3000 96.0% chr13 - 52050656 52050895 240 browser details YourSeq 202 2773 2999 3000 96.0% chr13 - 52053911 52054150 240 browser details YourSeq 202 2773 3000 3000 96.7% chr7 + 111006030 111006255 226

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Pphln1 periphilin 1 [ Mus musculus (house mouse) ] Gene ID: 223828, updated on 12-Aug-2019

Gene summary

Official Symbol Pphln1 provided by MGI Official Full Name periphilin 1 provided by MGI Primary source MGI:MGI:1917029 See related Ensembl:ENSMUSG00000036167 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CR; HSPC206; HSPC232 Expression Ubiquitous expression in CNS E11.5 (RPKM 8.0), limb E14.5 (RPKM 5.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 15; 15 E3 See Pphln1 in Genome Data Viewer

Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (93398345..93491913)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (93228781..93322344)

Chromosome 15 - NC_000081.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Pphln1 ENSMUSG00000036167

Description periphilin 1 [Source:MGI Symbol;Acc:MGI:1917029] Gene Synonyms 1110063K05Rik, 1600022A19Rik, CR Location Chromosome 15: 93,398,350-93,491,510 forward strand. GRCm38:CM001008.2 About this gene This gene has 7 transcripts (splice variants), 298 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pphln1-201 ENSMUST00000049122.15 3573 381aa ENSMUSP00000042762.8 Protein coding CCDS27767 G3X959 TSL:1 GENCODE basic

Pphln1-204 ENSMUST00000165935.1 3473 293aa ENSMUSP00000131121.1 Protein coding CCDS37181 G3UWD4 TSL:1 GENCODE basic APPRIS ALT2

Pphln1-202 ENSMUST00000068457.14 3351 312aa ENSMUSP00000068165.7 Protein coding CCDS27768 Q3UBL8 TSL:1 GENCODE basic APPRIS P3

Pphln1-205 ENSMUST00000229071.1 1198 293aa ENSMUSP00000154876.1 Protein coding CCDS37181 G3UWD4 GENCODE basic APPRIS ALT2

Pphln1-203 ENSMUST00000109256.10 1154 293aa ENSMUSP00000104879.3 Protein coding CCDS37181 G3UWD4 TSL:1 GENCODE basic APPRIS ALT2

Pphln1-207 ENSMUST00000230385.1 592 125aa ENSMUSP00000155236.1 Protein coding - A0A2R8W6Q1 CDS 3' incomplete

Pphln1-206 ENSMUST00000229721.1 583 No protein - lncRNA - - -

Page 6 of 8 https://www.alphaknockout.com

113.16 kb Forward strand 93.40Mb 93.45Mb 93.50Mb (Comprehensive set... Pphln1-203 >protein coding

Pphln1-201 >protein coding

Pphln1-202 >protein coding

Pphln1-205 >protein coding

Pphln1-204 >protein coding

Pphln1-207 >protein coding Pphln1-206 >lncRNA

Contigs AC126556.4 > Genes < Zcrb1-208protein coding < Prickle1-201protein coding (Comprehensive set...

< Zcrb1-201protein coding < Prickle1-202protein coding

< Zcrb1-205retained intron

< Zcrb1-204retained intron

< Zcrb1-207retained intron

< Zcrb1-206nonsense mediated decay

< Zcrb1-203retained intron

Regulatory Build

93.40Mb 93.45Mb 93.50Mb Reverse strand 113.16 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000049122

93.11 kb Forward strand

Pphln1-201 >protein coding

ENSMUSP00000042... MobiDB lite Low complexity (Seg) Pfam PF11488 PANTHER Periphilin-1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 381

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8