https://www.alphaknockout.com

Mouse Ppfia4 Knockout Project (CRISPR/Cas9)

Objective: To create a Ppfia4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ppfia4 (NCBI Reference Sequence: NM_001144855 ; Ensembl: ENSMUSG00000026458 ) is located on Mouse 1. 29 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 28 (Transcript: ENSMUST00000168515). Exon 2~11 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 6.6% of the coding region. Exon 2~11 covers 34.09% of the coding region. The size of effective KO region: ~6246 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 11 29

Legends Exon of mouse Ppfia4 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1832 bp section downstream of Exon 11 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.95% 459) | C(20.8% 416) | T(25.75% 515) | G(30.5% 610)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1832bp) | A(22.22% 407) | C(25.22% 462) | T(25.16% 461) | G(27.4% 502)

Note: The 1832 bp section downstream of Exon 11 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 134329248 134331247 2000 browser details YourSeq 210 57 300 2000 95.8% chr5 - 147173262 147173510 249 browser details YourSeq 205 81 316 2000 96.4% chr7 - 99431617 99431856 240 browser details YourSeq 200 80 297 2000 96.3% chr8 - 119872732 119872957 226 browser details YourSeq 199 83 298 2000 97.7% chr13 + 56845518 56845754 237 browser details YourSeq 196 81 295 2000 95.9% chr2 - 166028704 166028931 228 browser details YourSeq 194 81 299 2000 92.6% chr11 + 106982842 106983055 214 browser details YourSeq 192 65 300 2000 91.8% chr15 + 38073039 38073251 213 browser details YourSeq 191 80 292 2000 96.6% chr2 + 30855245 30855461 217 browser details YourSeq 190 80 297 2000 93.7% chr1 + 133022370 133022582 213 browser details YourSeq 189 67 298 2000 96.6% chr4 - 129474970 129475512 543 browser details YourSeq 189 81 279 2000 96.5% chr17 - 36029979 36030176 198 browser details YourSeq 188 1 299 2000 91.7% chr7 - 44883646 44884432 787 browser details YourSeq 188 80 295 2000 91.7% chr2 - 30311933 30312136 204 browser details YourSeq 186 80 299 2000 92.5% chr17 - 33273163 33273367 205 browser details YourSeq 186 81 298 2000 94.3% chr2 + 120665701 120665917 217 browser details YourSeq 185 66 300 2000 92.1% chr13 - 62444417 62444635 219 browser details YourSeq 185 81 279 2000 96.5% chr1 - 55069273 55069471 199 browser details YourSeq 185 66 301 2000 91.9% chr2 + 79676944 79677155 212 browser details YourSeq 185 81 369 2000 90.1% chr11 + 102744501 102744741 241

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1832 1 1832 1832 100.0% chr1 - 134321170 134323001 1832 browser details YourSeq 26 728 765 1832 96.5% chr1 + 11330060 11330113 54 browser details YourSeq 24 136 166 1832 96.2% chr8 - 4183625 4183656 32 browser details YourSeq 22 873 894 1832 100.0% chr8 + 109059071 109059092 22 browser details YourSeq 21 776 796 1832 100.0% chr6 - 94638356 94638376 21 browser details YourSeq 21 730 750 1832 100.0% chr10 - 95817186 95817206 21 browser details YourSeq 21 239 259 1832 100.0% chr1 + 168207775 168207795 21 browser details YourSeq 21 1802 1822 1832 100.0% chr1 + 75436091 75436111 21 browser details YourSeq 20 731 750 1832 100.0% chr1 - 59917109 59917128 20 browser details YourSeq 20 1035 1054 1832 100.0% chr1 + 152831299 152831318 20 browser details YourSeq 20 314 335 1832 95.5% chr1 + 107187370 107187391 22

Note: The 1832 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ppfia4 protein tyrosine phosphatase, receptor type, f polypeptide (PTPRF), interacting protein (liprin), alpha 4 [ Mus musculus (house mouse) ] Gene ID: 68507, updated on 24-Oct-2019

Gene summary

Official Symbol Ppfia4 provided by MGI Official Full Name protein tyrosine phosphatase, receptor type, f polypeptide (PTPRF), interacting protein (liprin), alpha 4 provided by MGI Primary source MGI:MGI:1915757 See related Ensembl:ENSMUSG00000026458 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C81506; Gm3812; AI448359; AI852265; 100042382; 1110008G13Rik Expression Biased expression in cerebellum adult (RPKM 32.2), CNS E18 (RPKM 10.5) and 7 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 E4 See Ppfia4 in Genome Data Viewer Exon count: 31

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (134296783..134344756, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (136193360..136229505, complement)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Ppfia4 ENSMUSG00000026458

Description protein tyrosine phosphatase, receptor type, f polypeptide (PTPRF), interacting protein (liprin), alpha 4 [Source:MGI Symbol;Acc:MGI:1915757] Gene Synonyms 1110008G13Rik, Gm3812, LOC100042382, Liprin-alpha4 Location : 134,296,783-134,332,928 reverse strand. GRCm38:CM000994.2 About this gene This gene has 6 transcripts (splice variants), 199 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ppfia4-201 ENSMUST00000168515.7 6067 1187aa ENSMUSP00000128314.1 Protein coding CCDS48369 B8QI36 TSL:1 GENCODE basic APPRIS P2

Ppfia4-205 ENSMUST00000189361.1 5238 1184aa ENSMUSP00000139833.1 Protein coding - A0A087WPM2 TSL:5 GENCODE basic APPRIS ALT2

Ppfia4-203 ENSMUST00000186730.6 4674 892aa ENSMUSP00000139800.1 Protein coding - A0A087WPJ3 CDS 5' incomplete TSL:5

Ppfia4-206 ENSMUST00000189862.6 2985 No protein - Retained intron - - TSL:1

Ppfia4-202 ENSMUST00000186553.1 1215 No protein - Retained intron - - TSL:1

Ppfia4-204 ENSMUST00000186964.6 2955 No protein - lncRNA - - TSL:1

Page 7 of 9 https://www.alphaknockout.com

56.15 kb Forward strand 134.29Mb 134.30Mb 134.31Mb 134.32Mb 134.33Mb 134.34Mb Myog-201 >protein coding (Comprehensive set...

Contigs < AC124110.8 Genes (Comprehensive set... < Ppfia4-201protein coding

< Ppfia4-203protein coding

< Ppfia4-204lncRNA

< Ppfia4-206retained intron

< Ppfia4-205protein coding

< Ppfia4-202retained intron

Regulatory Build

134.29Mb 134.30Mb 134.31Mb 134.32Mb 134.33Mb 134.34Mb Reverse strand 56.15 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000168515

< Ppfia4-201protein coding

Reverse strand 36.15 kb

ENSMUSP00000128... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Sterile alpha motif/pointed domain superfamily SMART Sterile alpha motif domain Pfam Sterile alpha motif domain

Sterile alpha motif domain PROSITE profiles Sterile alpha motif domain PANTHER Liprin-alpha-4

LAR-interacting protein, Liprin Gene3D Sterile alpha motif/pointed domain superfamily CDD Liprin-alpha, SAM domain repeat 1

Liprin-alpha, SAM domain repeat 2

Liprin-alpha, SAM domain repeat 3

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1187

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9