https://www.alphaknockout.com

Mouse Ptpn5 Knockout Project (CRISPR/Cas9)

Objective: To create a Ptpn5 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ptpn5 (NCBI Reference Sequence: NM_013643 ; Ensembl: ENSMUSG00000030854 ) is located on Mouse 7. 14 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000102626). Exon 3~11 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele exhibit normal brain development.

Exon 3 starts from about 1.6% of the coding region. Exon 3~11 covers 75.91% of the coding region. The size of effective KO region: ~9708 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9 10 11 14

Legends Exon of mouse Ptpn5 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 11 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.75% 575) | C(23.05% 461) | T(24.4% 488) | G(23.8% 476)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.0% 520) | C(22.3% 446) | T(22.55% 451) | G(29.15% 583)

Note: The 2000 bp section downstream of Exon 11 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 47091571 47093570 2000 browser details YourSeq 182 555 780 2000 93.4% chr18 - 65541074 65541486 413 browser details YourSeq 176 601 1203 2000 80.7% chr12 - 101984144 101984626 483 browser details YourSeq 163 576 1040 2000 88.7% chr4 + 150215168 150215673 506 browser details YourSeq 148 672 1268 2000 82.7% chr6 + 54581333 54581882 550 browser details YourSeq 147 597 782 2000 89.8% chr13 + 41717506 41717692 187 browser details YourSeq 146 587 1077 2000 86.1% chr1 + 136591448 136592020 573 browser details YourSeq 142 582 780 2000 88.3% chr6 - 52063333 52063531 199 browser details YourSeq 139 959 1291 2000 84.6% chr11 + 69254135 69254319 185 browser details YourSeq 137 664 1209 2000 81.3% chr16 + 11292473 11292769 297 browser details YourSeq 132 583 779 2000 82.9% chr5 - 73049524 73049702 179 browser details YourSeq 130 601 780 2000 86.7% chr7 + 101045369 101045550 182 browser details YourSeq 129 597 779 2000 85.8% chr2 - 3466668 3466849 182 browser details YourSeq 127 604 768 2000 87.5% chr7 - 126576130 126576291 162 browser details YourSeq 127 962 1283 2000 81.9% chr5 + 105665057 105665223 167 browser details YourSeq 127 1144 1480 2000 82.6% chr4 + 154129830 154130084 255 browser details YourSeq 127 603 779 2000 88.7% chr4 + 131875628 131875806 179 browser details YourSeq 126 578 780 2000 83.9% chr6 - 49759964 49760178 215 browser details YourSeq 126 584 747 2000 89.5% chr12 - 108540760 108541075 316 browser details YourSeq 125 959 1280 2000 83.1% chr1 + 160921867 160922037 171

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 47079863 47081862 2000 browser details YourSeq 41 1547 1645 2000 95.6% chr6 - 30295102 30295210 109 browser details YourSeq 39 1546 1600 2000 95.6% chr4 + 126535253 126535311 59 browser details YourSeq 38 1546 1643 2000 97.7% chr11 - 88926416 88926514 99 browser details YourSeq 37 1551 1600 2000 88.0% chr15 + 73163881 73163932 52 browser details YourSeq 36 1582 1681 2000 71.0% chr1 + 185460933 185461213 281 browser details YourSeq 34 1555 1596 2000 90.5% chr4 - 43743011 43743052 42 browser details YourSeq 33 1547 1596 2000 84.0% chr4 + 19582382 19582433 52 browser details YourSeq 33 1547 1600 2000 97.2% chr13 + 11355010 11355065 56 browser details YourSeq 33 1549 1600 2000 82.7% chr1 + 143311689 143311742 54 browser details YourSeq 32 1551 1598 2000 83.4% chr3 + 96550830 96550877 48 browser details YourSeq 32 1547 1599 2000 88.9% chr3 + 87800987 87801038 52 browser details YourSeq 32 1547 1597 2000 97.1% chr11 + 60765895 60765947 53 browser details YourSeq 31 1547 1600 2000 88.6% chr3 - 88828589 88828641 53 browser details YourSeq 31 1627 1668 2000 94.3% chr19 - 4515385 4515427 43 browser details YourSeq 31 1547 1592 2000 97.2% chr15 - 100412766 100412811 46 browser details YourSeq 31 1556 1597 2000 94.2% chr11 - 55482414 55482455 42 browser details YourSeq 31 1546 1580 2000 97.1% chr1 - 63532821 63532877 57 browser details YourSeq 31 1547 1597 2000 97.0% chr2 + 155288816 155288868 53 browser details YourSeq 31 1547 1597 2000 97.1% chr10 + 128648214 128648297 84

Note: The 2000 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and protein information: Ptpn5 protein tyrosine phosphatase, non-receptor type 5 [ Mus musculus (house mouse) ] Gene ID: 19259, updated on 17-Sep-2019

Gene summary

Official Symbol Ptpn5 provided by MGI Official Full Name protein tyrosine phosphatase, non-receptor type 5 provided by MGI Primary source MGI:MGI:97807 See related Ensembl:ENSMUSG00000030854 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Step Expression Biased expression in CNS E18 (RPKM 44.4), whole brain E14.5 (RPKM 40.6) and 9 other tissues See more Orthologs human all

Genomic context

Location: 7 B3; 7 30.7 cM See Ptpn5 in Genome Data Viewer Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (47077795..47134026, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (54333168..54389054, complement)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Ptpn5 ENSMUSG00000030854

Description protein tyrosine phosphatase, non-receptor type 5 [Source:MGI Symbol;Acc:MGI:97807] Gene Synonyms Step Location Chromosome 7: 47,077,795-47,133,684 reverse strand. GRCm38:CM001000.2 About this gene This gene has 11 transcripts (splice variants), 197 orthologues, 36 paralogues, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ptpn5-201 ENSMUST00000033142.12 3145 541aa ENSMUSP00000033142.5 Protein coding CCDS21294 P54830 TSL:1 GENCODE basic APPRIS P1

Ptpn5-202 ENSMUST00000102626.9 3137 541aa ENSMUSP00000099686.1 Protein coding CCDS21294 P54830 TSL:1 GENCODE basic APPRIS P1

Ptpn5-209 ENSMUST00000209161.1 3296 No protein - Retained intron - - TSL:2

Ptpn5-205 ENSMUST00000208324.1 2165 No protein - Retained intron - - TSL:1

Ptpn5-203 ENSMUST00000207172.1 1874 No protein - Retained intron - - TSL:2

Ptpn5-207 ENSMUST00000208531.1 694 No protein - Retained intron - - TSL:3

Ptpn5-204 ENSMUST00000207344.1 393 No protein - Retained intron - - TSL:3

Ptpn5-211 ENSMUST00000209184.1 765 No protein - lncRNA - - TSL:3

Ptpn5-210 ENSMUST00000209179.1 673 No protein - lncRNA - - TSL:3

Ptpn5-208 ENSMUST00000209057.1 589 No protein - lncRNA - - TSL:3

Ptpn5-206 ENSMUST00000208437.1 348 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

75.89 kb Forward strand 47.08Mb 47.10Mb 47.12Mb 47.14Mb Gm14377-201 >lncRNA (Comprehensive set...

Contigs AC113001.9 > Genes (Comprehensive set... < LOC102637012-201unitary pse

< Ptpn5-202protein coding

< Ptpn5-201protein coding

< Ptpn5-209retained intron < Ptpn5-208lncRNA

< Ptpn5-207retained intron < Ptpn5-211lncRNA

< Ptpn5-203retained intron

< Mir7056-201miRNA < Ptpn5-204retained intron

< Ptpn5-206lncRNA

Regulatory Build

47.08Mb 47.10Mb 47.12Mb 47.14Mb Reverse strand 75.89 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000102626

< Ptpn5-202protein coding

Reverse strand 55.89 kb

ENSMUSP00000099... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily Protein-tyrosine phosphatase-like SMART Protein-tyrosine phosphatase, catalytic

PTP type protein phosphatase Prints Protein-tyrosine phosphatase, KIM-containing

PTP type protein phosphatase Pfam PTP type protein phosphatase PROSITE profiles Tyrosine specific protein phosphatases domain

PTP type protein phosphatase PROSITE patterns Protein-tyrosine phosphatase, active site PIRSF Protein-tyrosine phosphatase, receptor type R/non-receptor type 5 PANTHER Protein-tyrosine phosphatase, KIM-containing

PTHR46198:SF1 Gene3D Protein-tyrosine phosphatase-like CDD cd14613

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 541

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9