https://www.alphaknockout.com

Mouse Pxdn Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Pxdn conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pxdn (NCBI Reference Sequence: NM_181395 ; Ensembl: ENSMUSG00000020674 ) is located on Mouse 12. 23 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 23 (Transcript: ENSMUST00000122328). Exon 11~13 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Pxdn gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-125H6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for an ENU-induced allele exhibit abnormal eye development with early-onset glaucoma and progressive retinal dysgenesis.

Exon 11 starts from about 28.99% of the coding region. The knockout of Exon 11~13 will result in frameshift of the gene. The size of intron 10 for 5'-loxP site insertion: 2365 bp, and the size of intron 13 for 3'-loxP site insertion: 1048 bp. The size of effective cKO region: ~2511 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 11 12 13 14 23 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Pxdn Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9011bp) | A(24.06% 2168) | C(22.96% 2069) | T(29.44% 2653) | G(23.54% 2121)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr12 + 29990125 29993124 3000 browser details YourSeq 81 2585 2881 3000 70.7% chr10 + 40377451 40377674 224 browser details YourSeq 71 2607 2990 3000 72.2% chr18 - 75485569 75485866 298 browser details YourSeq 68 2539 2688 3000 81.0% chr19 + 53560188 53560323 136 browser details YourSeq 59 2605 2912 3000 75.4% chr10 + 94937399 94937654 256 browser details YourSeq 58 2605 2696 3000 81.6% chr15 - 68295774 68295865 92 browser details YourSeq 52 2608 2888 3000 66.3% chr1 + 39480637 39480834 198 browser details YourSeq 49 992 1091 3000 94.7% chr1 + 38841968 38842472 505 browser details YourSeq 47 2605 2688 3000 77.3% chr5 - 100759471 100759553 83 browser details YourSeq 47 2605 2688 3000 86.0% chr6 + 88746721 88746802 82 browser details YourSeq 47 218 338 3000 87.1% chr1 + 72385416 72385534 119 browser details YourSeq 46 1249 1312 3000 92.8% chr18 - 55473993 55474057 65 browser details YourSeq 45 215 344 3000 92.6% chrX - 16463167 16463297 131 browser details YourSeq 45 2592 2667 3000 80.3% chr16 + 21287336 21287427 92 browser details YourSeq 43 2516 2671 3000 91.5% chr6 - 82360468 82360622 155 browser details YourSeq 43 2598 2671 3000 79.8% chr2 - 156804001 156804088 88 browser details YourSeq 43 1010 1095 3000 93.9% chr10 - 114975306 114975442 137 browser details YourSeq 43 2592 2671 3000 77.5% chr10 + 9983167 9983262 96 browser details YourSeq 43 2609 2688 3000 77.5% chr1 + 138178644 138178722 79 browser details YourSeq 42 202 340 3000 88.9% chr6 - 38407530 38407668 139

Note: The 3000 bp section upstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr12 + 29995636 29998635 3000 browser details YourSeq 98 3 124 3000 87.4% chr10 + 62933411 62933529 119 browser details YourSeq 95 14 2492 3000 88.0% chr10 - 121478380 121799252 320873 browser details YourSeq 95 1 124 3000 91.4% chr8 + 120099254 120099377 124 browser details YourSeq 95 1 111 3000 92.8% chr10 + 26474572 26474682 111 browser details YourSeq 91 1 112 3000 91.1% chr10 - 128162178 128162291 114 browser details YourSeq 89 3 133 3000 87.4% chr19 + 45610334 45610782 449 browser details YourSeq 88 10 112 3000 93.3% chr6 - 117895192 117895296 105 browser details YourSeq 88 1 112 3000 87.4% chr1 + 72305051 72305161 111 browser details YourSeq 87 10 125 3000 86.4% chr1 + 179645538 179645650 113 browser details YourSeq 86 1 107 3000 90.7% chr10 - 18008972 18398790 389819 browser details YourSeq 85 1 112 3000 88.4% chr8 - 120390102 120390215 114 browser details YourSeq 84 1 108 3000 88.9% chr11 - 4589774 4589881 108 browser details YourSeq 84 1 124 3000 84.3% chr10 + 39715469 39715592 124 browser details YourSeq 84 1 112 3000 87.5% chr1 + 106411480 106411591 112 browser details YourSeq 83 1 107 3000 88.8% chr1 - 60608633 60608739 107 browser details YourSeq 82 13 122 3000 89.6% chr18 + 42156397 42156506 110 browser details YourSeq 81 1 112 3000 83.7% chr10 - 111568126 111568235 110 browser details YourSeq 81 1 111 3000 86.5% chr1 - 63209056 63209166 111 browser details YourSeq 80 7 107 3000 90.1% chr10 + 81241420 81241532 113

Note: The 3000 bp section downstream of Exon 13 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Pxdn peroxidasin [ Mus musculus (house mouse) ] Gene ID: 69675, updated on 12-Aug-2019

Gene summary

Official Symbol Pxdn provided by MGI Official Full Name peroxidasin provided by MGI Primary source MGI:MGI:1916925 See related Ensembl:ENSMUSG00000020674 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as VPO1; C85409; mKIAA0230; E330004E07; 2310075M15Rik Expression Broad expression in subcutaneous fat pad adult (RPKM 54.0), limb E14.5 (RPKM 52.0) and 21 other tissues See more Orthologs human all

Genomic context

Location: 12; 12 A2 See Pxdn in Genome Data Viewer

Exon count: 24

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (29936642..30017658)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (30622901..30702523)

Chromosome 12 - NC_000078.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Pxdn ENSMUSG00000020674

Description peroxidasin [Source:MGI Symbol;Acc:MGI:1916925] Gene Synonyms 2310075M15Rik, VPO1 Location Chromosome 12: 29,937,608-30,017,658 forward strand. GRCm38:CM001005.2 About this gene This gene has 8 transcripts (splice variants), 213 orthologues, 4 paralogues, is a member of 2 Ensembl protein families and is associated with 37 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pxdn-202 ENSMUST00000122328.7 6614 1475aa ENSMUSP00000113703.1 Protein coding CCDS25856 Q3UQ28 TSL:1 GENCODE basic APPRIS P1

Pxdn-208 ENSMUST00000220271.1 6377 1295aa ENSMUSP00000151320.1 Protein coding - A0A1W2P6L9 TSL:5 GENCODE basic

Pxdn-201 ENSMUST00000118321.2 2698 307aa ENSMUSP00000113477.2 Protein coding - D3Z5M7 TSL:1 GENCODE basic

Pxdn-207 ENSMUST00000218620.1 4578 No protein - Retained intron - - TSL:5

Pxdn-206 ENSMUST00000155318.1 3406 No protein - Retained intron - - TSL:1

Pxdn-204 ENSMUST00000137316.1 787 No protein - Retained intron - - TSL:2

Pxdn-203 ENSMUST00000126233.1 768 No protein - Retained intron - - TSL:2

Pxdn-205 ENSMUST00000155190.1 1024 No protein - lncRNA - - TSL:3

100.05 kb Forward strand 29.94Mb 29.96Mb 29.98Mb 30.00Mb 30.02Mb (Comprehensive set... Pxdn-208 >protein coding

Pxdn-202 >protein coding

Pxdn-206 >retained intron Pxdn-205 >lncRNA

Pxdn-201 >protein coding Pxdn-203 >retained intron

Pxdn-204 >retained intron

Pxdn-207 >retained intron

Contigs < AC159626.2 < AC165078.2 Regulatory Build

29.94Mb 29.96Mb 29.98Mb 30.00Mb 30.02Mb Reverse strand 100.05 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 6 of 8 https://www.alphaknockout.com

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000122328

79.62 kb Forward strand

Pxdn-202 >protein coding

ENSMUSP00000113... Transmembrane heli... Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily Immunoglobulin-like domain superfamily Haem peroxidase superfamily SSF57603

SSF52058 SMART Immunoglobulin subtype 2 VWFC domain

Immunoglobulin subtype

Leucine-rich repeat, typical subtype

Cysteine-rich flanking region, C-terminal

Immunoglobulin V-set domain Prints Haem peroxidase, animal-type Pfam Leucine-rich repeat Haem peroxidase, animal-type VWFC domain

Immunoglobulin I-set PROSITE profiles Leucine-rich repeat Haem peroxidase, animal-type VWFC domain

Immunoglobulin-like domain PROSITE patterns VWFC domain

PANTHER Peroxidasin

PTHR11475 Gene3D Leucine-rich repeat domain superfamily Haem peroxidase domain superfamily, animal type 2.10.70.10

Immunoglobulin-like fold CDD cd05745 Peroxidasin, peroxidase domain

cd05746

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1475

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8