https://www.alphaknockout.com

Mouse Pwp2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Pwp2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pwp2 (NCBI Reference Sequence: NM_029546 ; Ensembl: ENSMUSG00000032834 ) is located on Mouse 10. 21 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 21 (Transcript: ENSMUST00000042556). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Pwp2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-77O11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.69% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 900 bp, and the size of intron 2 for 3'-loxP site insertion: 1017 bp. The size of effective cKO region: ~613 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 21 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Pwp2 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7113bp) | A(23.04% 1639) | C(23.79% 1692) | T(26.28% 1869) | G(26.89% 1913)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 78184421 78187420 3000 browser details YourSeq 299 610 1393 3000 83.8% chr10 + 78186008 78186523 516 browser details YourSeq 292 906 1388 3000 90.8% chr18 + 46534049 46534549 501 browser details YourSeq 288 903 1398 3000 92.4% chr2 - 104516957 104517456 500 browser details YourSeq 281 903 1388 3000 90.8% chr11 - 101734138 101779464 45327 browser details YourSeq 258 906 1339 3000 87.5% chr8 + 119391871 119392215 345 browser details YourSeq 253 903 1334 3000 91.3% chr1 - 171264271 171264917 647 browser details YourSeq 240 903 1394 3000 84.9% chr1 - 75323475 75323821 347 browser details YourSeq 237 903 1347 3000 90.5% chr2 - 152014630 152015241 612 browser details YourSeq 237 903 1300 3000 93.5% chr10 - 128332701 128364280 31580 browser details YourSeq 205 903 1287 3000 88.8% chr10 - 5815926 5816574 649 browser details YourSeq 201 926 1296 3000 89.8% chr11 - 88874121 88874652 532 browser details YourSeq 176 895 1094 3000 94.5% chr9 + 7851468 7851667 200 browser details YourSeq 173 902 1093 3000 95.3% chr5 - 100839446 100839637 192 browser details YourSeq 171 903 1106 3000 93.5% chr10 - 58581043 58581252 210 browser details YourSeq 171 903 1093 3000 94.8% chr12 + 20306364 20306554 191 browser details YourSeq 170 898 1107 3000 88.5% chr8 + 25745505 25745705 201 browser details YourSeq 169 852 1093 3000 93.4% chr11 - 53847685 53847977 293 browser details YourSeq 168 889 1092 3000 89.9% chr4 - 72230245 72230435 191 browser details YourSeq 168 897 1094 3000 93.0% chr8 + 26038380 26038592 213

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 78180808 78183807 3000 browser details YourSeq 35 2637 2678 3000 92.9% chr17 + 62895208 62895268 61 browser details YourSeq 34 2636 2678 3000 94.8% chr3 - 126903035 126903078 44 browser details YourSeq 34 2641 2677 3000 97.3% chr6 + 93546616 93546654 39 browser details YourSeq 33 2644 2678 3000 97.2% chr17 + 44721722 44721756 35 browser details YourSeq 32 2640 2674 3000 97.2% chr11 - 73991441 73991483 43 browser details YourSeq 32 2645 2678 3000 97.1% chr16 + 61490445 61490478 34 browser details YourSeq 31 2636 2678 3000 86.5% chr3 - 52020230 52020271 42 browser details YourSeq 31 2640 2678 3000 92.2% chr12 - 33926131 33926187 57 browser details YourSeq 31 2640 2672 3000 97.0% chr10 - 116112820 116112852 33 browser details YourSeq 30 2645 2676 3000 96.9% chr9 + 65406320 65406351 32 browser details YourSeq 30 2640 2674 3000 94.2% chr4 + 125246047 125246084 38 browser details YourSeq 30 2637 2674 3000 78.8% chr10 + 14751218 14751251 34 browser details YourSeq 29 2641 2678 3000 96.8% chr12 - 71424769 71424807 39 browser details YourSeq 29 2646 2675 3000 100.0% chr11 - 21272055 21272087 33 browser details YourSeq 29 2640 2670 3000 96.8% chr17 + 29190502 29190532 31 browser details YourSeq 28 2647 2678 3000 93.8% chrX - 140877466 140877497 32 browser details YourSeq 28 2637 2670 3000 93.8% chr5 - 125397602 125397638 37 browser details YourSeq 28 2645 2678 3000 91.2% chr2 - 16396760 16396793 34 browser details YourSeq 28 2645 2676 3000 87.1% chr18 - 80493834 80493864 31

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Pwp2 PWP2 periodic tryptophan protein homolog (yeast) [ Mus musculus (house mouse) ] Gene ID: 110816, updated on 12-Aug-2019

Gene summary

Official Symbol Pwp2 provided by MGI Official Full Name PWP2 periodic tryptophan protein homolog (yeast) provided by MGI Primary source MGI:MGI:1341200 See related Ensembl:ENSMUSG00000032834 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Pwp2h; wdp103; 6530411D08Rik Expression Ubiquitous expression in ovary adult (RPKM 6.6), CNS E18 (RPKM 6.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 10 C1; 10 39.72 cM See Pwp2 in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (78170909..78185171, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (77633655..77647894, complement)

Chromosome 10 - NC_000076.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Pwp2 ENSMUSG00000032834

Description PWP2 periodic tryptophan protein homolog (yeast) [Source:MGI Symbol;Acc:MGI:1341200] Gene Synonyms 6530411D08Rik, Pwp2, Pwp2h Location Chromosome 10: 78,170,909-78,185,149 reverse strand. GRCm38:CM001003.2 About this gene This gene has 1 transcript (splice variant), 197 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pwp2-201 ENSMUST00000042556.10 3872 919aa ENSMUSP00000045812.9 Protein coding CCDS35958 Q2M1K2 Q8BU03 TSL:1 GENCODE basic APPRIS P1

34.24 kb Forward strand

78.17Mb 78.18Mb 78.19Mb Contigs AC164573.5 >

Genes (Comprehensive set... < Gatd3a-201protein coding < Pwp2-201protein coding < Trappc10-201protein coding

< Gatd3a-205protein coding < Trappc10-205retained intron

< Gatd3a-203protein coding

< Gatd3a-204retained intron

< Gatd3a-202retained intron

Regulatory Build

78.17Mb 78.18Mb 78.19Mb Reverse strand 34.24 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000042556

< Pwp2-201protein coding

Reverse strand 14.24 kb

ENSMUSP00000045... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Quinoprotein alcohol dehydrogenase-like superfamily SMART WD40 repeat Prints G-protein beta WD-40 repeat Pfam WD40 repeat Small-subunit processome, Utp12

PROSITE profiles WD40 repeat

WD40-repeat-containing domain PROSITE patterns WD40 repeat, conserved site PANTHER Periodic tryptophan protein 2 Gene3D WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800 919

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7