https://www.alphaknockout.com

Mouse Pdap1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Pdap1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pdap1 (NCBI Reference Sequence: NM_001033313 ; Ensembl: ENSMUSG00000029623 ) is located on Mouse 5. 6 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000031627). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Pdap1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-327C12 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 2.58% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 2973 bp, and the size of intron 3 for 3'-loxP site insertion: 2084 bp. The size of effective cKO region: ~2429 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Pdap1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8929bp) | A(23.22% 2073) | C(24.26% 2166) | T(26.9% 2402) | G(25.62% 2288)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 145137244 145140243 3000 browser details YourSeq 56 2896 2973 3000 85.8% chr6 - 22440582 22440652 71 browser details YourSeq 55 2895 2976 3000 86.7% chr5 + 140729360 140729434 75 browser details YourSeq 52 2243 2429 3000 88.3% chr11 - 102077427 102077614 188 browser details YourSeq 51 2900 2973 3000 83.7% chr14 - 76000237 76000302 66 browser details YourSeq 51 2344 2453 3000 91.9% chr13 - 37656212 37864311 208100 browser details YourSeq 50 1714 1767 3000 98.2% chr8 - 35259170 35259244 75 browser details YourSeq 50 1711 1766 3000 96.4% chr1 + 134068804 134068867 64 browser details YourSeq 48 2347 2431 3000 85.3% chr12 - 105641582 105641667 86 browser details YourSeq 47 2357 2431 3000 88.6% chr16 - 96212509 96212584 76 browser details YourSeq 45 2896 2983 3000 96.0% chr2 + 117109805 117109897 93 browser details YourSeq 45 2339 2431 3000 84.0% chr14 + 17953662 17953752 91 browser details YourSeq 45 2347 2433 3000 94.2% chr11 + 88134139 88134226 88 browser details YourSeq 45 1714 1766 3000 96.1% chr1 + 30419739 30419801 63 browser details YourSeq 42 2896 2950 3000 80.9% chr12 - 91701503 91701551 49 browser details YourSeq 42 2896 2951 3000 79.6% chr13 + 25004786 25004834 49 browser details YourSeq 41 2896 2949 3000 80.9% chr10 - 41957090 41957137 48 browser details YourSeq 41 2896 2941 3000 90.7% chr11 + 95921015 95921058 44 browser details YourSeq 40 2896 2940 3000 90.5% chr11 + 103914651 103914693 43 browser details YourSeq 40 2357 2419 3000 91.7% chr10 + 83724880 83724943 64

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 145131815 145134814 3000 browser details YourSeq 163 1278 1485 3000 90.2% chr15 - 83335785 83335989 205 browser details YourSeq 163 1277 1570 3000 91.9% chr7 + 64009529 64010147 619 browser details YourSeq 160 1271 1463 3000 92.1% chr8 + 95773667 95773861 195 browser details YourSeq 158 1277 1462 3000 90.2% chr18 + 38173398 38173579 182 browser details YourSeq 158 1277 1472 3000 93.1% chr10 + 18399385 18399894 510 browser details YourSeq 156 1277 1463 3000 90.5% chr5 - 27347300 27347479 180 browser details YourSeq 156 1276 1463 3000 89.2% chr11 - 101715075 101715258 184 browser details YourSeq 155 1276 1463 3000 89.7% chr18 - 23978155 23978339 185 browser details YourSeq 155 1279 1463 3000 91.9% chr12 - 85989400 85989584 185 browser details YourSeq 155 1279 1463 3000 93.3% chr1 - 191948241 191948433 193 browser details YourSeq 154 1278 1463 3000 92.8% chr19 + 30240868 30241054 187 browser details YourSeq 154 1278 1463 3000 91.9% chr15 + 99836212 99836397 186 browser details YourSeq 154 1277 1463 3000 91.4% chr12 + 59263902 59264091 190 browser details YourSeq 154 1277 1462 3000 89.6% chr10 + 117713896 117714077 182 browser details YourSeq 154 1259 1462 3000 86.7% chr1 + 58243337 58243525 189 browser details YourSeq 153 1277 1462 3000 91.2% chr8 - 91179849 91180033 185 browser details YourSeq 153 1287 1463 3000 93.3% chr17 - 28656180 28656356 177 browser details YourSeq 153 1277 1463 3000 90.0% chr16 - 35419879 35420061 183 browser details YourSeq 153 1277 1463 3000 92.3% chr15 - 81350202 81350396 195

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Pdap1 PDGFA associated protein 1 [ Mus musculus (house mouse) ] Gene ID: 231887, updated on 12-Aug-2019

Gene summary

Official Symbol Pdap1 provided by MGI Official Full Name PDGFA associated protein 1 provided by MGI Primary source MGI:MGI:2448536 See related Ensembl:ENSMUSG00000029623 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PAP; PAP1; HASPP28 Expression Ubiquitous expression in CNS E11.5 (RPKM 51.8), CNS E14 (RPKM 32.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 G2 See Pdap1 in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (145128770..145140025, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (145889639..145900958, complement)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Pdap1 ENSMUSG00000029623

Description PDGFA associated protein 1 [Source:MGI Symbol;Acc:MGI:2448536] Gene Synonyms HASPP28, PAP1 Location Chromosome 5: 145,128,769-145,140,238 reverse strand. GRCm38:CM000998.2 About this gene This gene has 5 transcripts (splice variants), 207 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pdap1- ENSMUST00000031627.8 2353 181aa ENSMUSP00000031627.8 Protein coding CCDS39384 B2RTB0 Q3UHX2 TSL:1 201 GENCODE basic APPRIS P1

Pdap1- ENSMUST00000199218.1 2452 No protein - Retained intron - - TSL:NA 205

Pdap1- ENSMUST00000143439.1 676 No protein - Retained intron - - TSL:2 203

Pdap1- ENSMUST00000148781.1 448 No protein - Retained intron - - TSL:2 204

Pdap1- ENSMUST00000123111.7 1017 No protein - lncRNA - - TSL:1 202

Page 6 of 8 https://www.alphaknockout.com

31.47 kb Forward strand 145.12Mb 145.13Mb 145.14Mb 145.15Mb Arpc1b-201 >protein coding Bud31-207 >protein coding (Comprehensive set...

Arpc1b-204 >lncRNA Arpc1b-205 >protein coding Bud31-202 >retained introBnud31-204 >retained intron

Arpc1b-202 >retained intron Arpc1b-203 >retained intron Bud31-206 >protein coding

Arpc1b-211 >protein coding Bud31-201 >protein coding

Arpc1b-209 >protein coding Arpc1b-207 >lncRNA Bud31-203 >protein coding

Arpc1b-206 >retained intron Bud31-205 >retained intron

Arpc1b-208 >protein coding

Arpc1b-210 >retained intron

Contigs AC110556.15 > < AC127411.3 Genes (Comprehensive set... < Pdap1-201protein coding < Ptcd1-201protein coding

< Pdap1-202lncRNA

< Pdap1-203retained intron < Pdap1-204retained intron

< Pdap1-205retained intron

Regulatory Build

145.12Mb 145.13Mb 145.14Mb 145.15Mb Reverse strand 31.47 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000031627

< Pdap1-201protein coding

Reverse strand 11.47 kb

ENSMUSP00000031... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Casein kinase substrate, phosphoprotein PP28 PANTHER 28kDa heat- and acid-stable phosphoprotein

PTHR22055:SF5

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 181

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8