https://www.alphaknockout.com

Mouse Dpp3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Dpp3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dpp3 (NCBI Reference Sequence: NM_133803 ; Ensembl: ENSMUSG00000063904 ) is located on Mouse 19. 18 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 18 (Transcript: ENSMUST00000025851). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Dpp3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-237E11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1160 bp, and the size of intron 2 for 3'-loxP site insertion: 3011 bp. The size of effective cKO region: ~770 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Dpp3 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7270bp) | A(25.02% 1819) | C(21.87% 1590) | T(26.84% 1951) | G(26.27% 1910)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 4927305 4930304 3000 browser details YourSeq 75 210 432 3000 75.0% chr11 - 29615693 29615796 104 browser details YourSeq 72 308 405 3000 93.0% chr6 + 125800046 125800168 123 browser details YourSeq 70 305 457 3000 79.5% chr11 - 67494066 67494141 76 browser details YourSeq 69 210 436 3000 80.0% chr12 - 24105575 24105776 202 browser details YourSeq 60 327 436 3000 84.6% chr3 + 26635979 26636239 261 browser details YourSeq 50 114 236 3000 89.1% chr14 + 69962213 69962335 123 browser details YourSeq 46 306 394 3000 94.4% chr11 - 24252034 24252545 512 browser details YourSeq 46 365 432 3000 88.3% chr13 + 78415927 78415992 66 browser details YourSeq 43 190 238 3000 95.9% chr14 - 51770280 51770332 53 browser details YourSeq 42 190 238 3000 93.7% chr14 + 51099575 51099624 50 browser details YourSeq 41 187 236 3000 92.0% chr16 - 32995027 32995100 74 browser details YourSeq 41 114 222 3000 95.6% chr16 + 23643826 23643956 131 browser details YourSeq 41 385 440 3000 93.7% chr11 + 101877149 101877211 63 browser details YourSeq 40 190 238 3000 91.9% chr1 - 93806379 93806428 50 browser details YourSeq 40 190 237 3000 83.8% chr8 + 24620420 24620463 44 browser details YourSeq 39 192 237 3000 93.5% chr6 - 53297722 53297768 47 browser details YourSeq 38 187 238 3000 88.0% chr1 - 139447827 139447880 54 browser details YourSeq 38 190 236 3000 90.7% chr11 + 99227379 99227424 46 browser details YourSeq 36 190 240 3000 79.6% chr7 - 34369826 34369874 49

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 4923535 4926534 3000 browser details YourSeq 261 1186 1559 3000 95.9% chr7 + 19189106 19189700 595 browser details YourSeq 253 1157 1541 3000 88.7% chr17 - 35169401 35169781 381 browser details YourSeq 249 1149 1555 3000 97.8% chr16 - 90984789 90985288 500 browser details YourSeq 240 1170 1542 3000 88.3% chr17 + 35541905 35542267 363 browser details YourSeq 236 1340 1783 3000 91.4% chr1 + 180799653 180800008 356 browser details YourSeq 230 1338 1762 3000 99.2% chr5 + 115149115 115149745 631 browser details YourSeq 222 1238 1541 3000 97.5% chr8 - 70355098 70355422 325 browser details YourSeq 213 1325 1541 3000 99.1% chr10 + 59994040 59994256 217 browser details YourSeq 212 1157 1541 3000 93.5% chr12 - 76855266 76855811 546 browser details YourSeq 211 1325 1541 3000 98.7% chr16 + 33103036 33103252 217 browser details YourSeq 210 1332 1541 3000 100.0% chr4 - 127099497 127099706 210 browser details YourSeq 209 1333 1541 3000 100.0% chr15 - 58091529 58091737 209 browser details YourSeq 209 1192 1784 3000 92.4% chr15 + 81446230 81446930 701 browser details YourSeq 208 1334 1541 3000 100.0% chr2 - 63047259 63047466 208 browser details YourSeq 208 1331 1542 3000 99.1% chr2 - 28605250 28605461 212 browser details YourSeq 208 1341 1549 3000 100.0% chr1 - 52691392 52691607 216 browser details YourSeq 207 1273 1541 3000 98.2% chr5 - 108119030 108119433 404 browser details YourSeq 205 1335 1541 3000 99.6% chr4 - 104806954 104807160 207 browser details YourSeq 205 1333 1541 3000 99.1% chr13 - 92644116 92644324 209

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Dpp3 dipeptidylpeptidase 3 [ Mus musculus (house mouse) ] Gene ID: 75221, updated on 10-Oct-2019

Gene summary

Official Symbol Dpp3 provided by MGI Official Full Name dipeptidylpeptidase 3 provided by MGI Primary source MGI:MGI:1922471 See related Ensembl:ENSMUSG00000063904 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C86324; DPP III; 4930533O14Rik Expression Ubiquitous expression in adrenal adult (RPKM 86.3), colon adult (RPKM 70.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 19; 19 A See Dpp3 in Genome Data Viewer

Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (4907229..4928287, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (4907229..4928287, complement)

Chromosome 19 - NC_000085.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Dpp3 ENSMUSG00000063904

Description dipeptidylpeptidase 3 [Source:MGI Symbol;Acc:MGI:1922471] Gene Synonyms 4930533O14Rik Location Chromosome 19: 4,907,229-4,928,287 reverse strand. GRCm38:CM001012.2 About this gene This gene has 10 transcripts (splice variants), 173 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dpp3- ENSMUST00000025851.3 2683 738aa ENSMUSP00000025851.3 Protein coding CCDS29443 Q99KK7 TSL:1 201 GENCODE basic APPRIS P1

Dpp3- ENSMUST00000237394.1 726 191aa ENSMUSP00000158386.1 Protein coding - A0A494BBB3 CDS 3' 209 incomplete

Dpp3- ENSMUST00000236496.1 556 185aa ENSMUSP00000157895.1 Protein coding - A0A494BA16 CDS 5' and 3' 205 incomplete

Dpp3- ENSMUST00000236758.1 505 153aa ENSMUSP00000157448.1 Protein coding - A0A494B918 CDS 3' 206 incomplete

Dpp3- ENSMUST00000235781.1 2556 455aa ENSMUSP00000158453.1 Nonsense mediated - A0A494BBC1 - 204 decay

Dpp3- ENSMUST00000235191.1 4508 No - Retained intron - - - 202 protein

Dpp3- ENSMUST00000235777.1 2746 No - Retained intron - - - 203 protein

Dpp3- ENSMUST00000236782.1 917 No - Retained intron - - - 207 protein

Dpp3- ENSMUST00000237930.1 825 No - Retained intron - - - 210 protein

Dpp3- ENSMUST00000236955.1 806 No - lncRNA - - - 208 protein

Page 6 of 8 https://www.alphaknockout.com

41.06 kb Forward strand 4.90Mb 4.91Mb 4.92Mb 4.93Mb Gm25334-201 >snRNA (Comprehensive set...

Contigs < AC141437.4 < AC124502.4 Genes (Comprehensive set... < Bbs1-201protein coding < Dpp3-201protein coding < Peli3-201protein coding

< Bbs1-208protein coding < Dpp3-204nonsense mediated decay < Peli3-202protein coding

< Bbs1-205protein coding < Dpp3-207retained intron < Dpp3-210retained intron < Dpp3-206protein coding < Peli3-205retained intron

< Bbs1-206nonsense mediated decay < Dpp3-202retained intron < Peli3-203protein coding

< Bbs1-203retained intron < Dpp3-203retained intron < Dpp3-205protein coding < Peli3-204protein coding

< Bbs1-202retained intron < Dpp3-208lncRNA < Peli3-207lncRNA

< Bbs1-204lncRNA < Dpp3-209protein coding

< Bbs1-207retained intron

Regulatory Build

4.90Mb 4.91Mb 4.92Mb 4.93Mb Reverse strand 41.06 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000025851

< Dpp3-201protein coding

Reverse strand 21.06 kb

protein_pic

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8