https://www.alphaknockout.com

Mouse Dpp10 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Dpp10 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dpp10 (NCBI Reference Sequence: NM_199021 ; Ensembl: ENSMUSG00000036815 ) is located on Mouse 1. 26 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 26 (Transcript: ENSMUST00000112606). Exon 8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Dpp10 gene. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 8 starts from about 24.54% of the coding region. The knockout of Exon 8 will result in frameshift of the gene. The size of intron 7 for 5'-loxP site insertion: 40572 bp, and the size of intron 8 for 3'-loxP site insertion: 12115 bp. The size of effective cKO region: ~621 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 8 26 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Dpp10 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7121bp) | A(28.97% 2063) | C(16.6% 1182) | T(37.21% 2650) | G(17.22% 1226)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 123445572 123448571 3000 browser details YourSeq 351 639 2188 3000 83.5% chr2 + 4694554 4695997 1444 browser details YourSeq 344 39 2150 3000 88.7% chr17 + 58805429 59195463 390035 browser details YourSeq 316 1157 2188 3000 82.4% chr1 - 65705849 65706966 1118 browser details YourSeq 251 1089 2339 3000 77.5% chr8 + 78875768 78876948 1181 browser details YourSeq 227 19 2151 3000 79.0% chr3 + 8346157 8371117 24961 browser details YourSeq 223 1616 2122 3000 82.4% chr13 + 113073171 113073682 512 browser details YourSeq 219 1527 2438 3000 88.4% chr10 + 51751982 51752942 961 browser details YourSeq 199 1232 2009 3000 82.3% chr17 - 33856283 33856986 704 browser details YourSeq 195 1 667 3000 82.5% chr15 + 4192026 4192771 746 browser details YourSeq 194 1488 2151 3000 84.2% chr13 + 13737053 13737736 684 browser details YourSeq 193 691 1632 3000 83.8% chr16 + 21585134 21586021 888 browser details YourSeq 192 691 1249 3000 79.8% chrX - 159052464 159052954 491 browser details YourSeq 191 51 707 3000 79.2% chr9 - 88174191 88174805 615 browser details YourSeq 187 1678 2189 3000 81.9% chr13 - 4045509 4046013 505 browser details YourSeq 185 1780 2536 3000 89.4% chr4 + 10021512 10022376 865 browser details YourSeq 184 1351 2192 3000 81.2% chr17 + 88682260 88683062 803 browser details YourSeq 177 19 449 3000 85.8% chr8 - 106002313 106002774 462 browser details YourSeq 177 1773 2194 3000 85.1% chr1 + 4684250 4684678 429 browser details YourSeq 174 1650 2151 3000 86.4% chr10 - 26398326 26398845 520

Note: The 3000 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 123441951 123444950 3000 browser details YourSeq 99 2150 2325 3000 91.6% chr12 - 38554636 38554815 180 browser details YourSeq 95 2217 2333 3000 95.5% chr10 + 116985948 116986198 251 browser details YourSeq 93 2150 2328 3000 93.7% chr13 + 58625270 58953291 328022 browser details YourSeq 80 2150 2293 3000 96.5% chr10 - 109328691 109328898 208 browser details YourSeq 78 332 461 3000 86.2% chr15 - 76473270 76473721 452 browser details YourSeq 75 2151 2336 3000 96.5% chr12 + 111403786 111404109 324 browser details YourSeq 72 2190 2326 3000 94.0% chr10 + 86311335 86311480 146 browser details YourSeq 54 2152 2300 3000 72.6% chr13 - 114605520 114605608 89 browser details YourSeq 50 1584 1746 3000 96.3% chr6 - 137613071 137613245 175 browser details YourSeq 49 352 464 3000 86.8% chr4 - 34909010 34909320 311 browser details YourSeq 46 341 402 3000 83.7% chr1 - 100242232 100242292 61 browser details YourSeq 46 1683 1740 3000 86.0% chr2 + 5822698 5822754 57 browser details YourSeq 46 352 464 3000 83.9% chr1 + 17230759 17230879 121 browser details YourSeq 42 369 428 3000 85.0% chr10 - 59348696 59348755 60 browser details YourSeq 41 2149 2304 3000 67.4% chr1 - 151701956 151702045 90 browser details YourSeq 41 1584 1739 3000 95.6% chr1 + 152919207 152919369 163 browser details YourSeq 40 1584 1636 3000 93.5% chr12 - 24773471 24773525 55 browser details YourSeq 40 2158 2333 3000 63.7% chr11 + 91869844 91869909 66 browser details YourSeq 40 2182 2326 3000 84.5% chr1 + 172777374 172777514 141

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Dpp10 dipeptidylpeptidase 10 [ Mus musculus (house mouse) ] Gene ID: 269109, updated on 7-Oct-2019

Gene summary

Official Symbol Dpp10 provided by MGI Official Full Name dipeptidylpeptidase 10 provided by MGI Primary source MGI:MGI:2442409 See related Ensembl:ENSMUSG00000036815 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as DPP X; Dprp3; 6430601K09Rik Expression Biased expression in frontal lobe adult (RPKM 14.7), cortex adult (RPKM 13.3) and 4 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 E2.3 See Dpp10 in Genome Data Viewer

Exon count: 28

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (123332138..124845815, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (125228715..125942136, complement)

Chromosome 1 - NC_000067.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Dpp10 ENSMUSG00000036815

Description dipeptidylpeptidase 10 [Source:MGI Symbol;Acc:MGI:2442409] Gene Synonyms 6430601K09Rik, DPRP3 Location Chromosome 1: 123,321,471-124,846,039 reverse strand. GRCm38:CM000994.2 About this gene This gene has 6 transcripts (splice variants), 244 orthologues, 6 paralogues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dpp10-202 ENSMUST00000112606.7 15297 800aa ENSMUSP00000108225.1 Protein coding CCDS15241 E9QN98 TSL:1 GENCODE basic APPRIS P2

Dpp10-206 ENSMUST00000239072.1 4759 796aa ENSMUSP00000159007.1 Protein coding - - GENCODE basic APPRIS ALT1

Dpp10-201 ENSMUST00000112603.3 4477 789aa ENSMUSP00000108222.2 Protein coding - D3Z5I7 TSL:5 GENCODE basic APPRIS ALT2

Dpp10-205 ENSMUST00000187286.1 1443 No protein - lncRNA - - TSL:1

Dpp10-203 ENSMUST00000140361.2 656 No protein - lncRNA - - TSL:3

Dpp10-204 ENSMUST00000187202.1 285 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

1.54 Mb Forward strand 123.4Mb 123.6Mb 123.8Mb 124.0Mb 124.2Mb 124.4Mb 124.6Mb 124.8Mb Gm28929-201 >processed pseudogene Gm25578-201 >snRNA (Comprehensive set...

Contigs AC101848.8 > AC101931.8 >

Genes (Comprehensive set... < Dpp10-202protein coding < Gm38073-201TEC < Gm28299-201processed pseudogene

< Dpp10-206protein coding

< Dpp10-201protein coding < Dpp10-205lncRNA

< Gm37478-201TEC < Gm38263-201TEC < Gm18875-201processed pseudogene < Gm37717-201TEC

< Gm37221-201TEC < Dpp10-204lncRNA < Gm37551-201TEC

< Gm24791-201snoRNA < Dpp10-203lncRNA

< Gm37066-201TEC

< Gm37781-201TEC

< Gm37396-201TEC

Regulatory Build

123.4Mb 123.6Mb 123.8Mb 124.0Mb 124.2Mb 124.4Mb 124.6Mb 124.8Mb Reverse strand 1.54 Mb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000112606

< Dpp10-202protein coding

Reverse strand 724.09 kb

ENSMUSP00000108... Transmembrane heli... MobiDB lite Low complexity (Seg) Superfamily SSF82171 Alpha/Beta hydrolase fold

Pfam Dipeptidylpeptidase IV, N-terminal domain Peptidase S9, prolyl oligopeptidase, catalytic domain

PANTHER PTHR11731

PTHR11731:SF21 Gene3D Dipeptidylpeptidase IV, N-terminal domain superfamily Alpha/Beta hydrolase fold

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8