https://www.alphaknockout.com

Mouse Mpdu1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Mpdu1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mpdu1 (NCBI Reference Sequence: NM_011900 ; Ensembl: ENSMUSG00000018761 ) is located on Mouse 11. 7 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 7 (Transcript: ENSMUST00000018905). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Mpdu1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-51O13 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 14.04% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 3592 bp, and the size of intron 3 for 3'-loxP site insertion: 544 bp. The size of effective cKO region: ~811 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 2 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Mpdu1 Homology arm cKO region Exon of mouse Sox15 loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7311bp) | A(23.53% 1720) | C(24.28% 1775) | T(25.59% 1871) | G(26.6% 1945)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 69659127 69662126 3000 browser details YourSeq 256 768 1639 3000 85.9% chr7 + 27461327 27462003 677 browser details YourSeq 253 780 1762 3000 83.7% chr12 - 102477508 102478006 499 browser details YourSeq 252 809 1424 3000 91.2% chr11 + 77638946 77639606 661 browser details YourSeq 238 767 1545 3000 81.2% chr11 + 113670836 113671265 430 browser details YourSeq 231 767 1406 3000 89.2% chr2 + 80457690 80458319 630 browser details YourSeq 191 1261 1891 3000 83.3% chr19 + 44936181 44936790 610 browser details YourSeq 185 768 1427 3000 81.1% chr17 + 78129062 78129430 369 browser details YourSeq 181 2359 2625 3000 91.4% chr4 - 141039826 141040489 664 browser details YourSeq 178 1231 1427 3000 96.0% chrX + 162710748 162711286 539 browser details YourSeq 177 1237 1920 3000 90.4% chr13 - 62464641 62758696 294056 browser details YourSeq 169 920 1427 3000 92.5% chrX + 93935383 94067108 131726 browser details YourSeq 169 1216 1427 3000 92.9% chr3 + 19956005 19956639 635 browser details YourSeq 167 1236 1460 3000 88.7% chr11 - 116880143 116880347 205 browser details YourSeq 165 1248 1712 3000 92.0% chr7 + 101359005 101359472 468 browser details YourSeq 164 1236 1427 3000 94.2% chr4 - 137066165 137066503 339 browser details YourSeq 163 1248 1457 3000 88.4% chr3 + 33891072 33891263 192 browser details YourSeq 162 1257 1491 3000 94.0% chr5 - 72406127 72406750 624 browser details YourSeq 160 1257 1460 3000 93.1% chr8 + 31339479 31339943 465 browser details YourSeq 159 1227 1427 3000 87.9% chr9 - 77572958 77573149 192

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 69655316 69658315 3000 browser details YourSeq 123 2588 2802 3000 78.7% chrX + 60892784 60892998 215 browser details YourSeq 119 2594 2802 3000 78.5% chr14 + 118235403 118235611 209 browser details YourSeq 85 2576 2802 3000 68.8% chr2 + 152397352 152397578 227 browser details YourSeq 82 2594 2799 3000 70.0% chr12 + 27342053 27342258 206 browser details YourSeq 81 2594 2802 3000 69.4% chr13 + 28952636 28952844 209 browser details YourSeq 77 2686 2802 3000 83.0% chr8 - 12396514 12396630 117 browser details YourSeq 73 2594 2800 3000 84.8% chr2 + 181670891 181671282 392 browser details YourSeq 53 2720 2798 3000 83.6% chr3 - 34650549 34650627 79 browser details YourSeq 38 1 50 3000 95.3% chr1 + 174749855 174749904 50 browser details YourSeq 36 13 98 3000 71.0% chr9 - 71712116 71712201 86 browser details YourSeq 36 1 49 3000 76.8% chr7 + 102438575 102438617 43 browser details YourSeq 36 1 48 3000 87.5% chr2 + 154404851 154404898 48 browser details YourSeq 36 10 55 3000 89.2% chr11 + 55170331 55170376 46 browser details YourSeq 33 2 48 3000 85.4% chr3 - 102073501 102073546 46 browser details YourSeq 32 15 94 3000 86.2% chr15 - 19282054 19282131 78 browser details YourSeq 32 1 47 3000 81.9% chr1 - 114485349 114485394 46 browser details YourSeq 31 11 99 3000 67.5% chr4 + 46769203 46769291 89 browser details YourSeq 31 2773 2805 3000 97.0% chr17 + 25570196 25570228 33 browser details YourSeq 30 10 47 3000 89.5% chr5 + 139700161 139700198 38

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Mpdu1 mannose-P-dolichol utilization defect 1 [ Mus musculus (house mouse) ] Gene ID: 24070, updated on 12-Aug-2019

Gene summary

Official Symbol Mpdu1 provided by MGI Official Full Name mannose-P-dolichol utilization defect 1 provided by MGI Primary source MGI:MGI:1346040 See related Ensembl:ENSMUSG00000018761 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as SL15; LEC35; Supl15h Summary This gene encodes a member of the PQ-loop superfamily. A similar gene in human encodes a protein that is required for Expression monosaccharide-P-dolichol-dependent glycosyltransferase reactions, and disruption of this gene is the cause of congenital disorder of glycosylation (CDG) type 1F, a disease linked to defects in protein N-glycosylation. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Sep 2014] Orthologs Ubiquitous expression in duodenum adult (RPKM 64.5), adrenal adult (RPKM 54.1) and 28 other tissues See more human all

Genomic context

Location: 11 B3; 11 42.86 cM See Mpdu1 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (69656697..69662649, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (69470206..69476144, complement)

Chromosome 11 - NC_000077.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Mpdu1 ENSMUSG00000018761

Description mannose-P-dolichol utilization defect 1 [Source:MGI Symbol;Acc:MGI:1346040] Gene Synonyms LEC35, SL15, Supl15h Location Chromosome 11: 69,656,697-69,662,642 reverse strand. GRCm38:CM001004.2 About this gene This gene has 10 transcripts (splice variants), 217 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mpdu1- ENSMUST00000018905.11 1321 247aa ENSMUSP00000018905.5 Protein coding CCDS24903 Q8R0J2 TSL:1 201 GENCODE basic APPRIS P1

Mpdu1- ENSMUST00000129224.7 774 197aa ENSMUSP00000120001.1 Protein coding - F6XX36 CDS 5' 204 incomplete TSL:3

Mpdu1- ENSMUST00000155200.7 462 148aa ENSMUSP00000117715.1 Protein coding - Q5F2B1 CDS 3' 210 incomplete TSL:3

Mpdu1- ENSMUST00000148242.7 971 120aa ENSMUSP00000133074.1 Nonsense mediated - E9PVP3 TSL:1 207 decay

Mpdu1- ENSMUST00000125389.1 646 129aa ENSMUSP00000129025.1 Nonsense mediated - F6ZGG4 CDS 5' 202 decay incomplete TSL:2

Mpdu1- ENSMUST00000149764.7 867 No - Retained intron - - TSL:2 208 protein

Mpdu1- ENSMUST00000153217.2 818 No - Retained intron - - TSL:2 209 protein

Mpdu1- ENSMUST00000127118.1 664 No - Retained intron - - TSL:2 203 protein

Mpdu1- ENSMUST00000137288.7 524 No - Retained intron - - TSL:2 205 protein

Mpdu1- ENSMUST00000139155.1 510 No - Retained intron - - TSL:2 206 protein

Page 6 of 8 https://www.alphaknockout.com

25.95 kb Forward strand 69.65Mb 69.66Mb 69.67Mb Fxr2-201 >protein coding Sox15-201 >protein coding Mir1934-201 >miRNA (Comprehensive set...

Fxr2-205 >lncRNA Fxr2-204 >lncRNA

Fxr2-203 >lncRNA Fxr2-202 >lncRNA

Contigs AL603707.5 > Genes (Comprehensive set... < Mpdu1-201protein coding < Cd68-201protein coding< Gm24029-201snoRNA

< Mpdu1-204protein coding < Cd68-203lncRNA< Eif4a1-204retained intron

< Mpdu1-207nonsense mediated decay < Cd68-202protein coding < Gm25835-201snoRNA

< Mpdu1-208retained intron < Eif4a1-213protein coding

< Mpdu1-210protein coding < Eif4a1-202retained intron

< Mpdu1-202nonsense mediated decay < Eif4a1-201nonsense mediated decay

< Mpdu1-203retained intron < Eif4a1-211lncRNA

< Mpdu1-205retained intron < Eif4a1-208lncRNA

< Mpdu1-206retained intron < Eif4a1-206lncRNA

< Mpdu1-209retained intron < Eif4a1-212retained intron

< Eif4a1-205retained intron

< Eif4a1-207lncRNA

< Eif4a1-209lncRNA

< Gm22988-201snoRNA

< Eif4a1-210lncRNA

< Eif4a1-203lncRNA

Regulatory Build

69.65Mb 69.66Mb 69.67Mb Reverse strand 25.95 kb

Regulation Legend

CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000018905

< Mpdu1-201protein coding

Reverse strand 5.95 kb

ENSMUSP00000018... Transmembrane heli... Low complexity (Seg) SMART PQ-loop repeat Pfam PQ-loop repeat PIRSF Mannose-P-dolichol utilization defect 1 protein PANTHER PTHR12226:SF2

Mannose-P-dolichol utilization defect 1 protein

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant

Scale bar 0 40 80 120 160 200 247

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8