https://www.alphaknockout.com

Mouse Dcun1d1 Knockout Project (CRISPR/Cas9)

Objective: To create a Dcun1d1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dcun1d1 (NCBI Reference Sequence: NM_001205361 ; Ensembl: ENSMUSG00000027708 ) is located on Mouse 3. 7 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 7 (Transcript: ENSMUST00000108182). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele are runted with spleen and lymphoid hypoplasia and decreased mouse embryonic fibroblast proliferation. Males are infertile and exhibit abnormal spermiogenesis.

Exon 2 starts from about 0.51% of the coding region. Exon 2~4 covers 66.54% of the coding region. The size of effective KO region: ~4968 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 7

Legends Exon of mouse Dcun1d1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.85% 537) | C(18.45% 369) | T(30.15% 603) | G(24.55% 491)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.3% 546) | C(16.7% 334) | T(33.5% 670) | G(22.5% 450)

Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 35921188 35923187 2000 browser details YourSeq 96 245 378 2000 92.2% chr11 - 57262765 57263207 443 browser details YourSeq 88 264 379 2000 90.8% chr19 + 37509492 37509624 133 browser details YourSeq 87 266 393 2000 86.6% chr5 + 120966630 120966772 143 browser details YourSeq 86 264 559 2000 75.7% chr10 + 128772875 128773049 175 browser details YourSeq 84 264 392 2000 83.7% chr5 + 124937516 124937636 121 browser details YourSeq 82 258 378 2000 91.0% chr18 - 69701761 69701887 127 browser details YourSeq 81 264 377 2000 88.4% chr10 + 96065703 96065828 126 browser details YourSeq 78 260 379 2000 87.4% chr7 - 131516037 131516169 133 browser details YourSeq 78 258 386 2000 85.5% chr10 + 62007292 62007448 157 browser details YourSeq 76 258 380 2000 87.3% chr10 - 44620559 44620697 139 browser details YourSeq 76 264 378 2000 83.4% chr11 + 35776527 35776640 114 browser details YourSeq 75 264 365 2000 91.3% chr1 - 64280854 64280971 118 browser details YourSeq 75 264 365 2000 90.2% chr5 + 20215807 20215921 115 browser details YourSeq 74 270 379 2000 87.8% chr5 + 67087054 67087181 128 browser details YourSeq 74 266 384 2000 90.3% chr1 + 153924319 153924623 305 browser details YourSeq 73 271 379 2000 91.9% chr3 - 122193941 122194064 124 browser details YourSeq 72 264 378 2000 89.2% chrX - 142650963 142651078 116 browser details YourSeq 72 264 379 2000 88.2% chr11 - 46470885 46471014 130 browser details YourSeq 72 259 377 2000 86.6% chrX + 141375670 141375799 130

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 35914220 35916219 2000 browser details YourSeq 756 1029 2000 2000 90.3% chr5 - 25731085 25732121 1037 browser details YourSeq 746 1025 2000 2000 89.8% chr1 - 131391175 131898889 507715 browser details YourSeq 742 1029 2000 2000 91.3% chr17 + 87803769 87804857 1089 browser details YourSeq 739 1029 2000 2000 89.8% chr7 - 29607208 29608254 1047 browser details YourSeq 736 1032 2000 2000 90.1% chr7 - 64927791 64928821 1031 browser details YourSeq 729 1029 2000 2000 89.3% chr3 - 133038174 133194805 156632 browser details YourSeq 711 1042 1996 2000 88.7% chr15 + 68017502 68018584 1083 browser details YourSeq 708 1029 2000 2000 89.4% chr1 + 78753237 78754275 1039 browser details YourSeq 703 1025 2000 2000 90.9% chr10 - 41160825 41161857 1033 browser details YourSeq 700 1029 2000 2000 87.8% chr14 + 36836225 36837252 1028 browser details YourSeq 698 1032 2000 2000 89.8% chr1 + 161146877 161147893 1017 browser details YourSeq 697 1029 1987 2000 89.6% chr12 + 30666819 30667859 1041 browser details YourSeq 694 1032 1943 2000 90.4% chr3 - 68391462 68392513 1052 browser details YourSeq 694 1032 2000 2000 89.6% chr3 + 34340980 34342059 1080 browser details YourSeq 692 1033 1936 2000 89.7% chr6 - 3984643 3985619 977 browser details YourSeq 692 1033 2000 2000 87.9% chr14 + 16222040 16223065 1026 browser details YourSeq 687 1025 2000 2000 89.0% chr1 + 66711036 66712095 1060 browser details YourSeq 686 1029 2000 2000 90.8% chr18 + 57071247 57072385 1139 browser details YourSeq 684 1032 2000 2000 89.4% chr13 - 63054333 63055354 1022

Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Dcun1d1 DCN1, defective in cullin neddylation 1, domain containing 1 (S. cerevisiae) [ Mus musculus (house mouse) ] Gene ID: 114893, updated on 10-Oct-2019

Gene summary

Official Symbol Dcun1d1 provided by MGI Official Full Name DCN1, defective in cullin neddylation 1, domain containing 1 (S. cerevisiae) provided by MGI Primary source MGI:MGI:2150386 See related Ensembl:ENSMUSG00000027708 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Rp42; Tes3; SCCRO; pTes3 Expression Broad expression in testis adult (RPKM 11.9), liver E14 (RPKM 6.9) and 21 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 B See Dcun1d1 in Genome Data Viewer Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (35892105..35937472, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (35791903..35828996, complement)

Chromosome 3 - NC_000069.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Dcun1d1 ENSMUSG00000027708

Description DCN1, defective in cullin neddylation 1, domain containing 1 (S. cerevisiae) [Source:MGI Symbol;Acc:MGI:2150386] Gene Synonyms Rp42, SCCRO, Tes3, pTes3 Location : 35,892,105-35,937,445 reverse strand. GRCm38:CM000996.2 About this gene This gene has 14 transcripts (splice variants), 249 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dcun1d1- ENSMUST00000108182.9 4980 259aa ENSMUSP00000103817.5 Protein coding CCDS57207 Q9QZ73 TSL:1 201 GENCODE basic APPRIS ALT1

Dcun1d1- ENSMUST00000178098.7 4434 244aa ENSMUSP00000137324.1 Protein coding CCDS17306 Q3UT23 TSL:5 203 GENCODE basic APPRIS P3

Dcun1d1- ENSMUST00000198389.4 3020 244aa ENSMUSP00000143243.1 Protein coding CCDS17306 Q3UT23 TSL:1 212 GENCODE basic APPRIS P3

Dcun1d1- ENSMUST00000200661.4 845 220aa ENSMUSP00000143716.1 Protein coding - A0A0G2JGV7 CDS 3' 214 incomplete TSL:3

Dcun1d1- ENSMUST00000148465.7 784 214aa ENSMUSP00000115420.3 Protein coding - A0A140T8R8 CDS 3' 202 incomplete TSL:2

Dcun1d1- ENSMUST00000196270.4 588 115aa ENSMUSP00000142384.1 Protein coding - A0A1B0GXL7 CDS 3' 208 incomplete TSL:2

Dcun1d1- ENSMUST00000199173.2 546 114aa ENSMUSP00000142443.1 Protein coding - A0A0G2JDN9 CDS 3' 213 incomplete TSL:3

Dcun1d1- ENSMUST00000198362.1 282 20aa ENSMUSP00000143252.1 Protein coding - A0A0G2JFP2 CDS 3' 211 incomplete TSL:3

Dcun1d1- ENSMUST00000197489.4 1018 93aa ENSMUSP00000142690.1 Nonsense mediated - A0A0G2JEA2 TSL:5 209 decay

Dcun1d1- ENSMUST00000197546.4 621 142aa ENSMUSP00000142421.1 Nonsense mediated - A0A0G2JDM0 CDS 5' 210 decay incomplete TSL:3

Dcun1d1- ENSMUST00000196174.1 2949 No - Retained intron - - TSL:NA 205 protein

Dcun1d1- ENSMUST00000196115.1 631 No - Retained intron - - TSL:2 204 protein

Dcun1d1- ENSMUST00000196261.1 776 No - lncRNA - - TSL:2 206 protein

Dcun1d1- ENSMUST00000196263.4 684 No - lncRNA - - TSL:3 207 protein

Page 7 of 9 https://www.alphaknockout.com

65.34 kb Forward strand 35.89Mb 35.90Mb 35.91Mb 35.92Mb 35.93Mb 35.94Mb Gm15952-203 >lncRNA (Comprehensive set...

Gm15952-201 >lncRNA

Gm15952-202 >lncRNA

Contigs AC116677.11 > Genes (Comprehensive set... < Dcun1d1-203protein coding

< Dcun1d1-201protein coding

< Dcun1d1-212protein coding

< Dcun1d1-209nonsense mediated decay

< Dcun1d1-210nonsense mediated decay < Dcun1d1-205retained intron

< Dcun1d1-214protein coding

< Dcun1d1-202protein coding

< Dcun1d1-206lncRNA < Dcun1d1-204retained intron

< Dcun1d1-207lncRNA

< Dcun1d1-211protein coding

< Dcun1d1-208protein coding

< Dcun1d1-213protein coding

Regulatory Build

35.89Mb 35.90Mb 35.91Mb 35.92Mb 35.93Mb 35.94Mb Reverse strand 65.34 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000108182

< Dcun1d1-201protein coding

Reverse strand 41.37 kb

ENSMUSP00000103... Superfamily UBA-like superfamily

Pfam PF14555 Potentiating neddylation domain

PROSITE profiles Potentiating neddylation domain

PANTHER Defective-in-cullin neddylation protein

PTHR12281:SF10 Gene3D 1.10.8.10 1.10.238.10 DCN1-like, PONY binding domain

CDD cd14411

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 259

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9