https://www.alphaknockout.com

Mouse Dnajc1 Knockout Project (CRISPR/Cas9)

Objective: To create a Dnajc1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dnajc1 (NCBI Reference Sequence: NM_007869 ; Ensembl: ENSMUSG00000026740 ) is located on Mouse 2. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000166495). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 12.74% of the coding region. Exon 2~5 covers 24.94% of the coding region. The size of effective KO region: ~9086 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 12

Legends Exon of mouse Dnajc1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 848 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.85% 597) | C(16.65% 333) | T(33.75% 675) | G(19.75% 395)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(848bp) | A(27.48% 233) | C(18.51% 157) | T(35.97% 305) | G(18.04% 153)

Note: The 848 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 18317001 18319000 2000 browser details YourSeq 25 1667 1692 2000 100.0% chr2 + 126450186 126450212 27 browser details YourSeq 24 1352 1375 2000 100.0% chr4 + 76800601 76800624 24 browser details YourSeq 22 623 644 2000 100.0% chr12 + 81380036 81380057 22

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 848 1 848 848 100.0% chr2 - 18307067 18307914 848 browser details YourSeq 34 703 817 848 92.5% chr19 - 53573812 53573930 119 browser details YourSeq 27 799 830 848 96.7% chr1 + 69145490 69145528 39 browser details YourSeq 25 645 676 848 93.4% chr11 + 9689053 9689091 39 browser details YourSeq 24 821 847 848 96.2% chr10 - 54462282 54462313 32 browser details YourSeq 23 476 498 848 100.0% chr12 - 36001119 36001141 23 browser details YourSeq 22 401 426 848 87.5% chr1 - 27897787 27897811 25 browser details YourSeq 21 50 70 848 100.0% chr10 - 12820861 12820881 21 browser details YourSeq 21 808 828 848 100.0% chr16 + 60625981 60626001 21 browser details YourSeq 20 646 667 848 95.5% chr1 - 52681051 52681072 22 browser details YourSeq 20 473 492 848 100.0% chr10 + 67399540 67399559 20 browser details YourSeq 20 154 173 848 100.0% chr10 + 45512935 45512954 20 browser details YourSeq 20 482 501 848 100.0% chr1 + 177440507 177440526 20

Note: The 848 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Dnajc1 DnaJ family (Hsp40) member C1 [ Mus musculus (house mouse) ] Gene ID: 13418, updated on 12-Aug-2019

Gene summary

Official Symbol Dnajc1 provided by MGI Official Full Name DnaJ heat shock protein family (Hsp40) member C1 provided by MGI Primary source MGI:MGI:103268 See related Ensembl:ENSMUSG00000026740 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MTJ1; ERdj1; ERj1p; Dnajl1; AA960110; 4733401K02Rik; D230036H06Rik Expression Ubiquitous expression in testis adult (RPKM 10.4), thymus adult (RPKM 9.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A3 See Dnajc1 in Genome Data Viewer Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (18205634..18394099, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (18138762..18314134, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Dnajc1 ENSMUSG00000026740

Description DnaJ heat shock protein family (Hsp40) member C1 [Source:MGI Symbol;Acc:MGI:103268] Gene Synonyms 4733401K02Rik, D230036H06Rik, Dnajl1, ERdj1, MTJ1 Location Chromosome 2: 18,195,654-18,392,830 reverse strand. GRCm38:CM000995.2 About this gene This gene has 11 transcripts (splice variants), 202 orthologues, 21 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dnajc1- ENSMUST00000166495.7 5497 552aa ENSMUSP00000126321.1 Protein coding CCDS15709 Q61712 TSL:1 209 GENCODE basic APPRIS P1

Dnajc1- ENSMUST00000091418.11 5070 552aa ENSMUSP00000088980.5 Protein coding CCDS15709 Q61712 TSL:1 202 GENCODE basic APPRIS P1

Dnajc1- ENSMUST00000168723.1 1175 357aa ENSMUSP00000126716.1 Protein coding - F6ZL86 CDS 5' 210 incomplete TSL:5

Dnajc1- ENSMUST00000148401.7 456 58aa ENSMUSP00000132289.1 Protein coding - F6WEH1 CDS 5' 203 incomplete TSL:3

Dnajc1- ENSMUST00000028072.12 1876 105aa ENSMUSP00000028072.6 Nonsense mediated - F8WH68 TSL:5 201 decay

Dnajc1- ENSMUST00000163130.7 1490 79aa ENSMUSP00000129176.1 Nonsense mediated - F6YK70 CDS 5' 205 decay incomplete TSL:5

Dnajc1- ENSMUST00000164835.1 746 34aa ENSMUSP00000128687.1 Nonsense mediated - F7C8U2 CDS 5' 207 decay incomplete TSL:3

Dnajc1- ENSMUST00000164606.1 2850 No - Retained intron - - TSL:1 206 protein

Dnajc1- ENSMUST00000165793.1 749 No - Retained intron - - TSL:2 208 protein

Dnajc1- ENSMUST00000153055.2 1001 No - lncRNA - - TSL:1 204 protein

Dnajc1- ENSMUST00000172210.1 720 No - lncRNA - - TSL:3 211 protein

Page 7 of 9 https://www.alphaknockout.com

217.18 kb Forward strand 18.2Mb 18.3Mb 18.4Mb Mllt10-201 >protein coding (Comprehensive set...

Mllt10-203 >protein coding

Mllt10-204 >protein coding

Mllt10-202 >protein coding

Mllt10-207 >lncRNA

Contigs AL928557.6 > AL928620.5 > AL845265.20 > Genes (Comprehensive set... < Dnajc1-203protein coding < Gm13350-201processed pseudogene < Dnajc1-206retained intron< Gm13349-201processed pseudogene

< Dnajc1-202protein coding

< Dnajc1-211lncRNA < Dnajc1-204lncRNA

< Dnajc1-209protein coding

< Dnajc1-201nonsense mediated decay

< Dnajc1-205nonsense mediated decay

< Dnajc1-210protein coding

< Dnajc1-207nonsense mediated decay < Dnajc1-208retained intron

Regulatory Build

18.2Mb 18.3Mb 18.4Mb Reverse strand 217.18 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000166495

< Dnajc1-209protein coding

Reverse strand 179.08 kb

ENSMUSP00000126... Transmembrane heli... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily J-domain superfamily Homeobox-like domain superfamily SMART DnaJ domain SANT/Myb domain Prints DnaJ domain Pfam DnaJ domain SANT/Myb domain

PROSITE profiles DnaJ domain Myb-like domain SANT domain

PROSITE patterns DnaJ domain, conserved site PANTHER PTHR44653 Gene3D Chaperone J-domain superfamily 1.10.10.60 CDD DnaJ domain SANT/Myb domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 552

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9