https://www.alphaknockout.com

Mouse Dip2b Knockout Project (CRISPR/Cas9)

Objective: To create a Dip2b model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dip2b (NCBI Reference Sequence: NM_001159361 ; Ensembl: ENSMUSG00000023026 ) is located on Mouse 15. 38 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 38 (Transcript: ENSMUST00000100203). Exon 5~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 9.0% of the coding region. Exon 5~10 covers 18.78% of the coding region. The size of effective KO region: ~9402 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 9 10 38

Legends Exon of mouse Dip2b Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1425 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.25% 505) | C(23.9% 478) | T(30.05% 601) | G(20.8% 416)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1425bp) | A(25.05% 357) | C(18.74% 267) | T(32.35% 461) | G(23.86% 340)

Note: The 1425 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr15 + 100149152 100151151 2000 browser details YourSeq 31 690 740 2000 91.2% chr19 - 28555396 28555445 50 browser details YourSeq 28 536 571 2000 93.6% chr9 + 50999681 50999718 38 browser details YourSeq 25 706 730 2000 100.0% chr4 + 128724124 128724148 25 browser details YourSeq 23 752 774 2000 100.0% chr6 + 148460551 148460573 23 browser details YourSeq 22 422 443 2000 100.0% chr13 - 22782590 22782611 22

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1425 1 1425 1425 100.0% chr15 + 100160554 100161978 1425 browser details YourSeq 164 633 832 1425 89.1% chr4 - 135264690 135264880 191 browser details YourSeq 160 639 835 1425 90.2% chr1 + 193206572 193206760 189 browser details YourSeq 159 634 817 1425 93.5% chr8 + 123196714 123196898 185 browser details YourSeq 156 644 818 1425 94.9% chr9 + 72844112 72844286 175 browser details YourSeq 156 633 898 1425 91.5% chr14 + 32480747 32481152 406 browser details YourSeq 156 633 814 1425 92.9% chr11 + 68478236 68478417 182 browser details YourSeq 155 633 816 1425 92.4% chr13 + 62119137 62119321 185 browser details YourSeq 154 635 819 1425 92.4% chr18 - 67507415 67507614 200 browser details YourSeq 154 633 816 1425 90.8% chr12 - 87206577 87206759 183 browser details YourSeq 153 633 817 1425 94.3% chr14 - 73799211 73799396 186 browser details YourSeq 153 633 820 1425 92.8% chr10 - 39703629 39703816 188 browser details YourSeq 153 633 816 1425 92.7% chr6 + 108681890 108682072 183 browser details YourSeq 152 632 816 1425 92.8% chr19 - 41782982 41783172 191 browser details YourSeq 152 633 817 1425 91.4% chr19 + 55235828 55236013 186 browser details YourSeq 152 633 819 1425 93.3% chr17 + 94862311 94862497 187 browser details YourSeq 151 633 814 1425 89.9% chr7 - 79823599 79823777 179 browser details YourSeq 151 639 817 1425 93.0% chr13 - 35919023 35919200 178 browser details YourSeq 151 633 818 1425 91.9% chr13 + 69544667 69544859 193 browser details YourSeq 151 640 815 1425 93.2% chr13 + 55572999 55573179 181

Note: The 1425 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Dip2b disco interacting protein 2 homolog B [ Mus musculus (house mouse) ] Gene ID: 239667, updated on 12-Aug-2019

Gene summary

Official Symbol Dip2b provided by MGI Official Full Name disco interacting protein 2 homolog B provided by MGI Primary source MGI:MGI:2145977 See related Ensembl:ENSMUSG00000023026 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI317237; AI854602; mKIAA1463; 4932422C22 Expression Ubiquitous expression in cerebellum adult (RPKM 10.4), whole brain E14.5 (RPKM 8.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 15; 15 F1 See Dip2b in Genome Data Viewer Exon count: 40

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (100037626..100219474)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (99869095..100049904)

Chromosome 15 - NC_000081.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Dip2b ENSMUSG00000023026

Description disco interacting protein 2 homolog B [Source:MGI Symbol;Acc:MGI:2145977] Location Chromosome 15: 100,038,664-100,219,473 forward strand. GRCm38:CM001008.2 About this gene This gene has 5 transcripts (splice variants), 262 orthologues, 2 paralogues, is a member of 1 Ensembl and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dip2b-202 ENSMUST00000100203.9 8914 1574aa ENSMUSP00000097777.3 Protein coding CCDS49731 Q3UH60 TSL:1 GENCODE basic APPRIS P1

Dip2b-201 ENSMUST00000023768.13 4320 1340aa ENSMUSP00000023768.7 Protein coding CCDS27831 B2RQC7 TSL:5 GENCODE basic

Dip2b-204 ENSMUST00000230619.1 743 No protein - Retained intron - - -

Dip2b-203 ENSMUST00000135658.2 714 No protein - Retained intron - - TSL:5

Dip2b-205 ENSMUST00000230733.1 371 No protein - Retained intron - - -

Page 7 of 9 https://www.alphaknockout.com

200.81 kb Forward strand 100.05Mb 100.10Mb 100.15Mb 100.20Mb (Comprehensive set... Dip2b-202 >protein coding

Dip2b-201 >protein coding

Dip2b-205 >retained intron Dip2b-204 >retained intron

Dip2b-203 >retained intron

Atf1-201 >protein coding

Atf1-202 >protein coding

Atf1-208 >protein coding

Contigs < AC133868.10 < AC125526.4 < AC124479.4 Genes < Gm49474-201TEC < Gm49475-201lncRNA (Comprehensive set...

< 4930478M13Rik-201lncRNA

Regulatory Build

100.05Mb 100.10Mb 100.15Mb 100.20Mb Reverse strand 200.81 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000100203

180.81 kb Forward strand

Dip2b-202 >protein coding

ENSMUSP00000097... MobiDB lite Low complexity (Seg) Superfamily SSF56801 SMART DMAP1-binding domain Pfam DMAP1-binding domain AMP-dependent synthetase/ligase PANTHER PTHR22754:SF38

PTHR22754 Gene3D AMP-dependent synthetase-like superfamily

3.30.300.30 CDD Dip2-like domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1574

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9