https://www.alphaknockout.com

Mouse Dhx38 Knockout Project (CRISPR/Cas9)

Objective: To create a Dhx38 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dhx38 (NCBI Reference Sequence: NM_178380 ; Ensembl: ENSMUSG00000037993 ) is located on Mouse 8. 27 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 27 (Transcript: ENSMUST00000042601). Exon 2~19 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2~19 covers 70.66% of the coding region. The size of effective KO region: ~8636 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 19 27

Legends Exon of mouse Dhx38 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 594 bp section downstream of Exon 19 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.25% 505) | C(20.75% 415) | T(31.05% 621) | G(22.95% 459)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(594bp) | A(21.38% 127) | C(22.56% 134) | T(23.91% 142) | G(32.15% 191)

Note: The 594 bp section downstream of Exon 19 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 - 109562767 109564766 2000 browser details YourSeq 127 810 1266 2000 84.1% chr10 + 80600006 80600286 281 browser details YourSeq 107 755 925 2000 98.3% chr1 + 83656956 83657134 179 browser details YourSeq 57 748 904 2000 98.4% chr4 - 108206875 108207228 354 browser details YourSeq 39 825 863 2000 100.0% chr1 + 14935335 14935373 39 browser details YourSeq 35 1242 1286 2000 94.9% chr12 + 86304815 86304864 50 browser details YourSeq 32 1234 1270 2000 91.2% chr13 + 38213979 38214014 36 browser details YourSeq 32 1201 1275 2000 97.1% chr10 + 4322385 4322927 543 browser details YourSeq 31 885 918 2000 97.1% chr2 - 128549249 128549283 35 browser details YourSeq 31 1584 1643 2000 91.9% chr19 + 6039921 6039982 62 browser details YourSeq 30 1242 1272 2000 100.0% chrX - 105491021 105491053 33 browser details YourSeq 30 1623 1660 2000 89.5% chr13 - 51837355 51837392 38 browser details YourSeq 30 1242 1287 2000 94.2% chr10 - 41093397 41093447 51 browser details YourSeq 30 1242 1272 2000 100.0% chr11 + 82655670 82655701 32 browser details YourSeq 29 1242 1271 2000 100.0% chr7 - 6209175 6209354 180 browser details YourSeq 29 1231 1270 2000 81.9% chr13 - 9227662 9227698 37 browser details YourSeq 28 1442 1500 2000 93.8% chr18 - 13921849 13921935 87 browser details YourSeq 28 1242 1272 2000 96.8% chr13 - 80501100 80501132 33 browser details YourSeq 28 1242 1272 2000 96.8% chr2 + 36195876 36195908 33 browser details YourSeq 28 1242 1272 2000 96.8% chr18 + 10634645 10634677 33

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 594 1 594 594 100.0% chr8 - 109553554 109554147 594 browser details YourSeq 28 538 570 594 83.9% chr11 - 100402867 100402897 31 browser details YourSeq 28 108 138 594 96.8% chr15 + 27537603 27537640 38 browser details YourSeq 26 53 79 594 100.0% chr15 - 97806420 97806522 103 browser details YourSeq 26 533 568 594 72.5% chr11 - 3135607 3135636 30 browser details YourSeq 20 108 127 594 100.0% chr13 - 108006075 108006094 20

Note: The 594 bp section downstream of Exon 19 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Dhx38 DEAH (Asp-Glu-Ala-His) box polypeptide 38 [ Mus musculus (house mouse) ] Gene ID: 64340, updated on 10-Oct-2019

Gene summary

Official Symbol Dhx38 provided by MGI Official Full Name DEAH (Asp-Glu-Ala-His) box polypeptide 38 provided by MGI Primary source MGI:MGI:1927617 See related Ensembl:ENSMUSG00000037993 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ddx38; Prp16; AI325984; AW540902; KIAA0224; 5730550P09Rik Expression Ubiquitous expression in thymus adult (RPKM 12.7), CNS E11.5 (RPKM 11.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 D3 See Dhx38 in Genome Data Viewer Exon count: 27

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (109547999..109565861, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (112071924..112089501, complement)

Chromosome 8 - NC_000074.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Dhx38 ENSMUSG00000037993

Description DEAH (Asp-Glu-Ala-His) box polypeptide 38 [Source:MGI Symbol;Acc:MGI:1927617] Gene Synonyms 5730550P09Rik, Ddx38, Prp16 Location Chromosome 8: 109,548,011-109,565,861 reverse strand. GRCm38:CM001001.2 About this gene This gene has 2 transcripts (splice variants), 196 orthologues, 18 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dhx38-201 ENSMUST00000042601.8 4463 1228aa ENSMUSP00000047865.7 Protein coding CCDS22652 Q80X98 TSL:1 GENCODE basic APPRIS P1

Dhx38-202 ENSMUST00000212667.1 657 No protein - Retained intron - - TSL:2

37.85 kb Forward strand 109.54Mb 109.55Mb 109.56Mb 109.57Mb Pmfbp1-201 >protein coding Txnl4b-201 >protein coding (Comprehensive set...

Pmfbp1-202 >lncRNA Txnl4b-202 >protein coding

Contigs AC125162.4 >

Genes (Comprehensive set... < Dhx38-201protein coding < Hp-201protein coding

< Dhx38-202retained intron < Hp-203lncRNA

< Hp-202lncRNA

Regulatory Build

109.54Mb 109.55Mb 109.56Mb 109.57Mb Reverse strand 37.85 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000042601

< Dhx38-201protein coding

Reverse strand 17.85 kb

ENSMUSP00000047... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily P-loop containing nucleoside triphosphate hydrolase

SMART Helicase-associated domain

Helicase superfamily 1/2, ATP-binding domain

Helicase, C-terminal Pfam DEAD/DEAH box helicase domain Domain of unknown function DUF1605

Helicase, C-terminal Helicase-associated domain PROSITE profiles Helicase superfamily 1/2, ATP-binding domain

Helicase, C-terminal PROSITE patterns DNA/RNA helicase, ATP-dependent, DEAH-box type, conserved site

PANTHER PTHR18934:SF85

PTHR18934 Gene3D 3.40.50.300 1.20.120.1080

CDD cd17983 cd18791

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1228

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8