https://www.alphaknockout.com

Mouse Adh4 Knockout Project (CRISPR/Cas9)

Objective: To create a Adh4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Adh4 (NCBI Reference Sequence: NM_011996 ; Ensembl: ENSMUSG00000037797 ) is located on Mouse 3. 9 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000013458). Exon 2~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 1.68% of the coding region. Exon 2~7 covers 84.17% of the coding region. The size of effective KO region: ~9976 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 9

Legends Exon of mouse Adh4 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 936 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.35% 487) | C(22.7% 454) | T(34.3% 686) | G(18.65% 373)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(936bp) | A(37.93% 355) | C(15.71% 147) | T(29.59% 277) | G(16.77% 157)

Note: The 936 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 138416126 138418125 2000 browser details YourSeq 1189 37 1374 2000 94.9% chr16 - 8025405 8026798 1394 browser details YourSeq 1184 55 1374 2000 95.0% chrX + 77384079 77385418 1340 browser details YourSeq 1179 55 1374 2000 94.9% chr12 - 53002816 53004163 1348 browser details YourSeq 1178 54 1374 2000 95.2% chr1 - 106674232 106675579 1348 browser details YourSeq 1176 56 1374 2000 95.1% chr13 + 83704247 83705594 1348 browser details YourSeq 1174 56 1374 2000 95.2% chr2 + 129350439 129351771 1333 browser details YourSeq 1173 53 1374 2000 94.8% chr18 + 76269324 76270653 1330 browser details YourSeq 1170 56 1374 2000 94.8% chr15 + 42209247 42210574 1328 browser details YourSeq 1169 55 1373 2000 95.2% chr3 - 132604085 132605409 1325 browser details YourSeq 1168 55 1374 2000 94.6% chr19 + 11715426 11716747 1322 browser details YourSeq 1167 50 1374 2000 94.8% chrX - 101196197 101197541 1345 browser details YourSeq 1167 56 1374 2000 95.2% chr9 - 115899402 115900726 1325 browser details YourSeq 1166 55 1374 2000 94.3% chr9 + 56282876 56284198 1323 browser details YourSeq 1165 49 1374 2000 94.2% chr3 - 100329706 100331034 1329 browser details YourSeq 1165 57 1374 2000 94.8% chrX + 20440497 20441830 1334 browser details YourSeq 1163 57 1374 2000 94.9% chr3 - 115237162 115238484 1323 browser details YourSeq 1161 55 1374 2000 94.2% chrX - 72859617 72860952 1336 browser details YourSeq 1161 54 1374 2000 94.1% chr1 - 73068018 73069356 1339 browser details YourSeq 1160 54 1374 2000 94.4% chr1 - 43924153 43925492 1340

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 936 1 936 936 100.0% chr3 + 138428102 138429037 936 browser details YourSeq 116 26 589 936 78.5% chr16 - 18866404 18866816 413 browser details YourSeq 107 455 666 936 79.5% chr6 - 20522288 20522442 155 browser details YourSeq 101 474 625 936 88.0% chr10 + 8946076 8946435 360 browser details YourSeq 100 252 589 936 76.6% chr11 - 20759461 20759589 129 browser details YourSeq 97 458 598 936 83.5% chr17 + 47401442 47401581 140 browser details YourSeq 97 455 589 936 86.0% chr11 + 96224357 96224491 135 browser details YourSeq 96 455 590 936 85.3% chrX + 137686810 137686945 136 browser details YourSeq 95 2 590 936 72.6% chr1 - 59891821 59891978 158 browser details YourSeq 95 481 601 936 89.3% chr3 + 120294081 120294201 121 browser details YourSeq 93 458 590 936 85.0% chr10 - 47000066 47000198 133 browser details YourSeq 92 474 591 936 89.0% chr12 + 12247299 12247416 118 browser details YourSeq 90 455 590 936 83.1% chr10 + 120517435 120517570 136 browser details YourSeq 90 480 589 936 91.0% chr10 + 50474148 50474257 110 browser details YourSeq 90 474 598 936 85.3% chr1 + 48596801 48596924 124 browser details YourSeq 89 479 597 936 86.4% chr5 + 130194736 130194853 118 browser details YourSeq 89 483 601 936 87.4% chr10 + 115180124 115180242 119 browser details YourSeq 88 252 587 936 91.6% chrX - 101609316 101609839 524 browser details YourSeq 88 474 589 936 88.0% chr1 - 7087341 7087456 116 browser details YourSeq 87 456 578 936 88.4% chr10 - 59888629 59888751 123

Note: The 936 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Adh4 4 (class II), pi polypeptide [ Mus musculus (house mouse) ] Gene ID: 26876, updated on 10-Oct-2019

Gene summary

Official Symbol Adh4 provided by MGI Official Full Name alcohol dehydrogenase 4 (class II), pi polypeptide provided by MGI Primary source MGI:MGI:1349472 See related Ensembl:ENSMUSG00000037797 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Adh2 Expression Biased expression in liver adult (RPKM 28.8), testis adult (RPKM 3.3) and 2 other tissuesS ee more Orthologs human all

Genomic context

Location: 3; 3 G3 See Adh4 in Genome Data Viewer Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (138415456..138432683)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (138078461..138093856)

Chromosome 3 - NC_000069.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Adh4 ENSMUSG00000037797

Description alcohol dehydrogenase 4 (class II), pi polypeptide [Source:MGI Symbol;Acc:MGI:1349472] Gene Synonyms Adh2, mouse class II type ADH Location Chromosome 3: 138,415,466-138,430,892 forward strand. GRCm38:CM000996.2 About this gene This gene has 3 transcripts (splice variants), 331 orthologues, 16 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Adh4-201 ENSMUST00000013458.8 1314 377aa ENSMUSP00000013458.8 Protein coding CCDS38653 Q9QYY9 TSL:1 GENCODE basic APPRIS P1

Adh4-202 ENSMUST00000161312.7 695 216aa ENSMUSP00000124163.1 Protein coding - E0CYI4 CDS 3' incomplete TSL:5

Adh4-203 ENSMUST00000162260.1 257 No protein - lncRNA - - TSL:3

35.43 kb Forward strand 138.41Mb 138.42Mb 138.43Mb 138.44Mb (Comprehensive set... Adh4-203 >lncRNA

Adh4-202 >protein coding

Adh4-201 >protein coding

Contigs AC079845.31 > Regulatory Build

138.41Mb 138.42Mb 138.43Mb 138.44Mb Reverse strand 35.43 kb

Regulation Legend CTCF Open Chromatin Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000013458

15.40 kb Forward strand

Adh4-201 >protein coding

ENSMUSP00000013... PDB-ENSP mappings Superfamily NAD(P)-binding domain superfamily

GroES-like superfamily Pfam Alcohol dehydrogenase, N-terminal Alcohol dehydrogenase, C-terminal

PROSITE patterns Alcohol dehydrogenase, zinc-type, conserved site

PANTHER Alcohol dehydrogenase family, zinc-type, class-II subfamily

PTHR43880 Gene3D 3.90.180.10

3.40.50.720 CDD cd08299

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 377

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8