https://www.alphaknockout.com
Mouse Aldh1a1 Knockout Project (CRISPR/Cas9)
Objective: To create a Aldh1a1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Aldh1a1 gene (NCBI Reference Sequence: NM_013467 ; Ensembl: ENSMUSG00000053279 ) is located on Mouse chromosome 19. 13 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000087638). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a disruption in this gene show a significantly reduced ability to convert retinol to retinoic acid in the liver. Retinal morphology is normal even though the gene is normally highly expressed in the dorsal retina.
Exon 2 starts from about 4.46% of the coding region. Exon 2~4 covers 25.02% of the coding region. The size of effective KO region: ~9186 bp. The KO region does not have any other known gene.
Page 1 of 9 https://www.alphaknockout.com
Overview of the Targeting Strategy
Wildtype allele 5' gRNA region gRNA region 3'
1 2 3 4 13
Legends Exon of mouse Aldh1a1 Knockout region
Page 2 of 9 https://www.alphaknockout.com
Overview of the Dot Plot (up) Window size: 15 bp
Forward Reverse Complement
Sequence 12
Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Overview of the Dot Plot (down) Window size: 15 bp
Forward Reverse Complement
Sequence 12
Note: The 1570 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.
Page 3 of 9 https://www.alphaknockout.com
Overview of the GC Content Distribution (up) Window size: 300 bp
Sequence 12
Summary: Full Length(2000bp) | A(31.55% 631) | C(16.4% 328) | T(31.1% 622) | G(20.95% 419)
Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Overview of the GC Content Distribution (down) Window size: 300 bp
Sequence 12
Summary: Full Length(1570bp) | A(32.68% 513) | C(20.45% 321) | T(28.92% 454) | G(17.96% 282)
Note: The 1570 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 4 of 9 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr19 + 20608914 20610913 2000 browser details YourSeq 35 611 672 2000 84.4% chr2 - 85359506 85359569 64 browser details YourSeq 26 566 592 2000 100.0% chr12 - 87382174 87382208 35 browser details YourSeq 25 562 588 2000 96.3% chr12 - 13525990 13526016 27 browser details YourSeq 22 514 585 2000 65.3% chr7 - 91628520 91628591 72 browser details YourSeq 20 704 723 2000 100.0% chr1 - 33810795 33810814 20 browser details YourSeq 20 1724 1743 2000 100.0% chr1 - 3722190 3722209 20
Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1570 1 1570 1570 100.0% chr19 + 20620100 20621669 1570 browser details YourSeq 320 467 1543 1570 88.8% chr19 - 20717787 20718792 1006 browser details YourSeq 59 1073 1150 1570 93.0% chr2 - 25762002 25762091 90 browser details YourSeq 44 1107 1150 1570 100.0% chr15 - 17023458 17023501 44 browser details YourSeq 41 1067 1109 1570 97.7% chr6 + 53301093 53301135 43 browser details YourSeq 39 1031 1111 1570 97.6% chr6 - 104776129 104776384 256 browser details YourSeq 29 1047 1078 1570 96.8% chr13 - 119269442 119269478 37 browser details YourSeq 27 1043 1071 1570 96.6% chrX + 68848139 68848167 29 browser details YourSeq 26 1039 1065 1570 100.0% chrX - 84289546 84289574 29 browser details YourSeq 26 848 876 1570 96.5% chr1 + 70135686 70135714 29 browser details YourSeq 25 1047 1071 1570 100.0% chrX - 16265299 16265323 25 browser details YourSeq 25 1047 1071 1570 100.0% chr7 - 119622902 119622926 25 browser details YourSeq 25 1045 1071 1570 96.3% chr17 - 13999047 13999073 27 browser details YourSeq 25 973 999 1570 88.5% chr1 - 5712654 5712679 26 browser details YourSeq 25 1047 1071 1570 100.0% chr7 + 133164202 133164226 25 browser details YourSeq 23 1183 1205 1570 100.0% chr6 - 25873896 25873918 23 browser details YourSeq 23 1049 1071 1570 100.0% chr18 - 64385770 64385792 23 browser details YourSeq 23 1049 1071 1570 100.0% chr16 + 33185944 33185966 23 browser details YourSeq 22 1047 1068 1570 100.0% chr18 - 51275690 51275711 22 browser details YourSeq 22 1374 1395 1570 100.0% chr15 - 78212566 78212587 22
Note: The 1570 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.
Page 5 of 9 https://www.alphaknockout.com
Gene and protein information: Aldh1a1 aldehyde dehydrogenase family 1, subfamily A1 [ Mus musculus (house mouse) ] Gene ID: 11668, updated on 24-Oct-2019
Gene summary
Official Symbol Aldh1a1 provided by MGI Official Full Name aldehyde dehydrogenase family 1, subfamily A1 provided by MGI Primary source MGI:MGI:1353450 See related Ensembl:ENSMUSG00000053279 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as E1; Ahd2; Ahd-2; Aldh1; ALHDII; Raldh1; ALDH-E1; Aldh1a2 Expression Biased expression in liver adult (RPKM 141.6), genital fat pad adult (RPKM 125.0) and 12 other tissues See more
Genomic context
Location: 19 B; 19 13.91 cM See Aldh1a1 in Genome Data Viewer
Exon count: 17
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (20492583..20643463)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (20676472..20717952)
Chromosome 19 - NC_000085.6
Page 6 of 9 https://www.alphaknockout.com
Transcript information: This gene has 6 transcripts
Gene: Aldh1a1 ENSMUSG00000053279
Description aldehyde dehydrogenase family 1, subfamily A1 [Source:MGI Symbol;Acc:MGI:1353450] Gene Synonyms ALDH1, Ahd-2, Ahd2, E1, Raldh1 Location Chromosome 19: 20,492,715-20,643,465 forward strand. GRCm38:CM001012.2 About this gene This gene has 6 transcripts (splice variants), 211 orthologues, 19 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Aldh1a1- ENSMUST00000225337.2 2358 501aa ENSMUSP00000153410.2 Protein CCDS29695 A0A286YDG6 GENCODE 206 coding P24549 basic APPRIS P1
Aldh1a1- ENSMUST00000087638.3 2053 501aa ENSMUSP00000084918.3 Protein CCDS29695 P24549 TSL:1 201 coding GENCODE basic APPRIS P1
Aldh1a1- ENSMUST00000225313.1 825 138aa ENSMUSP00000153011.1 Protein - A0A286YCZ0 CDS 3' 205 coding incomplete
Aldh1a1- ENSMUST00000224358.1 1800 No - lncRNA - - - 202 protein
Aldh1a1- ENSMUST00000224807.1 1085 No - lncRNA - - - 203 protein
Aldh1a1- ENSMUST00000225249.1 510 No - lncRNA - - - 204 protein
Page 7 of 9 https://www.alphaknockout.com
170.75 kb Forward strand 20.50Mb 20.55Mb 20.60Mb 20.65Mb Genes (Comprehensive set... Aldh1a1-206 >protein coding
Aldh1a1-203 >lncRNA Aldh1a1-201 >protein coding
Aldh1a1-202 >lncRNA
Aldh1a1-205 >protein coding
Aldh1a1-204 >lncRNA
Contigs AC152162.9 > < AC167167.3 Genes < C730002L08Rik-202lncRNA < Gm6684-201processed pseudogene (Comprehensive set...
< C730002L08Rik-203lncRNA
< C730002L08Rik-201lncRNA
Regulatory Build
20.50Mb 20.55Mb 20.60Mb 20.65Mb Reverse strand 170.75 kb
Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site
Gene Legend Protein Coding
Ensembl protein coding merged Ensembl/Havana
Non-Protein Coding
pseudogene RNA gene
Page 8 of 9 https://www.alphaknockout.com
Transcript: ENSMUST00000087638
41.50 kb Forward strand
Aldh1a1-201 >protein coding
ENSMUSP00000084... Low complexity (Seg) Superfamily Aldehyde/histidinol dehydrogenase Pfam Aldehyde dehydrogenase domain PROSITE patterns Aldehyde dehydrogenase, cysteine active site
Aldehyde dehydrogenase, glutamic acid active site PANTHER PTHR11699
PTHR11699:SF221 Gene3D Aldehyde dehydrogenase, N-terminal
Aldehyde dehydrogenase, C-terminal CDD cd07141
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend missense variant splice region variant synonymous variant
Scale bar 0 60 120 180 240 300 360 420 501
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 9 of 9