https://www.alphaknockout.com

Mouse Fa2h Knockout Project (CRISPR/Cas9)

Objective: To create a Fa2h knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fa2h (NCBI Reference Sequence: NM_178086 ; Ensembl: ENSMUSG00000033579 ) is located on Mouse 8. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000038475). Exon 3~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a null allele show demyelination, axonal loss, and cerebellar dysfunction. Homozygotes for a different null allele show late onset axon and myelin sheath degeneration, delayed fur emergence, altered sebum composition, sebocyte hyperproliferation, and cyclic alopecia.

Exon 3 starts from about 32.62% of the coding region. Exon 3~6 covers 60.57% of the coding region. The size of effective KO region: ~8290 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7

Legends Exon of mouse Fa2h Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1409 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.75% 495) | C(24.1% 482) | T(26.8% 536) | G(24.35% 487)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1409bp) | A(32.01% 451) | C(21.36% 301) | T(24.91% 351) | G(21.72% 306)

Note: The 1409 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 - 111356221 111358220 2000 browser details YourSeq 83 676 764 2000 98.9% chr12 - 62580980 62581106 127 browser details YourSeq 72 676 747 2000 100.0% chr15 + 91674471 91674542 72 browser details YourSeq 63 676 755 2000 89.1% chr3 + 123997594 123997769 176 browser details YourSeq 62 678 761 2000 75.7% chr1 - 113149509 113149582 74 browser details YourSeq 47 726 786 2000 94.6% chr17 + 30021391 30021475 85 browser details YourSeq 40 718 764 2000 95.6% chr8 - 60064910 60064960 51 browser details YourSeq 40 680 724 2000 97.8% chr4 - 143017428 143017492 65 browser details YourSeq 40 707 746 2000 100.0% chr16 - 54325786 54325825 40 browser details YourSeq 40 713 755 2000 97.7% chr12 + 105761960 105762003 44 browser details YourSeq 38 723 762 2000 100.0% chr3 - 99538134 99538264 131 browser details YourSeq 38 713 754 2000 97.7% chr2 + 39910871 39910919 49 browser details YourSeq 38 712 759 2000 77.3% chr18 + 78084460 78084503 44 browser details YourSeq 36 708 748 2000 95.0% chrX - 157647593 157647635 43 browser details YourSeq 35 714 754 2000 81.6% chr4 - 10907605 10907642 38 browser details YourSeq 34 745 780 2000 97.3% chr2 + 149542386 149542421 36 browser details YourSeq 33 708 748 2000 89.2% chr6 - 92640644 92640683 40 browser details YourSeq 33 340 779 2000 48.6% chr17 + 70504752 70504943 192 browser details YourSeq 30 716 748 2000 96.9% chr6 + 95239662 95239694 33 browser details YourSeq 30 716 748 2000 96.9% chr18 + 52829011 52829046 36

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1409 1 1409 1409 100.0% chr8 - 111346522 111347930 1409 browser details YourSeq 126 655 826 1409 89.4% chr2 + 38904612 38904784 173 browser details YourSeq 123 358 795 1409 78.5% chr5 + 110369386 110369771 386 browser details YourSeq 116 259 821 1409 85.1% chr5 - 123327508 123328103 596 browser details YourSeq 116 617 824 1409 81.0% chr9 + 56444269 56444437 169 browser details YourSeq 114 678 834 1409 85.3% chr11 - 34894039 34894176 138 browser details YourSeq 106 681 828 1409 89.7% chr10 + 61592840 61593000 161 browser details YourSeq 105 678 821 1409 89.0% chr7 - 27226804 27226959 156 browser details YourSeq 104 682 828 1409 88.9% chr8 + 87941540 87941699 160 browser details YourSeq 102 674 824 1409 88.7% chr5 - 149703646 149703810 165 browser details YourSeq 102 678 805 1409 91.2% chr5 + 125737764 125737892 129 browser details YourSeq 101 349 794 1409 86.9% chr16 + 94477569 94478082 514 browser details YourSeq 100 256 745 1409 75.8% chr11 - 61901802 61902012 211 browser details YourSeq 100 678 821 1409 88.0% chr8 + 46438913 46439054 142 browser details YourSeq 100 672 792 1409 91.8% chr15 + 12301083 12301204 122 browser details YourSeq 97 672 828 1409 84.9% chr12 - 102327991 102328149 159 browser details YourSeq 97 674 795 1409 90.2% chr1 - 180720882 180721004 123 browser details YourSeq 96 678 816 1409 86.7% chr17 - 34184868 34185003 136 browser details YourSeq 96 672 795 1409 85.8% chr10 - 82721467 82721586 120 browser details YourSeq 94 678 795 1409 89.9% chr7 + 30508956 30509073 118

Note: The 1409 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Fa2h fatty acid 2-hydroxylase [ Mus musculus (house mouse) ] Gene ID: 338521, updated on 31-Aug-2019

Gene summary

Official Symbol Fa2h provided by MGI Official Full Name fatty acid 2-hydroxylase provided by MGI Primary source MGI:MGI:2443327 See related Ensembl:ENSMUSG00000033579 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as FAAH; Faxdc1; G630055L08Rik Expression Biased expression in stomach adult (RPKM 77.1), colon adult (RPKM 59.1) and 8 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 E1 See Fa2h in Genome Data Viewer Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (111345138..111393821, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (113869038..113917721, complement)

Chromosome 8 - NC_000074.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Fa2h ENSMUSG00000033579

Description fatty acid 2-hydroxylase [Source:MGI Symbol;Acc:MGI:2443327] Gene Synonyms Faxdc1, G630055L08Rik Location Chromosome 8: 111,345,135-111,393,824 reverse strand. GRCm38:CM001001.2 About this gene This gene has 4 transcripts (splice variants), 203 orthologues, is a member of 1 Ensembl protein family and is associated with 34 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fa2h-201 ENSMUST00000038475.8 2492 372aa ENSMUSP00000043597.8 Protein coding CCDS22674 Q5MPP0 TSL:1 GENCODE basic APPRIS P1

Fa2h-204 ENSMUST00000162463.1 1566 No protein - Retained intron - - TSL:1

Fa2h-202 ENSMUST00000159336.7 1971 No protein - lncRNA - - TSL:5

Fa2h-203 ENSMUST00000162216.1 933 No protein - lncRNA - - TSL:3

68.69 kb Forward strand 111.34Mb 111.36Mb 111.38Mb 111.40Mb Contigs AC132311.2 > (Comprehensive set... < Mlkl-201protein coding < Fa2h-202lncRNA

< Mlkl-202protein coding < Fa2h-201protein coding

< Mlkl-204protein coding < Fa2h-203lncRNA

< Fa2h-204retained intron

Regulatory Build

111.34Mb 111.36Mb 111.38Mb 111.40Mb Reverse strand 68.69 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000038475

< Fa2h-201protein coding

Reverse strand 48.68 kb

ENSMUSP00000043... Transmembrane heli... Low complexity (Seg) Superfamily Cytochrome b5-like heme/steroid binding domain superfamily SMART Cytochrome b5-like heme/steroid binding domain Prints Cytochrome b5-like heme/steroid binding domain Pfam Cytochrome b5-like heme/steroid binding domain Fatty acid hydroxylase

PROSITE profiles Cytochrome b5-like heme/steroid binding domain PROSITE patterns Cytochrome b5, heme-binding site PIRSF Sterol desaturase Scs7 PANTHER PTHR12863 Gene3D Cytochrome b5-like heme/steroid binding domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice acceptor variant missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 372

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8