https://www.alphaknockout.com

Mouse Soat2 Knockout Project (CRISPR/Cas9)

Objective: To create a Soat2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Soat2 (NCBI Reference Sequence: NM_146064 ; Ensembl: ENSMUSG00000023045 ) is located on Mouse 15. 15 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 15 (Transcript: ENSMUST00000023806). Exon 3~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutant animals exhibit elevated serum triglyceride levels and are resistant to fatty liver, hyperlipidemia, and gallstone development when fed a high fat, high cholesterol diet. When fed a Western diet homozygous mutant animals exhibit elevated HDL levels.

Exon 3 starts from about 8.63% of the coding region. Exon 3~13 covers 79.11% of the coding region. The size of effective KO region: ~8997 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 10

1 3 4 5 6 7 8 9 11 12 13 15

Legends Exon of mouse Soat2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1986 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 409 bp section downstream of Exon 13 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1986bp) | A(28.1% 558) | C(22.0% 437) | T(27.24% 541) | G(22.66% 450)

Note: The 1986 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(409bp) | A(25.18% 103) | C(20.29% 83) | T(21.03% 86) | G(33.5% 137)

Note: The 409 bp section downstream of Exon 13 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1986 1 1986 1986 100.0% chr15 + 102151163 102153148 1986 browser details YourSeq 150 668 1192 1986 90.7% chr5 - 127198838 127199389 552 browser details YourSeq 141 576 1192 1986 83.3% chr3 + 63770637 63771019 383 browser details YourSeq 137 1076 1342 1986 90.6% chr12 + 93347507 93348086 580 browser details YourSeq 134 451 613 1986 94.2% chr11 + 69011511 69011674 164 browser details YourSeq 126 724 1185 1986 93.1% chr1 - 61575817 61576276 460 browser details YourSeq 126 672 864 1986 90.1% chr8 + 44364741 44365111 371 browser details YourSeq 121 1065 1192 1986 97.7% chr2 - 133822768 133822896 129 browser details YourSeq 121 1060 1192 1986 96.3% chr2 + 136599490 136599624 135 browser details YourSeq 120 1045 1192 1986 89.8% chr1 + 135620678 135620816 139 browser details YourSeq 119 1044 1192 1986 89.0% chrY + 1566088 1566228 141 browser details YourSeq 119 1044 1192 1986 89.0% chrY + 1646592 1646732 141 browser details YourSeq 118 1066 1192 1986 96.9% chr18 - 13945851 13945978 128 browser details YourSeq 118 1066 1192 1986 96.9% chr13 - 93917276 93917403 128 browser details YourSeq 118 1064 1192 1986 96.9% chr17 + 7594199 7594327 129 browser details YourSeq 117 1065 1192 1986 96.1% chr4 - 19213727 19213855 129 browser details YourSeq 117 1068 1192 1986 97.6% chr3 - 56832503 56832630 128 browser details YourSeq 117 1067 1192 1986 96.9% chr1 - 128291240 128291366 127 browser details YourSeq 117 1045 1192 1986 90.1% chrX + 38393799 38393939 141 browser details YourSeq 116 1064 1192 1986 95.4% chr3 - 118820128 118820257 130

Note: The 1986 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 409 1 409 409 100.0% chr15 + 102162146 102162554 409 browser details YourSeq 27 361 399 409 96.7% chr12 - 29235535 29235577 43 browser details YourSeq 26 167 195 409 85.2% chr13 - 111671312 111671338 27 browser details YourSeq 24 167 190 409 100.0% chr17 - 8004660 8004683 24 browser details YourSeq 24 169 195 409 84.0% chr1 - 106228606 106228630 25 browser details YourSeq 23 167 192 409 96.0% chr10 + 67630773 67630802 30 browser details YourSeq 22 167 188 409 100.0% chrX - 113470540 113470561 22 browser details YourSeq 22 167 188 409 100.0% chr5 - 116481430 116481451 22 browser details YourSeq 22 60 81 409 100.0% chr2 - 81972950 81972971 22 browser details YourSeq 21 167 188 409 100.0% chr6 - 117976889 117976911 23 browser details YourSeq 21 313 334 409 100.0% chr13 + 29288161 29288183 23 browser details YourSeq 21 175 195 409 100.0% chr12 + 111330327 111330347 21 browser details YourSeq 20 171 190 409 100.0% chr13 + 7295565 7295584 20 browser details YourSeq 20 173 194 409 95.5% chr12 + 10691710 10691731 22

Note: The 409 bp section downstream of Exon 13 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and protein information: Soat2 sterol O-acyltransferase 2 [ Mus musculus (house mouse) ] Gene ID: 223920, updated on 21-Aug-2019

Gene summary

Official Symbol Soat2 provided by MGI Official Full Name sterol O-acyltransferase 2 provided by MGI Primary source MGI:MGI:1332226 See related Ensembl:ENSMUSG00000023045 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ACAT2; D15Wsu97e Expression Biased expression in duodenum adult (RPKM 116.4), small intestine adult (RPKM 111.2) and 6 other tissues See more Orthologs human all

Genomic context

Location: 15 F2; 15 57.33 cM See Soat2 in Genome Data Viewer Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (102150415..102163469)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (101981006..101993867)

Chromosome 15 - NC_000081.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Soat2 ENSMUSG00000023045

Description sterol O-acyltransferase 2 [Source:MGI Symbol;Acc:MGI:1332226] Gene Synonyms ACAT2, D15Wsu97e Location Chromosome 15: 102,150,526-102,163,469 forward strand. GRCm38:CM001008.2 About this gene This gene has 2 transcripts (splice variants), 170 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 11 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Soat2- ENSMUST00000023806.13 2208 525aa ENSMUSP00000023806.6 Protein coding CCDS27872 O88908 TSL:1 201 GENCODE basic APPRIS P1

Soat2- ENSMUST00000160465.1 1628 284aa ENSMUSP00000124628.1 Nonsense mediated - F6SMH4 CDS 5' 202 decay incomplete TSL:5

32.94 kb Forward strand 102.15Mb 102.16Mb 102.17Mb (Comprehensive set... Igfbp6-201 >protein coding Soat2-201 >protein coding

Soat2-202 >nonsense mediated decay

Contigs AC110512.13 > AC123791.4 > Regulatory Build

102.15Mb 102.16Mb 102.17Mb Reverse strand 32.94 kb

Regulation Legend

CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000023806

12.94 kb Forward strand

Soat2-201 >protein coding

ENSMUSP00000023... Transmembrane heli... MobiDB lite Low complexity (Seg) Pfam Membrane bound O-acyl transferase, MBOAT PIRSF Sterol O-acyltransferase, metazoa

Sterol O-acyltransferase, ACAT/DAG/ARE types PANTHER PTHR10408:SF10

Sterol O-acyltransferase, ACAT/DAG/ARE types

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 525

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8