https://www.alphaknockout.com

Mouse Apoa2 Knockout Project (CRISPR/Cas9)

Objective: To create a Apoa2 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Apoa2 (NCBI Reference Sequence: NM_013474 ; Ensembl: ENSMUSG00000005681 ) is located on Mouse 1. 4 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000005824). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous null mutation of this gene results in a reduction of total cholesterol, HDL cholesterol, free fatty acids, insulin, and glucose levels in both the fasted and unfasted states. Strain specific alleles have been associated with varying degrees of amyloidosis.

Exon 2 starts from about 0.33% of the coding region. Exon 2~4 covers 100.0% of the coding region. The size of effective KO region: ~2717 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4

Legends Exon of mouse Apoa2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.4% 488) | C(22.25% 445) | T(29.35% 587) | G(24.0% 480)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.65% 573) | C(22.45% 449) | T(24.85% 497) | G(24.05% 481)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 171223316 171225315 2000 browser details YourSeq 127 1135 1349 2000 93.9% chr10 - 24723126 24723354 229 browser details YourSeq 82 259 340 2000 100.0% chr16 - 52843736 52843817 82 browser details YourSeq 71 38 205 2000 75.9% chr16 + 21683445 21683555 111 browser details YourSeq 65 119 214 2000 92.3% chr3 - 121164923 121165034 112 browser details YourSeq 65 334 445 2000 82.2% chr12 - 55119412 55119528 117 browser details YourSeq 61 334 445 2000 80.2% chr12 - 55268717 55268833 117 browser details YourSeq 61 156 224 2000 94.3% chr15 + 64633115 64633183 69 browser details YourSeq 57 104 213 2000 78.3% chr2 - 132721996 132722083 88 browser details YourSeq 55 158 224 2000 91.1% chr6 - 103556366 103556432 67 browser details YourSeq 55 158 228 2000 88.8% chr12 - 47585765 47585835 71 browser details YourSeq 53 158 214 2000 96.5% chr13 - 99347589 99347645 57 browser details YourSeq 53 158 214 2000 96.5% chr5 + 150512622 150512678 57 browser details YourSeq 53 158 214 2000 96.5% chr19 + 36787701 36787757 57 browser details YourSeq 52 165 223 2000 91.0% chr11 + 102786435 102786491 57 browser details YourSeq 51 158 214 2000 94.8% chr14 - 48639398 48639454 57 browser details YourSeq 51 160 214 2000 92.6% chr6 + 146546179 146546232 54 browser details YourSeq 51 158 214 2000 94.8% chr15 + 73203881 73203937 57 browser details YourSeq 51 158 214 2000 94.8% chr10 + 78556608 78556664 57 browser details YourSeq 50 1128 1180 2000 98.1% chr9 - 67427649 67427702 54

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 171226265 171228264 2000 browser details YourSeq 259 574 1039 2000 89.9% chr2 + 22879350 22879857 508 browser details YourSeq 230 565 874 2000 91.1% chr8 + 83304921 83383152 78232 browser details YourSeq 222 568 866 2000 90.1% chr2 - 170689823 170690412 590 browser details YourSeq 221 568 875 2000 91.2% chr12 + 74374250 74374591 342 browser details YourSeq 220 568 846 2000 92.7% chr1 - 182159708 182647492 487785 browser details YourSeq 217 568 849 2000 92.3% chr13 + 58954667 58955030 364 browser details YourSeq 210 585 866 2000 91.5% chr10 - 60517626 60517957 332 browser details YourSeq 210 585 874 2000 90.9% chr7 + 133226656 133226973 318 browser details YourSeq 208 568 849 2000 91.1% chr18 + 49970447 49970760 314 browser details YourSeq 207 590 866 2000 89.4% chrX - 40286934 40287262 329 browser details YourSeq 207 585 861 2000 90.7% chr6 - 49362948 49363256 309 browser details YourSeq 206 568 849 2000 89.4% chr1 - 88755493 88755832 340 browser details YourSeq 205 585 866 2000 93.0% chr9 - 80058186 80058495 310 browser details YourSeq 205 590 865 2000 88.0% chr14 + 106073899 106074196 298 browser details YourSeq 201 590 861 2000 91.5% chr13 - 57070722 57071020 299 browser details YourSeq 200 595 849 2000 91.8% chr16 + 89293976 89294257 282 browser details YourSeq 199 585 843 2000 91.8% chr16 - 35016606 35016927 322 browser details YourSeq 199 590 866 2000 90.5% chr18 + 35002922 35003225 304 browser details YourSeq 198 585 861 2000 89.6% chr1 - 168279794 168280092 299

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Apoa2 apolipoprotein A-II [ Mus musculus (house mouse) ] Gene ID: 11807, updated on 10-Oct-2019

Gene summary

Official Symbol Apoa2 provided by MGI Official Full Name apolipoprotein A-II provided by MGI Primary source MGI:MGI:88050 See related Ensembl:ENSMUSG00000005681 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Alp-2; Hdl-1; ApoAII; Apoa-2; Apo-AII; ApoA-II Summary This gene encodes a component of high density lipoproteins (HDL). Mice lacking the encoded protein have low HDL- Expression cholesterol levels, smaller HDL particles, increased clearance of triglyceride-rich lipoproteins and insulin hypersensitivity. Transgenic mice overexpressing the encoded protein have elevated levels of HDL-cholesterol and show increased susceptibility to atherosclerosis. Alternative splicing of this gene results in multiple variants. [provided by RefSeq, Mar 2015] Orthologs Biased expression in liver adult (RPKM 4143.8), liver E18 (RPKM 1768.2) and 3 other tissues See more human all

Genomic context

Location: 1 H3; 1 79.22 cM See Apoa2 in Genome Data Viewer Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (171221564..171226379)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (173155185..173156510)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Apoa2 ENSMUSG00000005681

Description apolipoprotein A-II [Source:MGI Symbol;Acc:MGI:88050] Gene Synonyms Alp-2, ApoA-II, Apoa-2, Hdl-1 Location : 171,225,054-171,226,379 forward strand. GRCm38:CM000994.2 About this gene This gene has 4 transcripts (splice variants), 83 orthologues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Apoa2-204 ENSMUST00000111321.7 518 102aa ENSMUSP00000106953.1 Protein coding CCDS35773 P09813 TSL:2 GENCODE basic APPRIS P1

Apoa2-201 ENSMUST00000005824.11 492 102aa ENSMUSP00000005824.5 Protein coding CCDS35773 P09813 TSL:1 GENCODE basic APPRIS P1

Apoa2-203 ENSMUST00000111320.7 470 102aa ENSMUSP00000106952.1 Protein coding CCDS35773 P09813 TSL:2 GENCODE basic APPRIS P1

Apoa2-202 ENSMUST00000111319.1 458 102aa ENSMUSP00000106951.1 Protein coding CCDS35773 P09813 TSL:3 GENCODE basic APPRIS P1

Page 7 of 9 https://www.alphaknockout.com

21.33 kb Forward strand 171.220Mb 171.225Mb 171.230Mb 171.235Mb (Comprehensive set... Nr1i3-201 >protein coding Apoa2-204 >protein coding

Nr1i3-202 >protein coding Apoa2-201 >protein coding

Nr1i3-208 >protein coding Apoa2-203 >protein coding

Nr1i3-205 >retained intron Apoa2-202 >protein coding

Nr1i3-204 >nonsense mediated decay

Nr1i3-207 >retained intron

Nr1i3-203 >protein coding

Nr1i3-206 >retained intron

Contigs < AC163497.14 < AC084821.26 Genes < Tomm40l-201protein coding < Fcer1g-201protein coding (Comprehensive set...

< Tomm40l-202protein coding < Fcer1g-203protein coding

< Tomm40l-203protein coding < Fcer1g-202retained intron

< Tomm40l-206nonsense mediated decay < Ndufs2-204retained intron

< Tomm40l-205protein coding < Ndufs2-205retained intron

< Tomm40l-207protein coding < Ndufs2-202protein coding

< Tomm40l-208lncRNA < Ndufs2-201protein coding

< Tomm40l-204retained intron < Ndufs2-206lncRNA

Regulatory Build

171.220Mb 171.225Mb 171.230Mb 171.235Mb Reverse strand 21.33 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000005824

1.30 kb Forward strand

Apoa2-201 >protein coding

ENSMUSP00000005... Cleavage site (Sign... Superfamily Apolipoprotein A-II (ApoA-II) superfamily Pfam Apolipoprotein A-II (ApoA-II) PANTHER Apolipoprotein A-II (ApoA-II)

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) R M R R K W R W R Y R S Y Y R R

Variant Legend missense variant synonymous variant

Scale bar 0 10 20 30 40 50 60 70 80 90 102

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9