https://www.alphaknockout.com

Mouse Hdlbp Knockout Project (CRISPR/Cas9)

Objective: To create a Hdlbp knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hdlbp (NCBI Reference Sequence: NM_133808 ; Ensembl: ENSMUSG00000034088 ) is located on Mouse 1. 28 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 28 (Transcript: ENSMUST00000170883). Exon 3~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from the coding region. Exon 3~7 covers 22.95% of the coding region. The size of effective KO region: ~9838 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 28

Legends Exon of mouse Hdlbp Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1393 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 650 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1393bp) | A(25.99% 362) | C(19.6% 273) | T(33.17% 462) | G(21.25% 296)

Note: The 1393 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(650bp) | A(24.92% 162) | C(20.92% 136) | T(30.92% 201) | G(23.23% 151)

Note: The 650 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1393 1 1393 1393 100.0% chr1 - 93440756 93442148 1393 browser details YourSeq 72 265 376 1393 85.5% chr9 - 114546202 114546315 114 browser details YourSeq 71 262 422 1393 84.3% chr2 + 164934115 164934270 156 browser details YourSeq 66 308 421 1393 87.5% chr17 - 23753914 23754030 117 browser details YourSeq 64 57 421 1393 71.7% chr10 + 94742773 94742923 151 browser details YourSeq 61 329 429 1393 93.2% chr4 - 118089547 118089665 119 browser details YourSeq 61 328 424 1393 95.6% chr16 + 88558976 88559076 101 browser details YourSeq 59 308 422 1393 90.3% chr7 + 79886100 79886215 116 browser details YourSeq 58 264 360 1393 76.3% chr18 - 20503440 20503521 82 browser details YourSeq 57 329 424 1393 90.2% chr15 + 15854977 15855077 101 browser details YourSeq 55 309 422 1393 88.6% chr16 + 18293641 18293757 117 browser details YourSeq 55 319 424 1393 89.9% chr14 + 89147011 89147121 111 browser details YourSeq 50 308 376 1393 87.1% chr15 + 83014589 83014653 65 browser details YourSeq 50 341 421 1393 96.3% chr11 + 117679339 117679426 88 browser details YourSeq 49 323 376 1393 96.3% chr10 - 72034968 72035414 447 browser details YourSeq 46 309 364 1393 91.1% chr10 - 3883182 3883237 56 browser details YourSeq 45 308 364 1393 89.5% chr1 - 133927771 133927827 57 browser details YourSeq 45 308 364 1393 89.5% chrX + 52391978 52392034 57 browser details YourSeq 45 308 364 1393 89.5% chr8 + 40830185 40830241 57 browser details YourSeq 43 341 424 1393 95.8% chr11 - 106607580 106607836 257

Note: The 1393 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 650 1 650 650 100.0% chr1 - 93430305 93430954 650 browser details YourSeq 25 362 390 650 96.3% chr11 - 70870079 70870108 30 browser details YourSeq 25 434 463 650 85.2% chr2 + 37982456 37982483 28 browser details YourSeq 24 359 390 650 87.5% chrX + 162608251 162608282 32 browser details YourSeq 22 372 397 650 92.4% chr11 + 116187246 116187271 26 browser details YourSeq 21 225 247 650 95.7% chr14 + 111761644 111761666 23 browser details YourSeq 20 261 280 650 100.0% chr16 - 66041194 66041213 20

Note: The 650 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Hdlbp high density lipoprotein (HDL) binding protein [ Mus musculus (house mouse) ] Gene ID: 110611, updated on 24-Oct-2019

Gene summary

Official Symbol Hdlbp provided by MGI Official Full Name high density lipoprotein (HDL) binding protein provided by MGI Primary source MGI:MGI:99256 See related Ensembl:ENSMUSG00000034088 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AA960365; AI118566; D1Ertd101e; 1110005P14Rik Expression Ubiquitous expression in placenta adult (RPKM 70.8), limb E14.5 (RPKM 50.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 1 D; 1 47.24 cM See Hdlbp in Genome Data Viewer Exon count: 32

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (93405940..93478917, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (95302517..95375385, complement)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Hdlbp ENSMUSG00000034088

Description high density lipoprotein (HDL) binding protein [Source:MGI Symbol;Acc:MGI:99256] Gene Synonyms 1110005P14Rik, D1Ertd101e Location Chromosome 1: 93,405,940-93,478,815 reverse strand. GRCm38:CM000994.2 About this gene This gene has 11 transcripts (splice variants), 202 orthologues, 4 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hdlbp- ENSMUST00000170883.7 6231 1268aa ENSMUSP00000127903.1 Protein coding CCDS15189 Q3U4Z7 TSL:1 202 Q8VDJ3 GENCODE basic APPRIS P1

Hdlbp- ENSMUST00000042498.13 4427 1268aa ENSMUSP00000043047.7 Protein coding CCDS15189 Q3U4Z7 TSL:1 201 Q8VDJ3 GENCODE basic APPRIS P1

Hdlbp- ENSMUST00000186164.6 3808 1199aa ENSMUSP00000139671.1 Protein coding - A0A087WP83 TSL:5 204 GENCODE basic

Hdlbp- ENSMUST00000189025.6 850 129aa ENSMUSP00000140399.1 Protein coding - A0A087WQY9 CDS 3' 209 incomplete TSL:5

Hdlbp- ENSMUST00000188988.6 740 178aa ENSMUSP00000140946.1 Protein coding - A0A087WS92 CDS 3' 208 incomplete TSL:2

Hdlbp- ENSMUST00000186787.6 671 141aa ENSMUSP00000139719.1 Protein coding - A0A087WPC5 CDS 3' 205 incomplete TSL:5

Hdlbp- ENSMUST00000190321.6 528 22aa ENSMUSP00000139448.1 Protein coding - A0A087WNQ4 CDS 3' 211 incomplete TSL:5

Hdlbp- ENSMUST00000188165.1 341 22aa ENSMUSP00000139777.1 Protein coding - A0A087WPH2 CDS 3' 206 incomplete TSL:3

Hdlbp- ENSMUST00000188844.1 3208 No - Retained - - TSL:NA 207 protein intron

Hdlbp- ENSMUST00000189951.6 2172 No - Retained - - TSL:1 210 protein intron

Hdlbp- ENSMUST00000185349.6 630 No - Retained - - TSL:2 203 protein intron

92.88 kb Forward strand 93.40Mb 93.42Mb 93.44Mb 93.46Mb 93.48Mb Ano7-202 >protein coding Gm17415-201 >processed pseudogene Sept2-210 >retained intron (Comprehensive set...

Ano7-201 >protein coding Sept2-213 >retained intron

Ano7-203 >protein coding Sept2-201 >protein coding

Sept2-205 >protein coding

Sept2-204 >protein coding

Sept2-211 >protein coding

Sept2-203 >protein coding

Page 7 of 9 Sept2-202 >protein coding

Sept2-207 >protein coding

Sept2-206 >protein coding

Sept2-208 >protein coding

Sept2-215 >protein coding

Sept2-216 >protein coding

Sept2-214 >protein coding

Contigs AC108412.8 > Genes (Comprehensive set... < Hdlbp-202protein coding

< Hdlbp-201protein coding < Hdlbp-207retained intron

< Hdlbp-204protein coding

< Hdlbp-210retained intron

< Hdlbp-211protein coding

< Hdlbp-208protein coding

< Hdlbp-205protein coding

< Hdlbp-209protein coding

< Hdlbp-203retained intron

< Hdlbp-206protein coding

Regulatory Build

93.40Mb 93.42Mb 93.44Mb 93.46Mb 93.48Mb Reverse strand 92.88 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene 92.88 kb Forward strand 93.40Mb 93.42Mb 93.44Mb 93.46Mb 93.48Mb Genes Ano7-202 >protein coding Gm17415-201 >processed pseudogene Sept2-210 >retained intron (Comprehensive set...

Ano7-201 >protein coding Sept2-213 >retained intron

Ano7-203 >protein coding Sept2-201 >protein coding

Sept2-205 >protein coding

Sept2-204 >protein coding

Sept2-211 >protein coding

https://www.aSelppth2-a20k3n >oprcotkeion cuotd.incgom

Sept2-202 >protein coding

Sept2-207 >protein coding

Sept2-206 >protein coding

Sept2-208 >protein coding

Sept2-215 >protein coding

Sept2-216 >protein coding

Sept2-214 >protein coding

Contigs AC108412.8 >

Genes (Comprehensive set... < Hdlbp-202protein coding

< Hdlbp-201protein coding < Hdlbp-207retained intron

< Hdlbp-204protein coding

< Hdlbp-210retained intron

< Hdlbp-211protein coding

< Hdlbp-208protein coding

< Hdlbp-205protein coding

< Hdlbp-209protein coding

< Hdlbp-203retained intron

< Hdlbp-206protein coding

Regulatory Build

93.40Mb 93.42Mb 93.44Mb 93.46Mb 93.48Mb Reverse strand 92.88 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000170883

< Hdlbp-202protein coding

Reverse strand 72.86 kb

ENSMUSP00000127... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily K Homology domain, type 1 superfamily SMART K Homology domain Pfam K Homology domain, type 1 PROSITE profiles PS50084 PANTHER PTHR10627

PTHR10627:SF34 Gene3D K Homology domain, type 1 superfamily CDD cd02394

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1268

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9