https://www.alphaknockout.com

Mouse Phyhipl Knockout Project (CRISPR/Cas9)

Objective: To create a Phyhipl knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Phyhipl (NCBI Reference Sequence: NM_178621 ; Ensembl: ENSMUSG00000037747 ) is located on Mouse 10. 5 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 5 (Transcript: ENSMUST00000046513). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 9.24% of the coding region. Exon 2~4 covers 43.56% of the coding region. The size of effective KO region: ~5846 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5

Legends Exon of mouse Phyhipl Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.35% 567) | C(18.3% 366) | T(30.75% 615) | G(22.6% 452)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.6% 592) | C(19.4% 388) | T(29.75% 595) | G(21.25% 425)

Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 70571020 70573019 2000 browser details YourSeq 49 523 595 2000 93.2% chr2 + 153043135 153043226 92 browser details YourSeq 42 167 406 2000 95.7% chr3 - 39761746 39761991 246 browser details YourSeq 31 1871 1908 2000 94.5% chr13 - 42386349 42386396 48 browser details YourSeq 31 576 617 2000 97.1% chr12 + 81295845 81295892 48 browser details YourSeq 31 562 595 2000 97.0% chr11 + 46026949 46026983 35 browser details YourSeq 30 1877 1909 2000 97.0% chr2 - 169219328 169219366 39 browser details YourSeq 30 532 580 2000 96.9% chr11 - 21116089 21116139 51 browser details YourSeq 29 168 204 2000 96.8% chr19 - 10893942 10893987 46 browser details YourSeq 29 167 197 2000 96.8% chr7 + 55019301 55019331 31 browser details YourSeq 29 168 206 2000 77.5% chr6 + 68931433 68931465 33 browser details YourSeq 29 564 595 2000 96.8% chr14 + 79093311 79093343 33 browser details YourSeq 29 976 1090 2000 54.6% chr1 + 3995726 3995774 49 browser details YourSeq 28 167 196 2000 96.7% chr18 - 34934325 34934354 30 browser details YourSeq 27 533 579 2000 86.3% chr5 + 99628559 99628603 45 browser details YourSeq 26 562 587 2000 100.0% chrX - 134283182 134283207 26 browser details YourSeq 26 167 192 2000 100.0% chr19 - 21553416 21553441 26 browser details YourSeq 26 167 192 2000 100.0% chr18 - 85560441 85560466 26 browser details YourSeq 26 167 192 2000 100.0% chr16 - 4210966 4210991 26 browser details YourSeq 26 167 192 2000 100.0% chr10 - 26589841 26589866 26

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr10 - 70563174 70565173 2000 browser details YourSeq 66 82 208 2000 87.4% chr17 - 87864887 87865015 129 browser details YourSeq 53 65 205 2000 84.3% chr1 + 75296120 75296266 147 browser details YourSeq 52 70 208 2000 76.4% chr11 + 30229911 30230039 129 browser details YourSeq 47 96 213 2000 94.4% chr6 - 110236839 110236957 119 browser details YourSeq 44 97 207 2000 85.2% chr13 - 118678769 118678877 109 browser details YourSeq 42 96 220 2000 91.4% chr4 + 119402316 119402439 124 browser details YourSeq 41 96 201 2000 80.6% chr12 + 69713373 69713479 107 browser details YourSeq 40 76 206 2000 87.1% chr17 - 29547230 29547361 132 browser details YourSeq 38 88 210 2000 90.5% chr18 + 31899730 31899851 122 browser details YourSeq 37 97 210 2000 95.2% chr3 - 121400795 121400909 115 browser details YourSeq 37 96 211 2000 97.5% chr11 - 75981130 75981247 118 browser details YourSeq 37 82 208 2000 91.2% chr13 + 59782854 59782980 127 browser details YourSeq 36 96 211 2000 95.0% chr13 - 92267611 92267727 117 browser details YourSeq 36 322 376 2000 89.4% chr1 - 153400550 153400605 56 browser details YourSeq 36 98 209 2000 92.9% chr13 + 75219978 75220091 114 browser details YourSeq 36 98 210 2000 95.0% chr12 + 77029707 77029821 115 browser details YourSeq 36 75 118 2000 88.1% chr11 + 44525914 44525956 43 browser details YourSeq 35 96 211 2000 94.9% chr10 + 117157265 117157381 117 browser details YourSeq 35 96 210 2000 94.9% chr1 + 136585892 136586007 116

Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and protein information: Phyhipl phytanoyl-CoA hydroxylase interacting protein-like [ Mus musculus (house mouse) ] Gene ID: 70911, updated on 12-Aug-2019

Gene summary

Official Symbol Phyhipl provided by MGI Official Full Name phytanoyl-CoA hydroxylase interacting protein-like provided by MGI Primary source MGI:MGI:1918161 See related Ensembl:ENSMUSG00000037747 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI267048; 4921522K17Rik Expression Biased expression in testis adult (RPKM 75.5), cerebellum adult (RPKM 31.7) and 6 other tissuesS ee more Orthologs human all

Genomic context

Location: 10; 10 B5.3 See Phyhipl in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (70557686..70599291, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (70020434..70062039, complement)

Chromosome 10 - NC_000076.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Phyhipl ENSMUSG00000037747

Description phytanoyl-CoA hydroxylase interacting protein-like [Source:MGI Symbol;Acc:MGI:1918161] Gene Synonyms 4921522K17Rik Location : 70,557,682-70,655,965 reverse strand. GRCm38:CM001003.2 About this gene This gene has 10 transcripts (splice variants), 249 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Phyhipl- ENSMUST00000046513.9 2877 375aa ENSMUSP00000045807.3 Protein coding CCDS23914 Q8BGT8 TSL:1 201 GENCODE basic APPRIS P3

Phyhipl- ENSMUST00000162251.7 1540 330aa ENSMUSP00000125179.1 Protein coding CCDS48593 F7D3N3 TSL:3 206 GENCODE basic APPRIS ALT1

Phyhipl- ENSMUST00000162144.1 635 212aa ENSMUSP00000124828.1 Protein coding - F6U6Z2 CDS 5' and 3' incomplete 205 TSL:3

Phyhipl- ENSMUST00000162793.7 598 41aa ENSMUSP00000124655.1 Protein coding - E0CXW9 CDS 3' incomplete 209 TSL:3

Phyhipl- ENSMUST00000163054.1 3485 No - Retained - - TSL:1 210 protein intron

Phyhipl- ENSMUST00000162470.1 1877 No - Retained - - TSL:1 207 protein intron

Phyhipl- ENSMUST00000162571.7 2011 No - lncRNA - - TSL:1 208 protein

Phyhipl- ENSMUST00000161687.7 1665 No - lncRNA - - TSL:1 204 protein

Phyhipl- ENSMUST00000160127.1 359 No - lncRNA - - TSL:5 203 protein

Phyhipl- ENSMUST00000159025.1 326 No - lncRNA - - TSL:3 202 protein

Page 7 of 9 https://www.alphaknockout.com

118.28 kb Forward strand 70.56Mb 70.58Mb 70.60Mb 70.62Mb 70.64Mb 70.66Mb Fam13c-202 >protein coding (Comprehensive set...

Fam13c-205 >retained intron

Fam13c-203 >protein coding

Fam13c-201 >protein coding

Contigs AC122896.4 >

Genes (Comprehensive set... < Phyhipl-201protein coding < Phyhipl-208lncRNA

< Phyhipl-206protein coding < Phyhipl-204lncRNA

< Phyhipl-205protein coding

< Phyhipl-207retained intron < Phyhipl-202lncRNA

< Phyhipl-210retained intron < Phyhipl-203lncRNA

< Phyhipl-209protein coding

Regulatory Build

70.56Mb 70.58Mb 70.60Mb 70.62Mb 70.64Mb 70.66Mb Reverse strand 118.28 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000046513

< Phyhipl-201protein coding

Reverse strand 41.61 kb

ENSMUSP00000045... Low complexity (Seg) Superfamily Fibronectin type III superfamily Pfam Fibronectin type III PROSITE profiles Fibronectin type III PANTHER PTHR15698

PTHR15698:SF8 Gene3D Immunoglobulin-like fold CDD Fibronectin type III

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 375

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9