https://www.alphaknockout.com

Mouse Hhipl1 Knockout Project (CRISPR/Cas9)

Objective: To create a Hhipl1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hhipl1 gene (NCBI Reference Sequence: NM_001044380 ; Ensembl: ENSMUSG00000021260 ) is located on Mouse 12. 9 are identified, with the ATG start codon in 1 and the TAG stop codon in exon 9 (Transcript: ENSMUST00000021685). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 11.55% of the coding region. Exon 2~6 covers 58.7% of the coding region. The size of effective KO region: ~8414 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 9

Legends Exon of mouse Hhipl1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1701 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.7% 474) | C(22.2% 444) | T(22.7% 454) | G(31.4% 628)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1701bp) | A(23.34% 397) | C(25.22% 429) | T(21.99% 374) | G(29.45% 501)

Note: The 1701 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 108309689 108311688 2000 browser details YourSeq 54 70 154 2000 77.3% chr10 - 97411704 97411773 70 browser details YourSeq 53 81 154 2000 92.0% chr17 - 48976545 48976623 79 browser details YourSeq 49 86 153 2000 83.1% chr11 + 93508812 93508874 63 browser details YourSeq 48 70 149 2000 74.2% chr10 + 72845982 72846041 60 browser details YourSeq 47 97 151 2000 96.2% chr14 + 75163126 75163188 63 browser details YourSeq 46 70 153 2000 72.8% chr15 + 101303158 101303218 61 browser details YourSeq 45 83 147 2000 80.0% chr11 - 76225405 76225459 55 browser details YourSeq 44 81 154 2000 76.4% chr16 + 47658044 47658107 64 browser details YourSeq 44 81 144 2000 87.5% chr13 + 109077134 109077196 63 browser details YourSeq 42 81 154 2000 93.8% chr13 - 58479484 58479557 74 browser details YourSeq 38 70 123 2000 84.5% chr18 - 5406518 5406569 52 browser details YourSeq 38 71 143 2000 80.5% chr13 - 43697444 43697512 69 browser details YourSeq 37 86 144 2000 75.7% chr17 + 42649958 42650000 43 browser details YourSeq 37 101 145 2000 95.3% chr1 + 190804958 190805002 45 browser details YourSeq 36 86 134 2000 83.0% chr14 + 88290509 88290554 46 browser details YourSeq 35 82 126 2000 87.5% chr17 - 88297904 88297946 43 browser details YourSeq 35 82 139 2000 73.2% chr19 + 22273464 22273508 45 browser details YourSeq 33 70 124 2000 89.5% chr5 - 148302109 148302162 54 browser details YourSeq 33 102 145 2000 94.8% chr1_GL456211_random - 34736 34791 56

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1701 1 1701 1701 100.0% chr12 + 108320103 108321803 1701 browser details YourSeq 154 1127 1312 1701 90.4% chr19 - 46054075 46054253 179 browser details YourSeq 151 1142 1311 1701 95.3% chr13 - 74159821 74160046 226 browser details YourSeq 150 1141 1309 1701 91.6% chr15 - 100497617 100497781 165 browser details YourSeq 150 1143 1308 1701 95.8% chr1 - 114246495 114246663 169 browser details YourSeq 149 1141 1305 1701 95.8% chr15 + 90901651 90901819 169 browser details YourSeq 149 1124 1310 1701 92.2% chr1 + 165568799 165569037 239 browser details YourSeq 148 1144 1310 1701 94.6% chr7 - 126988092 126988261 170 browser details YourSeq 148 1141 1310 1701 94.1% chr13 - 14076682 14076864 183 browser details YourSeq 147 1140 1309 1701 92.0% chr2 - 121923576 121923739 164 browser details YourSeq 147 1141 1310 1701 94.7% chr11 - 97736555 97736731 177 browser details YourSeq 146 1144 1303 1701 96.3% chr2 + 154467481 154467648 168 browser details YourSeq 145 1142 1308 1701 94.6% chr4 - 141652816 141652986 171 browser details YourSeq 145 1140 1313 1701 91.6% chr13 - 107215036 107215207 172 browser details YourSeq 145 1140 1396 1701 94.0% chr1 - 84924932 84925228 297 browser details YourSeq 145 1142 1310 1701 92.5% chr11 + 80069055 80069218 164 browser details YourSeq 144 1142 1303 1701 93.1% chr14 - 26130167 26130326 160 browser details YourSeq 144 1141 1312 1701 93.4% chr13 - 102801732 102801916 185 browser details YourSeq 144 1141 1315 1701 93.3% chr4 + 133807949 133808127 179 browser details YourSeq 144 1142 1308 1701 94.0% chr17 + 15844755 15844922 168

Note: The 1701 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Hhipl1 hedgehog interacting protein-like 1 [ Mus musculus (house mouse) ] Gene ID: 214305, updated on 12-Aug-2019

Gene summary

Official Symbol Hhipl1 provided by MGI Official Full Name hedgehog interacting protein-like 1 provided by MGI Primary source MGI:MGI:1919265 See related Ensembl:ENSMUSG00000021260 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AK129450; mKIAA1822; 1600002O04Rik Expression Ubiquitous expression in subcutaneous fat pad adult (RPKM 2.7), ovary adult (RPKM 2.3) and 24 other tissues See more Orthologs human all

Genomic context

Location: 12; 12 F1 See Hhipl1 in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (108305968..108328300)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (109544480..109566510)

Chromosome 12 - NC_000078.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Hhipl1 ENSMUSG00000021260

Description hedgehog interacting protein-like 1 [Source:MGI Symbol;Acc:MGI:1919265] Gene Synonyms 1600002O04Rik Location Chromosome 12: 108,306,270-108,330,869 forward strand. GRCm38:CM001005.2 About this gene This gene has 2 transcripts (splice variants), 151 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hhipl1-201 ENSMUST00000021685.7 5174 791aa ENSMUSP00000021685.6 Protein coding CCDS36554 Q14DK5 TSL:1 GENCODE basic APPRIS P1

Hhipl1-202 ENSMUST00000223395.1 2038 No protein - lncRNA - - TSL:1

44.60 kb Forward strand

108.30Mb 108.31Mb 108.32Mb 108.33Mb 108.34Mb Genes (Comprehensive set... Hhipl1-201 >protein coding Cyp46a1-202 >nonsense mediated decay

Hhipl1-202 >lncRNA Cyp46a1-201 >protein coding

Contigs AC122407.4 > < AC154910.3 Genes < 4930478K11Rik-201lncRNA (Comprehensive set...

Regulatory Build

108.30Mb 108.31Mb 108.32Mb 108.33Mb 108.34Mb Reverse strand 44.60 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000021685

24.60 kb Forward strand

Hhipl1-201 >protein coding

ENSMUSP00000021... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Soluble quinoprotein glucose/sorbosone dehydrogenase SRCR-like domain superfamily

SMART SRCR-like domain

Prints SRCR domain Pfam Folate receptor-like Glucose/Sorbosone dehydrogenase SRCR domain

PROSITE profiles SRCR domain PROSITE patterns SRCR domain PANTHER PTHR19328:SF32

PTHR19328 Gene3D Six-bladed beta-propeller, TolB-like SRCR-like domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 791

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8