https://www.alphaknockout.com

Mouse Hook2 Knockout Project (CRISPR/Cas9)

Objective: To create a Hook2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hook2 (NCBI Reference Sequence: NM_133255 ; Ensembl: ENSMUSG00000052566 ) is located on Mouse 8. 22 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 22 (Transcript: ENSMUST00000064495). Exon 2~17 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 2.14% of the coding region. Exon 2~17 covers 73.98% of the coding region. The size of effective KO region: ~7306 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 14 16

1 2 3 4 5 6 7 8 9 10 11 12 13 15 17 22

Legends Exon of mouse Hook2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 466 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 17 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(466bp) | A(17.6% 82) | C(26.61% 124) | T(25.11% 117) | G(30.69% 143)

Note: The 466 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.65% 593) | C(20.5% 410) | T(27.1% 542) | G(22.75% 455)

Note: The 2000 bp section downstream of Exon 17 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 466 1 466 466 100.0% chr8 + 84990754 84991219 466 browser details YourSeq 24 9 80 466 66.7% chr1 - 119876110 119876181 72

Note: The 466 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 84998526 85000525 2000 browser details YourSeq 353 338 1256 2000 91.8% chr1 + 167338353 167339109 757 browser details YourSeq 351 324 1198 2000 90.5% chr16 - 14322526 14323277 752 browser details YourSeq 334 336 1179 2000 91.7% chr1 - 133043385 133044182 798 browser details YourSeq 289 329 989 2000 90.7% chr9 - 104135560 104136151 592 browser details YourSeq 288 819 1407 2000 95.1% chr1 - 156516125 156516740 616 browser details YourSeq 288 338 988 2000 91.8% chr9 + 22308075 22308402 328 browser details YourSeq 287 342 989 2000 90.5% chr19 - 4360963 4361387 425 browser details YourSeq 284 338 989 2000 92.3% chr4 + 99793350 99793967 618 browser details YourSeq 274 339 989 2000 89.3% chr11 + 4894604 4894906 303 browser details YourSeq 271 341 989 2000 90.0% chr13 - 108760399 108760911 513 browser details YourSeq 269 338 965 2000 90.4% chr12 + 55773252 55773559 308 browser details YourSeq 267 339 965 2000 91.8% chr8 + 95793770 95794197 428 browser details YourSeq 265 356 989 2000 90.0% chr15 - 60082928 60083305 378 browser details YourSeq 264 354 1013 2000 88.3% chr7 + 110199256 110199776 521 browser details YourSeq 261 811 1199 2000 96.8% chr8 + 25756146 26181691 425546 browser details YourSeq 259 818 1453 2000 92.1% chr8 - 33284985 33285601 617 browser details YourSeq 256 810 1198 2000 93.8% chr8 - 70619961 70620348 388 browser details YourSeq 256 810 1198 2000 92.2% chr5 - 146269928 146270293 366 browser details YourSeq 256 818 1198 2000 92.5% chr17 - 5276320 5276666 347

Note: The 2000 bp section downstream of Exon 17 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Hook2 hook tethering protein 2 [ Mus musculus (house mouse) ] Gene ID: 170833, updated on 24-Oct-2019

Gene summary

Official Symbol Hook2 provided by MGI Official Full Name hook microtubule tethering protein 2 provided by MGI Primary source MGI:MGI:2181664 See related Ensembl:ENSMUSG00000052566 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mHK2; A630054I03Rik Expression Ubiquitous expression in duodenum adult (RPKM 21.0), testis adult (RPKM 20.6) and 27 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 C3 See Hook2 in Genome Data Viewer Exon count: 26

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (84988776..85003352)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (87514494..87527263)

Chromosome 8 - NC_000074.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Hook2 ENSMUSG00000052566

Description hook microtubule tethering protein 2 [Source:MGI Symbol;Acc:MGI:2181664] Gene Synonyms A630054I03Rik Location Chromosome 8: 84,990,603-85,003,349 forward strand. GRCm38:CM001001.2 About this gene This gene has 4 transcripts (splice variants), 169 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hook2-201 ENSMUST00000064495.7 2580 716aa ENSMUSP00000067752.6 Protein coding CCDS22489 Q7TMK6 TSL:1 GENCODE basic APPRIS P1

Hook2-204 ENSMUST00000210326.1 3080 611aa ENSMUSP00000148078.1 Protein coding - A0A1B0GSU8 TSL:1 GENCODE basic

Hook2-203 ENSMUST00000209764.1 2516 692aa ENSMUSP00000148237.1 Protein coding - A0A1B0GT78 TSL:5 GENCODE basic

Hook2-202 ENSMUST00000209652.1 639 No protein - lncRNA - - TSL:2

32.75 kb Forward strand 84.99Mb 85.00Mb 85.01Mb (Comprehensive set... Hook2-203 >protein coding

Hook2-201 >protein coding

Hook2-204 >protein coding

Hook2-202 >lncRNA

Contigs < AC163703.4 Genes < Best2-203protein coding (Comprehensive set...

< Best2-201protein coding

< Best2-202protein coding

Regulatory Build

84.99Mb 85.00Mb 85.01Mb Reverse strand 32.75 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000064495

12.73 kb Forward strand

Hook2-201 >protein coding

ENSMUSP00000067... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF116907 Pfam Hook-like protein family PROSITE profiles Calponin homology domain PANTHER PTHR18947:SF37

PTHR18947 Gene3D CH domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice donor variant missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 716

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8