https://www.alphaknockout.com

Mouse Haus1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Haus1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Haus1 (NCBI Reference Sequence: NM_146089 ; Ensembl: ENSMUSG00000041840 ) is located on Mouse 18. 9 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000048192). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Haus1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-196O5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 24.7% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 2655 bp, and the size of intron 4 for 3'-loxP site insertion: 951 bp. The size of effective cKO region: ~2727 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Haus1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9227bp) | A(26.3% 2427) | C(21.76% 2008) | T(28.64% 2643) | G(23.29% 2149)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 - 77764459 77767458 3000 browser details YourSeq 213 1826 2569 3000 90.2% chr1 + 78213654 78530734 317081 browser details YourSeq 180 2220 2566 3000 83.2% chr2 - 54193617 54193868 252 browser details YourSeq 173 2228 2569 3000 82.1% chr3 + 59046504 59046765 262 browser details YourSeq 171 2263 2762 3000 91.0% chr16 + 94494447 94495237 791 browser details YourSeq 166 2264 2569 3000 92.4% chr3 - 94743832 94744253 422 browser details YourSeq 165 2228 2571 3000 90.3% chr4 + 137822537 137823182 646 browser details YourSeq 163 2362 2739 3000 84.5% chrX - 56603899 56604244 346 browser details YourSeq 162 2366 2569 3000 91.4% chr12 - 20210422 20210629 208 browser details YourSeq 162 2220 2568 3000 84.6% chr11 - 70673058 70673364 307 browser details YourSeq 160 2363 2569 3000 91.0% chr15 - 78818854 78819381 528 browser details YourSeq 160 2366 2569 3000 90.9% chr12 + 23954914 23955120 207 browser details YourSeq 159 2257 2569 3000 83.3% chr16 - 31911629 31911868 240 browser details YourSeq 159 2366 2569 3000 90.8% chr12 - 18833213 18833418 206 browser details YourSeq 159 2366 2569 3000 90.8% chr12 + 22893762 22893967 206 browser details YourSeq 158 2387 2591 3000 88.5% chr19 + 11746878 11747076 199 browser details YourSeq 158 2363 2571 3000 89.2% chr1 + 52635599 52635816 218 browser details YourSeq 156 2379 2571 3000 93.5% chr8 + 84007066 84007261 196 browser details YourSeq 156 1807 2570 3000 78.9% chr17 + 26252235 26252646 412 browser details YourSeq 155 2387 2569 3000 93.9% chr17 - 28530362 28530548 187

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 - 77758732 77761731 3000 browser details YourSeq 520 310 2872 3000 91.7% chr4 - 116048930 116529546 480617 browser details YourSeq 147 310 482 3000 94.1% chr10 - 128436791 128436980 190 browser details YourSeq 147 305 482 3000 93.1% chr1 + 60035479 60035675 197 browser details YourSeq 143 310 486 3000 93.4% chr3 - 145849796 145849986 191 browser details YourSeq 140 295 478 3000 91.3% chr17 + 42824127 42824314 188 browser details YourSeq 139 295 469 3000 93.3% chr5 - 24643413 24643610 198 browser details YourSeq 137 2774 3000 3000 85.9% chr5 + 149393340 149393583 244 browser details YourSeq 136 310 483 3000 93.6% chr11 - 33165433 33165620 188 browser details YourSeq 136 310 469 3000 94.9% chr17 + 74934284 74934465 182 browser details YourSeq 134 311 482 3000 92.0% chr7 + 107707328 107707513 186 browser details YourSeq 132 1439 1579 3000 97.2% chr4 - 98872451 98872592 142 browser details YourSeq 131 310 463 3000 94.0% chr17 + 69313279 69313448 170 browser details YourSeq 130 1439 1582 3000 95.2% chr17 - 44590969 44591112 144 browser details YourSeq 129 310 463 3000 93.4% chr12 - 65024616 65024785 170 browser details YourSeq 126 310 469 3000 92.2% chr6 + 85972011 85972193 183 browser details YourSeq 124 1768 1980 3000 87.5% chr5 - 127581667 127582184 518 browser details YourSeq 123 2768 2991 3000 86.0% chr13 - 95198740 95198981 242 browser details YourSeq 123 310 463 3000 91.4% chr11 + 60725674 60725843 170 browser details YourSeq 117 2759 2962 3000 84.5% chr5 + 64928030 64928235 206

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Haus1 HAUS augmin-like complex, subunit 1 [ Mus musculus (house mouse) ] Gene ID: 225745, updated on 24-Oct-2019

Gene summary

Official Symbol Haus1 provided by MGI Official Full Name HAUS augmin-like complex, subunit 1 provided by MGI Primary source MGI:MGI:2385076 See related Ensembl:ENSMUSG00000041840 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ccdc5; HEI-C; BC024400 Expression Broad expression in CNS E11.5 (RPKM 6.2), testis adult (RPKM 5.4) and 19 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 E3 See Haus1 in Genome Data Viewer

Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (77757277..77773885, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (77996306..78006519, complement)

Chromosome 18 - NC_000084.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Haus1 ENSMUSG00000041840

Description HAUS augmin-like complex, subunit 1 [Source:MGI Symbol;Acc:MGI:2385076] Gene Synonyms Ccdc5, HEI-C, spindle associated Location : 77,757,567-77,773,886 reverse strand. GRCm38:CM001011.2 View alleles of this gene on alternative sequences About this gene This gene has 3 transcripts (splice variants), 1 gene allele, 208 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Haus1-201 ENSMUST00000048192.8 1003 278aa ENSMUSP00000035826.7 Protein coding CCDS29357 Q8BHX1 TSL:1 GENCODE basic APPRIS P1

Haus1-202 ENSMUST00000236234.1 1226 293aa ENSMUSP00000158336.1 Protein coding - A0A494BB26 GENCODE basic

Haus1-203 ENSMUST00000236575.1 6882 No protein - Retained intron - - -

36.32 kb Forward strand 77.75Mb 77.76Mb 77.77Mb 77.78Mb Atp5a1-201 >protein coding (Comprehensive set...

Atp5a1-202 >protein coding

Atp5a1-204 >nonsense mediated decay

Atp5a1-203 >retained intron

Atp5a1-205 >retained intron

Contigs < AC102195.14 < AC162291.13

Genes (Comprehensive set... < Haus1-202protein coding

< Haus1-201protein coding

< Haus1-203retained intron

Regulatory Build

77.75Mb 77.76Mb 77.77Mb 77.78Mb Reverse strand 36.32 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000048192

< Haus1-201protein coding

Reverse strand 10.21 kb

ENSMUSP00000035... Low complexity (Seg) Coiled-coils (Ncoils) Prints HAUS augmin-like complex subunit 1 PANTHER HAUS augmin-like complex subunit 1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 278

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7