https://www.alphaknockout.com

Mouse Hspbp1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Hspbp1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hspbp1 (NCBI Reference Sequence: NM_024172 ; Ensembl: ENSMUSG00000063802 ) is located on Mouse 7. 8 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000079970). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hspbp1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-179K11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null mutation show male infertility with an arrest of male meiosis, increased male germ cell apoptosis and azoospermia.

Exon 3 starts from about 19.14% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 2223 bp, and the size of intron 3 for 3'-loxP site insertion: 3912 bp. The size of effective cKO region: ~705 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 8 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Hspbp1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7205bp) | A(24.05% 1733) | C(22.46% 1618) | T(29.23% 2106) | G(24.26% 1748)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 4682299 4685298 3000 browser details YourSeq 45 1983 2391 3000 61.8% chr4 + 101371818 101371933 116 browser details YourSeq 36 1809 1852 3000 93.2% chr4 - 137113086 137113131 46 browser details YourSeq 35 2356 2391 3000 100.0% chr2 + 155403076 155403121 46 browser details YourSeq 31 2345 2391 3000 88.6% chr5 - 131006071 131006116 46 browser details YourSeq 31 2348 2390 3000 71.5% chr18 + 82530137 82530171 35 browser details YourSeq 29 2334 2378 3000 94.0% chr9 - 115942141 115942189 49 browser details YourSeq 28 2365 2392 3000 100.0% chr8 + 119534093 119534120 28 browser details YourSeq 26 1495 1523 3000 96.5% chr17 - 40256751 40256781 31 browser details YourSeq 26 2365 2390 3000 100.0% chr7 + 110240777 110240802 26 browser details YourSeq 26 2366 2391 3000 100.0% chr13 + 108216341 108216366 26 browser details YourSeq 25 2366 2390 3000 100.0% chr7 + 137350803 137350827 25 browser details YourSeq 25 2367 2391 3000 100.0% chr4 + 135994018 135994042 25 browser details YourSeq 25 2365 2391 3000 96.3% chr1 + 151391350 151391376 27 browser details YourSeq 24 2852 2876 3000 100.0% chr15 - 87105186 87105211 26 browser details YourSeq 24 2365 2388 3000 100.0% chr16 + 87433809 87433832 24 browser details YourSeq 23 2361 2391 3000 87.1% chr8 + 25706296 25706326 31 browser details YourSeq 22 2365 2392 3000 89.3% chr7 - 128438611 128438638 28 browser details YourSeq 22 2369 2390 3000 100.0% chr10 + 40157806 40157827 22 browser details YourSeq 21 2239 2259 3000 100.0% chr7 + 124485936 124485956 21

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 4678594 4681593 3000 browser details YourSeq 381 30 428 3000 97.8% chr7 - 4681042 4681440 399 browser details YourSeq 306 47 364 3000 98.2% chr7 - 4681044 4681361 318 browser details YourSeq 212 1420 2437 3000 84.1% chr9 + 110126972 110127699 728 browser details YourSeq 204 1798 2021 3000 98.2% chr17 - 73115785 73116299 515 browser details YourSeq 193 1840 2434 3000 87.8% chr4 - 115947362 115947876 515 browser details YourSeq 188 1840 2436 3000 84.5% chr4 + 126992995 126993250 256 browser details YourSeq 188 1821 2021 3000 98.0% chr2 + 37471414 37471617 204 browser details YourSeq 187 1426 2053 3000 89.9% chr15 + 93434559 93435172 614 browser details YourSeq 182 1859 2491 3000 83.9% chrX + 160096413 160096721 309 browser details YourSeq 172 1819 2036 3000 90.0% chr1 - 4840535 4840744 210 browser details YourSeq 171 1835 2238 3000 88.1% chr5 - 64152501 64152826 326 browser details YourSeq 169 1822 2022 3000 93.4% chr9 + 120621988 120622194 207 browser details YourSeq 169 1828 2024 3000 92.8% chr14 + 14214285 14214480 196 browser details YourSeq 169 1833 2256 3000 92.5% chr11 + 121209360 121209830 471 browser details YourSeq 168 1826 2022 3000 93.9% chr5 - 143413484 143413683 200 browser details YourSeq 167 1825 2025 3000 91.5% chr3 + 122649322 122649521 200 browser details YourSeq 166 1837 2024 3000 94.7% chr7 + 49591290 49591485 196 browser details YourSeq 166 1834 2023 3000 93.6% chr15 + 100516905 100517093 189 browser details YourSeq 165 1822 2017 3000 94.2% chr16 - 20352195 20352390 196

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Hspbp1 HSPA (heat shock 70kDa) binding protein, cytoplasmic cochaperone 1 [ Mus musculus (house mouse) ] Gene ID: 66245, updated on 24-Oct-2019

Gene summary

Official Symbol Hspbp1 provided by MGI Official Full Name HSPA (heat shock 70kDa) binding protein, cytoplasmic cochaperone 1 provided by MGI Primary source MGI:MGI:1913495 See related Ensembl:ENSMUSG00000063802 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 1500019G21Rik Expression Ubiquitous expression in ovary adult (RPKM 24.7), adrenal adult (RPKM 23.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 A1 See Hspbp1 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (4660515..4685188, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (4612123..4636565, complement)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Hspbp1 ENSMUSG00000063802

Description HSPA (heat shock 70kDa) binding protein, cytoplasmic cochaperone 1 [Source:MGI Symbol;Acc:MGI:1913495] Gene Synonyms 1500019G21Rik Location Chromosome 7: 4,660,521-4,685,068 reverse strand. GRCm38:CM001000.2 About this gene This gene has 6 transcripts (splice variants), 169 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 10 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hspbp1-201 ENSMUST00000079970.5 1593 357aa ENSMUSP00000078886.4 Protein coding CCDS20740 Q99P31 TSL:1 GENCODE basic APPRIS P1

Hspbp1-203 ENSMUST00000205952.1 933 275aa ENSMUSP00000145960.1 Protein coding - A0A0U1RPF2 CDS 3' incomplete TSL:5

Hspbp1-206 ENSMUST00000206946.1 820 256aa ENSMUSP00000146248.1 Protein coding - A0A0U1RQ49 CDS 3' incomplete TSL:2

Hspbp1-204 ENSMUST00000206306.1 813 211aa ENSMUSP00000145954.1 Protein coding - A0A0U1RPE7 CDS 3' incomplete TSL:2

Hspbp1-202 ENSMUST00000205474.1 306 97aa ENSMUSP00000145614.1 Protein coding - A0A0U1RNL5 CDS 5' incomplete TSL:5

Hspbp1-205 ENSMUST00000206708.1 262 No protein - lncRNA - - TSL:5

Page 6 of 8 https://www.alphaknockout.com

44.55 kb Forward strand 4.66Mb 4.67Mb 4.68Mb 4.69Mb Gm44878-201 >TEC Brsk1-202 >protein coding (Comprehensive set...

Brsk1-201 >protein coding

Brsk1-205 >protein coding

Brsk1-204 >protein coding

Contigs < AC161197.9 Genes (Comprehensive set... < Ppp6r1-201protein coding < Hspbp1-201protein coding

< Ppp6r1-203retained intron < Hspbp1-202protein coding < Hspbp1-204protein coding

< Ppp6r1-202protein coding < Hspbp1-205lncRNA < Hspbp1-206protein coding

< Hspbp1-203protein coding

Regulatory Build

4.66Mb 4.67Mb 4.68Mb 4.69Mb Reverse strand 44.55 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000079970

< Hspbp1-201protein coding

Reverse strand 24.55 kb

ENSMUSP00000078... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Armadillo-type fold Pfam Nucleotide exchange factor Fes1 PROSITE profiles PS51257 PANTHER PTHR19316

PTHR19316:SF18 Gene3D Armadillo-like helical

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 357

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8