https://www.alphaknockout.com

Mouse Hspb7 Knockout Project (CRISPR/Cas9)

Objective: To create a Hspb7 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hspb7 (NCBI Reference Sequence: NM_013868 ; Ensembl: ENSMUSG00000006221 ) is located on Mouse 4. 3 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 3 (Transcript: ENSMUST00000102486). Exon 1~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele show embryonic lethality during organogenesis and defects in heart development associated with increased thin filament length and formation of atypical actin filament bundles in cardiomyocytes.

Exon 1 starts from about 0.2% of the coding region. Exon 1~3 covers 100.0% of the coding region. The size of effective KO region: ~2271 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3

Legends Exon of mouse Hspb7 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.25% 445) | C(28.35% 567) | T(20.4% 408) | G(29.0% 580)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.9% 458) | C(30.25% 605) | T(20.05% 401) | G(26.8% 536)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 141419786 141421785 2000 browser details YourSeq 34 18 55 2000 94.8% chr11 + 69506694 69506731 38 browser details YourSeq 29 11 41 2000 96.8% chr9 - 20641770 20641800 31 browser details YourSeq 29 4 44 2000 85.4% chr10 + 71280876 71280916 41 browser details YourSeq 27 24 52 2000 96.6% chr2 - 31529605 31529633 29 browser details YourSeq 27 28 54 2000 100.0% chr15 - 84467884 84467910 27 browser details YourSeq 27 7 43 2000 86.5% chr9 + 70182725 70182761 37 browser details YourSeq 26 28 55 2000 96.5% chr6 - 52669426 52669453 28 browser details YourSeq 26 28 53 2000 100.0% chr1 - 36470605 36470630 26 browser details YourSeq 25 33 57 2000 100.0% chr8 - 71593373 71593397 25 browser details YourSeq 25 28 54 2000 96.3% chr7 - 80783851 80783877 27 browser details YourSeq 25 28 54 2000 96.3% chr1 - 55669944 55669970 27 browser details YourSeq 25 29 53 2000 100.0% chr5 + 134598072 134598096 25 browser details YourSeq 24 24 57 2000 85.3% chr17 - 6573723 6573756 34 browser details YourSeq 24 29 52 2000 100.0% chr11 + 115388921 115388944 24 browser details YourSeq 23 33 55 2000 100.0% chr6 - 87450119 87450141 23 browser details YourSeq 23 29 52 2000 100.0% chr1 + 75242881 75242907 27 browser details YourSeq 23 29 53 2000 96.0% chr1 + 60943056 60943080 25 browser details YourSeq 23 33 57 2000 96.0% chr1 + 16767499 16767523 25 browser details YourSeq 22 33 54 2000 100.0% chr5 - 134201570 134201591 22

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 141424057 141426056 2000 browser details YourSeq 85 662 749 2000 98.9% chr1 - 78031957 78032050 94 browser details YourSeq 69 680 749 2000 100.0% chr1 + 55657736 55657821 86 browser details YourSeq 58 690 747 2000 100.0% chr18 - 42319352 42319409 58 browser details YourSeq 57 657 739 2000 91.1% chr7 - 80769431 80769512 82 browser details YourSeq 52 690 741 2000 100.0% chr1 - 24576110 24576161 52 browser details YourSeq 48 696 743 2000 100.0% chr1 - 26355758 26355805 48 browser details YourSeq 47 694 740 2000 100.0% chr1 - 46110630 46110676 47 browser details YourSeq 34 1288 1323 2000 100.0% chr10 - 76115696 76115881 186 browser details YourSeq 32 712 743 2000 100.0% chr1 + 12827959 12827990 32 browser details YourSeq 25 443 467 2000 100.0% chr8 + 74620333 74620357 25 browser details YourSeq 24 1291 1319 2000 81.5% chr17 + 71119941 71119967 27 browser details YourSeq 24 1896 1919 2000 100.0% chr14 + 76764874 76764897 24 browser details YourSeq 23 1717 1740 2000 100.0% chr1 + 12688952 12688982 31 browser details YourSeq 21 1895 1915 2000 100.0% chr9 - 119174576 119174596 21 browser details YourSeq 21 1573 1593 2000 100.0% chr7 - 40004416 40004436 21

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Hspb7 family, member 7 (cardiovascular) [ Mus musculus (house mouse) ] Gene ID: 29818, updated on 3-Sep-2019

Gene summary

Official Symbol Hspb7 provided by MGI Official Full Name heat shock protein family, member 7 (cardiovascular) provided by MGI Primary source MGI:MGI:1352494 See related Ensembl:ENSMUSG00000006221 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 27kDa; cvHsp; Hsp25-2 Expression Biased expression in heart adult (RPKM 366.5) and stomach adult (RPKM 18.0) See more Orthologs human all

Genomic context

Location: 4; 4 D3 See Hspb7 in Genome Data Viewer Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (141420779..141425310)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (140976694..140981225)

Chromosome 4 - NC_000070.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Hspb7 ENSMUSG00000006221

Description heat shock protein family, member 7 (cardiovascular) [Source:MGI Symbol;Acc:MGI:1352494] Gene Synonyms Hsp25-2, cvHsp Location Chromosome 4: 141,420,779-141,425,311 forward strand. GRCm38:CM000997.2 About this gene This gene has 1 transcript (splice variant), 238 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hspb7-201 ENSMUST00000102486.4 2769 169aa ENSMUSP00000099544.4 Protein coding CCDS18873 P35385 TSL:1 GENCODE basic APPRIS P1

24.53 kb Forward strand 141.415Mb 141.420Mb 141.425Mb 141.430Mb 141.435Mb (Comprehensive set... Gm13075-202 >lncRNA Hspb7-201 >protein coding

Contigs AL670285.10 > Genes < Clcnkb-202protein coding < Srarp-201protein coding (Comprehensive set...

< Clcnkb-201protein coding

< Clcnkb-204lncRNA

Regulatory Build

141.415Mb 141.420Mb 141.425Mb 141.430Mb 141.435Mb Reverse strand 24.53 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000102486

4.53 kb Forward strand

Hspb7-201 >protein coding

ENSMUSP00000099... MobiDB lite Low complexity (Seg) Superfamily HSP20-like chaperone Prints Alpha crystallin/Heat shock protein Pfam Alpha crystallin/Hsp20 domain PROSITE profiles Alpha crystallin/Hsp20 domain PANTHER PTHR46907:SF2

PTHR46907 Gene3D HSP20-like chaperone CDD Heat shock protein beta-7, ACD domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion synonymous variant

Scale bar 0 20 40 60 80 100 120 140 169

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8