Mouse Hsph1 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Hsph1 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Hsph1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Hsph1 gene (NCBI Reference Sequence: NM_013559 ; Ensembl: ENSMUSG00000029657 ) is located on Mouse chromosome 5. 18 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 18 (Transcript: ENSMUST00000202361). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hsph1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-256B18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous inactivation of this gene leads to decreased susceptibility to ischemic brain injury. Exon 2 starts from about 4.2% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 2082 bp, and the size of intron 2 for 3'-loxP site insertion: 1655 bp. The size of effective cKO region: ~558 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 2 3 18 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Hsph1 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7058bp) | A(25.83% 1823) | C(21.85% 1542) | T(27.78% 1961) | G(24.54% 1732) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 149634138 149637137 3000 browser details YourSeq 54 1062 1174 3000 90.8% chr3 + 40745678 40745790 113 browser details YourSeq 22 1893 1917 3000 96.0% chr12 - 74088105 74088131 27 Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 149630580 149633579 3000 browser details YourSeq 60 213 318 3000 81.3% chr3 - 43096731 43096859 129 browser details YourSeq 53 215 292 3000 78.2% chr10 + 95926408 95926471 64 browser details YourSeq 50 213 299 3000 90.4% chr2 + 30550436 30550826 391 browser details YourSeq 49 232 300 3000 88.9% chr10 - 18595911 18595979 69 browser details YourSeq 44 220 290 3000 94.0% chr2 + 35251855 35251948 94 browser details YourSeq 42 248 299 3000 90.4% chr18 + 67714586 67714637 52 browser details YourSeq 41 248 292 3000 95.6% chr16 + 58684206 58684250 45 browser details YourSeq 40 232 278 3000 93.7% chr4 - 149573710 149573757 48 browser details YourSeq 39 248 292 3000 93.4% chr11 - 84150701 84150745 45 browser details YourSeq 39 250 312 3000 89.8% chr1 - 179403729 179403795 67 browser details YourSeq 38 168 229 3000 97.5% chr17 - 53635242 53635605 364 browser details YourSeq 38 215 280 3000 88.7% chr11 - 3267405 3267469 65 browser details YourSeq 37 248 284 3000 100.0% chr14 + 67960320 67960356 37 browser details YourSeq 36 248 301 3000 83.4% chr2 - 127063781 127063834 54 browser details YourSeq 36 249 300 3000 84.7% chr5 + 128911206 128911257 52 browser details YourSeq 35 217 270 3000 89.2% chr2 - 52699786 52699837 52 browser details YourSeq 34 243 282 3000 92.5% chr9 - 55236146 55236185 40 browser details YourSeq 34 214 279 3000 75.8% chr19 - 44402658 44402723 66 browser details YourSeq 33 426 464 3000 86.9% chr2 + 158557838 158557875 38 Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Hsph1 heat shock 105kDa/110kDa protein 1 [ Mus musculus (house mouse) ] Gene ID: 15505, updated on 12-Aug-2019 Gene summary Official Symbol Hsph1 provided by MGI Official Full Name heat shock 105kDa/110kDa protein 1 provided by MGI Primary source MGI:MGI:105053 See related Ensembl:ENSMUSG00000029657 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 105kDa; Hsp105; Hsp110; hsp-E7I; AI790491; hsp110/105 Expression Broad expression in cortex adult (RPKM 30.2), CNS E11.5 (RPKM 29.9) and 25 other tissues See more Orthologs human all Genomic context Location: 5 G3; 5 89.18 cM See Hsph1 in Genome Data Viewer Exon count: 19 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (149616843..149636498, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (150419420..150438890, complement) Chromosome 5 - NC_000071.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 11 transcripts Gene: Hsph1 ENSMUSG00000029657 Description heat shock 105kDa/110kDa protein 1 [Source:MGI Symbol;Acc:MGI:105053] Gene Synonyms HSP110, Hsp105, hsp-E7I, hsp110/105 Location Chromosome 5: 149,614,287-149,636,376 reverse strand. GRCm38:CM000998.2 About this gene This gene has 11 transcripts (splice variants), 163 orthologues, 12 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Hsph1-211 ENSMUST00000202361.3 3802 858aa ENSMUSP00000144413.1 Protein coding CCDS19885 Q61699 TSL:1 GENCODE basic APPRIS P3 Hsph1-201 ENSMUST00000074846.13 3240 814aa ENSMUSP00000074392.8 Protein coding CCDS85010 Q61699 TSL:1 GENCODE basic APPRIS ALT1 Hsph1-205 ENSMUST00000201452.3 3140 858aa ENSMUSP00000144654.1 Protein coding CCDS19885 Q61699 TSL:1 GENCODE basic APPRIS P3 Hsph1-209 ENSMUST00000202089.3 3054 817aa ENSMUSP00000144297.1 Protein coding - E9Q0U7 TSL:5 GENCODE basic Hsph1-206 ENSMUST00000201559.3 661 144aa ENSMUSP00000144043.1 Protein coding - D3Z3I9 CDS 3' incomplete TSL:5 Hsph1-202 ENSMUST00000200805.3 587 94aa ENSMUSP00000143925.1 Protein coding - A0A0J9YTZ7 CDS 3' incomplete TSL:3 Hsph1-203 ENSMUST00000200825.1 416 100aa ENSMUSP00000143913.1 Protein coding - D3Z027 CDS 3' incomplete TSL:2 Hsph1-204 ENSMUST00000201431.3 4764 No protein - Retained intron - - TSL:1 Hsph1-210 ENSMUST00000202137.1 752 No protein - Retained intron - - TSL:2 Hsph1-208 ENSMUST00000201877.1 751 No protein - Retained intron - - TSL:2 Hsph1-207 ENSMUST00000201666.1 254 No protein - lncRNA - - TSL:5 Page 6 of 8 https://www.alphaknockout.com 42.09 kb Forward strand 149.61Mb 149.62Mb 149.63Mb 149.64Mb Genes Wdr95-201 >protein coding Gm20005-201 >lncRNA (Comprehensive set... Wdr95-207 >protein coding Contigs < AC119856.13 Genes (Comprehensive set... < Hsph1-205protein coding < Hsph1-211protein coding < Hsph1-201protein coding < Hsph1-209protein coding < Hsph1-208retained intron < Hsph1-206protein coding < Hsph1-210retained intron < Hsph1-202protein coding < Hsph1-204retained intron < Hsph1-203protein coding < Hsph1-207lncRNA Regulatory Build 149.61Mb 149.62Mb 149.63Mb 149.64Mb Reverse strand 42.09 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding processed transcript RNA gene Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000202361 < Hsph1-211protein coding Reverse strand 19.80 kb ENSMUSP00000144... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF53067 Heat shock protein 70kD, peptide-binding domain superfamily Heat shock protein 70kD, C-terminal domain superfamily Prints Heat shock protein 70 family Pfam Heat shock protein 70 family PROSITE patterns Heat shock protein 70, conserved site PANTHER PTHR45639:SF2 PTHR45639 Gene3D 3.30.30.30 3.90.640.10 Heat shock protein 70kD, peptide-binding domain superfamily 3.30.420.40 Heat shock protein 70kD, C-terminal domain superfamily CDD HSPH1, nucleotide-binding domain All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant Scale bar 0 80 160 240 320 400 480 560 640 720 858 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.