Mouse Hmbox1 Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Hmbox1 Knockout Project (CRISPR/Cas9) Objective: To create a Hmbox1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Hmbox1 gene (NCBI Reference Sequence: NM_177338 ; Ensembl: ENSMUSG00000021972 ) is located on Mouse chromosome 14. 10 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 10 (Transcript: ENSMUST00000067843). Exon 3~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trapped allele exhibit absence of TERT binding to chromatin as shown by subcellular fractionation analysis of mouse embryonic fibroblasts. Exon 3 starts from about 1.91% of the coding region. Exon 3~4 covers 44.79% of the coding region. The size of effective KO region: ~9552 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 3 4 10 Legends Exon of mouse Hmbox1 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(27.8% 556) | C(19.75% 395) | T(32.05% 641) | G(20.4% 408) Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(25.6% 512) | C(20.1% 402) | T(34.55% 691) | G(19.75% 395) Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr14 - 64897127 64899126 2000 browser details YourSeq 125 669 876 2000 88.4% chr9 + 75571854 75572062 209 browser details YourSeq 123 670 836 2000 87.4% chr4 + 106071736 106072229 494 browser details YourSeq 120 677 879 2000 84.5% chr6 - 107509285 107509495 211 browser details YourSeq 120 670 880 2000 89.0% chr2 - 128646737 128646959 223 browser details YourSeq 117 710 890 2000 87.7% chr11 - 62733645 62733834 190 browser details YourSeq 117 669 885 2000 78.3% chr2 + 116942820 116943044 225 browser details YourSeq 117 668 831 2000 92.1% chr11 + 32105104 32310599 205496 browser details YourSeq 115 670 822 2000 89.2% chr10 - 75972264 75972583 320 browser details YourSeq 113 678 876 2000 86.0% chr10 - 71368919 71369136 218 browser details YourSeq 112 697 876 2000 81.3% chr10 - 94671494 94671672 179 browser details YourSeq 111 670 825 2000 86.8% chrX - 136768234 136768398 165 browser details YourSeq 111 669 841 2000 88.3% chr6 - 31507301 31507482 182 browser details YourSeq 111 698 885 2000 86.3% chr1 - 177331476 177331663 188 browser details YourSeq 109 647 839 2000 79.1% chr19 - 59341464 59341612 149 browser details YourSeq 107 680 857 2000 88.1% chr18 - 38901045 38901231 187 browser details YourSeq 107 669 825 2000 89.7% chr15 - 35460727 35460887 161 browser details YourSeq 106 713 885 2000 84.8% chr6 - 108671510 108671682 173 browser details YourSeq 105 697 861 2000 89.4% chr8 - 104545689 104545864 176 browser details YourSeq 105 667 881 2000 84.4% chr15 - 8247438 8247660 223 Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr14 - 64885575 64887574 2000 browser details YourSeq 68 464 619 2000 85.8% chr19 - 45277643 45277791 149 browser details YourSeq 56 1802 1882 2000 82.7% chr14 + 21768496 21768574 79 browser details YourSeq 52 1802 1881 2000 80.8% chr9 + 14375123 14375201 79 browser details YourSeq 47 1813 1879 2000 85.1% chr1 - 180821849 180821915 67 browser details YourSeq 43 1820 1880 2000 85.3% chr18 + 34560746 34560806 61 browser details YourSeq 43 1813 1871 2000 82.8% chr11 + 97623030 97623087 58 browser details YourSeq 42 1837 1884 2000 93.8% chr8 - 72716680 72716727 48 browser details YourSeq 42 1810 1871 2000 81.7% chr11 + 79925475 79925535 61 browser details YourSeq 41 1064 1123 2000 80.0% chr14 + 121397237 121397288 52 browser details YourSeq 40 457 502 2000 93.5% chr6 - 9414825 9414870 46 browser details YourSeq 40 1821 1876 2000 85.8% chr5 - 147897567 147897622 56 browser details YourSeq 40 1837 1880 2000 95.5% chr4 + 40480448 40480491 44 browser details YourSeq 40 1080 1122 2000 97.7% chr1 + 78667808 78667992 185 browser details YourSeq 38 1815 1880 2000 75.4% chr3 - 95546081 95546145 65 browser details YourSeq 38 1077 1123 2000 80.5% chr12 - 54221765 54221805 41 browser details YourSeq 38 458 520 2000 88.4% chr11 - 78766835 78766895 61 browser details YourSeq 38 1082 1122 2000 97.6% chr10 + 102360440 102360481 42 browser details YourSeq 37 1085 1123 2000 97.5% chr9 + 57767410 57767448 39 browser details YourSeq 37 1820 1866 2000 89.4% chr7 + 135382481 135382527 47 Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Hmbox1 homeobox containing 1 [ Mus musculus (house mouse) ] Gene ID: 219150, updated on 10-Oct-2019 Gene summary Official Symbol Hmbox1 provided by MGI Official Full Name homeobox containing 1 provided by MGI Primary source MGI:MGI:2445066 See related Ensembl:ENSMUSG00000021972 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI451877; AI604847; F830020C16Rik Expression Ubiquitous expression in lung adult (RPKM 16.3), thymus adult (RPKM 10.2) and 28 other tissuesS ee more Orthologs human all Genomic context Location: 14; 14 D1 See Hmbox1 in Genome Data Viewer Exon count: 11 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (64811600..64949899, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (65441055..65568684, complement) Chromosome 14 - NC_000080.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 10 transcripts Gene: Hmbox1 ENSMUSG00000021972 Description homeobox containing 1 [Source:MGI Symbol;Acc:MGI:2445066] Gene Synonyms F830020C16Rik Location Chromosome 14: 64,811,600-64,949,871 reverse strand. GRCm38:CM001007.2 About this gene This gene has 10 transcripts (splice variants), 262 orthologues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Hmbox1-201 ENSMUST00000022544.13 3396 404aa ENSMUSP00000022544.7 Protein coding CCDS84149 Q8BJA3 TSL:1 GENCODE basic Hmbox1-205 ENSMUST00000176128.7 2958 420aa ENSMUSP00000135448.1 Protein coding CCDS84150 H3BKM3 TSL:5 GENCODE basic APPRIS ALT1 Hmbox1-202 ENSMUST00000067843.9 2931 419aa ENSMUSP00000066905.3 Protein coding CCDS36955 Q8BJA3 TSL:1 GENCODE basic APPRIS P3 Hmbox1-203 ENSMUST00000175744.7 1771 405aa ENSMUSP00000135272.1 Protein coding - H3BK67 TSL:1 GENCODE basic APPRIS ALT1 Hmbox1-210 ENSMUST00000177326.7 1729 445aa ENSMUSP00000135372.2 Protein coding - H3BKF8 CDS 5' incomplete TSL:5 Hmbox1-204 ENSMUST00000175905.7 1715 416aa ENSMUSP00000135657.2 Protein coding - H3BL55 TSL:5 GENCODE basic Hmbox1-209 ENSMUST00000176832.7 1586 408aa ENSMUSP00000135211.1 Protein coding - H3BK13 TSL:5 GENCODE basic Hmbox1-207 ENSMUST00000176489.7 1422 364aa ENSMUSP00000134824.1 Protein coding - H3BJ31 CDS 3' incomplete TSL:5 Hmbox1-208 ENSMUST00000176657.1 4740 No protein - Retained intron - - TSL:1 Hmbox1-206 ENSMUST00000176386.1 614 No protein - lncRNA - - TSL:5 Page 7 of 9 https://www.alphaknockout.com 158.27 kb Forward strand 64.82Mb 64.84Mb 64.86Mb 64.88Mb 64.90Mb 64.92Mb 64.94Mb Genes Kif13b-201 >protein coding Gm20111-201 >processed pseudogene Gm4573-201 >processed pseudogene (Comprehensive set... Kif13b-203 >protein coding Ints9-202 >retained intron Kif13b-205 >retained intron Ints9-203 >retained intron Ints9-201 >protein coding Contigs < AC132602.2 < AC141425.3 Genes (Comprehensive set... < Hmbox1-201protein coding