Mouse Heatr1 Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Heatr1 Knockout Project (CRISPR/Cas9) Objective: To create a Heatr1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Heatr1 gene (NCBI Reference Sequence: NM_144835 ; Ensembl: ENSMUSG00000050244 ) is located on Mouse chromosome 13. 45 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 45 (Transcript: ENSMUST00000059270). Exon 3~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 3 starts from about 2.22% of the coding region. Exon 3~10 covers 18.07% of the coding region. The size of effective KO region: ~9712 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 3 4 5 6 7 8 9 10 45 Legends Exon of mouse Heatr1 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 508 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 391 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(508bp) | A(28.15% 143) | C(17.13% 87) | T(27.95% 142) | G(26.77% 136) Note: The 508 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(391bp) | A(29.16% 114) | C(11.76% 46) | T(38.11% 149) | G(20.97% 82) Note: The 391 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 508 1 508 508 100.0% chr13 + 12395915 12396422 508 browser details YourSeq 33 43 76 508 100.0% chr10 + 21473856 21497396 23541 browser details YourSeq 24 205 234 508 92.9% chr10 - 22320125 22320156 32 browser details YourSeq 22 43 64 508 100.0% chr14 - 70799391 70799412 22 browser details YourSeq 22 43 64 508 100.0% chr13 - 78343516 78343537 22 browser details YourSeq 22 43 64 508 100.0% chr14 + 96457890 96457911 22 browser details YourSeq 22 51 72 508 100.0% chr1 + 170909542 170909563 22 browser details YourSeq 22 47 69 508 100.0% chr1 + 123829978 123830001 24 browser details YourSeq 20 41 62 508 95.5% chr1 - 161231721 161231742 22 Note: The 508 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 391 1 391 391 100.0% chr13 + 12406135 12406525 391 browser details YourSeq 27 200 230 391 93.6% chr1 - 142330730 142330760 31 browser details YourSeq 26 288 320 391 96.5% chr12 - 119390481 119390521 41 browser details YourSeq 23 205 229 391 96.0% chr16 - 75895755 75895779 25 browser details YourSeq 23 23 47 391 87.5% chr13 - 64350601 64350624 24 browser details YourSeq 22 10 35 391 92.4% chr12 + 67544370 67544395 26 browser details YourSeq 21 199 220 391 100.0% chr10 - 89549524 89549546 23 browser details YourSeq 21 260 282 391 95.7% chr13 + 8495951 8495973 23 browser details YourSeq 21 101 121 391 100.0% chr10 + 122560292 122560312 21 browser details YourSeq 20 267 286 391 100.0% chr10 - 50627231 50627250 20 browser details YourSeq 20 50 69 391 100.0% chr12 + 90301703 90301722 20 browser details YourSeq 20 148 167 391 100.0% chr1 + 64283145 64283164 20 Note: The 391 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Heatr1 HEAT repeat containing 1 [ Mus musculus (house mouse) ] Gene ID: 217995, updated on 12-Aug-2019 Gene summary Official Symbol Heatr1 provided by MGI Official Full Name HEAT repeat containing 1 provided by MGI Primary source MGI:MGI:2442524 See related Ensembl:ENSMUSG00000050244 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AA517551; BC019693; B130016L12Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 6.1), liver E14 (RPKM 3.9) and 28 other tissues See more Orthologs human all Genomic context Location: 13 A1; 13 4.56 cM See Heatr1 in Genome Data Viewer Exon count: 45 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (12395375..12438893) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (12487642..12531160) Chromosome 13 - NC_000079.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 8 transcripts Gene: Heatr1 ENSMUSG00000050244 Description HEAT repeat containing 1 [Source:MGI Symbol;Acc:MGI:2442524] Gene Synonyms B130016L12Rik Location Chromosome 13: 12,395,027-12,440,289 forward strand. GRCm38:CM001006.2 About this gene This gene has 8 transcripts (splice variants), 237 orthologues and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Heatr1- ENSMUST00000059270.9 6780 2143aa ENSMUSP00000054084.8 Protein coding CCDS26240 G3X9B1 TSL:5 201 GENCODE basic APPRIS P1 Heatr1- ENSMUST00000223324.1 193 4aa ENSMUSP00000152797.1 Protein coding - - CDS 3' 208 incomplete TSL:3 Heatr1- ENSMUST00000222091.1 2482 744aa ENSMUSP00000152435.1 Nonsense mediated - A0A1Y7VNI1 CDS 5' 206 decay incomplete TSL:2 Heatr1- ENSMUST00000221046.1 1325 344aa ENSMUSP00000152410.1 Nonsense mediated - A0A1Y7VNG8 TSL:1 202 decay Heatr1- ENSMUST00000221616.1 681 No - Retained intron - - TSL:2 204 protein Heatr1- ENSMUST00000221051.1 556 No - Retained intron - - TSL:3 203 protein Heatr1- ENSMUST00000221746.1 446 No - Retained intron - - TSL:3 205 protein Heatr1- ENSMUST00000222817.1 373 No - Retained intron - - TSL:2 207 protein Page 7 of 9 https://www.alphaknockout.com 65.26 kb Forward strand 12.40Mb 12.42Mb 12.44Mb Genes (Comprehensive set... Heatr1-208 >protein coding Heatr1-205 >retained intron Heatr1-206 >nonsense mediated decay Heatr1-201 >protein coding Heatr1-207 >retained intron Heatr1-203 >retained intron Heatr1-202 >nonsense mediated decay Heatr1-204 >retained intron Contigs < AC154221.3 Genes < Gm5928-201processed pseudogene < Lgals8-203protein coding (Comprehensive set... < Lgals8-202protein coding < Lgals8-204retained intron < Lgals8-206protein coding < Lgals8-207protein coding < Lgals8-201protein coding < Lgals8-208protein coding Regulatory Build 12.40Mb 12.42Mb 12.44Mb Reverse strand 65.26 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding pseudogene processed transcript Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000059270 43.52 kb Forward strand Heatr1-201 >protein coding ENSMUSP00000054... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Armadillo-type fold SMART BP28, C-terminal domain Pfam U3 small nucleolar RNA-associated protein 10, N-terminal BP28, C-terminal domain PANTHER U3 small nucleolar RNA-associated protein 10 Gene3D Armadillo-like helical All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend stop gained missense variant splice region variant synonymous variant Scale bar 0 200 400 600 800 1000 1200 1400 1600 1800 2143 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 9 of 9.