Mouse Hdgf Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Hdgf Knockout Project (CRISPR/Cas9) Objective: To create a Hdgf knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Hdgf gene (NCBI Reference Sequence: NM_008231 ; Ensembl: ENSMUSG00000004897 ) is located on Mouse chromosome 3. 6 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000005017). Exon 1~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a targeted disruption of this gene are viable and fertile and display no major morphological, biochemical or behavioral phenotypes except for a significant reduction in rearing activity. Exon 1 starts from about 0.14% of the coding region. Exon 1~6 covers 100.0% of the coding region. The size of effective KO region: ~8278 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 6 Legends Exon of mouse Hdgf Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(25.55% 511) | C(25.8% 516) | T(21.45% 429) | G(27.2% 544) Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(25.75% 515) | C(25.0% 500) | T(27.3% 546) | G(21.95% 439) Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 87904629 87906628 2000 browser details YourSeq 306 57 1023 2000 88.2% chr11 + 72554951 72555777 827 browser details YourSeq 299 381 949 2000 89.6% chr9 - 64278066 64278567 502 browser details YourSeq 280 25 934 2000 87.0% chr19 - 3536737 3537385 649 browser details YourSeq 248 75 736 2000 90.0% chr16 - 32318739 32319476 738 browser details YourSeq 241 381 972 2000 86.5% chr1 + 21334986 21335354 369 browser details YourSeq 233 28 972 2000 85.1% chr9 + 64181686 64182071 386 browser details YourSeq 232 381 906 2000 91.2% chr16 - 17221241 17222174 934 browser details YourSeq 232 383 708 2000 90.9% chr7 + 3269861 3270181 321 browser details YourSeq 223 74 585 2000 87.9% chr17 - 29272343 29272831 489 browser details YourSeq 222 381 959 2000 83.2% chr8 + 36672417 36672765 349 browser details YourSeq 222 383 970 2000 84.3% chr12 + 112990100 112990474 375 browser details YourSeq 217 381 990 2000 86.0% chr4 + 155977777 155978106 330 browser details YourSeq 213 92 871 2000 90.9% chr11 + 70243176 70243966 791 browser details YourSeq 211 381 910 2000 92.4% chr11 + 21233422 21234101 680 browser details YourSeq 210 381 950 2000 82.4% chr11 + 94606962 94607345 384 browser details YourSeq 208 14 584 2000 79.8% chr12 - 59263902 59264305 404 browser details YourSeq 207 381 708 2000 92.7% chr6 - 148824716 148825348 633 browser details YourSeq 200 381 949 2000 85.6% chr8 + 13970338 13970682 345 browser details YourSeq 199 474 968 2000 85.0% chr15 - 76665486 76665794 309 Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 87914907 87916906 2000 browser details YourSeq 1004 1 1219 2000 92.8% chrX - 79629989 79631163 1175 browser details YourSeq 81 1447 1787 2000 87.8% chr5 + 129100325 129100828 504 browser details YourSeq 75 1471 1838 2000 91.3% chr3 + 108281951 108584251 302301 browser details YourSeq 57 1392 1514 2000 78.0% chr1 - 189216157 189216269 113 browser details YourSeq 53 1425 1993 2000 71.5% chr17 + 26930154 26930636 483 browser details YourSeq 51 1442 1514 2000 85.0% chr3 - 53292674 53292746 73 browser details YourSeq 49 1439 1510 2000 91.7% chr2 + 106131020 106131091 72 browser details YourSeq 48 1440 1517 2000 83.7% chr18 + 36094076 36094149 74 browser details YourSeq 48 1432 1514 2000 79.3% chr10 + 97392595 97392680 86 browser details YourSeq 47 1441 1513 2000 82.2% chr3 - 40507493 40507565 73 browser details YourSeq 47 1441 1514 2000 82.5% chr2 + 52443075 52443152 78 browser details YourSeq 47 1445 1520 2000 77.1% chr10 + 76618191 76618264 74 browser details YourSeq 46 1443 1514 2000 86.0% chr1 - 24287946 24288018 73 browser details YourSeq 46 1471 1572 2000 89.7% chr8 + 70050803 70050907 105 browser details YourSeq 46 1444 1510 2000 92.8% chr10 + 121520294 121520365 72 browser details YourSeq 45 1447 1514 2000 90.0% chr11 + 103680859 103680924 66 browser details YourSeq 45 1447 1514 2000 79.0% chr11 + 83528007 83528067 61 browser details YourSeq 44 1465 1514 2000 94.0% chr4 - 150979366 150979415 50 browser details YourSeq 44 1445 1514 2000 84.5% chr12 + 84033257 84033325 69 Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Hdgf heparin binding growth factor [ Mus musculus (house mouse) ] Gene ID: 15191, updated on 10-Oct-2019 Gene summary Official Symbol Hdgf provided by MGI Official Full Name heparin binding growth factor provided by MGI Primary source MGI:MGI:1194494 See related Ensembl:ENSMUSG00000004897 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI118077; D3Ertd299e Expression Ubiquitous expression in liver E14 (RPKM 92.7), CNS E11.5 (RPKM 90.6) and 28 other tissues See more Orthologs human all Genomic context Location: 3 F1; 3 38.78 cM See Hdgf in Genome Data Viewer Exon count: 8 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (87906090..87916132) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (87710243..87720054) Chromosome 3 - NC_000069.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 6 transcripts Gene: Hdgf ENSMUSG00000004897 Description heparin binding growth factor [Source:MGI Symbol;Acc:MGI:1194494] Gene Synonyms D3Ertd299e Location Chromosome 3: 87,906,321-87,916,132 forward strand. GRCm38:CM000996.2 About this gene This gene has 6 transcripts (splice variants), 185 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Hdgf- ENSMUST00000005017.14 2245 237aa ENSMUSP00000005017.8 Protein coding CCDS17457 P51859 TSL:1 201 GENCODE basic APPRIS P1 Hdgf- ENSMUST00000159492.7 933 202aa ENSMUSP00000124803.1 Protein coding - E0CXA0 CDS 3' 202 incomplete TSL:3 Hdgf- ENSMUST00000162631.1 763 84aa ENSMUSP00000123832.1 Nonsense mediated - E0CYW7 TSL:3 206 decay Hdgf- ENSMUST00000160312.1 582 No - Retained intron - - TSL:2 204 protein Hdgf- ENSMUST00000160198.1 377 No - lncRNA - - TSL:2 203 protein Hdgf- ENSMUST00000161616.7 365 No - lncRNA - - TSL:3 205 protein Page 7 of 9 https://www.alphaknockout.com 29.81 kb Forward strand 87.90Mb 87.91Mb 87.92Mb Genes (Comprehensive set... Hdgf-201 >protein coding Mrpl24-201 >protein coding Hdgf-202 >protein coding Mrpl24-202 >protein coding Hdgf-206 >nonsense mediated decay Mrpl24-205 >protein coding Hdgf-205 >lncRNA Mrpl24-206 >retained intron Hdgf-203 >lncRNA Mrpl24-204 >protein coding Hdgf-204 >retained intron Mrpl24-203 >protein coding Contigs AC158233.2 > Genes < Rrnad1-208nonsense mediated decay (Comprehensive set... < Rrnad1-211nonsense mediated decay < Rrnad1-202retained intron < Rrnad1-212retained intron < Rrnad1-209retained intron < Rrnad1-203retained intron < Rrnad1-204retained intron < Rrnad1-201protein coding < Rrnad1-206protein coding < Rrnad1-207protein coding < Rrnad1-205lncRNA < Rrnad1-210lncRNA Regulatory Build 87.90Mb 87.91Mb 87.92Mb Reverse strand 29.81 kb Regulation Legend CTCF Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene processed transcript Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000005017 9.81 kb Forward strand Hdgf-201 >protein coding ENSMUSP00000005... MobiDB lite Low complexity (Seg) Superfamily SSF63748 SMART PWWP domain Pfam PWWP domain PROSITE profiles PWWP domain PANTHER PTHR12550:SF41 PTHR12550 Gene3D 2.30.30.140 CDD HDGF-related, PWWP domain All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend inframe insertion synonymous variant Scale bar 0 20 40 60 80 100 120 140 160 180 200 237 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.