Mouse Tox4 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Tox4 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Tox4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Tox4 gene (NCBI Reference Sequence: NM_023434 ; Ensembl: ENSMUSG00000016831 ) is located on Mouse chromosome 14. 10 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000022766). Exon 8~9 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tox4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-137N7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 8 starts from about 48.03% of the coding region. The knockout of Exon 8~9 will result in frameshift of the gene. The size of intron 7 for 5'-loxP site insertion: 633 bp, and the size of intron 9 for 3'-loxP site insertion: 517 bp. The size of effective cKO region: ~1709 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 7 8 9 1011 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Tox4 Homology arm cKO region Exon of mouse Mettl3 loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(8209bp) | A(27.99% 2298) | C(21.15% 1736) | T(29.15% 2393) | G(21.71% 1782) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr14 + 52288226 52291225 3000 browser details YourSeq 309 1 525 3000 87.8% chr1 + 84880484 84945487 65004 browser details YourSeq 270 122 989 3000 88.8% chr5 - 122710598 122935871 225274 browser details YourSeq 262 1 497 3000 88.5% chr15 - 58993141 58994236 1096 browser details YourSeq 229 1 300 3000 91.1% chr4 - 138022148 138022447 300 browser details YourSeq 226 1 301 3000 94.9% chr11 - 88333211 88333731 521 browser details YourSeq 222 1 299 3000 91.7% chr12 - 76834324 76834630 307 browser details YourSeq 221 1 299 3000 90.6% chrX + 134618602 134619076 475 browser details YourSeq 216 1 1055 3000 92.5% chr3 + 94617272 95103049 485778 browser details YourSeq 215 112 526 3000 87.9% chr17 - 46489044 46691819 202776 browser details YourSeq 212 1 302 3000 86.6% chr4 + 99164030 99164321 292 browser details YourSeq 210 1 303 3000 90.5% chr4 - 116535723 116969807 434085 browser details YourSeq 208 122 498 3000 91.4% chr17 - 27594015 27731757 137743 browser details YourSeq 207 1 302 3000 89.5% chr19 + 5021787 5022078 292 browser details YourSeq 203 31 523 3000 89.8% chr2 + 120666895 120667427 533 browser details YourSeq 200 1 301 3000 89.7% chr17 + 50539825 50540180 356 browser details YourSeq 195 5 299 3000 89.5% chr11 - 3178374 3178684 311 browser details YourSeq 190 141 516 3000 89.4% chr15 + 76665317 76665791 475 browser details YourSeq 168 145 521 3000 80.1% chr13 + 65849782 65850088 307 browser details YourSeq 163 2 291 3000 87.7% chr2 + 164379891 164380198 308 Note: The 3000 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr14 + 52292935 52295934 3000 browser details YourSeq 1195 265 1576 3000 96.6% chr1 - 16767013 16768329 1317 browser details YourSeq 335 385 2523 3000 91.8% chr10 - 128158489 128380431 221943 browser details YourSeq 288 383 2526 3000 91.0% chr1 - 130104096 130464735 360640 browser details YourSeq 253 408 2471 3000 92.4% chr10 + 80598703 80965560 366858 browser details YourSeq 204 386 2344 3000 91.5% chr1 - 152760384 152911594 151211 browser details YourSeq 171 1075 1559 3000 85.0% chr15 + 68764862 68765385 524 browser details YourSeq 168 2249 2526 3000 87.4% chr16 - 30393230 30393504 275 browser details YourSeq 149 384 1126 3000 91.8% chr8 + 105583094 105921817 338724 browser details YourSeq 148 383 552 3000 94.1% chr11 - 31080687 31080856 170 browser details YourSeq 147 384 545 3000 94.4% chr11 - 118360070 118360230 161 browser details YourSeq 146 387 546 3000 96.3% chr10 - 19885380 19885541 162 browser details YourSeq 146 385 548 3000 93.8% chr12 + 69612033 69612194 162 browser details YourSeq 144 383 550 3000 94.6% chr8 - 126983444 126983627 184 browser details YourSeq 144 383 542 3000 95.6% chr13 - 99404343 99404504 162 browser details YourSeq 144 381 548 3000 94.1% chr4 + 108194409 108194584 176 browser details YourSeq 144 385 545 3000 95.6% chr1 + 140242926 140243087 162 browser details YourSeq 142 388 564 3000 91.4% chr16 + 64892304 64892476 173 browser details YourSeq 141 393 566 3000 92.8% chr9 - 62781125 62781299 175 browser details YourSeq 140 385 544 3000 94.4% chr10 - 11111563 11111723 161 Note: The 3000 bp section downstream of Exon 9 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Tox4 TOX high mobility group box family member 4 [ Mus musculus (house mouse) ] Gene ID: 268741, updated on 12-Aug-2019 Gene summary Official Symbol Tox4 provided by MGI Official Full Name TOX high mobility group box family member 4 provided by MGI Primary source MGI:MGI:1915389 See related Ensembl:ENSMUSG00000016831 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as LCP1; AA410149; A630040M18; 5730589K01Rik Expression Ubiquitous expression in thymus adult (RPKM 22.1), ovary adult (RPKM 18.1) and 28 other tissues See more Orthologs human all Genomic context Location: 14; 14 C2 See Tox4 in Genome Data Viewer Exon count: 10 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (52279146..52295509) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (52898821..52915184) Chromosome 14 - NC_000080.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 5 transcripts Gene: Tox4 ENSMUSG00000016831 Description TOX high mobility group box family member 4 [Source:MGI Symbol;Acc:MGI:1915389] Gene Synonyms 5730589K01Rik Location Chromosome 14: 52,279,146-52,296,401 forward strand. GRCm38:CM001007.2 About this gene This gene has 5 transcripts (splice variants), 259 orthologues, 33 paralogues, is a member of 1 Ensembl protein family and is associated with 5 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Tox4-201 ENSMUST00000022766.7 5350 619aa ENSMUSP00000022766.6 Protein coding CCDS36920 Q8BU11 TSL:5 GENCODE basic APPRIS P1 Tox4-203 ENSMUST00000152493.1 2997 No protein - Retained intron - - TSL:1 Tox4-202 ENSMUST00000137753.1 709 No protein - Retained intron - - TSL:2 Tox4-204 ENSMUST00000172655.1 344 No protein - Retained intron - - TSL:3 Tox4-205 ENSMUST00000173361.1 675 No protein - lncRNA - - TSL:3 Page 6 of 8 https://www.alphaknockout.com 37.26 kb Forward strand 52.27Mb 52.28Mb 52.29Mb 52.30Mb Genes Gm23758-201 >snoRNA Tox4-205 >lncRNA (Comprehensive set... Tox4-201 >protein coding Tox4-203 >retained intron Tox4-204 >retained intron Tox4-202 >retained intron Contigs < AC126037.4 Genes < Rab2b-201protein coding < Mettl3-201protein coding (Comprehensive set... < Rab2b-203protein coding < Mettl3-212nonsense mediated decay < Rab2b-202protein coding < Mettl3-210protein coding < Mettl3-215retained intron < Rab2b-207nonsense mediated decay < Mettl3-213protein coding < Rab2b-204nonsense mediated decay < Mettl3-209retained intron < Rab2b-206lncRNA < Mettl3-206nonsense mediated decay < Rab2b-205nonsense mediated decay < Mettl3-204retained intron < Mettl3-207retained intron < Mettl3-203retained intron < Mettl3-202nonsense mediated decay < Mettl3-211nonsense mediated decay < Mettl3-208nonsense mediated decay < Mettl3-214protein coding < Mettl3-205nonsense mediated decay Regulatory Build 52.27Mb 52.28Mb 52.29Mb 52.30Mb Reverse strand 37.26 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene processed transcript Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000022766 17.26 kb Forward strand Tox4-201 >protein coding ENSMUSP00000022... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily High mobility group box domain superfamily SMART High mobility group box domain Prints PR00886 Pfam High mobility group box domain PROSITE profiles High mobility group box domain PANTHER PTHR45781 TOX high mobility group box family member 4 Gene3D High mobility group box domain superfamily CDD cd00084 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant Scale bar 0 60 120 180 240 300 360 420 480 540 619 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.