Mouse Smoc2 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Smoc2 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Smoc2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Smoc2 gene (NCBI Reference Sequence: NM_022315 ; Ensembl: ENSMUSG00000023886 ) is located on Mouse chromosome 17. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000024660). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Smoc2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-94H11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for one KO allele exhibit protection from induced kidney fibrosis and reduced interstitial myofibroblast accumulation. Another KO allele leads to shortening and widening of the skull. Exon 3 starts from about 19.16% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 10840 bp, and the size of intron 3 for 3'-loxP site insertion: 1008 bp. The size of effective cKO region: ~607 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 3 4 13 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Smoc2 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7107bp) | A(26.21% 1863) | C(20.88% 1484) | T(28.1% 1997) | G(24.81% 1763) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 14333298 14336297 3000 browser details YourSeq 144 2177 2384 3000 94.0% chr3 + 149745486 149745985 500 browser details YourSeq 132 2178 2382 3000 85.1% chrX + 152390492 152390682 191 browser details YourSeq 127 2178 2384 3000 81.5% chr5 - 59803566 59803742 177 browser details YourSeq 120 2190 2500 3000 85.7% chr17 + 7574138 7574432 295 browser details YourSeq 120 2178 2384 3000 81.1% chr13 + 115247573 115247737 165 browser details YourSeq 119 2167 2388 3000 89.8% chr14 - 35902263 35902645 383 browser details YourSeq 119 2198 2384 3000 82.3% chr13 - 82858849 82859014 166 browser details YourSeq 119 2177 2384 3000 89.4% chr19 + 34498078 34498286 209 browser details YourSeq 113 2177 2376 3000 81.9% chr9 + 31594499 31594660 162 browser details YourSeq 110 2189 2391 3000 91.0% chr14 + 71583722 71583946 225 browser details YourSeq 110 2184 2384 3000 85.2% chr11 + 91201266 91201424 159 browser details YourSeq 109 2214 2384 3000 90.3% chr19 + 46129035 46129305 271 browser details YourSeq 107 2178 2334 3000 85.3% chr10 - 45007530 45007676 147 browser details YourSeq 105 2179 2383 3000 89.2% chr11 + 32779604 32779804 201 browser details YourSeq 102 2167 2334 3000 83.5% chr1 - 162384446 162384600 155 browser details YourSeq 102 2178 2384 3000 78.6% chr13 + 115247597 115247725 129 browser details YourSeq 101 2178 2334 3000 82.7% chr18 - 29947440 29947570 131 browser details YourSeq 98 2196 2376 3000 80.5% chr16 - 78074773 78074923 151 browser details YourSeq 97 2200 2382 3000 81.9% chr6 - 97711215 97711359 145 Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 14336905 14339904 3000 browser details YourSeq 112 2360 2672 3000 91.8% chr9 + 113482567 113483303 737 browser details YourSeq 111 2370 2669 3000 83.4% chr1 - 34672870 34673163 294 browser details YourSeq 108 2359 2681 3000 88.3% chr10 + 67387121 67387633 513 browser details YourSeq 105 2357 2672 3000 79.2% chr11 - 8955268 8955554 287 browser details YourSeq 97 2362 2649 3000 81.5% chr10 + 6981484 6981755 272 browser details YourSeq 94 2435 2673 3000 80.0% chr17 + 69258310 69258489 180 browser details YourSeq 90 2400 2668 3000 80.0% chr15 - 101982413 101982632 220 browser details YourSeq 90 2438 2685 3000 91.6% chr10 - 59257132 59257389 258 browser details YourSeq 90 2400 2681 3000 79.9% chr8 + 121238655 121238879 225 browser details YourSeq 88 2203 2656 3000 75.8% chr12 - 19203156 19203387 232 browser details YourSeq 85 2450 2681 3000 94.0% chr1 - 130563666 130564091 426 browser details YourSeq 85 2360 2656 3000 89.0% chr11 + 98655715 98656014 300 browser details YourSeq 80 2406 2690 3000 79.4% chr11 - 111531567 111531791 225 browser details YourSeq 80 2466 2690 3000 88.4% chr4 + 153190046 153190301 256 browser details YourSeq 79 2436 2690 3000 91.7% chr8 - 120032191 120032562 372 browser details YourSeq 78 2438 2669 3000 90.6% chr8 + 121238657 121238891 235 browser details YourSeq 77 2394 2690 3000 76.4% chr11 - 38909971 38910155 185 browser details YourSeq 77 2440 2672 3000 80.0% chr1 - 192514567 192514741 175 browser details YourSeq 77 2374 2670 3000 89.5% chr11 + 6783461 6783806 346 Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Smoc2 SPARC related modular calcium binding 2 [ Mus musculus (house mouse) ] Gene ID: 64074, updated on 14-Oct-2019 Gene summary Official Symbol Smoc2 provided by MGI Official Full Name SPARC related modular calcium binding 2 provided by MGI Primary source MGI:MGI:1929881 See related Ensembl:ENSMUSG00000023886 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Smoc2l; 1700056C05Rik; 5430426J21Rik Expression Broad expression in ovary adult (RPKM 125.2), subcutaneous fat pad adult (RPKM 55.4) and 16 other tissues See more Orthologs human all Genomic context Location: 17 A2; 17 8.95 cM See Smoc2 in Genome Data Viewer Exon count: 13 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (14279490..14404790) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (14416513..14541797) Chromosome 17 - NC_000083.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 7 transcripts Gene: Smoc2 ENSMUSG00000023886 Description SPARC related modular calcium binding 2 [Source:MGI Symbol;Acc:MGI:1929881] Gene Synonyms 1700056C05Rik, 5430426J21Rik, Smoc2l Location Chromosome 17: 14,279,506-14,404,790 forward strand. GRCm38:CM001010.2 About this gene This gene has 7 transcripts (splice variants), 200 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Smoc2-201 ENSMUST00000024660.8 2832 447aa ENSMUSP00000024660.8 Protein coding CCDS37442 Q8CD91 TSL:1 GENCODE basic APPRIS P1 Smoc2-204 ENSMUST00000233344.1 2817 458aa ENSMUSP00000156572.1 Protein coding - A0A3B2WCK7 GENCODE basic Smoc2-202 ENSMUST00000232923.1 588 89aa ENSMUSP00000156632.1 Protein coding - A0A3B2WCQ8 GENCODE basic Smoc2-206 ENSMUST00000233540.1 3972 No protein - Retained intron - - - Smoc2-205 ENSMUST00000233526.1 572 No protein - lncRNA - - - Smoc2-203 ENSMUST00000232936.1 521 No protein - lncRNA - - - Smoc2-207 ENSMUST00000233686.1 337 No protein - lncRNA - - - Page 6 of 8 https://www.alphaknockout.com 145.28 kb Forward strand 14.30Mb 14.35Mb 14.40Mb Genes Gm34567-201 >lncRNA Gm49720-201 >processed pseudogene (Comprehensive set... Smoc2-201 >protein coding Smoc2-204 >protein coding Smoc2-206 >retained intron Smoc2-205 >lncRNA Smoc2-202 >protein coding Smoc2-203 >lncRNA Smoc2-207 >lncRNA Contigs < AC171111.2 < AC169675.2 < CT010437.15 Genes < 4930474M22Rik-201lncRNA (Comprehensive set... Regulatory Build 14.30Mb 14.35Mb 14.40Mb Reverse strand 145.28 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding processed transcript RNA gene pseudogene Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000024660 125.28 kb Forward strand Smoc2-201 >protein coding ENSMUSP00000024... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Thyroglobulin type-1 superfamily EF-hand