Mouse Smarca1 Conditional Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Smarca1 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Smarca1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Smarca1 gene (NCBI Reference Sequence: NM_053123 ; Ensembl: ENSMUSG00000031099 ) is located on Mouse chromosome X. 24 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 23 (Transcript: ENSMUST00000077569). Exon 10 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Smarca1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-341O9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Female mice homozygous and male mice hemizygous for a targeted disruption of this gene exhibit abnormalities in neuron differentiation and neuronal precursor proliferation, increased brain and heart weight, and forebrain hypercellularity. Exon 10 starts from about 37.6% of the coding region. The knockout of Exon 10 will result in frameshift of the gene. The size of intron 9 for 5'-loxP site insertion: 12766 bp, and the size of intron 10 for 3'-loxP site insertion: 631 bp. The size of effective cKO region: ~610 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 10 11 24 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Smarca1 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7110bp) | A(27.68% 1968) | C(20.04% 1425) | T(33.99% 2417) | G(18.28% 1300) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chrX - 47861639 47864638 3000 browser details YourSeq 824 1682 2690 3000 97.1% chr14 - 123130767 123131747 981 browser details YourSeq 811 1621 2516 3000 97.8% chrX + 159758813 159759790 978 browser details YourSeq 808 1505 2516 3000 95.9% chr6 + 139221455 139222308 854 browser details YourSeq 805 1516 2516 3000 97.8% chr9 - 109398088 109399211 1124 browser details YourSeq 805 1680 2516 3000 97.9% chr8 + 71871764 71872595 832 browser details YourSeq 802 1683 2516 3000 98.4% chrX + 118536421 118537255 835 browser details YourSeq 801 1670 2516 3000 97.9% chr4 + 76110793 76111645 853 browser details YourSeq 800 1681 2516 3000 98.1% chr7 + 41033927 41034763 837 browser details YourSeq 800 1682 2516 3000 98.4% chr3 + 135156715 135157550 836 browser details YourSeq 800 1680 2516 3000 97.5% chr17 + 37260035 37260865 831 browser details YourSeq 800 1680 2516 3000 97.5% chr1 + 139369402 139370228 827 browser details YourSeq 799 1680 2516 3000 98.1% chrX - 67569772 67570623 852 browser details YourSeq 799 1668 2516 3000 97.7% chr4 - 121742258 121743109 852 browser details YourSeq 799 1693 2516 3000 98.2% chr2 + 26710991 26711810 820 browser details YourSeq 799 1682 2516 3000 98.0% chr17 + 17680670 17681503 834 browser details YourSeq 798 1680 2516 3000 98.1% chr9 - 31740298 31741134 837 browser details YourSeq 798 1682 2516 3000 97.5% chrX + 79610300 79611128 829 browser details YourSeq 798 1683 2516 3000 97.6% chr2 + 162160503 162161329 827 browser details YourSeq 797 1677 2516 3000 97.8% chr12 - 119670413 119671251 839 Note: The 3000 bp section upstream of Exon 10 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chrX - 47858029 47861028 3000 browser details YourSeq 371 1397 2041 3000 84.0% chr7 - 123305137 123305898 762 browser details YourSeq 334 1363 2170 3000 84.3% chr4 - 40532372 40533259 888 browser details YourSeq 326 1361 2095 3000 84.0% chr9 + 36686303 36687171 869 browser details YourSeq 320 1476 2252 3000 82.6% chr10 + 90451124 90451960 837 browser details YourSeq 305 1367 2253 3000 82.2% chr15 + 42864924 42865929 1006 browser details YourSeq 294 1359 2261 3000 87.4% chr7 + 116744422 116745371 950 browser details YourSeq 291 1500 2022 3000 86.8% chrX + 94622174 95038392 416219 browser details YourSeq 279 1475 2077 3000 82.2% chr9 - 111101615 111102353 739 browser details YourSeq 266 1475 1900 3000 86.0% chr5 - 89991481 89991925 445 browser details YourSeq 260 1491 2197 3000 86.4% chr5 - 38727956 38728788 833 browser details YourSeq 247 1367 1898 3000 83.1% chr14 - 50986849 50987375 527 browser details YourSeq 241 1478 1867 3000 81.4% chrX - 60086388 60086774 387 browser details YourSeq 239 1330 1845 3000 86.0% chr18 + 39977658 39978195 538 browser details YourSeq 238 1370 1897 3000 87.6% chr7 - 140731656 140732230 575 browser details YourSeq 236 1367 1872 3000 82.7% chr12 - 14787456 14787982 527 browser details YourSeq 234 1362 1900 3000 84.6% chrX - 161405561 161406106 546 browser details YourSeq 232 1530 1891 3000 84.5% chr13 - 73526371 73526731 361 browser details YourSeq 227 1571 2234 3000 87.3% chr6 + 43195383 43196171 789 browser details YourSeq 225 1492 1900 3000 86.7% chr13 - 98143748 98144160 413 Note: The 3000 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Smarca1 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 [ Mus musculus (house mouse) ] Gene ID: 93761, updated on 24-Oct-2019 Gene summary Official Symbol Smarca1 provided by MGI Official Full Name SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 provided by MGI Primary source MGI:MGI:1935127 See related Ensembl:ENSMUSG00000031099 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Snf2l; 5730494M04Rik Expression Broad expression in placenta adult (RPKM 7.5), CNS E18 (RPKM 6.6) and 15 other tissues See more Orthologs human all Genomic context Location: X; X A4-A5 See Smarca1 in Genome Data Viewer Exon count: 24 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (47809370..47892613, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (45162547..45245729, complement) Chromosome X - NC_000086.7 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 7 transcripts Gene: Smarca1 ENSMUSG00000031099 Description SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 [Source:MGI Symbol;Acc:MGI:1935127] Gene Synonyms 5730494M04Rik, Snf2l Location Chromosome X: 47,809,368-47,892,974 reverse strand. GRCm38:CM001013.2 About this gene This gene has 7 transcripts (splice variants), 202 orthologues, 32 paralogues, is a member of 1 Ensembl protein family and is associated with 10 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Smarca1-202 ENSMUST00000088973.10 4798 1046aa ENSMUSP00000086366.4 Protein coding CCDS30104 Q6PGB8 TSL:1 GENCODE basic APPRIS P3 Smarca1-201 ENSMUST00000077569.10 4024 1046aa ENSMUSP00000076769.4 Protein coding CCDS30104 Q6PGB8 TSL:1 GENCODE basic APPRIS P3 Smarca1-203 ENSMUST00000101616.8 3991 1062aa ENSMUSP00000099138.2 Protein coding CCDS72374 Q6PGB8 TSL:1 GENCODE basic APPRIS ALT2 Smarca1-206 ENSMUST00000153548.8 2304 768aa ENSMUSP00000114296.2 Protein coding - F6Z6F4 CDS 3' incomplete TSL:5 Smarca1-205 ENSMUST00000141084.2 1965 381aa ENSMUSP00000135570.1 Protein coding - Q8BS67 TSL:2 GENCODE basic Smarca1-207 ENSMUST00000153587.1 360 117aa ENSMUSP00000116209.1 Protein coding - F6QS43 CDS 3' incomplete TSL:3 Smarca1-204 ENSMUST00000140756.1 409 No protein - lncRNA - - TSL:3 Page 6 of 8 https://www.alphaknockout.com 103.61 kb Forward strand 47.80Mb 47.85Mb 47.90Mb Genes Gm14648-201 >processed pseudogene (Comprehensive set... Contigs AL671903.19 > Genes (Comprehensive set... < Smarca1-201protein coding < Smarca1-203protein coding < Smarca1-202protein coding < Smarca1-206protein coding < Smarca1-205protein coding < Smarca1-207protein coding < Smarca1-204lncRNA Regulatory Build 47.80Mb 47.85Mb 47.90Mb Reverse strand 103.61 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding