Mouse Slc35c2 Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Slc35c2 Knockout Project (CRISPR/Cas9) Objective: To create a Slc35c2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Slc35c2 gene (NCBI Reference Sequence: NM_144893 ; Ensembl: ENSMUSG00000017664 ) is located on Mouse chromosome 2. 10 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 10 (Transcript: ENSMUST00000109300). Exon 2~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 2 starts from about 0.09% of the coding region. Exon 2~10 covers 100.0% of the coding region. The size of effective KO region: ~6218 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 6 7 8 9 10 Legends Exon of mouse Slc35c2 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(23.3% 466) | C(26.2% 524) | T(21.65% 433) | G(28.85% 577) Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(21.1% 422) | C(28.4% 568) | T(23.05% 461) | G(27.45% 549) Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 165283369 165285368 2000 browser details YourSeq 24 1295 1323 2000 84.7% chr6 + 13062080 13062106 27 browser details YourSeq 22 696 720 2000 96.0% chr6 - 98563485 98563511 27 browser details YourSeq 22 1054 1075 2000 100.0% chr19 - 18597160 18597181 22 browser details YourSeq 22 379 400 2000 100.0% chr18 - 11043575 11043596 22 browser details YourSeq 22 1372 1396 2000 95.9% chr10 + 53878111 53878137 27 browser details YourSeq 21 444 465 2000 100.0% chr1 - 31750839 31750861 23 browser details YourSeq 21 1298 1328 2000 83.9% chrX + 73829401 73829431 31 browser details YourSeq 20 1333 1352 2000 100.0% chr10 - 36172756 36172775 20 browser details YourSeq 20 1747 1766 2000 100.0% chr1 - 74594402 74594421 20 browser details YourSeq 20 524 543 2000 100.0% chr1 + 32833791 32833810 20 Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 165275149 165277148 2000 browser details YourSeq 30 1218 1266 2000 96.9% chr1 + 119667099 119667149 51 browser details YourSeq 22 1006 1027 2000 100.0% chr13 - 33894390 33894411 22 browser details YourSeq 22 1487 1508 2000 100.0% chr9 + 65277840 65277861 22 browser details YourSeq 22 567 588 2000 100.0% chr4 + 62837535 62837556 22 browser details YourSeq 20 1420 1439 2000 100.0% chr1 - 57330574 57330593 20 browser details YourSeq 20 631 650 2000 100.0% chr1 - 36740189 36740208 20 browser details YourSeq 20 1030 1049 2000 100.0% chr1 - 34764422 34764441 20 browser details YourSeq 20 1826 1845 2000 100.0% chr1 + 42603015 42603034 20 browser details YourSeq 20 728 747 2000 100.0% chr1 + 17864906 17864925 20 Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Slc35c2 solute carrier family 35, member C2 [ Mus musculus (house mouse) ] Gene ID: 228875, updated on 14-Aug-2019 Gene summary Official Symbol Slc35c2 provided by MGI Official Full Name solute carrier family 35, member C2 provided by MGI Primary source MGI:MGI:2385166 See related Ensembl:ENSMUSG00000017664 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C85957; CGI-15; Ovcov1; D2Wsu58e Expression Ubiquitous expression in duodenum adult (RPKM 63.1), small intestine adult (RPKM 33.3) and 28 other tissues See more Orthologs human all Genomic context Location: 2 H3; 2 85.53 cM See Slc35c2 in Genome Data Viewer Exon count: 14 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (165276522..165287888, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (165102056..165113327, complement) Chromosome 2 - NC_000068.7 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 16 transcripts Gene: Slc35c2 ENSMUSG00000017664 Description solute carrier family 35, member C2 [Source:MGI Symbol;Acc:MGI:2385166] Gene Synonyms CGI-15, D2Wsu58e, Ovcov1 Location Chromosome 2: 165,276,554-165,287,869 reverse strand. GRCm38:CM000995.2 About this gene This gene has 16 transcripts (splice variants), 168 orthologues, 9 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Slc35c2- ENSMUST00000109300.8 2008 364aa ENSMUSP00000104923.2 Protein coding CCDS17074 Q5GMH2 TSL:1 204 Q8VCX2 GENCODE basic APPRIS P1 Slc35c2- ENSMUST00000109298.7 1991 364aa ENSMUSP00000104921.1 Protein coding CCDS17074 Q5GMH2 TSL:1 202 Q8VCX2 GENCODE basic APPRIS P1 Slc35c2- ENSMUST00000109299.7 1926 364aa ENSMUSP00000104922.1 Protein coding CCDS17074 Q5GMH2 TSL:1 203 Q8VCX2 GENCODE basic APPRIS P1 Slc35c2- ENSMUST00000017808.13 1843 364aa ENSMUSP00000017808.7 Protein coding CCDS17074 Q5GMH2 TSL:1 201 Q8VCX2 GENCODE basic APPRIS P1 Slc35c2- ENSMUST00000133961.7 862 192aa ENSMUSP00000118227.1 Protein coding - Q5GMG8 CDS 3' 211 incomplete TSL:3 Slc35c2- ENSMUST00000155289.7 830 199aa ENSMUSP00000119071.1 Protein coding - Q5GMH1 CDS 3' 215 incomplete TSL:5 Slc35c2- ENSMUST00000156134.7 796 192aa ENSMUSP00000116288.1 Protein coding - Q5GMG8 CDS 3' 216 incomplete TSL:2 Slc35c2- ENSMUST00000129210.7 728 163aa ENSMUSP00000118605.1 Protein coding - Q5GMG7 CDS 3' 206 incomplete TSL:5 Slc35c2- ENSMUST00000131409.1 690 137aa ENSMUSP00000120036.1 Protein coding - A2A5A3 CDS 3' 209 incomplete TSL:3 Slc35c2- ENSMUST00000129336.7 588 196aa ENSMUSP00000123299.1 Protein coding - Q5GMG9 CDS 3' 207 incomplete TSL:5 Slc35c2- ENSMUST00000130393.1 376 58aa ENSMUSP00000123450.1 Protein coding - Q5GMG6 CDS 3' 208 incomplete TSL:3 Slc35c2- ENSMUST00000132270.7 1892 72aa ENSMUSP00000125708.1 Nonsense mediated - E0CYZ1 TSL:1 210 decay Slc35c2- ENSMUST00000145301.7 786 72aa ENSMUSP00000123757.1 Nonsense mediated - E0CYZ1 TSL:5 212 decay Slc35c2- ENSMUST00000147247.1 2042 No - Retained intron - - TSL:2 213 protein Slc35c2- ENSMUST00000125550.1 1110 No - Retained intron - - TSL:1 205 protein Page 7 of 9 https://www.alphaknockout.com Slc35c2- ENSMUST00000154608.7 775 No - Retained intron - - TSL:5 214 protein 31.32 kb Forward strand 165.27Mb 165.28Mb 165.29Mb Contigs AL591430.8 > Genes (Comprehensive set... < Gm25569-201snoRNA < Slc35c2-205retained intron< Slc35c2-213retained intron < Elmo2-205protein coding < Slc35c2-202protein coding < Elmo2-203protein coding < Slc35c2-203protein coding < Elmo2-202protein coding < Slc35c2-204protein coding < Elmo2-211protein coding < Slc35c2-201protein coding < Elmo2-204protein coding < Slc35c2-210nonsense mediated decay < Elmo2-201protein coding < Slc35c2-214retained intron < Slc35c2-208protein coding < Slc35c2-207protein coding < Slc35c2-212nonsense mediated decay < Slc35c2-215protein coding < Slc35c2-216protein coding < Slc35c2-211protein coding < Slc35c2-206protein coding < Slc35c2-209protein coding Regulatory Build 165.27Mb 165.28Mb 165.29Mb Reverse strand 31.32 kb Regulation Legend CTCF Enhancer Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene processed transcript Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000109300 < Slc35c2-204protein coding Reverse strand 11.32 kb ENSMUSP00000104... Transmembrane heli... Low complexity (Seg) Pfam Sugar phosphate transporter domain PANTHER PTHR11132:SF238 PTHR11132 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant splice region variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 320 364 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 9 of 9.