Mouse Srsf7 Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Srsf7 Knockout Project (CRISPR/Cas9) Objective: To create a Srsf7 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Srsf7 gene (NCBI Reference Sequence: NM_146083.2 ; Ensembl: ENSMUSG00000024097 ) is located on Mouse chromosome 17. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000063417). Exon 3~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 3 starts from about 29.41% of the coding region. Exon 3~8 covers 70.73% of the coding region. The size of effective KO region: ~3889 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 6 7 8 Legends Exon of mouse Srsf7 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(19.8% 396) | C(25.5% 510) | T(24.45% 489) | G(30.25% 605) Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(29.15% 583) | C(15.35% 307) | T(35.15% 703) | G(20.35% 407) Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr17 - 80205439 80207438 2000 browser details YourSeq 227 135 1591 2000 84.2% chr9 + 71908741 71909051 311 browser details YourSeq 45 1882 1930 2000 96.0% chr2 - 7451454 7451502 49 browser details YourSeq 44 1884 1930 2000 97.9% chr13 + 32622570 32622618 49 browser details YourSeq 43 1885 1950 2000 74.5% chr5 - 79686950 79686996 47 browser details YourSeq 42 1885 1930 2000 95.7% chrX + 80588843 80588888 46 browser details YourSeq 42 1885 1930 2000 88.4% chr3 + 141143321 141143363 43 browser details YourSeq 42 1884 1930 2000 86.1% chr10 + 52487025 52487067 43 browser details YourSeq 41 1886 1930 2000 95.6% chr6 - 125808362 125808406 45 browser details YourSeq 41 1885 1930 2000 95.6% chr2 - 20890646 20890691 46 browser details YourSeq 41 1883 1930 2000 93.8% chr2 - 19835827 19835875 49 browser details YourSeq 41 1885 1930 2000 95.7% chr18 - 82298462 82298509 48 browser details YourSeq 41 1885 1930 2000 88.7% chr8 + 50943137 50943180 44 browser details YourSeq 41 1885 1930 2000 93.1% chr4 + 91465488 91465532 45 browser details YourSeq 41 1888 1930 2000 97.7% chr1 + 44236084 44236126 43 browser details YourSeq 40 1884 1930 2000 92.9% chr6 - 4294351 4294396 46 browser details YourSeq 40 1887 1930 2000 97.7% chr13 - 61549422 61549470 49 browser details YourSeq 40 1885 1930 2000 97.7% chr1 - 9515995 9516047 53 browser details YourSeq 40 1884 1930 2000 97.7% chr7 + 138736103 138736161 59 browser details YourSeq 39 1887 1930 2000 88.1% chr6 - 102628932 102628973 42 Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr17 - 80199551 80201550 2000 browser details YourSeq 206 1 241 2000 92.7% chr9 + 71909551 71909787 237 browser details YourSeq 114 90 239 2000 89.2% chr4 + 55163698 55163850 153 browser details YourSeq 51 868 918 2000 100.0% chr17 - 80200141 80200191 51 browser details YourSeq 24 1881 1904 2000 100.0% chr5 - 116256414 116256437 24 browser details YourSeq 24 1881 1904 2000 100.0% chr6 + 56919595 56919618 24 browser details YourSeq 23 1879 1903 2000 96.0% chr1 - 20852525 20852549 25 browser details YourSeq 22 768 789 2000 100.0% chr3 + 134457055 134457076 22 browser details YourSeq 22 1883 1904 2000 100.0% chr16 + 92150765 92150786 22 browser details YourSeq 22 1883 1904 2000 100.0% chr11 + 78769805 78769826 22 browser details YourSeq 21 796 816 2000 100.0% chr16 - 9986121 9986141 21 Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Srsf7 serine and arginine-rich splicing factor 7 [ Mus musculus (house mouse) ] Gene ID: 225027, updated on 26-Jun-2020 Gene summary Official Symbol Srsf7 provided by MGI Official Full Name serine and arginine-rich splicing factor 7 provided by MGI Primary source MGI:MGI:1926232 See related Ensembl:ENSMUSG00000024097 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 9G8; 35kDa; NX-96; Sfrs7; 9430065L19Rik Summary The protein encoded by this gene is a member of the serine/arginine (SR)-rich family of pre-mRNA splicing factors, which Expression constitute part of the spliceosome. Each of these factors contains an RNA recognition motif (RRM) for binding RNA and an RS domain for binding other proteins. The RS domain is rich in serine and arginine residues and facilitates interaction between different SR splicing factors. In addition to being critical for mRNA splicing, the SR proteins have also been shown to be involved in mRNA export from the nucleus and in translation. Five transcript variants, four of them protein-coding and the other not protein-coding, have been found for this gene. [provided by RefSeq, Sep 2010] Orthologs Broad expression in CNS E11.5 (RPKM 98.8), CNS E14 (RPKM 77.4) and 22 other tissues See more human all Genomic context Location: 17; 17 E3 See Srsf7 in Genome Data Viewer Exon count: 9 Annotation release Status Assembly Chr Location 108.20200622 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (80200080..80207369, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (80599420..80606625, complement) Chromosome 17 - NC_000083.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 8 transcripts Gene: Srsf7 ENSMUSG00000024097 Description serine and arginine-rich splicing factor 7 [Source:MGI Symbol;Acc:MGI:1926232] Gene Synonyms 9430065L19Rik, 9G8, NX-96, Sfrs7 Location Chromosome 17: 80,200,080-80,207,307 reverse strand. GRCm38:CM001010.2 About this gene This gene has 8 transcripts (splice variants), 305 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Srsf7- ENSMUST00000063417.10 2288 238aa ENSMUSP00000070983.9 Protein coding CCDS28988 Q3THA6 TSL:1 201 Q8BL97 GENCODE basic APPRIS P2 Srsf7- ENSMUST00000234095.1 2403 154aa ENSMUSP00000157284.1 Protein coding - A0A3Q4EGP0 GENCODE 202 basic Srsf7- ENSMUST00000235069.1 1040 235aa ENSMUSP00000157383.1 Protein coding - A0A3Q4L393 GENCODE 208 basic APPRIS ALT1 Srsf7- ENSMUST00000234696.1 987 215aa ENSMUSP00000157263.1 Protein coding - A0A3Q4L335 GENCODE 205 basic Srsf7- ENSMUST00000234503.1 1991 137aa ENSMUSP00000157265.1 Nonsense mediated - A0A3Q4EH04 - 203 decay Srsf7- ENSMUST00000235036.1 1921 No - Retained intron - - - 207 protein Srsf7- ENSMUST00000234889.1 1898 No - Retained intron - - - 206 protein Srsf7- ENSMUST00000234577.1 839 No - Retained intron - - - 204 protein Page 7 of 9 https://www.alphaknockout.com 27.23 kb Forward strand 80.195Mb 80.200Mb 80.205Mb 80.210Mb 80.215Mb Genes Ttc39d-202 >protein coding (Comprehensive set... Ttc39d-201 >protein coding Contigs < AC140361.2 < AC132910.8 Genes (Comprehensive set... < Gm25706-201misc RNA< Srsf7-201protein coding < Srsf7-202protein coding < Srsf7-203nonsense mediated decay < Srsf7-206retained intron < Srsf7-205protein coding < Srsf7-207retained intron < Srsf7-208protein coding < Srsf7-204retained intron Regulatory Build 80.195Mb 80.200Mb 80.205Mb 80.210Mb 80.215Mb Reverse strand 27.23 kb Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding processed transcript RNA gene Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000063417 < Srsf7-201protein coding Reverse strand 7.23 kb ENSMUSP00000070... MobiDB lite Low complexity (Seg) Superfamily Zinc finger, CCHC-type superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Pfam RNA recognition motif domain PROSITE profiles RNA recognition motif domain Zinc finger, CCHC-type PANTHER PTHR23147 PTHR23147:SF81 Gene3D Nucleotide-binding alpha-beta plait domain superfamily 4.10.60.10 CDD SRSF7, RNA recognition motif All sequence SNPs/i..