Mouse Sema4b Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Sema4b Knockout Project (CRISPR/Cas9) Objective: To create a Sema4b knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Sema4b gene (NCBI Reference Sequence: NM_013659 ; Ensembl: ENSMUSG00000030539 ) is located on Mouse chromosome 7. 15 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 15 (Transcript: ENSMUST00000032754). Exon 3~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trap allele exhibit normal cerebellar morphology. Mice homozygous for a knock-out allele exhibit enhanced memory response by way of increased IgE and IgG1 serum levels. Exon 3 starts from about 4.82% of the coding region. Exon 3~13 covers 62.01% of the coding region. The size of effective KO region: ~8159 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 3 4 5 6 7 8 9 10 1112 13 15 Legends Exon of mouse Sema4b Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 13 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(23.2% 464) | C(25.85% 517) | T(29.95% 599) | G(21.0% 420) Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(29.65% 593) | C(21.4% 428) | T(26.75% 535) | G(22.2% 444) Note: The 2000 bp section downstream of Exon 13 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN -------------------------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr7 + 80210797 80212796 2000 browser details YourSeq 178 1407 1768 2000 93.6% chr3 - 94743830 94744253 424 browser details YourSeq 177 1578 1771 2000 96.4% chr12 - 111277752 111278121 370 browser details YourSeq 173 1581 1781 2000 95.3% chr11 - 97266771 97266971 201 browser details YourSeq 169 1384 1763 2000 84.9% chr11 - 107499220 107499461 242 browser details YourSeq 169 1582 1781 2000 91.6% chr2 + 34466910 34467104 195 browser details YourSeq 167 1580 1766 2000 96.2% chr7 - 130110141 130110328 188 browser details YourSeq 167 1577 1766 2000 92.5% chr14 - 66932731 66932916 186 browser details YourSeq 167 1594 1844 2000 89.5% chr4 + 108336674 108336876 203 browser details YourSeq 166 1578 1773 2000 92.9% chr5 - 117278107 117278304 198 browser details YourSeq 166 1578 1781 2000 91.1% chr2 - 121831076 121831275 200 browser details YourSeq 166 1576 1768 2000 93.7% chr6 + 137324372 137324568 197 browser details YourSeq 165 1578 1771 2000 94.7% chr7 - 29177573 29178154 582 browser details YourSeq 165 1576 1766 2000 92.0% chr19 - 57074827 57075014 188 browser details YourSeq 165 1580 1766 2000 94.2% chr17 - 37135601 37135787 187 browser details YourSeq 165 1578 1766 2000 93.0% chr10 - 76109287 76109473 187 browser details YourSeq 165 1578 1766 2000 93.7% chr10 + 75190898 75191086 189 browser details YourSeq 164 1578 1770 2000 92.0% chr11 - 119511240 119511429 190 browser details YourSeq 163 1578 1766 2000 94.1% chr8 - 105576768 105576959 192 browser details YourSeq 163 1577 1767 2000 96.1% chr4 - 153938366 153938556 191 Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr7 + 80220956 80222955 2000 browser details YourSeq 157 1336 1867 2000 90.4% chr1 - 164024147 164024702 556 browser details YourSeq 105 1516 1824 2000 94.2% chr15 - 85739340 85739904 565 browser details YourSeq 89 1754 1879 2000 91.1% chr11 + 73259085 73262165 3081 browser details YourSeq 85 1734 1867 2000 89.8% chr1 - 191642574 191642716 143 browser details YourSeq 85 1337 1812 2000 86.6% chr8 + 90807417 90807972 556 browser details YourSeq 82 1336 1836 2000 74.8% chr12 - 83845196 83845633 438 browser details YourSeq 80 1358 1826 2000 75.9% chr11 + 89191685 89191939 255 browser details YourSeq 78 1733 1854 2000 88.7% chr11 + 97031716 97031833 118 browser details YourSeq 76 1749 1862 2000 90.6% chr15 + 12275549 12275667 119 browser details YourSeq 75 1747 1862 2000 95.2% chr19 - 22084844 22084964 121 browser details YourSeq 75 1745 1883 2000 81.7% chr12 + 41057748 41057891 144 browser details YourSeq 75 1514 1836 2000 73.8% chr11 + 79797574 79797790 217 browser details YourSeq 74 1747 1854 2000 87.8% chr2 + 29478300 29478410 111 browser details YourSeq 74 1750 1862 2000 91.2% chr1 + 87702049 87702166 118 browser details YourSeq 73 1750 1862 2000 91.1% chr11 + 84173878 84173993 116 browser details YourSeq 71 1735 1836 2000 85.4% chr2 - 117096302 117096398 97 browser details YourSeq 71 1733 1867 2000 89.1% chr4 + 123980916 123981207 292 browser details YourSeq 70 1750 1854 2000 87.3% chr11 - 118320642 118320764 123 browser details YourSeq 69 1739 1854 2000 89.7% chr1 + 151459809 151460293 485 Note: The 2000 bp section downstream of Exon 13 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Sema4b sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4B [ Mus musculus (house mouse) ] Gene ID: 20352, updated on 24-Oct-2019 Gene summary Official Symbol Sema4b provided by MGI Official Full Name sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) Primary source 4B provided by MGI See related MGI:MGI:107559 Gene type Ensembl:ENSMUSG00000030539 RefSeq status protein coding Organism VALIDATED Lineage Mus musculus Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Also known as Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression SemC; Semac; mKIAA1745 Orthologs Broad expression in spleen adult (RPKM 31.2), duodenum adult (RPKM 28.9) and 27 other tissues See more human all Genomic context Location: 7; 7 D2 See Sema4b in Genome Data Viewer Exon count: 16 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (80186841..80226646) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (87331727..87371410) Chromosome 7 - NC_000073.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 4 transcripts Gene: Sema4b ENSMUSG00000030539 Description sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4B [Source:MGI Symbol;Acc:MGI:107559] Gene Synonyms SemC, Semac Location Chromosome 7: 80,186,841-80,226,527 forward strand. GRCm38:CM001000.2 About this gene This gene has 4 transcripts (splice variants), 254 orthologues, 19 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Sema4b-201 ENSMUST00000032754.8 3949 823aa ENSMUSP00000032754.7 Protein coding CCDS21391 Q62179 TSL:1 GENCODE basic APPRIS P1 Sema4b-204 ENSMUST00000205822.1 3760 823aa ENSMUSP00000145622.1 Protein coding CCDS21391 Q62179 TSL:1 GENCODE basic APPRIS P1 Sema4b-202 ENSMUST00000107383.7 2775 No protein - Retained intron - - TSL:1 Sema4b-203 ENSMUST00000123023.2 1202 No protein - Retained intron - - TSL:1 Page 7 of 9 https://www.alphaknockout.com 59.69 kb Forward strand 80.18Mb 80.19Mb 80.20Mb 80.21Mb 80.22Mb 80.23Mb Genes (Comprehensive set... Sema4b-201 >protein coding Gdpgp1-201 >protein coding Sema4b-202 >retained intron Sema4b-204 >protein coding Sema4b-203 >retained intron Gm45206-201 >TEC Contigs AC109232.17 > Genes < Cib1-210nonsense mediated decay (Comprehensive set... < Cib1-201protein coding < Cib1-202protein coding < Cib1-207protein coding < Cib1-203protein coding < Cib1-208retained intron < Cib1-206protein coding < Cib1-204retained intron < Cib1-205retained intron < Cib1-209retained intron Regulatory Build 80.18Mb 80.19Mb 80.20Mb 80.21Mb 80.22Mb 80.23Mb Reverse strand 59.69 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding processed transcript Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000032754 39.69 kb Forward strand Sema4b-201 >protein coding ENSMUSP00000032... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily Sema domain superfamily SSF103575 SMART Sema domain PSI domain Pfam Sema domain Plexin repeat PROSITE profiles Sema domain PANTHER Semaphorin PTHR11036:SF14 Gene3D WD40/YVTN repeat-like-containing domain superfamily 3.30.1680.10 CDD cd05872 All sequence SNPs/i..