https://www.alphaknockout.com
Mouse Sema4a Knockout Project (CRISPR/Cas9)
Objective: To create a Sema4a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Sema4a gene (NCBI Reference Sequence: NM_013658 ; Ensembl: ENSMUSG00000028064 ) is located on Mouse chromosome 3. 15 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 15 (Transcript: ENSMUST00000029700). Exon 2~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a knock-out allele show no obvious brain defects but exhibit impaired T cell priming and defective Th1 responses. Homozygotes for a gene trap allele show severe retinal degeneration with reduced retinal vessels, depigmentation and dysfunction of both rod and cone photoreceptors.
Exon 2 starts from the coding region. Exon 2~10 covers 49.74% of the coding region. The size of effective KO region: ~6856 bp. The KO region does not have any other known gene.
Page 1 of 10 https://www.alphaknockout.com
Overview of the Targeting Strategy
Wildtype allele 5' gRNA region gRNA region 3'
1 2 3 4 5 6 7 8 9 10 15
Legends Exon of mouse Sema4a Knockout region
Page 2 of 10 https://www.alphaknockout.com
Overview of the Dot Plot (up) Window size: 15 bp
Forward Reverse Complement
Sequence 12
Note: The 701 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Overview of the Dot Plot (down) Window size: 15 bp
Forward Reverse Complement
Sequence 12
Note: The 2000 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.
Page 3 of 10 https://www.alphaknockout.com
Overview of the GC Content Distribution (up) Window size: 300 bp
Sequence 12
Summary: Full Length(701bp) | A(23.97% 168) | C(23.25% 163) | T(18.69% 131) | G(34.09% 239)
Note: The 701 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Overview of the GC Content Distribution (down) Window size: 300 bp
Sequence 12
Summary: Full Length(2000bp) | A(22.3% 446) | C(26.55% 531) | T(30.4% 608) | G(20.75% 415)
Note: The 2000 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 4 of 10 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 701 1 701 701 100.0% chr3 - 88454818 88455518 701 browser details YourSeq 40 56 155 701 95.5% chr11 - 53916566 53940005 23440 browser details YourSeq 31 292 351 701 73.6% chr10 + 9119365 9119416 52 browser details YourSeq 25 60 87 701 96.5% chr11 - 65047254 65047284 31 browser details YourSeq 22 564 587 701 87.0% chr12 - 104793264 104793286 23 browser details YourSeq 22 392 414 701 100.0% chr1 - 128874851 128874874 24 browser details YourSeq 21 541 561 701 100.0% chr10 - 20544605 20544625 21 browser details YourSeq 20 46 65 701 100.0% chr13 - 29840438 29840457 20 browser details YourSeq 20 653 672 701 100.0% chr14 + 16488031 16488050 20
Note: The 701 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 88445991 88447990 2000 browser details YourSeq 73 255 582 2000 92.0% chr4 - 155038870 155039336 467 browser details YourSeq 64 263 811 2000 93.3% chr7 + 143227188 143356115 128928 browser details YourSeq 53 291 597 2000 76.2% chr1 + 84924932 84925200 269 browser details YourSeq 52 291 806 2000 63.3% chr1 + 36055899 36056163 265 browser details YourSeq 51 262 343 2000 90.5% chr10 - 69364491 69364770 280 browser details YourSeq 50 242 335 2000 91.7% chr7 - 24496126 24496323 198 browser details YourSeq 49 262 333 2000 91.6% chr11 - 77596717 77596795 79 browser details YourSeq 49 262 335 2000 90.0% chr5 + 79715938 79716016 79 browser details YourSeq 48 262 345 2000 81.7% chr2 - 144474210 144474289 80 browser details YourSeq 48 296 570 2000 92.9% chr16 + 87468723 87469105 383 browser details YourSeq 47 789 886 2000 92.8% chr1 - 153367843 153368251 409 browser details YourSeq 46 262 337 2000 92.6% chr7 - 19927444 19927520 77 browser details YourSeq 46 262 334 2000 92.6% chr14 + 32614938 32615011 74 browser details YourSeq 46 262 334 2000 92.6% chr10 + 63044707 63044780 74 browser details YourSeq 45 268 333 2000 84.9% chr4 - 103165358 103165424 67 browser details YourSeq 45 262 333 2000 92.5% chr1 - 171336552 171336624 73 browser details YourSeq 45 262 334 2000 91.0% chr16 + 20081264 20081337 74 browser details YourSeq 45 262 333 2000 92.5% chr11 + 84717834 84717906 73 browser details YourSeq 44 262 333 2000 89.1% chr11 + 101384482 101384556 75
Note: The 2000 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.
Page 5 of 10 https://www.alphaknockout.com
Gene and protein information: Sema4a sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4A [ Mus musculus (house mouse) ] Gene ID: 20351, updated on 10-Oct-2019
Gene summary
Official Symbol Sema4a provided by MGI Official Full Name sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) Primary source 4A provided by MGI See related MGI:MGI:107560 Gene type Ensembl:ENSMUSG00000028064 RefSeq status protein coding Organism VALIDATED Lineage Mus musculus Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Also known as Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression SemB; Semab; AI132332 Orthologs Broad expression in colon adult (RPKM 43.5), duodenum adult (RPKM 41.8) and 25 other tissues See more human all
Genomic context
Location: 3; 3 F1 See Sema4a in Genome Data Viewer Exon count: 21
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (88435959..88461240, complement)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (88239884..88265104, complement)
Chromosome 3 - NC_000069.6
Page 6 of 10 https://www.alphaknockout.com
Transcript information: This gene has 19 transcripts
Gene: Sema4a ENSMUSG00000028064
Description sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4A [Source:MGI Symbol;Acc:MGI:107560] Gene Synonyms SemB, Semab Location Chromosome 3: 88,435,959-88,461,182 reverse strand. GRCm38:CM000996.2 About this gene This gene has 19 transcripts (splice variants), 410 orthologues, 19 paralogues, is a member of 1 Ensembl protein family and is associated with 20 phenotypes. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Sema4a- ENSMUST00000029700.11 3205 760aa ENSMUSP00000029700.5 Protein coding CCDS17475 Q62178 TSL:1 201 GENCODE basic APPRIS P1
Sema4a- ENSMUST00000165898.7 3125 760aa ENSMUSP00000128510.1 Protein coding CCDS17475 Q62178 TSL:5 213 GENCODE basic APPRIS P1
Sema4a- ENSMUST00000169222.7 3084 760aa ENSMUSP00000128887.1 Protein coding CCDS17475 Q62178 TSL:5 215 GENCODE basic APPRIS P1
Sema4a- ENSMUST00000166237.7 3046 760aa ENSMUSP00000125909.1 Protein coding CCDS17475 Q62178 TSL:1 214 GENCODE basic APPRIS P1
Sema4a- ENSMUST00000107531.7 2897 628aa ENSMUSP00000103155.1 Protein coding - D3YWV5 TSL:5 202 GENCODE basic
Sema4a- ENSMUST00000127436.7 845 233aa ENSMUSP00000118706.1 Protein coding - D3YZ30 CDS 3' 205 incomplete TSL:3
Sema4a- ENSMUST00000147200.7 706 203aa ENSMUSP00000123061.1 Protein coding - D3YUM4 CDS 3' 210 incomplete TSL:5
Sema4a- ENSMUST00000141471.1 630 60aa ENSMUSP00000114330.1 Protein coding - D3YVM6 CDS 3' 208 incomplete TSL:5
Sema4a- ENSMUST00000125526.7 506 113aa ENSMUSP00000119028.1 Protein coding - D3Z336 CDS 3' 204 incomplete TSL:3
Sema4a- ENSMUST00000123753.7 385 17aa ENSMUSP00000120084.1 Protein coding - D3YWK5 CDS 3' 203 incomplete TSL:2
Sema4a- ENSMUST00000184487.7 866 170aa ENSMUSP00000139126.1 Nonsense mediated - V9GXF5 TSL:5 216 decay
Sema4a- ENSMUST00000184876.7 748 180aa ENSMUSP00000139159.1 Nonsense mediated - V9GXH9 TSL:5 217 decay
Sema4a- ENSMUST00000185137.7 699 47aa ENSMUSP00000138858.1 Nonsense mediated - V9GWW2 CDS 5' 219 decay incomplete TSL:3
Sema4a- ENSMUST00000135539.7 2487 No - Retained intron - - TSL:2 206 protein
Sema4a- ENSMUST00000149145.1 2357 No - Retained intron - - TSL:5 Page 7 of 10 https://www.alphaknockout.com
211 protein
Sema4a- ENSMUST00000156108.7 2108 No - Retained intron - - TSL:2 212 protein
Sema4a- ENSMUST00000135732.7 713 No - Retained intron - - TSL:3 207 protein
Sema4a- ENSMUST00000184972.1 465 No - Retained intron - - TSL:2 218 protein
Sema4a- ENSMUST00000146921.1 371 No - lncRNA - - TSL:3 209 protein
Page 8 of 10 https://www.alphaknockout.com
45.22 kb Forward strand 88.43Mb 88.44Mb 88.45Mb 88.46Mb 88.47Mb Genes Gm42814-201 >processed pseudogene (Comprehensive set...
Contigs < AC102388.7 Genes (Comprehensive set... < Sema4a-201protein coding
< Sema4a-215protein coding
< Sema4a-213protein coding
< Sema4a-214protein coding
< Sema4a-202protein coding < Sema4a-209lncRNA
< Sema4a-211retained intron < Sema4a-205protein coding
< Sema4a-206retained intron < Sema4a-219nonsense mediated decay
< Sema4a-212retained intron < Sema4a-207retained intron
< Mir7011-201miRNA < Sema4a-216nonsense mediated decay
< Sema4a-217nonsense mediated decay
< Sema4a-210protein coding
< Sema4a-204protein coding
< Sema4a-218retained intron
< Sema4a-208protein coding
< Sema4a-203protein coding
Regulatory Build
88.43Mb 88.44Mb 88.45Mb 88.46Mb 88.47Mb Reverse strand 45.22 kb
Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site
Gene Legend Protein Coding
merged Ensembl/Havana Ensembl protein coding
Non-Protein Coding
processed transcript RNA gene pseudogene
Page 9 of 10 https://www.alphaknockout.com
Transcript: ENSMUST00000029700
< Sema4a-201protein coding
Reverse strand 19.78 kb
ENSMUSP00000029... Transmembrane heli... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Sema domain superfamily
SSF103575 SMART Sema domain PSI domain
Pfam Sema domain Plexin repeat
PROSITE profiles Sema domain PANTHER PTHR11036:SF15
Semaphorin Gene3D WD40/YVTN repeat-like-containing domain superfamily 3.30.1680.10
Immunoglobulin-like fold
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend missense variant synonymous variant
Scale bar 0 80 160 240 320 400 480 560 640 760
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 10 of 10