https://www.alphaknockout.com
Mouse Msmb Conditional Knockout Project (CRISPR/Cas9)
Objective: To create a Msmb conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Msmb gene (NCBI Reference Sequence: NM_020597 ; Ensembl: ENSMUSG00000021907 ) is located on Mouse chromosome 14. 4 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 4 (Transcript: ENSMUST00000022464). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Msmb gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-95M10 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-in allele exhibit early prostate cancer development progressing from intraepithelial neoplasia with microinvasion to well-differentiated prostate gland adenocarcinoma, and show enlargement of the prostate gland and the lymph nodes, and increased metastatic potential.
Exon 2 starts from about 1.18% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 6001 bp, and the size of intron 2 for 3'-loxP site insertion: 1981 bp. The size of effective cKO region: ~606 bp. The cKO region does not have any other known gene.
Page 1 of 7 https://www.alphaknockout.com
Overview of the Targeting Strategy
Wildtype allele gRNA region 5' gRNA region 3'
1 2 3 4 Targeting vector
Targeted allele
Constitutive KO allele (After Cre recombination)
Legends Exon of mouse Msmb Homology arm cKO region loxP site
Page 2 of 7 https://www.alphaknockout.com
Overview of the Dot Plot Window size: 10 bp
Forward Reverse Complement
Sequence 12
Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.
Overview of the GC Content Distribution Window size: 300 bp
Sequence 12
Summary: Full Length(7106bp) | A(25.81% 1834) | C(21.45% 1524) | T(30.48% 2166) | G(22.26% 1582)
Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 3 of 7 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 + 32144827 32147826 3000 browser details YourSeq 361 2469 3000 3000 92.3% chr6 - 37680866 37681565 700 browser details YourSeq 324 2469 2866 3000 92.3% chr17 - 26317589 26318007 419 browser details YourSeq 321 2469 2866 3000 92.3% chr14 - 27539126 27539522 397 browser details YourSeq 321 2471 2866 3000 92.6% chr13 + 85523334 85523749 416 browser details YourSeq 320 2469 2852 3000 93.8% chr4 - 144812059 144812472 414 browser details YourSeq 319 2469 2866 3000 91.5% chr17 - 26309078 26309497 420 browser details YourSeq 315 2469 2866 3000 90.4% chr9 + 5867811 5868233 423 browser details YourSeq 314 2469 2878 3000 91.0% chr7 - 142234125 142238413 4289 browser details YourSeq 311 2469 2855 3000 93.6% chr9 - 37976267 37976666 400 browser details YourSeq 311 2469 2852 3000 92.9% chr10 - 40787098 40787516 419 browser details YourSeq 311 2471 2862 3000 91.1% chrX + 120224471 120224887 417 browser details YourSeq 311 2469 2867 3000 92.5% chr7 + 28530085 28530519 435 browser details YourSeq 309 2469 2855 3000 92.6% chr18 - 6593107 6593507 401 browser details YourSeq 308 2469 2863 3000 92.4% chr15 - 33328347 33328759 413 browser details YourSeq 308 2469 2866 3000 90.3% chr15 + 76961415 76961842 428 browser details YourSeq 306 2469 2852 3000 92.5% chr9 - 104727262 104727673 412 browser details YourSeq 303 2322 2826 3000 92.7% chr6 - 129106636 129107177 542 browser details YourSeq 303 2469 2867 3000 91.1% chr19 - 61217104 61217513 410 browser details YourSeq 302 2469 2852 3000 91.4% chr7 - 142234603 142235009 407
Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 + 32148433 32151432 3000 browser details YourSeq 345 2308 2805 3000 87.0% chr1 - 44082687 44083195 509 browser details YourSeq 331 2308 2804 3000 85.0% chr14 + 100290034 100290533 500 browser details YourSeq 330 2308 2794 3000 89.9% chr12 - 58298387 58298892 506 browser details YourSeq 325 2308 2794 3000 87.4% chrX + 163685380 163685882 503 browser details YourSeq 324 2314 2805 3000 87.8% chr1 - 53964996 53965506 511 browser details YourSeq 323 2327 2803 3000 87.9% chr10 - 32262398 32675790 413393 browser details YourSeq 323 2308 2803 3000 88.4% chr6 + 105647016 105647529 514 browser details YourSeq 321 2310 2787 3000 89.3% chr2 + 69967626 69968120 495 browser details YourSeq 319 2357 2803 3000 88.0% chr11 - 4412976 4413419 444 browser details YourSeq 315 2081 2803 3000 86.3% chr11 + 70111340 70112013 674 browser details YourSeq 314 2308 2794 3000 86.9% chr2 + 3902101 3902600 500 browser details YourSeq 312 2315 2805 3000 88.5% chr3 + 50246113 50246616 504 browser details YourSeq 312 2382 2794 3000 88.7% chr19 + 20022477 20022905 429 browser details YourSeq 309 2385 2804 3000 87.5% chr16 - 16449405 16449828 424 browser details YourSeq 309 2308 2794 3000 86.4% chr11 + 10036462 10036963 502 browser details YourSeq 305 2340 2794 3000 87.5% chr1 + 84080912 84081391 480 browser details YourSeq 304 2396 2803 3000 89.4% chr11 + 110382224 110382638 415 browser details YourSeq 303 2396 2803 3000 87.4% chr9 - 53218938 53219337 400 browser details YourSeq 299 2396 2791 3000 88.1% chr17 - 3751136 3751539 404
Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
Page 4 of 7 https://www.alphaknockout.com
Gene and protein information: Msmb beta-microseminoprotein [ Mus musculus (house mouse) ] Gene ID: 17695, updated on 12-Aug-2019
Gene summary
Official Symbol Msmb provided by MGI Official Full Name beta-microseminoprotein provided by MGI Primary source MGI:MGI:97166 See related Ensembl:ENSMUSG00000021907 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PIP; PSP94; beta-MSP Expression Low expression observed in reference dataset See more Orthologs human all
Genomic context
Location: 14; 14 B See Msmb in Genome Data Viewer
Exon count: 6
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (32142023..32158327)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (32955209..32971513)
Chromosome 14 - NC_000080.6
Page 5 of 7 https://www.alphaknockout.com
Transcript information: This gene has 2 transcripts
Gene: Msmb ENSMUSG00000021907
Description beta-microseminoprotein [Source:MGI Symbol;Acc:MGI:97166] Gene Synonyms PIP, PSP94, beta-MSP, beta-inhibin, prostatic inhibin protein Location Chromosome 14: 32,142,023-32,158,370 forward strand. GRCm38:CM001007.2 About this gene This gene has 2 transcripts (splice variants), 125 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Msmb-201 ENSMUST00000022464.13 562 113aa ENSMUSP00000022464.6 Protein coding CCDS36862 O08540 TSL:1 GENCODE basic APPRIS P1
Msmb-202 ENSMUST00000130397.1 537 No protein - lncRNA - - TSL:3
36.35 kb Forward strand
32.14Mb 32.15Mb 32.16Mb Genes (Comprehensive set... Msmb-201 >protein coding Ncoa4-202 >protein coding
Msmb-202 >lncRNA Ncoa4-212 >retained intron
Ncoa4-208 >protein coding
Ncoa4-207 >protein coding
Ncoa4-206 >nonsense mediated decay
Ncoa4-201 >protein coding
Ncoa4-204 >protein coding
Ncoa4-205 >protein coding
Ncoa4-203 >protein coding
Contigs < AC154532.3 Genes < Gm18909-201processed pseudogene (Comprehensive set...
Regulatory Build
32.14Mb 32.15Mb 32.16Mb Reverse strand 36.35 kb
Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank
Gene Legend Protein Coding
Ensembl protein coding merged Ensembl/Havana
Non-Protein Coding
RNA gene processed transcript pseudogene
Page 6 of 7 https://www.alphaknockout.com
Transcript: ENSMUST00000022464
16.34 kb Forward strand
Msmb-201 >protein coding
ENSMUSP00000022... Cleavage site (Sign... Pfam Beta-microseminoprotein PANTHER Beta-microseminoprotein
PTHR10500:SF0 Gene3D 2.10.70.10 2.20.25.590
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend missense variant synonymous variant
Scale bar 0 10 20 30 40 50 60 70 80 90 100 113
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 7 of 7