https://www.alphaknockout.com
Mouse Osgep Conditional Knockout Project (CRISPR/Cas9)
Objective: To create a Osgep conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Osgep gene (NCBI Reference Sequence: NM_133676 ; Ensembl: ENSMUSG00000006289 ) is located on Mouse chromosome 14. 11 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 11 (Transcript: ENSMUST00000159292). Exon 4~11 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Osgep gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-393L19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:
Exon 4~11 covers 59.1% of the coding region. Start codon is in exon 1, and stop codon is in exon 11. The size of intron 3 for 5'-loxP site insertion: 1684 bp. The size of effective cKO region: ~2343 bp. The cKO region does not have any other known gene.
Page 1 of 8 https://www.alphaknockout.com
Overview of the Targeting Strategy
gRNA region
Wildtype allele T A
5' gRNA region A 3'
1 2 3 4 5 6 7 8 9 10 11
Targeting vector T A A
Targeted allele T A A
Constitutive KO allele (After Cre recombination)
Legends Exon of mouse Osgep Homology arm cKO region loxP site
Page 2 of 8 https://www.alphaknockout.com
Overview of the Dot Plot Window size: 10 bp
Forward Reverse Complement
Sequence 12
Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.
Overview of the GC Content Distribution Window size: 300 bp
Sequence 12
Summary: Full Length(8570bp) | A(25.81% 2212) | C(22.61% 1938) | T(29.07% 2491) | G(22.51% 1929)
Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 3 of 8 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 50918182 50921181 3000 browser details YourSeq 286 1132 1568 3000 92.7% chr18 - 33384543 33384841 299 browser details YourSeq 157 2318 2540 3000 85.0% chr14 + 92455979 92456187 209 browser details YourSeq 154 2344 2555 3000 91.1% chr2 + 83753455 83753672 218 browser details YourSeq 152 2319 2522 3000 87.7% chr13 + 95203374 95203563 190 browser details YourSeq 151 2315 2523 3000 87.0% chr14 + 72875382 72875566 185 browser details YourSeq 149 2353 2543 3000 91.9% chr1 - 81846433 81846621 189 browser details YourSeq 147 2346 2551 3000 90.4% chr9 - 71370693 71370907 215 browser details YourSeq 141 2367 2549 3000 91.2% chr9 + 66998449 66998633 185 browser details YourSeq 139 2318 2544 3000 82.6% chr1 - 105526957 105527128 172 browser details YourSeq 137 2374 2543 3000 88.2% chr7 + 131006213 131006372 160 browser details YourSeq 132 2398 2553 3000 93.0% chr8 + 69723770 69723967 198 browser details YourSeq 130 2378 2540 3000 89.9% chr11 - 3607208 3607363 156 browser details YourSeq 130 2342 2546 3000 92.8% chr14 + 76164179 76164587 409 browser details YourSeq 129 2391 2544 3000 92.3% chr14 + 47308114 47308276 163 browser details YourSeq 128 2315 2524 3000 85.2% chr3 - 95102922 95103071 150 browser details YourSeq 126 2388 2543 3000 91.1% chr10 + 81108050 81108208 159 browser details YourSeq 125 2374 2524 3000 89.4% chrX - 134978487 134978629 143 browser details YourSeq 124 2342 2540 3000 85.8% chrX + 71363885 71364032 148 browser details YourSeq 124 2374 2523 3000 89.3% chr2 + 90908344 90908485 142
Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 50912862 50915861 3000 browser details YourSeq 376 1650 2498 3000 94.6% chr11 - 14715277 14715979 703 browser details YourSeq 366 1650 2495 3000 94.8% chr9 + 110806327 110807078 752 browser details YourSeq 354 1650 2020 3000 97.5% chr4 - 94950397 94950758 362 browser details YourSeq 350 1646 1997 3000 100.0% chr5 + 96672143 96672499 357 browser details YourSeq 348 1650 1998 3000 100.0% chr6 - 49971163 49971515 353 browser details YourSeq 348 1649 1997 3000 100.0% chr14 - 54613737 54614087 351 browser details YourSeq 348 1650 1998 3000 100.0% chr2 + 117206877 117207229 353 browser details YourSeq 347 1650 1999 3000 99.8% chrX - 106719700 106720055 356 browser details YourSeq 347 1650 1997 3000 100.0% chr5 - 37600086 37600437 352 browser details YourSeq 347 1650 1997 3000 100.0% chr14 - 11191559 11191910 352 browser details YourSeq 346 1650 1998 3000 99.8% chr7 - 29930097 29930449 353 browser details YourSeq 346 1650 2003 3000 99.2% chr5 - 148148219 148148621 403 browser details YourSeq 346 1649 1997 3000 99.8% chr4 + 135802141 135802493 353 browser details YourSeq 346 1649 1997 3000 99.8% chr16 + 72814009 72814361 353 browser details YourSeq 346 1650 1998 3000 99.8% chr15 + 10498310 10498662 353 browser details YourSeq 345 1650 1997 3000 99.8% chr7 - 73485803 73486154 352 browser details YourSeq 345 1650 1997 3000 99.8% chr6 - 95779275 95779626 352 browser details YourSeq 345 1650 1997 3000 99.8% chr10 - 114920935 114921286 352 browser details YourSeq 345 1650 1997 3000 99.8% chr10 - 107657584 107657935 352
Note: The 3000 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.
Page 4 of 8 https://www.alphaknockout.com
Gene and protein information: Osgep O-sialoglycoprotein endopeptidase [ Mus musculus (house mouse) ] Gene ID: 66246, updated on 12-Aug-2019
Gene summary
Official Symbol Osgep provided by MGI Official Full Name O-sialoglycoprotein endopeptidase provided by MGI Primary source MGI:MGI:1913496 See related Ensembl:ENSMUSG00000006289 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GCPL-1; PRSMG1; 1500019L24Rik Expression Ubiquitous expression in ovary adult (RPKM 44.1), adrenal adult (RPKM 34.5) and 28 other tissues See more Orthologs human all
Genomic context
Location: 14; 14 C1 See Osgep in Genome Data Viewer
Exon count: 11
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (50915374..50924893, complement)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (51535049..51544568, complement)
Chromosome 14 - NC_000080.6
Page 5 of 8 https://www.alphaknockout.com
Transcript information: This gene has 8 transcripts
Gene: Osgep ENSMUSG00000006289
Description O-sialoglycoprotein endopeptidase [Source:MGI Symbol;Acc:MGI:1913496] Gene Synonyms 1500019L24Rik, GCPL-1, PRSMG1 Location Chromosome 14: 50,906,478-50,924,893 reverse strand. GRCm38:CM001007.2 About this gene This gene has 8 transcripts (splice variants), 188 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Osgep- ENSMUST00000159292.7 3388 335aa ENSMUSP00000124039.1 Protein coding CCDS27026 A0A0R4J1Y3 TSL:1 202 GENCODE basic APPRIS P1
Osgep- ENSMUST00000162177.7 1219 254aa ENSMUSP00000124016.1 Protein coding - E0CYN9 TSL:1 207 GENCODE basic
Osgep- ENSMUST00000160375.7 713 156aa ENSMUSP00000124099.1 Protein coding - E0CYK9 CDS 3' 203 incomplete TSL:5
Osgep- ENSMUST00000160393.7 2004 335aa ENSMUSP00000125155.1 Nonsense mediated - A0A0R4J1Y3 TSL:1 204 decay
Osgep- ENSMUST00000160890.7 1812 80aa ENSMUSP00000124659.1 Nonsense mediated - E0CXW7 TSL:2 206 decay
Osgep- ENSMUST00000006452.12 1308 186aa ENSMUSP00000006452.6 Nonsense mediated - E9QMF4 TSL:5 201 decay
Osgep- ENSMUST00000160464.1 839 No - Retained intron - - TSL:2 205 protein
Osgep- ENSMUST00000162850.1 586 No - Retained intron - - TSL:2 208 protein
Page 6 of 8 https://www.alphaknockout.com
38.42 kb Forward strand
50.90Mb 50.91Mb 50.92Mb 50.93Mb Genes Gm24689-201 >snRNA Apex1-201 >protein coding (Comprehensive set...
Apex1-203 >protein coding
Apex1-204 >protein coding
Apex1-202 >protein coding
Pnp-203 >protein coding
Contigs < AC027184.15 < AC136376.3 Genes (Comprehensive set... < Klhl33-204protein coding < Osgep-204nonsense mediated decay < Pip4p1-205protein coding
< Klhl33-203retained intron < Osgep-202protein coding < Pip4p1-201protein coding
< Klhl33-202protein coding < Osgep-201nonsense mediated decay < Pip4p1-206protein coding
< Osgep-206nonsense mediated decay < Pip4p1-209protein coding
< Osgep-207protein coding < Pip4p1-203retained intron
< Osgep-208retained intron < Pip4p1-207protein coding
< Osgep-203protein coding < Pip4p1-204protein coding
< Osgep-205retained intron < Pip4p1-211retained intron
< Pip4p1-202retained intron
< Pip4p1-208retained intron
< Pip4p1-210retained intron
Regulatory Build
50.90Mb 50.91Mb 50.92Mb 50.93Mb Reverse strand 38.42 kb
Regulation Legend CTCF Open Chromatin Promoter Promoter Flank
Gene Legend Protein Coding
Ensembl protein coding merged Ensembl/Havana
Non-Protein Coding
RNA gene processed transcript
Page 7 of 8 https://www.alphaknockout.com
Transcript: ENSMUST00000159292
< Osgep-202protein coding
Reverse strand 11.30 kb
ENSMUSP00000124... TIGRFAM Kae1/TsaD family Superfamily SSF53067 Prints Kae1/TsaD family Pfam Gcp-like domain PROSITE patterns Peptidase M22, conserved site PANTHER tRNA N6-adenosine threonylcarbamoyltransferase Kae1/OSGEP
Gcp-like domain HAMAP tRNA N6-adenosine threonylcarbamoyltransferase Kae1/OSGEP Gene3D 3.30.420.40
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend missense variant splice region variant synonymous variant
Scale bar 0 40 80 120 160 200 240 280 335
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 8 of 8