https://www.alphaknockout.com
Mouse Tpbg Conditional Knockout Project (CRISPR/Cas9)
Objective: To create a Tpbg conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Tpbg gene (NCBI Reference Sequence: NM_011627 ; Ensembl: ENSMUSG00000035274 ) is located on Mouse chromosome 9. 2 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 2 (Transcript: ENSMUST00000006559). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tpbg gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-304F8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit low penetrance hydrocephaly and premature death. Embryonic stem cells isolated from these mice exhibit impaired mesenchyme differentiation and reduced chemotaxis following differentiation.
Exon 2 covers 100.0% of the coding region. Start codon is in exon 2, and stop codon is in exon 2. The size of effective cKO region: ~1880 bp. The cKO region does not have any other known gene.
Page 1 of 7 https://www.alphaknockout.com
Overview of the Targeting Strategy
gRNA region
Wildtype allele T gRNA region G 5' A 3'
1 2
Targeting vector T G A
Targeted allele T G A
Constitutive KO allele (After Cre recombination)
Legends Homology arm Exon of mouse Tpbg cKO region loxP site
Page 2 of 7 https://www.alphaknockout.com
Overview of the Dot Plot Window size: 10 bp
Forward Reverse Complement
Sequence 12
Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.
Overview of the GC Content Distribution Window size: 300 bp
Sequence 12
Summary: Full Length(7278bp) | A(24.33% 1771) | C(23.1% 1681) | T(27.7% 2016) | G(24.87% 1810)
Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.
Page 3 of 7 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 85840981 85843980 3000 browser details YourSeq 37 143 268 3000 79.5% chr19 + 60166443 60166559 117 browser details YourSeq 36 98 165 3000 97.4% chr2 - 63505261 63505599 339 browser details YourSeq 34 131 166 3000 100.0% chr19 + 17510471 17510516 46 browser details YourSeq 32 133 165 3000 100.0% chr10 - 36202251 36202286 36 browser details YourSeq 31 136 166 3000 100.0% chr12 + 50447412 50447442 31 browser details YourSeq 31 136 166 3000 100.0% chr12 + 36847566 36847596 31 browser details YourSeq 30 136 166 3000 100.0% chr16 + 73028123 73028157 35 browser details YourSeq 30 137 166 3000 100.0% chr15 + 96921933 96921962 30 browser details YourSeq 29 133 161 3000 100.0% chr14 + 71741656 71741684 29 browser details YourSeq 29 136 165 3000 100.0% chr14 + 39429831 39429861 31 browser details YourSeq 29 133 161 3000 100.0% chr10 + 51254095 51254123 29 browser details YourSeq 28 139 166 3000 100.0% chr18 - 63220576 63220603 28 browser details YourSeq 28 139 166 3000 100.0% chr14 - 116493620 116493647 28 browser details YourSeq 28 139 166 3000 100.0% chr11 - 55706075 55706102 28 browser details YourSeq 28 531 568 3000 86.9% chr1 - 145710580 145710617 38 browser details YourSeq 28 134 165 3000 90.0% chr17 + 84162791 84162821 31 browser details YourSeq 28 139 166 3000 100.0% chr15 + 22960480 22960507 28 browser details YourSeq 27 139 165 3000 100.0% chr18 - 68138476 68138502 27 browser details YourSeq 27 139 165 3000 100.0% chr16 - 58513286 58513312 27
Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 85845259 85848258 3000 browser details YourSeq 39 638 873 3000 62.8% chr13 - 21951160 21951256 97 browser details YourSeq 31 635 666 3000 100.0% chr5 + 131289164 131289196 33 browser details YourSeq 23 704 732 3000 77.0% chr11 - 25469245 25469270 26 browser details YourSeq 21 2279 2299 3000 100.0% chr5 - 74057268 74057288 21 browser details YourSeq 20 322 341 3000 100.0% chr3 - 17691943 17691962 20
Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.
Page 4 of 7 https://www.alphaknockout.com
Gene and protein information: Tpbg trophoblast glycoprotein [ Mus musculus (house mouse) ] Gene ID: 21983, updated on 12-Aug-2019
Gene summary
Official Symbol Tpbg provided by MGI Official Full Name trophoblast glycoprotein provided by MGI Primary source MGI:MGI:1341264 See related Ensembl:ENSMUSG00000035274 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 5T4; WAIF1; AW495680 Expression Broad expression in CNS E11.5 (RPKM 8.0), limb E14.5 (RPKM 6.2) and 21 other tissues See more Orthologs human all
Genomic context
Location: 9; 9 E3.1 See Tpbg in Genome Data Viewer
Exon count: 5
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (85842114..85847057)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (85735987..85740662)
Chromosome 9 - NC_000075.6
Page 5 of 7 https://www.alphaknockout.com
Transcript information: This gene has 3 transcripts
Gene: Tpbg ENSMUSG00000035274
Description trophoblast glycoprotein [Source:MGI Symbol;Acc:MGI:1341264] Gene Synonyms 5T4, 5T4 oncofetal antigen Location Chromosome 9: 85,842,380-85,847,040 forward strand. GRCm38:CM001002.2 About this gene This gene has 3 transcripts (splice variants), 168 orthologues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Tpbg-201 ENSMUST00000006559.13 3603 426aa ENSMUSP00000006559.7 Protein coding CCDS23380 Q9Z0L0 TSL:1 GENCODE basic APPRIS P1
Tpbg-202 ENSMUST00000098500.4 3481 426aa ENSMUSP00000096101.3 Protein coding CCDS23380 Q9Z0L0 TSL:1 GENCODE basic APPRIS P1
Tpbg-203 ENSMUST00000185559.1 409 No protein - lncRNA - - TSL:2
24.66 kb Forward strand 85.835Mb 85.840Mb 85.845Mb 85.850Mb 85.855Mb Genes (Comprehensive set... Tpbg-201 >protein coding
Tpbg-202 >protein coding
Tpbg-203 >lncRNA
Contigs < AC158516.2 Genes < 9330154J02Rik-201lncRNA (Comprehensive set...
< 9330154J02Rik-204lncRNA
< 9330154J02Rik-202lncRNA
< 9330154J02Rik-203retained intron
Regulatory Build
85.835Mb 85.840Mb 85.845Mb 85.850Mb 85.855Mb Reverse strand 24.66 kb
Regulation Legend
CTCF Open Chromatin Promoter Promoter Flank
Gene Legend Protein Coding
merged Ensembl/Havana
Non-Protein Coding
processed transcript RNA gene
Page 6 of 7 https://www.alphaknockout.com
Transcript: ENSMUST00000006559
4.66 kb Forward strand
Tpbg-201 >protein coding
ENSMUSP00000006... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily SSF52058
SMART Leucine-rich repeat N-terminal domain Cysteine-rich flanking region, C-terminal
Leucine-rich repeat, typical subtype
SM00364 Prints PR00019 Pfam Leucine-rich repeat
Leucine-rich repeat N-terminal domain PROSITE profiles Leucine-rich repeat PANTHER PTHR24364:SF17
PTHR24364 Gene3D Leucine-rich repeat domain superfamily
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend
missense variant synonymous variant
Scale bar 0 40 80 120 160 200 240 280 320 360 426
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 7 of 7