https://www.alphaknockout.com

Mouse Tpbg Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tpbg conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tpbg (NCBI Reference Sequence: NM_011627 ; Ensembl: ENSMUSG00000035274 ) is located on Mouse 9. 2 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 2 (Transcript: ENSMUST00000006559). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tpbg gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-304F8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit low penetrance hydrocephaly and premature death. Embryonic stem cells isolated from these mice exhibit impaired mesenchyme differentiation and reduced chemotaxis following differentiation.

Exon 2 covers 100.0% of the coding region. Start codon is in exon 2, and stop codon is in exon 2. The size of effective cKO region: ~1880 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele T gRNA region G 5' A 3'

1 2

Targeting vector T G A

Targeted allele T G A

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Tpbg cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7278bp) | A(24.33% 1771) | C(23.1% 1681) | T(27.7% 2016) | G(24.87% 1810)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 85840981 85843980 3000 browser details YourSeq 37 143 268 3000 79.5% chr19 + 60166443 60166559 117 browser details YourSeq 36 98 165 3000 97.4% chr2 - 63505261 63505599 339 browser details YourSeq 34 131 166 3000 100.0% chr19 + 17510471 17510516 46 browser details YourSeq 32 133 165 3000 100.0% chr10 - 36202251 36202286 36 browser details YourSeq 31 136 166 3000 100.0% chr12 + 50447412 50447442 31 browser details YourSeq 31 136 166 3000 100.0% chr12 + 36847566 36847596 31 browser details YourSeq 30 136 166 3000 100.0% chr16 + 73028123 73028157 35 browser details YourSeq 30 137 166 3000 100.0% chr15 + 96921933 96921962 30 browser details YourSeq 29 133 161 3000 100.0% chr14 + 71741656 71741684 29 browser details YourSeq 29 136 165 3000 100.0% chr14 + 39429831 39429861 31 browser details YourSeq 29 133 161 3000 100.0% chr10 + 51254095 51254123 29 browser details YourSeq 28 139 166 3000 100.0% chr18 - 63220576 63220603 28 browser details YourSeq 28 139 166 3000 100.0% chr14 - 116493620 116493647 28 browser details YourSeq 28 139 166 3000 100.0% chr11 - 55706075 55706102 28 browser details YourSeq 28 531 568 3000 86.9% chr1 - 145710580 145710617 38 browser details YourSeq 28 134 165 3000 90.0% chr17 + 84162791 84162821 31 browser details YourSeq 28 139 166 3000 100.0% chr15 + 22960480 22960507 28 browser details YourSeq 27 139 165 3000 100.0% chr18 - 68138476 68138502 27 browser details YourSeq 27 139 165 3000 100.0% chr16 - 58513286 58513312 27

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 85845259 85848258 3000 browser details YourSeq 39 638 873 3000 62.8% chr13 - 21951160 21951256 97 browser details YourSeq 31 635 666 3000 100.0% chr5 + 131289164 131289196 33 browser details YourSeq 23 704 732 3000 77.0% chr11 - 25469245 25469270 26 browser details YourSeq 21 2279 2299 3000 100.0% chr5 - 74057268 74057288 21 browser details YourSeq 20 322 341 3000 100.0% chr3 - 17691943 17691962 20

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Tpbg trophoblast glycoprotein [ Mus musculus (house mouse) ] Gene ID: 21983, updated on 12-Aug-2019

Gene summary

Official Symbol Tpbg provided by MGI Official Full Name trophoblast glycoprotein provided by MGI Primary source MGI:MGI:1341264 See related Ensembl:ENSMUSG00000035274 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 5T4; WAIF1; AW495680 Expression Broad expression in CNS E11.5 (RPKM 8.0), limb E14.5 (RPKM 6.2) and 21 other tissues See more Orthologs human all

Genomic context

Location: 9; 9 E3.1 See Tpbg in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (85842114..85847057)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (85735987..85740662)

Chromosome 9 - NC_000075.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Tpbg ENSMUSG00000035274

Description trophoblast glycoprotein [Source:MGI Symbol;Acc:MGI:1341264] Gene Synonyms 5T4, 5T4 oncofetal Location Chromosome 9: 85,842,380-85,847,040 forward strand. GRCm38:CM001002.2 About this gene This gene has 3 transcripts (splice variants), 168 orthologues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tpbg-201 ENSMUST00000006559.13 3603 426aa ENSMUSP00000006559.7 Protein coding CCDS23380 Q9Z0L0 TSL:1 GENCODE basic APPRIS P1

Tpbg-202 ENSMUST00000098500.4 3481 426aa ENSMUSP00000096101.3 Protein coding CCDS23380 Q9Z0L0 TSL:1 GENCODE basic APPRIS P1

Tpbg-203 ENSMUST00000185559.1 409 No protein - lncRNA - - TSL:2

24.66 kb Forward strand 85.835Mb 85.840Mb 85.845Mb 85.850Mb 85.855Mb (Comprehensive set... Tpbg-201 >protein coding

Tpbg-202 >protein coding

Tpbg-203 >lncRNA

Contigs < AC158516.2 Genes < 9330154J02Rik-201lncRNA (Comprehensive set...

< 9330154J02Rik-204lncRNA

< 9330154J02Rik-202lncRNA

< 9330154J02Rik-203retained intron

Regulatory Build

85.835Mb 85.840Mb 85.845Mb 85.850Mb 85.855Mb Reverse strand 24.66 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000006559

4.66 kb Forward strand

Tpbg-201 >protein coding

ENSMUSP00000006... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily SSF52058

SMART Leucine-rich repeat N-terminal domain Cysteine-rich flanking region, C-terminal

Leucine-rich repeat, typical subtype

SM00364 Prints PR00019 Pfam Leucine-rich repeat

Leucine-rich repeat N-terminal domain PROSITE profiles Leucine-rich repeat PANTHER PTHR24364:SF17

PTHR24364 Gene3D Leucine-rich repeat domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 426

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7