https://www.alphaknockout.com

Mouse Tfpt Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tfpt conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tfpt (NCBI Reference Sequence: NM_001290381 ; Ensembl: ENSMUSG00000006335 ) is located on Mouse 7. 6 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000108641). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tfpt gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-146B24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 36.42% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 4158 bp, and the size of intron 3 for 3'-loxP site insertion: 3466 bp. The size of effective cKO region: ~571 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tfpt Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7071bp) | A(26.49% 1873) | C(22.34% 1580) | T(26.9% 1902) | G(24.27% 1716)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 3624920 3627919 3000 browser details YourSeq 303 2321 2782 3000 83.8% chr13 + 8753268 8753727 460 browser details YourSeq 292 2313 2758 3000 84.4% chr1 + 131705875 131706424 550 browser details YourSeq 282 2298 2786 3000 86.2% chr19 - 42154938 42155411 474 browser details YourSeq 281 2298 2752 3000 85.2% chr12 - 39283481 39283939 459 browser details YourSeq 280 2315 2782 3000 82.2% chr19 - 32468269 32468728 460 browser details YourSeq 276 2316 2782 3000 82.6% chr7 + 120405043 120405512 470 browser details YourSeq 272 2314 2782 3000 85.6% chr7 + 13151779 13152245 467 browser details YourSeq 263 2387 2782 3000 86.0% chr9 + 114669101 114669545 445 browser details YourSeq 253 2297 2739 3000 83.5% chr3 + 136040643 136041131 489 browser details YourSeq 250 2321 2779 3000 83.2% chrX - 77698094 77698577 484 browser details YourSeq 249 2329 2782 3000 87.3% chr7 + 24334522 24334976 455 browser details YourSeq 247 2352 2782 3000 87.1% chr1 + 16717337 16717758 422 browser details YourSeq 246 2298 2786 3000 87.6% chr17 - 34272393 34273151 759 browser details YourSeq 246 2317 2788 3000 83.1% chr10 - 24370495 24371215 721 browser details YourSeq 245 2335 2782 3000 84.7% chr17 - 50724306 50724757 452 browser details YourSeq 245 2387 2785 3000 83.9% chr1 - 152394961 152395400 440 browser details YourSeq 245 2329 2783 3000 84.0% chr8 + 91056896 91057290 395 browser details YourSeq 241 2295 2779 3000 85.7% chr3 - 8319890 8320367 478 browser details YourSeq 235 2314 2705 3000 86.4% chr13 - 18080241 18080634 394

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 3621349 3624348 3000 browser details YourSeq 180 2503 2900 3000 84.9% chr6 + 94519417 94519782 366 browser details YourSeq 177 2519 2924 3000 89.4% chr1 + 167260286 167260717 432 browser details YourSeq 176 2520 2919 3000 84.1% chr8 + 95147268 95147614 347 browser details YourSeq 166 2432 2918 3000 85.5% chr19 + 45525709 45526159 451 browser details YourSeq 161 2499 2919 3000 88.3% chr4 + 108275248 108275691 444 browser details YourSeq 158 2732 2919 3000 92.6% chr1 + 181129057 181129246 190 browser details YourSeq 157 2737 2919 3000 94.5% chr1 + 35954952 35955136 185 browser details YourSeq 155 2735 2919 3000 92.9% chr11 - 106227730 106227916 187 browser details YourSeq 154 2739 2928 3000 91.5% chr2 + 157196451 157196640 190 browser details YourSeq 153 2389 2916 3000 82.9% chr10 + 75816732 75817123 392 browser details YourSeq 151 2738 2917 3000 92.3% chr3 - 89623263 89623443 181 browser details YourSeq 151 2737 2919 3000 91.9% chr5 + 66222173 66222357 185 browser details YourSeq 151 2736 2919 3000 92.3% chr12 + 78830245 78830431 187 browser details YourSeq 150 2738 2919 3000 94.7% chr6 + 134811548 134811731 184 browser details YourSeq 150 2372 2920 3000 80.4% chr18 + 20761950 20762178 229 browser details YourSeq 149 2738 2919 3000 91.3% chr11 - 6629141 6629323 183 browser details YourSeq 149 2428 2919 3000 80.6% chr1 - 58437806 58438212 407 browser details YourSeq 148 2746 2919 3000 93.2% chr9 - 32151350 32151525 176 browser details YourSeq 148 1523 1999 3000 94.1% chr10 - 78971530 78972043 514

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tfpt TCF3 (E2A) fusion partner [ Mus musculus (house mouse) ] Gene ID: 69714, updated on 12-Aug-2019

Gene summary

Official Symbol Tfpt provided by MGI Official Full Name TCF3 (E2A) fusion partner provided by MGI Primary source MGI:MGI:1916964 See related Ensembl:ENSMUSG00000006335 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as FB1; Amida; AI450389; 2400004F01Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 12.3), CNS E14 (RPKM 9.8) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 A1 See Tfpt in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (3620324..3629929, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (3571926..3581486, complement)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Tfpt ENSMUSG00000006335

Description TCF3 (E2A) fusion partner [Source:MGI Symbol;Acc:MGI:1916964] Gene Synonyms 2400004F01Rik, Amida, FB1 Location Chromosome 7: 3,620,324-3,629,929 reverse strand. GRCm38:CM001000.2 About this gene This gene has 7 transcripts (splice variants), 105 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tfpt- ENSMUST00000108641.9 1166 259aa ENSMUSP00000104281.3 Protein coding CCDS71873 Q3U1J1 TSL:1 202 GENCODE basic APPRIS ALT2

Tfpt- ENSMUST00000155592.7 1091 249aa ENSMUSP00000123636.1 Protein coding CCDS20720 Q3U1J1 TSL:1 204 GENCODE basic APPRIS P3

Tfpt- ENSMUST00000205596.1 315 84aa ENSMUSP00000145936.1 Protein coding - A0A0U1RPD5 CDS 3' 206 incomplete TSL:5

Tfpt- ENSMUST00000206370.1 722 97aa ENSMUSP00000146255.1 Nonsense mediated - A0A0U1RQ56 TSL:5 207 decay

Tfpt- ENSMUST00000058880.8 710 156aa ENSMUSP00000053108.8 Nonsense mediated - F8WIZ9 CDS 5' 201 decay incomplete TSL:5

Tfpt- ENSMUST00000153143.1 389 45aa ENSMUSP00000120930.1 Nonsense mediated - D6RHM4 TSL:3 203 decay

Tfpt- ENSMUST00000156194.2 562 No - lncRNA - - TSL:3 205 protein

Page 6 of 8 https://www.alphaknockout.com

29.61 kb Forward strand 3.615Mb 3.620Mb 3.625Mb 3.630Mb 3.635Mb Ndufa3-201 >protein coding Prpf31-206 >protein coding (Comprehensive set...

Ndufa3-205 >retained intron Prpf31-201 >protein coding

Ndufa3-206 >retained intron Prpf31-205 >retained intron Prpf31-204 >retained intron

Ndufa3-202 >protein coding Prpf31-203 >nonsense mediated decay

Ndufa3-204 >nonsense mediated decay Prpf31-202 >protein coding

Ndufa3-203 >nonsense mediated decay

Contigs AC171680.5 > Genes (Comprehensive set... < Oscar-203protein coding < Tfpt-202protein coding < Gm15927-203lncRNA

< Oscar-202protein coding < Tfpt-204protein coding

< Oscar-201protein coding < Tfpt-201nonsense mediated decay

< Oscar-204lncRNA < Tfpt-205lncRNA

< Tfpt-206protein coding

< Tfpt-207nonsense mediated decay

< Tfpt-203nonsense mediated decay

Regulatory Build

3.615Mb 3.620Mb 3.625Mb 3.630Mb 3.635Mb Reverse strand 29.61 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000108641

< Tfpt-202protein coding

Reverse strand 9.61 kb

ENSMUSP00000104... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) PANTHER TCF3 fusion partner

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 259

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8