https://www.alphaknockout.com

Mouse Tfdp2 Knockout Project (CRISPR/Cas9)

Objective: To create a Tfdp2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tfdp2 (NCBI Reference Sequence: NM_001184706 ; Ensembl: ENSMUSG00000032411 ) is located on Mouse 9. 13 exons are identified, with the ATG start codon in exon 4 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000034982). Exon 6~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 6 starts from about 10.91% of the coding region. Exon 6~9 covers 36.71% of the coding region. The size of effective KO region: ~9861 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 6 7 8 9 13

Legends Exon of mouse Tfdp2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 9 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.8% 576) | C(18.7% 374) | T(33.05% 661) | G(19.45% 389)

Note: The 2000 bp section upstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.75% 455) | C(22.7% 454) | T(29.5% 590) | G(25.05% 501)

Note: The 2000 bp section downstream of Exon 9 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr9 + 96288581 96290580 2000 browser details YourSeq 50 42 378 2000 60.9% chr8 + 111330408 111330493 86 browser details YourSeq 47 1345 1423 2000 94.5% chr1 + 133628207 133628287 81 browser details YourSeq 46 1271 1395 2000 90.8% chr11 + 109405900 109406267 368 browser details YourSeq 45 1445 1532 2000 92.5% chr15 - 73650876 73650964 89 browser details YourSeq 43 1346 1539 2000 90.4% chr14 - 106237953 106238152 200 browser details YourSeq 42 1374 1459 2000 90.4% chr2 - 130456205 130456304 100 browser details YourSeq 42 1361 1423 2000 88.0% chr1 + 189925395 189925458 64 browser details YourSeq 41 1372 1423 2000 90.4% chr12 - 83994493 83994546 54 browser details YourSeq 37 1382 1471 2000 93.1% chr5 - 130094477 130094566 90 browser details YourSeq 36 1381 1421 2000 95.2% chr9 - 80080405 80080447 43 browser details YourSeq 35 7 61 2000 83.1% chr11 - 112713616 112713670 55 browser details YourSeq 34 1346 1397 2000 97.3% chrX - 20646353 20646406 54 browser details YourSeq 34 1350 1464 2000 57.5% chr1 + 66417854 66417904 51 browser details YourSeq 33 24 65 2000 90.3% chr12 + 13670454 13670501 48 browser details YourSeq 32 5 45 2000 97.1% chr10 - 74416921 74416963 43 browser details YourSeq 32 1352 1397 2000 84.8% chr13 + 98615119 98615164 46 browser details YourSeq 31 467 564 2000 97.0% chr1 - 160790600 160790697 98 browser details YourSeq 31 1382 1421 2000 97.1% chr13 + 33995082 33995123 42 browser details YourSeq 30 1445 1513 2000 88.3% chr11 - 53254547 53254614 68

Note: The 2000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr9 + 96300442 96302441 2000 browser details YourSeq 39 1760 1856 2000 70.2% chr7 - 105494355 105494451 97 browser details YourSeq 29 1758 1790 2000 94.0% chr1 - 36122951 36122983 33 browser details YourSeq 21 1836 1856 2000 100.0% chrX - 75231821 75231841 21

Note: The 2000 bp section downstream of Exon 9 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tfdp2 factor Dp 2 [ Mus musculus (house mouse) ] Gene ID: 211586, updated on 10-Oct-2019

Gene summary

Official Symbol Tfdp2 provided by MGI Official Full Name Dp 2 provided by MGI Primary source MGI:MGI:107167 See related Ensembl:ENSMUSG00000032411 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as DP3; DP-3; 1110029I05Rik; A330080J22Rik Expression Ubiquitous expression in liver E14 (RPKM 17.0), liver E14.5 (RPKM 15.8) and 28 other tissues See more Orthologs human all

Genomic context

Location: 9; 9 E3.3 See Tfdp2 in Genome Data Viewer Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (96196246..96323646)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (96096694..96224065)

Chromosome 9 - NC_000075.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 16 transcripts

Gene: Tfdp2 ENSMUSG00000032411

Description transcription factor Dp 2 [Source:MGI Symbol;Acc:MGI:107167] Gene Synonyms 1110029I05Rik, A330080J22Rik, DP-3, DP3 Location Chromosome 9: 96,196,275-96,323,646 forward strand. GRCm38:CM001002.2 About this gene This gene has 16 transcripts (splice variants), 157 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tfdp2-201 ENSMUST00000034982.15 7354 385aa ENSMUSP00000034982.9 Protein coding CCDS23413 Q8BHD2 TSL:1 GENCODE basic APPRIS P3

Tfdp2-206 ENSMUST00000185644.6 2646 385aa ENSMUSP00000140061.1 Protein coding CCDS23413 Q8BHD2 TSL:1 GENCODE basic APPRIS P3

Tfdp2-205 ENSMUST00000179416.7 2408 385aa ENSMUSP00000137176.1 Protein coding CCDS23413 Q8BHD2 TSL:1 GENCODE basic APPRIS P3

Tfdp2-202 ENSMUST00000165120.8 2341 310aa ENSMUSP00000132934.2 Protein coding CCDS52893 E9PWL5 TSL:1 GENCODE basic

Tfdp2-204 ENSMUST00000179065.7 2213 359aa ENSMUSP00000136817.1 Protein coding CCDS52892 J3QK26 TSL:1 GENCODE basic

Tfdp2-203 ENSMUST00000165768.3 2212 386aa ENSMUSP00000128260.2 Protein coding CCDS57690 F6QG91 TSL:1 GENCODE basic APPRIS ALT1

Tfdp2-209 ENSMUST00000188750.6 1713 446aa ENSMUSP00000139926.1 Protein coding - A0A087WPU8 TSL:5 GENCODE basic APPRIS ALT1

Tfdp2-208 ENSMUST00000188008.6 1576 368aa ENSMUSP00000139848.1 Protein coding - A0A1B0GXK0 CDS 3' incomplete TSL:5

Tfdp2-211 ENSMUST00000189606.6 1340 369aa ENSMUSP00000141084.1 Protein coding - A0A087WSK4 CDS 3' incomplete TSL:5

Tfdp2-207 ENSMUST00000186609.6 769 203aa ENSMUSP00000139891.1 Protein coding - A0A087WPS0 CDS 3' incomplete TSL:5

Tfdp2-210 ENSMUST00000188829.6 615 112aa ENSMUSP00000140359.1 Protein coding - A0A087WQV5 CDS 3' incomplete TSL:3

Tfdp2-215 ENSMUST00000191133.1 550 117aa ENSMUSP00000139395.1 Protein coding - A0A087WNL4 CDS 5' incomplete TSL:2

Tfdp2-212 ENSMUST00000190104.6 363 39aa ENSMUSP00000140797.1 Protein coding - A0A087WRW3 CDS 3' incomplete TSL:3

Tfdp2-216 ENSMUST00000191390.1 3151 No protein - Retained intron - - TSL:NA

Tfdp2-213 ENSMUST00000190689.1 3148 No protein - Retained intron - - TSL:NA

Tfdp2-214 ENSMUST00000190709.6 449 No protein - lncRNA - - TSL:2

Page 7 of 9 https://www.alphaknockout.com

147.37 kb Forward strand 96.20Mb 96.25Mb 96.30Mb (Comprehensive set... Tfdp2-202 >protein coding

Tfdp2-201 >protein coding

Tfdp2-208 >protein coding

Tfdp2-209 >protein coding

Tfdp2-206 >protein coding

Tfdp2-214 >lncRNA Tfdp2-207 >protein coding

Tfdp2-210 >protein coding Tfdp2-213 >retained intron

Gm37195-201 >TEC Tfdp2-205 >protein coding

Tfdp2-212 >protein coding Tfdp2-216 >retained intron Tfdp2-215 >protein coding

Tfdp2-211 >protein coding

Tfdp2-204 >protein coding

Tfdp2-203 >protein coding

Contigs AC166175.2 > < AC117248.3 Genes < Gm28924-201processed pseudogene < Atp1b3-201protein coding (Comprehensive set...

Regulatory Build

96.20Mb 96.25Mb 96.30Mb Reverse strand 147.37 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000034982

127.34 kb Forward strand

Tfdp2-201 >protein coding

ENSMUSP00000034... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Winged helix DNA-binding domain superfamily

E2F-DP heterodimerization region SMART /DP family, winged-helix DNA-binding domain

Transcription factor DP, C-terminal Pfam Transcription factor DP, C-terminal

E2F/DP family, winged-helix DNA-binding domain PIRSF Transcription factor DP PANTHER Transcription factor DP2

Transcription factor DP Gene3D Transcription factor DP, C-terminal domain superfamily

Winged helix-like DNA-binding domain superfamily CDD Transcription factor DP, C-terminal

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 385

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9