https://www.alphaknockout.com

Mouse Taf3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Taf3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Taf3 (NCBI Reference Sequence: NM_027748 ; Ensembl: ENSMUSG00000025782 ) is located on Mouse 2. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000026888). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Taf3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-136M6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 14.66% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 89481 bp, and the size of intron 3 for 3'-loxP site insertion: 10104 bp. The size of effective cKO region: ~2332 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Taf3 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8832bp) | A(27.79% 2454) | C(22.61% 1997) | T(27.46% 2425) | G(22.15% 1956)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 9953196 9956195 3000 browser details YourSeq 361 1010 1402 3000 95.9% chr8 + 115759164 115759554 391 browser details YourSeq 357 825 1392 3000 94.3% chrX + 93879185 93879800 616 browser details YourSeq 354 1011 1402 3000 94.4% chr1 - 86635842 86636229 388 browser details YourSeq 354 1009 1406 3000 95.0% chrX + 53646889 53647509 621 browser details YourSeq 350 1009 1406 3000 93.4% chr9 - 84929809 84930199 391 browser details YourSeq 350 1012 1406 3000 94.6% chr5 - 127869059 127869452 394 browser details YourSeq 350 1003 1406 3000 92.5% chr8 + 61570931 61571329 399 browser details YourSeq 349 1020 1406 3000 94.9% chr2 - 150962449 150962834 386 browser details YourSeq 349 1009 1402 3000 94.7% chr1 - 153190387 153190784 398 browser details YourSeq 348 1014 1402 3000 95.4% chr13 - 105368067 105368460 394 browser details YourSeq 347 1009 1402 3000 94.6% chr4 - 88082901 88083304 404 browser details YourSeq 347 1011 1402 3000 93.5% chr16 - 65589709 65590094 386 browser details YourSeq 347 1010 1402 3000 94.6% chrX + 64029989 64030379 391 browser details YourSeq 346 1015 1407 3000 94.4% chr7 + 47029218 47029629 412 browser details YourSeq 346 1014 1406 3000 93.8% chr15 + 51935786 51936174 389 browser details YourSeq 346 1013 1406 3000 94.1% chr10 + 98556463 98556853 391 browser details YourSeq 344 1007 1402 3000 93.9% chr13 - 67222653 67223052 400 browser details YourSeq 344 1007 1400 3000 94.4% chr4 + 5013633 5014038 406 browser details YourSeq 344 1018 1406 3000 94.3% chr3 + 8280034 8280419 386

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 9947864 9950863 3000 browser details YourSeq 72 881 1013 3000 92.0% chr9 - 106105196 106105607 412 browser details YourSeq 72 877 1029 3000 94.0% chr12 - 84237835 84238003 169 browser details YourSeq 69 2355 2797 3000 72.4% chr1 + 181070406 181070597 192 browser details YourSeq 65 2355 2731 3000 72.7% chr1 + 181070422 181070597 176 browser details YourSeq 58 880 1013 3000 94.0% chr1 - 72139571 72139741 171 browser details YourSeq 58 979 1053 3000 82.7% chr12 + 72197328 72197396 69 browser details YourSeq 55 882 1046 3000 92.2% chr4 + 107444913 107445102 190 browser details YourSeq 52 929 1008 3000 94.9% chr5 - 148438960 148439053 94 browser details YourSeq 50 886 1042 3000 96.3% chr5 - 106435461 106435669 209 browser details YourSeq 50 949 1007 3000 93.3% chr18 - 67191935 67192000 66 browser details YourSeq 50 877 1003 3000 94.6% chr16 - 23220788 23220944 157 browser details YourSeq 50 929 1018 3000 96.3% chr15 + 78650353 78650631 279 browser details YourSeq 49 948 1013 3000 96.3% chr2 - 30715242 30715324 83 browser details YourSeq 48 883 1018 3000 92.9% chr16 - 4872411 4872547 137 browser details YourSeq 46 880 1018 3000 92.6% chr18 + 34081930 34082074 145 browser details YourSeq 46 979 1053 3000 80.4% chr17 + 43781506 43781574 69 browser details YourSeq 45 990 1053 3000 77.6% chr1 + 185936109 185936166 58 browser details YourSeq 44 974 1029 3000 85.8% chr2 - 101853025 101853076 52 browser details YourSeq 42 883 1008 3000 97.8% chr8 - 13084041 13084216 176

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Taf3 TATA-box binding protein associated factor 3 [ Mus musculus (house mouse) ] Gene ID: 209361, updated on 12-Aug-2019

Gene summary

Official Symbol Taf3 provided by MGI Official Full Name TATA-box binding protein associated factor 3 provided by MGI Primary source MGI:MGI:2388097 See related Ensembl:ENSMUSG00000025782 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 140kDa; TAF140; AW539625; TAFII140; TAFII-140; mTAFII140; 4933439M23Rik Expression Ubiquitous expression in bladder adult (RPKM 3.3), testis adult (RPKM 3.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A1 See Taf3 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (9914552..10048609, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (9836179..9970236, complement)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Taf3 ENSMUSG00000025782

Description TATA-box binding protein associated factor 3 [Source:MGI Symbol;Acc:MGI:2388097] Gene Synonyms 4933439M23Rik, mTAFII140 Location Chromosome 2: 9,914,552-10,048,596 reverse strand. GRCm38:CM000995.2 About this gene This gene has 5 transcripts (splice variants), 202 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Taf3-201 ENSMUST00000026888.10 4792 932aa ENSMUSP00000026888.4 Protein coding CCDS15675 A2ASY1 Q5HZG4 TSL:1 GENCODE basic APPRIS P1

Taf3-204 ENSMUST00000114909.1 4897 779aa ENSMUSP00000110559.1 Protein coding - A2ASY0 TSL:1 GENCODE basic

Taf3-203 ENSMUST00000114907.1 695 108aa ENSMUSP00000110557.1 Protein coding - A2ASX9 TSL:1 GENCODE basic

Taf3-202 ENSMUST00000114906.1 583 51aa ENSMUSP00000110556.1 Protein coding - A2ASX8 TSL:2 GENCODE basic

Taf3-205 ENSMUST00000129720.1 2217 No protein - lncRNA - - TSL:5

154.04 kb Forward strand 9.92Mb 9.94Mb 9.96Mb 9.98Mb 10.00Mb 10.02Mb 10.04Mb Gm13262-201 >lncRNA C630004M23Rik-201 >TEC (Comprehensive set...

Contigs AL928704.10 > Genes (Comprehensive set... < Taf3-201protein coding

< Taf3-204protein coding < Atp5c1-201protein coding

< Taf3-205lncRNA < Atp5c1-203protein coding

< Taf3-202protein coding < Atp5c1-202protein coding

< Taf3-203protein coding

Regulatory Build

9.92Mb 9.94Mb 9.96Mb 9.98Mb 10.00Mb 10.02Mb 10.04Mb Reverse strand 154.04 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000026888

< Taf3-201protein coding

Reverse strand 134.04 kb

ENSMUSP00000026... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Zinc finger, FYVE/PHD-type SMART Bromodomain associated domain Zinc finger, PHD-type

Pfam Bromodomain associated domain Zinc finger, PHD-finger

PROSITE profiles Zinc finger, PHD-finger PROSITE patterns Zinc finger, PHD-type, conserved site PANTHER PTHR46452 Gene3D Histone-fold Zinc finger, RING/FYVE/PHD-type

CDD cd15522

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800 932

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7