https://www.alphaknockout.com

Mouse Tada3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tada3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tada3 (NCBI Reference Sequence: NM_133932 ; Ensembl: ENSMUSG00000048930 ) is located on Mouse 6. 9 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 9 (Transcript: ENSMUST00000032410). Exon 5~8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tada3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-229B13 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit embryonic lethality between E3.5 and E8.5 associated with impaired proliferation of trophoblast cells and absence of inner cell mass.

Exon 5 starts from about 43.67% of the coding region. The knockout of Exon 5~8 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 2172 bp, and the size of intron 8 for 3'-loxP site insertion: 3041 bp. The size of effective cKO region: ~2885 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 7 8 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tada3 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9385bp) | A(24.57% 2306) | C(24.67% 2315) | T(25.29% 2373) | G(25.48% 2391)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 - 113372857 113375856 3000 browser details YourSeq 317 595 1078 3000 90.2% chrX + 102655971 102656329 359 browser details YourSeq 270 584 1074 3000 93.3% chr2 + 162907084 162907880 797 browser details YourSeq 174 1323 1802 3000 92.2% chr16 - 32088870 32380789 291920 browser details YourSeq 167 671 1040 3000 82.6% chr2 - 179020586 179020919 334 browser details YourSeq 163 1320 1802 3000 84.2% chr1 - 195089831 195090205 375 browser details YourSeq 158 1307 1536 3000 91.6% chr8 - 96200911 96284196 83286 browser details YourSeq 146 1323 1699 3000 83.2% chr8 + 120055552 120055782 231 browser details YourSeq 146 1320 1801 3000 89.2% chr5 + 125345999 125346436 438 browser details YourSeq 141 1304 1470 3000 89.6% chrX - 131423404 131423566 163 browser details YourSeq 141 1293 1471 3000 89.2% chr2 + 66151868 66152026 159 browser details YourSeq 140 1320 1477 3000 94.9% chr9 - 109189172 109189342 171 browser details YourSeq 140 1323 1490 3000 94.4% chr11 - 5767858 5768318 461 browser details YourSeq 139 1320 1470 3000 94.7% chr4 - 74276333 74276482 150 browser details YourSeq 139 1320 1471 3000 96.0% chr4 + 123504166 123504317 152 browser details YourSeq 138 1320 1482 3000 93.6% chr3 + 97468476 97468652 177 browser details YourSeq 137 1304 1476 3000 87.0% chrX + 52106033 52106201 169 browser details YourSeq 136 1307 1469 3000 90.0% chr8 - 72773019 72773172 154 browser details YourSeq 136 1320 1468 3000 96.0% chr3 - 22020252 22020400 149 browser details YourSeq 136 1323 1471 3000 96.0% chrX + 84371013 84371161 149

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 - 113366972 113369971 3000 browser details YourSeq 189 2790 3000 3000 94.8% chrX + 102656869 102657079 211 browser details YourSeq 158 2089 2482 3000 87.0% chr7 + 28670536 28671033 498 browser details YourSeq 154 2794 3000 3000 92.4% chr2 + 162908418 162908629 212 browser details YourSeq 127 2061 2304 3000 87.7% chrX - 89443924 89444165 242 browser details YourSeq 117 2094 2322 3000 90.9% chr1 - 133125484 133125714 231 browser details YourSeq 117 2096 2305 3000 78.7% chr8 + 32454424 32454601 178 browser details YourSeq 113 2086 2296 3000 81.3% chr14 + 41919373 41919562 190 browser details YourSeq 108 2094 2296 3000 81.0% chr1 - 169509925 169510102 178 browser details YourSeq 105 2164 2322 3000 87.9% chr3 + 36955930 36956097 168 browser details YourSeq 104 2162 2304 3000 88.3% chr11 + 29452807 29452956 150 browser details YourSeq 100 2110 2290 3000 90.5% chr5 - 38415726 38416259 534 browser details YourSeq 97 2088 2296 3000 75.8% chr7 - 132860879 132861033 155 browser details YourSeq 97 2088 2230 3000 91.7% chr1 - 100077153 100077421 269 browser details YourSeq 97 2088 2284 3000 78.3% chr1 - 60237562 60237706 145 browser details YourSeq 96 2096 2234 3000 92.4% chr9 + 63908908 63909055 148 browser details YourSeq 95 2146 2305 3000 90.6% chr1 - 36703941 37023682 319742 browser details YourSeq 94 1848 2296 3000 71.9% chr4 + 22979825 22979977 153 browser details YourSeq 94 8 2303 3000 87.9% chr1 + 153514619 153559774 45156 browser details YourSeq 92 2086 2232 3000 84.9% chr2 + 159473670 159473807 138

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tada3 transcriptional adaptor 3 [ Mus musculus (house mouse) ] Gene ID: 101206, updated on 1-Sep-2019

Gene summary

Official Symbol Tada3 provided by MGI Official Full Name transcriptional adaptor 3 provided by MGI Primary source MGI:MGI:1915724 See related Ensembl:ENSMUSG00000048930 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ADA3; Tada3l; AI987856; 1110004B19Rik Expression Ubiquitous expression in testis adult (RPKM 16.2), CNS E18 (RPKM 14.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 E3 See Tada3 in Genome Data Viewer

Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (113366640..113378005, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (113316649..113327514, complement)

Chromosome 6 - NC_000072.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Tada3 ENSMUSG00000048930

Description transcriptional adaptor 3 [Source:MGI Symbol;Acc:MGI:1915724] Gene Synonyms 1110004B19Rik, ADA3, Tada3l Location Chromosome 6: 113,366,025-113,377,883 reverse strand. GRCm38:CM000999.2 About this gene This gene has 7 transcripts (splice variants), 209 orthologues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tada3-201 ENSMUST00000032410.13 2848 432aa ENSMUSP00000032410.7 Protein coding CCDS20416 Q8R0L9 TSL:1 GENCODE basic APPRIS P1

Tada3-202 ENSMUST00000043333.8 1708 413aa ENSMUSP00000043363.2 Protein coding - Q8R0L9 TSL:1 GENCODE basic

Tada3-203 ENSMUST00000099118.7 1282 232aa ENSMUSP00000108736.1 Protein coding - A0A0R4J1I2 TSL:1 GENCODE basic

Tada3-207 ENSMUST00000193384.1 440 22aa ENSMUSP00000141289.1 Protein coding - A0A0A6YVW0 CDS 3' incomplete TSL:2

Tada3-204 ENSMUST00000113106.2 1850 No protein - Retained intron - - TSL:1

Tada3-205 ENSMUST00000113107.7 1577 No protein - Retained intron - - TSL:1

Tada3-206 ENSMUST00000125414.1 655 No protein - Retained intron - - TSL:1

Page 6 of 8 https://www.alphaknockout.com

31.86 kb Forward strand

113.36Mb 113.37Mb 113.38Mb Arpc4-201 >protein coding (Comprehensive set...

Arpc4-203 >protein coding

Arpc4-202 >protein coding

Arpc4-204 >protein coding

Contigs AC155287.6 > Genes (Comprehensive set... < Tada3-201protein coding

< Tada3-203protein coding

< Tada3-202protein coding

< Tada3-206retained intron < Tada3-205retained intron

< Tada3-204retained intron

< Tada3-207protein coding

Regulatory Build

113.36Mb 113.37Mb 113.38Mb Reverse strand 31.86 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000032410

< Tada3-201protein coding

Reverse strand 11.86 kb

ENSMUSP00000032... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Histone acetyltransferases subunit 3 PANTHER Histone acetyltransferases subunit 3

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 432

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8