https://www.alphaknockout.com

Mouse Gtf3c4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gtf3c4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gtf3c4 (NCBI Reference Sequence: NM_001166033 ; Ensembl: ENSMUSG00000035666 ) is located on Mouse 2. 6 exons are identified, with the ATG start codon in exon 3 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000171404). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gtf3c4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-297A20 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 is not frameshift exon, and covers 86.09% of the coding region. The size of intron 2 for 5'-loxP site insertion: 1037 bp, and the size of intron 3 for 3'-loxP site insertion: 2917 bp. The size of effective cKO region: ~2324 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gtf3c4 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8746bp) | A(27.8% 2431) | C(20.26% 1772) | T(29.3% 2563) | G(22.64% 1980)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 28835546 28838545 3000 browser details YourSeq 43 2394 2505 3000 92.4% chr14 - 25150416 25150527 112 browser details YourSeq 39 2466 2524 3000 95.5% chr8 - 36788421 36788481 61 browser details YourSeq 36 1751 1820 3000 78.1% chr17 - 65230698 65230760 63 browser details YourSeq 28 2958 2987 3000 96.7% chr6 + 47875036 47875065 30 browser details YourSeq 28 2959 2994 3000 88.9% chr5 + 52291913 52291948 36 browser details YourSeq 27 1751 1819 3000 69.6% chr13 - 56666673 56666741 69 browser details YourSeq 27 2472 2509 3000 96.7% chr10 - 69542119 69542156 38 browser details YourSeq 26 2492 2525 3000 89.3% chr19 - 4424526 4424558 33 browser details YourSeq 25 2959 2987 3000 93.2% chr17 + 37119174 37119202 29 browser details YourSeq 22 2492 2525 3000 82.4% chr17 - 42959125 42959158 34

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 28830300 28833299 3000 browser details YourSeq 54 1390 1454 3000 95.2% chr15 - 84162507 84162802 296 browser details YourSeq 51 2134 2264 3000 90.7% chr8 - 95261337 95261467 131 browser details YourSeq 48 1380 1451 3000 76.2% chr12 - 86835653 86835719 67 browser details YourSeq 48 1404 1464 3000 94.6% chr4 + 136784846 136784911 66 browser details YourSeq 46 1381 1446 3000 85.8% chr11 - 54825166 54825225 60 browser details YourSeq 45 1381 1454 3000 84.7% chr1 - 14203902 14204004 103 browser details YourSeq 45 2161 2295 3000 92.2% chr14 + 121820029 121820162 134 browser details YourSeq 45 1380 1440 3000 92.5% chr11 + 85282005 85282067 63 browser details YourSeq 44 1381 1453 3000 96.0% chr11 + 32088651 32088794 144 browser details YourSeq 42 1381 1427 3000 95.7% chr11 - 89008145 89008220 76 browser details YourSeq 42 1381 1427 3000 95.7% chr10 - 80366420 80366495 76 browser details YourSeq 42 1380 1427 3000 95.8% chr12 + 75846776 75847047 272 browser details YourSeq 41 1385 1452 3000 95.7% chr2 + 145311544 145311614 71 browser details YourSeq 40 2145 2261 3000 83.4% chr17 - 9088183 9088296 114 browser details YourSeq 39 1380 1419 3000 100.0% chr5 - 67187869 67187910 42 browser details YourSeq 38 1379 1421 3000 84.7% chr6 - 97364103 97364141 39 browser details YourSeq 38 1381 1419 3000 100.0% chr17 + 46103976 46104016 41 browser details YourSeq 38 1411 1454 3000 95.4% chr11 + 95780318 95780365 48 browser details YourSeq 37 1381 1419 3000 97.5% chrX + 19356650 19356688 39

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Gtf3c4 general transcription factor IIIC, polypeptide 4 [ Mus musculus (house mouse) ] Gene ID: 269252, updated on 10-Oct-2019

Gene summary

Official Symbol Gtf3c4 provided by MGI Official Full Name general transcription factor IIIC, polypeptide 4 provided by MGI Primary source MGI:MGI:2138937 See related Ensembl:ENSMUSG00000035666 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as KAT12; AI426938; AU014771; AU017413; TFIIIC90; 5330400C03 Expression Ubiquitous expression in CNS E11.5 (RPKM 8.3), whole brain E14.5 (RPKM 7.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A3 See Gtf3c4 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (28822299..28840360, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (28677819..28695880, complement)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Gtf3c4 ENSMUSG00000035666

Description general transcription factor IIIC, polypeptide 4 [Source:MGI Symbol;Acc:MGI:2138937] Gene Synonyms KAT12 Location Chromosome 2: 28,822,299-28,840,360 reverse strand. GRCm38:CM000995.2 About this gene This gene has 4 transcripts (splice variants), 193 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gtf3c4-204 ENSMUST00000171404.7 7127 676aa ENSMUSP00000132171.1 Protein coding CCDS50550 E9Q2B0 TSL:1 GENCODE basic

Gtf3c4-201 ENSMUST00000037117.5 6941 817aa ENSMUSP00000042265.5 Protein coding CCDS15847 Q8BMQ2 TSL:1 GENCODE basic APPRIS P1

Gtf3c4-203 ENSMUST00000156468.1 677 No protein - lncRNA - - TSL:3

Gtf3c4-202 ENSMUST00000124390.1 646 No protein - lncRNA - - TSL:2

38.06 kb Forward strand

28.82Mb 28.83Mb 28.84Mb 28.85Mb Ak8-201 >protein coding Ddx31-203 >lncRNA (Comprehensive set...

Ddx31-201 >protein coding

Ddx31-202 >lncRNA

Contigs AL732526.8 >

Genes (Comprehensive set... < Gtf3c4-204protein coding

< Gtf3c4-201protein coding

< Gtf3c4-203lncRNA

< Gtf3c4-202lncRNA

Regulatory Build

28.82Mb 28.83Mb 28.84Mb 28.85Mb Reverse strand 38.06 kb

Regulation Legend Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000171404

< Gtf3c4-204protein coding

Reverse strand 18.06 kb

ENSMUSP00000132... MobiDB lite Low complexity (Seg) Superfamily WD40-repeat-containing domain superfamily

Pfam Transcription factor IIIC, 90kDa subunit, N-terminal Transcription factor IIIC, putative zinc-finger

PANTHER PTHR15496

PTHR15496:SF2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 676

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7