https://www.alphaknockout.com

Mouse Tfec Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tfec conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tfec (NCBI Reference Sequence: NM_031198 ; Ensembl: ENSMUSG00000029553 ) is located on Mouse 6. 7 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000031533). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tfec gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-361N22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele are viable and fertile, normally pigmented, have normal eyes and mast cells, and show no evidence of osteopetrosis.

Exon 3 starts from about 19.03% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 21981 bp, and the size of intron 3 for 3'-loxP site insertion: 1112 bp. The size of effective cKO region: ~615 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tfec Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7115bp) | A(29.52% 2100) | C(18.82% 1339) | T(33.72% 2399) | G(17.95% 1277)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 - 16845729 16848728 3000 browser details YourSeq 70 2452 2546 3000 85.0% chr7 + 56908644 56908727 84 browser details YourSeq 67 2440 2531 3000 83.2% chr16 - 39465723 39465808 86 browser details YourSeq 67 2436 2527 3000 84.4% chr8 + 93799156 93799243 88 browser details YourSeq 65 2462 2601 3000 77.1% chr11 + 15157665 15157751 87 browser details YourSeq 64 2462 2546 3000 85.6% chr12 - 39469771 39469845 75 browser details YourSeq 64 589 755 3000 88.9% chr4 + 25035487 25035665 179 browser details YourSeq 63 2515 2597 3000 95.8% chr13 - 112055569 112056072 504 browser details YourSeq 61 590 799 3000 79.5% chrX - 16934598 16934793 196 browser details YourSeq 59 595 816 3000 72.5% chr1 + 176525832 176526030 199 browser details YourSeq 58 455 617 3000 91.5% chrX - 167230019 167230181 163 browser details YourSeq 56 2436 2509 3000 87.9% chr9 + 94573392 94573465 74 browser details YourSeq 55 2462 2549 3000 93.7% chrX + 65802194 65802304 111 browser details YourSeq 54 2434 2503 3000 88.6% chr1 - 121686418 121686487 70 browser details YourSeq 49 259 763 3000 62.8% chrX + 151263431 151263640 210 browser details YourSeq 49 2436 2500 3000 87.7% chr6 + 57125274 57125338 65 browser details YourSeq 49 2460 2526 3000 85.5% chr4 + 12673771 12673833 63 browser details YourSeq 48 2465 2535 3000 80.4% chr15 - 53909121 53909183 63 browser details YourSeq 47 2436 2500 3000 86.2% chr8 + 79313024 79313088 65 browser details YourSeq 46 2462 2530 3000 80.8% chr9 - 101645144 101645201 58

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 - 16842114 16845113 3000 browser details YourSeq 48 860 919 3000 90.0% chr15 - 17896245 17896304 60 browser details YourSeq 34 642 697 3000 94.9% chr10 - 24839465 24839521 57 browser details YourSeq 27 2771 2808 3000 93.4% chr1 + 49364440 49364478 39 browser details YourSeq 24 1852 1877 3000 88.0% chr11 + 86176330 86176354 25

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Tfec transcription factor EC [ Mus musculus (house mouse) ] Gene ID: 21426, updated on 12-Aug-2019

Gene summary

Official Symbol Tfec provided by MGI Official Full Name transcription factor EC provided by MGI Primary source MGI:MGI:1333760 See related Ensembl:ENSMUSG00000029553 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Tcfec; bHLHe34; BB107417 Expression Biased expression in placenta adult (RPKM 1.9), liver E18 (RPKM 1.4) and 12 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 A2 See Tfec in Genome Data Viewer

Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (16830927..16898464, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (16783381..16848441, complement)

Chromosome 6 - NC_000072.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Tfec ENSMUSG00000029553

Description transcription factor EC [Source:MGI Symbol;Acc:MGI:1333760] Gene Synonyms TFEC, Tcfec, bHLHe34 Location Chromosome 6: 16,833,373-16,898,441 reverse strand. GRCm38:CM000999.2 About this gene This gene has 5 transcripts (splice variants), 202 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tfec- ENSMUST00000031533.10 1787 317aa ENSMUSP00000031533.7 Protein coding CCDS19920 Q9WTW4 TSL:1 201 GENCODE basic APPRIS P1

Tfec- ENSMUST00000201406.1 572 28aa ENSMUSP00000144445.1 Protein coding - A0A0J9YV19 CDS 5' 204 incomplete TSL:NA

Tfec- ENSMUST00000202997.1 1654 127aa ENSMUSP00000143880.1 Nonsense mediated - A0A0J9YTW5 TSL:1 205 decay

Tfec- ENSMUST00000201104.1 3384 No - Retained intron - - TSL:1 203 protein

Tfec- ENSMUST00000200984.1 688 No - Retained intron - - TSL:5 202 protein

85.07 kb Forward strand

16.84Mb 16.86Mb 16.88Mb 16.90Mb Contigs AC102663.6 >

Genes (Comprehensive set... < Tfec-201protein coding

< Tfec-205nonsense mediated decay

< Tfec-204protein coding

< Tfec-203retained intron

< Tfec-202retained intron

Regulatory Build

16.84Mb 16.86Mb 16.88Mb 16.90Mb Reverse strand 85.07 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000031533

< Tfec-201protein coding

Reverse strand 65.07 kb

ENSMUSP00000031... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Helix-loop-helix DNA-binding domain superfamily SMART Myc-type, basic helix-loop-helix (bHLH) domain

Pfam MiT/TFE transcription factors, C-terminal

Myc-type, basic helix-loop-helix (bHLH) domain PROSITE profiles Myc-type, basic helix-loop-helix (bHLH) domain

PANTHER PTHR45776

Transcription factor EC Gene3D Helix-loop-helix DNA-binding domain superfamily

CDD Myc-type, basic helix-loop-helix (bHLH) domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 317

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7