https://www.alphaknockout.com

Mouse Tle4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tle4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tle4 (NCBI Reference Sequence: NM_011600 ; Ensembl: ENSMUSG00000024642 ) is located on Mouse 19. 20 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 20 (Transcript: ENSMUST00000052011). Exon 7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tle4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-68P6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele are runted and die around 4 weeks of age with leukocytopenia, B cell lymphopenia, reduced bone mineralization and reduced hematopoietic stem cell number and function.

Exon 7 starts from about 16.86% of the coding region. The knockout of Exon 7 will result in frameshift of the gene. The size of intron 6 for 5'-loxP site insertion: 26814 bp, and the size of intron 7 for 3'-loxP site insertion: 1396 bp. The size of effective cKO region: ~702 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 7 8 20 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tle4 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7202bp) | A(29.73% 2141) | C(17.5% 1260) | T(33.55% 2416) | G(19.23% 1385)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 14518224 14521223 3000 browser details YourSeq 35 1451 1494 3000 97.4% chr13 + 56661443 56661488 46 browser details YourSeq 34 1345 1395 3000 87.0% chr1 - 61761698 61761762 65 browser details YourSeq 34 1355 1393 3000 94.9% chr13 + 12108893 12108934 42 browser details YourSeq 28 1496 1526 3000 96.7% chr11 - 39226539 39226572 34 browser details YourSeq 27 53 97 3000 64.6% chr2 + 53255241 53255271 31 browser details YourSeq 27 1427 1461 3000 91.0% chr13 + 71646321 71646357 37 browser details YourSeq 25 1496 1521 3000 100.0% chr13 + 7425736 7425763 28 browser details YourSeq 22 1496 1520 3000 95.9% chr5 - 130379993 130380021 29 browser details YourSeq 21 1496 1516 3000 100.0% chr2 + 86697883 86697903 21 browser details YourSeq 20 1496 1515 3000 100.0% chr2 - 12387536 12387555 20

Note: The 3000 bp section upstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 14514522 14517521 3000 browser details YourSeq 103 1836 2044 3000 85.6% chr7 - 129822469 129822715 247 browser details YourSeq 93 1916 2075 3000 87.7% chr10 + 128772871 128773223 353 browser details YourSeq 92 1926 2075 3000 87.0% chr2 + 116942831 116942984 154 browser details YourSeq 89 1911 2036 3000 85.6% chr14 + 78201274 78201399 126 browser details YourSeq 88 1914 2023 3000 90.8% chr10 - 118220211 118762776 542566 browser details YourSeq 87 1916 2036 3000 84.9% chr1 - 90387794 90387913 120 browser details YourSeq 86 1916 2036 3000 85.9% chr6 - 54412186 54412307 122 browser details YourSeq 86 1916 2074 3000 86.4% chr14 - 45150089 45150250 162 browser details YourSeq 86 1931 2083 3000 88.3% chr6 + 95178476 95178631 156 browser details YourSeq 85 1911 2036 3000 84.0% chr19 - 26650308 26650433 126 browser details YourSeq 84 1915 2036 3000 86.8% chr13 - 15477576 15477698 123 browser details YourSeq 83 1926 2036 3000 86.3% chr2 - 105449822 105449931 110 browser details YourSeq 83 1926 2075 3000 83.1% chr13 + 69355476 69355629 154 browser details YourSeq 82 1931 2060 3000 82.7% chr14 + 48071671 48071800 130 browser details YourSeq 81 1918 2035 3000 87.2% chr14 - 20470967 20471084 118 browser details YourSeq 81 1916 2036 3000 83.0% chr1 - 194326766 194326885 120 browser details YourSeq 80 1916 2036 3000 79.5% chr14 - 61829281 61829397 117 browser details YourSeq 80 1921 2023 3000 89.3% chr11 - 46236889 46236991 103 browser details YourSeq 80 1919 2036 3000 82.1% chr1 + 164769423 164769539 117

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Tle4 transducin-like enhancer of split 4 [ Mus musculus (house mouse) ] Gene ID: 21888, updated on 10-Oct-2019

Gene summary

Official Symbol Tle4 provided by MGI Official Full Name transducin-like enhancer of split 4 provided by MGI Primary source MGI:MGI:104633 See related Ensembl:ENSMUSG00000024642 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Bce1; Grg4; Bce-1; grg-4; ESTM13; ESTM14; X83332; X83333; AA792082; 5730411M05Rik Expression Ubiquitous expression in whole brain E14.5 (RPKM 15.3), CNS E18 (RPKM 14.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 19 A; 19 9.11 cM See Tle4 in Genome Data Viewer

Exon count: 22

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (14448072..14598183, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (14522562..14672473, complement)

Chromosome 19 - NC_000085.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Tle4 ENSMUSG00000024642

Description transducin-like enhancer of split 4 [Source:MGI Symbol;Acc:MGI:104633] Gene Synonyms 5730411M05Rik, Bce1, ESTM13, ESTM14, Grg4 Location Chromosome 19: 14,448,150-14,598,051 reverse strand. GRCm38:CM001012.2 About this gene This gene has 3 transcripts (splice variants), 242 orthologues, 6 paralogues, is a member of 1 Ensembl protein family and is associated with 30 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tle4-201 ENSMUST00000052011.14 4545 773aa ENSMUSP00000057527.7 Protein coding CCDS50399 Q62441 TSL:1 GENCODE basic APPRIS P2

Tle4-202 ENSMUST00000167776.2 4545 773aa ENSMUSP00000126249.2 Protein coding - F6ZBY9 TSL:5 GENCODE basic APPRIS ALT1

Tle4-203 ENSMUST00000235975.1 1323 No protein - Retained intron - - -

169.90 kb Forward strand 14.45Mb 14.50Mb 14.55Mb 14.60Mb Contigs < AC122202.4 < AC125315.4 Genes < Tle4-202protein coding (Comprehensive set...

< Tle4-201protein coding

< Tle4-203retained intron

Regulatory Build

14.45Mb 14.50Mb 14.55Mb 14.60Mb Reverse strand 169.90 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000052011

< Tle4-201protein coding

Reverse strand 149.90 kb

ENSMUSP00000057... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily WD40-repeat-containing domain superfamily

SMART WD40 repeat Prints Groucho/transducin-like enhancer Pfam Groucho/TLE, N-terminal Q-rich domain WD40 repeat

PROSITE profiles WD40-repeat-containing domain

WD40 repeat PROSITE patterns WD40 repeat, conserved site PANTHER Groucho/transducin-like enhancer

PTHR10814:SF29 Gene3D WD40/YVTN repeat-like-containing domain superfamily CDD cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 773

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7