https://www.alphaknockout.com

Mouse Eaf2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Eaf2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Eaf2 (NCBI Reference Sequence: NM_001113401 ; Ensembl: ENSMUSG00000022838 ) is located on Mouse 16. 6 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000114829). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Eaf2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-400K10 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele exhibit premature death, enlarged heart and prostate associate with hypertrophy, and increased incidence of tumors.

Exon 4 starts from about 43.13% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 2311 bp, and the size of intron 4 for 3'-loxP site insertion: 7170 bp. The size of effective cKO region: ~646 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 4 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Eaf2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7146bp) | A(34.17% 2442) | C(17.35% 1240) | T(28.56% 2041) | G(19.91% 1423)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr16 - 36808411 36811410 3000 browser details YourSeq 107 1901 2045 3000 86.9% chr3 + 123687844 123687988 145 browser details YourSeq 96 1905 2134 3000 77.9% chr3 - 88067811 88068024 214 browser details YourSeq 94 1926 2176 3000 83.3% chr1 + 133682271 133682676 406 browser details YourSeq 93 1912 2141 3000 83.5% chr4 - 130187711 130188202 492 browser details YourSeq 93 1909 2110 3000 78.8% chr14 + 62938827 62939001 175 browser details YourSeq 91 1901 2099 3000 89.8% chr2 - 32670393 32670592 200 browser details YourSeq 91 1908 2040 3000 84.3% chr5 + 100216009 100216141 133 browser details YourSeq 90 1900 2043 3000 81.3% chr8 - 25493704 25493847 144 browser details YourSeq 90 1905 2040 3000 83.1% chr13 - 41613892 41614027 136 browser details YourSeq 90 1905 2039 3000 82.5% chrX + 101613874 101614002 129 browser details YourSeq 90 1901 2140 3000 74.9% chr3 + 82007856 82008037 182 browser details YourSeq 87 1917 2043 3000 84.3% chr19 - 7515009 7515135 127 browser details YourSeq 86 1910 2043 3000 82.1% chr9 + 40619469 40619602 134 browser details YourSeq 85 1922 2043 3000 82.5% chr15 - 100473145 100473264 120 browser details YourSeq 85 1900 2040 3000 89.8% chrX + 10508945 10509087 143 browser details YourSeq 85 1910 2042 3000 82.0% chr11 + 67403356 67403488 133 browser details YourSeq 83 1900 2038 3000 79.9% chr8 - 118370949 118371087 139 browser details YourSeq 83 1905 2029 3000 83.2% chr19 - 9925813 9925937 125 browser details YourSeq 82 1878 2014 3000 83.7% chr11 + 96589111 96589248 138

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr16 - 36804765 36807764 3000 browser details YourSeq 171 2782 2963 3000 97.3% chr2 - 20817017 20817209 193 browser details YourSeq 170 2764 2958 3000 94.4% chr17 + 7137590 7137811 222 browser details YourSeq 169 2782 2961 3000 97.3% chr18 - 36393382 36393934 553 browser details YourSeq 169 2778 2962 3000 96.2% chr17 + 6226698 6226884 187 browser details YourSeq 166 2774 2964 3000 93.4% chr7 + 130448420 130448607 188 browser details YourSeq 163 2778 2994 3000 95.1% chr14 + 54326183 54326419 237 browser details YourSeq 162 2751 2936 3000 92.1% chr4 - 129411750 129411928 179 browser details YourSeq 162 2778 2962 3000 95.0% chr1 - 153494358 153494553 196 browser details YourSeq 162 2779 2955 3000 96.1% chr8 + 61640454 61640650 197 browser details YourSeq 162 2778 2945 3000 98.3% chr11 + 75664949 75665116 168 browser details YourSeq 161 2767 2945 3000 93.6% chr12 - 44271606 44271777 172 browser details YourSeq 161 2779 2951 3000 95.4% chr9 + 13527005 13527176 172 browser details YourSeq 160 2768 2942 3000 97.1% chr18 + 35368506 35368684 179 browser details YourSeq 159 2779 2948 3000 97.1% chrX - 41461665 41461847 183 browser details YourSeq 159 2778 2945 3000 97.7% chr4 - 32071338 32071506 169 browser details YourSeq 159 2782 2950 3000 95.9% chr1 - 51504532 51504699 168 browser details YourSeq 159 2778 2945 3000 97.7% chr4 + 88741228 88741397 170 browser details YourSeq 159 2777 2945 3000 97.1% chr16 + 5054727 5054895 169 browser details YourSeq 158 2782 2949 3000 97.1% chr2 - 31070372 31070539 168

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Eaf2 ELL associated factor 2 [ Mus musculus (house mouse) ] Gene ID: 106389, updated on 24-Oct-2019

Gene summary

Official Symbol Eaf2 provided by MGI Official Full Name ELL associated factor 2 provided by MGI Primary source MGI:MGI:2146616 See related Ensembl:ENSMUSG00000022838 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as U19; Festa; Traits; FESTA-L; FESTA-S; AW048865 Expression Broad expression in bladder adult (RPKM 1.1), kidney adult (RPKM 1.0) and 24 other tissues See more Orthologs human all

Genomic context

Location: 16; 16 B3 See Eaf2 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 16 NC_000082.6 (36792884..36875068, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 16 NC_000082.5 (36792970..36874912, complement)

Chromosome 16 - NC_000082.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Eaf2 ENSMUSG00000022838

Description ELL associated factor 2 [Source:MGI Symbol;Acc:MGI:2146616] Gene Synonyms FESTA-L, FESTA-S, Festa, Traits, U19 Location Chromosome 16: 36,792,884-36,875,003 reverse strand. GRCm38:CM001009.2 About this gene This gene has 9 transcripts (splice variants), 198 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Eaf2-202 ENSMUST00000075946.11 1878 132aa ENSMUSP00000075331.5 Protein coding CCDS37335 Q91ZD6 TSL:1 GENCODE basic

Eaf2-203 ENSMUST00000114825.2 1530 132aa ENSMUSP00000110473.1 Protein coding CCDS37335 Q91ZD6 TSL:1 GENCODE basic

Eaf2-204 ENSMUST00000114829.8 1030 262aa ENSMUSP00000110477.2 Protein coding CCDS49843 Q91ZD6 TSL:1 GENCODE basic APPRIS P2

Eaf2-201 ENSMUST00000023537.5 849 195aa ENSMUSP00000023537.5 Protein coding - K4DI60 TSL:5 GENCODE basic APPRIS ALT2

Eaf2-205 ENSMUST00000134556.1 2377 No protein - Retained intron - - TSL:1

Eaf2-208 ENSMUST00000157072.7 1106 No protein - lncRNA - - TSL:5

Eaf2-206 ENSMUST00000138660.7 963 No protein - lncRNA - - TSL:1

Eaf2-209 ENSMUST00000231782.1 668 No protein - lncRNA - - -

Eaf2-207 ENSMUST00000147053.1 667 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

102.12 kb Forward strand 36.80Mb 36.82Mb 36.84Mb 36.86Mb 36.88Mb Gm22617-201 >miRNA Iqcb1-206 >lncRNA Iqcb1-204 >retained intron 2600002D14Rik-201 >lncRNA (Comprehensive set...

Iqcb1-202 >protein coding Golgb1-205 >lncRNA

Iqcb1-201 >protein coding

Iqcb1-208 >lncRNA Iqcb1-203 >lncRNA Golgb1-201 >protein coding

Iqcb1-205 >lncRNA Golgb1-203 >protein coding

Iqcb1-207 >lncRNA Golgb1-204 >protein coding

Gm49599-201 >TEC

Contigs AC117662.13 > AC154232.1 >

Genes (Comprehensive set... < Slc15a2-201protein coding < Eaf2-205retained intron

< Slc15a2-208protein coding < 4930565N06Rik-201lncRNA

< Slc15a2-207nonsense mediated decay

< Slc15a2-205protein coding

< Slc15a2-212protein coding

< Eaf2-203protein coding

< Eaf2-202protein coding

< Eaf2-208lncRNA

< Eaf2-204protein coding

< Eaf2-206lncRNA

< Eaf2-209lncRNA

< Eaf2-201protein coding

< Eaf2-207lncRNA

Regulatory Build

36.80Mb 36.82Mb 36.84Mb 36.86Mb 36.88Mb Reverse strand 102.12 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000114829

< Eaf2-204protein coding

Reverse strand 34.52 kb

ENSMUSP00000110... MobiDB lite Low complexity (Seg) Pfam Transcription elognation factor Eaf, N-terminal PANTHER EAF family

PTHR15970:SF7

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 262

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8