https://www.alphaknockout.com

Mouse Esf1 Knockout Project (CRISPR/Cas9)

Objective: To create a Esf1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Esf1 (NCBI Reference Sequence: NM_001081090 ; Ensembl: ENSMUSG00000045624 ) is located on Mouse 2. 14 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 14 (Transcript: ENSMUST00000046030). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2~6 covers 54.64% of the coding region. The size of effective KO region: ~9976 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 14

Legends Exon of mouse Esf1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1310 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.55% 551) | C(18.1% 362) | T(32.4% 648) | G(21.95% 439)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1310bp) | A(28.47% 373) | C(18.17% 238) | T(38.02% 498) | G(15.34% 201)

Note: The 1310 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 140168415 140170414 2000 browser details YourSeq 64 662 848 2000 69.0% chr11 - 82754645 82754821 177 browser details YourSeq 51 649 842 2000 70.8% chr18 - 21461005 21461145 141 browser details YourSeq 50 642 711 2000 86.7% chr1 - 184848764 184848831 68 browser details YourSeq 49 648 841 2000 63.8% chr1 - 135774503 135774688 186 browser details YourSeq 46 647 719 2000 80.8% chr17 + 81345420 81345488 69 browser details YourSeq 46 1153 1284 2000 82.2% chr12 + 32695390 32695516 127 browser details YourSeq 42 743 849 2000 70.2% chr15 + 102774297 102774408 112 browser details YourSeq 40 807 1253 2000 95.5% chr16 - 30035130 30035636 507 browser details YourSeq 39 743 847 2000 67.0% chr12 + 108828615 108828718 104 browser details YourSeq 39 630 673 2000 95.5% chr11 + 93512247 93512291 45 browser details YourSeq 36 616 665 2000 86.0% chr11 + 39633027 39633076 50 browser details YourSeq 34 461 673 2000 85.8% chr4 - 111359150 111359361 212 browser details YourSeq 34 652 758 2000 89.5% chr2 - 155980671 155980776 106 browser details YourSeq 34 655 719 2000 79.5% chr1 + 185843349 185843408 60 browser details YourSeq 33 1234 1346 2000 97.2% chr18 - 35086533 35086646 114 browser details YourSeq 33 636 672 2000 94.6% chr11 + 109446870 109446906 37 browser details YourSeq 32 636 673 2000 92.2% chr13 + 99998004 99998041 38 browser details YourSeq 31 646 714 2000 91.2% chr8 - 47232074 47232141 68 browser details YourSeq 29 1199 1237 2000 94.0% chr18 - 13560065 13560105 41

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1310 1 1310 1310 100.0% chr2 - 140157170 140158479 1310 browser details YourSeq 263 27 343 1310 94.1% chr11 - 62429777 62430369 593 browser details YourSeq 263 16 340 1310 92.7% chr15 + 34148593 34148995 403 browser details YourSeq 261 18 341 1310 92.6% chr2 - 26523992 26524354 363 browser details YourSeq 246 32 341 1310 92.1% chr9 + 62788896 62789250 355 browser details YourSeq 245 32 342 1310 91.0% chr15 + 102317968 102318311 344 browser details YourSeq 241 15 342 1310 90.4% chr17 - 34924990 34925314 325 browser details YourSeq 224 16 325 1310 93.5% chr5 + 139577757 139578089 333 browser details YourSeq 222 33 342 1310 90.0% chr7 + 133138593 133138911 319 browser details YourSeq 209 28 294 1310 91.6% chrX + 38568032 38568304 273 browser details YourSeq 205 32 316 1310 90.8% chr11 - 77545873 77546364 492 browser details YourSeq 198 147 558 1310 93.4% chr12 - 108370276 108499027 128752 browser details YourSeq 189 156 558 1310 88.0% chr5 - 115501443 115501708 266 browser details YourSeq 185 76 337 1310 93.5% chr13 - 93593723 93594326 604 browser details YourSeq 180 151 344 1310 95.9% chr2 + 84016245 84016437 193 browser details YourSeq 178 23 323 1310 90.5% chr4 - 32620147 32620676 530 browser details YourSeq 177 153 341 1310 96.3% chr12 + 100081403 100081590 188 browser details YourSeq 176 149 337 1310 95.8% chr5 + 65992647 65992834 188 browser details YourSeq 176 151 340 1310 96.9% chr4 + 59821236 59821428 193 browser details YourSeq 174 153 340 1310 95.7% chr16 - 17602377 17602563 187

Note: The 1310 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Esf1 ESF1 nucleolar pre-rRNA processing protein homolog [ Mus musculus (house mouse) ] Gene ID: 66580, updated on 12-Aug-2019

Gene summary

Official Symbol Esf1 provided by MGI Official Full Name ESF1 nucleolar pre-rRNA processing protein homolog provided by MGI Primary source MGI:MGI:1913830 See related Ensembl:ENSMUSG00000045624 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ABTAP; C79684; AW545818; 2610101J03Rik Expression Biased expression in CNS E11.5 (RPKM 9.1), placenta adult (RPKM 4.4) and 11 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 F3 See Esf1 in Genome Data Viewer Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (140119881..140170558, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (139945617..139996294, complement)

Chromosome 2 - NC_000068.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Esf1 ENSMUSG00000045624

Description ESF1 nucleolar pre-rRNA processing protein homolog [Source:MGI Symbol;Acc:MGI:1913830] Gene Synonyms 2610101J03Rik Location Chromosome 2: 140,119,883-140,170,564 reverse strand. GRCm38:CM000995.2 About this gene This gene has 5 transcripts (splice variants), 240 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Esf1-201 ENSMUST00000046030.7 3396 845aa ENSMUSP00000036523.7 Protein coding CCDS38250 A2APY6 Q3V1V3 TSL:1 GENCODE basic APPRIS P1

Esf1-202 ENSMUST00000137005.7 639 No protein - Retained intron - - TSL:2

Esf1-205 ENSMUST00000153769.1 617 No protein - Retained intron - - TSL:2

Esf1-204 ENSMUST00000151317.1 613 No protein - Retained intron - - TSL:2

Esf1-203 ENSMUST00000141086.1 533 No protein - Retained intron - - TSL:3

70.68 kb Forward strand

Genes Gm17374-201 >protein coding Ndufaf5-201 >protein coding (Comprehensive set...

Ndufaf5-202 >retained intron

Ndufaf5-203 >lncRNA

Contigs AL929001.7 > AL844528.13 >

Genes (Comprehensive set... < Gm14073-201processed pseudogene < Esf1-205retained intron < Esf1-203retained intron

< Esf1-201protein coding

< Esf1-202retained intron

< Esf1-204retained intron

Regulatory Build

Reverse strand 70.68 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000046030

< Esf1-201protein coding

Reverse strand 50.68 kb

ENSMUSP00000036... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam NUC153

PANTHER Pre-rRNA-processing protein Esf1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe insertion inframe deletion missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 845

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8