https://www.alphaknockout.com

Mouse Eml1 Knockout Project (CRISPR/Cas9)

Objective: To create a Eml1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Eml1 (NCBI Reference Sequence: NM_001043335 ; Ensembl: ENSMUSG00000058070 ) is located on Mouse 12. 22 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 22 (Transcript: ENSMUST00000109860). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a spontaneous mutation exhibit subcortical band heterotopia associated with seizures, developmental delay and behavioral deficits.

Exon 2 starts from about 2.78% of the coding region. Exon 2~3 covers 12.94% of the coding region. The size of effective KO region: ~8459 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 22

Legends Exon of mouse Eml1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.55% 531) | C(23.3% 466) | T(25.65% 513) | G(24.5% 490)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(32.5% 650) | C(29.55% 591) | T(20.8% 416) | G(17.15% 343)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 108461528 108463527 2000 browser details YourSeq 275 1 286 2000 98.3% chr1 - 181322248 181322534 287 browser details YourSeq 271 1 304 2000 96.3% chr6 + 89682801 89683107 307 browser details YourSeq 269 2 486 2000 90.7% chr1 + 194608355 194608734 380 browser details YourSeq 268 1 285 2000 97.2% chr1 + 52747659 52747944 286 browser details YourSeq 267 1 291 2000 95.2% chr11 - 4443081 4443370 290 browser details YourSeq 267 1 286 2000 96.9% chr4 + 124890293 124890579 287 browser details YourSeq 267 1 286 2000 96.9% chr1 + 189905116 189905403 288 browser details YourSeq 266 1 284 2000 97.2% chr4 - 131852506 131852791 286 browser details YourSeq 266 1 287 2000 96.6% chr3 - 19027960 19028246 287 browser details YourSeq 266 1 286 2000 96.9% chr5 + 38462854 38463141 288 browser details YourSeq 266 6 308 2000 95.9% chr3 + 36596726 36597165 440 browser details YourSeq 266 1 286 2000 96.6% chr16 + 31673799 31674084 286 browser details YourSeq 265 1 286 2000 96.6% chr9 - 102968865 102969151 287 browser details YourSeq 265 1 287 2000 96.6% chr16 - 20092539 20092828 290 browser details YourSeq 264 1 286 2000 96.2% chr18 - 36213048 36213333 286 browser details YourSeq 264 1 297 2000 93.9% chr15 - 17182421 17182713 293 browser details YourSeq 264 1 286 2000 96.6% chr14 - 49268882 49269169 288 browser details YourSeq 264 1 286 2000 96.2% chr13 - 108030812 108031097 286 browser details YourSeq 264 8 294 2000 95.5% chr1 - 180557917 180558201 285

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 108471987 108473986 2000 browser details YourSeq 275 838 1391 2000 90.7% chr15 + 97566651 97921044 354394 browser details YourSeq 237 840 1391 2000 92.5% chr7 - 141652568 141653320 753 browser details YourSeq 215 841 1365 2000 91.8% chr15 + 97566401 97567094 694 browser details YourSeq 209 841 1390 2000 86.0% chr16 - 37865283 37865607 325 browser details YourSeq 198 960 1391 2000 92.8% chr13 - 12331467 12561499 230033 browser details YourSeq 196 943 1382 2000 93.6% chr4 - 133560303 133560829 527 browser details YourSeq 196 853 1391 2000 93.5% chr4 - 131347398 131347992 595 browser details YourSeq 195 1052 1383 2000 92.9% chr3 - 152482892 152483597 706 browser details YourSeq 193 867 1391 2000 93.0% chr8 - 91843855 91844864 1010 browser details YourSeq 193 943 1393 2000 84.9% chr5 + 141964863 141965280 418 browser details YourSeq 193 1020 1387 2000 91.9% chr2 + 26261458 26261853 396 browser details YourSeq 190 1008 1391 2000 91.3% chr4 + 45593904 45594398 495 browser details YourSeq 189 840 1209 2000 92.9% chr12 + 55538382 55538848 467 browser details YourSeq 181 696 1386 2000 89.7% chr10 - 69896776 70115087 218312 browser details YourSeq 180 984 1393 2000 92.9% chr11 - 6884327 6884836 510 browser details YourSeq 178 978 1391 2000 92.1% chr5 + 106734431 106950033 215603 browser details YourSeq 167 1011 1286 2000 92.0% chr1 + 190462994 190463321 328 browser details YourSeq 165 840 1380 2000 85.7% chr11 - 6884369 6884771 403 browser details YourSeq 164 983 1349 2000 93.7% chr19 - 60171624 60172186 563

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Eml1 echinoderm microtubule associated protein like 1 [ Mus musculus (house mouse) ] Gene ID: 68519, updated on 24-Oct-2019

Gene summary

Official Symbol Eml1 provided by MGI Official Full Name echinoderm microtubule associated protein like 1 provided by MGI Primary source MGI:MGI:1915769 See related Ensembl:ENSMUSG00000058070 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as EMAP; heco; ELP79; EMAPL; EMAP-1; AA171013; AI847476; AI853955; 1110008N23Rik; A930030P13Rik Expression Broad expression in bladder adult (RPKM 32.9), subcutaneous fat pad adult (RPKM 15.9) and 22 other tissues See more Orthologs human all

Genomic context

Location: 12 F1; 12 59.46 cM See Eml1 in Genome Data Viewer Exon count: 30

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (108371002..108539576)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (109648865..109777774)

Chromosome 12 - NC_000078.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Eml1 ENSMUSG00000058070

Description echinoderm microtubule associated protein like 1 [Source:MGI Symbol;Acc:MGI:1915769] Gene Synonyms 1110008N23Rik, A930030P13Rik, ELP79, heco Location Chromosome 12: 108,370,957-108,539,617 forward strand. GRCm38:CM001005.2 About this gene This gene has 8 transcripts (splice variants), 211 orthologues, 9 paralogues, is a member of 1 Ensembl protein family and is associated with 41 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Eml1- ENSMUST00000109860.7 4160 814aa ENSMUSP00000105486.1 Protein coding CCDS36555 Q05BC3 TSL:1 203 GENCODE basic APPRIS P4

Eml1- ENSMUST00000054955.13 3879 783aa ENSMUSP00000057209.7 Protein coding CCDS36556 Q05BC3 TSL:1 201 GENCODE basic APPRIS ALT2

Eml1- ENSMUST00000109857.7 2627 800aa ENSMUSP00000105483.1 Protein coding CCDS70420 D3Z4J9 TSL:1 202 GENCODE basic APPRIS ALT2

Eml1- ENSMUST00000130999.1 2493 699aa ENSMUSP00000118325.1 Nonsense mediated - D6RII3 TSL:2 205 decay

Eml1- ENSMUST00000138456.7 2984 No - Retained intron - - TSL:1 206 protein

Eml1- ENSMUST00000155544.7 4169 No - lncRNA - - TSL:5 208 protein

Eml1- ENSMUST00000123035.1 1730 No - lncRNA - - TSL:1 204 protein

Eml1- ENSMUST00000148186.1 332 No - lncRNA - - TSL:3 207 protein

Page 7 of 9 https://www.alphaknockout.com

188.66 kb Forward strand 108.40Mb 108.45Mb 108.50Mb (Comprehensive set... Cyp46a1-201 >protein coding Eml1-201 >protein coding

Eml1-206 >retained intron Eml1-207 >lncRNA

Eml1-202 >protein coding

Eml1-203 >protein coding

Eml1-205 >nonsense mediated decay

Eml1-208 >lncRNA

Eml1-204 >lncRNA

Contigs < AC154910.3

Genes < Gm15636-201processed pseudogene < Gm16596-203lncRNA (Comprehensive set...

< Gm16596-201lncRNA

< Gm16596-202lncRNA

Regulatory Build

108.40Mb 108.45Mb 108.50Mb Reverse strand 188.66 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript pseudogene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000109860

116.75 kb Forward strand

Eml1-203 >protein coding

ENSMUSP00000105... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Quinoprotein alcohol dehydrogenase-like superfamily

SSF50960 SMART WD40 repeat Pfam HELP WD40 repeat

PROSITE profiles WD40-repeat-containing domain

WD40 repeat PROSITE patterns WD40 repeat, conserved site

PANTHER PTHR13720:SF22

PTHR13720 Gene3D WD40/YVTN repeat-like-containing domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop lost missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 814

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9