https://www.alphaknockout.com

Mouse Elmo2 Knockout Project (CRISPR/Cas9)

Objective: To create a Elmo2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Elmo2 (NCBI Reference Sequence: NM_001302752 ; Ensembl: ENSMUSG00000017670 ) is located on Mouse 2. 21 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 21 (Transcript: ENSMUST00000103091). Exon 2~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2~8 covers 31.34% of the coding region. The size of effective KO region: ~7439 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 21

Legends Exon of mouse Elmo2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 8 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.75% 455) | C(27.8% 556) | T(22.55% 451) | G(26.9% 538)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.35% 447) | C(27.4% 548) | T(28.05% 561) | G(22.2% 444)

Note: The 2000 bp section downstream of Exon 8 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 165316317 165318316 2000 browser details YourSeq 31 8 43 2000 91.0% chr6 + 101331319 101331353 35 browser details YourSeq 29 949 980 2000 96.8% chr2 - 153253785 153254162 378 browser details YourSeq 28 6 38 2000 96.7% chr1 + 134902066 134902098 33 browser details YourSeq 23 1126 1148 2000 100.0% chr6 - 88350079 88350101 23 browser details YourSeq 22 109 130 2000 100.0% chr19 - 11790335 11790356 22

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 165306913 165308912 2000 browser details YourSeq 225 990 1838 2000 89.9% chr10 - 70704481 70705391 911 browser details YourSeq 181 1570 2000 2000 89.3% chr8 + 46729250 46729825 576 browser details YourSeq 179 1511 2000 2000 84.1% chr18 + 49313867 49314411 545 browser details YourSeq 174 994 1634 2000 80.6% chr3 + 30970950 30971623 674 browser details YourSeq 173 1159 1534 2000 87.2% chr5 + 65205050 65205460 411 browser details YourSeq 169 1449 2000 2000 88.7% chr11 + 3618247 4003809 385563 browser details YourSeq 168 1060 1599 2000 87.7% chr19 + 23012561 23013129 569 browser details YourSeq 165 1445 2000 2000 86.5% chr10 + 77028676 77029263 588 browser details YourSeq 165 1567 1995 2000 88.9% chr10 + 66025047 66025620 574 browser details YourSeq 163 1397 1944 2000 84.2% chr11 - 116176938 116177557 620 browser details YourSeq 159 1570 1932 2000 87.8% chr10 + 126460161 126460562 402 browser details YourSeq 158 1395 1932 2000 88.9% chr2 - 83899871 83900460 590 browser details YourSeq 153 1395 1885 2000 89.8% chr4 + 34290926 34291481 556 browser details YourSeq 146 1712 2000 2000 87.9% chr10 + 10564916 10565338 423 browser details YourSeq 145 1712 2000 2000 87.8% chrX + 18386817 18387245 429 browser details YourSeq 145 1618 2000 2000 88.9% chr1 + 15212023 15212447 425 browser details YourSeq 143 1653 1932 2000 88.4% chr14 - 88113566 88113957 392 browser details YourSeq 140 1712 2000 2000 87.4% chr3 + 123242590 123243019 430 browser details YourSeq 137 1466 1932 2000 87.1% chr2 - 134236546 134237112 567

Note: The 2000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Elmo2 engulfment and cell motility 2 [ Mus musculus (house mouse) ] Gene ID: 140579, updated on 12-Aug-2019

Gene summary

Official Symbol Elmo2 provided by MGI Official Full Name engulfment and cell motility 2 provided by MGI Primary source MGI:MGI:2153045 See related Ensembl:ENSMUSG00000017670 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CED-12; 1190002F24Rik Expression Ubiquitous expression in cerebellum adult (RPKM 32.9), frontal lobe adult (RPKM 28.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 H3 See Elmo2 in Genome Data Viewer Exon count: 25

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (165288031..165326479, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (165113531..165151979, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Elmo2 ENSMUSG00000017670

Description engulfment and cell motility 2 [Source:MGI Symbol;Acc:MGI:2153045] Gene Synonyms 1190002F24Rik, CED-12 Location Chromosome 2: 165,288,031-165,326,479 reverse strand. GRCm38:CM000995.2 About this gene This gene has 12 transcripts (splice variants), 197 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Elmo2- ENSMUST00000103091.8 4695 720aa ENSMUSP00000099380.2 Protein coding CCDS17075 Q8BHL5 TSL:1 205 GENCODE basic APPRIS P2

Elmo2- ENSMUST00000074046.12 3502 732aa ENSMUSP00000073691.6 Protein coding CCDS17076 Q8BHL5 TSL:1 202 GENCODE basic

Elmo2- ENSMUST00000094329.10 3466 720aa ENSMUSP00000091887.4 Protein coding CCDS17075 Q8BHL5 TSL:1 203 GENCODE basic APPRIS P2

Elmo2- ENSMUST00000071699.10 3247 798aa ENSMUSP00000071619.4 Protein coding - Q8BHL5 TSL:1 201 GENCODE basic APPRIS ALT2

Elmo2- ENSMUST00000103088.9 3052 798aa ENSMUSP00000099377.3 Protein coding - Q8BHL5 TSL:5 204 GENCODE basic APPRIS ALT2

Elmo2- ENSMUST00000148643.7 2078 290aa ENSMUSP00000117124.1 Protein coding - F6YXR3 CDS 5' incomplete 211 TSL:1

Elmo2- ENSMUST00000133205.7 627 122aa ENSMUSP00000119682.1 Protein coding - Q5GMG0 CDS 3' incomplete 209 TSL:2

Elmo2- ENSMUST00000137188.1 581 194aa ENSMUSP00000123232.1 Protein coding - F6XD63 CDS 5' and 3' 210 incomplete TSL:3

Elmo2- ENSMUST00000128690.1 352 41aa ENSMUSP00000114303.1 Protein coding - A2A5A7 CDS 3' incomplete 208 TSL:5

Elmo2- ENSMUST00000126318.7 639 63aa ENSMUSP00000116124.1 Nonsense mediated - D6RHT3 TSL:5 206 decay

Elmo2- ENSMUST00000149844.7 557 No - Retained intron - - TSL:5 212 protein

Elmo2- ENSMUST00000127496.7 795 No - lncRNA - - TSL:3 207 protein

Page 7 of 9 https://www.alphaknockout.com

58.45 kb Forward strand

165.28Mb 165.29Mb 165.30Mb 165.31Mb 165.32Mb 165.33Mb Contigs AL591430.8 >

Genes (Comprehensive set... < Slc35c2-202protein coding < Elmo2-205protein coding

< Slc35c2-203protein coding < Elmo2-203protein coding

< Slc35c2-204protein coding < Elmo2-202protein coding

< Slc35c2-201protein coding < Elmo2-211protein coding < Elmo2-207lncRNA < Elmo2-208protein coding

< Slc35c2-210nonsense mediated decay< Elmo2-204protein coding

< Slc35c2-214retained intron < Elmo2-201protein coding

< Slc35c2-207protein coding < Elmo2-210protein coding

< Slc35c2-212nonsense mediated decay < Elmo2-212retained intron

< Slc35c2-215protein coding < Elmo2-206nonsense mediated decay

< Slc35c2-216protein coding < Elmo2-209protein coding

< Slc35c2-211protein coding

< Slc35c2-206protein coding

< Slc35c2-209protein coding

< Slc35c2-213retained intron

< Slc35c2-208protein coding

Regulatory Build

165.28Mb 165.29Mb 165.30Mb 165.31Mb 165.32Mb 165.33Mb Reverse strand 58.45 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000103091

< Elmo2-205protein coding

Reverse strand 32.80 kb

ENSMUSP00000099... Low complexity (Seg) Superfamily Armadillo-type fold SSF50729

Pfam ELMO domain Pleckstrin homology domain

Domain of unknown function DUF3361 PROSITE profiles ELMO domain PANTHER PTHR12771

Engulfment and cell motility protein 2 Gene3D Armadillo-like helical PH-like domain superfamily

1.10.8.920 CDD cd13359

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9