https://www.alphaknockout.com

Mouse Eml4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Eml4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Eml4 (NCBI Reference Sequence: NM_001114361 ; Ensembl: ENSMUSG00000032624 ) is located on Mouse 17. 24 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 24 (Transcript: ENSMUST00000096766). Exon 5~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Eml4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-456J9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 17.31% of the coding region. The knockout of Exon 5~7 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 1791 bp, and the size of intron 7 for 3'-loxP site insertion: 11782 bp. The size of effective cKO region: ~1526 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 7 24 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Eml4 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8026bp) | A(28.32% 2273) | C(19.1% 1533) | T(34.08% 2735) | G(18.5% 1485)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 83423974 83426973 3000 browser details YourSeq 87 2 565 3000 74.0% chr2 - 168968704 168969077 374 browser details YourSeq 85 2 377 3000 72.5% chr6 - 38380308 38380497 190 browser details YourSeq 82 1 105 3000 86.5% chr6 + 10827886 10827988 103 browser details YourSeq 81 8 105 3000 88.6% chr5 - 34329261 34329356 96 browser details YourSeq 81 2 105 3000 86.3% chr12 - 99976476 99976577 102 browser details YourSeq 81 2 116 3000 86.8% chr5 + 111638223 111638338 116 browser details YourSeq 81 2 105 3000 86.3% chr11 + 84021051 84021152 102 browser details YourSeq 80 2 112 3000 83.5% chr3 - 74157416 74157524 109 browser details YourSeq 79 2 105 3000 87.0% chrX - 40250616 40250718 103 browser details YourSeq 79 2 107 3000 84.7% chr3 - 88275745 88275848 104 browser details YourSeq 79 2 105 3000 85.3% chrX + 152790182 152790283 102 browser details YourSeq 78 2 105 3000 82.9% chr8 + 107340215 107340313 99 browser details YourSeq 77 2 105 3000 84.4% chr19 - 4905385 4905486 102 browser details YourSeq 77 2 105 3000 84.4% chr6 + 125199633 125199734 102 browser details YourSeq 77 2 104 3000 83.4% chr10 + 93133542 93133638 97 browser details YourSeq 76 2 105 3000 89.1% chr15 - 34337025 34337126 102 browser details YourSeq 76 2 105 3000 86.5% chr4 + 143496678 143496779 102 browser details YourSeq 76 2 105 3000 86.5% chr16 + 75395743 75395844 102 browser details YourSeq 75 2 114 3000 84.5% chrX + 115999457 115999568 112

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 83428500 83431499 3000 browser details YourSeq 148 848 1466 3000 84.8% chrX + 165019970 165020255 286 browser details YourSeq 139 847 1048 3000 94.3% chr8 - 35026040 35026336 297 browser details YourSeq 139 2261 2719 3000 78.3% chr3 - 65202800 65203150 351 browser details YourSeq 133 853 1048 3000 94.6% chr11 + 79000375 79000824 450 browser details YourSeq 132 843 1315 3000 81.7% chr4 + 38263603 38263804 202 browser details YourSeq 131 429 980 3000 83.3% chr12 + 28287206 28287374 169 browser details YourSeq 130 847 998 3000 89.6% chr15 - 99962501 99962644 144 browser details YourSeq 129 845 981 3000 94.9% chr11 + 19239412 19239546 135 browser details YourSeq 127 847 1314 3000 81.6% chr10 + 51755376 51755528 153 browser details YourSeq 127 840 1315 3000 79.4% chr10 + 49859289 49859506 218 browser details YourSeq 125 847 1048 3000 90.3% chr11 - 52410323 52410637 315 browser details YourSeq 124 2238 2490 3000 84.4% chr8 - 18192254 18192566 313 browser details YourSeq 124 841 981 3000 94.3% chr1 - 33759871 33760017 147 browser details YourSeq 124 847 980 3000 94.8% chr17 + 29060569 29060701 133 browser details YourSeq 124 852 1315 3000 81.2% chr15 + 39916639 39916803 165 browser details YourSeq 123 847 979 3000 94.7% chr7 - 55978280 55978411 132 browser details YourSeq 123 848 978 3000 95.4% chr18 - 32649000 32649129 130 browser details YourSeq 123 848 980 3000 94.7% chr17 - 33837280 33837411 132 browser details YourSeq 123 847 979 3000 94.7% chr15 - 85124520 85124651 132

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Eml4 echinoderm microtubule associated protein like 4 [ Mus musculus (house mouse) ] Gene ID: 78798, updated on 24-Oct-2019

Gene summary

Official Symbol Eml4 provided by MGI Official Full Name echinoderm microtubule associated protein like 4 provided by MGI Primary source MGI:MGI:1926048 See related Ensembl:ENSMUSG00000032624 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI644019; 4930443C24Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 20.0), CNS E14 (RPKM 14.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 17; 17 E4 See Eml4 in Genome Data Viewer

Exon count: 27

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (83350891..83480361)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (83750271..83879701)

Chromosome 17 - NC_000083.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Eml4 ENSMUSG00000032624

Description echinoderm microtubule associated protein like 4 [Source:MGI Symbol;Acc:MGI:1926048] Gene Synonyms 4930443C24Rik Location Chromosome 17: 83,350,931-83,480,361 forward strand. GRCm38:CM001010.2 About this gene This gene has 6 transcripts (splice variants), 207 orthologues, 9 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Eml4-202 ENSMUST00000096766.11 5441 988aa ENSMUSP00000094528.4 Protein coding CCDS50192 F8WJ93 TSL:1 GENCODE basic APPRIS ALT2

Eml4-203 ENSMUST00000112363.9 5081 919aa ENSMUSP00000107982.2 Protein coding CCDS50193 A0A0R4J1G7 TSL:1 GENCODE basic APPRIS ALT2

Eml4-201 ENSMUST00000049503.9 5052 876aa ENSMUSP00000041880.8 Protein coding CCDS37708 A0A0R4J0H0 TSL:1 GENCODE basic APPRIS P3

Eml4-204 ENSMUST00000234460.1 5286 941aa ENSMUSP00000157143.1 Protein coding - A0A3Q4EHV9 GENCODE basic APPRIS ALT2

Eml4-205 ENSMUST00000234584.1 2934 977aa ENSMUSP00000157195.1 Protein coding - A7ISP9 GENCODE basic APPRIS ALT2

Eml4-206 ENSMUST00000235121.1 3814 No protein - Retained intron - - -

149.43 kb Forward strand 83.36Mb 83.38Mb 83.40Mb 83.42Mb 83.44Mb 83.46Mb 83.48Mb (Comprehensive set... Eml4-202 >protein coding

Eml4-201 >protein coding

Eml4-203 >protein coding

Eml4-204 >protein coding

Eml4-205 >protein coding

Eml4-206 >retained intron

Contigs AC167969.2 > < AC164634.3 Regulatory Build

83.36Mb 83.38Mb 83.40Mb 83.42Mb 83.44Mb 83.46Mb 83.48Mb Reverse strand 149.43 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 8 https://www.alphaknockout.com

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000096766

129.43 kb Forward strand

Eml4-202 >protein coding

ENSMUSP00000094... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Quinoprotein alcohol dehydrogenase-like superfamily SMART WD40 repeat Pfam HELP WD40 repeat PROSITE profiles WD40 repeat

WD40-repeat-containing domain PROSITE patterns WD40 repeat, conserved site PANTHER PTHR13720

PTHR13720:SF11 Gene3D WD40/YVTN repeat-like-containing domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 988

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8