https://www.alphaknockout.com

Mouse Efhc2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Efhc2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Efhc2 (NCBI Reference Sequence: NM_028916 ; Ensembl: ENSMUSG00000025038 ) is located on Mouse X. 15 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 15 (Transcript: ENSMUST00000026014). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Efhc2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-299C2 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 10.31% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 37900 bp, and the size of intron 3 for 3'-loxP site insertion: 12702 bp. The size of effective cKO region: ~651 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 15 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Efhc2 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7151bp) | A(30.23% 2162) | C(18.51% 1324) | T(32.08% 2294) | G(19.17% 1371)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 17243828 17246827 3000 browser details YourSeq 172 1 196 3000 96.3% chr7 - 12906868 12907464 597 browser details YourSeq 73 2527 2654 3000 91.9% chr4 + 59089247 59089378 132 browser details YourSeq 60 2571 2654 3000 85.8% chr2 + 163645746 163645829 84 browser details YourSeq 58 2566 2656 3000 82.3% chr10 - 79606826 79606926 101 browser details YourSeq 51 2562 2643 3000 81.8% chr9 + 121558951 121559051 101 browser details YourSeq 51 2544 2627 3000 84.0% chr11 + 16157363 16157447 85 browser details YourSeq 49 1695 1809 3000 83.6% chr18 - 64609869 64609984 116 browser details YourSeq 48 2546 2643 3000 94.6% chr1 + 33198759 33198857 99 browser details YourSeq 47 2601 2664 3000 96.1% chr2 + 118509522 118509587 66 browser details YourSeq 46 2614 2738 3000 96.1% chr14 + 10343662 10344137 476 browser details YourSeq 44 2608 2660 3000 92.4% chr9 - 95816580 95816632 53 browser details YourSeq 42 1664 1739 3000 77.7% chr12 - 71279448 71279523 76 browser details YourSeq 41 2571 2652 3000 93.7% chr10 + 105892988 105893073 86 browser details YourSeq 39 1697 1806 3000 83.1% chr11 - 58069582 58069978 397 browser details YourSeq 37 1640 1687 3000 95.2% chr1 + 59223300 59223348 49 browser details YourSeq 36 2177 2220 3000 91.0% chr14 - 31981979 31982022 44 browser details YourSeq 35 1668 1739 3000 79.3% chr6 - 135625641 135625711 71 browser details YourSeq 35 657 704 3000 87.9% chr11 - 86895875 86895921 47 browser details YourSeq 34 2571 2660 3000 94.8% chr17 - 54665213 54665302 90

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 17240177 17243176 3000 browser details YourSeq 1726 828 3000 3000 91.5% chr13 + 103692037 103694209 2173 browser details YourSeq 1691 837 3000 3000 91.7% chr19 + 47758331 47760478 2148 browser details YourSeq 1647 876 3000 3000 90.3% chr13 - 63825259 63827726 2468 browser details YourSeq 1640 875 3000 3000 90.8% chr13 - 93778006 93780141 2136 browser details YourSeq 1629 828 3000 3000 90.2% chr11 + 108576630 108578796 2167 browser details YourSeq 1625 903 2961 3000 91.3% chr14 - 65517801 65519881 2081 browser details YourSeq 1624 893 3000 3000 90.8% chr5 + 107672651 107674769 2119 browser details YourSeq 1618 890 2999 3000 90.3% chr1 + 23861416 23863535 2120 browser details YourSeq 1615 828 3000 3000 89.2% chrX + 105300085 105302249 2165 browser details YourSeq 1610 828 3000 3000 89.0% chr18 - 68486271 68488440 2170 browser details YourSeq 1605 828 3000 3000 89.8% chrX + 6147208 6149370 2163 browser details YourSeq 1601 901 3000 3000 91.1% chr1 - 24147718 24149796 2079 browser details YourSeq 1601 828 2953 3000 90.0% chr11 + 115575296 115577424 2129 browser details YourSeq 1600 828 3000 3000 89.3% chr15 - 35260012 35262227 2216 browser details YourSeq 1598 868 3000 3000 89.2% chr4 + 33818867 33820994 2128 browser details YourSeq 1597 834 3000 3000 90.9% chr15 - 93361871 93364230 2360 browser details YourSeq 1593 828 3000 3000 89.2% chr2 - 139153656 139155824 2169 browser details YourSeq 1588 920 3000 3000 90.8% chr11 + 5657461 5659578 2118 browser details YourSeq 1587 834 3000 3000 89.9% chr13 - 100247632 100249792 2161

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Efhc2 EF-hand domain (C-terminal) containing 2 [ Mus musculus (house mouse) ] Gene ID: 74405, updated on 14-Aug-2019

Gene summary

Official Symbol Efhc2 provided by MGI Official Full Name EF-hand domain (C-terminal) containing 2 provided by MGI Primary source MGI:MGI:1921655 See related Ensembl:ENSMUSG00000025038 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mRib72-2; 4933407D04Rik Expression Biased expression in testis adult (RPKM 2.7), placenta adult (RPKM 0.8) and 6 other tissues See more Orthologs human all

Genomic context

Location: X; X A1.2 See Efhc2 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (17132049..17320049, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (16709175..16896494, complement)

Chromosome X - NC_000086.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Efhc2 ENSMUSG00000025038

Description EF-hand domain (C-terminal) containing 2 [Source:MGI Symbol;Acc:MGI:1921655] Gene Synonyms 4933407D04Rik, mRib72-2 Location Chromosome X: 17,132,049-17,319,368 reverse strand. GRCm38:CM001013.2 About this gene This gene has 1 transcript (splice variant), 189 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Efhc2-201 ENSMUST00000026014.7 2554 750aa ENSMUSP00000026014.7 Protein coding CCDS40883 Q059K2 Q9D485 TSL:1 GENCODE basic APPRIS P1

207.32 kb Forward strand 17.15Mb 17.20Mb 17.25Mb 17.30Mb Gm14514-201 >processed pseudogene Gm22294-201 >snRNA (Comprehensive set...

Contigs AL807734.7 > BX004977.8 > Genes (Comprehensive set... < Efhc2-201protein coding

Regulatory Build

17.15Mb 17.20Mb 17.25Mb 17.30Mb Reverse strand 207.32 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000026014

< Efhc2-201protein coding

Reverse strand 187.32 kb

ENSMUSP00000026... Low complexity (Seg) Superfamily EF-hand domain pair SMART Uncharacterised domain DM10 Pfam Domain of unknown function DUF1126 PROSITE profiles Uncharacterised domain DM10 PANTHER PTHR12086:SF11

EF-hand domain-containing protein EFHC1/EFHC2/EFHB Gene3D 2.30.29.170 1.10.238.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 750

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7