https://www.alphaknockout.com

Mouse Emc10 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Emc10 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Emc10 (NCBI Reference Sequence: NM_197991 ; Ensembl: ENSMUSG00000008140 ) is located on Mouse 7. 7 are identified, with the ATG start codon in 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000118808). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Emc10 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-406H21 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Male mice homozygous for a gene trapped allele display improved glucose tolerance and reduced fertility, while female homozygotes exhibit an increased anxiety-related response.

Exon 2 starts from about 18.03% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1375 bp, and the size of intron 2 for 3'-loxP site insertion: 1508 bp. The size of effective cKO region: ~573 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

2 1 1 2 3 4 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Fam71e1 Exon of mouse Emc10 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7073bp) | A(22.54% 1594) | C(24.43% 1728) | T(23.58% 1668) | G(29.45% 2083)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 44495131 44498130 3000 browser details YourSeq 72 2097 2207 3000 86.2% chr4 + 115135186 115135299 114 browser details YourSeq 67 2128 2207 3000 93.6% chr12 - 84648113 84648194 82 browser details YourSeq 66 2126 2202 3000 94.6% chr11 - 87096195 87096273 79 browser details YourSeq 65 2098 2217 3000 94.6% chr11 - 55014356 55250361 236006 browser details YourSeq 65 2093 2207 3000 85.8% chr1 - 10753140 10753256 117 browser details YourSeq 65 2097 2207 3000 87.4% chr1 + 126672816 126672928 113 browser details YourSeq 63 2128 2207 3000 89.9% chr8 - 120986717 120986798 82 browser details YourSeq 63 2101 2207 3000 88.0% chr11 - 24554255 24554363 109 browser details YourSeq 63 2065 2207 3000 91.1% chr2 + 32931380 32931818 439 browser details YourSeq 63 2128 2207 3000 91.0% chr12 + 72919644 72919725 82 browser details YourSeq 62 2128 2447 3000 98.5% chr11 + 59217564 59217966 403 browser details YourSeq 61 2129 2217 3000 91.8% chrX - 22433918 22555672 121755 browser details YourSeq 61 2124 2201 3000 90.7% chr10 - 89915384 89915462 79 browser details YourSeq 59 2101 2206 3000 84.8% chrX + 92758617 92758724 108 browser details YourSeq 59 2129 2222 3000 87.5% chr10 + 59719639 59719731 93 browser details YourSeq 58 2126 2203 3000 92.7% chr10 - 105372836 105372915 80 browser details YourSeq 56 2133 2203 3000 91.2% chr7 - 45427447 45427519 73 browser details YourSeq 56 2101 2207 3000 84.2% chr11 - 54912375 54912483 109 browser details YourSeq 55 2128 2207 3000 96.7% chr7 - 25425817 25425898 82

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 - 44491558 44494557 3000 browser details YourSeq 108 84 286 3000 87.9% chrX + 37924551 37924757 207 browser details YourSeq 106 42 299 3000 84.0% chr13 + 23528536 23528775 240 browser details YourSeq 103 110 284 3000 91.2% chr11 - 103432996 103433216 221 browser details YourSeq 102 52 294 3000 92.6% chr6 - 29293714 29293972 259 browser details YourSeq 97 81 331 3000 81.2% chr2 - 163527639 163527846 208 browser details YourSeq 92 112 292 3000 89.8% chr10 - 23879579 23879769 191 browser details YourSeq 85 215 335 3000 85.8% chr5 - 74131793 74131921 129 browser details YourSeq 84 499 835 3000 79.2% chr7 + 117535095 117535318 224 browser details YourSeq 75 217 335 3000 94.2% chrX - 103156032 103156156 125 browser details YourSeq 75 87 291 3000 90.5% chr1 + 96579589 96579797 209 browser details YourSeq 71 499 840 3000 91.9% chr19 - 57285428 57285927 500 browser details YourSeq 71 83 324 3000 90.2% chr3 + 88429700 88429939 240 browser details YourSeq 70 149 295 3000 88.9% chr10 - 40272068 40272220 153 browser details YourSeq 66 217 287 3000 97.2% chr4 + 14617622 14617700 79 browser details YourSeq 64 219 293 3000 93.4% chr4 - 3640452 3640534 83 browser details YourSeq 62 217 287 3000 94.4% chr12 - 105577278 105577356 79 browser details YourSeq 62 217 285 3000 95.7% chrX + 74461461 74461537 77 browser details YourSeq 58 796 887 3000 79.7% chr10 + 64583763 64583836 74 browser details YourSeq 57 503 886 3000 95.4% chr9 + 62703676 62704113 438

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Emc10 ER membrane protein complex subunit 10 [ Mus musculus (house mouse) ] Gene ID: 69683, updated on 24-Oct-2019

Gene summary

Official Symbol Emc10 provided by MGI Official Full Name ER membrane protein complex subunit 10 provided by MGI Primary source MGI:MGI:1916933 See related Ensembl:ENSMUSG00000008140 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Inm02; Mirta22; 2310044H10Rik; 5430410O10Rik Expression Ubiquitous expression in adrenal adult (RPKM 166.5), ovary adult (RPKM 122.4) and 28 other tissues See more Orthologs all

Genomic context

Location: 7; 7 B3 See Emc10 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (44489937..44496524, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (51745308..51751883, complement)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Emc10 ENSMUSG00000008140

Description ER membrane protein complex subunit 10 [Source:MGI Symbol;Acc:MGI:1916933] Gene Synonyms 2310044H10Rik, 5430410O10Rik, Mirta22 Location Chromosome 7: 44,489,937-44,496,529 reverse strand. GRCm38:CM001000.2 About this gene This gene has 7 transcripts (splice variants), 167 orthologues, is a member of 1 Ensembl protein family and is associated with 23 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Emc10- ENSMUST00000239015.1 1855 268aa ENSMUSP00000159045.1 Protein CCDS52231 - GENCODE 207 coding basic APPRIS P2

Emc10- ENSMUST00000118808.8 1872 258aa ENSMUSP00000113509.3 Protein - A0A0X1KG67 TSL:1 202 coding Q3TAS6 GENCODE basic APPRIS ALT2

Emc10- ENSMUST00000118515.8 1662 254aa ENSMUSP00000113141.3 Protein - A0A0X1KG66 TSL:1 201 coding GENCODE basic APPRIS ALT2

Emc10- ENSMUST00000138328.2 701 214aa ENSMUSP00000116293.3 Protein - D3Z665 CDS 3' 204 coding incomplete TSL:5

Emc10- ENSMUST00000150342.1 992 No - Retained - - TSL:2 206 protein intron

Emc10- ENSMUST00000123928.1 749 No - Retained - - TSL:2 203 protein intron

Emc10- ENSMUST00000147102.1 429 No - Retained - - TSL:3 205 protein intron

Page 6 of 8 https://www.alphaknockout.com

26.59 kb Forward strand 44.48Mb 44.49Mb 44.50Mb 5430431A17Rik-202 >lncRNA Fam71e1-201 >protein coding Gm44646-201 >lncRNA (Comprehensive set...

5430431A17Rik-201 >lncRNA Fam71e1-205 >retained inFtraomn 71e1-203 >lncRNA

Fam71e1-204 >protein coding Gm44646-202 >TEC

Fam71e1-202 >protein coding

Fam71e1-208 >nonsense mediated decay

Fam71e1-207 >retained intron

Fam71e1-206 >retained intron

Fam71e1-209 >retained intron

Contigs < AC149607.6 Genes (Comprehensive set... < Emc10-202protein coding < Mybpc2-201protein coding

< Emc10-207protein coding < Mybpc2-202protein coding

< Emc10-201protein coding

< Emc10-205retained intron

< Emc10-206retained intron

< Emc10-204protein coding

< Emc10-203retained intron

Regulatory Build

44.48Mb 44.49Mb 44.50Mb Reverse strand 26.59 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000118808

< Emc10-202protein coding

Reverse strand 6.59 kb

ENSMUSP00000113... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... PANTHER ER membrane protein complex subunit 10

PTHR21397

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 258

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8