https://www.alphaknockout.com

Mouse Erp29 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Erp29 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Erp29 (NCBI Reference Sequence: NM_026129 ; Ensembl: ENSMUSG00000029616 ) is located on Mouse 5. 3 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 3 (Transcript: ENSMUST00000130451). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Erp29 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-112C22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 19.21% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 4911 bp, and the size of intron 2 for 3'-loxP site insertion: 1568 bp. The size of effective cKO region: ~639 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Erp29 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7139bp) | A(25.58% 1826) | C(23.87% 1704) | T(25.82% 1843) | G(24.74% 1766)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 121447530 121450529 3000 browser details YourSeq 243 97 1858 3000 89.9% chr14 + 76275500 76305678 30179 browser details YourSeq 184 91 1659 3000 92.3% chr11 - 101633986 101662701 28716 browser details YourSeq 180 1505 2653 3000 90.5% chr11 + 115884879 116253920 369042 browser details YourSeq 174 1504 1894 3000 88.6% chr11 + 97972218 97972649 432 browser details YourSeq 148 1509 1779 3000 90.2% chr11 - 116118957 116119296 340 browser details YourSeq 144 1505 1683 3000 95.6% chr17 + 27079385 27079583 199 browser details YourSeq 142 1505 1682 3000 92.5% chr12 + 72904130 72904305 176 browser details YourSeq 141 1510 1693 3000 89.9% chr11 + 98895761 98895941 181 browser details YourSeq 139 1505 1864 3000 85.0% chr1 - 69324046 69324528 483 browser details YourSeq 138 1505 1668 3000 91.7% chr4 - 37011871 37012029 159 browser details YourSeq 138 1505 1679 3000 93.7% chr11 + 87475315 87475492 178 browser details YourSeq 137 1505 1665 3000 93.2% chr14 - 57531186 57531365 180 browser details YourSeq 136 1504 1661 3000 95.4% chr12 + 55863822 55863995 174 browser details YourSeq 135 1510 1661 3000 91.7% chr7 - 19653097 19653240 144 browser details YourSeq 135 1509 1665 3000 93.7% chr13 - 9391116 9391275 160 browser details YourSeq 135 1510 1665 3000 95.4% chr12 - 109714369 109714541 173 browser details YourSeq 135 1505 1668 3000 89.2% chr10 - 128006672 128006818 147 browser details YourSeq 135 1510 1666 3000 95.3% chr10 - 86467768 86467923 156 browser details YourSeq 135 1505 1668 3000 92.0% chr10 + 88755919 88756077 159

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 121443891 121446890 3000 browser details YourSeq 202 707 3000 3000 90.7% chr1 - 59507556 59759890 252335 browser details YourSeq 202 528 873 3000 90.5% chr6 + 83224054 83224530 477 browser details YourSeq 169 546 859 3000 91.4% chr2 + 119840476 119841096 621 browser details YourSeq 164 701 960 3000 90.0% chr5 + 124295651 124324411 28761 browser details YourSeq 151 523 960 3000 88.8% chrX - 56315609 56316163 555 browser details YourSeq 150 694 873 3000 92.7% chr7 + 16359510 16359697 188 browser details YourSeq 148 689 875 3000 90.3% chr1 + 25054256 25054455 200 browser details YourSeq 146 528 859 3000 86.6% chr11 - 5853549 5853915 367 browser details YourSeq 145 730 960 3000 91.6% chr9 - 114444235 114444707 473 browser details YourSeq 144 597 859 3000 91.5% chr10 - 88238276 88291173 52898 browser details YourSeq 141 560 859 3000 92.3% chr11 - 88984083 88984390 308 browser details YourSeq 141 698 870 3000 91.4% chr15 + 19413469 19413653 185 browser details YourSeq 138 708 873 3000 92.6% chr1 - 152602591 152602947 357 browser details YourSeq 137 569 859 3000 93.7% chr1 - 175886992 176009742 122751 browser details YourSeq 135 700 880 3000 91.1% chr15 - 98495389 98495586 198 browser details YourSeq 135 700 859 3000 92.5% chr11 + 32429791 32429957 167 browser details YourSeq 134 699 867 3000 90.4% chr6 - 149019736 149420091 400356 browser details YourSeq 134 700 871 3000 91.9% chr15 + 53740410 53740589 180 browser details YourSeq 133 699 873 3000 91.9% chr2 - 29433363 29433553 191

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Erp29 endoplasmic reticulum protein 29 [ Mus musculus (house mouse) ] Gene ID: 67397, updated on 20-Aug-2019

Gene summary

Official Symbol Erp29 provided by MGI Official Full Name endoplasmic reticulum protein 29 provided by MGI Primary source MGI:MGI:1914647 See related Ensembl:ENSMUSG00000029616 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Erp28; Erp31; PDI-Db; AW209030; 1200015M03Rik; 2810446M09Rik Expression Ubiquitous expression in placenta adult (RPKM 91.8), ovary adult (RPKM 52.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 F See Erp29 in Genome Data Viewer

Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (121444753..121452474, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (121894762..121902483, complement)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Erp29 ENSMUSG00000029616

Description endoplasmic reticulum protein 29 [Source:MGI Symbol;Acc:MGI:1914647] Gene Synonyms 1200015M03Rik, 2810446M09Rik, Erp28, Erp31, PDI-Db Location Chromosome 5: 121,428,590-121,452,506 reverse strand. GRCm38:CM000998.2 About this gene This gene has 4 transcripts (splice variants), 155 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Erp29-203 ENSMUST00000130451.1 5204 262aa ENSMUSP00000117347.1 Protein coding CCDS19634 P57759 TSL:1 GENCODE basic APPRIS P1

Erp29-201 ENSMUST00000052590.7 1123 55aa ENSMUSP00000059275.7 Protein coding - F8WJI4 TSL:1 GENCODE basic

Erp29-204 ENSMUST00000153758.1 4034 95aa ENSMUSP00000122522.1 Nonsense mediated decay - D6RG87 TSL:1

Erp29-202 ENSMUST00000111802.3 748 57aa ENSMUSP00000107433.3 Nonsense mediated decay - F8WIM7 TSL:2

Page 6 of 8 https://www.alphaknockout.com

43.92 kb Forward strand

121.42Mb 121.43Mb 121.44Mb 121.45Mb 121.46Mb Naa25-201 >protein coding Tmem116-204 >protein coding (Comprehensive set...

Naa25-205 >nonsense mediated decay Naa25-204 >nonsense mediated decay Tmem116-208 >protein coding

Naa25-202 >retained intron Naa25-206 >lncRNA Tmem116-205 >nonsense mediated decay

Naa25-203 >nonsense mediated decay Tmem116-206 >nonsense mediated decay

Tmem116-210 >lncRNA

Tmem116-202 >protein coding

Tmem116-209 >protein coding

Contigs AC129215.3 > Genes (Comprehensive set... < Erp29-204nonsense mediated decay

< Erp29-203protein coding

< Erp29-201protein coding

< Erp29-202nonsense mediated decay

Regulatory Build

121.42Mb 121.43Mb 121.44Mb 121.45Mb 121.46Mb Reverse strand 43.92 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000130451

< Erp29-203protein coding

Reverse strand 11.68 kb

ENSMUSP00000117... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily -like superfamily Endoplasmic reticulum resident protein 29, C-terminal domain superfamily

Pfam ERp29, N-terminal Endoplasmic reticulum resident protein 29, C-terminal

PIRSF Endoplasmic reticulum resident protein 29

PANTHER PTHR12211:SF1

Endoplasmic reticulum resident protein 29 Gene3D 3.40.30.10 Endoplasmic reticulum resident protein 29, C-terminal domain superfamily

CDD ERp29, N-terminal Endoplasmic reticulum resident protein 29, C-terminal

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 262

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8