https://www.alphaknockout.com

Mouse Ergic3 Knockout Project (CRISPR/Cas9)

Objective: To create a Ergic3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ergic3 (NCBI Reference Sequence: NM_025516 ; Ensembl: ENSMUSG00000005881 ) is located on Mouse 2. 13 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 13 (Transcript: ENSMUST00000006035). Exon 5~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 32.03% of the coding region. Exon 5~10 covers 44.56% of the coding region. The size of effective KO region: ~6514 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3' 10

1 5 6 7 8 9 13

Legends Exon of mouse Ergic3 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1460 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 626 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1460bp) | A(23.36% 341) | C(20.82% 304) | T(31.16% 455) | G(24.66% 360)

Note: The 1460 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(626bp) | A(20.61% 129) | C(27.48% 172) | T(24.44% 153) | G(27.48% 172)

Note: The 626 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1460 1 1460 1460 100.0% chr2 + 156008981 156010440 1460 browser details YourSeq 220 794 1067 1460 92.9% chr10 + 61403327 61403890 564 browser details YourSeq 215 806 1068 1460 93.2% chr1 - 39523142 39523508 367 browser details YourSeq 211 825 1067 1460 95.7% chr15 + 96066260 96596320 530061 browser details YourSeq 205 791 1052 1460 92.8% chr5 - 126977348 126977675 328 browser details YourSeq 205 792 1056 1460 93.6% chr11 + 85262094 85262448 355 browser details YourSeq 205 815 1067 1460 93.7% chr1 + 180748188 180748709 522 browser details YourSeq 204 815 1068 1460 93.3% chr16 - 17194907 17195253 347 browser details YourSeq 202 825 1068 1460 94.3% chr13 + 100736556 100820996 84441 browser details YourSeq 198 825 1067 1460 93.5% chr11 - 60824663 60824957 295 browser details YourSeq 198 825 1068 1460 93.1% chr15 + 102173166 102173671 506 browser details YourSeq 197 799 1067 1460 93.5% chr13 + 17811134 17811679 546 browser details YourSeq 196 825 1067 1460 92.7% chr11 - 101588254 101588589 336 browser details YourSeq 194 825 1066 1460 92.3% chr14 + 49205100 49205574 475 browser details YourSeq 193 825 1067 1460 93.3% chr16 - 22056591 22056927 337 browser details YourSeq 189 792 1067 1460 92.0% chr11 - 95917783 95918095 313 browser details YourSeq 188 769 1068 1460 94.0% chr17 - 29037871 29529425 491555 browser details YourSeq 181 803 1189 1460 91.8% chr9 - 110513487 110513909 423 browser details YourSeq 176 825 1046 1460 94.0% chr11 - 52288513 52289043 531 browser details YourSeq 171 825 1061 1460 90.6% chr11 - 80380265 80380576 312

Note: The 1460 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 626 1 626 626 100.0% chr2 + 156016955 156017580 626 browser details YourSeq 24 470 499 626 76.0% chr16 - 31719625 31719649 25 browser details YourSeq 23 584 606 626 100.0% chr2 - 171881849 171881871 23 browser details YourSeq 22 159 181 626 100.0% chr14 - 58206009 58206034 26 browser details YourSeq 21 467 488 626 100.0% chr14 + 24096559 24096581 23 browser details YourSeq 21 200 220 626 100.0% chr11 + 58335065 58335085 21 browser details YourSeq 20 368 387 626 100.0% chr12 - 71353446 71353465 20 browser details YourSeq 20 46 65 626 100.0% chr11 - 71414610 71414629 20

Note: The 626 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ergic3 ERGIC and golgi 3 [ Mus musculus (house mouse) ] Gene ID: 66366, updated on 24-Oct-2019

Gene summary

Official Symbol Ergic3 provided by MGI Official Full Name ERGIC and golgi 3 provided by MGI Primary source MGI:MGI:1913616 See related Ensembl:ENSMUSG00000005881 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CGI-54; D2Ucla1; AV318804; NY-BR-84; Sdbcag84; 2310015B14Rik Expression Ubiquitous expression in testis adult (RPKM 128.8), ovary adult (RPKM 102.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2 H1; 2 77.32 cM See Ergic3 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (156008028..156018279)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (155833861..155844015)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Ergic3 ENSMUSG00000005881

Description ERGIC and golgi 3 [Source:MGI Symbol;Acc:MGI:1913616] Gene Synonyms 2310015B14Rik, CGI-54, D2Ucla1, NY-BR-84, Sdbcag84 Location Chromosome 2: 156,008,045-156,018,279 forward strand. GRCm38:CM000995.2 About this gene This gene has 10 transcripts (splice variants), 204 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ergic3- ENSMUST00000006035.12 1377 383aa ENSMUSP00000006035.6 Protein coding CCDS16959 Q9CQE7 TSL:1 201 GENCODE basic APPRIS P1

Ergic3- ENSMUST00000088650.10 1349 394aa ENSMUSP00000086025.4 Protein coding - Q9CQE7 TSL:1 202 GENCODE basic

Ergic3- ENSMUST00000155370.1 751 235aa ENSMUSP00000119051.1 Protein coding - F6RK81 CDS 5' incomplete 210 TSL:3

Ergic3- ENSMUST00000142859.7 739 246aa ENSMUSP00000115912.1 Protein coding - F6UIS1 CDS 5' and 3' incomplete 205 TSL:5

Ergic3- ENSMUST00000137545.1 654 No protein - Retained intron - - TSL:3 204

Ergic3- ENSMUST00000144707.1 649 No protein - Retained intron - - TSL:2 206

Ergic3- ENSMUST00000152568.1 488 No protein - Retained intron - - TSL:2 209

Ergic3- ENSMUST00000149381.1 412 No protein - Retained intron - - TSL:2 207

Ergic3- ENSMUST00000150970.7 463 No protein - lncRNA - - TSL:3 208

Ergic3- ENSMUST00000130793.7 382 No protein - lncRNA - - TSL:5 203

Page 7 of 9 https://www.alphaknockout.com

30.23 kb Forward strand 156.00Mb 156.01Mb 156.02Mb (Comprehensive set... Cep250-201 >protein coding Ergic3-201 >protein coding

Cep250-202 >protein coding Ergic3-202 >protein coding

Cep250-204 >protein coding Ergic3-206 >retained intron Ergic3-208 >lncRNA

Cep250-206 >retained intron Ergic3-205 >protein coding Ergic3-207 >retained intron

Ergic3-210 >protein coding

Ergic3-203 >lncRNA

Ergic3-209 >retained intron

Ergic3-204 >retained intron

Contigs AL833786.8 >

Genes < 6430550D23Rik-203nonsense mediated decay < Fer1l4-201protein coding (Comprehensive set...

< 6430550D23Rik-202protein coding < Fer1l4-204retained intron

< 6430550D23Rik-201protein coding < Fer1l4-206retained intron

< 6430550D23Rik-204retained intron

< 6430550D23Rik-208protein coding

< 6430550D23Rik-206protein coding

< 6430550D23Rik-205protein coding

< 6430550D23Rik-209protein coding

< 6430550D23Rik-207protein coding

Regulatory Build

156.00Mb 156.01Mb 156.02Mb Reverse strand 30.23 kb

Regulation Legend

CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000006035

10.23 kb Forward strand

Ergic3-201 >protein coding

ENSMUSP00000006... Transmembrane heli... Low complexity (Seg) Pfam Endoplasmic reticulum vesicle transporter, N-terminal

Endoplasmic reticulum vesicle transporter, C-terminal PANTHER PTHR10984

PTHR10984:SF25

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 383

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9