https://www.alphaknockout.com

Mouse Cog5 Knockout Project (CRISPR/Cas9)

Objective: To create a Cog5 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cog5 (NCBI Reference Sequence: NM_001163126 ; Ensembl: ENSMUSG00000035933 ) is located on Mouse 12. 23 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 22 (Transcript: ENSMUST00000036862). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 3.82% of the coding region. Exon 2~5 covers 12.99% of the coding region. The size of effective KO region: ~9669 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 23

Legends Exon of mouse Cog5 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.25% 565) | C(19.4% 388) | T(33.9% 678) | G(18.45% 369)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.05% 401) | C(25.5% 510) | T(34.8% 696) | G(19.65% 393)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 31658718 31660717 2000 browser details YourSeq 97 1374 1517 2000 93.8% chr2 - 128994437 128994611 175 browser details YourSeq 94 1412 1518 2000 95.3% chr17 - 21529744 21529865 122 browser details YourSeq 90 1412 1516 2000 94.3% chr19 + 47335979 47336275 297 browser details YourSeq 87 1412 1518 2000 94.9% chr8 - 108979636 108979757 122 browser details YourSeq 86 1412 1516 2000 92.4% chr12 - 118723100 118723219 120 browser details YourSeq 86 1412 1518 2000 91.6% chr14 + 50769801 50769922 122 browser details YourSeq 85 1412 1517 2000 93.0% chr11 - 95949822 95949942 121 browser details YourSeq 84 1412 1518 2000 92.1% chr12 - 34293154 34293275 122 browser details YourSeq 84 1412 1517 2000 95.7% chr15 + 94157907 94158027 121 browser details YourSeq 82 1411 1516 2000 92.8% chr2 - 37836109 37836223 115 browser details YourSeq 81 1191 1517 2000 95.6% chr4 - 34770630 34771020 391 browser details YourSeq 81 1412 1517 2000 94.7% chr4 + 134176611 134176722 112 browser details YourSeq 81 1412 1517 2000 96.7% chr11 + 51798430 51798550 121 browser details YourSeq 80 1412 1517 2000 96.6% chr9 - 116978226 116978347 122 browser details YourSeq 80 1412 1518 2000 91.8% chr19 - 5614655 5614776 122 browser details YourSeq 80 1412 1516 2000 91.0% chr18 - 57987210 57987329 120 browser details YourSeq 80 1412 1516 2000 90.9% chr13 - 46901613 46901731 119 browser details YourSeq 80 1414 1517 2000 91.7% chr10 - 120808099 120808219 121 browser details YourSeq 80 1412 1516 2000 90.0% chr18 + 20016498 20016616 119

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 31670387 31672386 2000 browser details YourSeq 893 467 1998 2000 87.6% chr5 + 108221667 108223152 1486 browser details YourSeq 443 149 1345 2000 83.3% chr9 - 49480812 49481935 1124 browser details YourSeq 396 1239 1949 2000 83.2% chr14 + 31674651 31675342 692 browser details YourSeq 390 57 879 2000 85.9% chr15 - 98236819 98237675 857 browser details YourSeq 368 770 1932 2000 86.8% chr12 - 32265541 32266737 1197 browser details YourSeq 362 912 1950 2000 83.9% chr17 + 24825776 24826772 997 browser details YourSeq 347 165 1221 2000 84.0% chr3 - 99674784 99675799 1016 browser details YourSeq 337 83 996 2000 84.8% chr10 + 97853613 97854546 934 browser details YourSeq 319 1093 1922 2000 83.8% chr18 - 13555074 13555861 788 browser details YourSeq 314 83 918 2000 81.4% chr9 + 54791579 54792427 849 browser details YourSeq 309 1202 1962 2000 80.1% chr13 - 29139720 29140252 533 browser details YourSeq 308 1288 1963 2000 81.5% chr8 - 65200694 65201353 660 browser details YourSeq 298 1317 1952 2000 81.4% chr8 - 91266268 91266902 635 browser details YourSeq 294 976 1954 2000 83.9% chr1 - 194843616 194844486 871 browser details YourSeq 294 959 1951 2000 80.5% chr18 + 7699576 7700529 954 browser details YourSeq 291 183 895 2000 82.5% chr6 - 147782129 147782815 687 browser details YourSeq 286 316 992 2000 85.9% chr4 - 14059436 14060500 1065 browser details YourSeq 284 132 658 2000 83.4% chr4 - 98251927 98252465 539 browser details YourSeq 278 45 724 2000 82.5% chr1 + 65305672 65306321 650

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cog5 component of oligomeric golgi complex 5 [ Mus musculus (house mouse) ] Gene ID: 238123, updated on 12-Aug-2019

Gene summary

Official Symbol Cog5 provided by MGI Official Full Name component of oligomeric golgi complex 5 provided by MGI Primary source MGI:MGI:2145130 See related Ensembl:ENSMUSG00000035933 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GTC90; C87247; D18362; GOLTC1; 5430405C01Rik Expression Ubiquitous expression in ovary adult (RPKM 11.7), mammary gland adult (RPKM 11.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 12; 12 A2-A3 See Cog5 in Genome Data Viewer Exon count: 25

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (31654819..31937630)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (32339734..32622495)

Chromosome 12 - NC_000078.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Cog5 ENSMUSG00000035933

Description component of oligomeric golgi complex 5 [Source:MGI Symbol;Acc:MGI:2145130] Gene Synonyms 5430405C01Rik, GOLTC1, GTC90 Location Chromosome 12: 31,654,869-31,937,630 forward strand. GRCm38:CM001005.2 About this gene This gene has 4 transcripts (splice variants), 215 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cog5- ENSMUST00000036862.4 2997 829aa ENSMUSP00000044797.4 Protein coding CCDS49046 Q8C0L8 TSL:1 201 GENCODE basic APPRIS P1

Cog5- ENSMUST00000218428.1 3610 99aa ENSMUSP00000151730.1 Protein coding - A0A1W2P7M6 CDS 5' 202 incomplete TSL:3

Cog5- ENSMUST00000219837.1 372 21aa ENSMUSP00000151902.1 Nonsense mediated - A0A1W2P822 CDS 5' 204 decay incomplete TSL:5

Cog5- ENSMUST00000219672.1 750 No - Retained intron - - TSL:2 203 protein

Page 7 of 9 https://www.alphaknockout.com

302.76 kb Forward strand 31.7Mb 31.8Mb 31.9Mb (Comprehensive set... Cog5-201 >protein coding

Gm48808-201 >processed pseudogene Gm25913-201 >scaRNA Cog5-202 >protein coding

Gm18029-201 >processed pseudogene Cog5-204 >nonsense mediated decay

Cog5-203 >retained intron

Contigs AC119950.14 > AC125020.7 > CT010463.9 > Genes < Dus4l-201protein coding < Gpr22-201protein coding < Gm48809-201lncRNA < Hbp1-207retained intron (Comprehensive set...

< Dus4l-202lncRNA < Gpr22-204protein coding < Hbp1-203retained intron

< Dus4l-203lncRNA < Gpr22-203protein coding < Hbp1-202protein coding

< Gpr22-202protein coding < Hbp1-201protein coding

< Hbp1-209protein coding

< Hbp1-204protein coding

< Hbp1-208protein coding

< Hbp1-206protein coding

< Hbp1-205protein coding

Regulatory Build

31.7Mb 31.8Mb 31.9Mb Reverse strand 302.76 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000036862

282.76 kb Forward strand

Cog5-201 >protein coding

ENSMUSP00000044... Low complexity (Seg) Pfam Conserved oligomeric Golgi complex subunit 5 PANTHER Conserved oligomeric Golgi complex subunit 5

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 829

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9