https://www.alphaknockout.com

Mouse Cox6a1 Knockout Project (CRISPR/Cas9)

Objective: To create a Cox6a1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cox6a1 (NCBI Reference Sequence: NM_007748 ; Ensembl: ENSMUSG00000041697 ) is located on Mouse 5. 3 are identified, with the ATG start codon in 1 and the TAA stop codon in exon 3 (Transcript: ENSMUST00000040154). Exon 1~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous null mice exhibit impaired coordination, thinned sciatic nerves, neurogenic muscular changes and delayed motor nerve conduction velocity.

Exon 1 starts from about 0.3% of the coding region. Exon 1~3 covers 100.0% of the coding region. The size of effective KO region: ~3103 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3

Legends Exon of mouse Cox6a1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.65% 513) | C(24.35% 487) | T(26.4% 528) | G(23.6% 472)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.85% 537) | C(23.3% 466) | T(25.1% 502) | G(24.75% 495)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 115348951 115350950 2000 browser details YourSeq 481 159 1038 2000 93.3% chr16 + 56059841 56060687 847 browser details YourSeq 437 197 1047 2000 93.5% chrX + 154415912 154416799 888 browser details YourSeq 419 173 1047 2000 92.6% chr10 + 63311988 63312721 734 browser details YourSeq 403 173 1047 2000 91.4% chr14 - 58188932 58189633 702 browser details YourSeq 400 159 1045 2000 92.1% chr4 - 62334393 62335118 726 browser details YourSeq 390 241 1047 2000 95.8% chr7 + 5417066 5417941 876 browser details YourSeq 383 155 1047 2000 93.3% chr14 - 11233957 11234769 813 browser details YourSeq 380 655 1049 2000 98.3% chr14 - 32416645 32417041 397 browser details YourSeq 380 655 1047 2000 98.5% chr9 + 91792716 91793110 395 browser details YourSeq 378 366 1047 2000 96.4% chr2 - 101050983 101051826 844 browser details YourSeq 378 655 1047 2000 98.3% chr15 - 11632181 11632575 395 browser details YourSeq 375 652 1047 2000 97.5% chr6 + 143014339 143014736 398 browser details YourSeq 374 655 1047 2000 97.8% chr12 - 48374756 48375150 395 browser details YourSeq 374 655 1047 2000 97.8% chr12 - 48376244 48376638 395 browser details YourSeq 373 653 1048 2000 97.3% chr6 + 143015831 143016228 398 browser details YourSeq 373 655 1057 2000 96.8% chr4 + 111103736 111104145 410 browser details YourSeq 373 655 1048 2000 97.5% chr17 + 19830487 19830882 396 browser details YourSeq 372 655 1047 2000 97.5% chr1 - 78301235 78301629 395 browser details YourSeq 372 655 1047 2000 97.5% chr17 + 19828995 19829389 395

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 115343846 115345845 2000 browser details YourSeq 176 1 192 2000 93.7% chr19 + 3206071 3206259 189 browser details YourSeq 89 599 752 2000 89.4% chr7 - 87478348 87478501 154 browser details YourSeq 81 634 755 2000 86.0% chr10 - 69333001 69333126 126 browser details YourSeq 79 1273 1584 2000 91.6% chr2 + 128038855 128039197 343 browser details YourSeq 76 634 752 2000 84.2% chr6 + 120026817 120026934 118 browser details YourSeq 74 634 749 2000 86.9% chr1 + 86149353 86149468 116 browser details YourSeq 73 1435 1581 2000 89.6% chr2 - 131323316 131323478 163 browser details YourSeq 69 596 749 2000 91.7% chr13 - 48242316 48242569 254 browser details YourSeq 63 625 741 2000 94.6% chr13 + 7617919 7618035 117 browser details YourSeq 60 1270 1577 2000 85.8% chr16 + 54798191 54798495 305 browser details YourSeq 59 634 757 2000 94.1% chr16 - 21715392 21715530 139 browser details YourSeq 58 1342 1587 2000 91.5% chr8 - 89451810 89452077 268 browser details YourSeq 58 634 724 2000 91.6% chr11 - 105304246 105304661 416 browser details YourSeq 58 633 718 2000 91.7% chr11 - 43331345 43331441 97 browser details YourSeq 57 1433 1580 2000 89.1% chr9 + 64954412 64954577 166 browser details YourSeq 57 690 757 2000 94.0% chr5 + 52661056 52661127 72 browser details YourSeq 57 634 723 2000 95.4% chr1 + 74955126 74955222 97 browser details YourSeq 56 637 741 2000 91.2% chr11 + 116144925 116145158 234 browser details YourSeq 54 651 741 2000 88.8% chr13 - 43382326 43382423 98

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Cox6a1 subunit 6A1 [ Mus musculus (house mouse) ] Gene ID: 12861, updated on 12-Aug-2019

Gene summary

Official Symbol Cox6a1 provided by MGI Official Full Name cytochrome c oxidase subunit 6A1 provided by MGI Primary source MGI:MGI:103099 See related Ensembl:ENSMUSG00000041697 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as VIaL Expression Ubiquitous expression in duodenum adult (RPKM 2589.4), colon adult (RPKM 2186.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5 F; 5 56.06 cM See Cox6a1 in Genome Data Viewer Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (115345623..115348958, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (115795663..115798964, complement)

Chromosome 5 - NC_000071.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Cox6a1 ENSMUSG00000041697

Description cytochrome c oxidase subunit 6A1 [Source:MGI Symbol;Acc:MGI:103099] Gene Synonyms VIaL, subunit VIaL (liver-type) Location Chromosome 5: 115,345,642-115,348,981 reverse strand. GRCm38:CM000998.2 About this gene This gene has 2 transcripts (splice variants), 225 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cox6a1-201 ENSMUST00000040154.8 571 112aa ENSMUSP00000047661.8 Protein coding CCDS19590 Q9DCW5 TSL:1 GENCODE basic APPRIS P1

Cox6a1-202 ENSMUST00000137766.1 620 No protein - Retained intron - - TSL:2

23.34 kb Forward strand

115.340Mb 115.345Mb 115.350Mb 115.355Mb Triap1-201 >protein coding Gm42903-201 >lncRNA (Comprehensive set...

Contigs < AC117735.8 Genes (Comprehensive set... < Gatc-201protein coding < Cox6a1-201protein coding < 4930401G09Rik-201lncRNA

< Cox6a1-202retained intron

Regulatory Build

115.340Mb 115.345Mb 115.350Mb 115.355Mb Reverse strand 23.34 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000040154

< Cox6a1-201protein coding

Reverse strand 3.34 kb

ENSMUSP00000047... Transmembrane heli... Low complexity (Seg) Superfamily Cytochrome c oxidase, subunit VIa superfamily Pfam Cytochrome c oxidase, subunit VIa PROSITE patterns Cytochrome c oxidase, subunit VIa, conserved site PIRSF Cytochrome c oxidase, subunit VIa PANTHER Cytochrome c oxidase, subunit VIa

PTHR11504:SF4 Gene3D Cytochrome c oxidase, subunit VIa superfamily CDD Cytochrome c oxidase, subunit VIa

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 10 20 30 40 50 60 70 80 90 100 112

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8