https://www.alphaknockout.com

Mouse Cox8a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cox8a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cox8a (NCBI Reference Sequence: NM_007750 ; Ensembl: ENSMUSG00000035885 ) is located on Mouse 19. 2 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 2 (Transcript: ENSMUST00000039758). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cox8a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-95L16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 covers 55.07% of the coding region. Start codon is in exon 1, and stop codon is in exon 2. The size of intron 1 for 3'-loxP site insertion: 1914 bp. The size of effective cKO region: ~374 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele A gRNA region T

5' G 3'

1 2

Targeting vector A T G

Targeted allele A T G

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Cox8a cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(6614bp) | A(27.15% 1796) | C(23.34% 1544) | T(24.8% 1640) | G(24.71% 1634)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 7217513 7220512 3000 browser details YourSeq 402 849 1688 3000 90.6% chr9 - 114974955 114975498 544 browser details YourSeq 369 846 1248 3000 98.0% chrX - 140484355 140970669 486315 browser details YourSeq 357 815 1267 3000 95.0% chr8 + 33941693 33942763 1071 browser details YourSeq 348 849 1241 3000 95.6% chr4 + 99793434 99793985 552 browser details YourSeq 346 323 1036 3000 96.5% chr4 - 141644766 141999146 354381 browser details YourSeq 345 849 1251 3000 93.0% chr19 - 37058007 37058403 397 browser details YourSeq 339 851 1262 3000 93.5% chr9 + 110705113 110705493 381 browser details YourSeq 338 850 1246 3000 94.3% chr12 - 111742814 111743224 411 browser details YourSeq 329 1053 1425 3000 98.0% chr7 + 28853376 29191998 338623 browser details YourSeq 324 850 1239 3000 93.6% chr18 + 35852622 35853014 393 browser details YourSeq 320 852 1252 3000 92.1% chr17 - 29124984 29125382 399 browser details YourSeq 313 1053 1449 3000 95.4% chr13 + 55671001 55671462 462 browser details YourSeq 305 848 1234 3000 90.9% chr12 + 113079665 113080044 380 browser details YourSeq 304 850 1189 3000 95.3% chr11 - 97880858 97881213 356 browser details YourSeq 302 850 1242 3000 92.2% chr18 + 23970192 23970728 537 browser details YourSeq 298 1069 1422 3000 97.5% chr5 - 113835492 113835937 446 browser details YourSeq 282 849 1423 3000 91.7% chr6 - 48683691 48684211 521 browser details YourSeq 280 959 1415 3000 97.0% chr4 - 107430888 107431926 1039 browser details YourSeq 280 349 1003 3000 88.5% chr11 + 6530321 6530654 334

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 - 7214149 7217148 3000 browser details YourSeq 307 1664 1992 3000 96.4% chr1 - 181186904 181187231 328 browser details YourSeq 291 1664 1996 3000 94.5% chr13 - 9272677 9273008 332 browser details YourSeq 164 1682 1989 3000 88.3% chr8 - 121635481 121635792 312 browser details YourSeq 140 550 914 3000 86.3% chr19 + 44950061 44950417 357 browser details YourSeq 125 765 1174 3000 82.6% chr6 + 71258199 71258367 169 browser details YourSeq 122 603 886 3000 84.2% chr11 - 72630544 72630775 232 browser details YourSeq 121 751 918 3000 95.6% chr10 + 66846514 66846688 175 browser details YourSeq 117 806 1157 3000 82.2% chr14 - 84269847 84270013 167 browser details YourSeq 115 765 927 3000 89.4% chr11 + 86678271 86678426 156 browser details YourSeq 110 765 916 3000 91.5% chr1 - 74674252 74674402 151 browser details YourSeq 110 765 1151 3000 81.9% chr18 + 36583838 36583994 157 browser details YourSeq 109 521 885 3000 81.8% chr10 + 128162076 128162297 222 browser details YourSeq 109 765 918 3000 93.6% chr1 + 153426290 153426447 158 browser details YourSeq 106 551 885 3000 94.3% chr11 + 94748338 94748672 335 browser details YourSeq 103 758 918 3000 92.0% chr11 - 99025646 99025810 165 browser details YourSeq 103 748 904 3000 85.4% chr11 + 3321559 3321701 143 browser details YourSeq 103 765 1156 3000 80.2% chr1 + 8432993 8433146 154 browser details YourSeq 102 765 918 3000 92.6% chr1 + 182039066 182039219 154 browser details YourSeq 101 765 918 3000 91.2% chr12 + 110971545 110971695 151

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Cox8a subunit 8A [ Mus musculus (house mouse) ] Gene ID: 12868, updated on 12-Aug-2019

Gene summary

Official Symbol Cox8a provided by MGI Official Full Name cytochrome c oxidase subunit 8A provided by MGI Primary source MGI:MGI:105959 See related Ensembl:ENSMUSG00000035885 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as COX8L Expression Ubiquitous expression in duodenum adult (RPKM 1660.4), colon adult (RPKM 1505.4) and 27 other tissues See more Orthologs human all

Genomic context

Location: 19; 19 A See Cox8a in Genome Data Viewer

Exon count: 2

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (7215158..7217616, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (7289648..7292106, complement)

Chromosome 19 - NC_000085.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Cox8a ENSMUSG00000035885

Description cytochrome c oxidase subunit 8A [Source:MGI Symbol;Acc:MGI:105959] Gene Synonyms COX VIII-L, COX8L Location Chromosome 19: 7,215,153-7,217,616 reverse strand. GRCm38:CM001012.2 About this gene This gene has 2 transcripts (splice variants), 95 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cox8a- ENSMUST00000039758.5 550 69aa ENSMUSP00000040717.4 Protein coding CCDS29521 Q64445 TSL:1 201 GENCODE basic APPRIS P1

Cox8a- ENSMUST00000237087.1 504 43aa ENSMUSP00000157524.1 Nonsense mediated - A0A494B971 - 202 decay

22.46 kb Forward strand 7.210Mb 7.215Mb 7.220Mb 7.225Mb Contigs AC140307.3 > (Comprehensive set... < Otub1-201protein coding < Cox8a-201protein coding < Naa40-203nonsense mediated decay

< Otub1-202protein coding < Cox8a-202nonsense mediated decay < Naa40-205retained intron

< Otub1-205protein coding < Naa40-201protein coding

< Naa40-204protein coding

Regulatory Build

7.210Mb 7.215Mb 7.220Mb 7.225Mb Reverse strand 22.46 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000039758

< Cox8a-201protein coding

Reverse strand 2.46 kb

ENSMUSP00000040... Transmembrane heli... Superfamily Cytochrome c oxidase, subunit 8 superfamily

Pfam Cytochrome c oxidase, subunit 8

PANTHER PTHR16717:SF1

Cytochrome c oxidase, subunit 8 Gene3D Cytochrome c oxidase, subunit 8 superfamily

CDD cd00930

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Y R S

Variant Legend

missense variant synonymous variant

Scale bar 0 6 12 18 24 30 36 42 48 54 60 69

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7