US 20070203083A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2007/0203083 A1 M00tha et al. (43) Pub. Date: Aug. 30, 2007

(54) METHODS OF REGULATING METABOLISM Related U.S. Application Data AND MITOCHONORAL FUNCTION (60) Provisional application No. 60/478,238, filed on Jun. 13, 2003. Provisional application No. 60/525,548, (76) Inventors: Vamsi Krishna Mootha, Brookline, filed on Nov. 26, 2003. Provisional application No. MA (US); David Altshuler, Brookline, 60/559,141, filed on Apr. 2, 2004. MA (US) Publication Classification

Correspondence Address: (51) '',R l. 48/00 (2006.01) FSH & NEAVE P GROUP C40B 30/06 (2006.01) ROPES & GRAY LLP C40B 40/08 (2006.01) ONE INTERNATIONAL PLACE (52) U.S. Cl...... 514/44; 435/6 BOSTON, MA 02110-2624 (US) (57) ABSTRACT (21) Appl. No.: 10/560,501 The invention relates to novel methods of regulating metabolism and mitochondrial biogenesis. Some aspects of (22) PCT Filed: Jun. 14, 2004 the invention relate to methods of treating or preventing diseases in a patient associated with reduced mitochondrial (86). PCT No.: PCT/USO4/19017 function, to methods of identifying agents to treat Such diseases, and to methods of diagnosing Such diseases. Other S 371(c)(1), aspects of the invention relate to a set of coordinately (2), (4) Date: Jun. 15, 2006 regulated which regulate oxidative phosphorylation. Patent Application Publication Aug. 30, 2007 Sheet 1 of 12 US 2007/0203083 A1

FIGURE 1 GD Collect sets S. P Pathays GO terms Gene clusters

Order genes (F) by expression difference NGT DM2 ember of High A geesel A v --

Lory G.) Measure ES for each gene set (6) Fermute class labels

(1,000 times)

Record MES for actual data Description WCGROXPOS 36 WCGR mitochondris 215 Mitochondria keyword 2O Clusterc2O 18 GenAPP OXPHOS 49 GenMAPP retinol metabolism o

s

certies,Giffyddramatisan SS

G) Evaluate significance of actual MES against 1,000 permuted MES Patent Application Publication Aug. 30, 2007 Sheet 2 of 12 US 2007/0203083 A1 FIGURE 2

o at 5e s C

get gc gC cg 7

O Seue? Jo uOloe

C

C X, ceC is re

& g '. Y., Cl . SE

c s S 5 - T- CD 2

C c C c S se uoisseudxe uteaua cy Patent Application Publication Aug. 30, 2007 Sheet 3 of 12 US 2007/0203083 A1

i h

E C. res

g s s

s i s s T

y r seasTrist, as a rur

Patent Application Publication Aug. 30, 2007 Sheet 4 of 12 US 2007/0203083 A1 FIGURE 4

All genes (i?),983)

OXPHOS genes (106) s o

CXPHOS genes with 2 reliable orthologs (47) 2.

5 OXPHOS-CR (34) - -08 u-0.6 -QA -02 2 OA G 8

5 OXPHOS not CR (13) - -08 -0.6 -4 -02 - 0.2 d 6 0.8 MGT versus DM2SNR Patent Application Publication Aug. 30, 2007 Sheet 5 of 12 US 2007/0203083 A1 FIGURE 5

Predictor(s) Rad P'fate Diabetes status 0.28 O.OOO6 OXPHOS-CR ().22 0.002 Diabetes status, OXPOSCR (.33 0.04 2 LOCREB 0.3 0.000 Diabetes status, QCRB O.38 C, OOC Addition of OXPHOS-CR improves the model with P=0,05, Addition of UOCRE improves the model with P = 0.03,

45 h d k A.

Sas 35 d AA El ---

SAS 25 0 F-8is r" S. Ak. Fad = 0.22 (P=0.00- 2 & 20 { 4) { O5 is 4) 4. s O

-0.2 -, C 0.1 0.2 O,3 OXPHOS-CR ean Cetroid Patent Application Publication Aug. 30, 20 07 Sheet 6 of 12 US 2007/0203083 A1

FIGURE 6

A B 0.25 591 distinct (mito-A) 0.20 S 9. 0.15 Sl as 5 0.10 ls (.5 Annotated Proteomics 428 399 O 0 10 20 30 40 50 60 70 80 90 000 20130140 Molecular weight (kDa)

0.25

0.20 r . 92 S f 5 c 9.

LS U

Matrix MM MS MOM Mitochondrial compartment

E

O s -- mRNA = 93.5

10 100 OOO 10000 00000 mRNA expression Patent Application Publication Aug. 30, 2007 Sheet 7 of 12 US 2007/0203083 A1 FIGURE 7

Patent Application Publication Aug. 30, 2007 Sheet 8 of 12 US 2007/0203083 A1

Patent Application Publication Aug. 30, 2007 Sheet 9 of 12 US 2007/0203083 A1 FIGURE 9

0-10 11-20 21-30 31-40 41-50 51-60 >60 Neighborhood index (N) Patent Application Publication Aug. 30, 2007 Sheet 10 of 12 US 2007/0203083 A1 FIGURE 10

A

o

:

Expression Difference Patent Application Publication Aug. 30, 2007 Sheet 11 of 12 US 2007/0203083 A1 FIGURE 11

Y s PGC-1 Gabpa Erro. Exercise y J ; -Erg Ga"PGC-1a N. -' Y W s -' PGC-1 f -w Gabpa Erra PGC-10 P ' ' t a s w PGC-1. Y Gabpa Erro, ' Diabetes s PGC-1a Gabpa Erro,

h d Early Late target genes target genes Patent Application Publication Aug. 30, 2007 Sheet 12 of 12 US 2007/0203083 A1 FIGURE 12

O.2 OA . . Litulative fraction of Rafik Ordered Gerlag US 2007/0203083 A1 Aug. 30, 2007

METHODS OF REGULATING METABOLISMAND replication: (h) expression of mitochondrial enzymes; or (i) MTOCHONORAL FUNCTION skeletal muscle fiber-type Switching. 0006 Additionally, the invention provides a method of BACKGROUND OF THE INVENTION treating or preventing a disorder characterized by reduced 0001) Type 2 diabetes (DM2) affects an estimated 110 mitochondrial function, glucose intolerance, or insulin intol million people worldwide and is a major contributor to erance in a Subject, the method comprising administering to atherosclerotic vascular disease, blindness, amputation, and the Subject a therapeutically effective amount of an agent kidney failure. Defects in insulin secretion are observed which (i) increases the expression or activity of Errol, or early in patients with MODY, a monogenic form of type 2 Gabp or both; or (ii) increases the formation of a complex diabetes; insulin resistance at tissues such as skeletal muscle between a PGC-1 polypeptide and (a) an Erro. polypeptide; is a cardinal feature of patients with fully developed DM2. (b) a Gabp polypeptide; or both; or (iii) binds to an (a) Erro. Many molecular pathways have been implicated in the binding site, or to a (b) Gabpa binding site, and which disease process: beta-cell development, insulin receptor increases transcription of at least one gene in the Subject, signaling, carbohydrate production and utilization, mito said gene having an Errol, binding site, a Gabpa binding site, chondrial metabolism, fatty acid oxidation, cytokine signal or both. ing, adipogenesis, adrenergic signaling, and others. It 0007 Yet another aspect of the invention provides a remains unclear, however, which of these or other pathways method of treating or preventing a disorder characterized by are disturbed in, and might be responsible for, DM2 in its reduced mitochondrial function, glucose intolerance, or common form. insulin intolerance in a subject, the method comprising 0002 Therefore, a need remains to identify the molecular administering to the subject a therapeutically effective pathways implicated in the disease process and to develop amount of an agent which increases the expression or new tools and assays to identify therapeutics for the treat activity of a gene, wherein the gene has an Errol, binding site ment of diabetes. or a Gapba binding site. 0008. The invention also provides a method of reducing SUMMARY OF THE INVENTION the metabolic rate of a subject in need thereof, the method 0003. One aspect of the invention provides a method of comprising administering to the Subject a therapeutically modulating a biological response in a cell, the method effective amount of an agent which decreases the expression comprising contacting the cell with at least one agent that or activity of at least one of the following: (i) Errol; (ii) modulates the expression or activity of Erro. or Gabp. Gabpa; (iii) a gene having an Errol, binding site, a Gabpa wherein the biological response is (a) expression of at least binding site, or both; or (iv) a transcriptional activator which one OXPHOS gene; (b) mitochondrial biogenesis; (c) binds to an Errol, binding site or to a Gabpa binding site; expression of Nuclear Respiratory Factor 1 (NRF-1); (d) thereby reducing the metabolic rate of the patient. 3-oxidation of fatty acids; (e) total mitochondrial respira 0009. The invention further provides a method of iden tion; (f) uncoupled respiration; (g) mitochondrial DNA tifying a susceptibility locus for a disorder that is charac replication: (h) expression of mitochondrial enzymes; or (i) terized by reduced mitochondrial function, glucose intoler skeletal muscle fiber-type Switching. ance, or insulin intolerance in a Subject, the method 0004 Another aspect of the invention provides a method comprising (i) identifying at least one polymorphisms in a of determining if an agent is a potential agent for the gene, or linked to a gene, wherein the gene (a) has an Erro. treatment of a disorder that is characterized by glucose binding site, a Gabpa binding site, or both; or (b) is Erro. intolerance, insulin resistance or reduced mitochondrial Gabpa, or Gabpb; (ii) determining if at least one polymor function, the method comprising determining if the agent phism is associated with the incidence of the disorder, increases: (i) the expression or activity of Errol. or Gabp in wherein if a polymorphism is associated with the incidence a cell; or (ii) the formation of a complex between a PGC-1 of the disorder then the gene having the polymorphism, or polypeptide and (1) an Errol, polypeptide; or (2) a Gabp the gene to which the polymorphism is linked, is a Suscep polypeptide; wherein an agent that increases (i) or (ii) is a tibility locus. potential target for the treatment of the disorder. 0010) A related aspect of the invention provides a method of determining if a Subject is at risk of developing a disorder 0005 The invention also provides a method of identify which is characterized by reduced mitochondrial function, ing an agent that modulates a biological response, the the method comprising determining if a gene from the method comprising (a) contacting, in the presence of the Subject contains a mutation which reduces the function of agent, a PGC-1 polypeptide and an (i) Erro. polypeptide, or the gene, wherein the gene has an Errol, binding site, a Gapba (ii) a Gabp polypeptide, under conditions which allow the formation of a complex between the PGC-1 polypeptide and binding site, or both, wherein if a gene from the subject (i) the Erro. polypeptide, or (ii) the Gabp polypeptide; and contains a mutation then the Subject is at risk of developing (b) detecting the presence of the complex; wherein an agent the disorder. that modulates the biological response is identified if the 0011 Yet another aspect of the invention provides a agent increases or decreases the formation of the complex, method of identifying a transcriptional regulator having and wherein the biological response is (a) expression of at differential activity between an experimental cell and a least one OXPHOS gene; (b) mitochondrial biogenesis; (c) control cell, the method comprising (i) determining the level expression of Nuclear Respiratory Factor 1 (NRF-1); (d) of gene expression of at least two genes in the experimental 3-oxidation of fatty acids; (e) total mitochondrial respira cell and in the control cell; (ii) ranking genes according to tion; (f) uncoupled respiration; (g) mitochondrial DNA a difference metric of their expression level in the experi US 2007/0203083 A1 Aug. 30, 2007 mental cell compared to the control cell; (iii) identifying a the agent regulates the expression levels of OXPHOS-CR Subset of genes, wherein each gene in the Subset contains the genes. In one embodiment, the OXPHOS-CR genes are same DNA sequence motif: (iv) testing using a nonparamet selected from the group consisting of NDUFB3, SDHA, ric statistic if the subset of genes are enriched at either the NDUFA8, COX7A1, UQCRC1, NDUFC1, NDUFS2, top or the bottom of the ranking; (V) optionally reiterating ATP5O, NDUFS3, SDHB, NDUFS5, NDUFB6, COX5B, steps (ii)-(iii) for additional motifs; (vi) for a subset of genes CYC1, NDUFA7, UQCRB, COX7B, ATP5L COX7C, that is enriched, identifying a transcriptional regulator which NDUFA5, GRIM19, ATP5J, COX6A2 NDUFB5, CYCS, binds to a DNA sequence motif that is contained in the NDUFA2 and HSPCO51. Subset of genes; thereby identifying a transcriptional regu lator having differential activity between two cells. BRIEF DESCRIPTION OF THE DRAWINGS 0012. An additional aspect of the invention provides a 0015 FIG. 1 shows a schematic overview of an embodi method of treating impaired glucose tolerance in an indi ment of gene set enrichment analysis (GSEA). The goal of vidual in need thereof, the method comprising administering GSEA is to determine whether any a priori defined gene sets to the individual a therapeutically effective amount of an (step 1) are enriched at the top of list of genes ordered on the agent which increases the expression level of at least two basis of expression difference between two classes (e.g., OXPHOS-CR genes, thereby treating impaired glucose tol high in NGT vs. DM2). Genes, R. . . . RN are rank ordered erance in the individual. A related aspect provides a method on the basis of expression difference (step 2) using an of treating obesity in an individual, comprising administer appropriate difference measure (e.g. signal to noise ratio ing to the individual atherapeutically effective amount of an (SNR), see Methods). To determine whether the G members agent which increases the expression level of at least two of a gene set S are enriched at the top of this list (step 3), a OSPHOS-CR genes, thereby treating obesity in the indi Kolmogorov-Smirnov (K-S) running Sum statistic is com vidual. puted: beginning with the top ranking gene, the running Sum 0013. One aspect of the invention provides a method of increases when a gene annotated to be a member of gene set detecting statistically-significant differences in the expres S is encountered, and decreases otherwise. The enrichment sion level of at least one biomarker belonging to a biomarker score (ES) for a single gene set is defined as the greatest set, between the members of a first and of a second experi positive deviation of the running sum across all N genes. mental group, comprising: (a) obtaining a biomarker sample When many members of Sappear at the top of the list, ES from members of the first and the second experimental is high. The enrichment score is computed for every gene set groups; (b) determining, for each biomarker sample, the using actual data, and the maximum ES (MES) achieved is recorded (step 4). To determine whether one or more of the expression levels of at least one biomarker belonging to the gene sets are enriched in one diagnostic class relative to the biomarker set and of at least one biomarker not belonging to other (step 5), the entire procedure (steps 2-4) is repeated the set; (c) generating a ranks order of each biomarker 1000 times, using permuted diagnostic assignments, and according to a difference metric of its expression level in the building a histogram of the maximum ES achieved by any first experimental group compared to the second experimen pathway in a given permutation. The MES achieved using tal group; (d) calculating an experimental enrichment score the actual data is then compared to this histogram (step 6, red for the biomarker set by applying a non parametric statistic; arrow), providing us with a global P-value for assessing and (e) comparing the experimental enrichment score with a whether any gene set is associated with the diagnostic distribution of randomized enrichment scores to calculate categorization. the fraction of randomized enrichment scores greater than the experimental enrichment score, wherein a low fraction 0016 FIG. 2 shows that OXPHOS gene expression is indicates a statistically-significant difference in the expres reduced in diabetic muscle. (a) The mean expression of all sion level of the biomarker set, between the members of a genes (gray) and for OXPHOS genes (red) is plotted for first and of a second experimental group. In one embodi DM2 vs. NGT individuals. (b) Histogram of mean gene ment, the distribution of randomized enrichment scores is expression level differences between NGT and DM2, using generated by (i) randomly permutating the assignment of the data from (b), for all genes (black) and for OXPHOS each biomarker sample to the first or to the second experi genes (red). mental group; (ii) generating a rank order of each biomarker 0017 FIG. 3 shows that OXPHOS-CR represents a co according to the absolute value of a difference metric of its regulated subset of OXPHOS genes responsive to the tran expression level in the first experimental group compared to scriptional co-activator PGC-1C. (a) Normalized expression the second experimental group; (iii) calculating an experi profile of 52 mouse homologs of the human OXPHOS genes mental enrichment score for the biomarker set by applying across the mouse expression atleas (Su, A.I. et al. Proc Natl a non parametric statistic to the rank order, and (iv) repeat Acad Sci USA 99, 4465-70. (2002)). These 52 genes were ing steps (i), (ii) and (iii) a number of times Sufficient to hierarchically clustered (Eisen et al. Proc Natl AcadSci USA generate the distribution of randomized enrichment scores. 95, 14863-8. (1998)). The purple tree corresponds to a 0014. In addition, the invention provides a method of sub-cluster with a correlation coefficient of 0.65. Applicants identifying an agent that regulates expression of OXPHOS call the human homologs of these mouse genes the CR genes, the method comprising (a) contacting (i) an agent OXPHOS-CR set. The human homologs of this tightly to be assessed for its ability to regulate expression of coregulated cluster, marked with an * and delimited with a OXPHOS-CR genes with (ii) a test cell; and (b) determining yellow box, are: ATP5J, ATP5L, ATP5O, COX5B, whether the expression of at least two OXPHOS-CR gene COX6A2, COX7A1, COX7B, COX7C, CYC1, CYCS, products show a coordinate change in the test cell compared GRIM19, HSPC051, NDUFA2, NDUFA5, NDUFA7, to an appropriate control, wherein a coordinate change in the NDUFA8, NDUFB3, NDUFB5, NDUFB6, NDUFC1, expression of the OXPHOS-CR gene products indicates that NDUFS2, NDUFS3, NDUFS5, SDHA, SDHB, UQCRB, US 2007/0203083 A1 Aug. 30, 2007

UQCRC1. (b) Normalized expression profile of OXPHOS the number of mito-Pgenes that occur within the nearest 100 mouse homologs in a mouse skeletal muscle cell line during expression neighbors of a given gene. The distribution of a three-day time course in response to PGC-1C. The expres N is plotted for all genes (white), mito-Pgenes (gray), and sion profile includes infection with control (GFP) or with for the ancestral mito-P genes (black). PGC-1C., at day 0 (prior to infection) as well as on days 1, 2, and 3 following adenoviral infection, all performed in 0024 FIG. 10 shows a schematic overview of motifADE duplicate. and application to the PGC-1a timecourse. (A) motif ADE identifies motifs associated with differential expression. It 0018 FIG. 4 shows that OXPHOS-CR accounts for the begins with a list of genes ordered on the basis of differential bulk of OXPHOS signal seen in NGT vs. DM2. Histogram expression across two conditions. Each gene is then anno of signal:noise ratio for (a) All 10,983 human genes meeting tated for the presence of a given motif in the promoter the clipping and filtering criteria in the GSEA enrichment region. A nonparametric statistic is used to assess whether screen between NGT and DM2, (b) 106 OXPHOS genes genes with the motif tend to rank high on this list (see meeting these clipping and filtering criteria, (c) 47 OXPHOS Methods). In this example, genes with Motif 1 are randomly genes for which reliable mouse homologs are available in distributed on the list, while genes with Motif 2 tend to rank the mouse microarray, (d) OXPHOS-CR genes, and (e) high, Suggesting an association between Motif 2 and the OXPHOS genes but not in the OXPHOS-CR set. differential expression. (B) C2C12 cells were infected with 0019 FIG. 5 shows that OXPHOS-CR predicts total an adenovirus expressing either GFP (control) or with PGC body aerobic capacity (VO2max). (a) Linear regression was 1.C. and profiled over a three day period. Experiments were used to model VO2max with diabetes status, the mean performed in duplicate and relative gene expression mea centroid of OXPHOS-CR gene expression, ubiquinol cyto Sures are shown. Genes are ranked according to the differ chrome c reductase binding (UQCRB) expression, or ence in expression between PGC-1a and GFP on day 3. in combination as explanatory predictor) variables. The Mouse genes having a perfect Erro. motif (5'-TGACCTTG explanatory power and significance of the model are shown 3'), a perfect Gabpa/b motif (5'-CTTCCG-3"), or both motifs in the table. (b) Linear regression of VO2max against the are labeled with a black bar on the right side of the mean centroid of OXPHOS-CR gene expression. correlogram. 0020 FIG. 6 shows previously known and newly iden 0025 FIG. 11 shows a proposed model of mechanism of tified mitochondrial proteins (mito-P). (A) Proteomic survey action of PGC-1a. PGC-1a is a highly regulated gene that of mitochondria from mouse brain, heart, kidney, and liver responds to external stimuli, e.g., reduced in diabetes and resulted in the identification of 422 proteins, 262 of which increased following exercise. When PGC-1a levels rise, the were previously annotated as being mitochondrial. The expression of Errol, and Gabpa are immediately induced via distributions for (B) molecular weight, (C) isoelectric point, a double positive feedback loop. This results in the strong (D) mitochondrial compartments are plotted for proteins induction of Erro. as well as Gabpa. These levels rise and detected (pink) or not detected (blue) by our proteomic over the course of 2-3 days, these factors couple with Survey. Isolectric point, molecular weight, and Subcellular PGC-1a to induce the expression of NRF-1 as well as distribution data came from the MITOchondria Project hundreds of downstream targets, such as OXPHOS and (MITOP (Scharfe et al., 2000)). (E) Cumulative distribution other mitochondrial genes. of mRNA abundance for those genes whose protein product 0026 FIG. 12 shows cooperativity between the Erro. and was detected (pink) or not detected (blue) by proteomics. Gabpa binding sites. All 5034 genes from motif ADE analy The median expression levels for both groups are indicated. sis are rank ordered on the basis of expression difference The cumulative distribution function for the proteins (signal to noise ratio) on day 3 between cells treated with detected in proteomics is significantly greater than the PGC-1a vs. GFP. The cumulative fraction of genes with a cumulative distribution function for proteins not detected specified motif (Errol, blue; Gabpa, pink; both, black) is (Kolmogorov-Smirnov statistic, D=0.3618, P=9.4x10-18). plotted as a function of fractional rank ordering of all 5034 0021 FIG. 7 shows modules of tightly co-regulated genes. mito-P genes. Pairwise correlation matrix for the 388 mito chondrial genes present in the GNF mouse tissue compen DETAILED DESCRIPTION OF THE dium. Red represents strong positive correlation, blue rep INVENTION resents strong negative correlation. Dominant gene modules I. Overview are labeled 1-7 with functional annotations. 0027. The invention broadly relates to novel therapeutics 0022 FIG. 8 shows the mRNA expression profile for 388 for regulating metabolism, mitochondrial function, and for mitochondrial genes (rows) across 47 different mouse tis treating disorders, including obesity and type 2 diabetes, and Sues (columns) in the GNF mouse expression atlas (Su et al., to related methods. The invention stems, in part, from the 2002). These genes and tissues were hierarchically clustered discovery by applicants of a new group of coordinately and visualized using DCHIP (Schadt et al., 2001). Key regulated genes, termed OXPHOS, which are involved in tissues showing high expression levels are labeled at the top oxidative phosphorylation. OXPHOS-CR genes have the of the panel. Evidence for being in mito-P is indicated by the following key characteristics: (a) they are members of white (previously known but not found in proteomics), gray oxidative phosphorylation; (b) they are transcriptionally (previously known and found in proteomics), and black (not co-regulated and highly expressed at the major sites of previously known but found in proteomics) bars placed to insulin mediated glucose uptake (brown fat, heart, skeletal the right of the correlogram. muscle); (c) they are targets of the transcriptional co 0023 FIG.9 shows mitochondria neighborhood analysis. activator PPARGC1 (PGC-1C); (d) they show a subtle but The mitochondria neighborhood index (Noo) is defined as extremely consistent expression decrease in diabetic and US 2007/0203083 A1 Aug. 30, 2007 pre-diabetic muscle; and (e) their expression predicts total terms used herein have the same meaning as commonly body aerobic capacity in humans. understood by one of ordinary skill in the art to which this 0028 Applicant have discovered that OXPHOS genes invention belongs. are downregulated in subjects afflicted with type 2 diabetes 0034. The term “expression vector' and equivalent terms or with glucose intolerance and that Peroxisome Prolifera are used herein to mean a vector which is capable of tor-Activated Receptor Y-Coactivator-1C. (PGC-1C.) tran inducing the expression of DNA that has been cloned into it scriptionally regulates the OXPHOS genes. Applicants have after transformation into a host cell. The cloned DNA is also discovered that PGC-1C. acts through Erro. and Gabp to usually placed under the control of (i.e., operably linked to) regulate OXPHOS gene expression. Such discoveries pro certain regulatory sequences Such a promoters or enhancers. vide the basis for novel assays and methods of treatment Promoters sequences maybe constitutive, inducible or relating to the genes and disorders. repressible. 0029. The invention provides, in part, methods of modu 0035) The term “operably linked' is used herein to mean lating mitochondrial function, expression of the OXPHOS molecular elements that are positioned in Such a manner that genes, mitochondrial biogenesis, expression of Nuclear Res enables them to carry out their normal functions. For piratory Factor 1 (NRF-1), B-oxidation of fatty acids, total example, a gene is operably linked to a promoter when its mitochondrial respiration, uncoupled respiration, mitochon transcription is under the control, of the promoter and, if the drial DNA replication, or expression of mitochondrial gene encodes a protein, such transcription produces the enzymes, by modulating the expression or activity of Erro. protein normally encoded by the gene. For example, a Gabpa, Gabpb or of genes containing Errol, binding sites, binding site for a transcriptional regulator is said to be Gabpa binding sites, or both. Modulation of these biological operably linked to a promoter when transcription from the activities may be carried out in a cell. Such as contacting a promoter is regulated by protein(s) binding to the binding cell with an agent, or in a subject in need thereof. The site. Likewise, two protein domains are said to be operably invention further provides agents for treating these disorders linked in a protein when both domains are able to perform and for modulating Erro, Gabp and PGC-1 function. their normal functions. 0030 A related aspect of the invention provides a method 0036) The articles “a” and “an are used herein to refer to of identifying agents useful for treating disorders related to one or to more than one (i.e., to at least one) of the altered glucose homeostasis, insulin resistance or reduced grammatical object of the article. By way of example, “an mitochondrial function. Furthermore, the invention provides element’ means one element or more than one element. methods of diagnosing Such disorders or of identifying Subjects at risk of developing the disorders. 0037. The term “including is used herein to mean, and is used interchangeably with, the phrase “including but not 0031. The invention also provides cell-based methods of limited to. identifying agents which modulate the expression of OXPHOS genes. Since applicants have discovered that 0038. The term 'or' is used herein to mean, and is used PGC-1C, Erro. and Gabp regulate the expression of level of interchangeably with, the term “and/or unless context OXPHOS genes, such methods are useful in identifying clearly indicates otherwise. agents which regulate the expression or activity of PGC-1C. 0039 The term “such as is used herein to mean, and is Erro. and Gabp. Furthermore, expression of OXPHOS genes used interchangeably, with the phrase “such as but not may be used to predict total body aerobic capacity in humans limited to. and other mammals. 0040 A“patient” or “subject” to be treated by the method 0032) Another aspect of the invention provides a method of the invention can mean either a human or non-human of detecting statistically-significant differences in the animal, preferably a mammal. expression level of at least one biomarker belonging to a biomarker set, between the members of a first and of a 0041. The term “encoding comprises an RNA product second experimental group. Such a method may be applied, resulting from transcription of a DNA molecule, a protein for example, to identify biomarker sets which are differen resulting from the translation of an RNA molecule, or a tially expressed in an experimental group afflicted with a protein resulting from the transcription of a DNA molecule disorder, even when the changes in expression between the and the subsequent translation of the RNA product. two groups are very Subtle. Biomarker sets identified using 0042. The term “promoter is used herein to mean a DNA the methods described herein may be used in the develop sequence that initiates the transcription of a gene. Promoters ment of diagnostic tools and treatments for the disorder for are typically found 5' to the gene and located proximal to the which they are associated. A related aspect of the invention start codon. If a promoter is of the inducible type, then the provides methods of identifying transcriptional regulators rate of transcription increases in response to an inducer. which display differential activity between two sets of Promoters may be operably linked to DNA binding elements conditions. Such methods may be applied to the bio markers that serve as binding sites for transcriptional regulators. The identified using the related methods provided herein, and term “mammalian promoter” is used herein to mean pro may be useful in identifying disease genes and targets for moters that are active in mammalian cells. Similarly, novel therapeutics to treat or prevent disease. “prokaryotic promoter” refers to promoters active in II. Definitions prokaryotic cells. 0033 For convenience, certain terms employed in the 0043. The term “expression' is used herein to mean the specification, examples, and appended claims, are collected process by which a polypeptide is produced from DNA. The here. Unless defined otherwise, all technical and scientific process involves the transcription of the gene into mRNA US 2007/0203083 A1 Aug. 30, 2007

and the translation of this mRNA into a polypeptide. or mental development and conditions in an animal or Depending on the context in which used, “expression' may human. The phrase “therapeutically-effective amount refer to the production of RNA, protein or both. means that amount of Such a substance that produces some desired local or systemic effect at a reasonable benefit/risk 0044) The term “recombinant is used herein to mean any ratio applicable to any treatment. In certain embodiments, a nucleic acid comprising sequences which are not adjacent in therapeutically-effective amount of a compound will depend nature. A recombinant nucleic acid may be generated in on its therapeutic index, solubility, and the like. For vitro, for example by using the methods of molecular example, certain compounds discovered by the methods of biology, or in vivo, for example by insertion of a nucleic acid the present invention may be administered in a Sufficient at a novel chromosomal location by homologous or non amount to produce a reasonable benefit/risk ratio applicable homologous recombination. to Such treatment. 0045. The term “transcriptional regulator refers to a biochemical element that acts to prevent or inhibit the 0052 The term “improving mitochondrial function” may transcription of a promoter-driven DNA sequence under refer to (a) Substantially (e.g., in a statistically significant manner, and preferably in a manner that promotes a statis certain environmental conditions (e.g., a repressor or nuclear tically significant improvement of a clinical parameter Such inhibitory protein), or to permit or stimulate the transcription as prognosis, clinical score or outcome) restoring to a of the promoter-driven DNA sequence under certain envi normal level at least one indicator of glucose responsiveness ronmental conditions (e.g., an inducer or an enhancer). in cells having reduced glucose responsiveness and reduced 0046) The term “microarray' refers to an array of distinct mitochondrial mass and/or impaired mitochondrial function; polynucleotides or oligonucleotides synthesized on a Sub or (b) Substantially (e.g., in a statistically significant manner, strate. Such as paper, nylon or other type of membrane, filter, and preferably in a manner that promotes a statistically chip, glass slide, or any other Suitable solid Support. significant improvement of a clinical parameter Such as prognosis, clinical score or outcome) restoring to a normal 0047. The terms “disorders' and “diseases” are used level, or increasing to a level above and beyond normal inclusively and refer to any deviation from the normal levels, at least one indicator of mitochondrial function in structure or function of any part, organ or system of the body cells having impaired mitochondrial function or in cells (or any combination thereof). A specific disease is mani having normal mitochondrial function, respectively. fested by characteristic symptoms and signs, including bio Improved or altered mitochondrial function may result from logical, chemical and physical changes, and is often asso changes in extra-mitochondrial structures or events, as well ciated with a variety of other factors including, but not as from mitochondrial structures or events, in direct inter limited to, demographic, environmental, employment, actions between mitochondrial and extra-mitochondrial genetic and medically historical factors. Certain character genes and/or their gene products, or in structural or func istic signs, symptoms, and related factors can be quantitated tional changes that occur as the result of interactions through a variety of methods to yield important diagnostic between intermediates that may be formed as the result of information. Such interactions, including metabolites, catabolites, Sub 0.048. The terms “level of expression of a gene in a cell strates, precursors, cofactors and the like. or “gene expression level” refer to the level of mRNA, as 0053) The term “effective amount” refers to the amount well as pre-mRNA nascent transcript(s), transcript process of a therapeutic reagent that when administered to a subject ing intermediates, mature mRNA(s) and degradation prod by an appropriate dose and regime produces the desired ucts, encoded by the gene in the cell. result. 0049. The term “modulation” refers to upregulation (i.e., activation or stimulation), downregulation (i.e., inhibition or 0054 The term “subject in need of treatment for a Suppression) of a response, or the two in combination or disorder is a subject diagnosed with that disorder or sus apart. A "modulator” is a compound or molecule that modu pected of having that disorder. lates, and may be, e.g., an agonist, antagonist, activator, 0055. The term “metabolic disorder” refers to a disorder, stimulator, Suppressor, or inhibitor. disease or condition which is caused or characterized by an 0050. The term “prophylactic' or “therapeutic' treatment abnormal metabolism (i.e., the chemical changes in living refers to administration to the subject of one or more of the cells by which energy is provided for vital processes and Subject compositions. If it is administered prior to clinical activities) in a subject. Metabolic disorders include diseases, manifestation of the unwanted condition (e.g., disease or disorders, or conditions associated with aberrant thermogen other unwanted state of the host animal) then the treatment esis or aberrant adipose cell (e.g., brown or white adipose is prophylactic, i.e., it protects the host against developing cell) content or function. Metabolic disorders can detrimen the unwanted condition, whereas if administered after mani tally affect cellular functions such as cellular proliferation, festation of the unwanted condition, the treatment is thera growth, differentiation, or migration, cellular regulation of homeostasis, inter- or intra-cellular communication, tissue peutic (i.e., it is intended to diminish, ameliorate or maintain function, Such as liver function, muscle function, or adipo the existing unwanted condition or side effects therefrom). cyte function; Systemic responses in an organism, such as 0051) The term “therapeutic effect” refers to a local or hormonal responses (e.g., insulin response). Examples of systemic effect in animals, particularly mammals, and more metabolic disorders include obesity, diabetes, hyperphagia, particularly humans caused by a pharmacologically active hypophagia, endocrine abnormalities, triglyceride storage Substance. The term thus means any Substance intended for disease, Bardet-Biedl syndrome, Lawrence-Moon syn use in the diagnosis, cure, mitigation, treatment or preven drome, Prader-Labhart-Willi syndrome, Kearns-Sayre syn tion of disease or in the enhancement of desirable physical drome, anorexia, medium chain acyl-CoA dehydrogenase US 2007/0203083 A1 Aug. 30, 2007

deficiency, and cachexia. Obesity is defined as a body mass Accession No. NP 004442. Additional isoforms of Erro. index (BMI) of 30 kg/m or more (National Institute of and methods for assaying Errol, activity are known in the art Health, Clinical Guidelines on the Identification, Evaluation, e.g. Schreiber, S. N., et al. J. Biol. Chem. 278 (11), 9013 and Treatment of Overweight and Obesity in Adults (1998)). 9018 (2003); Igarashi, M., et al. J. Gen. Virol. 84 (Pt 2), However, the present invention is also intended to include a 319-327 (2003); Kraus, R. J., et al. J. Biol. Chem. 277 (27), disease, disorder, or condition that is characterized by a body 24826-24834 (2002); Vanacker, J. M., Oncogene 17 (19), mass index (BMI) of 25 kg/m or more, 26 kg/m or more, 2429-2435 (1998); Sladek, R., et al. Genomics 45 (2), 27 kg/m or more, 28 kg/m or more, 29 kg/m or more, 29.5 320-326 (1997); Sladek, R., et al. Mol. Cell. Biol. 17 (9), kg/m or more, or 29.9 kg/m or more, all of which are 5400-5409 (1997); Shi, H., et al. Genomics 44 (1), 52-60 typically referred to as overweight (National Institute of (1997): Yang, N., et al. J. Biol. Chem. 271 (10), 5795-5804 Health, Clinical Guidelines on the Identification, Evaluation, (1996); Giguere, V et al. Nature 331 (6151), 91-94 (1988); and Treatment of Overweight and Obesity in Adults (1998)). Eiler, S., et al Protein Expr. Purif. 22 (2), 165-173 (2001), the teachings of which are incorporated by referenced 0056. A “susceptibility locus for a particular disease is a herein. sequence or gene locus implicated in the initiation or pro gression of the disease. The Susceptibility locus can be, for 0060. The term “nuclear hormone receptors’ comprises example, a gene or a microsatellite repeat, as identified by a comprise a large, well-defined family of ligand-activated microsatellite marker, or can be identified by a defined single transcription factors which modify the expression of target nucleotide polymorphism. Generally, Susceptibility genes genes by binding to specific cis-acting sequences (Laudet et implicated in specific diseases and their loci can be found in al., 1992, EMBO J, Vol. 1003-1013; Lopes da Silva et al., Scientific publications, but may also be determined experi 1995, TINS 18, 542-548; Mangelsdorf et al., 1995, Cell 83, mentally. 835-839; Mangelsdorf et al., 1995, Cell 83, 841-850). Fam ily members include both orphan receptors and receptors for 0057 The term “Gabp polypeptide' comprises Gabpa a wide variety of clinically significant ligands including and Gabpb polypeptides. In preferred embodiments of the steroids, vitamin D, thyroid hormones, retinoic acid, etc. methods described herein, the Gabpa and Gabpb polypep Additional receptors may be found in the literature (See for tides are mammalian polypeptides, preferably human. The example The Nuclear Receptor FactsBook: Vincent Laudet amino acid sequences of human Gabpa and Gabpb are (Editor); Elsevier Science & Technology, 2001). deposited as Genbank Accession Nos. NP 002031 and NP 852092, respectively. Gabpa is also known as E4TF1 0061 The term “antibody” as used herein is intended to 53 in the art, while Gabpb is also known as E4TF1-60. include whole antibodies, e.g., of any isotype (IgG, IgA, Additional assays to those described herein for assaying the IgM, IgE, etc), and includes fragments thereof which are transcriptional activity of Gabpa and Gabpb, and additional also specifically reactive with a vertebrate, e.g., mammalian, isoforms of these subunits, may be found in the art (Sawa et protein. Antibodies can be fragmented using conventional al., Nucleic Acids Res. 24(24):4954-61 (1996); Watanabe, et techniques and the fragments screened for utility and/or al. Mol. Cell. Biol. 13 (3), 1385-1391 (1993), Sawada, J. et interaction with a specific epitope of interest. Thus, the term al J. Biol. Chem. 274 (50), 35475-35482 (1999); Suzuki, F. includes segments of proteolytically-cleaved or recombi etal J. Biol. Chem. 273 (45), 29302-29308 (1998); Sawa, C., nantly-prepared portions of an antibody molecule that are et al. Nucleic Acids Res. 24 (24), 4954-4961 (1996): capable of selectively reacting with a certain protein. Non Gugneja, S. etal Mol. Cell. Biol. 15 (1), 102-111 (1995); de limiting examples of Such proteolytic and/or recombinant la Brousse, F. C. et al. Genes Dev. 8 (15), 1853-1865 (1994): fragments include Fab, F(ab')2, Fab'. Fv, and single chain Virbasius, J. V. et al. Genes Dev. 7 (3), 380-392 (1993)), the antibodies (scFv) containing a VL and/or VH domain teachings of which are incorporated by referenced herein. joined by a peptide linker. The scFv's may be covalently or non-covalently linked to form antibodies having two or 0058. The term “PGC-1 polypeptide' comprises PGC-1a more binding sites. The term antibody also includes poly and PGC-1b polypeptides. In preferred embodiments of the clonal, monoclonal, or other purified preparations of anti methods described herein, the PGC-1a and PGC-1b polypeptides are mammalian polypeptides, preferably bodies and recombinant antibodies. human. The amino acid sequences of human PGC-1a and 0062) The term “recombinant’ as used in reference to a PGC-1 b are deposited as Genbank Accession Nos. nucleic acid indicates any nucleic acid that is positioned NP 573570 and AF453324, respectively. Additional assays adjacent to one or more nucleic acid sequences that it is not to those described herein for assaying the transcriptional found adjacent to in nature. A recombinant nucleic acid may activity of Gabpa and Gabpb, and additional isoforms of be generated in vitro, for example by using the methods of these subunits, may be found in the art (Huss, J. M., et al. molecular biology, or in Vivo, for example by insertion of a Biol. Chem. 277 (43), 40265-40274 (2002); Kressler, D., et nucleic acid at a novel chromosomal location by homolo al. J. Biol. Chem. 277 (16), 13918-13925 (2002); Lin, J., et gous or non-homologous recombination. The term “recom al. J. Biol. Chem. 277 (3), 1645-1648 (2002); Lin et al. J. binant as used in reference to a polypeptide indicates any Biol. Chem. Vol. 277, Issue 3, 1645-1648, Jan. 18, (2002)), polypeptide that is produced by expression and translation of the teachings of which are incorporated by referenced a recombinant nucleic acid. herein. 0063. The following terms are used to describe the 0059) The term “Errol polypeptide' includes Erro. sequence relationships between two or more polynucle polypeptides from any species. In some preferred embodi otides: “reference sequence.’"comparison window.'"se ments of the methods described herein, an Erro. polypeptide quence identity.'percentage of sequence identity,” and is a mammalian polypeptide, preferably a human polypep “substantial identity.” A reference sequence is a defined tide. The sequence of human Errol, corresponds to Genbank sequence used as a basis for a sequence comparison; a US 2007/0203083 A1 Aug. 30, 2007 reference sequence can be a Subset of a larger sequence, for (b) mitochondrial biogenesis; (c) expression of Nuclear example, as a segment of a fall length cDNA or gene Respiratory Factor 1 (NRF-1); (d) B-oxidation offatty acids: sequence given in a sequence listing, or may comprise a (e) total mitochondrial respiration, (f) uncoupled respiration; complete cDNA or gene sequence. Generally, a reference (g) mitochondrial DNA replication: (h) expression of mito sequence is at least 20 nucleotides in length, frequently at chondrial enzymes; or (i) skeletal muscle fiber-type switch least 25 nucleotides in length, and often at least 50 nucle 1ng. otides in length. Since two polynucleotides can each (1) comprise a sequence (for example a portion of the complete 0067. In one embodiment of the methods described polynucleotide sequence) that is similar between the two herein, the biological response that is modulated is the polynucleotides, and (2) may further comprise a sequence expression of at least one OXPHOS gene. OXPHOS genes that is divergent between the two polynucleotides, sequence have been described in Mootha et al., Nat. Genet. 2003; comparisons between two (or more) polynucleotides are 34(3):267-73, hereby incorporated by reference in its typically performed by comparing sequences of the two entirety. In one embodiment, the OXPHOS gene is polynucleotides over a “comparison window' to identify NDUFB3, SDHA, NDUFA8, COX7A1, UQCRC1, and compare local regions of sequence similarity. A com NDUFC1, NDUFS2, ATP5O, NDUFS3, SDHB, NDUFS5, parison window, as used herein, refers to a conceptual NDUFB6, COX5B, CYC1, NDUFA7, UQCRB, COX7B, segment of at least 20 contiguous nucleotide positions ATP5L, COX7C, NDUFA5, GRIM19, ATP5J, COX6A2 wherein a polynucleotide sequence may be compared to a NDUFB5, CYCS, NDUFA2 or HSPC051. reference sequence of at least 20 contiguous nucleotides and 0068. In another embodiment of the methods described wherein the portion of the polynucleotide sequence in the herein, the biological response that is modulated is mito comparison window can comprise additions and deletions chondrial biogenesis. U.S. Patent Publication No. 2002/ (for example, gaps) of 20 percent or less as compared to the 0049176 describes assays for determining mitochondrial reference sequence (which would not comprise additions or mass, Volume or number, and is hereby incorporated by deletions) for optimal alignment of the two sequences. reference in its entirety. Optimal alignment of sequences for aligning a comparison 0069. In another embodiment of the methods described window can be conducted by the local identity algorithm herein, the biological response that is modulated is expres (Smith and Waterman, Adv. Appl. Math. 2:482 (1981)), by sion of Nuclear Respiratory Factor 1 (NRF-1). NRF-1 is a the identity alignment algorithm (Needleman and Wunsch, J. transcription factor occurring as a homodimer of a 54 KDa Mol. Bio., 48:443 (1970)), by the search for similarity polypeptide encoded by the nuclear gene nrf-1 (Evans and method (Pearson and Lipman, Proc. Natl. Acid. Sci. U.S.A. Scarpulla, Genes & Development 4:1023-1034 (1990), 85:2444 (1988)), by the computerized implementations of Scarpulla, J. Bioenergetics and Biomembranes 29:109-119 these algorithms such as GAP BESTFIT. FASTA and (1997), Moyes et al., J. Exper. Biol. 201:299-307 (1998)). TFASTA (Wisconsin Genetics Software Page Release 7.0, NRF-1 binds to the upstream promoters of nuclear genes Genetics Computer Group, Madison, Wis.), or by inspec that encode respiratory components associated with mito tion. Preferably, the best alignment (for example, the result chondrial transcription and replication. NRF-1 can be any having the highest percentage of identity over the compari NRF-1, such as rat, mouse or human. NRF-1 nucleotide and son window) generated by the various methods is selected. polypeptide sequences are described in U.S. Patent Publi 0064. The term “diagnostic’ refers to assays that provide cation No. 20020049176, hereby incorporated by reference results which can be used by one skilled in the art, typically in its entirety. in combination with results from other assays, to determine 0070. In another embodiment of the methods described if an individual is suffering from a disease or disorder of herein, the biological response that is modulated is B-oxi interest Such as diabetes, including type I and type II, dation offatty acids. In another embodiment of the methods whereas the term “prognostic’ refers to the use of such described herein, the biological response that is modulated assays to evaluate the response of an individual having Such is total mitochondrial respiration. In another embodiment of a disease or disorder to therapeutic or prophylactic treat the methods described herein, the biological response that is ment. The term “pharmacogenetic' refers to the use of modulated uncoupled respiration. Uncoupled respiration assays to predict which individual patients in a group will occurs when electron transport is uncoupled from ATP best respond to a particular therapeutic or prophylactic synthesis composition or treatment. 0071. In another embodiment of the methods described 0065 Other technical terms used herein have their ordi herein, the biological response that is modulated is mito nary meaning in the art that they are used, as exemplified by chondrial DNA replication. Quantification of mitochondrial a variety of technical dictionaries, such as the McGraw-Hill DNA (mtDNA) content may be accomplished by one with Dictionary of Chemical Terms and the Stedman’s Medical routine skill in the art using any of a variety of established Dictionary. techniques that are useful for this purpose, including but not III. Methods of Modulating Biological Responses in a Cell limited to, oligonucleotide probe hybridization or poly merase chain reaction (PCR) using oligonucleotide primers 0066. In one aspect, the invention provides methods of specific for mitochondrial DNA sequences (see, e.g., Miller modulating biological responses in a cell. One specific et al., 1996 J. Neurochem. 67: 1897: Fahy et al., 1997 Nucl. aspect of the invention provides a method of modulating a Ac. Res. 25:3102; U.S. patent application Ser. No. 09/098, biological response in a cell, the method comprising con 079; Lee et al., 1998 Diabetes Res. Clin. Practice 42:161; tacting the cell with at least one agent that modulates the Lee et al., 1997 Diabetes 46(suppl. 1): 175A). A particularly expression or activity of Errol, or Gabp, wherein the biologi useful method is the primer extension assay disclosed by cal response is (a) expression of at least one OXPHOS gene: Fahy et al. (Nucl. Acids Res. 25:3102, 1997) and by Ghosh US 2007/0203083 A1 Aug. 30, 2007

et al. (Am. J. Hum. Genet. 58:325, 1996). Suitable hybrid or tissue culture. The methods described herein also apply to ization conditions may be found in the cited references or groups of cells, such as to whole tissues or organs. In some may be varied according to the particular nucleic acid target embodiments, the organism is a mammal. Such as a mouse, and oligonucleotide probe selected, using methodologies rat, an ungulate, a horse, a dog or a human. well known to those having ordinary skill in the art (see, e.g., 0075. In some embodiments, the human is afflicted, at Ausubel et al., Current Protocols in Molecular Biology, risk of developing, or Suspected with being afflicted, with a Greene Publishing, 1987; Sambrook et al., Molecular Clon disorder. In some embodiments, the disorder comprises a ing: A Laboratory Manual, Cold Spring Harbor Press, 1989). metabolic disorder, a disorder characterized by altered mito 0072. In another embodiment of the methods described chondrial activity, a disorder characterized by Sugar intol herein, the biological response that is modulated is expres erance, or a combination thereof. In specific embodiments of sion of mitochondrial enzymes. In one embodiment, mito the methods described herein, the disorder is diabetes, chondrial enzymes are Electron Transport Chain (ETC) obesity, cardiac myopathy, aging, coronary atherosclerotic enzymes. An ETC enzyme refers to any mitochondrial heart disease, diabetes mellitus, Alzheimer's Disease, Par molecular component that is a mitochondrial enzyme com kinson's Disease, Huntington's disease, dystonia, Leber's ponent of the mitochondrial electron transport chain (ETC) hereditary optic neuropathy (LHON), schizophrenia, myo complex associated with the inner mitochondrial membrane degenerative disorders such as “mitochondrial encephalopa and mitochondrial matrix. An ETC enzyme may include any thy, lactic acidosis, and stroke' (MELAS). and “myoclonic of the multiple ETC subunit polypeptides encoded by mito epilepsy ragged red fiber syndrome' (MERRY), NARP chondrial and nuclear genes. The ETC is typically described (Neuropathy; Ataxia; Retinitis Pigmentosa), MNGIE as comprising complex I (NADH:ubiquinone reductase), (Myopathy and external ophthalmoplegia, neuropathy: gas complex II (Succinate dehydrogenase), complex III trointestinal encephalopathy, Kearns-Sayre disease, Pear (ubiquinone: cytochrome c oxidoreductase), complex IV son's Syndrome, PEO (Progressive External Ophthalmople (cytochrome c oxidase) and complex V (mitochondrial ATP gia), congenital muscular dystrophy with mitochondrial synthetase), where each complex includes multiple polypep structural abnormalities, Wolfram syndrome, Diabetes tides and cofactors (for review see, e.g., Walker et al., 1995 Insipidus, Diabetes Mellitus, Optic Atrophy Deafness, Meths. Enzymol. 260:14; Ernster et al., 1981 J. Cell Biol. Leigh's Syndrome, fatal infantile myopathy with severe 91:227s-255s, and references cited therein). A mitochondrial mitochondrial DNA (mtDNA) depletion, benign “later-on enzyme of the present invention may also comprise a Krebs set myopathy with moderate reduction in mtDNA, dysto cycle enzyme, which includes mitochondrial molecular nia, medium chain acyl-CoA dehydrogenase deficiency, components that mediate the series of biochemical/bioener arthritis, and mitochondrial diabetes and deafness (MIDD), getic reactions also known as the citric acid cycle or the mitochondrial DNA depletion syndrome. tricarboxylic acid cycle (see, e.g., Lehninger, Biochemistry, 1975 Worth Publishers, NY; Voet and Voet, Biochemistry, 0076. In one embodiment of the methods for modulating 1990 John Wiley & Sons, NY; Mathews and van Holde, biological responses in a cell described herein, the agent Biochemistry, 1990 Benjamin Cummings, Menlo Park, modulates the formation of a complex between a PGC-1 Calif.). Krebs cycle enzymes include subunits and cofactors polypeptide and (i) an Errol, polypeptide; or (ii) a Gabp of citrate synthase, aconitase, isocitrate dehydrogenase, the polypeptide. The agent may be an agent which increases C.-ketoglutarate dehydrogenase complex, Succinyl CoA syn formation of the complex in the cell, or it may be an agent thetase, Succinate dehydrogenase, fumarase and malate that reduces formation of the complex in the cell. In embodi dehydrogenase. Krebs cycle enzymes further include ments where the agent increases a biological activity of the enzymes and cofactors that are functionally linked to the cell, the agent increases complex formation, whereas in reactions of the Krebs cycle. Such as, for example, nicoti embodiments where a biological activity is to be decreased, namide adenine dinucleotide, coenzyme A, thiamine pyro complex formation is decreased. One skilled in the art would phosphate, lipoamide, guanosine diphosphate, flavin recognize that complex formation, as used herein, refers to the normal association between the polypeptides which adenine dinucloetide and nucleoside diphosphokinase. results in the transcriptional activation of target genes by the 0073. In another embodiment of the methods described complex. Therefore, an agent which resulted in an aberrant herein, the biological response that is modulated is skeletal aggregation of PGC-1C. and Errol, polypeptides, wherein the muscle fiber-type Switching, that is, a shift towards type I resulting complex has reduced transcriptional activating oxidative skeletal muscle fibers. International PCT Applica activity, would not result in increased biological activity but tion WO 03/068944 describes skeletal muscle fiber-type instead in less. Likewise, an agent which increased com Switching. In some embodiments, the agent increases at least plexed formation, but the resulting complex was degraded in one of the biological responses. In alternate embodiments, the cell, would result in less biological activity in the cell. the agent decreases at least one of the biological responses. Accordingly, in some specific embodiments for reducing biological activity, the agent results in increase complex 0074 The methods described herein for modulating a formation, wherein the complex has reduced transcriptional biological activity in a cell may be applied to any type of activity or stability. cell. In specific embodiments, the cell is a skeletal muscle cell, a smooth muscle cell, a cardiac muscle cell, a hepato 0077. In one embodiment of the methods for modulating cyte, an adipocyte, a neuronal cell, or a pancreatic cell. The biological responses in a cell described herein, the agent cell may be a primary cell, a cell derived from a cell line, or modulates the expression level or the transcriptional activity a cell which has differentiated in vitro, such as a differen of an Erro. polypeptide, a Gabp polypeptide, or of both. The tiated cell obtained through manipulation of a stem cell. In agent may comprise a polypeptide, a nucleic acid, or a Some embodiments, the cell in an organism, while in other chemical compound. In one embodiment of the methods for embodiments the cell is manipulated ex vivo. Such as in cell modulating biological responses in a cell described herein, US 2007/0203083 A1 Aug. 30, 2007

the agent is itself an Errol, polypeptide or fragments thereof, Thus all variations of the methods described herein for or a Gapb polypeptide or a fragment thereof, or a nucleic modulating biological responses in a cell using an Erro. acid encoding Such polypeptides or fragments thereof. polypeptide may be applied to an Gabp polypeptide. Such as a Gabpa polypeptide. 0078. In some embodiments of the methods for increas ing biological responses in a cell described herein, the agent 0082 Another embodiment of the methods described increases complex formation between a PGC-1 polypeptide herein for modulating biological responses in a cell, the cell and an Erro. polypeptide. In preferred embodiments, the is contacted with two agents, wherein one agent modulates agent is specific for the complex formation between a the expression or activity of Erro. and the other agent PGC-1 polypeptide and an Erro. polypeptide. In a preferred modulates the expression or activity of a Gabp polypeptide, embodiment, the agent increases Errol activity by preferen Such as a Gabpa polypeptide. In another embodiment, the tially promoting complex formation between a PGC-1 cell is contacted with one agent which modulates the expres polypeptide and an Erro. polypeptide over complex forma sion or activity of both Errs and of a Gabp polypeptide. tion between a PGC-1 polypeptide and at least one other IV. Methods of Preventing/Treating Disease polypeptide to which PGC-1 normally binds in an organism. Polypeptides to which PGC-1 normally binds in an organism 0083. Some aspects of the invention provide methods of include the following: nearly all nuclear receptor (e.g., treating or preventing a disorder. Some aspects provide PPAR-gamma, PPAR-alpha, thyroid hormone receptor, methods of preventing disorders which are associated with HNF4C, etc.) as well as other transcription factors, such as glucose intolerance, excess glucose production, insulin NRF1, NFAT, etc (see Puigserver and Spiegelman, Endocr resistance, aberrant metabolism or abnormal mitochondrial Rev. 2003; 24(1):78-90). function. 0079. In another preferred embodiment, the agent 0084. The invention further provides agents for the increases Errol, activity by preferentially promoting complex manufacture of medicaments to treat any of the disorders formation between a PGC-1 polypeptide and an Errs described herein. Any methods disclosed herein for treating polypeptide over a PGC-1 polypeptide and another nuclear or preventing a disorder by administering an agent to a receptor. In some embodiments, the affinity of an agent Subject may be applied to the use of the agent in the which increases complex formation between PGC-1 manufacture of a medicament to treat that disorder. For polypeptide and Erro. does so at least 2, 5, 10, 20, 40, 50. example, in one specific embodiment, an Errol agonist may 100, 200, 500, 1000, 5000, 10,000, 50,000 or 100,000-fold be used in the manufacture of a medicament for the treat times more potently than complex formation between the ment of a disorder characterized by low mitochondrial same PGC-1 polypeptide and (i) at least another polypeptide function or by Sugar intolerance, such as diabetes. to which PGC-1 normally binds in an organism; or (ii) a 0085 One aspect of the invention provides method of nuclear receptor; or (iii) both. The fold-level of potency may treating or preventing a disorder characterized by reduced be determined by measuring the association constant, the mitochondrial function, glucose intolerance, or insulin intol disassociation constant, or more preferably the K of the erance in a Subject, the method comprising administering to agent for the various complexes. the Subject a therapeutically effective amount of an agent 0080. In parallel embodiments of the methods for inhib which (i) increases the expression or activity of Errol, or iting a biological response in a cell described herein, the Gabp or both; or (ii) increases the formation of a complex agent preferentially inhibits complex formation between a between a PGC-1 polypeptide and (a) an Erro. polypeptide; PGC-1 polypeptide and an Erro. polypeptide over a PGC-1 (b) a Gabp polypeptide; or both; or (iii) binds to an (a) Erro. polypeptide and another nuclear receptor. In some embodi binding site, or to a (b) Gabpa binding site, and which ments, the affinity of an agent which decreases complex increases transcription of at least one gene in the Subject, formation between PGC-1 polypeptide and an Erro. does so said gene having an Errol, binding site, a Gabpa binding site, at least 2, 5, 10, 20, 40, 50, 100, 200, 500, 1000, 5000, or both. 10,000, 50,000 or 100,000-fold times more potently than 0086. In one embodiment, the agent which binds to an (a) complex formation between the same PGC-1 polypeptide Errol, binding site, or to a (b) Gabp binding site, comprises and (i) at least another polypeptide to which PGC-1 nor at least one DNA binding domain. In a further embodiment, mally binds in an organism; or (ii) a nuclear receptor, or (iii) the DNA binding domain comprises at least one zinc-finger. both. In other embodiments, the IC50 for disrupting the In some embodiments, such agents comprise a DNA binding interaction between a PGC-1 polypeptide and an Erro. domain and a transactivation domain. Methods are known in polypeptide is 2, 5, 10, 20, 40, 50, 100, 200, 500, 1000, the art for designing transcriptional activator or repressors 5000, 10,000, 50,000 or 100,000-fold lower than that for which bind to specific DNA sequences, including those disrupting the interaction between a PGC-1 polypeptide and disclosed in U.S. Pat. Nos. 6,607,882, 6,453,242 and 6,511, (i) at least one another polypeptide to which PGC-1 nor 808. mally binds in an organism; or (ii) a nuclear hormone receptor. 0087. In one embodiment, the disorder is type 2 diabetes mellitus. In one embodiment of any of the methods 0081. In other embodiments of the methods described described herein, a disorder characterized by reduced mito herein for modulating biological responses in a cell, a Gabp chondrial function, glucose intolerance, or insulin intoler polypeptide may replace the Errol, polypeptide. For example, ance is diabetes, obesity, cardiac myopathy, aging, coronary instead of using an agent that modulates the interaction atherosclerotic heart disease, diabetes mellitus, Alzheimer's between a PGC-1 polypeptide and an Erro. polypeptide, an Disease, Parkinson's Disease, Huntington's disease, dysto agent is used that modulates the interaction between a nia, Leber's hereditary optic neuropathy (LHON), schizo polypeptide PGC-1 polypeptide and an Gabp polypeptide. phrenia, myodegenerative disorders such as “mitochondrial US 2007/0203083 A1 Aug. 30, 2007

encephalopathy, lactic acidosis, and stroke' (MELAS). and gene. In one embodiment, the promoter region comprises “myoclonic epilepsy ragged red fiber syndrome' (MERRF), from at least 0.5, 1, 1.5, 2, 2.5, 3, 4, 5 or 10 kb upstream of NARP (Neuropathy; Ataxia; Retinitis Pigmentosa), MNGIE the transcriptional start site of the gene to at least either (i) (Myopathy and external ophthalmoplegia, neuropathy: gas 0.5, 1, 1.5, 2, 2.5, 3, 4, 5 or 10 kb downstream of the tro-intestinal encephalopathy, Kearns-Sayre disease, Pear transcriptional start site of the gene; or (ii) 0.5, 1, 1.5, 2, 2.5, sons Syndrome, PEO (Progressive External Ophthalmople 3, 4, 5 or 10 kb downstream of the stop codon of the gene. gia), congenital muscular dystrophy with mitochondrial In yet another embodiment of this methods, the promoter structural abnormalities, Wolfram syndrome, Diabetes region comprises a masked promoter region. A masked Insipidus, Diabetes Mellitus, Optic Atrophy Deafness, promoter region comprises the regions of promoters that are Leigh's Syndrome, fatal infantile myopathy with severe conserved between two organisms. For example, a masked mitochondrial DNA (mtDNA) depletion, benign “later-on promoter region may comprise the promoter sequences set myopathy with moderate reduction in mtDNA, dysto which are conserved between human and another mammal, nia, medium chain acyl-CoA dehydrogenase deficiency, Such as a mouse. By sequences that are conserved, it is arthritis, and mitochondrial diabetes and deafness (MIDD), meant sequences which share at least 70% sequence identity mitochondrial DNA depletion syndrome. between the two species across a window size of at least 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, or 50 nucleotides, 0088. The invention further provides a method of treating or more preferably a window of 10 nucleotides. or preventing a disorder characterized by reduced mitochon drial function, glucose intolerance, or insulin intolerance in 0092. In another embodiment, the binding sites are a Subject, the method comprising administering to the Sub located within the promoter region, the coding region, the ject a therapeutically effective amount of an agent which exons, the introns, or the untranslated region of the gene, or increases the expression or activity of a gene, wherein the a combination thereof. gene has an Errol, binding site or a Gapba binding site. 0093. In yet another specific embodiment of the method, 0089. In one preferred embodiment of this method, the the gene having an Errol, binding site or a Gapba binding site gene has both an Errol, binding site and a Gapba binding site. is not Errol, while in another embodiment, the gene is not In one embodiment, the Errol, binding site comprises the Gabpa. The agent which increases the activity or expression sequence 5'-TGACCTTG-3" or the sequence '5-CAAG of a specific gene may be selected by one skilled in the art GTCA-3'. In one embodiment, the Gapba binding site com according to the type of protein that is encoded. For prises the sequence '5-CTTCCG-3' or '5-CGGAAG-3'. It is example, if the gene encodes an enzyme, then enzyme well known by one of routine skill in the art that transcrip activators are expected to increase the activity of the tional factors may have optimal binding sites to which they enzyme. Likewise, if the gene is a receptor, a receptor may bind in vivo or in vitro with substantially the same agonist may be administered. Such agonist may comprise binding affinity as their optimal binding sites. Accordingly, Small organic molecules, such as those having less than 1 in Some embodiments, an Errol, binding site comprises any kDa in mass, or may comprise an antibody that binds to the sequence that, when operably bound to a promoter, allows gene product and increases its activity. For any gene, an transcriptional control of the promoter by Erro. In another agent which increases the activity of the gene may comprise embodiment, an Errol, binding site comprises any sequence a polypeptide of the gene itself, or a nucleic acid containing that may be bound by an Erro. polypeptide with high affinity, the gene or an active fragment thereof. such as with a K that is less than at least about 10 M, about 10 M, about 107 M, about 10 M, about 10 M, 0094. In one embodiments of the methods described about 10' M, about 10' M, or about 10' M. Likewise, herein, reduced mitochondrial function comprises reduced in some embodiments, an Gabpa binding site comprises any total mitochondrial respiration, reduced uncoupled respira sequence that, when operably bound to a promoter, allows tion, reduced expression of mitochondrial enzymes, reduced transcriptional control of the promoter by Gabpa. In another mitochondrial biogenesis or a combination thereof. In some embodiment, an Errol, binding site comprises any sequence embodiments of the methods for preventing or treating a that may be bound by an Gabpa polypeptide with high disorder in a subject, at least one of the agents increases the affinity, such as with a K that is less than at least about 10 expression or activity of Errol, of a Gabp polypeptide, or of M, about 10 M, about 107M, about 10 M, about 10 both. In another embodiment, the agent promotes the expres M, about 10' M, about 10' M, or about 10' M. In some sion or activity of a binding partner of PGC-1C. or of embodiments, an Errol, binding site comprises a sequence PGC-1B. In yet another embodiment, the agent promotes the which is about 50%, 62.5%, 75%, or 87.5% identical to binding of PGC-1C. to a transcriptional regulator. In some either 5'-TGACCTTG-3" or to '5-CAAGGTCA-3'. In some embodiments, the transcriptional regulator is Errol, or Gabpa. embodiments, a Gabpa binding site comprises a sequence In one preferred embodiment, the agent induces mitochon which is about 50%, 66.6%, or 83.3%, identical to either drial activity in skeletal muscle. '5-CTTCCG-3" or '5-CGGAAG-3. 0095) Another aspect of the invention provides a method 0090. In another embodiment of any of the methods of treating impaired glucose tolerance in an individual, described herein, a gene which has an Errol, binding site is comprising administering to the individual atherapeutically any one of the genes listed on Table 10, a gene which has a effective amount of an agent which increases the expression Gabpa binding site is any one of the genes on Table 11, and level of at least two OXPHOS-CR genes, thereby treating a gene having both an Errol, and a Gabpa binding site is any impaired glucose tolerance in the individual. Another aspect of the invention provides a method of treating obesity in an one of the genes listed on Table 12. individual, comprising administering to the individual a 0091. In yet another embodiment of this method, the therapeutically effective amount of an agent which increases binding sites are located within the promoter region of the the expression level of at least two OSPHOS-CR genes, US 2007/0203083 A1 Aug. 30, 2007

thereby treating obesity in the individual. In preferred by reduced mitochondrial function or reduced metabolism. embodiments, the expression level of the OXPHOS-CR The identification of these loci allows for the diagnosis of genes is increased in the skeletal muscle cells of the Subject the disorders and for the design or screening of agents for the by at least 10%, 20%, 30%, 40%, 50% or 75%. treatment of these disorders. 0096. Another aspect of the invention provides methods 0102) The invention provides a method of identifying a of treating or preventing disorders characterized by an susceptibility locus for a disorder that is characterized by elevated metabolic rate in a subject and methods of lowering reduced mitochondrial function, glucose intolerance, or a metabolic rate in a Subject. The invention provides a insulin intolerance in a Subject, the method comprising (i) method of reducing the metabolic rate of a subject in need identifying at least one polymorphisms in a gene, or linked thereof, the method comprising administering to the Subject to a gene, wherein the gene (a) has an Errol, binding site, a a therapeutically effective amount of an agent which Gabpa binding site, or both; or (b) is Errol, Gabpa, or Gabpb: decreases the expression or activity of at least one of the (ii) determining if at least one polymorphism is associated following: (i) Errol; (ii) Gabpa; (iii) a gene having an Erro. with the incidence of the disorder, wherein if a polymor binding site, a Gabpa binding site, or both; or (iv) a phism is associated with the incidence of the disorder then transcriptional activator which binds to an Erro. binding site the gene having the polymorphism, or the gene to which the or to a Gabpa binding site; thereby reducing the metabolic polymorphism is linked, is a Susceptibility locus. rate of the patient. 0103) In one embodiment of the methods described 0097. In some embodiments of the methods provided for herein for identifying a susceptibility locus for a disorder, reducing the metabolic rate of a subject in need thereof, the the gene is any one of the gene listed on Tables 10-12. subject is afflicted with an infection, such as a viral infection. In one specific embodiment, the viral infection is a human 0104. As used herein, the term “polymorphism' refers to immunodeficiency virus infection. the co-existence, within a population, of more than one form of a gene or portion thereof (e.g. allelic variant), at a 0098. In another embodiment of methods for reducing frequency too high to be explained by recurrent mutation metabolic rates, the subject is afflicted with cancer or with alone. A portion of a gene of which there are at least two cachexia. Cachexia is a metabolic condition characterized different forms, i.e. two different nucleotide sequences, is by weight loss and muscle wasting. It is associated with a referred to as a polymorphic region of a gene'. A specific wide range of conditions including inflammation, heart genetic sequence at a polymorphic region of a gene is an failure and malignancies, and is well known and described allele. in the clinical literature e.g., J. Natl. Cancer Inst. 89(23): 1763-1773 (1997) 1. The mechanistic derangements under 0105. A polymorphic region can be a single nucleotide or lying cachexia are not known, but it is clear that a negative more than one nucleotide, the identity of which differs in energy balance obtains in the face of severe weight loss. In different alleles. A polymorphic region can be a restriction specific embodiments, the subject is afflicted with cancer fragment length polymorphism (RFLP). A RFLP refers to a cachexia, pulmonary cachexia, Russell's Diencephalic variation in DNA sequence that alters the length of a Cachexia, cardiac cachexia or chronic renal insufficiency. restriction fragment as described in Botstein et al., Am. J. Hum. Genet. 32.3 14-33 1 (1980). The RFLP may create or 0099. In some embodiments of the methods provided for delete a restriction site, thus changing the length of the reducing the metabolic rate of a subject in need thereof, the restriction fragment. RFLPs have been widely used in agent decreases the formation of a complex between a human and animal genetic analyses (see WO 90/13668; PGC-1 polypeptide and (i) an Erro. polypeptide; or (ii) a WO90/11369; Donis-Keller, Cell 5 1, 3) 19-33)7 (1987); Gabp polypeptide. In preferred embodiments, the PGC-1 Lander et al. Genetics 121, 85-99 (1989)). When a heritable polypeptide is a PGC-1C. polypeptide. In another embodi trait can be linked to a particular RFLP, the presence of the ment, the agent decreases the expression level or the tran RFLP in an individual can be used to predict the likelihood Scriptional activity of an Erro. polypeptide, a Gabp polypep that the individual will also exhibit the trait. tide, or of both, while in additional embodiments the agent inhibits the expression or activity of a gene which has an 0106 Other polymorphisms take the form of short tan Errol, binding site, a Gabpa binding site, or both. In some dem repeats (STRs) that include tandem di-, tri- and tetra embodiments, the agents comprise double stranded RNA nucleotide repeated motifs. These tandem repeats are also reagents, dominant negative polypeptides or nucleic acids referred to as variable number tandem repeat (VNTR) encoding them, or antibodies directed to Errol, Gabpa, polymorphisms. VNTRs have been used in identity and Gabpb, or to genes (or their gene products) which have an paternity analysis (U.S. Pat. No. 5,075.217: Armour et al., Errol, binding site, a Gabpa binding site, or both, such as FEBS Lett. 307, 13-115 (1992); Hornet al. WO 91/14003: binding sites in their promoter regions. Jeffreys, EP 370,719), and in a large number of genetic 0100 U.S. Pat. No. 5,602,009 describes a method of mapping studies. generating inhibitory nuclear hormone receptors. Such 0.107. Other polymorphisms take the form of single methods may be applied to Errol, or to Gabp to generate nucleotide variations between individuals of the same spe polypeptides or nucleic acids which encode them, which cies. Such single nucleotide variations may arise due to may be used as agents in the methods described herein for substitution of one nucleotide for another at the polymorphic reducing the metabolic rate of a Subject. site or from a deletion of a nucleotide or an insertion of a V. Methods of Diagnosing/Identifying Disease Genes nucleotide relative to a referenced allele. These single nucle otide variations are referred to herein as single nucleotide 0101 One aspect of the invention provides methods of polymorphism (SNPs). Such SNPs are far more frequent identifying a susceptibility loci for a disorder characterized than RFLPS, STRs and VNTRs. Some SNPs may occur in US 2007/0203083 A1 Aug. 30, 2007

protein-coding sequences, in which case, one of the poly cells may be used with electrical stimulation or thyroid morphic forms may give rise to the expression of a defective hormone as the stimulus for mitochondrial biogenesis. Alter protein and, potentially, a genetic disease. Other SNPs may natively, a fat cell culture such as 3T3-L1 cells may be used, occur in noncoding regions. Some of these polymorphisms with norepinephrine providing the stimulus for mitochon may also result in defective protein expression (e.g. as a drial biogenesis. Alternatively, cultured cells such as HeLa result of defective splicing). Other SNPs may have no or HEK293 that express PGC-1 and/or NRF-1 under a phenotypic effects. tetracycline inducible system may be used, wherein induced 0108 Techniques for determining the presence of par expression of PGC-1 and/or NRF-1 stimulates mitochon ticular alleles would be those known to persons skilled in the drial biogenesis. After sufficient time with the appropriate art and include, but are not limited to, nucleic acid tech stimulus to allow induction (1-2 days), the cells are incu niques based on size or sequence. Such as restriction frag bated with P’ orthophosphate for 4 hrs. Cells are then ment length polymorphism (RFLP), nucleic acid sequenc harvested and subjected to SDS-PAGE to resolve the labeled ing, or nucleic acid hybridization. The nucleic acid tested proteins. Using these systems, the function of a candidate may be RNA or DNA. These techniques may also comprise disease gene may be altered, such as through overexpres the step of amplifying the nucleic acid before analysis. Sion, expression of dominant negative forms of the proteins, Amplification techniques are known to those of skill in the inhibitory RNAi reagents, antibodies, and the like, and the art and include, but are not limited to, cloning, polymerase effects on mitochondrial biogenesis or function determined. chain reaction (CR), polymerase chain reaction of specific VI. Methods of Identifying Therapeutic Agents alleles (PASA), polymerase chain ligation, nested poly merase chain reaction, and the like. Amplification products 0113. One aspect of the invention provides methods of may be assayed in a variety of ways, including size analysis, identifying agents which modulate biological responses in a restriction digestion followed by size analysis, detecting cell, which modulate expression of the OXPHOS-CR genes specific tagged oligonucleotide primers in the reaction prod or which prevent or treat a disorder. ucts, allele-specific oligonucleotide (ASO) hybridization, 0114. One aspect of the invention provides a method of allele specific exonuclease detection, sequencing, hybridiza determining if an agent is a potential agent for the treatment tion and the like. Polymorphic variations leading to altered of a disorder that is characterized by glucose intolerance, protein sequences or structures may also be detected by insulin resistance or reduced mitochondrial function, the analysis of the protein itself. Additional methods for the method comprising determining if the agent increases: (i) detection of polymorphisms are described in U.S. Pat. No. the expression or activity of Erro, or Gabp in a cell; or (ii) 6,453,244 and in International PCT publications No. WO the formation of a complex between a PGC-1 polypeptide 04/011668, WO 03/048384, WO 01/20031 and WO and (i) an Errol, polypeptide; or (ii) a Gabp polypeptide; 03/038.125, the teachings of which are hereby incorporated wherein an agent that increases (i) or (ii) is a potential target by reference. for the treatment of the disorder. 0109 General methods are available to one skilled in the 0.115. In some embodiments of the methods described art for determining if a particular allele is associated with the herein for determining if an agent is a potential agent for the incidence of the disorder, Such as those described in Analysis treatment of a disorder, the disorder is diabetes, obesity, of Human Genetic Linkage, by Jurg Ott: Johns Hopkins cardiac myopathy, aging, coronary atherosclerotic heart dis University Press, 1999; and Statistical Genomics: Linkage, ease, diabetes mellitus, Alzheimer's Disease, Parkinson's Mapping, and QTL Analysis by Ben Hui Liu: CRC Press, Disease, Huntington's disease, dystonia, Leber's hereditary 1997. optic neuropathy (LHON). Schizophrenia, myodegenerative disorders such as “mitochondrial encephalopathy, lactic 0110. The invention also provides a related method for acidosis, and stroke' (MELAS). and “myoclonic epilepsy determining if a Subject is at risk of developing a disorder ragged red fiber syndrome' (MERRF), NARP (Neuropathy: which is characterized by reduced mitochondrial function, Ataxia; Retinitis Pigmentosa), MNGIE (Myopathy and the method comprising determining if a gene from the external ophthalmoplegia, neuropathy: gastrointestinal Subject contains a mutation which reduces the function of encephalopathy, Kearns-Sayre disease, Pearson's Syn the gene, wherein the gene has an Errol, binding site, a Gapba drome, PEO (Progressive External Ophthalmoplegia), con binding site, or both, wherein if a gene from the subject genital muscular dystrophy with mitochondrial structural contains a mutation then the Subject is at risk of developing abnormalities, Wolfram syndrome, Diabetes Insipidus, Dia the disorder. betes Mellitus, Optic Atrophy Deafness; Leigh's Syndrome, 0111. In one embodiment of this method, the mutation fatal infantile myopathy with severe mitochondrial DNA reduces the function of the gene. In another embodiment, the (mtDNA) depletion, benign “later-onset myopathy with disorder is diabetes, obesity, premature aging, cardiomyopa moderate reduction in mtDNA, medium chain acyl-CoA thy, a neurodegenerative disease, or retinal degeneration. In dehydrogenase deficiency, dystonia, arthritis, and mitochon further embodiments, the gene is any one of the genes on drial diabetes and deafness (MIDD) or mitochondrial DNA Tables 10-12. depletion. 0112 The proposed role of the candidate genes proteins 0116. Any general method known to one skilled in the art can be validated by traditional overexpression or knockout may be applied to determine if an agent increases the approaches to ascertain the effects of Such manipulations on expression or activity of Erro. or Gabp. In one specific mitochondrial biogenesis in the engineered cell lines. This embodiment for determining if an agent increases the approach ultimately identifies additional molecules whose expression of Errol, or Gabp, a cell is contacted with an agent, expression or activity can be modulated to enhance mito and an indicator of gene expression, such as mRNA level or chondrial function. For example, cultured skeletal muscle protein level, is determined. Levels of mRNA may be US 2007/0203083 A1 Aug. 30, 2007 determined, for example, using Such techniques as Northern 0119. One particular embodiment for identifying agents Blots, reverse-transcriptase polymerase chain reaction (RT which modulate activity of Errol, employs two genetic con PCR), RNA protection assays or a DNA microarray com structs. One is typically a plasmid that continuously prising probes capable of detecting Errol, or Gabp mRNA or expresses the transcriptional regulator of interest when cDNA molecules. Likewise, protein levels may be quanti transfected into an appropriate cell line. The second is a tated using techniques well-known in the art, such as west plasmid which expresses a reporter, e.g., luciferase under ern blotting, immuno-sandwich assays, ELISA assays, or control of the transcriptional regulator. For example, if a any other immunological technique. Techniques for quanti compound which acts as a ligand for Erro. is to be evaluated, tating nucleic acids and proteins may be found, for example, one of the plasmids would be a construct that results in expression of the Erro. in the cell line. The second would in Molecular Cloning: A Laboratory Manual, 3rd Ed., ed. by possess a promoter linked to the luciferase gene in which an Sambrook and Russell (Cold Spring Harbor Laboratory Erro. response element is inserted. If the compound to be Press: 2001); and in Current Protocols in Cell Biology, ed. tested is an agonist for the Errol receptor, the ligand will by Bonifacino, Dasso, Lippincott-Schwartz, Harford, and complex with the receptor and the resulting complex binds Yamada, John Wiley and Sons, Inc., New York, 1999, hereby the response element and initiates transcription of the incorporated by reference in their entirety. luciferase gene. In time the cells are lysed and a substrate for luciferase added. The resulting chemiluminescence is mea 0117. In one example, an RC cell culture system can be Sured photometrically. Dose response curves are obtained used to identify compounds which activate production of and can be compared to the activity of known ligands. Other ERRO. or, once ERRC. production has been activated in the reporters than luciferase can be used including CAT and cells, can be used to identify compounds which lead to other enzymes. In one specific embodiments of this suppression or switching off of ERRC., production. Alterna approach, the cells further express PGC-1C. or PGC-13, tively, such a cell culture system can be used to identify either endogenously or by introduction of a third plasmid compounds or binding partners of ERRC. which increase its encoding said polypeptides. The presence of PGC-1 expression. Compounds thus identified are useful as thera polypeptides in the cell further allows for the identification peutics in conditions where ERRC. production is deficient or of agents which increase or decrease the binding interaction excessive. Similar experiments may be carried out with between a PGC-1 polypeptide and Erro. This approach may Gabpa or Gabpb or both. also be modified to express both Gabpa and Gabpb to identify agents which modulate their transcriptional activity. 0118 Likewise, any general method known to one skilled Alternatively, a cell may be used which endogenously in the art may be applied to determining if an agent increases expresses any combination of polypeptides, such that only a the activity of Errol or Gabp. Activities of Errol, or Gabp plasmid encoding a reporter gene is introduced into the cell. include their ability to bind to DNA, their ability to bind to other transcriptional regulators or their ability to promote 0120 Viral constructs can be used to introduce the gene transcription of target genes. In one embodiment, candidate for Erro. Gabp or PGC-1 and the reporter into a cell. An agents are tested for their ability to modulate ERRC. activity usual viral vector is an adenovirus. For further details by (a) providing a system for measuring a biological activity concerning this preferred assay, see U.S. Pat. No. 4,981,784 of ERRC.; and (b) measuring the biological activity of ERRC. issued Jan. 1, 1991 hereby incorporated by reference, and in the presence or absence of the candidate compound, Evans et al., WO88/03168 published on 5 May 1988, also wherein a change in ERRC. activity in the presence of the incorporated by reference. compound relative to ERRC. activity in the absence of the 0121 Errol antagonists can be identified using this same compound indicates an ability to modulate ERRC. activity. In basic "agonist' assay. A fixed amount of an antagonist is specific embodiments, the biological activity is the ability of added to the cells with varying amounts of test compound to Erro. to bind the promoter of a target gene. Such as the generate a dose response curve. If the compound is an promoter or medium chain acyl-CoA dehydrogenase antagonist, expression of luciferase is Suppressed. (MCAD), which may be determined using chromatin immu noprecipitation and analysis of the DNA bound to the Erro. 0122) Additional methods for the isolation of agonists polypeptide. In another embodiment, the biological activity and antagonist of transcriptional regulators are described in is the ability of Errol to complex with PGC-1a or PGC-1b, U.S. Pat. Nos. 6,187,533, 5,620,887, 5,804,374, and 5,298, which may be measured by immunoprecipitation of either 429, and U.S. Patent Publication Nos. 2004/003394, 2003/ Errol, or a PGC-1 polypeptide and determining the presence 0077664, 2003/0215829 and 2003/0039980. Any of the of the other protein by western blotting. In another embodi methods described herein may be easily adapted to identify ment, the biological activity is promoting transcription of a agonists or antagonists of any one Errol, or Gabp polypep target gene. An indicator of gene expression for a target gene tides. whose transcription is regulated by Errol or by Gabp can be compared between cells which have or have not been 0123 U.S. Pat. No. 6,555,326 (PCT Pub No. WO contacted with the agent. In specific embodiments, PGC-1C. 99/27365) describes a fluorescent polarization assay for or PGC-13 is also present when testing of an agent modu identifying agents which regulate the activity of nuclear lates the transcriptional activating activity of Errol or Gabp hormone receptors, by using a nuclear hormone receptor, a polypeptides. Target genes which may be used include those peptide sensor and a candidate agent. Table 1 of this patent which contain either an Errol, or a Gabp binding site. Such as also lists exemplary nuclear hormone receptors. Such a OXPHOS genes or those provided by the invention. Because method may easily be modified by one skilled in the art to Gabpa and Gabpb form a complex, in some preferred identify agents which regulate the activity of Errol or Gabp. embodiments both proteins, or nucleic acids encoding them, 0.124. The invention also provides a method for screening are present in the assay Systems described herein. a candidate compound for its ability to modulate Erro. US 2007/0203083 A1 Aug. 30, 2007 activity in a suitable system, in the presence or absence of comprises: 1) combining: a Erro. polypeptide or fragment the candidate compound. A change in Errol, activity the thereof, a PGC-1C. polypeptide or fragment thereof, and an presence of the compound relative to ERRC. activity in the agent, under conditions wherein the Err alpha and PGC-1C. absence of the compound indicates that the compound polypeptides physically interact in the absence of the agent, modulates ERRC. activity. ERRC. activity is increased rela 2) determining if the agent interferes with the interaction, tive to the control in the presence of the compound, the and 3) for an agent that interferes with the interaction, compound is an ERRS agonist. Conversely, if ERRS activity further assessing its ability to promote the any of the is decreased in the presence of the compound, the compound biological responses of the cell. Such as (a) expression of the is an ERRC. antagonist. OXPHOS genes, mitochondrial biogenesis, expression of Nuclear Respiratory Factor 1 (NRF-1), B-Oxidation of fatty 0125. Another way of determining if an agent increases acids, total mitochondrial respiration, uncoupled respiration, the activity of Errol. or Gabp may also be based on binding of the agent to an ERRC. or to a Gabp polypeptide or mitochondrial DNA replication or expression of mitochon fragment thereof. Such competitive binding assays are well drial enzymes. known to those skilled in the art. 0131) A variety of assay formats will suffice and, in light of the present disclosure; those not expressly described 0126 For example, the invention provides screening herein will nevertheless be comprehended by one of ordi methods for compounds able to bind to ERRC. which are nary skill in the art. Assay formats which approximate Such therefore candidates for modifying the activity of ERRC. conditions as formation of protein complexes, enzymatic Various Suitable screening methods are known to those in activity, may be generated in many different forms, and the art, including immobilization of ERRC. on a substrate include assays based on cell-free systems, e.g. purified and exposure of the bound ERRC. to candidate compounds, proteins or cell lysates, as well as cell-based assays which followed by elution of compounds which have bound to the utilize intact cells. Simple binding assays can also be used ERRC. Additional methods and assays for identifying agents to detect agents which bind to Erro. or PGC-1 C. Such which modulate Errol activity, for generating Errol, knockout binding assays may also identify agents that act by disrupt animals and cells, and for generating Erro. reagents, such as ing the interaction between a Erro. polypeptide and PGC-1C. anti-Errol antibodies are described in International PCT Agents to be tested can be produced, for example, by publication No. WO 00/122988, hereby incorporated by bacteria, yeast or other organisms (e.g. natural products), reference in its entirety. produced chemically (e.g. Small molecules, including pep 0127. Another aspect of the invention provides a method tidomimetics), or produced recombinantly. Because Erro. of identifying an agent that modulates a biological response, and PGC-1a polypeptides contain multiple domains, specific the method comprising (a) contacting, in the presence of the embodiments of the assays and methods described to iden agent, a PGC-1 polypeptide and an (i) Erro. polypeptide, or tify agents which modulate complex formation between (ii) a Gabp polypeptide, under conditions which allow the Erro. and PGC-1a employ fragments of Erro. rather than formation of a complex between the PGC-1 polypeptide and full-length polypeptides. Such as those lacking the DNA (i) the Erro. polypeptide, or (ii) the Gabp polypeptide; and binding domains. Fragments of PGC-1 C. may also be used in (b) detecting the presence of the complex; wherein an agent Some embodiments, in particular fragments which retain the that modulates the biological response is identified if the ability to complex with Erro. agent increases or decreases the formation of the complex, 0.132. In many drug screening programs which test librar and wherein the biological response is (a) expression of the ies of compounds and natural extracts, high throughput OXPHOS genes; (b) mitochondrial biogenesis; (c) expres assays are desirable in order to maximize the number of sion of Nuclear Respiratory Factor 1 (NRF-1); (d) (3-oxida compounds Surveyed in a given period of time. Assays of the tion of fatty acids; (e) total mitochondrial respiration; (f) present invention which are performed in cell-free systems, uncoupled respiration; (g) mitochondrial DNA replication; which may be developed with purified or semi-purified or (h) expression of mitochondrial enzymes. proteins or with lysates, are often preferred as “primary 0128. In some embodiments of the methods for identify screens in that they can be generated to permit rapid devel ing an agent that modulates a biological response, the opment and relatively easy detection of an alteration in a method comprises an agent that increases the formation of molecular target which is mediated by a test compound. the complex and that increases the biological response. In Moreover, the effects of cellular toxicity and/or bioavail alternate embodiments, the agent decreases the formation of ability of the test agent can be generally ignored in the in the complex and decreases the biological response. In some vitro system, the assay instead being focused primarily on embodiments, the conditions which allow the formation of the effect of the drug on the molecular target as may be a complex between the PGC-1 polypeptide and an Erro. manifest in an alteration of binding affinity with other polypeptide or a Gabpa polypeptide comprise in vitro con proteins or changes in enzymatic properties of the molecular ditions, while in other embodiments they comprise in vivo target. conditions such as expression in a cell or in an organism. 0133. In preferred in vitro embodiments of the present 0129. The following embodiments of methods for iden assay, a reconstituted Erro/PGC-1C. complex comprises a tifying a compound that modulates a biological response, reconstituted mixture of at least semi-purified proteins. By although directed at Erro. and PGC-1C. are equally appli semi-purified, it is meant that the proteins utilized in the cable to Gabp polypeptides, such as Gabpa polypeptides, or reconstituted mixture have been previously separated from to PGC-1B polypeptides. other cellular or viral proteins. For instance, in contrast to cell lysates, the proteins involved in Erro/PGC-1 C. complex 0130. One embodiment for the of the methods for iden formation are present in the mixture to at least 50% purity tifying a compound that modulates a biological response relative to all other proteins in the mixture, and more US 2007/0203083 A1 Aug. 30, 2007 preferably are present at 90-95% purity. In certain embodi cive to complex formation. Following incubation, the beads ments of the subject method, the reconstituted protein mix are washed to remove any unbound interacting protein, and ture is derived by mixing highly purified proteins such that the matrix bead-bound radiolabel determined directly (e.g. the reconstituted mixture substantially lacks other proteins beads placed in Scintillant), or in the Supernatant after the (such as of cellular or viral origin) which might interfere complexes are dissociated, e.g. when microtitre plate is with or otherwise alter the ability to measure Erro/PGC-1C. used. Alternatively, afterwashing away unbound protein, the complex assembly and/or disassembly. complexes can be dissociated from the matrix, separated by 0134) Assaying Erro/PGC-1 C. complexes, in the presence SDS-PAGE gel, and the level of interacting polypeptide and absence of a candidate agent, can be accomplished in found in the matrix-bound fraction quantitated from the gel any vessel Suitable for containing the reactants. Examples using standard electrophoretic techniques. include microtiter plates, test tubes, and micro-centrifuge tubes. In a screening assay, the effect of a test agent may be 0.139. In yet another embodiment, the Errol and PGC-1C. assessed by, for example, determining the effect of the test polypeptides can be used to generate an interaction trap agent on kinetics, steady-state and/or endpoint of the reac assay (see also, U.S. Pat. No. 5.283.317; Zervos et al. (1993) tion. Cell 72:223-232; Madura et al. (1993) J Biol Chem 0135) In one embodiment of the present invention, drug 268: 12046-12054: Bartel et al. (1993) Biotechniques 14: screening assays can be generated which detect inhibitory 920-924; and Iwabuchi et al. (1993) Oncogene 8:1693 agents on the basis of their ability to interfere with assembly 1696), for Subsequently detecting agents which disrupt or stability of the Erro/PGC-1a complex. In an exemplary binding of the proteins to one and other. binding assay, the compound of interest is contacted with a 0140. In still further embodiments of the present assay, mixture comprising a Erro/PGC-1a complex. Detection and the Erro/PGC-1 C. complex is generated in whole cells, quantification of Erro/PGC-1C. complexes provides a means taking advantage of cell culture techniques to Support the for determining the compound's efficacy at inhibiting (or subject assay. For example, as described below, the Errol/ potentiating) interaction between the two polypeptides. The PGC-1 C. complex can be constituted in a eukaryotic cell efficacy of the compound can be assessed by generating dose culture system, Such as a mammalian cell and a yeast cell. response curves from data obtained using various concen Other cells known to one skilled in the art may be used. trations of the test compound. Moreover, a control assay can Advantages to generating the Subject assay in a whole cell also be performed to provide a baseline for comparison. In include the ability to detect inhibitors which are functional the control assay, the formation of complexes is quantitated in an environment more closely approximating that which in the absence of the test compound. therapeutic use of the inhibitor would require, including the 0136 Complex formation may be detected by a variety of ability of the agent to gain entry into the cell. Furthermore, techniques. For instance, modulation in the formation of certain of the in vivo embodiments of the assay, Such as complexes can be quantitated using, for example, detectably examples given below, are amenable to high through-put labeled proteins (e.g. radiolabeled, fluorescently labeled, or analysis of candidate agents. enzymatically labeled), by immunoassay, or by chromato graphic detection. Surface plasmon resonance systems. Such 0.141. The components of the Erro/PGC-1a complex can as those available from Biacore (C) International AB (Upp be endogenous to the cell selected to Support the assay. sala, Sweden), may also be used to detect protein-protein Alternatively, some or all of the components can be derived interaction. from exogenous sources. For instance, fusion proteins can be introduced into the cell by recombinant techniques (such 0137 The proteins and peptides described herein may be as through the use of an expression vector), as well as by immobilized. Often, it will be desirable to immobilize the microinjecting the fusion protein itself or mRNA encoding peptides and polypeptides to facilitate separation of com the fusion protein. plexes from uncomplexed forms of one of the proteins, as well as to accommodate automation of the assay. The 0142. In still further embodiments of the present assay, peptides and polypeptides can be immobilized on any solid the Erro/PGC-1a complex is generated in whole cells and matrix. Such as a plate, a bead or a filter. The peptide or the level of interaction is determined by measuring the level polypeptide can be immobilized on a matrix which contains of gene expression of an (i) endogenous gene or of a reactive groups that bind to the polypeptide. Alternatively or transgene, whose expression is dependent on the formation in combination, reactive groups such as cysteines in the of a complex. Genes which are responsive to Erro/PGC-1a protein can react and bind to the matrix. In another embodi complex are provided by the invention and some may be ment, the polypeptide may be expressed as a fusion protein found in the literature. with another polypeptide which has a high binding affinity 0.143. In specific embodiments, the cells used in the to the matrix, Such as a fusion protein to streptavidin which methods described herein for identifying agents are cells in binds biotin with high affinity. culture or from a subject, Such as a tissue, fluid or organ or 0138. In an illustrative embodiment, a fusion protein can a portion of any of the foregoing. For example, cells can be provided which adds a domain that permits the protein to preferably be from tissues that are involved in glucose be bound to an insoluble matrix. For example, a GST-ERRC. metabolism, Such as pancreatic cells, islates of Langerhans, fusion protein can be adsorbed onto glutathione Sepharose pancreatic beta cells, muscle cells, liver cells or other beads (Sigma Chemical, St. Louis, Mo.) or glutathione appropriate cells. Preferably, cells are provided in culture derivatized microtitre plates, which are then combined with and can be a primary cell line or a continuous cell line and a PGC-1a polypeptide, e.g. an S-labeled polypeptide, and can be provided as a clonal population of cells or a mixed the test compound and incubated under conditions condu population of cells. US 2007/0203083 A1 Aug. 30, 2007

VII. Methods of Identifying Agents which Modulate ing, either qualitatively, semiquantitatively, or more prefer OXPHOS-CR Expression ably quantitatively, the levels of the OXPHOS-CR gene 0144. Applicants have identified a core set of genes products. In one embodiment, the coordinate change com (OXPHOS-CR) that help unify previous observations from prises an increase or a decrease in expression in all the genes clinical investigation, exercise physiology, pharmacology, tested. In another embodiment, a coordinate change com and genetics. Drugs that modulate OXPHOS-CR activity prises an increase or a decrease in at least 60%. 65%, 70%, may be promising candidates for the prevention and/or 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%, treatment of type 2 diabetes. Applicants discovery of 97%, 98% or 99% of the genes tested. OXPHOS-CR properties and previous observations support the hypothesis that drugs that increase OXPHOS-CR activ 0150. In a variation of this method, more than one cell is ity in muscle and fat will improve insulin resistance, while contacted with the agent. In yet another variation, multiple agents that reduce it will worsen insulin resistance. These cells or cell populations are contacted with the agent, Such drugs may have benefit in other processes characterized by that each cell or cell population provides a measure of aberrant oxidative capacity in these tissues, including obe expression for each of the OXPHOS-CR gene products. For sity and aging. example, if the expression level of four OXPHOS-CR genes 0145 The methods described in this section for identi is to be determined, then four cell populations, such as one fying agents which regulate the expression level of one or on each well of a 96-well plate, is contacted with the agent, more OXPHOS-CR genes may also identify agents which and from each well the expression level of one of the modulate PGC-1C., Gabp or Erro. expression or activity, or OXPHOS genes is determined. Alternatively, two cell popu agents which mimic or functionally substitute for these lations could be used and the expression level of two gene genes, since applicants have demonstrated that these three products could be determined from each of the two cell transcriptional regulators regulate the expression of populations. In another embodiment, the cell or cell popu OXPHOS-CR genes. Likewise, these methods also identify lation is contacted with more than one agent. therapeutic agents which modulate metabolism or mitochon 0151. The expression level of the OXPHOS-CR gene drial function in a subject in need thereof. Such as a subject products may be determined using techniques known in the afflicted with diabetes. art. Gene products which comprise an mRNA may be 0146 Accordingly, the invention further provides cell detected, for example, using reverse transcriptase mediated based methods for identifying agents which regulate the polymerase chain reaction (RT-PCR), Northern blot analy expression of OXPHOS-CR genes. On aspect provides a sis, in situ hybridization, microarray analysis, etc. (Schena et method of identifying an agent that regulates expression of al., Science 270:467-470 (1995); Lockhart et al., Nature OXPHOS-CR genes, the method comprising (a) contacting Biotech. 14: 1675-1680 (1996), and U.S. Pat. Nos. 5,770, (i) an agent to be assessed for its ability to regulate expres 151, 5,807,522, 5,837,832, 5,952,180, 6,040,138 and 6,045, sion of OXPHOS-CR genes with (ii) a test cell; and (b) 996). Polypeptide products may be detected using, for determining whether the expression level of at least two example, standard immunoassay methods known in the art. OXPHOS-CR gene products show a coordinate change in Such immunoassays include but are not limited to, competi the test cell compared to an appropriate control, wherein a tive and non-competitive assay systems using techniques coordinate change in the expression of the OXPHOS-CR Such as radioimmunoassays, ELISA (enzyme-linked immu gene products relative to the appropriate control indicates nosorbent assay), “sandwich’ immunoassays, immunoradi that the agent regulates the expression of OXPHOS-CR ometric assays, gel diffusion precipitin, reactions, immun genes. odiffusion assays, in situ immunoassays (using colloidal 0147 A related aspect of the invention provides method gold, enzymatic, or radioisotope labels, for example), West of identifying an agent that regulates expression of a gene, ern blots, 2-dimensional gel analysis, precipitation reactions, wherein the gene is an OXPHOS-CR gene, the method immunofluorescence assays, protein A assays, and immu comprising (a) contacting (i) an agent to be assessed for its noelectrophoresis assays. ability to regulate expression of the gene with (ii) a test cell; and (b) determining whether the expression level of two or 0152. When the gene product comprises an enzyme, the more OXPHOS-CR gene products show a coordinate level of gene product may be determined using a measure of change in the test cell compared to an appropriate control, enzymatic activity. Products of enzyme catalytic activity wherein the gene does not encode the two or more may be detected by suitable methods that will depend on the OXPHOS-CR gene products, and wherein a coordinate quantity and physicochemical properties of the particular change in the expression of the OXPHOS-CR gene products product. Thus, detection may be, for example by way of relative to the appropriate control indicates that the agent illustration and not limitation, by radiometric, calorimetric, regulates the expression level of the gene. spectrophotometric, fluorimetric, immunometric or mass spectrometric procedures, or by other Suitable means that 0148. In some embodiments, the OXPHOS-CR gene will be readily apparent to a person having ordinary skill in products comprise an mRNA or a polypeptide. The gene the art. In certain embodiments of the invention, detection of products of the two genes need not be of the same type. For a product of enzyme catalytic activity may be accomplished instance, in one specific embodiment, the mRNA levels of a directly, and in certain other embodiments detection of a first OXPHOS-CR gene, the polypeptide levels of a second product may be accomplished by introduction of a detect OPHOS-CR gene, and the enzymatic activity of a third able reporter moiety or label into a substrate or reactant such OXPHOS-CR genes are determined. In a preferred embodi as a marker enzyme, dye, radionuclide, luminescent group, ment, all the gene products comprises mRNAS. fluorescent group or biotin, or the like. The amount of such 0149. In additional embodiments, determining whether a label that is present as unreacted Substrate and/or as the expression of at least two OXPHOS-CR gene products reaction product, following a reaction to assay enzyme show a coordinate change in the test cell comprises detect catalytic activity, is then determined using a method appro US 2007/0203083 A1 Aug. 30, 2007

priate for the specific detectable reporter moiety or label. For compounds are screened for their effects on the cellular radioactive groups, radionuclide decay monitoring, Scintil phenotype, arrays of cells may be prepared for parallel lation counting, Scintillation proximity assays (SPA) or handling of cells and reagents. Standard 96 well microtiter autoradiographic methods are generally appropriate. For plates which are 86 mm by 129 mm, with 6 mm diameter immunometric measurements, Suitably labeled antibodies wells on a 9 mm pitch, may be used for compatibility with may be prepared including, for example, those labeled with current automated loading and robotic handling systems, radionuclides, with fluorophores, with affinity tags, with The microplate is typically 20 mm by 30 mm, with cell biotin or biotin mimetic sequences or those prepared as locations that are 100-200 microns in dimension on a pitch antibody-enzyme conjugates (see, e.g., Weir, D. M., Hand of about 500 microns. Methods for making microplates are book of Experimental Immunology, 1986, Blackwell Scien described in U.S. Pat. No. 6,103,479, incorporated by ref tific, Boston; Scouten, W. H., Methods in Enzymology erence herein in its entirety. Microplates may consist of 135:30-65, 1987; Harlow and Lane, Antibodies: A Labora coplanar layers of materials to which cells adhere, patterned tory Manual, Cold Spring Harbor Laboratory, 1988: with materials to which cells will not adhere, or etched Haugland, 1996 Handbook of Fluorescent Probes and 3-dimensional surfaces of similarly pattered materials. For Research Chemicals—Sixth Ed., Molecular Probes, Eugene, the purpose of the following discussion, the terms well and Oreg.; Scopes, R. K. Protein Purification: Principles and microwell refer to a location in an array of any construction Practice, 1987, Springer-Verlag, NY: Hermanson, G. T. et to which cells adhere and within which the cells are imaged. al., Immobilized Affinity Ligand Techniques, 1992, Aca Microplates may also include fluid delivery channels in the demic Press, Inc., NY; Luo et al., 1998 J. Biotechnol. 65:225 spaces between the wells. The smaller format of a micro and references cited therein). Spectroscopic methods may be plate increases the overall efficiency of the system by used to detect dyes (including, for example, colorimetric minimizing the quantities of the reagents, storage and han products of enzyme reactions), luminescent groups and dling during preparation and the overall movement required fluorescent groups. Biotin may be detected using avidin or for the scanning operation. In addition, the whole area of the streptavidin, coupled to a different reporter group (com microplate can be imaged more efficiently. monly a radioactive or fluorescent group or an enzyme). Enzyme reporter groups may generally be detected by the 0.156. In specific embodiments, the test cell that is con addition of Substrate (generally for a specific period of time), tacted with the agent may be a primary cell, a cell within a followed by spectroscopic, spectrophotometric or other tissue, or a cell line. In a preferred embodiment, the test cell analysis of the reaction products. Standards and standard is a liver cell, a skeletal muscle cell, such as a C2C12 additions may be used to determine the level of enzyme myoblast or a fat cell, such as 3T3-L1 preadipocyte. catalytic activity in a sample, using well known techniques. 0157. In one embodiment, the method for identifying an 0153. In one embodiment, the promoter regions for two agent that regulates expression of OXPHOS-CR genes com or more OXPHOS-CR genes (or larger portions of such prises determining whether the expression of at least 3, 4, 5, genes) may be operatively linked to a reporter gene and used 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, in a reporter gene-based assay to detect agents that enhance 23, 24, 25, 26 or 27 OXPHOS-CR gene products. In a or diminish OXPHOS-CR gene expression. In such embodi preferred embodiment, the expression level of five or less ments, the OXPHOS gene product is the mRNA or polypep OXPHOS-CR gene products is determined. In a specific tide encoded by the reporter gene. In a specific embodiment, embodiment, the OXPHOS-CR gene products are selected the recombinant fluorescent polypeptide comprises a from the group consisting of NDUFB3, SDHA, NDUFA8, polypeptide selected from the group consisting of the green COX7A1, UQCRC1, NDUFC1, NDUFS2, ATP50, fluorescent protein (GFP), Dsked, ZFP538, mRRFP1, BFP. NDUFS3, SDHB, NDUFS5, NDUFB6, COX5B, CYC1, CFP, YFP mutants thereof, or functionally-active fragments NDUFA7, UQCRB, COX7B, ATP5L, COX7C, NDUFA5, GRIM19, ATP5J, COX6A2 NDUFB5, CYCS, NDUFA2 thereof. GFP is described in U.S. Pat. No. 5,491,084, while and HSPC051. In a specific embodiment, one of the ZFP538 is described in Zagranichny et al. Biochemistry. OXPHOS-CR genes is ubiquinol cytochrome c reductase 2004; 43(16):4764-72. binding protein (UQCRB). In a preferred embodiment, the 0154) In another specific embodiment, the appropriate OXPHOS-CR gene products are human OXPHOS-CR prod control comprises the expression level of the two or more ucts. The OXPHOS-CR genes whose expression level is OXPHOS-CR gene products in cells that (a) have not been determined may be encoded by (i) mitochondrial DNA contacted with the agent; (b) have been contacted with a (mtDNA); (ii) nuclear DNA; or (iii) a combination thereof. different dosage of the agent; (c) have been contacted with a second agent; or (d) a combination thereof. Alternatively, 0158. In one embodiment of the methods described an appropriate control may be a measure of the gene product herein for identifying agents which regulate the expression in the cell prior to contacting with the agent. In another of OXPHOS-CR genes, the method further comprises deter embodiment, the level of gene expression of the OXPHOS mining if the agent regulates the expression of at least one CR gene product in the cell can be compared with a standard gene which is not an OXPHOS-CR gene. In some embodi (e.g., presence or absence of an OXPHOS-CR gene product) ments, the method further comprises determining if the or numerical value determined (e.g. from analysis of other agent regulates the expression of at least 2, 3, 4, 5, 6, 7, 8, samples) to correlate with a normal or expected level of 9, 10, 15, 20, 25 or 50 genes which are not an OXPHOS-CR expression. genes. Such genes may be mitochondrial genes or, in pre ferred embodiments, not mitochondrial genes, such as actin 0155 In some embodiments, the identification of agents genes. The expression level of another gene which is not an which regulate the expression of OXPHOS-CR genes is OXPHOS-CR gene may serve as an internal control, such carried out in a high-throughput fashion. When screening that agents which specifically modulate the expression of an agents in a high-throughput manner, such as when test OXPHOS-CR gene may be identified. US 2007/0203083 A1 Aug. 30, 2007

0159. In other embodiments, a secondary screening step the examination site. The physical state may be the tem is performed on the agent. In a specific embodiment, the perature or pressure of the sample, or an amount or quality agent is tested in additional assays for its effects on mito of light (electromagnetic radiation) at the site. Alternatively, chondrial cell number or a mitochondrial function, such as or in addition, the physical state may relate to an electric coupled oxygen consumption. Such assays may comprise field, magnetic field, and/or particle radiation at the site, contacting a cell with the agent, measuring mitochondrial among others. Chemical conditions include any chemical cell number or function, and comparing it to an appropriate aspect of the fluid in which the sample populations are control. U.S. Patent Publication No. 20020049176 describes disposed. The chemical aspect may relate to presence or assays for determining mitochondrial mass, Volume or num concentration of a test compound or material, pH, ionic ber, and U.S. Patent Publication No. 2002/0127536 strength, and/or fluid composition, among others. describes assays for determining coupled oxygen consump 0.164 Biological conditions include any biological aspect tion. Accordingly, in one embodiment, the agent being tested of the shared fluid volume in which cell populations are in the assays described herein additionally (a) increases the disposed. The biological aspects may include the presence, number of mitochondria in the test cell; (b) increases absence, concentration, activity, or type of cells, viruses, coupled oxygen consumption in the cell; (c) increases vesicles, organelles, biological extracts, and/or biological mtDNA copy number in the test cell; or (d) a combination mixtures, among others. The assays described herein may thereof. screen a library of conditions to test the activity of each 0160 Agents identified using the methods of the present library member on a set of cell populations. A library invention may also be tested in model systems for their generally comprises a collection of two or more different efficacy in inducing the desired biological response or in members. These members may be chemical modulators (or treating disorders. One example is high-fat diet induced candidate modulators) in the form of molecules, ligands, obesity and insulin resistance. In another example, agents compounds, transfection materials, receptors, antibodies, may also be tested for their efficacy in treating diabetes by and/or cells (phages, viruses, whole cells, tissues, and/or cell using a non-obese diabetic (NOD) mouse. The successful extracts), among others, related by any suitable or desired use of this animal model in diabetic drug discovery is common characteristic. This common characteristic may be reported in the literature (Yang et al., J. Autoimmun. 10:257 “type.” Thus, the library may comprise a collection of two 260 (1997), Akashi et al., Int. Immunol. 9:1159–1164 (1997), or more compounds, two or more different cells, two or more Suri and Katz, Immunol. Rev. 169:55-65 (1999), Pak et al., different antibodies, two or more different nucleic acids, two Autoimmunity 20:19-24 (1995), Toyoda and Formby, Bioes or more different ligands, two or more different receptors, or says 20:750-757 (1998), Cohen, Res. Immunol. 148:286 two or more different phages or whole cell populations 291 (1997), Baxter and Cooke, Diabetes Metal. Rev. 11:315 distinguished by expressing different proteins, among oth 335 (1995), McDuffie, Curr. Opin. Immunol. 10:704-709 ers. This common characteristic also may be “function.” Thus, the library may comprise a collection of two or more (1998), Shieh et al. Autoimmunity 15:123-135 (1993), binding partners (e.g., ligands and/or receptors), agonists, or Anderson et al., Autoimmunity 15:113-122 (1993)). antagonists, among others, independent of type. 0161 It is well understood by one skilled in the art that 0.165 Library members may be produced and/or other many of the methods described herein may be carried out wise generated or collected by any suitable mechanism, using variants of the polypeptides described. Variants including chemical synthesis in vitro, enzymatic synthesis in include truncated polypeptides, mutant polypeptides, such vitro, and/or biosynthesis in a cell or organism. Chemically as those carrying point mutations, and fusions between and/or enzymatically synthesized libraries may include domains of the Subject polypeptides and other polypeptides. libraries of compounds, such as synthetic oligonucleotides In some embodiments, the Subject polypeptides, or their (DNA, RNA, peptide nucleic acids, and/or mixtures or domains, may be fused to reporter proteins, such as to GFP modified derivatives thereof), small molecules (about 100 or to enzymes. In some embodiments of any of the methods Da to 10 KDa), peptides, carbohydrates, lipids, and/or so on. described herein, the polypeptides used are 50, 60, 70, 80. Such chemically and/or enzymatically synthesized libraries 90, 95, 98 or 99% identical to the sequences referenced to in may be formed by directed synthesis of individual library the various Genbank Accession numbers. members, combinatorial synthesis of sets of library mem 0162. In the methods described herein for identifying an bers, and/or random synthetic approaches. Library members agent, the agent may comprise a recombinant polypeptide, a produced by biosynthesis may include libraries of plasmids, synthetic molecule, or a purified or partially purified natu complementary DNAs, genomic DNAs, RNAs, viruses, rally occurring molecule. In a specific embodiment, the phages, cells, proteins, peptides, carbohydrates, lipids, agent comprises a virus or a phage. In another embodiment, extracellular matrices, cell lysates, cell mixtures, and/or the agent is a nuclear hormone, Such as estrogen, thyroid materials secreted from cells, among others. Library mem hormone, cortisol, testosterone, and others. Additional bers may be contact arrays of cell populations singly or as agents include nucleic acids encoding nuclear hormone groups/pools of two or more members. receptors. VIII. Methods of Identifying Transcriptional Regulators 0163. In another embodiment, the agent comprises a set 0166 Another aspect of the invention provides methods of environmental conditions. The condition may be a physi of identifying transcriptional regulators. In some aspects, the cal condition of the environment in which the cell resides, a invention provides methods of identifying transcriptional chemical condition of the environment, and/or a biological regulators which display differential activity between two condition of the site. Exposure may be for any suitable time. cells. The exposure may be continuous, transient, periodic, Spo 0.167 The invention provides a method of identifying a radic, etc. Physical conditions include any physical state of transcriptional regulator having differential activity between US 2007/0203083 A1 Aug. 30, 2007 an experimental cell and a control cell, the method com 0171 In some embodiments, a transgene is introduced prising (i) determining the level of gene expression of at into the experimental cell. The transgene may encode any least two genes in the experimental cell and in the control protein, Such as transcriptional regulators or proteins that cell; (ii) ranking genes according to a difference metric of regulate the activity of transcriptional regulators, such as their expression level in the experimental cell compared to kinase and phosphatases. The transgene may also encode an the control cell; (iii) identifying a Subset of genes, wherein inhibitory RNA, such as a hairpin RNA, so that the function each gene in the Subset contains the same DNA sequence of the gene to which the hairpin RNA is directed may be motif (iv) testing via a nonparametric statistic if the Subset knocked down, allowing a comparison of gene expression in of genes are enriched at either the top or the bottom of the between the two cells. In some embodiments, the transgenes ranking; (V) optionally reiterating steps (ii)-(iii) for addi is a transgene associated with a disease state. For example, tional motifs; (vi) for a Subset of genes that is enriched, a gene whose overexpressing leads to cancer may be over identifying a transcriptional regulator which binds to a DNA expressed to identify transcriptional regulators expressing sequence motif that is contained in the Subset of genes; differential activity between the two cells. These transcrip thereby identifying a transcriptional regulator having differ tional regulators may then be used as therapeutic targets for the treatment of cancer. In some embodiments, the transgene ential activity between two cells. is a mutant transgene. Such as a mutant transgene associated 0168 The methods provided by the invention for identi with a disease state. fying transcriptional regulators with differential activity are 0.172. In some embodiments, the DNA sequence motif not limited to any type of cell or to any type of difference comprises at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17. between the two cell. The cells may be eukaryotic, prokary 18, 19, 20, or 25 nucleotides in length, preferably at least 5. otic, yeast, nematode, insect, mammalian or human cells. The DNA sequence motif may be any combination of The cells may be primary cells, or cell lines. The cells may nucleotides, and it may represent a known binding site or a be in an organism. In one specific embodiment, the cells are novel binding site. In some embodiments, the DNA isolated from a subject. sequence motif comprises undefined nucleotide positions 0169. The control and the experimental cell may be the which may contain more than one base. For instance, a DNA same type of cell or they may be different types of cells. In sequence motif may comprise the sequence GATNNATC, one embodiment, the experimental cell and the control cell wherein the 3" and 4" positions would include any of the are both cells derived from the same cell line or from the four bases. Similarly, a DNA sequence motif comprising the same tissue types. In some embodiments, the experimental sequence GAT(G/T)ATC would have a G or a T in the fourth cell and the control cell are from different organisms, such position. In some embodiments, DNA sequence motif com as from two different subjects. In some specific embodi prises at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, ments in which the cells are derived from the same organ 19, or 20 defined positions. ism, one cell is a normal cell and another cell is a diseased 0173 The method can be applied to any number of cell. For instance, one cell may be a cancer cell and one may motifs. In one embodiment, all permutations of DNA be a non-cancer cell, or one cell may be a virus infected cell sequence motifs of at least 6, 7, 8 and 9 bases in length are and one may be a non-infected cell. In some embodiments, tested. The number selected may depend on the number of both cells may be diseased cells, but differ in their disease genes in the Subset, the computational capabilities available, states. For instance, the two cells may be hyperplastic cells but at different stages of cancer progression e.g. one cell may and the size of the window in each gene in which the DNA be a tumor cell and the other a metastatic cell derived from sequence motif is search. that tumor. Furthermore, the two cells may differ genetically 0.174 The method is not limited to any particular method or they may be clonal cells with essentially identical geno of measuring gene expression. In some embodiments, deter types. One or both of the cells may be experimentally mining the level of expression of a gene in a cell comprises manipulated, such as by contacting one of the cells with an determining the levels of mRNA for the gene in the cell. Any agent, or contacting both cells with an agent but at different method known in the art may be used to determine mRNA concentrations. levels. In one embodiment, mRNA is isolated from the cell, and the levels of mRNA for each gene in the subset is 0170 In some embodiments of the method, the subject determined by hybridizing the mRNA, or cDNA derived from which one or both of the cells are derived in is afflicted from the mRNA, to a DNA microarray. with a disorder. The method is not limited by any particular disorder. In some specific embodiments, the disorder is a 0.175. In some embodiments of the methods described metabolic disorder or a hyperplastic condition. Hyperplastic herein, identifying the transcriptional regulator which binds conditions include renal cell cancer, Kaposi's sarcoma, to a DNA sequence motif comprises searching a database chronic leukemia, prostate cancer, breast cancer, sarcoma, comprising transcriptional regulators and DNA sequence pancreatic cancer, leukemia, ovarian carcinoma, rectal can motifs to which they bind. For example, the TRANSFAC cer, throat cancer, melanoma, colon cancer, bladder cancer, transcription factor database, maintained at the GBF Braun lymphoma, mastocytoma, lung cancer, mammary adenocar Schweig, Germany, defines sequence specific binding site cinoma, pharyngeal squamous cell carcinoma, testicular patterns, or motifs, for transcription factors. In another cancer, gastrointestinal cancer, or stomach cancer, or a embodiment, the transcriptional regulator is identified by combination thereof. Additional disorders to which this comparing the sequences identified to those found in the method may be applied may be found, for example, in literature. It is understood by one skilled in the art that more Braunwald, E. et al. eds. Harrison's Principles of Internal than one transcriptional regulator may bind to a given DNA Medicine, 15" Edition (McGraw-Hill Book Company, New sequence motif, and therefore multiple transcriptional regu York, 2001). lators may be identified. US 2007/0203083 A1 Aug. 30, 2007 20

0176). In some embodiments of the method described IX. Biomarker Set Enrichment Analysis (BSEA) herein, identifying a transcriptional regulator which binds to 0179. One aspect of the invention provides methods of a DNA sequence motif comprises experimentally identify detecting statistically-significant differences in the expres ing a transcriptional regulator which binds to the DNA sion level of at least one biomarker belonging to a biomarker sequence motif. In one embodiment, this is achieved by set, between the members of a first and of a second experi These may be achieved by (i) identifying, from a library of mental group. Applicants have named this new analytical genes, a transcriptional regulator capable of driving the technique Biomarker Set Enrichment Analysis (BSEA), or expression of a selectable marker, wherein the expression of Gene Set Enrichment Analysis (GSEA) when the biomarker the selectable marker is dependent on binding of the tran is a gene or a gene product. Scriptional regulator to the DNA sequence motif. In a specific embodiment, a reporter gene is introduced into a 0180 GSEA may be valuable in efforts to relate genomic cell. Such as a mammalian cell or a yeast cell, wherein the variation to disease and measures of total body physiology. promoter of the reporter gene is operably linked to the DNA Single-gene methods are powerful only where the individual sequence motif. A plasmid library which comprises candi gene effect is dramatic and the variance Small, which may date transcriptional regulator genes is introduced into the not be the case in many disease states. Methods like GSEA are complementary, and provide a framework with which to cells such that the transcriptional regulators are expressed in examine changes operating at a higher level of biological the cell. If a transcriptional regulator is able to bind to the organization. This may be needed if common, complex DNA sequence motif, it will increase or decrease expression disorders typically result from modest variation in the of the reporter gene, allowing identification of the cell expression or activity of multiple members of a pathway e.g. expressing said regulator and thus allowing its identification. gene (biomarker) sets. AS gene sets are systematically In a specific embodiment, a yeast one-hybrid approach, or assembled using functional and genomic approaches, meth other approaches well known to one skilled in the art, is used ods such as GSEA will likely be valuable in detecting to identify a transcriptional regulator which binds to the coordinated but Subtle variation in gene function that con DNA sequence motif (Vidal M et al. Nucleic Acids Res. tribute to common human diseases. Accordingly, in a pre 1999; 27(4):919-29, Kadonaga et al., (1986) Proc. Natl. ferred embodiment, the methods detect statistically-signifi Acad. Sci. USA, 83,5889-5893. Singh etel. (1988) Cell, 52, cant differences in the expression level in more than one 415-423: Chong, J. A. et al. (1997) In Bartel, P. L. and biomarker. Fields, S. (eds), The Yeast Two-Hybrid System. Oxford 0181. One aspect of the invention provides a method of University Press, New York, N.Y., pp. 289-297). Transcrip detecting statistically-significant differences in the expres tional regulators may also be identified based on its binding sion level of at least one biomarker belonging to a biomarker affinity for the DNA sequence motif, such by standard set, between the members of a first and of a second experi affinity chromatography. mental group, comprising: (a) obtaining a biomarker sample 0177. In some embodiments, the non-parametric statistic from members of the first and the second experimental is a nonparametric, rank Sum statistic. In specific embodi groups; (b) determining, for each biomarker sample, the ments, the non-parametric statistic is selected from the expression levels of at least one biomarker belonging to the group consisting of a Kolmogorov-Smirnov, Mann-Whitney biomarker set and of at least one biomarker not belonging to the set; (c) generating a rank order of each biomarker or Wald-Wolfowitz. Non-parametric statistics are well according to a difference metric of its expression level in the known in the art (David J. Sheskin, Handbook of Parametric first experimental group compared to the second experimen and Nonparametric Statistical Procedures, CRC Press, 2003: tal group; (d) calculating an experimental enrichment score Myles Hollander, Douglas A. Wolfe, Nonparametric Statis for the biomarker set by applying a non parametric statistic; tical Methods, Wiley, John & Sons, Inc., 1998: Larry and (e) comparing the experimental enrichment score with a Wasserman, All of Statistics, Springer-Verlag New York, distribution of randomized enrichment scores to calculate Incorporated, 2003). In some embodiments, the difference the fraction of randomized enrichment scores greater than metric is a difference in arithmetic means, t-test scores, or the experimental enrichment score, wherein a low fraction signal to noise ratios. In some embodiments, a gene set is indicates a statistically-significant difference in the expres said to be enriched if the probability that the gene set would sion level of the biomarker set between the members of the be enriched by chance, or when compared to an appropriate first and of the second experimental group. null hypothesis, is less than 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.0001, 0.00005 or 0.00001. 0182. In one embodiment of the foregoing methods, the distribution of randomized enrichment scores is generated 0178. In some embodiments where the experimental cell by randomly permutating the assignment of each biomarker expresses a recombinant transgene, such as a recombinant sample to the first or to the second experimental group; (ii) transcriptional regulator, the recombinant transcriptional generating a rank order of each biomarker according to the regulator may itself be found to have differential activity. In absolute value of a difference metric of its expression level other embodiments where the experimental cell expresses a in the first experimental group compared to the second recombinant transgene, the method may yield transcriptional experimental group; (iii) calculating an experimental enrich regulators whose activity or expression is itself regulated by ment score for the biomarker set by applying a non para the recombinant transcriptional regulator, and if a recombi metric statistic to the rank order; and (iv) repeating steps (i), nant transcriptional regulator is used whose activity is (ii) and (iii) a number of times Sufficient to generate the related to a disease state is used, identification of transcrip distribution of randomized enrichment scores. In a specific tional regulators having differential activity between the two embodiment, the number of times Sufficient to generate a cells may yield therapeutic targets to treat the disorder. distribution is at least 20, 30, 40, 50, 60, 70, 80,90, 100, 150, US 2007/0203083 A1 Aug. 30, 2007

200 or 500 times. In another specific embodiment, the low ond experimental groups may differ by any measurable fraction is less than 0.05, while in other embodiments it is characteristic. For example, the groups may differ by a less than 0.04, 0.03, 0.02, 0.01, 0.005 or 0.001. physical characteristic, such as weight, age, sex, sexual preference, eyesight, percent body fat, percent lean muscle 0183 In one embodiment of the foregoing methods, the mass, height, right vs. left handedness or race. The groups distribution of randomized enrichment scores is a normal may also differ by a psychological characteristic, such as distribution. The difference metric may be any difference intelligence, Verbal skills, emotional intelligence and even metric. Such as a difference in arithmetic means, a difference personality types, such those determined by the Myers in t-test scores, or a difference in signal-to-noise ratio. Briggs Type Indicator. The groups may also differ by emo Similarly, the non-parametric Statistic may be any non tional State, such as relaxed VS. emotionally stressed Sub parametric statistic, such Mann-Whitney, Wald-Wolfowitz jects, or cheerful Vs. gloomy Subjects. The Subjects may also or more preferably Kolmogorov-Smirnov. differ by the presence or absence of one or more mutations, 0184 The biomarker set typically comprises elements of Such as Subjects having mutations in an oncogene. In another a pathway, such as a metabolic pathway, a biochemical embodiment, the two experimental groups differ in that one pathway, a signaling pathway, or any set of genes which group has been treated with at least one agent, such as a share a common biological function or which are coordi drug. nately regulated. In a preferred embodiment, the biomarker 0187. In another embodiment, experimental groups com is selected from the group consisting of a nucleic acid, a prise cells. The cells may comprise primary cells, cell lines, polypeptide, a metabolite and a genotype. For example, or come in the form of tissue samples. As described above when the biomarker set comprises genes encoding enzymes for organisms, the cells in the two experimental groups may of a metabolic pathway, such as glycolytic enzymes, the differ by a physical characteristic or differ genetically. In a biomarkers may comprise the genotype of the glycolytic preferred embodiment, the two experimental groups differ in genes. In the embodiment where the biomarker is a geno that the cells in one of the experimental groups have been type, the genotype of all or a Subset of the glycolytic genes treated with an agent, Such as with a compound or drug. In may be determined by DNA sequencing, and the expression such embodiments, the methods described herein may be level of the genotype would correspond to the amount of used to detect Subtle changes that the agent may have on the polymorphic DNA i.e. 0, 1 or 2 copies of a wild-type copy biomarker set, Such as a biochemical or signaling pathway. of the gene for a diploid cell or organism. Alternatively, the number of mutant copies, or of a specific mutation, can be X. Nucleic Acid and Polypeptide Agents used in determining the expression level of the genotype. 0188 In some of embodiments of methods described 0185. In other embodiments where the biomarker is the herein, an agent which reduces the expression of Erro. mRNA of each of, or of a subset of the glycolytic enzymes, Gabpa, Gabpb, or any other gene, oran genet used in any of the expression level of the mRNA may be determined, or the the methods of screening agents described herein, comprises expression level of a particular splice isoform, using meth a double stranded RNAi molecule, a ribozyme, or an anti ods well known in the art, such as by northern blots or sense nucleic acid directed at said gene. microarray analysis. In other embodiments where the biom 0189 Certain embodiments of the invention make use of arker is the protein of each of, or of a subset of the materials and methods for effecting knockdown of one form glycolytic enzymes, the level of expression may comprise of a gene, by means of RNA interference (RNAi). RNAi is total protein levels or levels of a particular modified form of a process of sequence-specific post-transcriptional gene the protein, Such as the level of phosphorylated or glycosy repression which can occur in eukaryotic cells. In general, lated protein, both of which may be determined using this process involves degradation of an mRNA of a particu immunological techniques. Finally, when the biomarker is a lar sequence induced by double-stranded RNA (dsRNA) that metabolite, such as the product whose formation is catalyzed is homologous to that sequence. For example, the expression by the glycolytic enzyme, the expression level of the of a long dsRNA corresponding to the sequence of a metabolite is its concentration in the biomarker sample. Such particular single-stranded mRNA (ss mRNA) will labilize as its cellular concentration. Metabolite levels may be deter that message, thereby “interfering with expression of the mined using chromatographic means or other means well corresponding gene. Accordingly, any selected gene may be known in the art. The reference to the glycolitic pathway in repressed by introducing a dsRNA which corresponds to all the examples above is meant to be illustrative and non or a substantial part of the mRNA for that gene. It appears limiting, or the same principles may apply to any other that when a long dsRNA is expressed, it is initially processed pathway or biomarker set. by a ribonuclease III into shorter dsRNA oligonucleotides of 0186. In one embodiment, experimental groups comprise in some instances as few as 21 to 22 base pairs in length. organisms, such as mammals, or more preferably humans. In Furthermore, RNAi may be effected by introduction or Such embodiments, the sample from the biomarker sample expression of relatively short homologous dsRNAs. Indeed comprises a sample of cells from the organism, or a sample the use of relatively short homologous dsRNAs may have of bodily fluid, such as serum, saliva, tears, Sweat or semen. certain advantages as discussed below. The difference between the first and second experimental 0.190 Mammalian cells have at least two pathways that groups may be a disease state. For example, the first experi are affected by double-stranded RNA (dsRNA). In the RNAi mental group may be afflicted with a disease or disorder, (sequence-specific) pathway, the initiating dsRNA is first while the second group is not. In a specific embodiment, the broken into short interfering (si) RNAs, as described above. disorder is characterized by defective glucose metabolism, The siRNAs have sense and antisense strands of about 21 such as type II diabetes. In another embodiment where the nucleotides that form approximately 19 nucleotide siRNAs experimental groups comprise organisms, the first and sec with overhangs of two nucleotides at each 3' end. Short US 2007/0203083 A1 Aug. 30, 2007 22 interfering RNAs are thought to provide the sequence infor midites and thymidine phosphoramidite (Proligo, Germany). mation that allows a specific messenger RNA to be targeted Synthetic oligonucleotides are preferably deprotected and for degradation. In contrast, the nonspecific pathway is gel-purified using methods known in the art (see e.g. triggered by dsRNA of any sequence, as long as it is at least Elbashir et al. (2001) Genes Dev. 15: 188-200). Longer about 30 base pairs in length. The nonspecific effects occur RNAs may be transcribed from promoters, such as T7 RNA because dsRNA activates two enzymes: PKR, which in its polymerase promoters, known in the art. A single RNA active form phosphorylates the translation initiation factor target, placed in both possible orientations downstream of an eIF2 to shut down all protein synthesis, and 2', 5' oligoad in vitro promoter, will transcribe both strands of the target to enylate synthetase (2',5'-AS), which synthesizes a molecule create a dsRNA oligonucleotide of the desired target that activates RNAse L., a nonspecific enzyme that targets all sequence. For example, if Erro. is the target of the double mRNAS. The nonspecific pathway may represents a host stranded RNA, any of the above RNA species will be response to stress or viral infection, and, in general, the designed to include a portion of nucleic acid sequence of the effects of the nonspecific pathway are preferably minimized Errol, gene. under preferred methods of the present invention. Signifi 0193 The specific sequence utilized in design of the cantly, longer dsRNAS appear to be required to induce the oligonucleotides may be any contiguous sequence of nucle nonspecific pathway and, accordingly, dsRNAS shorter than otides contained within the expressed gene message of the about 30 bases pairs are preferred to effect gene repression target. Programs and algorithms, known in the art, may be by RNAi (see Hunter et al. (1975) J Biol Chem 250: 409-17: used to select appropriate target sequences. In addition, Manche et al. (1992) Mol Cell Biol 12:5239-48; Minks et optimal sequences may be selected utilizing programs al. (1979) 3 Biol Chem 254: 10180-3; and Elbashir et al. designed to predict the secondary structure of a specified (2001) Nature 411: 494-8). single stranded nucleic acid sequence and allowing selection 0191) RNAi has been shown to be effective in reducing or of those sequences likely to occur in exposed single stranded eliminating the expression of a gene in a number of different regions of a folded mRNA. Methods and compositions for organisms including Caenorhabditis elegans (see e.g. Fire et designing appropriate oligonucleotides may be found, for al. (1998) Nature 391: 806-11), mouse eggs and embryos example, in U.S. Pat. No. 6,251,588, the contents of which (Wianny et al. (2000) Nature Cell Biol 2: 70-5; Svoboda et are incorporated herein by reference. Messenger RNA al. (2000) Development 127: 4147-56), and cultured RAT-1 (mRNA) is generally thought of as a linear molecule which fibroblasts (Bahramina et al. (1999) Mol Cell Biol 19: contains the information for directing protein synthesis 274-83), and appears to be an anciently evolved pathway within the sequence of ribonucleotides, however studies available in eukaryotic plants and animals (Sharp (2001) have revealed a number of secondary and tertiary structures Genes Dev. 15:485-90). RNAi has proven to be an effective that exist in most mRNAs. Secondary structure elements in means of decreasing gene expression in a variety of cell RNA are formed largely by Watson-Crick type interactions types including HeLa cells, NIH/3T3 cells, COS cells, 293 between different regions of the same RNA molecule. cells and BHK-21 cells, and typically decreases expression Important secondary structural elements include intramo of a gene to lower levels than that achieved using antisense lecular double stranded regions, hairpin loops, bulges in techniques and, indeed, frequently eliminates expression duplex RNA and internal loops. Tertiary structural elements entirely (see Bass (2001) Nature 411: 428-9). In mammalian are formed when secondary structural elements come in cells, siRNAs are effective at concentrations that are several contact with each other or with single stranded regions to orders of magnitude below the concentrations typically used produce a more complex three dimensional structure. A in antisense experiments (Elbashir et al. (2001) Nature 411: number of researchers have measured the binding energies 494-8). of a large number of RNA duplex structures and have 0192 The double stranded oligonucleotides used to effect derived a set of rules which can be used to predict the RNAi are preferably less than 30 base pairs in length and, secondary structure of RNA (see e.g. Jaeger et al. (1989) more preferably, comprise about 25, 24, 23, 22, 21, 20, 19, Proc. Natl. Acad. Sci. USA 86:7706 (1989); and Turner et al. 18 or 17 base pairs of ribonucleic acid. Optionally the (1988) Annu. Rev. Biophys. Biophys. Chem. 17:167). The dsRNA oligonucleotides of the invention may include 3' rules are useful in identification of RNA structural elements overhang ends. Exemplary 2-nucleotide 3' overhangs may and, in particular, for identifying single stranded RNA be composed of ribonucleotide residues of any type and may regions which may represent preferred segments of the even be composed of 2'-deoxythymidine resides, which mRNA to target for silencing RNAi ribozyme or antisense lowers the cost of RNA synthesis and may enhance nuclease technologies. Accordingly, preferred segments of the mRNA resistance of siRNAs in the cell culture medium and within target can be identified for design of the RNAi mediating transfected cells (see Elbashi et al. (2001) Nature 411: dsRNA oligonucleotides as well as for design of appropriate 494-8). Longer dsRNAs of 50, 75, 100 or even 500 base ribozyme and hammerhead ribozyme compositions of the pairs or more may also be utilized in certain embodiments of invention. the invention. Exemplary concentrations of dsRNAs for 0194 The dsRNA oligonucleotides may be introduced effecting RNAi are about 0.05 nM, 0.1 nM, 0.5 nM, 1.0 nM, into the cell by transfection with an heterologous target gene 1.5 nM, 25 nM or 100 nM, although other concentrations using carrier compositions such as liposomes, which are may be utilized depending upon the nature of the cells known in the art—e.g. Lipofectamine 2000 (Life Technolo treated, the gene target and other factors readily discernable gies) as described by the manufacturer for adherent cell to the skilled artisan. Exemplary dsRNAs may be synthe lines. Transfection of dsRNA oligonucleotides for targeting sized chemically or produced in vitro or in vivo using endogenous genes may be carried out using Oligofectamine appropriate expression vectors. Exemplary synthetic RNAS (Life Technologies). Transfection efficiency may be checked include 21 nucleotide RNAs chemically synthesized using using fluorescence microscopy for mammalian cell lines methods known in the art (e.g. Expedite RNA phophora after co-transfection of hCGFP-encoding pAD3 (Kehlenback US 2007/0203083 A1 Aug. 30, 2007

et al. (1998) J Cell Biol 141: 863-74). The effectiveness of it. Such that it is no longer capable of being translated to the RNAi may be assessed by any of a number of assays synthesize a functional polypeptide product. following introduction of the dsRNAs. Further composi tions, methods and applications of RNAi technology are 0198 The ribozymes of the present invention also provided in U.S. Pat. Nos. 6,278,039, 5,723,750 and 5,244, include RNA endoribonucleases (hereinafter “Cech-type 805, which are incorporated herein by reference. ribozymes') such as the one which occurs naturally in Tetrahymena thermophila (known as the IVS, or L-19 IVS 0.195 Ribozyme molecules designed to catalytically RNA) and which has been extensively described by Thomas cleave Errol or Gabpa mRNA transcripts can also be used to Cech and collaborators (Zaug, et al. (1984) Science prevent translation of Errol, or Gabpa (see, e.g., PCT Inter 224:574-578; Zaug, et al. (1986) Science 231:470-475; national Publication WO90/11364, published Oct. 4, 1990; Zaug, et al. (1986) Nature 324:429–433; published Interna Sarver et al. (1990) Science 247: 1222-1225 and U.S. Pat. tional patent application No. WO88/04300 by University No. 5,093,246). Ribozymes are enzymatic RNA molecules Patents Inc.; Been, et al. (1986) Cell 47:207-216). The capable of catalyzing the specific cleavage of RNA. (For a Cech-type ribozymes have an eight active site review, see Rossi (1994) Current Biology 4: 469-471). The which hybridizes to a target RNA sequence whereafter mechanism of ribozyme action involves sequence specific cleavage of the target RNA takes place. The invention hybridization of the ribozyme molecule to complementary encompasses those Cech-type ribozymes which target eight target RNA, followed by an endonucleolytic cleavage event. base-pair active site sequences that are present in a target The composition of ribozyme molecules preferably includes gene or nucleic acid sequence. one or more sequences complementary to the gene whose activity is to be reduced. 0199 Ribozymes can be composed of modified oligo nucleotides (e.g., for improved stability, targeting, etc.) and 0196) While ribozymes that cleave mRNA at site specific should be delivered to cells which express the target gene in recognition sequences can be used to destroy target mRNAS, vivo. A preferred method of delivery involves using a DNA the use of hammerhead ribozymes is preferred. Hammer construct “encoding the ribozyme under the control of a head ribozymes cleave mRNAs at locations dictated by strong constitutive pol III or pol II promoter, so that trans flanking regions that form complementary base pairs with fected cells will produce sufficient quantities of the the target mRNA. Preferably, the target mRNA has the ribozyme to destroy endogenous target messages and inhibit following sequence of two bases: 5'-UG-3'. The construction translation. Because ribozymes, unlike antisense molecules, and production of hammerhead ribozymes is well known in are catalytic, a lower intracellular concentration is required the art and is described more fully in Haseloff and Gerlach for efficiency. (1988) Nature 334:585-591; and see PCT Applin. No. WO89/ 05852, the contents of which are incorporated herein by 0200. In a long target RNA chain, significant numbers of reference). Hammerhead ribozyme sequences can be target sites are not accessible to the ribozyme because they embedded in a stable RNA such as a transfer RNA (tRNA) are hidden within secondary or tertiary structures (Birikh et to increase cleavage efficiency in Vivo (Perriman et al. al. (1997) Eur J. Biochem 245: 1-16). To overcome the (1995) Proc. Natl. Acad. Sci. USA, 92: 6175-79; de Feyter, problem of target RNA accessibility, computer generated and Gaudron, Methods in Molecular Biology, Vol. 74, predictions of secondary structure are typically used to Chapter 43, “Expressing Ribozymes in Plants”. Edited by identify targets that are most likely to be single-stranded or Turner, P. C. Humana Press Inc., Totowa, N.J.). In particular, have an “open configuration (see Jaeger et al. (1989) RNA polymerase III-mediated expression of tRNA fusion Methods Enzymol 183: 281-306). Other approaches utilize ribozymes are well known in the art (see Kawasaki et al. a systematic approach to predicting secondary structure (1998) Nature 393: 284-9; Kuwabara et al. (1998) Nature which involves assessing a huge number of candidate Biotechnol. 16:961-5; and Kuwabara et al. (1998) Mol. hybridizing oligonucleotides molecules (see Milner et al. Cell. 2: 617-27; Koseki et al. (1999) J Virol 73: 1868–77; (1997) Nat Biotechnol 15:537-41; and Patzel and Sczakiel Kuwabara et al. (1999) Proc Natl Acad Sci USA96: 1886 (1998) Nat Biotechnol 16: 64-8). Additionally, U.S. Pat. No. 91; Tanabe et al. (2000) Nature 406: 473-4). There are 6.251,588, the contents of which are hereby incorporated typically a number of potential hammerhead ribozyme herein, describes methods for evaluating oligonucleotide cleavage sites within a given target cDNA sequence. Pref probe sequences so as to predict the potential for hybrid erably the ribozyme is engineered so that the cleavage ization to a target nucleic acid sequence. The method of the recognition site is located near the 5' end of the target invention provides for the use of such methods to select mRNA- to increase efficiency and minimize the intracellular preferred segments of a target mRNA sequence that are accumulation of non-functional mRNA transcripts. Further predicted to be single-stranded and, further, for the oppor more, the use of any cleavage recognition site located in the tunistic utilization of the same or substantially identical target sequence encoding different portions of the C-termi target mRNA sequence, preferably comprising about 10-20 nal amino acid domains of for example, long and short consecutive nucleotides of the target mRNA, in the design of forms of target would allow the selective targeting of one or both the RNAi oligonucleotides and ribozymes of the inven the other form of the target, and thus, have a selective effect tion. on one form of the target gene product. 0201 In other embodiments of methods described herein, 0197). In addition, ribozymes possess highly specific an agent which modulates the activity of Errol, Gabpa, endoribonuclease activity, which autocatalytically cleaves Gabpb, or any other gene, comprises an antibody or frag the target sense mRNA. The present invention extends to ment thereof. An antibody may increase or decrease the ribozymes which hybridize to a sense mRNA encoding a activity of any of the Subject polypeptides, and it may Errol, or Gabpa or any other genes of interest described increase or decrease the binding of two proteins into a herein, thereby hybridizing to the sense mRNA and cleaving complex, such as an Errol/PCG-1 a complex. US 2007/0203083 A1 Aug. 30, 2007 24

0202 Chickens, mammals, such as a mouse, a hamster, a been introduced into the cell. In another embodiment, the goat, a guinea pig or a rabbit, can be immunized with an agent is a mutant polypeptide which inhibits Errol protein immunogenic form of the Errol, Gabpa, Gabpb, or any activity. Examples of Such inhibitory agents include a polypeptide provided by the invention, or with peptide nucleic acid molecule encoding a dominant negative Errol, a variants thereof (e.g., an antigenic fragment which is capable protein, Such a fragment of Errol, which may compete with ofeliciting an antibody response). Techniques for conferring wildtype Erro. protein for DNA binding or complex forma immunogenicity on a protein or peptide include conjugation tion with PGC-1. to carriers or other techniques well known in the art. For instance, a peptidyl portion of one of the Subject proteins can XI. Therapeutics be administered in the presence of adjuvant. The progress of 0207. In one aspect, the invention provides methods of immunization can be monitored by detection of antibody treating disorders in a subject comprising the administration titers in plasma or serum. Standard ELISA or other immu of a agent or of a composition comprising an agent, such as noassays can be used with the immunogen as antigen to a therapeutic agent. “Therapeutic agent' or “therapeutic' assess the levels of antibodies. refers to an agent capable of having a desired biological 0203 Following immunization, antisera can be obtained effect on a host. Chemotherapeutic and genotoxic agents are and, if desired, polyclonal antibodies against the target examples of therapeutic agents that are generally known to protein can be further isolated from the serum. To produce be chemical in origin, as opposed to biological, or cause a monoclonal antibodies, antibody producing cells (lympho therapeutic effect by a particular mechanism of action, cytes) can be harvested from an immunized animal and respectively. Examples of therapeutic agents of biological fused by standard somatic cell fusion procedures with origin include growth factors, hormones, and cytokines. A immortalizing cells such as myeloma cells to yield hybri variety of therapeutic agents are known in the art and may doma cells. Such techniques are well known in the art, and be identified by their effects. Certain therapeutic agents are include, for example, the hybridoma technique (originally capable of regulating cell proliferation and differentiation. developed by Kohler and Milstein, Nature, 256: 495-497, Examples include chemotherapeutic nucleotides, drugs, hor 1975), as well as the human B cell hybridoma technique mones, non-specific (non-antibody) proteins, oligonucle (Kozbar et al., Immunology Today, 4: 72, 1983), and the otides (e.g., antisense oligonucleotides that bind to a target EBV-hybridoma technique to produce human monoclonal nucleic acid sequence (e.g., mRNA sequence)), peptides, antibodies (Cole et al., Monoclonal Antibodies and Cancer and peptidomimetics. Therapy, Alan R. Liss, Inc. pp. 77-96, 1985). Hybridoma 0208. In one embodiment, the compositions are pharma cells can be screened immunochemically for production of ceutical compositions. Pharmaceutical compositions for use antibodies specifically reactive to the peptide immunogen in accordance with the present invention may be formulated and the monoclonal antibodies isolated. Accordingly, in conventional manner using one or more physiologically another aspect of the invention provides hybridoma cell acceptable carriers or excipients. Thus, the compounds and lines which produce the antibodies described herein. The their physiologically acceptable salts and Solvates may be antibodies can then be tested for their effects on the activity formulated for administration by, for example, by aerosol, and expression of the protein to which they are directed. intravenous, oral or topical route. The administration may 0204 The term antibody as used herein is intended to comprise intralesional, intraperitoneal, Subcutaneous, intra include fragments which are also specifically reactive with muscular or intravenous injection; infusion; liposome-me a protein described herein or a complex comprising Such diated delivery; topical, intrathecal, gingival pocket, per protein. Antibodies can be fragmented using conventional rectum, intrabronchial, nasal, transmucosal, intestinal, oral, techniques and the fragments screened in the same manner ocular or otic delivery. as described above for whole antibodies. For example, 0209 An exemplary composition of the invention com F(ab'), fragments can be generated by treating antibody with prises an compound capable of modulating the expression or pepsin. The resulting F(ab')2 fragment can be treated to activity of a transcriptional regulator, such as a PGC-1, Gabp reduce disulfide bridges to produce Fab' fragments. The or Errol, polypeptide, with a delivery system, such as a antibody of the present invention is further intended to liposome system, and optionally including an acceptable include bispecific and chimeric molecules, as well as single excipient. In a preferred embodiment, the composition is chain (sclv) antibodies. formulated for injection. 0205 The subject antibodies include trimeric antibodies 0210 Techniques and formulations generally may be and humanized antibodies, which can be prepared as found in Remmington’s Pharmaceutical Sciences, Meade described, e.g., in U.S. Pat. No. 5,585,089. Also within the Publishing Co., Easton, Pa. For systemic administration, Scope of the invention are single chain antibodies. All of injection is preferred, including intramuscular, intravenous, these modified forms of antibodies as well as fragments of intraperitoneal, and Subcutaneous. For injection, the com antibodies are intended to be included in the term “anti pounds of the invention can be formulated in liquid solu body”. tions, preferably in physiologically compatible buffers such 0206. In yet another embodiment of the methods as Hank’s solution or Ringer's solution. In addition, the described herein, the agent is a polypeptide. Such as an Erro. compounds may be formulated in Solid form and redissolved polypeptide or a Gabp polypeptide, or a fragment thereof or Suspended immediately prior to use. Lyophilized forms which retains a biological activity or which antagonizes a are also included. biological activity of the wild-type polypeptide. For 0211 For oral administration, the pharmaceutical com example, an Errol. Stimulatory agent comprises an active positions may take the form of for example, tablets or Erro. protein, a nucleic acid molecule encoding Errol that has capsules prepared by conventional means with pharmaceu US 2007/0203083 A1 Aug. 30, 2007 tically acceptable excipients such as binding agents (e.g., 0216) Systemic administration can also be by transmu pregelatinised maize starch, polyvinylpyrrolidone or cosal or transdermal means. For transmucosal or transder hydroxypropyl methylcellulose); fillers (e.g., lactose, micro mal administration, penetrants appropriate to the barrier to crystalline cellulose or calcium hydrogen phosphate); lubri be permeated are used in the formulation. Such penetrants cants (e.g., magnesium Stearate, talc or silica); disintegrants are generally known in the art, and include, for example, for (e.g., potato starch or Sodium starch glycolate); or wetting transmucosal administration bile salts and fusidic acid agents (e.g., sodium lauryl Sulphate). The tablets may be derivatives in addition, detergents may be used to facilitate coated by methods well known in the art. Liquid prepara permeation. Transmucosal administration may be through tions for oral administration may take the form of, for nasal sprays or using Suppositories. For topical administra example, solutions, syrups or Suspensions, or they may be tion, the oligomers of the invention are formulated into presented as a dry product for constitution with water or ointments, salves, gels, or creams as generally known in the other suitable vehicle before use. Such liquid preparations art. A wash Solution can be used locally to treat an injury or may be prepared by conventional means with pharmaceuti inflammation to accelerate healing. cally acceptable additives Such as Suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible 0217. The compositions may, if desired, be presented in fats); emulsifying agents (e.g., lecithin or acacia); non a pack or dispenser device which may contain one or more aqueous vehicles (e.g., ationd oil, oily esters, ethyl alcohol unit dosage forms containing the active ingredient. The pack or fractionated vegetable oils); and preservatives (e.g., may for example comprise metal or plastic foil. Such as a methyl or propyl-p-hydroxybenzoates or sorbic acid). The blister pack. The pack or dispenser device may be accom preparations may also contain buffer salts, flavoring, color panied by instructions for administration. ing and Sweetening agents as appropriate. 0218 For therapies involving the administration of 0212 Preparations for oral administration may be suit nucleic acids, the oligomers of the invention can be formu ably formulated to give controlled release of the active lated for a variety of modes of administration, including compound. For buccal administration the compositions may systemic and topical or localized administration. Techniques take the form of tablets or lozenges formulated in conven and formulations generally may be found in Remmington's tional manner. For administration by inhalation, the com Pharmaceutical Sciences, Meade Publishing Co., Easton, Pa. pounds for use according to the present invention are For systemic administration, injection is preferred, including conveniently delivered in the form of an aerosol spray intramuscular, intravenous, intraperitoneal, intranodal, and presentation from pressurized packs or a nebuliser, with the subcutaneous for injection, the oligomers of the invention use of a Suitable propellant, e.g., dichlorodifluoromethane, can be formulated in liquid solutions, preferably in physi trichlorofluoromethane, dichlorotetrafluoroethane, carbon ologically compatible buffers such as Hank's solution or dioxide or other Suitable gas. In the case of a pressurized Ringer's solution. In addition, the oligomers may be formu aerosol the dosage unit may be determined by providing a lated in solid form and redissolved or suspended immedi valve to deliver a metered amount. Capsules and cartridges ately prior to use. Lyophilized forms are also included. of e.g., gelatin for use in an inhaler or insufflator may be 0219 Systemic administration can also be by transmu formulated containing a powder mix of the compound and a cosal or transdermal means, or the compounds can be Suitable powder base Such as lactose or starch. administered orally. For transmucosal or transdermal admin 0213 The compounds may be formulated for parenteral istration, penetrants appropriate to the barrier to be perme administration by injection, e.g., by bolus injection or con ated are used in the formulation. Such penetrants are gen tinuous infusion. Formulations for injection may be pre erally known in the art, and include, for example, for sented in unit dosage form, e.g., in ampoules or in multi transmucosal administration bile salts and fusidic acid dose containers, with an added preservative. The derivatives. In addition, detergents may be used to facilitate compositions may take Such forms as Suspensions, Solutions permeation. Transmucosal administration may be through or emulsions in oily or aqueous vehicles, and may contain nasal sprays or using Suppositories. For oral administration, formulatory agents such as Suspending, stabilizing and/or the oligomers are formulated into conventional oral admin dispersing agents. Alternatively, the active ingredient may istration forms such as capsules, tablets, and tonics. For be in powder form for constitution with a suitable vehicle, topical administration, oligomers may be formulated into e.g., Sterile pyrogen-free water, before use. ointments, salves, gels, or creams as generally known in the art. 0214) The compounds may also be formulated in rectal compositions such as Suppositories or retention enemas, e.g., 0220 Toxicity and therapeutic efficacy of the agents and containing conventional Suppository bases Such as cocoa compositions of the present invention can be determined by butter or other glycerides. standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LDs (the 0215. In addition to the formulations described previ dose lethal to 50% of the population) and the EDs (the dose ously, the compounds may also be formulated as a depot therapeutically effective in 50% of the population). The dose preparation. Such long acting formulations may be admin ratio between toxic and therapeutic effects is the therapeutic istered by implantation (for example Subcutaneously or index and it can be expressed as the ratio LDs/EDs. intramuscularly) or by intramuscular injection. Thus, for Compounds which exhibit large therapeutic induces are example, the compounds may be formulated with Suitable preferred. While compounds that exhibit toxic side effects polymeric or hydrophobic materials (for example as an may be used, care should be taken to design a delivery emulsion in an acceptable oil) or ion exchange resins, or as system that targets Such compounds to the site of affected sparingly soluble derivatives, for example, as a sparingly tissue in order to minimize potential damage to uninfected soluble salt. cells and, thereby, reduce side effects. US 2007/0203083 A1 Aug. 30, 2007 26

0221) The data obtained from the cell culture assays and dependent upon the form of the compound, the size of the animal studies can be used in formulating a range of dosage compound and the bioactivity of the compound. One of skill for use in humans. The dosage of Such compounds lies in the art could routinely perform empirical activity tests for preferably within a range of circulating concentrations that a compound to determine the bioactivity in bioassays and include the EDs with little or no toxicity. The dosage may thus determine the effective amount. In one embodiment of vary within this range depending upon the dosage form the above methods, the effective amount of the compound employed and the route of administration utilized. For any comprises from about 1.0 ng/kg to about 100 mg/kg body compound used in the method of the invention, the thera weight of the subject. In another embodiment of the above peutically effective dose can be estimated initially from cell methods, the effective amount of the compound comprises culture assays. A dose may be formulated in animal models from about 100 ng/kg to about 50 mg/kg body weight of the to achieve a circulating plasma concentration range that subject. In another embodiment of the above methods, the includes the ICs (i.e., the concentration of the test com effective amount of the compound comprises from about 1 pound which achieves a half-maximal inhibition of symp ug/kg to about 10 mg/kg body weight of the Subject. In toms) as determined in cell culture. Such information can be another embodiment of the above methods, the effective used to more accurately determine useful doses in humans. amount of the compound comprises from about 100 ug/kg to Levels in plasma may be measured, for example, by high about 1 mg/kg body weight of the Subject. performance liquid chromatography. 0226. As for when the compound, compositions and/or 0222. In one embodiment of the methods described agent is to be administered, one skilled in the art can herein, the effective amount of the agent is between about 1 determine when to administer Such compound and/or agent. mg and about 50 mg per kg body weight of the Subject. In The administration may be constant for a certain period of one embodiment, the effective amount of the agent is time or periodic and at specific intervals. The compound between about 2 mg and about 40 mg per kg body weight of may be delivered hourly, daily, weekly, monthly, yearly (e.g. the subject. In one embodiment, the effective amount of the in a time release form) or as a one time delivery. The agent is between about 3 mg and about 30 mg per kg body delivery may be continuous delivery for a period of time, weight of the subject. In one embodiment, the effective e.g. intravenous delivery. In one embodiment of the methods amount of the agent is between about 4 mg and about 20 mg described herein, the agent is administered at least once per per kg body weight of the Subject. In one embodiment, the day. In one embodiment of the methods described herein, the effective amount of the agent is between about 5 mg and agent is administered daily. In one embodiment of the about 10 mg per kg body weight of the subject. methods described herein, the agent is administered every other day. In one embodiment of the methods described 0223) In one embodiment of the methods described herein, the agent is administered every 6 to 8 days. In one herein, the agent is administered at least once per day. In one embodiment of the methods described herein, the agent is embodiment, the agent is administered daily. In one embodi administered weekly. ment, the agent is administered every other day. In one embodiment, the agent is administered every 6 to 8 days. In EXEMPLIFICATION one embodiment, the agent is administered weekly. 0227. The invention now being generally described, it 0224. As for the amount of the compound and/or agent will be more readily understood by reference to the follow for administration to the subject, one skilled in the art would ing examples, which are included merely for purposes of know how to determine the appropriate amount. As used illustration of certain aspects and embodiments of the herein, a dose or amount would be one in Sufficient quan present invention, and are not intended to limit the inven tities to either inhibit the disorder, treat the disorder, treat the tion, as one skilled in the art would recognize from the subject or prevent the subject from becoming afflicted with teachings hereinabove and the following examples, that the disorder. This amount may be considered an effective amount. A person of ordinary skill in the art can perform other DNA microarrays, cell types, agents, constructs, or simple titration experiments to determine what amount is data analysis methods, all without limitation, can be required to treat the subject. The dose of the composition of employed, without departing from the scope of the invention the invention will vary depending on the Subject and upon as claimed. the particular route of administration used. In one embodi 0228. The contents of any patents, patent applications, ment, the dosage can range from about 0.1 to about 100,000 patent publications, or Scientific articles referenced any ug/kg body weight of the Subject. Based upon the compo where in this application are herein incorporated in their sition, the dose can be delivered continuously, Such as by entirety. continuous pump, or at periodic intervals. For example, on one or more separate occasions. Desired time intervals of 0229. The practice of the present invention will employ, multiple doses of a particular composition can be deter where appropriate and unless otherwise indicated, conven mined without undue experimentation by one skilled in the tional techniques of cell biology, cell culture, molecular art. biology, transgenic biology, microbiology, Virology, recom binant DNA, and immunology, which are within the skill of 0225. The effective amount may be based upon, among the art. Such techniques are described in the literature. See, other things, the size of the compound, the biodegradability for example, Molecular Cloning: A Laboratory Manual, 3rd of the compound, the bioactivity of the compound and the Ed., ed. by Sambrook and Russell (Cold Spring Harbor bioavailability of the compound. If the compound does not Laboratory Press: 2001); the treatise, Methods In Enzymol degrade quickly, is bioavailable and highly active, a smaller ogy (Academic Press, Inc., N.Y.); Using Antibodies, Second amount will be required to be effective. The effective amount Edition by Harlow and Lane, Cold Spring Harbor Press, will be known to one of skill in the art; it will also be New York, 1999; Current Protocols in Cell Biology, ed. by US 2007/0203083 A1 Aug. 30, 2007 27

Bonifacino, Dasso, Lippincott-Schwartz, Harford, and (1990)). The investigation was approved by the Ethics Yamada, John Wiley and Sons, Inc., New York, 1999; and Committee at Lund University, and informed consent was PCR Protocols, ed. by Bartlett et al., Humana Press, 2003. obtained from each of the volunteers. All subjects were Northern Europeans, and their glucose tolerance status was 0230. The tables for all the Experimental genes are listed assessed using standardized 75-gram OGTT and by applying at the end of the third experimental series. WHO85 criteria (Eriksson et al. Diabetologia 33, 526-31. First Experimental Series (1990)). At the initial OGTT performed 10 years earlier, 0231. Described herein are results of RNA expression none of the men had DM2 (Eriksson et al. Diabetologia 33, profiling of 43 individuals with varying levels of insulin 526-31. (1990)). An OGTT performed at the time the biopsy resistance, carried out to systematically identify pathways showed that 20 of the subjects had developed manifest type and processes operative in diabetes. The 43 individuals 2 diabetes (DM2), 8 fulfilled the criteria for IGT and 26 had were: 17 with normal glucose tolerance (NGT), 8 with normal glucose tolerance (NGT). As diabetes was diagnosed impaired glucose tolerance (IGT), and 18 with type 2 at the time of the repeat OGTT, none of the subjects were on diabetes (DM2). No single gene showed statistically signifi medication for hyperglycemia or diabetes-related condi cant expression differences between the diagnostic classes. tions. Therefore, they developed a new analytical technique, called 0234 Anthropometric and insulin sensitivity measures Gene Set Enrichment Analysis (GSEA), that seeks to deter were performed as previously described (Groop, L. et al. mine whether members of gene sets (e.g., pathways) are Diabetes 45, 1585-93. (1996)). Height, weight, waist to hip consistently different, even though modestly or slightly, in ratio (WHR) and fat free mass were measured on the day of one diagnostic class versus another. Application of GSEA to the euglycemic clamp. Maximal oxygen uptake (VO2max) the microarray data, demonstrated that the oxidative phos was measured using an incremental work-conducted upright phorylation pathway (OXPHOS) was significantly different. exercise test with a bicycle ergometer (Monark Varberg, Of the approximately 106 members in this pathway, 94 are Sweden) combined with continuous analysis of expiratory diminished in DM2 versus NGT. The effect is subtle with gases and minute ventilation. Exercise was started at a each gene only showing a 15-20% decrease. workload varying between 30-100 W depending on the 0232 Also described herein are results of work carried previous history of endurance training or exercise habits and out to define mechanisms underlying this coordinated then increased by 20-50 W every 3 min, until a perceived decrease in expression of OXPHOS genes. Analysis of the exhaustion or a respiratory quotient of 1.0 was reached. expression of these OXPHOS genes in a public atlas of Maximal aerobic capacity was defined as the VO2 during the mouse gene expression, showed that 2/3 of all OXPHOS last 30 S of exercise and is expressed per lean body mass. genes are tightly co-regulated across all 47 tissues exam Insulin sensitivity was determined with a standard 2 hour ined, and that they are highly expressed at the major sites of euglycemic hyperinsulinemic clamp combined with infusion insulin mediated glucose uptake (brown fat, heart, and of tritiated glucose to estimate endogenous glucose produc skeletal muscle). This group of genes is referred to herein as tion and indirect calorimetry (Deltatrac, Datex Instrumen “ONPHOS-CR,” for “OXPHOS Co-Regulated.” Applicants tarium, Finland) to estimate Substrate oxidation (Groop, L. hypothesized that the transcriptional co-activator PPARGC1 et al. Diabetes 45, 1585-93. (1996)). The rate of glucose (also known as PGC-1C. was responsible for this transcrip uptake (also referred to as the M-value) was calculated from tional co-regulation. To prove this, Applicants infected the infusion rate of glucose and the residual rate of endog mouse muscle cell lines with PPARGC1 and demonstrated enous glucose production measured by the tritiated glucose that the OXPHOS-CR genes are specifically induced in a tracer during the clamp. time-dependent manner over a three day period. As 0235 Percutaneous muscle biopsies (20-50 mg) were described in detail below, GSEA was re-applied to the taken from the vastus lateralis muscle under local anesthesia diabetes data, this time testing whether OXPHOS-CR is (1% lidocaine) after the 2-h euglycemic hyperinsulinemic specifically differentially expressed between the patient clamp using a Bergström needle (Eriksson et al. Diabetes 43. classes. Results showed that this accounts for the bulk of the 805-8. (1994)). Fiber-type composition and glycogen con signal detected in the comparison between NGT and DM2. centration were determined as previously described (Schalin and moreover, appears to be very different between NGT et al. Eur J. Clin Invest 25, 693-8. (1995)). Quantification and and IGT, as well, Suggesting derangements in this group of calculation of the fibers was performed using the COMFAS genes is an early event. Previous studies have Suggested that image analysis system (Scan Beam, Hadsun, Denmark). total body aerobic capacity (VOmax) is predictive of future insulin resistance and diabetes. Interestingly, Applicants 0236 Cell Culture and Adenoviral Infection. Mouse found a striking relationship between the mean expression of myoblasts (C2C12 cells) were cultured and differentiated the OXPHOS-CR genes and total body oxygen consump into myotubes as previously described (Wu, Z. et al. Cell 98, tion. 115-24. (1999)). After 3 days of differentiation, they were infected with an adenovirus containing either green fluores The following experimental procedures were followed in the cent protein (GFP) or PGC-1C. as previously described (Lin, first experimental series: J. et al. Nature 418, 797-801. (2002)). Methods 0237 mRNA. Isolation, Target Preparation, and Hybrid 0233 Human Subjects and Clinical Measurements. ization. Targets were prepared from human biopsy or mouse Applicants selected 54 men of similar age but with varying cell lines as previously described (Golub, T. R. et al. Science degree of glucose tolerance who had been participating in 286, 531-7. (1999)) and hybridized to the Affymetrix The Malmö Prevention Study in southern Sweden for more HG-U133A or MG-U74AV2 chip, respectively. Only scans than 12 years (Eriksson et al. Diabetologia 33, 526-31. with 10% Present calls and a GAPDH 3"/GAPDH 5' expres US 2007/0203083 A1 Aug. 30, 2007 28 sion ratio<1.33 were selected. Applicants obtained gene samples of A and B. Model the dataset so that the entry D, expression data for 54 human samples, but only 43 met these for gene i and sample j is normally distributed with meanu, selection criteria; the analysis in this paper is limited to these and standard deviation O, where 43 individuals. 0238) Data Scaling and Filtering. Human microarray data were subjected to global Scaling to correct for intensity O, i ES related biases. For each scan applicants binned all genes put = k +a, ie S, ie A according to their expression intensity in a designated ref -a, ie S, ie B. erence Scan, and recorded the median intensity of that bin to serve as a calibration curve for that Scan. Applicants then scaled the expression to the calibration curve of one NGT scan (patient mm.12) which applicants visually inspected and 0241 Then the signal to noise for an individual gene in deemed high quality using a linear interpolation between the S is proportionate to calibration points. Applicants then filtered the 22,283 genes on the HG-U133A chip to eliminate genes that had extremely low expression. A previous study suggested that an Affymetrix average difference level of 100 corresponds to an extremely low level (“not expressed”) (Su, A. I. et al. Proc Natl Acad Sci USA99, 4465-70. (2002)). Therefore, applicants only considered genes for which there was at least 0242 Suppose on the other hand applicants know S and a single measure (average difference) greater than 100. Of add the expression levels for all genes in S. Then the signal the 22.283 genes on the HG-U133A chip, 10.983 genes met to noise is proportionate to this filtering criterion. 0239 Single Gene Microarray Analysis. Microarray analysis to identify individual genes that are significantly a Win M different between diagnostic classes was performed using O two software packages. First, marker analysis was per formed as previously described using GeneCluster. Signifi cance of individual genes was testing by permutation of where M is the number of genes in S. This increases the class labels (5000 iterations), as previously described mean of our statistic (which is standard normal for the null (Golub, T. R. et al. Science 286, 531-7. (1999)). Applicants hypothesis of no gene set association) by a factor of VM. If used both the t-test and signal to noise difference metrics in these analysis, both yielding comparable results. Second, the noise is in fact correlated for genes of S, this reduces the applicants used the Software package SAM, using a A=0.5, benefit, but applicants can still expect a large gain. In to search for gene expression values significantly different practice applicants will not be able to select a gene set between classes (Tusher et al. Proc Natl Acad Sci USA 98. containing fully concordant expression levels, but as long as 5116-21. (2001)). an appreciable fraction of our gene set exhibits this property, applicants can expect a benefit from the grouped gene 0240 Compilation of Gene Sets. Applicants analyzed approach. 149 gene sets consisting of manually curated pathways and clusters defined by public expression compendia. First, Gene Set Enrichment Analysis (GSEA). GSEA determines if applicants used two different sets of metabolic pathway the members of a given gene set are enriched amongst the annotations. Applicants manually curated genes belonging to the following pathways: free fatty acid metabolism, most differentially expressed genes between two classes. gluconeogenesis, glycolysis, glycogen metabolism, insulin First, the genes are rank ordered on the basis of a difference signaling, ketogenesis, pyruvate metabolism, reactive oxy metric. The results presented in the current experimental gen species (ROS) homeostasis, Krebs cycle, oxidative series use the signal to noise (SNR) difference metric, which phosphorylation (OXPHOS), and mitochondria, using stan is simply the difference in means of the two classes divided dard textbooks, literature reviews, and LocusLink. Appli by the sum of the standard deviations of the two diagnostic cants also downloaded NetAFFX (Liu, G. etal et al. Nucleic classes. In general other difference metrics can also be used. Acids Res 31, 82-6. (2003)) annotations (October 2002) corresponding to GenMAPP metabolic pathways. To iden 0243 For each gene set, applicants then make an enrich tify sets of co-regulated genes, applicants used self-organiz ment measure, called the enrichment score (ES), which is a ing maps to group the GNF mouse expression atlas into 36 normalized Kolmogorov-Smirnov statistic. Consider the clusters (Su, A.I. et al. Proc Natl AcadSci USA99, 4465-70. genes R. . . . . RN that are rank ordered on the basis of the (2002), Tamayo et al. Proc Natl Acad Sci USA96, 2907-12. difference metric between the two classes, and a gene set S (1999). Genes in these 36 groups were converted to Affyme containing G members. Applicants define trix HG-U133A probe sets using the ortholog tables avail able at the NetAFFX website (October 2002). Rationale for Grouped Gene Analysis. Consider a microar ray dataset with the samples in two categories, A, B. For the sake of simplicity, let the size of A and Beach ben. Consider a gene set S for which the expression levels differ between US 2007/0203083 A1 Aug. 30, 2007 29 if R is not a member of groups, representing ambiguous cases (i.e., these human probe-sets that map to two mouse probe-sets, one of which is co-regulated and the other of which is not co-regulated). Applicants discarded these five ambiguous human probe sets from our analysis. This left a total of 35 HG-U133A probe-sets which applicants call OXPHOS-CR genes, and a total of 14 HG-U133A probe-sets which applicants call OXPHOS not CR. Note that 34 and 13 of these genes, if R is a member of S. Applicants then compute a running respectively, passed our filtering criteria, and these were the sum across all N genes. The enrichment score (ES) is defined genes used in FIG.9 as well as in the OXPHOS-CR analysis aS described in the paper. 0248 Linear Regression Analysis. Applicants generated i linear regression models using SAS (SAS Institute, USA). 3X Xi, ls isN 4 Clinical variables were used as dependent variables, and OXPHOS-CR gene expression levels or other clinical/bio chemical measures used as the independent (explanatory or predictor) variables. To compute the mean centroid of or the maximum observed positive deviation of the running OXPHOS-CR, the 34 genes OXPHOS-CR gene expression Sum. ES is measured for every gene set considered. To levels were normalized to a mean 0 and a variance 1 across determine whether any of the given gene sets shows asso all 43 patients. The OXPHOS-CR mean centroid vector is ciation with the class phenotype distinction, applicants per simply the mean of these 34 expression vectors. In some mute the class labels 1000 times, each time recording the regression analyses, applicants introduced dummy variables maximum ES over all gene sets. Note that in this regard, to represent diabetes status. For the regressions applicants applicants are testing a single hypothesis. The null hypoth have performed, applicants have reported the adjusted esis is that no gene set is associated with the class distinc squared correlation coefficient (Ra),ad which corrects for the tion. degrees of freedom. 0244. In this experimental series, after identifying OXPHOS-CR as a subset of co-regulated OXPHOS genes, Example 1 applicants tested it (a single gene set) for association with Comparison of Gene Expression in between clinical status using GSEA. Because OXPHOS-CR is not Experimental Groups independent of the OXPHOS set interrogated in the initial analysis, this cannot be viewed as an independent hypoth 0249 DNA microarrays were used to profile expression esis. For this reason, these P-values are explicitly marked as of over 22,000 genes in skeletal muscle biopsies from 43 nominal P-values. age-matched males (Table 1): 17 with Normal Glucose Tolerance (NGT), 8 with Impaired Glucose Tolerance (IGT), 0245 Gene set enrichment analysis (GSEA) has been and 18 with Type 2 Diabetes Mellitus (DM2). Biopsies were implemented as a software tool for use with microarray data obtained at the time of diagnosis (before treatment with and will be presented in fuller detail, including a discussion hypoglycemic medication) and under the controlled condi of different varieties of multiple hypothesis testing and tions of a hyperinsulinemic euglycemic clamp (see Meth applications to other biomedical problems, in a companion ods). When assessed with either of two different analytical paper (Subramanian et. al., in preparation). techniques (Golub, T. R. et al. Science 286,531-7. (1999), 0246) Evaluating OXPHOS Coregulation in Mouse Tusher et al. Proc Natl Acad Sci USA 98, 5116-21. (2001)) Expression Datasets. Applicants used the NetAFFX to iden that take into account the multiple comparisons implicit in tify probe sets on the mouse expression chips corresponding microarray analysis, no single gene exhibited a significant to human OXPHOS probe sets. Applicants identified a total difference in expression between the diagnostic categories. of 114 (106 of which passed our filtering criterion) probe This result is consistent with smaller studies (Sreekumar et sets corresponding to the human oxidative phosphorylation al. Diabetes 51, 1913-20. (2002), Yang et al. Diabetologia genes. Using the October 2002 ortholog tables at NetAFFX, 45, 1584-93. (2002)) which failed to identify any individual applicants were able to identify 61 mouse orthologs on the gene whose expression difference was significant when Affymetrix MG-U74AV2 chip. Of these 61 probe-sets, 52 corrected for the large number of hypotheses tested (Kropf were represented in the GNF mouse expression atlas (Su, A. et al. Biometrical J. 44, 789-800 (2002), Storey et al. J. R. I. et al. Proc Natl AcadSci USA99, 4465-70. (2002)). These Statist. Soc. B 64, 479–498 (2002)). expression data were normalized to a mean of 0 and a Example 2 variance of 1. Data were hierarchically clustered and visu alized using the Cluster and TreeView Software packages Gene Set Enrichment Analysis (Eisen et al. Proc Natl Acad Sci USA95, 14863-8. (1998)). 0250) To test for sets of related genes that might be 0247 Applicants parsed these 52 genes into 32 co-regu systematically altered in diabetic muscle, Applicants devised lated probe-sets and 20 probe-sets that are not co-regulated, a simple approach called Gene Set Enrichment Analysis based on the dendrogram in FIGS. 7 and 8. 40 distinct (GSEA), which is introduced here (see FIG. 1 and Methods). HG-HG-U133A probe-sets mapped to the 32 co-regulated The method combines information from the members of mouse probe-sets, and 19 distinct HG-U133A probe-sets previously defined sets of genes (e.g., biological pathways) mapped to the 20 mouse probe-sets that are not co-regulated. to increase signal relative to noise (see Methods) and Five HG-U133A probe-sets are shared between these two improve statistical power. US 2007/0203083 A1 Aug. 30, 2007 30

0251 For a given pairwise comparison (e.g., high in NGT 0255 Examination of the individual expression values vs DM2), all genes are ranked based on the difference in for the 106 OXPHOS genes reveals the source of this signal expression (using an appropriate metric Such as signal to (FIG. 2). Although the typical decrease in expression for noise). The null hypothesis of GSEA is that the rank ordering individual OXPHOS genes is very modest (-20%), the of the genes in a given comparison is random with regard to decrease is remarkably consistent across the set: 89% (94 of the diagnostic categorization of the samples. The alternative 106) of the genes showing decreased expression in DM2 hypothesis is the rank ordering of the pathway members is relative to NGT (FIG. 2). As controls, applicants confirmed associated with the specific diagnostic criteria used to cat that the result is independent of specific aspects of data egorize the patient groups. processing (Such as scaling, thresholding, filtering) or of 0252) The extent of association is then measured by a selection of difference metrics. Moreover, the result identi non-parametric, running Sum statistic termed the enrichment fied by GSEA is supported by previous observations: others score (ES), and record the Maximum ES (MES) over all have shown that oxidative capacities are altered in insulin gene sets in the actual patient data (FIG. 1). To assess the resistant muscle (Bjorntorp, et al. Diabetologia 3, 346-52. statistical significance of the MES, applicants use permuta (1967), Simoneau et al. FasebJ9, 273-8. (1995), and recent tion testing of the patient diagnostic labels (for example, microarray analyses of human diabetic muscle have identi whether a patient is NGT or DM2, see FIG. 1). Specifically, fied genes in oxidative phosphorylation among their top applicants compare the MES achieved in the actual data to ranked genes (Sreekumar et al. Diabetes 51, 1913-20. that seen in each of 1,000 permutations that shuffled the (2002)). diagnostic labels among the samples. The significance of the MES score is calculated as the fraction of the 1,000 random Example 4 permutations in which the top pathway gave a stronger result than that observed in the actual data. Because the permuta OXPHOS-CR: A Coregulated Subset of OXPHOS tion test involves randomization of the patient labels, it is a Genes test for the dependence on the actual diagnostic status of the 0256. One of the overlapping gene sets identified by patients. Moreover, because the actual MES is compared to GSEA is cluster c20, defined as a set of genes that are tightly the distribution of maximal ES values over all pathways co-regulated across many tissues (see Methods). The partial examined in each of the randomized datasets, it accounts for overlap of OXPHOS with the coregulated cluster led us to multiple pathways tested, and no further correction is ask whether all OXPHOS genes are coordinately regulated, required (Kropfet al. Biometrical J. 44, 789-800 (2002), or just a Subset. Applicants examined transcriptional co Storey et al. J. R. Statist. Soc. B 64, 479–498 (2002). regulation of mouse homologs of OXPHOS genes across a Example 3 mouse tissue expression atlas (Su, A.I. et al. Proc Natl Acad Sci USA99, 4465-70. (2002)). This revealed a previously Decreased Expression of Genes Involved in unrecognized subset of the OXPHOS biochemical pathway, Oxidative Phosphorylation corresponding to about two-thirds of the OXPHOS genes, 0253) Applicants applied GSEA to the microarray data that exhibit strong correlation across mouse tissues (r=0.67) described above, using 149 gene sets that applicants com (FIG. 3a). Applicants term this subset OXPHOS-CR (OXi piled (Table 2). Of these gene sets, 113 are based on dative PHOSphorylation Co-Regulated). The remaining involvement in metabolic pathways (based on public or local OXPHOS genes show little co-regulation with OXHPOS curation (Liu, G. et al et al. Nucleic Acids Res 31, 82-6. CR or each other (FIG. 3a). The OXPHOS-CR subset (2003)) and 36 consist of gene clusters that exhibit co strongly expressed in three of 46 tissues: skeletal muscle, regulation in a mouse expression atlas of 46 tissues (Su, A. heart, and brown fat. Applicants note that these are the major I. et al. Proc Natl Acad Sci USA99, 4465-70. (2002)) (see sites of insulin-mediated glucose disposal in mice. Methods). The gene sets were selected without regard to the 0257 Applicants next asked whether the downregulation results of the microarray data from our patients. The top of OXPHOS observed in DM2 was a general property of all gene set in GSEA analysis yielded a Maximal Enrichment OXPHOS genes or was specific to OXPHOS-CR. Interest Score (MES=346) that was significant at P=0.029 over the ingly, the bulk of the statistical signal applicants observe in 1,000 permutations of the 149 pathways. That is, in only 29 GSEA is accounted for by OXPHOS-CR (FIG. 4). Namely, or 1,000 permutations did the top pathway (of the 149) the OXPHOS-CR subset showed a stronger mean deviation exceed the score achieved by the top pathway achieved than the remainder of the OXPHOS gene set (FIG. 4), and using the actual diagnostic labels. was itself significant in the GSEA analysis (nominal P-value 0254 The maximal ES score was obtained for an inter 0.001, as compared to nominal P=0.226 for the remainder of nally curated set consisting of genes involved in oxidative the OXPHOS set). To see if these changes were secondary phosphorylation (applicants refer to this gene set as to hyperglycemia per se, or preceded the onset of frank OXPHOS). Interestingly, the four gene sets with the next diabetes, applicants compared expression of OXPHOS-CR highest ES scores overlap with this OXPHOS gene set, and in NGT patients to those with the pre-diabetic state. IGT. their enrichment is almost entirely explained by the overlap: Applicants found that expression of OXPHOS-CR is also a locally curated set of genes involved in mitochondrial downregulated in IGT (nominal P-10). This suggests that function, a set of genes identified with the keyword mito downregulation of OXPHOS-CR precedes onset of hyper chondria, a cluster (referred to here as c20) of co-regulated glycemia. Thus, GSEA allowed us to detect a subset of genes derived from the comparison of publicly available OXPHOS genes, called OXPHOS-CR, with three key prop mouse data, and a set of genes related to oxidative phos erties: (1) they are members of the oxidative phosphoryla phorylation defined at the Affymetrix website (Liu, G. etal tion pathway, (2) they are tightly co-regulated across many et al. Nucleic Acids Res 31, 82-6. (2003)). tissues and are highly expressed in the major sites of insulin US 2007/0203083 A1 Aug. 30, 2007

mediated glucose disposal, and (3) they exhibit a subtle but for diabetes status, however, because a two-variable regres consistent decreased expression in muscle from patients Sion of VO2max on diabetes Status and OXPHOS-CR with both the pre-diabetic state IGT and type 2 diabetes. expression level shows that both variables contribute sig nificantly to the correlation (P=0.05 for the model with both Example 5 variables as compared to the model with only diabetes status). PGC-1C. can Induce Expression of OXPHOS-CR 0261. It is important to note that these results do not seem 0258. The strong correlation in expression of the secondary to other known predictors of oxidative capacity. OXPHOS-CR genes and their coordinated downregulation Applicants found no relationship between BMI or WHR and in diabetic muscle led us to explore mechanisms that might OXPHOS-CR gene expression (R-0.01 in both cases). In mediate to this tight control. Applicants reasoned that per addition, there was no significant relationship between quan oxisome proliferator-activated receptor Y coactivator titative measures of fiber types and OXPHOS-CR expres 1g (PGC-1C), a cold-inducible regulator of mitochondrial sion. Thus, subtle decrease in expression of OXPHOS-CR biogenesis, thermogenesis, and skeletal muscle fiber type genes in muscle appears to be associated with changes in switching (Puigserver, P. et al. Cell 92,829-39. (1998), Wu, total body aerobic capacity, even beyond their correlation to Z. et al. Cell 98, 115-24. (1999), Lin, J. et al. Nature 418, diabetes status, body habitus, or muscle fiber type. 797-801. (2002)), was a prime candidate for mediating these effects. Consistent with this hypothesis, applicants observed Second Experimental Series that mean levels of PGC-1C. transcript were similarly The following experimental procedures were followed in the decreased (-20%) in the diabetic muscle, and noted that the second experimental series: promoters of several of the OXPHOS-CR genes have been 0262. Organelle Purification and Sample Preparation. 6-8 reported to contain binding sites for nuclear respiratory week old male mice were subjected to an 8 hour fast and factor 1, a transcription factor co-activated by PGC-1C. then euthanized. Brain, heart, kidney, and livers were har (Scarpulla, R. C. Biochim Biophys Acta 1576, 1-14. (2002)). vested immediately and placed in ice cold saline. Mitochon 0259. To test directly whether OXPHOS-CR genes might dria were isolated using differential centrifugation as previ be transcriptional targets of PGC-1C., applicants expressed ously described and purified with a Percoll gradient (Mootha PGC-1C. in a mouse skeletal muscle cell line using an et al. (2003). Proc Natl Acad Sci USA 100, 605-10). The adenoviral expression vector (Lin, J. et al. Nature 418, proteins were then solubilized, size separated, and digested 797-801. (2002)) and used DNA microarrays to profile as previously described (Mootha et al. (2003). Proc Natl expression of the OXPHOS genes over a 3 day period (see Acad Sci USA 100, 605-10)). Methods). Applicants found that a subset of OXPHOS genes were strongly upregulated in a time-dependent manner in 0263 Tandem Mass Spectrometry. Liquid chromatogra response to PGC-1C, and that this subset corresponds almost phy tandem mass spectrometry (LC-MS/MS) was per precisely to OXPHOS-CR (FIG. 3b). These in vitro results formed on QSTAR pulsar quadrupole time of flight mass support the hypothesis that PGC-1 plays a role in the spectrometers (AB/MDS Sciex, Toronto) as described pre regulation of OXPHOS-CR, both across the mouse tissue viously (Mootha et al. (2003). Proc Natl AcadSci USA 100, compendium as well as in the observed downregulation in 605-10). Tandem mass spectra were searched against the diabetes. NCBImr database (February 2002) with tryptic constraints and initial mass tolerances<0.13 Da in the search software Mascot (Matrix Sciences, London). Only peptides achieving Example 6 a Mascot score above 25 and containing a sequence tag of Expression of OXPHOS-CR and Measures of at least three consecutive amino acids were accepted. Whole Body Physiology 0264 Curation of Previously Annotated Mitochondrial Proteins. Two key sources were used to identify previously 0260 Metabolic control theory suggests that small annotated proteins. First, Applicant downloaded the 308 increases in many sequential steps of a metabolic pathway human and 117 mouse protein sequences at MITOcondria can lead to a dramatic change in the total flux through the Project (Scharfe et al. (2000). Nucleic Acids Res 28, 155-8). pathway, whereas large changes in a single enzyme might Applicant also downloaded the 199 human and 290 mouse have no measurable effects (Brown et al. Biochem J. 284, protein sequences annotated at LocusLink (http://ww 1-13. (1992). To test the hypothesis that subtle differences in OXPHOS-CR gene expression in diabetic patients might be w.ncbi.nlm.nih.gov/LocusLink) as having a mitochondrial related to changes in total body metabolism, applicants Subcellular localization based on terminology examined the relationships between diabetes status, expres (GO:0005739) (Lewis et al. (2000). Curr Opin Struct Biol sion of OXPHOS-CR genes, and VO2max as measured in 10, 349-54) (January 2003). Also included in the master list our patients (FIG. 5). Consistent with previous reports the are 13 mtDNA encoded proteins, based on LocusLink (Eriksson et al. Diabetologia 33, 526-31. (1990)), diabetes annotation. and VO2max are correlated in our patients (R =0.28, 0265 A Nonredundant List of Mitochondrial Proteins. P=0.0005). Strikingly, applicants found that the expression FASTA sequences corresponding to the previously anno of OXPHOS-CR genes in muscle is strongly correlated with tated mitochondrial proteins, newly identified mitochondrial VO2max (R’=0.22. P=0.0012) (FIG. 5), a measure of proteins, and the mouse Reference Sequences (Maglottet al. total-body physiology. The top ranking OXPHOS-CR gene, (2000). Nucleic Acids Res 28, 126-8) were merged. These ubiquinol cytochrome c reductase binding protein were then collapsed into distinct protein clusters using a (UQCRB), is even a stronger predictor (R’=0.31, downloaded version of blastclust (http://www.ncbi.nlm.nih P-0.0001). OXPHOS-CR appears to be not solely a proxy .gov/BLAST/). Applicants required that members of a clus US 2007/0203083 A1 Aug. 30, 2007 32 ter demonstrate 70% sequence identity over 50% of the total interested in the probability that a given gene product is length, not requiring a reciprocal relationship to exist. Clus found in S conditional that it is found in S, or simply ters containing multiple Reference Sequences were then T(S,S)=P(gene product is found in Sigene product is broken using a higher stringency blastclust, in which appli found in S). Define P, as the average T(S,S) over all cants required 90% identity over 50% of the length. Clusters selections of SCS. When applicant assessed composi containing hemoglobin, trypsin, and albumin were elimi tional diversity using RNA expression levels, Applicant nated as obvious contaminants. When possible the Refer interpreted an RNA expression level greater than 200 as ence Sequence was selected as the exemplar from the present (Su et al. (2002). Proc Natl Acad Sci USA 99, cluster, otherwise another sequence was manually selected. 4465-70), and an expression below this level as not present. Hence, each cluster is annotated by an exemplar sequence, These average conditional probabilities P, can also be mod the protein accessions (and tissues) in which the proteins eled. Imagine that a fraction f of all mitochondrial proteins were found in the proteomics experiments, and the protein are ubiquitous (i.e., expressed in all cell types with prob accessions corresponding to annotation Sources. Applicant ability 1) and that a fraction 1-fare not ubiquitous, but obtained a total of 612 distinct protein clusters (Table 2). The rather, appear in a given tissue with probability p. Then GenPept descriptions of 37 of these exemplars suggested that they are mitochondrial, but simply missed by the 0268 DNA Microarray Analysis. To identify Affymetrix automated annotation procedure using the MITOP and probe-sets corresponding to each protein cluster, Applicant LocusLink databases. These exemplars were therefore mapped the exemplar sequence to the Unigene cluster, and manually annotated as previously known mitochondrial pro then identified the corresponding Affymetrix MG-U74AV2 teins, to provide a more conservative estimate of our sen probe set. The NetAffx website (http://www.affymetrix sitivity measure and newly discovered proteins. .com) and its tables were used to perform these mappings 0266 Statistical Analysis. Cluster enrichment was deter (January 2003). The GNF mouse expression atlas (Su et al. mined using a cumulative hypergeometric distribution. To (2002). Proc Natl Acad Sci USA99, 4465-70) was down determine whether two empirical cumulative distributions loaded from its website (http://www.gmforg). In compari arise from the same underlying distribution, Applicant used sons of protein detection and mRNA abundance, the used the the Kolmogorov-Smirnov test statistic, D. Tail values were mRNA expression level for a given tissue averaged over the obtained using Matlab (Mathworks). replicates, since the GNF mouse expression atlas includes duplicates for each tissue. Because the proteomic Survey was RNA/Protein Concordance Test. the RNA/protein concor performed on whole brain, applicants simply compared to dance test was developed to determine whether there is the average expression of all brain samples in the GNF significant concordance between protein detection in a pro teomics experiment and mRNA abundance in a microarray mouse atlas. Hierarchical clustering was performed using experiment. Consider the pair of tissues, i,j, where DCHIP (Schadt et al. (2001). J Cell Biochem Suppl Suppl. i,je{brain, heart, kidney, liver. For a given gene, G, let 120-5). M(G,k) represent the gene expression level of gene G in 0269) Identification of Ancestral Mitochondrial Genes. tissue k. Let P(G,k) be an indicator variable that is 0 if the The consensus FASTA sequences for the genes represented protein product of gene G is not found in tissue k, and 1 if on the Affymetrix MG-U74AV2 oligonucleotide array were the protein product is found in tissue k. The mRNA and downloaded from the NetAFFX (Liu et al. (2003). Nucleic protein expression levels of gene G are concordant in tissues Acids Res 31, 82-6) website (http://www.affymetrix.com). A i and j if M(G.i)>M(G,j) when P(G,i)>P(G,j). For a given blastX comparison of these sequences was performed against gene, G, compute the total number of observed concor the Rickettsia prowaZeki protein sequences, downloaded dances (c) between all pairs of tissues as well as the from the NCBI, and then a thiastin comparison of the expected variance in concordance (V) for that gene. The bacterial protein sequences was performed against the con test statistic is simply sensus FASTA sequences. An ancestral gene as defined as one achieving a BLASTX E-0.01 and having a reciprocal best match in the BLAST analysis. X.ca Example 7 C = - 2 Proteomic Survey of Mitochondria 0270 Applicants carried out a systematic survey of mito chondrial proteins from brain, heart, kidney, and liver of C57BL6/S mice (see Methods). Each of these tissues pro which has mean 0 and variance 1 and is approximately vides a rich source of mitochondria. The isolation consisted normal in the null case where there is no concordance of density centrifugation followed by Percoll purification. Preparations were tested for purity and for contamination between RNA abundance and protein detection. using immunoblotting directed against organelle markers, Compositional Diversity Across Tissues. Mitochondrial enzymatic assays to ensure that the mitochondria were gene products show distinct patterns of expression based on intact, and electron microscopy. The liver, heart, and kidney protein and RNA expression (Table 5). These patterns of mitochondria were extremely pure. The brain mitochondria distribution can be used to develop a simple model that tended to show persistent contamination by Synaptosomes, describes core mitochondrial proteins versus those that are which themselves are a rich source of neuronal mitochondria specialized to any set of cell types. (see Fernandez-Vizarra (2002). Methods 26, 292-7). 0267 Consider a set of i-1 tissues, S., as well as a 0271 Mitochondrial proteins from each tissue were solu distinct Subset S, i.e., S, CS, where i>0. Applicants are bilized and size separated by gel filtration chromatography US 2007/0203083 A1 Aug. 30, 2007 into approximately 20 fractions (see Methods). These pro 0276. The 422 proteins identified in the proteomic survey teins were then digested and analyzed by liquid chromatog span a wide range of isoelectric points and molecular raphy mass spectrometry/mass spectrometry (LC-MS/MS). weights (FIG. 6B, 6C), although proteins from the inner More than 100 LC-MS/MS experiments were performed mitochondrial membrane are underrepresented (FIG. 6D). (see Methods). The incomplete sensitivity (58%) is most likely due to a bias 0272. The acquired tandem mass spectra were then against proteins of low abundance, which is a known feature searched against the NCBI nonredundant database consist of the mass spectrometry methodology. This explanation is ing of mammalian proteins using a probability-based Supported by analysis of RNA expression of the genes method (Perkins et al. (1999). Electrophoresis 20, 3551-67. encoding the detected and undetected proteins. Considering pii). Stringent criteria were used for accepting a database the subset of the 452 previously annotated genes for which hit. Specifically, only peptides corresponding to complete RNA expression was reported in a recent atlas of mRNA tryptic cleavage specificity with scores greater than 25 were expression in mouse 0, the distribution of RNA expression considered (see Methods). Furthermore, only fragmentation level was about 5-fold higher for the genes whose products spectra which also exhibited a correct, corresponding pep were detected in our proteomic Survey as compared to those tide sequence tag (Mann et al. (1994). Anal Chem 66, that were not (P=1x10') (FIG. 6E). This suggests that the 4390-9) consisting of at least three amino acids were con proteomics strategy preferentially detected the higher abun sidered. dance proteins 0273 Using these criteria, -2100 database hits were 0277. The 160 proteins not previously annotated as mito identified. This list contains a high degree of redundancy, chondrial potentially represent new mitochondrial proteins, because a protein may have been found in adjacent fractions either in the conventional sense of being present within the of the gel and in different tissues. The -2100 hits collapse to organelle or in a broader sense of being tethered to the a distinct set of 422 mouse proteins (see Table 4, FIG. 6, and mitochondrial outer membrane (e.g., tubulin (Heggeness et Methods). al. (1978). Proc Natl Acad Sci USA 75, 3863-6)). 0278. To test this notion, Applicants sought independent Example 8 evidence that these 160 proteins are actually mitochondrial. First, the list was compared to proteins identified in a recent Previously Annotated Mitochondrial Proteins survey of human heart mitochondria (Taylor et al. (2003). Nat Biotechnol 18, 18). Human homologs of 64 of the 160 0274. A list of previously annotated mouse and human proteins were identified in this recently published study. Of mitochondrial proteins was created by pooling all the mouse the remaining 96 proteins, 24 have strong mitochondrial and human proteins from MITOchondria Project (MITOP, targeting sequences based on bioinformatic analysis of pro http://mips.gSf.de/proj/medgen/mitop?), a public database of tein targeting sequences (Table 4 and Methods) (Nakai et al. curated mitochondrial proteins, as well as all proteins anno (1999). Trends Biochem Sci 24, 34-6), a proportion similar tated as mitochondrial in NCBI's LocusLink database to the known mitochondrial proteins. For example poly (http://www.ncbi.nlm.nih.gov/LocusLink?) (see Methods). merase delta interacting protein 38 (encoded by Pdip38 After elimination of redundancy, the list contains 452 dis pending), which was detected only in liver mitochondria, tinct mouse proteins that are either directly annotated as and the gene product of Rnasehl, which was found only in mitochondrial or whose human homolog is annotated as the kidney, have strong mitochondrial targeting scores. A mitochondrial (FIG. 6A). The human proteins recently reported to be mitochondrial by Taylor et. al. 2003 (in a recent study confirmed that Rnaseh1 can be localized to the study published after the construction of Applicant’s list of mitochondrion, where it plays a critical role in mtDNA previously annotated proteins) were not included in Appli homeostasis (Cerritelli et al. (2003). Mol Cell 11, 807-15). cant’s list. These proteins instead serve as a control against which to compare the proteins identified in our proteomic Example 10 analysis. The list of 452 previously annotated mitochondrial proteins is by no means comprehensive—there are likely Modules of Coregulated Mitochondrial Genes many mitochondrial proteins that are simply not annotated 0279 Applicant also investigated co-regulation of the by these public databases. However, it does provide a 612 mito-P genes across different tissues. For 388 of the 612 reasonable, high confidence list of previously annotated mito-P genes, mRNA expression levels were available in a proteins against which to benchmark Applicant’s proteomic mouse gene expression compendium containing data across Survey. 47 tissues (Su et al. (2002). Proc Natl Acad Sci USA99, 4465-70). Example 9 0280 Applicant calculated pairwise correlation and per Newly Identified Mitochondrial Proteins formed hierarchical clustering of these 388 gene expression profiles (FIGS. 6 and 7). There are several striking mito 0275. The set of 422 proteins identified in Applicants chondrial gene modules (FIG. 6), which are defined here as proteomic survey include 262 of the 452 proteins previously clusters of genes showing strong expression correlation annotated to be mitochondrial (58%) and 160 proteins not across the 47 tissues (Table 6). These modules include genes previously annotated as associated with the mitochondria with strong annotation Support as well as genes identified in (FIG. 6A). The previous and new sets were combined to this study as being mitochondrial (see bar labeling in FIG. produce a list of 612 genes whose protein product is physi 7). These clusters appear to have properties of scale-free cally associated with mitochondria. This set of genes is networks, in which a few central nodes are highly correlated referred to as mito-P (Table 4). with each other (module 6), while most are correlated with US 2007/0203083 A1 Aug. 30, 2007 34 only a few genes or none at all (Barabasi, (2003). Scale-free 0285) The 10,043 genes in the mouse expression atlas networks, Sci Am 288, 60-9). As shown in FIG. 7, mito include 388 of the 612 mito-P genes. If these 388 genes were chondrial gene expression profiles vary tremendously from a random Subset, an Noo value greater than 10 would be tissue to tissue, consistent with the compositional diversity expected to occur by chance 1 in 1000 times, and an Noo of mitochondria noted above. greater than 50 would be exceedingly rare (P=1.5x10'). 0281. Some of these gene modules have no obvious 0286 A total of 806 genes have No-10. This is defined functional relationships, though two appear to be enriched in herein as the expression neighborhood of the mito-P set, and certain tissues (modules 1.2). Each of these gene modules is Applicant interprets these genes as being co-regulated with characterized by tightly correlated gene expression across mitochondrial genes (see the entire rank ordered list, Table the tissue compendium. Members of these genes likely share 7). This group corresponds to only 8% of all the genes transcriptional regulatory mechanisms as well as cellular studied, but it contains 52% of the mito-P genes (6.5-fold functions. Many of the newly identified mitochondrial genes enrichment, P=1.49x10'). The list includes 59 that are (black bar in annotation bar of FIG. 7) lie within these newly mitochondrial, based on the proteomic Survey modules, providing a functional context for their cellular described herein and 25 that were previously known to be role. mitochondrial but not detected by that proteomic survey. 0282. The mitochondria gene modules provide an initial 0287 Importantly, the expression neighborhood includes step towards the characterization of some of the newly 605 genes not present in the mito-P set itself. These genes identified mitochondrial genes, since functionally related may encode proteins that are physically present in mito genes tend to have correlated gene expression. Of the 104 chondria but were missed in the proteomic survey or that are newly identified mitochondrial proteins that are represented functionally related to mitochondria but not physically asso in this microarray dataset, 38 fall within these 7 modules, ciated. They provide a catalog of genes that are likely providing them with a preliminary functional context. functionally relevant to mitochondrial biology, and are complementary to the proteomic approach that identified Example 11 proteins resident in this organelle. Modules Enriched in Genes of Oxidative Example 13 Phosphorylation 0283. A striking gene module (module 6) consists of Transcription Factors and Nutrient Sensors within genes related to oxidative phosphorylation (OXPHOS) and the Mitochondrial Neighborhood 3-oxidation and expressed at high levels in brown fat, 0288 Applicant found several genes involved in DNA skeletal muscle, and heart (FIGS. 6 and 7). The related replication within the mitochondria neighborhood (Table 1). module 5, enriched in OXPHOS genes but not the f-oxida Essra, Pparg, and Ppara encode nuclear receptors that are tion genes, is expressed not only in brown fat, heart, and tightly co-regulated with the mitochondrial genes. This is skeletal muscle, but also in colon. Colon is not traditionally intriguing since previous studies have Suggested that these considered to be a highly metabolic tissue, but it has high nuclear receptors are important partners of the coactivator expression of peroxisome proliferative activated receptor-Y, PGC-1 o, key molecule in mitochondrial biogenesis (Puig a partner of PGC-1C., a master regulator of mitochondrial server et al. (2003). Endocr Rev 24, 78-90). While nuclear biogenesis (Puigserver et al. (2003). Endocr Rev 24, 78-90). receptors are critical to mitochondrial biogenesis (Scarpulla, In a recent study of human diabetic muscle, Applicant and R. C. (2002). Biochim Biophys Acta 1576, 1-14), to our co-workers demonstrated that the OXPHOS genes in mod knowledge, none has previously been reported to be co ules 5 and 6 (termed OXPHOS-CR for OXidative PHOSho regulated with the mitochondrial genes themselves. Inter rylation CoRegulated) show diminished expression in type 2 estingly, a recent report demonstrated that PGC-1C. co diabetes, and that these genes are targets of PGC-1C. The activates Essra gene expression (Schreiber et al. (2003). J current study identifies two modules (modules 5, 6) that Biol Chem 278, 9013-8). Applicant's results raise the contain OXPHOS-CR as well as other mitochondrial genes, hypothesis that this may be a general phenomenon, in which including 4 newly identified genes in module 5 and 12 newly PGC-1C. is co-activating a number of its own transcriptional mitochondrial genes in module 6. It will be interesting to partners. determine how this expanded set contributes to type 2 diabetes and other measures of whole-body metabolism. 0289. A number of other transcriptional regulators also have expression patterns very tightly regulated with the Example 12 mitochondrial genes, including Mdfi, Nfix. Thx6, and Crsp2. These are excellent candidate transcription factors that may Mitochondrial Gene Expression Neighborhood be targets of PGC-1C, or perhaps are involved in other 0284 Applicant also sought to systematically identify all mechanisms leading to the biogenesis of this organelle. genes that exhibit correlated expression with the mito-P 0290 Surprisingly, the nutrient sensor Sir2 is also found genes. This was done using the neighborhood index (Noo), within the mitochondrial expression neighborhood. Sir2 a previously described Statistic that measures a given gene's encodes an NAD(+)-dependent histone deacetylase which is expression similarity to a target gene set (Mootha et al. homologous to the yeast silent information regulator 2 (2003). Proc Natl Acad Sci USA 100, 605-10). For a given (ySir2). Sir2 is involved in gene silencing, chromosomal gene, the mitochondria neighborhood index is defined as the stability, and aging. Chromatin remodeling enzymes rely on number of mito-P genes among its nearest 100 expression coenzymes derived from metabolic pathways, including neighbors. Applicant computed the Noo statistic for all those generated by the mitochondrion. These observations genes in the mouse expression atlas (FIG. 9). Suggest that Sir2 and mitochondrial gene expression are US 2007/0203083 A1 Aug. 30, 2007 cooperatively regulated, perhaps linking the mitochondrion largely unsolved problem (17). A promising approach relates to the nutrient sensing activities of Sir2. genome-wide expression profiles to promoter sequences to discover influential cis-motifs (18-21). Such methods have Third Experimental Series yielded impressive results in simple organisms such as yeast, The following experimental procedures were followed in the but it has been challenging to extend these algorithms to third experimental series: mammalian genomes, where intergenic regions are large, annotation of gene structure is imperfect, and DNA 0291 Data Scaling, Visualization, and Annotation sequence can be highly repetitive. Most of these methods Enrichment. Microarray data were acquired and Subjected to seek motifs by comparison to a fixed background model of linear Scaling using the median scan as a reference. Data nucleotide composition (which fails to represent the fluc were visualized using the dChip Software package (10) and tuations seen in large genomes) or by comparison between enrichment by ontology terms determined with the GoSurfer two sets of genes (which is likely to capture only very sharp tool, using a P-value of 0.01 (11). Mitochondrial genes were differences). Further, many of these methods assume that the defined based on a recent proteomic Survey of organelle in expression data are normally distributed, which may not mouse (12). always be true. 0292 Promoter Databases. Applicants used the Refer 0296) To overcome some of these obstacles, applicants ence Sequence annotations of mm3 build of the mouse devised a simple, nonparametric strategy for identifying genome (http://genome.ucsc.edu) and the annotation tables motifs associated with differential expression (motifADE) for the Affymetrix MG-U74AV2 chip (http://www.affyme (FIG. 10a). The algorithm involves three steps: (i) ranking trix.com) to compile a list of 5034 mouse genes for which genes based on differential expression between two condi there is a 1:1 mapping between Affymetrix probe-set and tions; (ii) given a candidate motif, identifying the Subset of Reference Sequences. The mouse promoter database con genes whose promoter regions contains the motif, and (iii) sists of 2000 bp of genomic sequence centered on the testing via a nonparametric, rank Sum statistic (see Methods) annotated transcription start site of these genes. if these genes tend to appear toward the top or bottom of the 0293 Applicants also performed analyses on a masked ranked list (indicating association) or are randomly distrib promoter database, consisting of the regions within these uted on the list. motifADE may be applied to a specific 2000 bp that are aligned and conserved between mouse and candidate motif of interest or to the list of all possible motifs human. Applicants used the mouse/human BLASTZ align of a given size (in which case the significance level should ments (mouse mm3 vs. human hg15) (13) and only consid be adjusted to reflect multiple hypothesis testing). By using ered the 5008 promoters for which the alignment contained a nonparametric scoring procedure (see Methods), appli at least 100 bp. Applicants masked the aligned promoters to cants do not make assumptions about the distribution of the retain mouse sequence exhibiting at least 70% identity to expression data. Furthermore, by considering the entire rank human across windows of size 10. The median promoter ordered list, the promoters without the motif implicitly length in the masked database is ~1200 bp. provide a background of DNA composition for comparison, 0294 Motif discovery. For a given day, genes from the and there is no need to group the genes into clusters. The microarray are ordered on the basis of expression difference method can operate on a traditional promoter database or between GFP and PGC-1C. (applicants use the signal to noise even a database of promoters that have been masked based ratio as our difference metric). Each gene is annotated for the on evolutionary conservation (see Methods). presence of a motif in the promoter by searching for exact Example 15 k-mers (where k=6, 7, 8 or 9) or for selected motifs of interest. Applicants use the Mann-Whitney rank sum statis tic U to determine whether the distribution of differential Binding Sites for Erro. and Gabpa are the Top expression for those genes with a given motif differs from Scoring Motifs Associated with the PGC-1C. those genes lacking the motif. When working with promot Transcriptional Program ers of unequal length (e.g., the masked promoter database), 0297 To identify motifs related to PGC-1C. action, appli a more appropriate null hypothesis for the Mann-Whitney cants infected mouse C2C12 muscle cells with an adenovi statistic is that the probability of detecting a motif in a rus expressing PGC-1C. and obtained gene expression pro promoter is proportional to its length. To assess the signifi files for 12,488 genes at 0, 1, 2, and 3 days following cance of a motif with rank Sum U that appears in C infection. Applicants found 649 genes that were induced at promoters, applicants use Monte Carlo simulation (with least 1.5-fold (nominal P-0.05) at day 3. As expected, these 1000 samples) to estimate the null distribution of U for a were enriched for genes involved in carbohydrate metabo sample of C ranks drawn randomly, without replacement, lism and the mitochondrion (see (1)). Interestingly, many given relative weights proportional to the promoter lengths. genes involved with protein synthesis (GO terms: protein For large C (C>10) and a reasonable distribution of promoter biosynthesis, mitochondrial ribosome and ribosome) are lengths, U is approximately normally distributed. also induced. Promoter databases and motifADE source code are available 0298. Applicants then applied motifADE to study the at http://www-genome.wi.mit.edu/mpg/PGC motifs/. 5034 mouse genes for which applicants have measures of Example 14 gene expression as well as reliable annotations of the transcriptional start site (TSS) (see Methods). For each gene, Discovering Motifs Associated with Differential the target region was defined to be a 21 kb region centered Expression on the TSS. Applicants then tested all possible k-mers 0295 Systematic identification of transcription factors ranging in size from k=6 to k=9 nucleotides for association involved in biological processes in mammals remains a with differential expression on each of the three days of the US 2007/0203083 A1 Aug. 30, 2007 36 timecourse. A total of 20 motifs achieved high statistical Example 17 significance (p<0.001, following Bonferroni correction for multiple hypothesis testing) and these were almost exclu Erro. and Gabpa are Themselves Induced by sively related to two distinct motifs (see Table 8 and Table PGC-1 O. 9). The first motif, 5'-TGACCTTG-3' was significant on 0302) The above results suggest that Erro. and Gabpa may days 1, 2, and 3 (adjusted P=2.1 x 10, 2.9x10, and be the key transcriptional factors mediating PGC-1C. action 7.7x10", respectively). It corresponds to the published in muscle. In this connection, it is notable that based on the binding site for the orphan nuclear receptor Errol. (22), which microarray data, both Erro. and Gabpa are themselves is known to be capable of being co-activated by induced 2-fold (P<0.01) on day 1 following expression PGC-1 c. and -3 (23-25). The Errol, gene is known to be PGC-1 c. consistent with previous studies (2, 23). More involved in metabolic processes, based on Studies showing over, careful analysis of the Errol, and Gabpa genes Suggest that knockout mice have reduced body weight and periph that each contain potential binding sites for both transcrip eral fat tissue, as well as altered expression of genes tion factors within the vicinity of their promoters. The Erro. involved in metabolic pathways (26). The second motif is gene has the Erro. motif as well as a conserved variant of the 5'-CTTCCG-3 (adjusted p=8.9x10), which is the top Gabpa binding site (27) upstream of the TSS, while the scoring motif on day 3. It corresponds to the published Gabpa gene has an Errol site upstream of the TSS and a binding site for Gabpa (27), which complexes with Gabpb conserved variant of the Gabpa binding site in its first intron (15) to form the heterodimer, nuclear respiratory factor-2 These results raise the possibility that Errol, and Gabpa may (NRF-2), a factor known to regulate the expression of some regulate their own and each other's expression. OXPHOS genes (28). 0303 Taken together, the systematic analysis of the tran 0299 Interestingly, the reverse complements of these scriptional program driven by PGC-1C. in skeletal muscle suggests a model (FIG. 11) in which increases in PGC-1C. motifs did not score as well, Suggesting a preference for the protein levels (induced, for example, by exercise, e.g. see orientation of these motifs, and some occurrences of the (29)) results in increased transcriptional activity of Gabpa motifs occurred downstream of the TSS. While each of these and Errol on their own promoters, leading to a stable increase motifs is individually associated with PGC-1A, our analyses in the expression of these two factors via a double positive Suggest that a gene having both motifs typically ranks higher feedback loop. These two factors, perhaps in combination on the list of differentially expressed genes and genes with with PGC-1C. are then crucial in the induction of down only one of the motifs (FIG. 12) suggesting that the two stream target genes, many of which have binding sites for motifs might have an additive or synergistic effect. these motifs (FIG. 11). Such a circuit may serve as a regulatory Switch, analogous to a feed-forward loop that Example 16 plays a key role in the early stages of endomesodermal development in sea urchin (30). Erro. and Gabpa Motifs are Evolutionarily Conserved and Enriched Upstream of OXPHOS Experiment 18: MotifADE Results Applied to Human Dia Genes betic Versus Normal Expression 0304) Applicants applied the Motif ADE method to ana 0300. Applicants next repeated motif ADE analysis using lyze the transcription factor binding sites that are differen a “masked promoter database (Table 3). Applicants still tially expressed in diabetic vs. normal human skeletal considered the 2000 bp centered on the TSS, but only muscle (previously published data, Mootha et al Nature considered those nucleotides aligned and conserved between Genetics 2003). The program identified exactly three motifs mouse and human (see Methods). Still, the top ranking achieving an adjusted P-value-0.05. These are AAATCG motifs on days 1 and 3 were related to Erro. (day 1, (adjusted P-value 0.003), CCGGAAG (adjusted P-value P=4.8x10; day 3 P-1.2x10') and to Gabpa (day 3 0.039), and AGCGTTT (adjusted P-value 0.011). Applicants P=3.1x10'), providing additional support these motifs are note that the second motif is a published binding site for biologically relevant. Gabpa (reverse complement of CTTCCG). This results Suggest that Gabpa function is altered in diabetic muscle, or 0301 The Erro. and Gabpa motifs are particularly that perhaps another transcription factor that binds to this enriched upstream of the OXPHOS-CR genes, which exhibit element. reduced expression in human diabetes (5, 6). Whereas the top scoring Errol motif (5'-TGACCTTG-3' or its reverse Experiment 6: Identification of Human Genes Having Bind complement) only occurs in 12% of the promoters in the ing Sites for Errol, Gabpa or Both database, in 29% of the PGC-responsive genes (i.e., those 0305 Applicants searched for the binding sites motifs genes induced at least 1.5 fold on day 3), and in 27% of the (forward or reverse complement) 3. Kb upstream and 1 Kb mitochondrial genes, they are found in 52% of the downstream of the annotated transcription start site. In the OXPHOS-CR genes (significance of enrichment, P-1x10). accompanying files are the genes with either one motif About one-half of these sites are perfectly conserved in the (forward or reverse complement) or both motifs conserved Syntenic region in human. The top scoring Gabpa binding between human and mouse. The following genes were sites (5'-CTTCCG-3' or its reverse complement) are much identified: Table 10: 678 genes with Erro. motif conserved more common (62% of all promoters of the database and in between mouse and human. Table 11: 2799 genes with 79% of the PGC-responsive genes), but they, too, show Gabpa motif conserved between mouse and human. Table significant enrichment in the OXPHOS-CR genes (89%, 12: 354 genes with both motifs conserved between mouse P=0.02). and human. US 2007/0203083 A1 Aug. 30, 2007 37

Discussion of First Experimental Series expression in muscle has not been explored (25). On the other hand, a global knockout of Errol, also causes a leaner 0306 In this study, applicants have used a combined phenotype and resistance to high-fat diet-induced obesity genomic and computational Strategy to systematically dis (26). The identification of the critical roles of Erro, and sect a mammalian transcriptional circuit central to cellular Gabpa in mediating the transcriptional program altered in energetics. The results above have computational, biological human diabetic muscle may offer a more specific target. and medical implications. Because Errol is an orphan nuclear receptor, it may be an 0307 First, the motifADE algorithm provides a simple, attractive, "druggable' target for diabetes and for other nonparametric approach for discovering cis-elements by human metabolic disorders. considering differential gene expression. It makes very few assumptions about the statistical properties of DNA com REFERENCES OF THIRD EXPERIMENTAL position or about the distribution of gene expression. The SECTION method is flexible, and as applicants have shown, can easily 0310) 1. Puigserver, P. & Spiegelman, B. M. (2003) incorporate “masked' or “phylogenetically footprinted pro Endocr Rev 24, 78-90. moters. With additional cross-species comparisons, it should be possible to interrogate conserved segments of larger 0311) 2. Wu, Z., Puigserver, P., Andersson, U., Zhang, C., upstream regions (34). Moreover, the method operates on Adelmant, G., Mootha, V., Troy, A., Cinti, S., Lowell, B., any ordered set of genes and is particularly convenient for Scarpulla, R. C., et al. (1999) Cell 98, 115-24. discovering motifs associated with human disease states, 0312. 3. Michael, L. F., Wu, Z., Cheatham, R. B., Puig e.g., “healthy versus sick” or “treated versus control.” server, P., Adelmant, G., Lehman, J. J., Kelly, D. P. & Clearly, the method has some limitations. For example, in Spiegelman, B. M. (2001) Proc Natl Acad Sci USA 98, the current study, applicants were confident in the identity of 3820-5. the transcription factors binding the motifs discovered in general this may not be the case, and experimental strategies 0313 4. Lin, J., Wu, H., Tarr, P. T., Zhang, C. Y., Wu, Z. will be needed to systematically determine the occupancy of Boss, O., Michael, L. F., Puigserver, P., Isotani, B., Olson, newly identified motifs. Moreover, a motif may be missed if E. N., et al. (2002) Nature 418, 797-801. it lies outside the target promoter region, or if a functional 0314) 5. Patti, M. B. Butte, A.J., Crunkhorn, S., Cusi, K., binding site is too degenerate for our motif search strategy. Berria, R., Kashyap, S., Miyazaki, Y., Kohane, I., Cos 0308) Second, the analyses above indicate that the imme tello, M., Saccone, R., et al. (2003) Proc Natl Acad Sci diate effects of PGC-1C. on OXPHOS genes in muscle are USA 100, 8466-71. largely mediated through Errol, and Gabpa. Recent studies 0315 6. Mootha, V. K. Lindgren, C. M., Eriksson, K. F., have shown that PGC-13 can also co-activate Errol. (25). Subramanian, A., Sihag, S. Lehar, J., Puigserver, P. Together, the data imply a model of gene regulation in which Carlsson, E., Ridderstrale, M., Laurila, E., et al. (2003) PGC-1C. (and likely PGC-1B) initially induces the expres Nat Genet 34, 267-73. sion of Erro. and Gabpa, via a double positive feedback mechanism (FIG. 11). These transcription factors are then 0316 7. Kelley, D. E., He, J., Menshikova, E. V. & Ritov, expressed at higher levels and are themselves co-activated V. B. (2002) Diabetes 51, 2944-50. by PGC-1 to induce downstream genes such as NRF-1 and members of OXPHOS. Certainly, other transcription factors 0317 8. Petersen, K. F. Befroy, D., Dufour, S., Dziura, J., and regulators, not identified in the current study, are Ariyan, C., Rothman, D. L., DiPietro, L., Cline, G. W. & involved in the mitochondrial biogenesis program. Whereas Shulman, G. I. (2003) Science 300, 1140-2. previous studies have shown that PGC-1 interacts with 0318 9. Sreekumar, R., Halvatsiotis, P. Schimke, J. C. & and/or induces 15-20 transcription factors in various physi Nair, K. S. (2002) Diabetes 51, 1913-20. ological settings (including Errol, and Gabpa (2, 23-25), the present study points to Errol, and Gabpa as being especially 0319) 10. Schadt, E. E., Li, C., Ellis, B. & Wong, W. H. important early in the timecourse in muscle and provides a (2001) J Cell Biochem Suppl 37, 120-5. model of how these factors interact in executing the tran 0320 11. Zhong, S., Li, C. & Wong, W. H. (2003) Scriptional program. Nucleic Acids Res 31, 3483-6. 0309 Finally, the results suggest a potential approach to 0321) 12. Mootha, V. K., Bunkenborg, J., Olsen, J. V., the treatment of type 2 diabetes. Recent studies in diabetic Hjerrild, M., Wisniewski, J. R., Stahl, E., Bolouri, M. S., and pre-diabetic humans have demonstrated that there is a Ray, H. N., Sihag, S., Kamal, M., et al. (2003) Cell 115, consistent decrease in the expression of genes of oxidative 629-40. phosphorylation that are responsive to PGC-1C. and PGC-1B and that treatments that induce PGC-1C. (such as exercise) 0322, 13. Schwartz, S., Kent, W. J., Smit, A., Zhang, Z. lead to increased expression of OXPHOS genes and Baertsch, R., Hardison, R. C., Haussler, D. & Miller, W. improved insulin sensitivity (5, 6, 8, 9). On its face, this (2003) Genome Res 13, 103-7. might argue for developing therapeutic approaches that raise the transcriptional activity of PGC-1. However, PGC-1 0323, 14. Handschin, C., Rhee, J., Lin, J., Tarr, P. T. & activates many different pathways in many tissues and Such Spiegelman, B. M. (2003) Proc Natl Acad Sci USA 100, approaches may suffer from lack of specificity. For example, 7111-6. global transgenic overexpression of PGC-1B in mice results 0324) 15. Batchelor, A. H., Piper, D. E., de la Brousse, F. in resistance to obesity induced by a high-fat diet or by a C., McKnight, S. L. & Wolberger, C. (1998) Science 279, genetic abnormality, though the contribution of PGC-1B 1037-41. US 2007/0203083 A1 Aug. 30, 2007

0325 16. St-Pierre, J., Lin, J., Krauss, S., Tarr, P. T., 0336 27. Chinenov, Y., Coombs, C. & Martin, M. E. Yang, R., Newgard, C. B. & Spiegelman, B. M. (2003) J (2000) Gene 261, 311-20. Biol Chem 278, 26597-603. 0337 28. Virbasius, J. V. & Scarpulla, R. C. (1994) Proc 0326 17. Qiu, P. (2003) Biochem Biophys Res Commun Natl Acad Sci USA 91, 1309-13. 309, 495-501. 0338 29. Russell, A. P. Feilchenfeldt, J., Schreiber, S., 0327 18. Tavazoie, S., Hughes, J. D., Campbell, M. J., Praz, M., Crettenand, A., Gobelet, C., Meier, C. A., Bell, Cho, R. J. & Church, G. M. (1999) Nat Genet 22, 281-5. D. R. Kralli, A., Giacobino, J. P. et al. (2003) Diabetes 0328. 19. Liu, X. S., Brutlag, D. L. & Liu, J. S. (2002) 52, 2874-81. Nat Biotechnol 20, 835-9. 0339) 30. Davidson, E. H., Rast, J. P., Oliveri, P., Ransick, 0329. 20. Conlon, E. M., Liu, X. S., Lieb, J. D. & Liu, J. A., Calestani, C., Yuh, C. H., Minokawa, T., Amore, G., S. (2003) Proc Natl Acad Sci USA 100, 3339-44. Hinman, V., Arenas-Mena, C., et al. (2002) Science 295, 0330 21. Bussemaker, H. J., Li, H. & Siggia, E. D. 1669-78. (2001) Nat Genet 27, 167-71. 0340 31. Sladek, R., Bader, J. A. & Giguere, V. (1997) 0331 22. Johnston, S. D., Liu, X. Zuo, F., Eisenbraun, T. Mol Cell Biol 17, 5400-9. L., Wiley, S. R. Kraus, R. J. & Mertz, J. E. (1997) Mol Endocrinol 11, 342-52. 0341. 32. Scarpulla, R. C. (2002) Gene 286, 81-9. 0342. 33. Baar, K., Song, Z. Semenkovich, C. F., Jones, 0332 23. Schreiber, S. N., Knutti, D., Brogli, K., Uhl T. E., Han, D., H., Nolte, L. A., Ojuka, E. O., Chen, M. mann, T. & Kralli, A. (2003) J Biol Chem 278, 9013-8. & Holloszy, J. O. (2003) FASEB J 17, 1666-73. 0333 24. Huss, J. M., Kopp, R. P. & Kelly, D. P. (2002) J Biol Chem 277, 40265-74. 0343 34. Kellis, M., Patterson, N., Endrizzi, M., Birren, 0334 25. Kamei, Y., Ohizumi, H., Fujitani, Y., Nemoto, B. & Lander, E. S. (2003) Nature 423, 241-54. T., Tanaka, T., Takahashi, N., Kawada, T., Miyoshi, M., 0344) 35. Stunnenberg, H. G. (1993) Bioessays 15, 309 Ezaki, O. & Kakizuka, A. (2003) Proc Natl AcadSci USA 15. 100, 12378-83. 0345 36. Chinenov, Y., Henzl, M. & Martin, M. B. 0335). 26. Luo, J., Sladek, R., Carrier, J., Bader, J. A., (2000) J Biol Chem 275, 7749-56. Richard, D. & Giguere, V. (2003) Mol Cell Biol 23, 7947-56. 0346) Tables:

TABLE 1. Clinical and biochemical characteristics of male Subjects with normal glucose tolerance (NGT), impaired glucose tolerance (IGT), and type 2 diabetes mellitus (DM2). Class P-Value

NGT IGT DM2 NGT vs. IGT IGT vs. DM2 NGT vs. DM2

l 17 18 Age (yrs) 66.1 (1.0) 66.4 (1.6) 65.5 (1.8) BMI (kg/m2) 23.6 (3.4) 27.1 (4.8) 27.3 (4.0) 5.70 x 103 WHR 0.91 (0.09) 0.97 (0.04) 0.99 (0.03) 3.00 x 102 3.83 x 10 Trigs (mmol/L) 1.03 (0.40) 1.83 (1.60) 2.04 (1.13) 2.63 x 10 Chol (mmol/L) 5.39 (0.09) 4.60 (148) 5.77 (0.97) OGTT Glucose 0 (mmol/L) 4.67 (0.50) 5.05 (0.46) 7.83 (2.3) 9.22 x 10 2.01 x 10 Insulin O (ul J/ml) 5.41 (3.3) 13.38 (8.9) 12.0 (6.0) 4.05 x 102 4.10 x 10 Glucose 120 (mmol/L) 6.58 (0.94) 9.15 (0.8) 14.9 (4.0) 2.51 x 106 8.91 x 10 4.90 x 108 Insulin 120 (ul J/ml) 33.5 (19.3) 125.1 (66.1) 43.5 (25.6) 5.47 x 103 9.73 x 103 M-value (mg/kg/min) 8.74 (3.15) 6.32 (3.08) 4.22 (1.72) 2.30 x 10 VO2max (ml O2/kg/min) 32.1 (5.46) 26.5 (4.6) 24.3 (5.6) 1.72 x 102 3.09 x 10 Glycogen (mmol/kg) 371.1 (77.0) 326.5 (88.0) 350.6 (97.8) Type I Fibers Number (%) 37.2 (13.5) 33.5 (3.6) 36.4 (9.3) Area (%) 39.1 (14.4) 32.7 (0.91) 40.1 (10.7) 2.35 x 102 Capillaries/Fiber 3.91 (0.72) 4.05 (1.04) 4.14 (0.75) Type IIb Fibers Number (%) 73.8 (42.1) 60.2 (51.4) 72.2 (36.7) Area (%) 31.3 (18.0) 24.7 (18.3) 36.2 (15.4) Capillaries/Fiber 2.97 (0.71) 3.05 (0.87) 3.02 (0.65) Values are mean (S.D.). M-value is the total body glucose uptake. VO2max is the total body aerobic capacity. Only P-values < 0.05 are shown for pairwise comparisons, using a two-sided t-test. US 2007/0203083 A1 Aug. 30, 2007 39

0347) TABLE 2-continued TABLE 2 149 gene sets considered in the current analysis. 149 gene sets considered in the current analysis. Path ways Curated at WICGR MA P00512 O Glycans biosynthesis FFA Oxidation MA P00520 Nucleotide sugars metabolism Gluconeogenesis Glycolysis MA P00521 Streptomycin biosynthesis Glycogen metabolism MA P00522 Erythromycin biosynthesis GO: MA P00530 Aminosugars metabolism Insulin signaling Ketone body metabolism MA P00531 Glycosaminoglycan degradation Pyruvate metabolism MA P00532 Chondroitin Heparan Sulfate biosynthesis Reactive oxygen species Krebs cycle MA P00533 Keratan Sulfate biosynthesis Oxidative phosphorylation (OXPHOS) MA P00550 Peptidoglycan biosynthesis human mito)B 6 2002 mitochondria keyword MA P00561 Glycerolipid metabolism 36 GNF Mouse Expression MA P00562 Inositol phosphate metabolism Clusters MA P00570 Sphingophospholipid biosynthesis cluster c0, . . . , cluster c35 MA P00580 Phospholipid degradation Pathways from NetAFFX (October 2002) MA P00590 Prostaglandin and leukotriene metabolism MA P00010 Glycolysis Gluconeogenesis MA P00600 Sphingoglycolipid metabolism MA P00020 Citrate cycle TCA cycle MA P00601 Blood group glycolipid biosynthesis lact series MA P00030 Pentose phosphate pathway MA MA P00031. Inositol metabolism P00602 Blood group glycolipid biosynthesis neolact series MA P00040 Pentose and glucuronate interconversions MA P00603 Globoside metabolism MA P00051 Fructose and mannose metabolism MA P00620 Pyruvate metabolism MA P00052 Galactose metabolism MA P00053 Ascorbate and aldarate metabolism MA P00625 Tetrachloroethene degradation MA P00061 Fatty acid biosynthesis path 1 MA P00630 Glyoxylate and dicarboxylate metabolism MA P00062 Fatty acid biosynthesis path 2 MA P00071 Fatty acid metabolism MA P00631 1 2 Dichloroethane degradation MA P00072 Synthesis and degradation of ketone bodies MA P00632 Benzoate degradation MA P00100 Sterol biosynthesis MA MA P00120 Bile acid biosynthesis P00640 Propanoate metabolism MA POO130 Ubiquinone biosynthesis MA P00643 Styrene degradation MA POO140 C21 Steroid hormone metabolism MA P00650 Butanoate metabolism MA POO150 Androgen and estrogen metabolism MA P00190. Oxidative phosphorylation MA P00670. One carbon pool by folate MA P00193 ATP synthesis MA P00680 Methane metabolism MA POO195 Photosynthesis MA POO220 Urea cycle and metabolism of amino groups MA P00710 Carbon fixation MA P00230 Purine metabolism MA P00720 Reductive carboxylate cycle CO2 fixation MA P00240 Pyrimidine metabolism MA P00251 Glutamate metabolism MA POO740 Riboflavin me abolism MA P00252 Alanine and aspartate metabolism MA P00750 Vitamin B6 metabolism MA P00253 Tetracycline biosynthesis MA P00760 Nicotinate and nicotinamide metabolism MA P00260 Glycine serine and threonine metabolism MA P00271 Methionine metabolism MA P00770 Pantothenate and CoA biosynthesis MA P00272 Cysteine metabolism MA P00780 Biotin metabolism MA P00280 Valine leucine and isoleucine degradation MA P00290 Valine leucine and isoleucine biosynthesis MA P00790. Folate biosyn hesis MA P00300 Lysine biosynthesis MA P00830 Retinol metabolism MA P00310 Lysine degradation MA P00330 Arginine and proline metabolism MA P00860 Porphyrin and chlorophyll metabolism MA P00340 Histidine metabolism MA P00900 Terpenoid biosynthesis MA P00350 Tyrosine metabolism MA P00360 Phenylalanine metabolism MA P00910 Nitrogen metabolism MA P00361 gamma Hexachlorocyclohexane degradation MA P00920 Sulfur metabolism MA P00380 Tryptophan metabolism MA P00400 Phenylalanine tyrosine and tryptophan biosynthesis MA P00940 Flavonoids stilbene and lignin biosynthesis MA P00410 beta Alanine metabolism MA P00950 Alkaloid biosynthesis I MA P00430 Taurine and hypotaurine metabolism MA MA P00440 Aminophosphonate metabolism P00960 Alkaloid biosynthesis II MA P00450 Selenoamino acid metabolism MA P00970 Aminoacyl tRNA biosynthesis MA P00460 Cyanoamino acid metabolism MA P03020 RNA polymerase MA P00471 D Glutamine and D glutamate metabolism MA P00472 D Arginine and D ornithine metabolism MA P03030 DNA polymerase MA P00480 Glutathione metabolism MA P03070 Type III secretion system MA P00500 Starch and Sucrose metabolism MA P00510 N. Glycans biosynthesis MA P03090 Type II secretion system MA P00511 N. Glycan degradation US 2007/0203083 A1 Aug. 30, 2007 40

0348 TABLE 3 Genes in the mitochondria expression neighborhood with putative roles in DNA maintenance and repair based on Gene Ontology annotations. The gene name, symbol, and neighborhood index (Noo) are provided for each gene.

Gene Gene name symbol Noo Transcriptional regulators MyoD family inhibitor Mdf 63 nuclear factor IX Nfix 60 Zinc finger protein 288 Z?p288 56 T-box 6 Tbx6 49 Cofactor required for Sp1 transcriptional activation subunit 2 Crsp2 47 RIKEN cDNA 913.0025P16 gene 913OO2SP16Rik 46 Kruppel-like factor 9 KIf 43 EGL nine homolog 1 Eglin1 39 Estrogen related receptor, alpha ESrra 36 nuclease sensitive element binding protein 1 Nsep1 34 sirtuin 1 (silent mating type information regulation 2, Sirt1 31 homolog) peroxisome proliferator activated receptor alpha Ppara 29 metastasis associated 1-like 1 Mta11 28 NK2 transcription factor related, locus 5 NkX2-5 27 cardiac responsive adriamycin protein Crap 24 homeo box D8 Hoxd8 21 nuclear receptor subfamily 1, group I, member 2 Nr12 21 nuclear receptor subfamily 1, group H, member 3 Nr13 2O cellular nucleic acid binding protein Cnbp 19 transcription factor 2 Tcf2 19 Est2 repressor factor Erf 19 nuclear receptor subfamily 5, group A, member 1 Nr.5a1 18 nuclear factor, erythroid derived 2, -like 1 Nfe21 18 Zinc finger protein 30 Zfp30 17 peroxisome proliferator activated receptor gamma Pparg 17 cAMP responsive element binding protein 1 Creb1 15 SRY-box containing gene 6 Sox6 15 CCAAT enhancer binding protein (C/EBP), alpha Cebpa 15 DNA repair mutL homolog 1 MIh1 29 mutS homolog 5 Mshs 24 excision repair cross-complementing rodent repair Ercol 15 deficiency, complementation group 1

0349)

TABLE 4 Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink

Accession Description Mouse MITOP Mouse Human MITOP Human 19354491 111002OP15Rik protein Mus musculus 1338568.0 2,4-dienoyl CoA reductase 1, mitochondrial Mus 4503301 SS3352 musculus US 2007/0203083 A1 Aug 30, 2007 41

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 20071710 2010.002H18Rik protein Mus musculus 21 630283 2'-5' oligoadenylate synthetase 1A Mus musculus P29080 P11928 P1 A22842 B24359 A91013 21644597 2'-5' oligoadenylate synthetase 2; 2'-5' oligoadenylate B42665 A42665 synthetase-like 11 Mus 3-hydroxy-3-methylglutaryl-Coenzyme A lyase Mus 2SO22682 HMGL MOUSE A45470 musculus 2SO49209 668O233 31S60689 3-hydroxy-3-methylglutaryl-Coenzyme A synthase 2 27734729 B55729 5031751 SS.1103 Mus musculus 2096.5433 2O874930 3.19821.69 3-hydroxybutyrate dehydrogenase (heart, 17738292 A42845 mitochondrial): 3-hydroxybutyrate 21704140 3-hydroxyisobutyrate dehydrogenase, mitochondrial D3HI HUMAN precursor: EST AI265272: 20149758 3-mercaptopyruvate sulfurtransferase; e Mus ROHU musculus 48.1864 3-methyl-2-oxobutanoate dehydrogenase (lipoamide) 4557353 A37157 (EC 1.2.4.4) - mouse 1826668O 3-oxoacid CoA transferase Mus musculus SCOT HUMAN 119681 60 3-oxoacid CoA transferase 2A; haploid germ cell specific Succinyl CoA 6679066 4-nitrophenylphosphatase domain and non-neuronal SNAP25-like protein homolog 1 2O127399 5',3'-deoxyribonucleotidase, mitochondrial Mus 2O127399 991O372 musculus 1892.1208 8-oxoguanine DNA-glycosylase 1 Mus musculus 1892.1208 991 O174 A kinase (PRKA) anchor protein 10; protein kinase A 991 O174 anchoring protein Mus 3072584.5 AAA-ATPase TOB3 Mus musculus 1167982 ABC transporter-7 ABC7 HUMAN 2145O129 acetyl-Coenzyme A acetyltransferase 1 precursor Mus 2145O129 4557237 JHO255 musculus 29.1262OS acetyl-Coenzyme A acyltransferase 2 (mitochondrial 3 S174429 S43440 oxoacyl-Coenzyme A 2O841184 acetyl-Coenzyme A carboxylase beta Mus musculus 3.1982S2O acetyl-Coenzyme A dehydrogenase, long-chain Mus ACDL MOUSE 45O1857 A40559 musculus 6680618 acetyl-Coenzyme A dehydrogenase, medium chain ASS724 4557231 IS2240 Mus musculus 97.90059 acid phosphatase 6, lysophosphatidic; acid 21359911 phosphatase like 1 Mus musculus 18079339 aconitase 2, mitochondrial Mus musculus 18079339 45O1867 Q99798 88SO209 actin-like Mus musculus 3.1982S22 acyl-Coenzyme A dehydrogenase, short chain; acetyl 668062O 496OS 4557233 A306OS Coenzyme A dehydrogenase, 17647119 acyl-Coenzyme A dehydrogenase, short branched 45O1859 ASS68O chain Mus musculus 23956084 acyl-Coenzyme A dehydrogenase, very long chain Mus 23956084 ACDV MOUSE 4557235 ACDB. HUMAN musculus 2SOS 6160 2O881925 7656855 acyl-Coenzyme A oxidase 1, palmitoyl; acyl-Coenzyme A oxidase; Acyl-CoA oxidase 12331400 acyl-Coenzyme A thioesterase 3, mitochondrial; MT 12331400 6912S18 ACT48, p48 Mus musculus 9790025 6753074 adaptor protein complex AP-2, mul; adaptor-related protein complex AP-2, mul; US 2007/0203083 A1 Aug. 30, 2007 42

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 10946936 adenylate kinase 1: cytosolic adenylate kinase Mus musculus 34328.230 adenylate kinase 2 Mus musculus 8392883 KAD2 HUMAN 23956104 adenylate kinase 3 alpha-like; adenylate kinase 3 alpha like Mus musculus adenylate kinase 4Mus musculus 6753022 KIHUA3 1690SO99 AFG3(ATPase family gene 3)-like 1 Mus musculus 16905099 58O2970 A-kinase anchor protein 1: A kinase anchor protein 6753O3O I39173 Mus musculus 77099.78 alanine-glyoxylate aminotransferase: alanine-glyoxylate 77099.78 P21549 XNHUSP aminotransferase 1 Mus aldehyde dehydrogenase 2, mitochondrial Mus 25777732 A40872 DEHUE2 musculus 19527258 aldehyde dehydrogenase family 6, Subfamily A1 Mus MMSA HUMAN musculus 20070418 aldehyde dehydrogenase family 7, member A1; aldehyde dehydrogenase 7 family, 276,59728 aldo-keto reductase family 7, member A5 (aflatoxin aldehyde reductase); 13435924 aldolase 3, C isoform Mus musculus 6678766 alpha-methylacyl-CoA racemase; alpha-methylacyl 6678766 23618869 Coenzyme A racemase; 3.1980703 aminoadipate-semialdehyde synthase; lysine oxoglutarate reductase, Saccharopine 33859502 aminolevulinic acid synthase 2, erythroid; erythroid 20985872 SYMSAL SYHUAL SYHUAE specific ALAS: 1350762O ankycorbin; NORPEG-like protein Mus musculus 6753058 annexin A10 Mus musculus 21541818 AP endonuclease 2 Mus musculus 21541818 6753110 arginase type II Mus musculus 6753110 4502215 ARG2 HUMAN 25089776 ATP synthase D chain, mitochondrial PNOO46 5834959 ATP synthase FO subunit 6 Mus musculus 5834959 PWMS6 27754208 PWHU6 5834958 ATP synthase FO subunit 8 Mus musculus 5834958 PWMS8 PWHU8 21263432 ATP synthase gamma chain, mitochondrial precursor PTO095 3.1980648 ATP synthase, H+ transporting mitochondrial F1 2SOS2136 PS648O 4502295 A33370 complex, beta subunit; ATP 79490O3 33859512 ATP synthase, H+ transporting, mitochondrial FO 2O875157 21361565 JQ1144 complex, Subunit b, isoform 1 25O2O5O2 3.1982497 ATP synthase, H+ transporting, mitochondrial FO 6680750 AT91 MOUSE 38612 S34067 complex, Subunit c (subunit 9), S34066 10181184 ATP synthase, H+ transporting, mitochondrial FO 10181184 PS613S complex, Subunit f, isoform 2: 79490OS ATP synthase, H+ transporting, mitochondrial FO 79490OS PDO444 18644883 JTOS63 complex, S l bl illit t F: 3.1980744 ATP synthase, H+ transporting, mitochondrial FO complex, subunit g; F1F0-ATP 668.0748 ATP synthase, H+ transporting, mitochondrial F1 668O748 JC1473 4757810 PWHUA complex, alpha Subunit, isoform 13385484 ATP synthase, H+ transporting, mitochondrial F1 S901896 complex, epsilon subunit; ATP 116O2916 ATP synthase, H+ transporting, mitochondrial F1 116O2916 4.885079 A49108 complex, gamma polypeptide 1: F 20070412 ATP synthase, H+ transporting, mitochondrial F1 ATPO HUMAN complex, O subunit Mus 6671592 ATP synthase, H+ transporting, mitochondrial F1FO 6671592 JC1412 complex, Subunit e Mus US 2007/0203083 A1 Aug. 30, 2007 43

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink

Accession Description Mouse MITOP Mouse Human MITOP Human 3.198.2864 ATPase inhibitor Mus musculus 6671594 7705927 JC7175 6680758 ATPase, Cu++ transporting, beta polypeptide; Wilson S4OS25 protein; toxic milk Mus 31560731 ATPase, H+ transporting, V1 subunit A, isoform 1: ATPase, H+ transporting, ATPase, H+ transporting, V1 subunit E isoform 1: ATPase, H+ transporting 6680612 ATP-binding cassette, Sub-family D, member 3: peroxisomal membrane protein, 70 ATP-specific Succinyl-CoA synthetase beta subunit 2O876884 Mus musculus 7709988 AU RNA-binding enoyl-coenzyme A hydratase; AU 7709988 18426971 RNA-binding protein enoyl-coenzyme 25052987 67S31 68 B-cell leukemia/lymphoma 2 Mus musculus 67S3168 TVMSA1 B2S960 TVHUA1D37332 6671622 B-cell receptor-associated protein 37; repressor of estrogen receptor activity 67531.98 BCL2/adenovirus E1B 19 kDa-interacting protein 1, 67531.98 NIP3; BCL2/adenovirus E1 B 19 6753200 BCL2/adenovirus E1B 19 kDa-interacting protein 3-like: 6753200 4757860 NIPL HUMAN BCL2 adenovirus E1B 19 6680770 Bcl2-associated X protein Mus musculus BAXA MOUSE BAXA HUMAN 3.1981887 Bcl2-like Mus musculus 6753170 20336335 BCLX HUMAN 4SO2381 3.198.1875 benzodiazepine receptor, peripheral Mus musculus 67S3216 AS3405 I381OS 31542228 BH3 interacting domain death agonist Mus musculus 4557361 BID HUMAN 9055178 brain protein 44-like: apoptosis-regulating basic protein Mus musculus 338S9514 branched chain aminotransferase 2, mitochondrial Mus 23597235 45O2375 BCAM HUMAN musculus 3.1982494 branched chain ketoacid dehydrogenase E1, alpha 6671624 S71881 11386.13S DEHUXA polypeptide; BCKAD E1a Mus 67S3164 branched chain ketoacid dehydrogenase kinase; 67S3164 SO31.609 branched chain keto acid 16905127 butyryl Coenzyme A synthetase 1; acetyl-Coenzyme A synthetase 3 Mus musculus 6753290 calsequestrin 1 Mus musculus A60424 738.1085 carbamoylphosphate Synthetase I Mus musculus 21361331 JQ1348 6671 680 carbonic anhydrase 5a, mitochondrial; carbonic 6671 680 S12579 4SO2S21 CRHUS anhydrase 5, mitochondrial; 95.06463 carbonic anhydrase 5b, mitochondrial; carbonic 95.06463 6005723 anhydrase VB; carbonic anhydrase 6671 688 carbonyl reductase 2: lung carbonyl reductase Mus 6671 688 A280S3 musculus 6.681009 carnitine acetyltransferase Mus musculus 6681009 CACP MOUSE 21.618331 ASS720 21.618334 21.618336 27804309 carnitine palmitoyltransferase 1, liver; L-CPT IMus 2O884997 4SO3O21 IS9351 musculus 27804309 6753512 carnitine palmitoyltransferase 1, muscle: M-CPT I Mus 23238.254 S70579 musculus 23238.256 4758.050 23238.258 6753514 carnitine palmitoyltransferase 2: CPT II Mus musculus 6753514 A49362 A390 18 67S3454 caseinolytic protease X Mus musculus 67S3454 7242140 CLPX HUMAN 83931S6 caseinolytic protease, ATP-dependent, proteolytic 83931.56 S68421 Subunit homolog: caseinolytic 2O847456 caspase 8 Mus musculus 15718704 157187O6 US 2007/0203083 A1 Aug 30, 2007 44

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 15718708 15718710 15718712 6753272 catalase; catalase 1 Mus musculus 6681079 cathepsin B preproprotein Mus musculus 6753556 cathepsin DMus musculus 11968166 cathepsin Z preproprotein: cathepsin Z precursor: cathepsin X Mus musculus 31560609 ceroid lipofuscinosis, neuronal 3, juvenile (Batten, 4SO2889 Spielmeyer-Vogt disease) 6753448 ceroid-lipofuscinosis, neuronal 2 Mus musculus 7304963 chloride intracellular channel 4 (mitochondrial) Mus 73O4963 musculus 13385942 citrate synthase Mus musculus 4758.076 6680816 complement component 1. q Subcomponent binding 668O816 protein Mus musculus 6681007 coproporphyrinogen oxidase; clone 560 Mus musculus 6.681007 A48049 2O1274.06 IS2444 10946574 creatine kinase, brain Mus musculus 6753428 creatine kinase, mitochondrial 1, ubiquitous Mus 67S3428 S24612 4SO28SS A35756 A30789 musculus 10334859 6681.031 cryptochrome 1 (photolyase-like) Mus musculus 4758072 201006 Cu/Zn-Superoxide dismutase 5834966 cytochrome b Mus musculus 5834966 CBMS CBHU 22094.077 cytochrome b-245, alpha polypeptide; cytochrome beta- 4557505 558; p.22 phox Mus 31542440 cytochrome b-245, beta polypeptide Mus musculus 6996021 13385268 cytochrome b-5 Mus musculus 4503183 CBHUS CBEHUSE 5834956 cytochrome c oxidase subunit I Mus musculus 5834956 ODMS1 27754204 ODHU1 5834957 cytochrome c oxidase subunit II Mus musculus 5834957 OBMS2 27754206 OBHU2 5834960 cytochrome c oxidase subunit III Mus musculus 5834960 OTMS3 OTHU3 16716379 cytochrome c oxidase subunit IV isoform 2 precursor: 16716379 Cox IV-2 Mus musculus 6677977 cytochrome c oxidase subunit VIIa polypeptide 2-like: 6677977 O14548 silica-induced gene 81 13384754 cytochrome c oxidase subunit VIIb Mus musculus 13384754 4SO2991. OSHUfB 6753498 cytochrome c oxidase, subunit IVa; cytochrome c 67S3498 S12142 OLEHU4 oxidase, subunit IV Mus 6680986 cytochrome c oxidase, subunit Va Mus musculus 6680986 SOS495 4758O38 OTHUSA 6753500 cytochrome c oxidase, subunit Vb Mus musculus 67S3500 A39425 OTHUSB 668.0988 cytochrome c oxidase, subunit VI a, polypeptide 1: 6680988 SS2088 OGHU6L subunit VIaL (liver-type) 6753502 cytochrome c oxidase, subunit VI a, polypeptide 2: 6753502 COXD MOUSE OGHU6A subunit VIaH (heart-type) 13385090 cytochrome c oxidase, subunit VIb Mus musculus OGHU6B 16716343 cytochrome c oxidase, subunit VIc Mus musculus S16083 OGHU6C 6753504 cytochrome c oxidase, subunit VIIa 1; cytochrome c 6753504 oxidase subunit VIIa 1 Mus 31981830 cytochrome c oxidase, subunit VIIa 2; cytochrome c 67S3506 48286 oxidase subunit VIIa 3: 6680991 cytochrome c oxidase, subunit VIIc: cytochrome c 25025041 COXO MOUSE OSHUFC oxidase subunit VIIc Mus 2SOS3109 S1 O303 25057077 668O991 6680993 cytochrome c oxidase, subunit VIIIa: COX VIII-L Mus 668O993 COXR MOUSE 475804.4 OSHU8 musculus 6680995 cytochrome c oxidase, subunit VIIIb: COX VIII-H Mus 6680995 COXQ MOUSE musculus US 2007/0203083 A1 Aug. 30, 2007 45

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 167583O8 cytochrome c oxidase, subunit XVII assembly protein Q14061 homolog Rattus norvegicus 6.681095 cytochrome c, somatic Mus musculus 6753560 CCMS CCMST 11128O19 CCHU 1338SOO6 cytochrome c-1 Mus musculus 21359867 SOO680 23 1896 Cytochrome P450 11B2, mitochondrial precursor 13904853 B34181 S11338 (CYPXIB2) (P450C11) (Steroid 2O867579 cytochrome P450, 40 (25-hydroxyvitamin D31 alpha 2O867579 hydroxylase) Mus musculus 97899.21 cytochrome P450, family 11, subfamily a, polypeptide 1: 97899.21 4SO3189 A25922 S14367 cytochrome P450, 11a, 7106287 cytochrome P450, family 11, subfamily b, polypeptide 2: 7106287 A41552 6.681097 cytochrome P450, family 17, subfamily a, polypeptide 1: cytochrome P450, 17: 6753572 cytochrome P450, family 24, subfamily a, polypeptide 1: 6753572 S60033 A47436 cytochrome P450, 24: 305784O1 cytochrome P450, family 27, subfamily a, polypeptide 1: 4SO3211 A39740 cytochrome P450, 27: 1887S324 DAZ associated protein 1 Mus musculus 1887S324 17505907 DEAD (Asp-Glu-Ala-Asp) box polypeptide 31 isoform 1: DEADDEXH helicase DDX31 2O587962 emethyl-Q 7 Mus musculus 2S453484 73O4999 eoxyguanosine kinase Mus musculus 73O4999 18426967 JC6142 18426963 18426969 1842696S 21281 687 eoxyuridine triphosphatase Mus musculus 4503423 DUT HUMAN 1974.5150 iaphorase 1 (NADH) Mus musculus RDHUBS 6681137 iazepam binding inhibitor; acyl-CoA binding protein; iazepam-binding inhibitor 67S3610 ihydrolipoamide branched chain transacylase E2; 6753610 S65760 4SO326S A32422 BCKAD E2 Mus musculus 3.198.2856 ihydrolipoamide dehydrogenase Mus musculus 6681189 107450 4557525 DEHULP 31542.559 ihydrolipoamide S-acetyltransferase (E2 component of 2163O2SS S2566S XXHU pyruvate dehydrogenase 21313536 ihydrolipoamide S-Succinyltransferase (E2 component 21313536 PNO673 of 2-oxo-glutarate complex) 99.101.94 ihydroorotate dehydrogenase Mus musculus 991 O194 16753223 PC1219 6753676 ihydropyrimidinase-like 2; collapsin response mediator protein 2 Mus musculus 21311901 imethylglycine dehydrogenase precursor Mus 24797151 M2GD HUMAN musculus 34328271 irect IAP binding protein with low PIMus musculus 1296.3593 9845297 21070978 21070976 34.328379 D-lactate dehydrogenase Mus musculus 19527228 DNA segment, Chr 10, ERATO Doi 214, expressed Mus musculus 20070420 DNA segment, Chr 10, Johns Hopkins University 81 JC4913 JC4914 expressed Mus musculus 2SO92662 DNA segment, Chr 11, Wayne State University 68, expressed Mus musculus 27552760 DNA segment, Chr 16, Indiana University Medical 22, 27552760 expressed Mus musculus 1486.1848 DNA segment, Chr 7, Roswell Park 2 complex, expressed; androgen regulated gene 31560O85 DnaJ (Hsp40) homolog, Subfamily A, member 3 Mus 139941SS musculus 2505.3902 US 2007/0203083 A1 Aug. 30, 2007 46

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 3.198.1810 dodecenoyl-Coenzyme A delta isomerase (3.2 trans 6753612 S38770 4503267 A55723 enoyl-Coenyme A isomerase) Mus 3.1981826 electron transferring flavoprotein, alpha polypeptide; 4SO3607 A31998 Alpha-ETF Mus musculus 21313290 electron transferring flavoprotein, dehydrogenase Mus Q16134 musculus 6679647 endonuclease G Mus musculus 6679647 475827O NUCG HUMAN 1992.3857 endothelial cell growth factor 1; thymidine P19971 phosphorylase; gliostatin; platelet 7949037 enoyl coenzyme A hydratase 1, peroxisomal; 7949037 peroxisomal mitochondrial dienoyl-CoA 29789289 enoyl Coenzyme A hydratase, short chain, 1, 12707570 ECHM HUMAN mitochondrial Mus musculus 73051.25 estradiol 17 beta-dehydrogenase 8; 17-beta hydroxysteroid dehydrogenase 8; 18079334 ethanol induced 6 Mus musculus 6679078 expressed in non-metastatic cells 2, protein; expressed in non-metastatic cells 9790123 expressed in non-metastatic cells 4, protein; nucleoside 9790123 4826862. NDKM HUMAN diphosphate kinase 21618729 Facl5 protein Mus musculus 3156O705 atty acid Coenzyme Aligase, long chain 2; acetyl LCFA HUMAN Coenzyme A synthetase; JXO2O2 6679765 ferredoxin 1: ADRENODOXIN Mus musculus 6679765 S53524. 4758352 AXEHU 6679767 ferredoxin reductase Mus musculus 6679767 S60028 4758354 A40487 13435350 13385780 ferritin heavy chain 3; mitochondrial ferritin Mus 1338.578O musculus 20452466 ferrochelatase Mus musculus 2O4S2466 A37972 A36403 10946808 fibroblast growth factor (acidic) intracellular binding 7262378 protein; aFGF 33469107 folylpolyglutamyl synthetase Mus musculus 2O824150 S65755 22O24385 A46281 9507187 fractured callus expressed transcript 1: Fracture Callus 9507187 ; Small Zinc 6679863 frataxin Mus musculus 6679863 4503785 33859554 fumarate hydratase 1 Mus musculus 2O831568 19743875 UFHUM 200704O2 G elongation factor; mitochondrial Mus musculus 12963633 genes associated with retinoid-IFN-induced mortality 19 Mus musculus 6679957 glioblastoma amplified Mus musculus 31982798 glucokinase; hexokinase 4 Mus musculus A46157 CA6157 6680027 glutamate dehydrogenase Mus musculus 668OO27 S16239 2748S958 AS3719 DEHUE 4.885281 691.2392 6754036 glutamate oxaloacetate transaminase 2, mitochondrial; 67S4O36 SO1174 4504O69 XNHUDM mitochondrial aspartate 31982332 glutamate-ammonia ligase (glutamine synthase); glutamine synthetase Mus 31982847 glutamic acid decarboxylase 1 Mus musculus 6679959 glutaryl-Coenzyme A dehydrogenase Mus musculus 6679959 GCDH MOUSE 4503943 GCDH HUMAN 76694.94 668.0075 glutathione peroxidase 1: cellular GPx Mus musculus 668OO75 1354.0480 glutathione peroxidase 4: sperm nuclei glutathione 13S4O480 4SO4107 peroxidase; phospholipid 34328489 glutathione reductase 1 Mus musculus 13775154 21313138 glutathione S-transferase class kappa Mus musculus 6754092 glutathione transferase Zeta 1 (maleylacetoacetate isomerase); US 2007/0203083 A1 Aug. 30, 2007 47

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 6679937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus 6680139 glycerol kinase Mus musculus GKP2 HUMAN GLPK HUMAN 34536827 glycerol-3-phosphate acyltransferase, mitochondrial 668OOS7 Mus musculus 31981769 glycerol-3-phosphate dehydrogenase 2: glycerol 6753970 450408.5 GPDM HUMAN phosphate dehydrogenase 1, 13385454 glycine amidinotransferase (L-arginine:glycine 13385454 4SO3933 S41734 amidinotransferase) Mus 31560488 glycine C-acetyltransferase (2-amino-3-ketobutyrate- 7305083 coenzyme Aligase); 20070408 glycine decarboxylase Mus musculus B39521 6806917 GM2 ganglioside activator protein Mus musculus 6680107 granulin; acrogranulin; progranulin; PC cell-derived growth factor Mus 12746414 growth factor, erv1 (S. cerevisiae)-like (augmenter of iver regeneration); 13277394 GrpE-like 1, mitochondrial Mus musculus 13277394 29789124 GrpE-like 2, mitochondrial Mus musculus 2O878923 3766203 GTP-specific succinyl-CoA synthetase beta subunit 2O8288.15 TO8812 Mus musculus 2137368 H+-transporting two-sector ATPase (EC 3.6.3.14) chain SS8660 c - mouse (fragments) 6680309 heat shock protein 1 (chaperonin 10); heat shock 10 kDa 6680309 ASSO75 S47532 protein 1 (chaperonin CH10 MOUSE 31981679 heat shock protein 1 (chaperonin); heat shock protein, HHMS60 A32800 60 kDa: heat shock 60 kDa 6680305 heat shock protein 1, beta; heat shock protein, 84 kDa ; heat shock 90 kDa 31560686 heat shock protein 2: heat shock protein, 70 kDa 2: heat B45871 shock 70 kDa protein 2 675.4256 heat shock protein, A.; heat shock protein cognate 74; 67S4256 A481.27 24234688 B481.27 heat shock protein, 74 2SO24532 6680277 heat-responsive protein 12 Mus musculus 7305137 heme binding protein 1: heme-binding protein; p22 HBP: heme-binding protein 1 6680175 hemoglobin alpha, adult chain 1: alpha 1 globin Mus musculus 122513 Hemoglobin beta-1 chain (B1) (Major) 31982300 hemoglobin, beta adult major chain; beta major globin; beta maj Mus musculus 6754206 hexokinase 1: downeast anemia Mus musculus A35244 A31869 JC2O2S 2098.2837 holocarboxylase synthetase; biotin-propriony- BPL1 HUMAN Coenzyme A-carboxylase 31542950 holocytochrome c synthetase Mus musculus 668O181 CCHL MOUSE GO2133 6754160 HS1 binding protein Mus musculus 6754160 13435356 12963539 HSCO protein Mus musculus 7949047 hydroxyacyl-Coenzyme A dehydrogenase type II; hydroxyacyl-Coenzyme A 21704100 hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl- 4SO4327 JC2109 Coenzyme A 33.859811 hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl- 2O1274.08 JC2108 Coenzyme A 31982273 hydroxysteroid (17-beta) dehydrogenase 4: hydroxysteroid 17-beta dehydrogenase 6680291 hydroxysteroid dehydrogenase-4, delta-3-beta; 3-beta- 2O874991 49762 DEHUHS DEHUH2 hydroxysteroid 23397415 3BH3 MOUSE US 2007/0203083 A1 Aug 30, 2007 48

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 23621517 3BH4 MOUSE 668O289 3BH5 MOUSE 668O291 3BEH6 MOUSE 668O293 3BH2 MOUSE 7305167 2SO46137 27754O71 hypothetical protein 4833421E05Rik Mus musculus 21311867 hypothetical protein D11 Erta 99e Mus musculus 21312O20 hypothetical protein D4Ertd765e Mus musculus 22122743 hypothetical protein MGC37245 Mus musculus 21313262 inner membrane protein, mitochondrial Mus musculus 222O3753 inorganic pyrophosphatase 2 Mus musculus 14916467 inositol polyphosphate-5-phosphatase E; inositol 14916467 polyphosphate-5-phosphatase, 72 27370516 isocitrate dehydrogenase 2 (NADP+), mitochondrial 668O343 IDHP MOUSE 4504575 S57499 Mus musculus 182SO284 isocitrate dehydrogenase 3 (NAD+) alpha Mus 5031777 S55282 musculus 668O345 isocitrate dehydrogenase 3 (NAD+), gamma Mus IDHG HUMAN musculus 187OOO24 isocitrate dehydrogenase 3, beta Subunit; isocitrate S901982 IDHB HUMAN dehydrogenase 3 beta; N14A 9789985 isovaleryl coenzyme A dehydrogenase; isovaleryl 4504799 A37033 dehydrogenase precursor Mus 6754482 keratin complex 1, acidic, gene 18; keratin 18 Mus musculus 6754488 keratin complex 2, basic, gene 6b Mus musculus 1948.2166 kidney expressed gene 1 Mus musculus 2SO31694 kinesin family member 1B Mus musculus 20850523 2SO31694 19527030 kynurenine 3-monooxygenase (kynurenine 3 hydroxylase) Mus musculus 67S44.08 kynurenine aminotransferase II Mus musculus 668O163 L-3-hydroxyacyl-Coenzyme A dehydrogenase, short 668O163 JC4210 4885387 JC4879 chain; hydroxylacyl-Coenzyme A 217O3764 actamase, beta 2 Mus musculus 13SO7666 actamase, beta; serine beta lactamase-like protein; 13SO7666 mitochondrial ribosomal 3.1981147 eucine aminopeptidase 3: leucine aminopeptidase Mus musculus 9789997 eucine Zipper-EF-hand containing transmembrane protein 1; leucine 2138932O eucine-rich PPR motif-containing protein; leucine rich protein LRP130 Mus 233466.17 eucyl-tRNA synthetase Mus musculus SYLM HUMAN 13277380 ipoic acid synthetase Mus musculus 13277380 6678716 ow density lipoprotein receptor-related protein 5; low density 21539585 ow molecular mass ubiquinone-binding protein; 21539585 ubiquinol-cytochrome c reductase 31541815 L-specific multifunctional beta-oxdiation protein Mus musculus 667876O ySophospholipase 1: phospholipase 1a; ySophopholipase 1 Mus musculus 839.3739 ysozyme Mus musculus 13654245 major urinary protein 1 Mus musculus 3.19821.86 malate dehydrogenase, mitochondrial Mus musculus 6678916 DEMSMM MDHM HUMAN US 2007/0203083 A1 Aug. 30, 2007 49

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 21703972 malic enzyme 2, NAD(+)-dependent, mitochondrial 4SOS145 A39SO3 Mus musculus 31542169 malic enzyme 3, NADP(+)-dependent, mitochondrial S53351 Mus musculus 991 0434 malonyl-CoA decarboxylase Mus musculus 6912498 DCMC HUMAN 6754760 mature T-cell proliferation 1 Mus musculus 6754760 7305291 metaxin 1; metaxin Mus musculus 7305291 MTXN HUMAN 31543274 metaxin 2 Mus musculus 7949084 3.1981013 methionine Sulfoxide reductase A Mus musculus 3.19807O6 methylcrotonoyl-Coenzyme A carboxylase 1 (alpha) 1296.5187 Mus musculus 6678952 methylenetetrahydrofolate dehydrogenase (NAD+ 6678952 A33267 5729.935 DEHUMT dependent), 2O270275 methylenetetrahydrofolate dehydrogenase 1: C1 13699868 A31903 tetrahydrofolate synthase Mus 667897O methylmalonyl-Coenzyme A mutase Mus musculus 667897O SO868O 45.57767 3.1981068 microsomal glutathione S-transferase 1 Mus musculus 3O794474 mitchondrial ribosomal protein S7; ribosomal protein, mitochondrial, S7 Mus 195274O2 mitochondrial acyl-CoA thioesterase 1 Mus musculus 195274O2 13386040 mitochondrial ATP synthase regulatory component 13386040 factor BMus musculus 1SO11842 mitochondrial capsule selenoprotein; sperm 1SO11842 A371.99 MCS HUMAN mitochondria associated cysteine-rich 97.90055 mitochondrial carrier homolog 2 Mus musculus 28.076953 mitochondrial intermediate peptidase Mus musculus 5174567 27SO2349 mitochondrial matrix processing protease, alpha subunit Q10713 Mus musculus 315598.91 mitochondrial Rho. 1 Mus musculus 22164792 mitochondrial ribosomal protein L12 Mus musculus RM12 HUMAN phosphorylase; gliostatin; platelet 7949037 enoyl coenzyme A hydratase 1, peroxisomal; 7949037 peroxisomal mitochondrial dienoyl-CoA 29789289 enoyl Coenzyme A hydratase, short chain, 1, 12707570 ECHM HUMAN mitochondrial Mus musculus 73051.25 estradiol 17 beta-dehydrogenase 8; 17-beta hydroxysteroid dehydrogenase 8: 18079334 ethanol induced 6 Mus musculus 6679078 expressed in non-metastatic cells 2, protein; expressed in non-metastatic cells 9790123 expressed in non-metastatic cells 4, protein; nucleoside 9790123 4826862 NDKM HUMAN diphosphate kinase 21618729 Facl5 protein Mus musculus 3156O705 atty acid Coenzyme Aligase, long chain 2; acetyl LCFA HUMAN Coenzyme A synthetase; JXO2O2 6679765 erredoxin 1: ADRENODOXIN Mus musculus 6679765 S53524. 4758352 AXEHU 6679767 erredoxin reductase Mus musculus 6679767 S60028 47583S4 A40487 13435350 1338.578O erritin heavy chain 3; mitochondrial ferritin Mus 1338.578O musculus 20452466 errochelatase Mus musculus 2O4S2466 A37972 A36403 10946808 fibroblast growth factor (acidic) intracellular binding 7262378 protein; aFGF 33469107 olylpolyglutamyl synthetase Mus musculus 2O824150 S65755 22O24385 A46281 9507187 ractured callus expressed transcript 1: Fracture Callus 9507187 ; Small Zinc 6679863 rataxin Mus musculus 6679863 4503785 338595.54 fumarate hydratase 1 Mus musculus 2O831568 19743875 UFHUM US 2007/0203083 A1 Aug. 30, 2007 50

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 20070402 G elongation factor; mitochondrial Mus musculus 12963633 genes associated with retinoid-IFN-induced mortality 19 Mus musculus 6679957 glioblastoma amplified Mus musculus 31982798 glucokinase; hexokinase 4 Mus musculus A46157 CA6157 6680027 glutamate dehydrogenase Mus musculus 668OO27 S16239 2748S958 AS3719 DEHUE 4.885281 691.2392 6754036 glutamate oxaloacetate transaminase 2, mitochondrial; 67S4O36 SO1174 4504O69 XNHUDM mitochondrial aspartate 31982332 glutamate-ammonia ligase (glutamine synthase); glutamine synthetase Mus 31982847 glutamic acid decarboxylase 1 Mus musculus 6679959 glutaryl-Coenzyme A dehydrogenase Mus musculus 6679959 GCDH MOUSE 4503943 GCDH HUMAN 76694.94 668.0075 glutathione peroxidase 1: cellular GPx Mus musculus 668OO75 1354.0480 glutathione peroxidase 4: sperm nuclei glutathione 13S4O480 4SO4107 peroxidase; phospholipid 34328489 glutathione reductase 1 Mus musculus 13775154 21313138 glutathione S-transferase class kappa Mus musculus 6754092 glutathione transferase Zeta 1 (maleylacetoacetate isomerase); 6679937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus 6680139 glycerol kinase Mus musculus GKP2 HUMAN GLPK HUMAN 34536827 glycerol-3-phosphate acyltransferase, mitochondrial 668OOS7 Mus musculus 31981769 glycerol-3-phosphate dehydrogenase 2: glycerol 6753970 450408.5 GPDM HUMAN phosphate dehydrogenase 1, 13385454 glycine amidinotransferase (L-arginine:glycine 13385454 4SO3933 S41734 amidinotransferase) Mus 31560488 glycine C-acetyltransferase (2-amino-3-ketobutyrate- 7305083 coenzyme Aligase); 20070408 glycine decarboxylase Mus musculus B39521 6806917 GM2 ganglioside activator protein Mus musculus 6680107 granulin; acrogranulin; progranulin; PC cell-derived growth factor Mus 12746414 growth factor, erv1 (S. cerevisiae)-like (augmenter of liver regeneration); 13277394 GrpE-like 1, mitochondrial Mus musculus 13277394 29789124 GrpE-like 2, mitochondrial Mus musculus 2O878923 3766203 GTP-specific succinyl-CoA synthetase beta subunit 2O8288.15 TO8812 Mus musculus 2137368 H+-transporting two-sector ATPase (EC 3.6.3.14) chain SS8660 c - mouse (fragments) 6680309 heat shock protein 1 (chaperonin 10); heat shock 10 kDa 6680309 ASSO75 S47532 protein 1 (chaperonin CH10 MOUSE 31981679 heat shock protein 1 (chaperonin); heat shock protein, HHMS60 A32800 60 kDa: heat shock 60 kDa 6680305 heat shock protein 1, beta; heat shock protein, 84 kDa 1: heat shock 90 kDa 31560686 heat shock protein 2: heat shock protein, 70 kDa 2: heat B45871 shock 70 kDa protein 2 675.4256 heat shock protein, A.; heat shock protein cognate 74; 67S4256 A481.27 24234688 B481.27 heat shock protein, 74 2SO24532 6680277 heat-responsive protein 12 Mus musculus US 2007/0203083 A1 Aug. 30, 2007 51

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 7305137 heme binding protein 1: heme-binding protein; p22 HBP: heme-binding protein 1 6680175 hemoglobin alpha, adult chain 1: alpha 1 globin Mus musculus 122513 Hemoglobin beta-1 chain (B1) (Major) 31982300 hemoglobin, beta adult major chain; beta major globin; beta maj Mus musculus 6754206 hexokinase 1: downeast anemia Mus musculus A35244 A31869 JC2O2S 2098.2837 holocarboxylase synthetase; biotin-propriony- BPL1 HUMAN Coenzyme A-carboxylase 31542950 holocytochrome c synthetase Mus musculus 668O181 CCHL MOUSE GO2133 6754160 HS1 binding protein Mus musculus 6754160 13435356 12963539 HSCO protein Mus musculus 7949047 hydroxyacyl-Coenzyme A dehydrogenase type II; hydroxyacyl-Coenzyme A 21704100 hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl- 4SO4327 JC2109 Coenzyme A 33.859811 hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl- 2O1274.08 JC2108 Coenzyme A 31982273 hydroxysteroid (17-beta) dehydrogenase 4: hydroxysteroid 17-beta dehydrogenase 6680291 hydroxysteroid dehydrogenase-4, delta-3-beta; 3-beta- 2O874991 49762 DEHUHS DEHUH2 hydroxysteroid 23397415 3BH3 MOUSE 23621517 3BH4 MOUSE 668O289 3BH5 MOUSE 668O291 3BEH6 MOUSE 668O293 3BH2 MOUSE 7305167 2SO46137 27754071 hypothetical protein 4833421E05Rik Mus musculus 21311867 hypothetical protein D11 Erta 99e Mus musculus 21312020 hypothetical protein D4Ertd765e Mus musculus 22122743 hypothetical protein MGC37245 Mus musculus 21313262 inner membrane protein, mitochondrial Mus musculus 22203753 inorganic pyrophosphatase 2 Mus musculus 14916467 inositol polyphosphate-5-phosphatase E: inositol 14916467 polyphosphate-5-phosphatase, 72 27370516 isocitrate dehydrogenase 2 (NADP+), mitochondrial 668O343 IDHP MOUSE 4504575 S57499 Mus musculus 18250284 isocitrate dehydrogenase 3 (NAD+) alpha Mus 5031777 S55282 musculus 6680345 isocitrate dehydrogenase 3 (NAD+), gamma Mus 668O345 IDHG HUMAN musculus 18700024 isocitrate dehydrogenase 3, beta subunit; isocitrate 590.1982 IDHB HUMAN dehydrogenase 3 beta; N14A 9789985 isovaleryl coenzyme A dehydrogenase; isovaleryl 4SO4799 A37033 dehydrogenase precursor Mus 6754482 keratin complex 1, acidic, gene 18; keratin 18 Mus musculus 6754488 keratin complex 2, basic, gene 6b Mus musculus 1948.2166 kidney expressed gene 1 Mus musculus 25031694 kinesin family member 1B Mus musculus 20850523 2SO31694 19527030 kynurenine 3-monooxygenase (kynurenine 3 hydroxylase) Mus musculus 6754408 kynurenine aminotransferase II Mus musculus 6680163 L-3-hydroxyacyl-Coenzyme A dehydrogenase, short 668O163 JC4210 4.885387 JC4879 chain; hydroxylacyl-Coenzyme A US 2007/0203083 A1 Aug. 30, 2007 52

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink

Accession Description Mouse MITOP Mouse Human MITOP Human 217O3764 actamase, beta 2 Mus musculus 13SO7666 actamase, beta; serine beta lactamase-like protein; 13SO7666 mitochondrial ribosomal 3.1981147 eucine aminopeptidase 3: leucine aminopeptidase Mus musculus 9789997 eucine Zipper-EF-hand containing transmembrane protein 1; leucine 2138932O eucine-rich PPR motif-containing protein; leucine rich protein LRP130 Mus 233466.17 eucyl-tRNA synthetase Mus musculus SYLM HUMAN 13277380 ipoic acid synthetase Mus musculus 13277380 6678716 ow density lipoprotein receptor-related protein 5; low density 21539585 ow molecular mass ubiquinone-binding protein; 21539585 ubiquinol-cytochrome c reductase 31541815 L-specific multifunctional beta-oxdiation protein Mus musculus 667876O ySophospholipase 1: phospholipase 1a; ySophopholipase 1 Mus musculus 839.3739 ysozyme Mus musculus 13654245 major urinary protein 1 Mus musculus 3.19821.86 malate dehydrogenase, mitochondrial Mus musculus 6678916 DEMSMM MDHM HUMAN 21703972 malic enzyme 2, NAD(+)-dependent, mitochondrial 4SOS145 A39SO3 Mus musculus 31542169 malic enzyme 3, NADP(+)-dependent, mitochondrial S53351 Mus musculus 991 0434 malonyl-CoA decarboxylase Mus musculus 6912498 DCMC HUMAN 6754760 mature T-cell proliferation 1 Mus musculus 6754760 7305291 metaxin 1; metaxin Mus musculus 7305291 MTXN HUMAN 31543274 metaxin 2 Mus musculus 7949084 3.1981013 methionine Sulfoxide reductase A Mus musculus 3.19807O6 methylcrotonoyl-Coenzyme A carboxylase 1 (alpha) 1296.5187 Mus musculus 6678952 methylenetetrahydrofolate dehydrogenase (NAD+ 6678952 A33267 5729.935 DEHUMT dependent), 2O270275 methylenetetrahydrofolate dehydrogenase 1: C1 13699868 A31903 tetrahydrofolate synthase Mus 667897O methylmalonyl-Coenzyme A mutase Mus musculus 667897O 45.57767 S4O622 3.1981068 microsomal glutathione S-transferase 1 Mus musculus B28O83 3O794474 mitchondrial ribosomal protein S7; ribosomal protein, JCT 165 mitochondrial, S7 Mus 195274O2 mitochondrial acyl-CoA thioesterase 1 Mus musculus 195274O2 13386040 mitochondrial ATP synthase regulatory component 13386040 factor BMus musculus 1SO11842 mitochondrial capsule selenoprotein; sperm 1SO11842 A371.99 MCS HUMAN mitochondria associated cysteine-rich 97.90055 mitochondrial carrier homolog 2 Mus musculus 28.076953 mitochondrial intermediate peptidase Mus musculus 5174567 27SO2349 mitochondrial matrix processing protease, alpha subunit Q10713 Mus musculus 315598.91 mitochondrial Rho. 1 Mus musculus 22164792 mitochondrial ribosomal protein L12 Mus musculus RM12 HUMAN 1671 6447 mitochondrial ribosomal protein L27 Mus musculus 1671 6447 3.1981470 mitochondrial ribosomal protein L3 Mus musculus RSHUL3 RSHUL3 13385266 mitochondrial ribosomal protein L33 Mus musculus 1671 64.49 mitochondrial ribosomal protein L34 Mus musculus 1671 64.49 31S60438 mitochondrial ribosomal protein L39; ribosomal protein, 8393O21 mitochondrial, L5 Mus US 2007/0203083 A1 Aug 30, 2007 53

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 3385752 mitochondrial ribosomal protein L49; neighbor of fau 1 13385752 Mus musculus 3051992.1 mitochondrial ribosomal protein L50 Mus musculus 29789253 mitochondrial ribosomal protein L9 Mus musculus 2O874698 7157979 mitochondrial ribosomal protein S11 Mus musculus 17157979 6755360 mitochondrial ribosomal protein S12; ribosomal protein, 6755360 RT12 HUMAN mitochondrial, S12: 3384894 mitochondrial ribosomal protein S14 Mus musculus 3384968 mitochondrial ribosomal protein S15 Mus musculus 13384968 338484.4 mitochondrial ribosomal protein S16 Mus musculus 13384844 338485.4 mitochondrial ribosomal protein S17 Mus musculus 13384854 31543265 mitochondrial ribosomal protein S2 Mus musculus 7505220 mitochondrial ribosomal protein S21 Mus musculus 17505220 31981257 mitochondrial ribosomal protein S25 Mus musculus 1338.5024 O181116 mitochondrial ribosomal protein S31; islet mitochondrial 10181116 5031787 antigen, 38 kD Mus 7157985 mitochondrial ribosomal protein S5 Mus musculus 23956244 mitochondrial ribosomal protein S6 Mus musculus 23956244 9526984 mitochondrial translational initiation factor 2 Mus 4505277 A55628 musculus 31981857 mitochondrial translational release factor 1 Mus 4758744 RF1M HUMAN musculus 27804325 monoamine oxidase A Mus musculus 20983270 48342 A36175 278043.25 19073795 MTO1 Mus musculus 19073795 6754732 myeloperoxidase Mus musculus OPHUM 22003874 N-acetylglutamate synthase; amino-acid N- 220O3874 acetyltransferase Mus musculus 9055.168 N-acylsphingosine amidohydrolase 2; neutral/alkaline; 9845267 neutral alkaline 13195624 NADH dehydrogenase (ubiquinone) 1 alpha O95299 Subcomplex 10 Mus musculus 9506911 NADH dehydrogenase (ubiquinone) 1 alpha 9506911 O15239 subcomplex, 1 (7.5 kD, MWFE); NADH 31981600 NADH dehydrogenase (ubiquinone) 1 alpha O43678 Subcomplex, 2: NADH dehydrogenase 33563266 NADH dehydrogenase (ubiquinone) 1 alpha NUML MOUSE NUML HUMAN Subcomplex, 4: NADH dehydrogenase 3386100 NADH dehydrogenase (ubiquinone) 1 alpha NUFM. Human Subcomplex, 5 Mus musculus 3385492 NADH dehydrogenase (ubiquinone) 1 alpha P56.556 Subcomplex, 6 (B14); NADH dehydrogenase 2963571 NADH dehydrogenase (ubiquinone) 1 alpha AADO5427 Subcomplex, 7 (B14.5a); NADH 21312012 NADH dehydrogenase (ubiquinone) 1 alpha 7657369 NUPM HUMAN Subcomplex, 8 Mus musculus 3384720 NADH dehydrogenase (ubiquinone) 1 alpha NUEM HUMAN Subcomplex, 9 Mus musculus 31980802 NADH dehydrogenase (ubiquinone) 1 alpha 27229088 Subcomplex, assembly factor 1: NADH 3385054 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex 450S361 O43676 3 Mus musculus 3385558 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex JEO382 8 Mus musculus 3386096 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, 2 Mus musculus 27754144 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, O43674 5: NADH dehydrogenase US 2007/0203083 A1 Aug 30, 2007 54

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink

Accession Description Mouse MITOP Mouse Human MITOP Human 13385322 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, NB8M HUMAN 7 Mus musculus 29789148 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, 9 Mus musculus 27754007 NADH dehydrogenase (ubiquinone) 1, alpha/beta TOOf 41 Subcomplex, 1 Mus musculus 13384946 NADH dehydrogenase (ubiquinone) 1, Subcomplex O43677 unknown, 1 Mus musculus 21704020 NADH dehydrogenase (ubiquinone) Fe—S protein 1 S17854 Mus musculus 23346461 NADH dehydrogenase (ubiquinone) Fe—S protein 2: JEO193 NADH-coenzyme Q reductase Mus 6754814 NADH dehydrogenase (ubiquinone) Fe—S protein 4; NUYM HUMAN NADH dehydrogenase (ubiquinone) 19527334 NADH dehydrogenase (ubiquinone) Fe—S protein 5: O4392O NADH dehydrogenase Fe—S protein 21312950 NADH dehydrogenase (ubiquinone) Fe—S protein 7 O75251 Mus musculus 21450.107 NADH dehydrogenase (ubiquinone) Fe—S protein 8 NUIM HUMAN Mus musculus 19526814 NADH dehydrogenase (ubiquinone) flavoprotein 1: A44362 NADH dehydrogenase flavoprotein 20900762 NADH dehydrogenase (ubiquinone) flavoprotein 2 Mus 209.00762 A3O113 musculus 5834954 NADH dehydrogenase subunit 1 Mus musculus 5834954 QXMS1M DNHUN1 5834955 NADH dehydrogenase subunit 2 Mus musculus 5834955 QXMS2M DNHUN2 5834961 NADH dehydrogenase subunit 3 Mus musculus 5834961 QXMS3M DNHUN3 5834963 NADH dehydrogenase subunit 4 Mus musculus 5834963 QXMS4M DNHUN4 5834962 NADH dehydrogenase subunit 4L Mus musculus 5834962 QXMS4L DNHUNL 7770.109 NADH dehydrogenase subunit 5 Mus musculus DNHUNS domesticus 5834964 NADH dehydrogenase subunit 5 Mus musculus 5834964 QXMSSM 5834965 NADH dehydrogenase subunit 6 Mus musculus 5834965 DEMSN6 27754188 DEHUN6 21314826 NADH:ubiquinone oxidoreductase B15 subunit Mus O95168 musculus 21539587 NADH-ubiquinone oxidoreductase B9 subunit; Complex 21539587 O95167 I-B9; C1-B9 Mus musculus 13507612 NADPH-dependent retinol dehydrogenase/reductase Mus musculus 6754870 neighbor of Cox4 Mus musculus 5174615 200022 neurofilament protein 9506933 neuronal protein 15.6 Mus musculus 31543330 nicotinamide nucleotide transhydrogenase Mus 6679088 SS4876 GO2257 musculus 13385084 NIPSNAP-related protein Mus musculus 12963555 Nit protein 2 Mus musculus 21313484 nitrogen fixation cluster-like Mus musculus 6754846 nitrogen fixation gene, yeast homolog 1; nifS-like (sic) 25058437 26OO6849 Mus musculus 6754846 6679.146 nth (endonuclease III)-like 1; thymine glycol DNA 6679146 glycosylase/AP lyase Mus 31543343 nuclear respiratory factor 1 Mus musculus A548.68 27753998 nudix (nucleoside diphosphate linked moiety X)-type 27753998 motif 9 Mus musculus 19526960 optic atrophy 1 homolog Mus musculus 19526960 TOO336 8393866 ornithine aminotransferase Mus musculus 8393866 XNMSO 4SS7809 XNHUO 66791.84 ornithine transcarbamylase; sparse fur Mus musculus 66791.84 OWMS 92.57234. OWHU US 2007/0203083 A1 Aug. 30, 2007 55

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 33563270 oxoglutarate dehydrogenase (lipoamide); alpha 2O853413 48.884 A38234 ketoglutarate dehydrogenase Mus 25O25547 ODO1 MOUSE 1152852O p53 apoptosis effector related to Pmp22; p53 apoptosis 11528S2O associated target Mus 19527310 peptidylprolyl isomerase F (cyclophilin F); peptidyl-prolyl 19527310 A41581 cis-trans isomerase: 6680690 peroxiredoxin 3; anti-oxidant protein 1; mitochondrial 6680690 JQ0064 TDXM HUMAN Trx dependent peroxide 7948999 peroxiredoxin 4; antioxidant enzyme AOE372; Prx IV Mus musculus 6755114 peroxiredoxin 5 precursor; peroxiredoxin 6: peroxisomal 6755114 6912.238 membrane protein 20; 18875408 peroxisomal acyl-CoA thioesterase 1 Mus musculus 3.198O804 peroxisomal trans 2-enoyl CoA reductase; perosisomal 2-enoyl-CoA reductase Mus 214SO279 PET112-like Mus musculus 4758894 GATB. HUMAN 109468.32 phorbol-12-myristate-13-acetate-induced protein 1: 109468.32 Noxa protein Mus musculus 33667036 phosphatidylethanolamine N-methyltransferase Mus 7110685 PEMT HUMAN musculus 6755090 phospholipase A2, group IB, pancreas Mus musculus PSHU 72421.75 phospholipase A2, group IIA (platelets, synovial fluid); I48342 modifier of Min1; 6679369 phospholipase A2, group IVA (cytosolic, calcium A39329 dependent); phospholipase A2, 7657467 polymerase (DNA directed), gamma 2, accessory 7657467 Subunit; mitochondrial polymerase 856,7392 polymerase (DNA directed), gamma; polymerase, 856,7392 DPOG MOUSE 4505937 GO2750 gamma; Pol gamma; polymerase 1478O884 polymerase delta interacting protein 38 Mus musculus 6755004 programmed cell death 8; programmed cell death 8 6755004 4757732 (apoptosis inducing factor); 222O2629 222O2631 6679299 prohibitin Mus musculus 6755178 proline dehydrogenase Mus musculus 2SOS3948 6755178 13.385310 propionyl Coenzyme A carboxylase, beta polypeptide 4S57044 AS3020 Mus musculus 2145O241 propionyl-Coenzyme A carboxylase, alpha polypeptide; 4SS7833 A27883 propionyl CoA-carboxylase 34328.185 prosaposin Mus musculus 3.1980991 protease, serine, 25; serine protease OMIMus 979O135 musculus 6679437 protective protein for beta-galactosidase Mus musculus 6679445 protoporphyrinogen oxidase Mus musculus 6679445 S68367 4506001 PPOX HUMAN 21553.115 putative mitochondrial solute carrier Mus musculus 21553115 3154328O putative prostate cancer tumor Suppressor; cDNA sequence BC003311 Mus musculus 214SO149 pyrroline-5-carboxylate reductase 1: hypothetical A41770 protein MGC11688 Mus pyrroline-5-carboxylate synthetase; glutamate gamma 979 OO61 semialdehyde synthetase Mus 24O25659 6679237 pyruvate carboxylase; pyruvate decarboxylase Mus 6679237 A472SS 11761615 JC2460 musculus 4505627 18152793 pyruvate dehydrogenase (lipoamide) beta Mus 450S687 DEHUPB musculus US 2007/0203083 A1 Aug. 30, 2007 56

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 282O1978 pyruvate dehy rogenase complex, component X; 4SOS 699 dihydrolipoamide 66792.61 pyruvate dehy rogenase E1 alpha 1: pyruvate 6679263 S23S07 S23S06 450S68S DEHUPADEHUPT dehydrogenase E1alpha subunit Mus 66792.61 19526816 pyruvate dehy rogenase kinase, isoenzyme 2: pyruvate 19526816 I7O159 dehydrogenase 2 Mus 21704122 pyruvate dehy rogenase kinase, isoenzyme 3 Mus 488SS45 IFO160 musculus 7305375 pyruvate dehy rogenase kinase, isoenzyme 4: pyruvate 7305.375 4505693 Q16654 dehydrogenase kinase 4Mus 3.1981562 pyruvate kinase 3 Mus musculus 31543608 reticulon 4 interacting protein 1: NOGO-interacting 187OOO36 mitochondrial protein; 22267464 retinoic acid inducible protein 3 Mus musculus 6755334 ribonuclease H1 Mus musculus 1258.4986 ribosomal protein L23 Mus musculus RL23 HUMAN 13384904 ribosomal protein, mitochondrial, S22 Mus musculus 13384904 21311883 EN cDNA 06 O07 Mus musculus 21311967 EN cDNA 06 C08 Mus musculus 2153622O EN cDNA 06 F14 Mus musculus S22348 21313679 EN cDNA 06 D10 Mus musculus 21312004 EN cDNA 06 16 Mus musculus S32482 33.85656 EN cDNA 06 D20 Mus musculus 21311853 EN cDNA 06 H03 Mus musculus 21313618 EN cDNA 06 L09 Mus musculus 33.85662 EN cDNA 06 E07 Mus musculus EN cDNA O7 P09 Mus musculus EN cDNA 11 11 Mus musculus 3384742 EN cDNA 11 B13 Mus musculus 3384766 EN cDNA 11 D01 Mus musculus 2963697 EN cDNA 11 H10 Mus musculus 3385298 EN cDNA 1300002 A08 Mus musculus 13385298 21311845 EN cDNA 13OOOO6 L01 Mus musculus 21311845 338S9744 EN cDNA 1 SOOO32 D16 Mus musculus NUOM HUMAN 8859597 EN cDNA 18 06 Mus musculus O95298 2O876O12 EN cDNA 18 002OMO2 Mus musculus I38079 2089.7872 EN cDNA 18 OOS8 14 Mus musculus 21624.609 EN cDNA 20 OO12 D11 Mus musculus 33854.36 EN cDNA 20 2 Mus musculus 21312554 EN cDNA 20 O107 E04 Mus musculus 21312554 P56379 68MP HUMAN 338SO42 EN cDNA 20 O309 E21 Mus musculus 273.70092 EN cDNA 2300002G02 Mus musculus PDO441 21359837 IS3499 S62767 3.1980955 EN cDNA 23 OOOS D12 Mus musculus 338S9690 EN cDNA 23 OOOSO 4Mus musculus 21312348 EN cDNA 23 OO20 P08 Mus musculus 21312348 13384950 EN cDNA 23 OO39 H17 Mus musculus 21313468 EN cDNA 23 OOSO B20 Mus musculus 21313468 21361280 I84606 13385998 EN cDNA 24 OOO2 K23 Mus musculus 13385998 3156O255 EN cDNA 24 OOOSO 6 Mus musculus 272.28985 EN cDNA 24 O011G03 Mus musculus 3O794396 EN cDNA 24 P16 Mus musculus 21312594 EN cDNA 26 H19; EST AA108335 Mus usculus 13195670 EN cDNA 26 0207 16 Mus musculus 21313O8O EN cDNA 27OOO85 E05 Mus musculus 22267456 EN cDNA 28 B21 Mus musculus 572982O 21312204 EN cDNA 28 D12 Mus musculus 1952684.8 EN cDNA 28 0484M10 Mus musculus US 2007/0203083 A1 Aug. 30, 2007 57

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 31541932 RIKEN cDNA 2900026G05 Mus musculus 1792.1985 17921987 21312153 RIKEN cDNA 290007OE19 Mus musculus 13386046 RIKEN cDNA 3010027G13 Mus musculus 13386046 27229021 RIKEN cDNA 3110001 M13 Mus musculus 4506865 DHSD HUMAN 20822904 RIKEN cDNA 3110004O18 Mus musculus 2O822904 4758734 O75439 25031957 30424808 RIKEN cDNA 3110021 G18 (Mus musculus 15O11910 A40141 25072051 RIKEN cDNA 3110065L21 Mus musculus 21312006 RIKEN cDNA 3632410G24 Mus musculus 2131.2006 4759286 UCP4 HUMAN 21311988 RIKEN cDNA 4121402D02 Mus musculus 13385168 RIKEN cDNA 4430402G14 Mus musculus UCRI HUMAN 31981207 RIKEN cDNA 4432405K22 Mus musculus 19527276 RIKEN cDNA 4921526O06 Mus musculus 21312894 RIKEN cDNA 4930483N21 Mus musculus 30424611 RIKEN cDNA 4932416F07 Mus musculus 13386066 RIKEN cDNA 5730591C18 Mus musculus 13386066 4758424 GCHUH 27370158 RIKEN cDNA 6430520CO2 Mus musculus 5454070 Q92581 28077029 RIKEN cDNA 913.0022B02 Mus musculus 4758886 S69S46 13386062 RIKEN cDNA 943 0083G14 Mus musculus 27369922 RIKEN cDNA 963002OE24 Mus musculus 27370474 RIKEN cDNA 9630038CO2 Mus musculus GABT HUMAN 22122359 RIKEN cDNA A330009E03 Mus musculus 5031709 21450203 RIKEN cDNA A330035HO4; long-chain acyl-CoA Synthetase Mus musculus 21704204 RIKEN cDNAA930O31 O08 Mus musculus 47S9068 34328415 RIKEN cDNAA93.0035F14 gene Mus musculus PUT2 HUMAN 21311919 RIKEN cDNA B430104H02 Mus musculus 27369966 RIKEN cDNA D530020C15 Mus musculus 450S689 ISS.465 27369748 RIKEN cDNA D630032B01 Mus musculus 19527384 RIKEN cDNA D93001OJO1 Mus musculus 28893421 RIKEN cDNA E430012MO5 gene Mus musculus 22267442 RIKubiquinol cytochrome c reductase core protein 2 22267442 A32629 Mus musculus 31982720 SA rat hypertension-associated homolog Mus musculus 20149748 sarcosine dehydrogenase Mus musculus 1503.0102 Sdha protein Mus musculus 47.5908O 984837 secretory group II phospholipase A2 PSHUYF 6677943 serine hydroxymethyl transferase 1 (soluble) Mus musculus 21312298 serine hydroxymethyl transferase 2 (mitochondrial) 1992,3315 B46746 Mus musculus 15147224 sideroflexin 1; flexed tail Mus musculus 15147224 1671 6499 31981486 sideroflexin 2 Mus musculus 1671 6497 1671 6501 sideroflexin 4 Mus musculus 1671 6501 208.95140 similar to aminomethyltransferase Mus musculus 4502083 IS4.192 2SOS2664 similar to Cytochrome c oxidase assembly protein 4758034 COXZ HUMAN COX11, mitochondrial precursor 28478945 similar to Glutaminase, kidney isoform, mitochondrial 2O336214 precursor (GLS) 28526374 similar to NADH2 dehydrogenase (ubiquinone) (EC NUMM MOUSE O7538O 1.6.5.3) complex I 13K-A chain 2O825073 similar to NADH-ubiquinone oxidoreductase B17 O95139 subunit (Complex I-B17) (Cl-B17) 2091.63S1 single-stranded DNA binding protein 1 Mus musculus 4SO7231 JNOS68 27229283 Small fragment nuclease Mus musculus T14770 US 2007/0203083 A1 Aug. 30, 2007 58

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 13540709 Sodium channel, voltage-gated, type 1, alpha polypeptide; sodium channel, 6678001 solute carrier family 1, member 1 Mus musculus EAT3 MOUSE 7106409 solute carrier family 1, member 2: glial high affinity EAT2 HUMAN glutamate transporter 24233554 solute carrier family 1, member 3; glial high affinity JC2O84 glutamate transporter 9790129 solute carrier family 22 member 4: solute carrier family (organic cation 2854.4699 solute carrier family 25 (mitochondrial carrier), member 2O3422O2 18 Mus musculus 2O831383 2SO22813 6755544 solute carrier family 25 (mitochondrial carrier, brain), 675.5544 4507009 O95258 member 14: solute 13385736 13259543 7657583 solute carrier family 25 (mitochondrial carrier; adenine 7657583 21361.103 Y14494 nucleotide 7657581 7305501 solute carrier family 25 (mitochondrial carrier; 73055O1 dicarboxylate transporter), 6754952 solute carrier family 25 (mitochondrial carrier; ornithine 6754952 transporter), member 21312994 solute carrier family 25 (mitochondrial carrier; 21312994 AS.66SO oxoglutarate carrier), member 29789024 solute carrier family 25 (mitochondrial carrier; 20902883 peroxisomal membrane protein), 19526818 solute carrier family 25 (mitochondrial carrier; 19526818 6031192 A53737 B53737 phosphate carrier), member 3: 4505,775 21313024 solute carrier family 25 (mitochondrial deoxynucleotide 21313024 carrier), member 19 Mus 23943838 solute carrier family 25, member 1; DiGeorge syndrome 2O3461.64 TXTP HUMAN gene j; Solute carrier 2O891945 23943838 25O25453 22094.075 solute carrier family 25, member 5; adenine nucleotide 2O863388 S31814 S37210 4502097 A291.32 A44778 translocator 2, 22094O75 SO3894 6755548 solute carrier family 27 (fatty acid transporter), member 2; very long-chain 31981977 spastic paraplegia 7 homolog: paraplegin; spastic 4507173 paraplegia 7 Mus musculus 13507712 sphingosine-1-phosphate phosphatase 1: Sphingosine- 13507712 1-phosphate phosphatase Mus 10946984 START domain containing 3; esé4 protein; S60682 steroidogenic acute regulatory protein 31543776 steroidogenic acute regulatory protein Mus musculus 1992O319 ASS4S5 45072S1 I38896 28545662 sterol carrier protein 2, liver Mus musculus 2O841062 JUO157 A4001S B404O7 12963591 stomatin-like protein 2 Mus musculus 13384690 Succinate dehydrogenase complex, Subunit C, integral 13384690 4506863 membrane protein Mus 20908717 Succinate dehydrogenase Fp subunit Mus musculus JXO336 34328286 Succinate dehydrogenase Ip Subunit Mus musculus PTOO94 92S7242 A34045 9845299 Succinate-CoA ligase, GDP-forming, alpha subunit: 9845299 11321581 P53597 Succinyl-CoA synthetase Mus 31981549 Sulfide quinone reductase-like; flavo-binding protein; sulfide 30424565 sulfite oxidase Mus musculus S55874 31980762 Superoxide dismutase 2, mitochondrial; manganese 7305511. I57023 10835.187 DSHUN SOD; manganese Superoxide 31088872 Suppressor of varl, 3-like 1 Mus musculus 4507315 US 2007/0203083 A1 Aug. 30, 2007 59

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called "by name. Previous Mitochondral Annotations Exemplar Protein for the Cluster LocusLink LocusLink Accession Description Mouse MITOP Mouse Human MITOP Human 7363455 Surfeit gene 1 Mus musculus 7363455 B2S394 S57749 6678,179 syntaxin binding protein 1; unc18 homolog (C. elegans): UNC-18 homolog (C. 15809030 Synuclein, beta Mus musculus 31442416 tafaZZin Mus musculus TFZ HUMAN 13384998 tetratricopeptide repeat domain 11 Mus musculus 13385260 thioesterase Superfamily member 2 Mus musculus 6755911 thioredoxin 1; thioredoxin Mus musculus 9903609 thioredoxin 2; thioredoxin nuclear gene encoding 990.3609 THI2 HUMAN mitochondrial protein; 7305603 thioredoxin reductase 2; human EST 573010; EST 7305603 22O35672 AA118373; TR beta Mus musculus 22O35670 22O356.68 6678449 thiosulfate Sulfurtransferase, mitochondrial Mus 6678449 THTR MOUSE musculus 6678357 thymidine kinase 1 Mus musculus KIHUT 10835111 thymidine kinase 2, mitochondrial; thymidine kinase 2 10835111 10281330 Mus musculus 6678417 thyroid peroxidase Mus musculus OPHUIT 6678303 transcription factor A, mitochondrial Mus musculus 6678303 JC1496 26.006865 transcription termination factor, mitochondrial-like Mus S902O10 musculus 7305573 translocase of inner mitochondrial membrane 10 homolog Mus musculus 7305575 translocase of inner mitochondrial membrane 13 homolog a Mus musculus 12025536 translocase of inner mitochondrial membrane 23 12O25536 homolog Mus musculus 7305577 translocase of inner mitochondrial membrane 8 7305577 U66O3S homolog a Mus musculus 7305579 translocase of inner mitochondrial membrane 8 homolog b Mus musculus 7305581 translocase of inner mitochondrial membrane 9 homolog Mus musculus 13324.686 translocase of outer mitochondrial membrane 20 S66619 homolog Mus musculus 8394480 translocase of outer mitochondrial membrane 40 8394480 homolog; mitochondrial outer 19705563 translocator of inner mitochondrial membrane 44 Mus 19705563 IM44 HUMAN musculus 25O24735 25070554 33468943 translocator of inner mitochondrial membranea; 2SO3O423 IM17 HUMAN translocator of inner 2091O363 20270297 trimethyllysine hydroxylase, epsilon; epsilon- 2O27O297 trimethyllysine 2-oxoglutarate 33859692 tRNA nucleotidyl transferase, CCA-adding, 1; tRNA 2O8292S4 adenylyltransferase, 16716569 trypsinogen 16 Mus musculus 31543952 tryptophanyl tRNA synthetase 2 (mitochondrial) Mus 2136.2271 7710154 musculus 6678469 tubulin, alpha 6; tubulin alpha 6 Mus musculus 12963615 tubulin, beta 3 Mus musculus 31981925 tyrosine 3-monooxygenase/tryptophan 5- 143E HUMAN monooxygenase activation protein, epsilon 6756041 tyrosine 3-monooxygenase/tryptophan 5- JCS384 PSHUAM monooxygenase activation protein, Zeta 22122769 tyrosine aminotransferase Mus musculus S10887 US 2007/0203083 A1 Aug 30, 2007 60

TABLE 4-continued Annotation and experimental Support for the mito-A proteins. The mito-Alist of protein clusters consist of proteins that are physically associated with mitochondria, based on previous annotations or based on organelle proteomics. The list is produced by pooling all the individual proteins identified in the organelle proteomics Survey with proteins previously annotated as being mitochondrial. These proteins were then clustered into 601 groups using a BLAST procedure (see Methods). Each cluster may be supported by previous annotations, organelle proteomics, or by both (protein accessions are indicated in the appropriate columns). Of the 601 clusters, 10 correspond to expected contaminants and have been flagged. The remaining 591 constitute the mito-Alist that is used in the analysis. For each mito-A cluster, an exemplar protein (typically corresponding to a Reference Sequence) accession and description are provided. GenPept or Swissprot accession numbers of the cluster members are provided in the appropriate columns. Of the 591 mito-A clusters, 37 appeared to be obviously mitochondrial based on the description, so these have been flagged as mitochondrial in a dedicated column called by name.

Previous Mitochondral Annotations

Exemplar Protein for the Cluster LocusLink LocusLink

Accession Description Mouse MITOP Mouse Human MITOP Human 21539599 ubiquinol cytochrome c reductase hinge protein; 21539599 SOO2.19 mitochondrial hinge protein; 13385726 ubiquinol-cytochrome c reductase binding protein Mus 13385726 A32450 musculus 13384794 ubiquinol-cytochrome c reductase core protein 1 Mus 13384794 A48043 musculus 2SO3O421 13385112 ubiquinol-cytochrome c reductase subunit Mus musculus 21070950 ubiquitin C; polyubiquitin C Mus musculus 6678497 uncoupling protein 1, mitochondrial; uncoupling protein, 6678497 A31106 1122S256 A60793 mitochondrial Mus 31543920 uncoupling protein 2, mitochondrial Mus musculus 6755933 UCP2 HUMAN 6678495 uncoupling protein 3, mitochondrial Mus musculus 66784.95 2836291 unnamed protein product Mus musculus 21396489 S42366 2832533 unnamed protein product Mus musculus O75489 2832556 unnamed protein product Mus musculus O96OOO 26343407 unnamed protein product Mus musculus 14790 138 26346947 unnamed protein product Mus musculus S63453 2834221 unnamed protein product Mus musculus 2834781 unnamed protein product Mus musculus 2835668 unnamed protein product Mus musculus 2835711 unnamed protein product Mus musculus 283.6533 unnamed protein product Mus musculus 2836798 unnamed protein product Mus musculus 284.1269 unnamed protein product Mus musculus 2842244 unnamed protein product Mus musculus 2845262 unnamed protein product Mus musculus 2846164 unnamed protein product Mus musculus 2855263 unnamed protein product Mus musculus 2855887 unnamed protein product Mus musculus 2860092 unnamed protein product Mus musculus 2861374 unnamed protein product Mus musculus 26363071 unnamed protein product Mus musculus 3128954 upregulated during skeletal muscle growth 5 Mus musculus 6755941 uracil-DNA glycosylase Mus musculus 6755941 UNG MOUSE A60472 6678509 urate oxidase; uricase Mus musculus 6678519 uroporphyrinogen III synthase; URO-synthase: A40483 uroporphyrinogen-III synthase; 34328204 valyl-tRNA synthetase 2 Mus musculus 31559883 very-long-chain acyl-CoA dehydrogenase VLCAD homolog Mus musculus 6755963 voltage-dependent anion channel 1 Mus musculus 6755963 4507879 MMHUP3 6755965 voltage-dependent anion channel 2 Mus musculus 6755965 B44422 6755967 voltage-dependent anion channel 3 Mus musculus S59547 31980962 WW-domain oxidoreductase Mus musculus 9625O12 US 2007/0203083 A1 Aug. 30, 2007 61

0350

TABLE 5 Tiers of evidence supporting the 163 newly identified mito-A proteins. The protein accession and description of each of the newly identified mito-A proteins is shown along with each of the GenPept accessions of the proteins identified in the tissue proteomics experiments. For each mito-A protein cluster, the top scoring human homologue from the study, the PSORT targeting prediction, the mitochondrial neighborhood index, and the results of epitope tagging experiments, when available, are shown. For the BLASTP analyses, only the top scoring match from the study by MitoKor is provided, using a threshold of E < 1 x 10. The PSORT targeting prediction and probability were obtained for the exemplar protein sequence. The neighborhood indices (NSo, Noo, and N-so) are provided, when available. Due to probe-set duplicity, Some proteins have more than one corresponding probe-set, and others have no probe-set. An Nso 2 6, No 2 10, and N2 so 2 19 each correspond to a nominal P = 0.001, assuming that mito-A genes are randomly distributed in expression space. In the final column, the Subcellular localization based on immunofluorescence microscopy is indicated for the five proteins shown in FIG. 2 Exemplar Protein for the Cluster Proteomics BLASTP against MitoKor Accession Description Liver Brain Heart Kidney Match Score Expect 21313679 RIKEN cDNA 0610009D10 Mus musculus 12832313 12832.313 12832313 12832.313 S4S3559 283 1.OOE-78 220904 220904 21312594 RIKEN cDNA 2610205H19; EST AA108335 1284.8292 1284.8292 1284.8292 73O248 76616O2 249 2.OOE-68 Mus musculus 73O248 73O248 13128954 upregulated during skeletal muscle growth 5 Mus 12842476. 13128954 12842476 12842476. 142493.76 1 OS 2.OOE-2S musculus 6851054 12842476 6671622 B-cell receptor-associated protein 37; repressor of 6OOS854 600S854 600S854 600S854 600S854 568 e-164 estrogen receptor activity 6671622 272.28985 RIKEN cDNA 2410011G03 Mus musculus 10092657 13384978 13384978 13384978 10092657 297 6.OOE-83 13384978 133847.66 RIKEN cDNA 1110021 D01 Mus musculus 1338.4766 133847.66 12842709 13384,766 NO 12842709 MATCH 19354491 111002OP15Rik protein Mus musculus 1367O1 9297078 136701 136701 9297078 116 S.OOE-29 136701 3891857 6094,658 9789997 leucine zipper-EF-hand containing transmembrane 9789997 9789997 9789997 9789997 6912482 1209 O protein 1; leucine 13385260 thioesterase Superfamily member 2 Mus musculus 338526O 1338526O 1338526O 1338526O 4210351 209 2.OOE-56 19527228 DNA segment, Chr 10, ERATO Doi 214, expressed 892393O 89.2393O 892393O 89.2393O 8923930 206 1.OOE-SS Mus musculus 12842244 unnamed protein product Mus musculus 2842244 12842244 12842244 12842244 17455445 210 1.OOE-56 12963633 genes associated with retinoid-IFN-induced 2963633 12963633 12963633 12963633 12005918 26O 1.OOE-71 mortality 19 Mus musculus 2833.386 12833406 1283.3386 12833.386 2833406 12833406 77.05734 2833406 6679066 4-nitrophenylphosphatase domain and non- 6679066 4SOS399 6679066 128O313S 4503937 429 e-122 neuronal SNAP25-like protein homolog 1 28SO319 6679066 4SOS399 128SO319 6679066 28SO319 7949047 hydroxyacyl-Coenzyme A dehydrogenase type II; 7949047 7949047 7949047 7949047 147642O2 421 e-120 hydroxyacyl-Coenzyme A 28SO643 28SO643 13182962 3182962 3182962 3183O2S 23956104 adenylate kinase 3 alpha-like; adenylate kinase 3 2837588 2837588 12836.369 1273.5226 428 e-122 alpha like Mus musculus 69784.79 2837588 6707707 67.07707 20149748 sarcosine dehydrogenase Mus musculus 3097441 30974.41 13097441 1377S 158 185 3.OOE-48 3283373 3283373 4928113 31980804 peroxisomal trans 2-enoyl CoA reductase: 2963715 296371S 1284S570 4503301 143 S.OOE-36 perosisomal 2-enoyl-CoA reductase Mus 3506791 2963715 3506791 21624609 RIKEN cDNA 2010012D11 Mus musculus 2833236 28S7234 12833236 NO 28S7234 47S7862 MATCH 21389320 leucine-rich PPR motif-containing protein; leucine 28S 1540 173OO78 12851540 173OO78 1938 O rich protein LRP130 Mus 28S 1540 21313618 RIKEN cDNA 0610041L09 Mus musculus 12839842 1283.2121 12832.121. 8923390 411 e-117 8923390 30424611 RIKEN cDNA 4932416F07 Mus musculus 75.13021 75.13021 75.13021 NO MATCH 27369748 RIKEN cDNA D630032B01 Mus musculus 1711535 1711535 1711.53S 1363O862 608 e-176 34328379 D-lactate dehydrogenase Mus musculus 12852638 12852638 12852638 NO MATCH 19526848 RIKEN cDNA 2810484M10 Mus musculus 374,7107 374,7107 374,7107 NO MATCH 1948.2166 kidney expressed gene 1 Mus musculus 12832283 12832283 12832283 NO MATCH 6754092 glutathione transferase Zeta 1 (maleylacetoacetate 67S4O92 67S4O92 6754092 NO isomerase); MATCH US 2007/0203083 A1 Aug. 30, 2007 62

TABLE 5-continued Tiers of evidence supporting the 163 newly identified mito-A proteins. The protein accession and description of each of the newly identified mito-A proteins is shown along with each of the GenPept accessions of the proteins identified in the tissue proteomics experiments. For each mito-A protein cluster, the top scoring human homologue from the study, the PSORT targeting prediction, the mitochondrial neighborhood index, and the results of epitope tagging experiments, when available, are shown. For the BLASTP analyses, only the top scoring match from the study by Mito Kor is provided, using a threshold of E < 1 x 10. The PSORT targeting prediction and probability were obtained for the exemplar protein sequence. The neighborhood indices (NSo, Noo, and N-so) are provided, when available. Due to probe-set duplicity, Some proteins have more than one corresponding probe-set, and others have no probe-set. An Nso 2 6, No 2 10, and Nso 2 19 each correspond to a nominal P = 0.001, assuming that mito-A genes are randomly distributed in expression space. In the final column, the Subcellular localization based on immunofluorescence microscopy is indicated for the five proteins shown in FIG. 2 Exemplar Protein for the Cluster Proteomics BLASTP against MitoKor Accession Description Liver Brain Heart Kidney Match Score Expect 21312153 RIKEN cDNA 290007OE19 Mus musculus 12851249 12851249 12851249 1273S430 101 6.OOE-24 13384742 RIKEN cDNA 1110018B13 Mus musculus 133847.42 133847.42 133847.42 15150811 175 2.OOE-46 12835711 unnamed protein productMus musculus 12835,711 12835,711 12835711 14211923 290 1.OOE-8O 13507612 NADPH-dependent retinol 13097510 135076.12 11559414 12804319 51 1.OOE-08 dehydrogenase/reductase Mus musculus 12832859 34328.185 prosaposin Mus musculus 72421.91 6981424 91281 NO 91281 881390 881390 MATCH 557,967 6981424 881390 94388OS 1360694 11386.147 13540709 Sodium channel, voltage-gated, type 1, alpha 13S4O709 13S4O709 NO polypeptide; sodium channel, MATCH 21070950 ubiquitin C; polyubiquitin C Mus musculus 97.90277 9790277 11024714 449 e-128 1OSO930 136670 31980703 aminoadipate-semialdehyde synthase; lysine 13529344 1302764O NO oxoglutarate reductase, Saccharopine 8393730 13529.344 MATCH 49.383O4 8393,730 6753272 catalase; catalase 1 Mus musculus 6753272 115704 NO 67S3272 MATCH 115698 229299 31541815 L-specific multifunctional beta-oxdiation protein 12836.375 1706569 14730775 293 9.OOE-81 Mus musculus 11434714 12836375 7656855 acyl-Coenzyme A oxidase 1, palmitoyl; acyl- 6429156 6429156 13653O49 55 3.OOE-09 Coenzyme A oxidase; Acyl-CoA oxidase 7656855 7656855 9790129 solute carrier family 22 member 4: solute carrier 979O129 9790129 NO family (organic cation MATCH 668.0756 ATPase, H+ transporting, V1 subunit E isoform 1: 6680756 6680756 NO ATPase, H+ transporting 313014 MATCH 201006 Cu/Zn-Superoxide dismutase 201OO6 134614 12374O6 266 2.OOE-73 13S1080 226471 7433299 9055178 brain protein 44-like: apoptosis-regulating basic 12852262. 12852262 14755.192 216 1.OOE-58 protein Mus musculus 77O6369 12852283 9055178 7305125 estradiol 17 beta-dehydrogenase 8; 17-beta- 73051.25 1103,844 14041699 418 e-119 hydroxysteroid dehydrogenase 8: 1103844 12963539 HSCO protein Mus musculus 12832819 2963539 4885389 70 3.00E-14 2832819 21312020 hypothetical protein D4Ertd765e Mus musculus 128366.67 2836,667 45O2327 300 2.OOE-83 28.47441 12963697 RIKEN cDNA 1110025H10 Mus musculus 12963697 2963697 NO 2834.868 MATCH 6681137 diazepam binding inhibitor; acyl-CoA binding 13937.379 3937.379 12052810 76 1.OOE-16 protein; diazepam-binding inhibitor 6681137 13507620 ankycorbin; NORPEG-like protein Mus musculus 1350762O 1350762O 14771689 100 2.OOE-22 16905127 butyryl Coenzyme A synthetase 1; acetyl- 5O19275 S4873OO 6996429 137 6.OOE-34 Coenzyme A synthetase 3 Mus musculus 22122743 hypothetical protein MGC37245 Mus musculus 31271.93 31271.93 6996429 123 7.OOE-30 22203753 inorganic pyrophosphatase 2 Mus musculus 12834464 2834464. 11526789 525 e-151 1338.5656 RIKEN cDNA 061001OD20 Mus musculus 1338.5656 33.85656 NO 1284.6589 MATCH 33859690 RIKEN cDNA 2310005O14 Mus musculus 3252827 3252827 3252827 578 e-167 21311919 RIKEN cDNA B430104H02 Mus musculus 7705608 283.6847 NO MATCH US 2007/0203083 A1 Aug. 30, 2007 63

TABLE 5-continued Tiers of evidence supporting the 163 newly identified mito-A proteins. The protein accession and description of each of the newly identified mito-A proteins is shown along with each of the GenPept accessions of the proteins identified in the tissue proteomics experiments. For each mito-A protein cluster, the top scoring human homologue from the study, the PSORT targeting prediction, the mitochondrial neighborhood index, and the results of epitope tagging experiments, when available, are shown. For the BLASTP analyses, only the top scoring match from the study by Mito Kor is provided, using a threshold of E < 1 x 10. The PSORT targeting prediction and probability were obtained for the exemplar protein sequence. The neighborhood indices (NSo, Noo, and N-so) are provided, when available. Due to probe-set duplicity, Some proteins have more than one corresponding probe-set, and others have no probe-set. An Nso 2 6, No 2 10, and Nso 2 19 each correspond to a nominal P = 0.001, assuming that mito-A genes are randomly distributed in expression space. In the final column, the Subcellular localization based on immunofluorescence microscopy is indicated for the five proteins shown in FIG. 2 Exemplar Protein for the Cluster Proteomics BLASTP against MitoKor Accession Description Liver Brain Heart Kidney Match Score Expect 21703764 lactamase, beta 2 Mus musculus 1327849S 132784.95 NO MATCH 133.85662 RIKEN cDNA 0610042E07 Mus musculus 13376007 13376007 NO MATCH 10946936 adenylate kinase 1: cytosolic adenylate kinase Mus 729865 125152 4502O11 347 6.OOE-98 musculus 6680277 heat-responsive protein 12 Mus musculus 668O277 668O277 5032215 226 3.OOE-61 21312028 RIKEN cDNA 1110006I11 Mus musculus 128342O6 12834206 NO MATCH 13385436 RIKEN cDNA 2010100O12 Mus musculus 13385436 13385436 NO MATCH 12836,533 unnamed protein product Mus musculus 12836S33 12836533 NO MATCH 6677943 serine hydroxymethyl transferase 1 (soluble) Mus 2321.78 2321.78 NO musculus MATCH 12834221 unnamed protein product Mus musculus 12834221 12834221 14211939 283 1.OOE-78 6681097 cytochrome P450, family 17, subfamily a, 2148066 2506241 NO polypeptide 1: cytochrome P450, 17: MATCH 6753676 dihydropyrimidinase-like 2; collapsin response 13S1260 13645618 825 O mediator protein 2 Mus musculus 3122O18 79937 glyceraldehyde-3-phosphate dehydrogenase Mus 667993.7 76694.92 637 O musculus 229279 65987 98.38358 13435924 aldolase 3, C isoform Mus musculus 11231095 312137 716 O 12836758 31982332 glutamate-ammonia ligase (glutamine synthase); 2144562 NO glutamine synthetase Mus 4504O27 MATCH 6680O23 2144563 6681079 cathepsin B preproprotein Mus musculus 227293 NO 6.681079 MATCH 12832453 39298.17 13654245 major urinary protein 1 Mus musculus 13276755 NO 127531 MATCH 27369922 RIKEN cDNA 963002OE24 Mus musculus 12052944 75.13022 108 4.OOE-25 6680305 heat shock protein 1, beta; heat shock protein, 84 kDa 1170383 72222 1415 O 1: heat shock 90 kDa 3642691 31982847 glutamic acid decarboxylase 1 Mus musculus 416884 NO 1082397 MATCH 1352214 31981147 leucine aminopeptidase 3: leucine aminopeptidase 12845995 NO Mus musculus 770.5688 MATCH 12833O83 6753556 cathepsin DMus musculus 6753556 4503143 697 O 11572O 8886526 31560731 ATPase, H+ transporting, V1 subunit A, isoform 1: 108733 114549 116 1.OOE-27 ATPase, H+ transporting, 6680752 6680107 granulin; acrogranulin; progranulin; PC cell-derived 191767 1335064 57 8.00E-10 growth factor Mus 668O107 31982720 SA rat hypertension-associated homolog Mus 2135243 6996429 161 2.OOE-41 musculus 5032O65 6753448 ceroid-lipofuscinosis, neuronal 2 Mus musculus 137862O6 NO 67.53448 MATCH 6754408 kynurenine aminotransferase II Mus musculus 67S44.08 NO 8393641 MATCH 1478.0884 polymerase delta interacting protein 38 Mus 7661672 NO musculus 12834,531 MATCH 31543280 putative prostate cancer tumor suppressor; cDNA 1353701 NO sequence BC003311 Mus musculus MATCH US 2007/0203083 A1 Aug. 30, 2007 64

TABLE 5-continued Tiers of evidence supporting the 163 newly identified mito-A proteins. The protein accession and description of each of the newly identified mito-A proteins is shown along with each of the GenPept accessions of the proteins identified in the tissue proteomics experiments. For each mito-A protein cluster, the top scoring human homologue from the study, the PSORT targeting prediction, the mitochondrial neighborhood index, and the results of epitope tagging experiments, when available, are shown. For the BLASTP analyses, only the top scoring match from the study by Mito Kor is provided, using a threshold of E < 1 x 10. The PSORT targeting prediction and probability were obtained for the exemplar protein sequence. The neighborhood indices (NSo, Noo, and N-so) are provided, when available. Due to probe-set duplicity, Some proteins have more than one corresponding probe-set, and others have no probe-set. An Nso 2 6, Noo 10, and Nso 2 19 each correspond to a nominal P = 0.001, assuming that mito-A genes are randomly distributed in expression space. In the final column, the Subcellular localization based on immunofluorescence microscopy is indicated for the five proteins shown in FIG. 2 Exemplar Protein for the Cluster Proteomics BLASTP against MitoKor Accession Description Liver Brain Heart Kidney Match Score Expect 12963555 Nit protein 2 Mus musculus 1296.3555 NO 1283576S MATCH 27754146 RIKEN cDNA 0710001 P09 Mus musculus 128536O4 1415O134 3O1 8.OOE-84 12839157 27754071 hypothetical protein 4833421E05 Rik Mus 12837739 NO musculus 12847330 MATCH 31981013 methionine sulfoxide reductase A Mus musculus 12844.852 NO 12857997 MATCH 13384998 tetratricopeptide repeat domain 11 Mus musculus 13384998. 14747249 288 4.OOE-8O 77.05632 9506933 neuronal protein 15.6 Mus musculus 9506933 13938.442 22O 8.OOE-60 21311867 hypothetical protein D11 Erta 99e Mus musculus 12859025 7661732 174 4.OOE-46 7661732 6678716 low density lipoprotein receptor-related protein 5: 7513560 1335064 S3 3.OOE-08 ow density 34328204 valyl-tRNA synthetase 2 Mus musculus 6755953 7678804 191 S.OOE-SO 30794396 RIKEN cDNA 2410021P16 Mus musculus 12846107 13653O49 141 6.OOE-35 31982273 hydroxysteroid (17-beta) dehydrogenase 4: 12836373 14041699 1OO 1.OOE-22 hydroxysteroid 17-beta dehydrogenase 21450203 RIKEN cDNA A330035HO4; long-chain acyl-CoA 43366O4 11276,083 981 O Synthetase Mus musculus 31981207 RIKEN cDNA 4432405K22 Mus musculus 12232451 NO MATCH 6680612 ATP-binding cassette, Sub-family D, member 3: 105.161 NO peroxisomal membrane protein, 70 MATCH 31559883 very-long-chain acyl-CoA dehydrogenase VLCAD 12849737 10436258 1056 O homolog Mus musculus 6755548 solute carrier family 27 (fatty acid transporter), 3O8782O 15559516 61 4.OOE-11 member 2: very long-chain 21311988 RIKEN cDNA 4121402D02 Mus musculus 12853862 NO MATCH 6678,179 syntaxin binding protein 1; unc18 homolog (C. elegans); 69816O2 NO UNC-18 homolog (C. MATCH 30725845 AAA-ATPase TOB3 Mus musculus 137524-13 11095436 57 8.00E-10 31981562 pyruvate kinase 3 Mus musculus 6755074 107554 1032 O 11968160 3-oxoacid CoA transferase 2A; haploid germ cell 1196816O 4557817 709 O specific Succinyl CoA 20070418 aldehyde dehydrogenase family 7, member A1; 1283.6597 128O3387 953 O aldehyde dehydrogenase 7 family, 13195670 RIKEN cDNA 2610207I16 Mus musculus 13195670 141SOO62 374 e-105 19527030 kynurenine 3-monooxygenase (kynurenine 3- 11024672 NO hydroxylase) Mus musculus MATCH 6679437 protective protein for beta-galactosidase Mus 1286O234 NO musculus MATCH 31981549 Sulfide quinone reductase-like; flavo-binding 12842384 10864011 812 O protein; sulfide 6753074 adaptor protein complex AP-2, mul; adaptor- 6753074 NO related protein complex AP-2, mul; MATCH 28893421 RIKEN cDNA E430012MO5 gene Mus musculus 12654733 NO MATCH 19527276 RIKEN cDNA 4921526O06 Mus musculus 7705586 NO MATCH 276,59728 aldo-keto reductase family 7, member A5 (aflatoxin 13384,704 NO aldehyde reductase); MATCH 14861848 DNA segment, Chr 7, Roswell Park 2 complex, 14861848 NO expressed; androgen regulated gene MATCH 12963591 stomatin-like protein 2 Mus musculus 1296.3591 7513076 603 e-174 6753058 annexin A10 Mus musculus 6274497 4826643 271 1.OOE-74 12834781 unnamed protein product Mus musculus 12834781 NO 1285 6019 MATCH 18875408 peroxisomal acyl-CoA thioesterase 1 Mus 488SS6S NO musculus MATCH US 2007/0203083 A1 Aug. 30, 2007 65

TABLE 5-continued Tiers of evidence supporting the 163 newly identified mito-A proteins. The protein accession and description of each of the newly identified mito-A proteins is shown along with each of the GenPept accessions of the proteins identified in the tissue proteomics experiments. For each mito-A protein cluster, the top scoring human homologue from the study, the PSORT targeting prediction, the mitochondrial neighborhood index, and the results of epitope tagging experiments, when available, are shown. For the BLASTP analyses, only the top scoring match from the study by Mito Kor is provided, using a threshold of E < 1 x 10. The PSORT targeting prediction and probability were obtained for the exemplar protein sequence. The neighborhood indices (NSo, Noo, and N-so) are provided, when available. Due to probe-set duplicity, Some proteins have more than one corresponding probe-set, and others have no probe-set. An Nso 2 6, Noo 10, and Nso 2 19 each correspond to a nominal P = 0.001, assuming that mito-A genes are randomly distributed in expression space. In the final column, the Subcellular localization based on immunofluorescence microscopy is indicated for the five proteins shown in FIG. 2 Exemplar Protein for the Cluster Proteomics BLASTP against MitoKor Accession Description Liver Brain Heart Kidney Match Score Expect 11968166 cathepsin Z. preproprotein; cathepsin Z. precursor: 283,5144 NO cathepsin X Mus musculus MATCH 3156O255 RIKEN cDNA 2410005O16 Mus musculus 3384896 16307164 511 e-147 6678509 urate oxidase; uricase Mus musculus 6678SO9 NO MATCH 3.1980955 RIKEN cDNA 2310005D12 Mus musculus 3195640 12654521 474 e-136 21313O8O RIKEN cDNA 2700085E05 Mus musculus 284O992 NO MATCH 6755334 ribonuclease H1 Mus musculus NO MATCH 6679957 glioblastoma amplified Mus musculus 6679957 4503937 S4O e-156 7948999 peroxiredoxin 4; antioxidant enzyme AOE372; Prx 24O7849 14768743 464 e-133 V Mus musculus 13386062 RIKEN cDNA 943 0083G14 Mus musculus 13386062 17461670 414 e-118 21311883 RIKEN cDNA 0610007O07 Mus musculus 2858578 NO MATCH 21312204 RIKEN cDNA 2810435D12 Mus musculus 28SO490 13654294 400 e-113 13385084 NIPSNAP-related protein Mus musculus 3385084 14743031 416 e-118 19527384 RIKEN cDNA D93001OJO1 Mus musculus 126S3017 12653017 458 e-131 667876O ySophospholipase 1: phospholipase 1a; 667876O 1474.7375 249 ySophopholipase 1 Mus musculus 21312894 RIKEN cDNA 4930483N21 Mus musculus 12854111 8922629 56 213131.38 glutathione S-transferase class kappa Mus 12832811 77.05704 350 musculus 21311853 RIKEN cDNA 0610012H03 Mus musculus 12832709 NO MATCH 21311967 RIKEN cDNA 0610008C08 Mus musculus 12832215 12OO1992 287 1.OOE-79 13384950 RIKEN cDNA 2310039H17 Mus musculus 13384950 NO MATCH 1274.6414 growth factor, erv1 (S. cerevisiae)-like (augmenter 7670387 NO of liver regeneration); MATCH 6806917 GM2 ganglioside activator protein Mus musculus 479912 NO MATCH 7305137 heme binding protein 1: heme-binding protein; p22 4886904 NO HBP: heme-binding protein 1 MATCH 2SO92662 DNA segment, Chr 11, Wayne State University 68, 13386160 NO expressed Mus musculus MATCH 21313484 nitrogen fixation cluster-like Mus musculus 12843.563 NO MATCH 18079334 ethanol induced 6 Mus musculus 12834O45 NO MATCH 6679078 expressed in non-metastatic cells 2, protein; 13929.192 1421.609 311 expressed in non-metastatic cells 1338SO42 RIKEN cDNA 2010309E21 Mus musculus 1338SO42 NO MATCH 15809030 Synuclein, beta Mus musculus 464424 NO MATCH 6755911 thioredoxin 1; thioredoxin Mus musculus 12841S60 1474O403 196 1.OOE-52 2O841184 acetyl-Coenzyme A carboxylase beta Mus 3O8OS46 NO musculus MATCH 2507,2051 RIKEN cDNA 3110065L21 Mus musculus 4758O12 NO MATCH 20071710 2010.002H18Rik protein Mus musculus 7678804 7678804 954 O 200022 neurofilament protein 205686 14742600 301 4.OOE-83 21618729 Facl5 protein Mus musculus 10800O88 77O6449 1193 O 17505907 DEAD (Asp-Glu-Ala-Asp) box polypeptide 31 122324.67 NO isoform 1: DEAD/DEXH helicase DDX31 MATCH 12855263 unnamed protein product Mus musculus 12855263 NO MATCH 12855887 unnamed protein product Mus musculus 12855887 NO MATCH 26363071 unnamed protein product Mus musculus 1284.3537 14771689 107 4.OOE-25 US 2007/0203083 A1 Aug. 30, 2007 66

TABLE 5-continued Tiers of evidence supporting the 163 newly identified mito-A proteins. The protein accession and description of each of the newly identified mito-A proteins is shown along with each of the GenPept accessions of the proteins identified in the tissue proteomics experiments. For each mito-A protein cluster, the top scoring human homologue from the study, the PSORT targeting prediction, the mitochondrial neighborhood index, and the results of epitope tagging experiments, when available, are shown. For the BLASTP analyses, only the top scoring match from the study by Mito Kor is provided, using a threshold of E < 1 x 10. The PSORT targeting prediction and probability were obtained for the exemplar protein sequence. The neighborhood indices (NSo, Noo, and N-so) are provided, when available. Due to probe-set duplicity, Some proteins have more than one corresponding probe-set, and others have no probe-set. An Nso 2 6, Noo. 2 10, and Nso 2 19 each correspond to a nominal P = 0.001, assuming that mito-A genes are randomly distributed in expression space. In the final column, the Subcellular localization based on immunofluorescence microscopy is indicated for the five proteins shown in FIG. 2 Exemplar Protein for the Cluster Proteomics BLASTP against MitoKor Accession Description Liver Brain Heart Kidney Match Score Expect 12836798 unnamed protein product Mus musculus 12836798 NO MATCH 128461.64 unnamed protein product Mus musculus 128461.64 NO MATCH 12845262 unnamed protein product Mus musculus 12845262 14770968 326 3.00E-91 12860092 unnamed protein product Mus musculus 12860092 11545863 316 2.OOE-88 2089.7872 RIKEN cDNA 18 10058I14 Mus musculus 12841742 NO MATCH 1284.1269 unnamed protein product Mus musculus 1284.1269 NO MATCH 128356.68 unnamed protein product Mus musculus 128356.68 NO MATCH 1286.1374 unnamed protein product Mus musculus 1286.1374 NO MATCH 22267464 retinoic acid inducible protein 3 Mus musculus 13436248 NO MATCH

0351)

TABLE 6 The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Set Exemplar Description Symbol 1 104560 at 21553.115 putative mitochondrial solute carrier Mus musculus Mrs3f4-pending 2 97868 at 31S60085 DnaJ (Hsp40) homolog, Subfamily A, member 3 Mus musculus Dnaja3 3 95608 at 6.681079 cathepsin B preproprotein Mus musculus Ctsb 4 95.359 at 668O3OS heat shock protein 1, beta; heat shock protein, 84 kDa 1; heat shock 90 kDa Hspcb 5 104103 at 3072584.5 AAA-ATPase TOB3 Mus musculus TOB3 6 96861 at 3OS1992.1 mitochondrial ribosomal protein L50 Mus musculus D4WSu125e 7 95.438 at 315598.91 mitochondrial Rho. 1 Mus musculus 2210403N23Rik 8 95.431 at 27552760 DNA segment, Chr 16, Indiana University Medical 22, expressed Mus D16lum22e musculus 9 93808 at 6671 688 carbonyl reductase 2: lung carbonyl reductase Mus musculus 10 103044 g at 6754,760 mature T-cell proliferation 1 Mus musculus 11 104747 at 6678OO1 solute carrier family 1, member 1 Mus musculus 12 104748 s at 6678OO1 solute carrier family 1, member 1 Mus musculus 13 104700 at 6677943 serine hydroxymethyl transferase 1 (soluble) Mus musculus 14 98470 at 675.5544 solute carrier family 25 (mitochondrial carrier, brain), member 14; solute 15 97935 at 21311988 RIKEN cDNA 4121402D02 Mus musculus 16 103061 at 3.198.2847 glutamic acid decarboxylase 1 Mus musculus 17 95.432 f at 27552760 DNA segment, Chr 16, Indiana University Medical 22, expressed Mus musculus 18 95746 at 31560731 ATPase, H+ transporting, V1 subunit A, isoform 1: ATPase, H+ transporting, 19 93126 at 10946574 creatine kinase, brain Mus musculus 20 97983 s at 6678179 Syntaxin binding protein 1; unc18 homolog (C. elegans); UNC-18 homolog (C. 21 100510 at 15809030 Synuclein, beta Mus musculus 22 93362 at 6753074 adaptor protein complex AP-2, mul; adaptor-related protein complex AP-2, mu1; tyrosine 3-monooxygenase tryptophan 5-monooxygenase activation protein, 23 97544 at 6756041 Zeta YwhaZ AFFX-GapdhMurf 24 M32599 3 St 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus US 2007/0203083 A1 Aug. 30, 2007 67

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Se Exemplar Description Symbol 25 100551 r at 1671 6343 cytochrome c oxidase, subunit VIc Mus musculus 26 99124 a 9507187 ractured callus expressed transcript 1: Fracture Callus 1; Small Zinc 27 92876 a 6754814 NADH dehydrogenase (ubiquinone) Fe—S protein 4: NADH dehydrogenase (ubiquinone) 28 96760 a 7305573 translocase of inner mitochondrial membrane 10 homolog Mus musculus mm10 29 94421 r at 6681.031 cryptochrome 1 (photolyase-like) Mus musculus Cry1 30 93359 a 18859597 RIKEN cDNA 1810004I06 Mus musculus 1810004IO6Rik 31 98832 a. 6678417 hyroid peroxidase Mus musculus Tpo 32 96.857 a 668O816 complement component 1, q Subcomponent binding protein Mus musculus 33 98117 a 9506911 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 1 (7.5 kD, MWFE); Ndufa1 NADH 34 100046 at 6678952 methylenetetrahydrofolate dehydrogenase (NAD+ dependent), Mthf2 35 103806 at 6678716 ow density lipoprotein receptor-related protein 5; low density Lrp5 36 97372 a. 1887S324 DAZ associated protein 1 Mus musculus Dazap1 37 102416 at 6.681097 cytochrome P450, family 17, subfamily a, polypeptide 1: cytochrome P450, Cyp17 7; 38 94850 a 12331400 acyl-Coenzyme A thioesterase 3, mitochondrial; MT-ACT48, p48 Mus Acate3-pending musculus 39 103471 at 31.9812O7 RIKEN cDNA 4432405K22 Mus musculus 443240SK22Rik 40 928.10 a 21704122 pyruvate dehydrogenase kinase, isoenzyme 3 Mus musculus Pok3 41 93062 a 3156O438 mitochondrial ribosomal protein L39; ribosomal protein, mitochondrial, L5 Mrpl39 Mus 42 97884 a mitochondrial ribosomal protein S11 Mus musculus Mrps 11 43 94420 f at cryptochrome 1 (photolyase-like) Mus musculus Cry1 44 99.027 a 3 981887 Bcl2-like Mus musculus Bcl2 45 100619 rat 22094O75 solute carrier family 25, member 5; adenine nucleotide translocator 2, SI 46 102007 at 315429SO holocytochrome c synthetase Mus musculus Hccs 47 95.354 a Solute carrier family 25 (mitochondrial carrier; adenine nucleotide SI c25a13 48 995.43 s at deoxyguanosine kinase Mus musculus Dguok 49 98.903 a 21312O28 RIKEN cDNA 1110006I11 Mus musculus 1110006I11Rik 50 96.032 a 3.1982497 ATP synthase, H+ transporting, mitochondrial FO complex, Subunit c (subunit Atp5g1 9), 51 95.734 a 3.1981470 mitochondrial ribosomal protein L3 Mus musculus Mrpl3 52 102128 at 31.981257 mitochondrial ribosomal protein S25 Mus musculus Mrps25 53 94210 a 7305581 translocase of inner mitochondrial membrane 9 homolog Mus musculus S4 103622 at 99.10434 malonyl-CoA decarboxylase Mus musculus Mlycod 55 96.289 a 1296.3591 stomatin-like protein 2 Mus musculus StomI2 56 AFFX 6679237 pyruvate carboxylase; pyruvate decarboxylase Mus musculus PyruCarbMurf L09192 5 at 57 95.645 a. 21313484 nitrogen fixation cluster-like Mus musculus 58 96916 a 13385266 mitochondrial ribosomal protein L33 Mus musculus 59 94.012 a 7305575 translocase of inner mitochondrial membrane 13 homolog a Mus musculus 60 93859 a 19526984 mitochondrial translational initiation factor 2 Mus musculus 2410112O06Rik 61 96.202 a 71064.09 Solute carrier family 1, member 2; glial high affinity glutamate transporter SI 62 96650 a 7709988 AU RNA-binding enoyl-coenzyme A hydratase; AU RNA-binding Auh protein enoyl-coenzyme 63 98.120 a 1671 6447 mitochondrial ribosomal protein L27 Mus musculus Mrpl27 64 93048 a 83931S6 caseinolytic protease, ATP-dependent, proteolytic subunit homolog: pp caseinolytic 65 94.852 a. 3.1982332 glutamate-ammonia ligase (glutamine Synthase); glutamine synthetase Mus 66 98.909 a 13277380 ipoic acid synthetase Mus musculus 67 103646 at 6.681009 carnitine acetyltransferase Mus musculus 68 98984 f at 3.198.1769 glycerol-3-phosphate dehydrogenase 2; glycerol phosphate dehydrogenase 69 98099 a 27753998 nudix (nucleoside diphosphate linked moiety X)-type motif 9 Mus musculus 70 94.897 a 13S4O480 glutathione peroxidase 4: Sperm nuclei glutathione peroxidase; phospholipid Gl 71 97369 g at 6753O3O A-kinase anchor protein 1: A kinase anchor protein Mus musculus Akap1 72 99636 a 1478O884 polymerase delta interacting protein 38 Mus musculus 73 95.215 f at 21070950 ubiquitin C; polyubiquitin C Mus musculus 74 96.095 at 13195670 RIKEN cDNA 2610207I16 Mus musculus 75 93114 a 10181184 ATP synthase, H+ transporting, mitochondrial FO complex, Subunit f isoform 2: 76 100.527 at 21311867 hypothetical protein D11 Erta 99e Mus musculus D 77 92625 a. 6679078 expressed in non-metastatic cells 2, protein; expressed in non-metastatic Nme2 cells 78 96653 a 21311883 RIKEN cDNA 0610007O07 Mus musculus 79 96856 a 668O816 complement component 1, q Subcomponent binding protein Mus musculus 8O 98.545 a. 6671622 B-cell receptor-associated protein 37; repressor of estrogen receptor activity Bcap37 US 2007/0203083 A1 Aug. 30, 2007 68

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Set Exemplar Description Symbol 81 96858 at 6755004 programmed cell death 8; programmed cell death 8 (apoptosis inducing Polcd8 factor); 82 94.855 at 6679299 prohibitin Mus musculus 83 99148 at 33859554 fumarate hydratase 1 Mus musculus 84 96.898 at 33859512 ATP synthase, H+ transporting, mitochondrial FO complex, Subunit b, isoform 1 85 AFFX-GapdhMurf 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus M32599 5 st 86 AFFX-GapdhMurf 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus M32599 M st 87 93392 at 66784.95 uncoupling protein 3, mitochondrial Mus musculus Ucp3 88 94379 at 2SO31694 kinesin family member 1B Mus musculus Kiflb 89 102426 at 6753290 calsequestrin 1 Mus musculus Casq1 90 96.801 at 10946936 adenylate kinase 1: cytosolic adenylate kinase Mus musculus Ak1 91 96.066 s at 3.1981562 pyruvate kinase 3 Mus musculus Pkm2 92 101214 f at 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus Gapd 93 AFFX-GapdhMurf 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus M32599 3 at 94 AFFX-GapdhMurf 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus M32599 5 at 95 AFFX-GapdhMurf 6679.937 glyceraldehyde-3-phosphate dehydrogenase Mus musculus M32599 M at 96 94279 at 2153622O RIKEN cDNA 0610008F 14 Mus musculus O610008F14Rik 97 95.498 at 13384968 mitochondrial ribosomal protein S15 Mus musculus Mrps 15 98 98.130 at 99.03609 thioredoxin 2; thioredoxin nuclear gene encoding mitochondrial protein; Tx2 99 96626 at 273.70092 RIKEN cDNA 2300002G02 Mus musculus 23OOOO2GO2Rik OO 99658 f a 12963697 RIKEN cDNA 1110025H10 Mus musculus 1110O2SH1 ORik O1 97342 at 13384894 mitochondrial ribosomal protein S14 Mus musculus Mrps 14 O2 95472 f a 13385726 ubiquinol-cytochrome c reductase binding protein Mus musculus 2210415M14Rik O3 94.062 at 209.00762 NADH dehydrogenase (ubiquinone) flavoprotein 2 Mus musculus Ndufv2 O4 99661 r a 6680991 cytochrome c oxidase, subunit VIIc: cytochrome c oxidase subunit VIIc Mus 05 95718 f a 13128954 upregulated during skeletal muscle growth 5 Mus musculus USmg5 O6 101580 at 13384,754 cytochrome c oxidase subunit VIIb Mus musculus O7 96.887 at 9506933 neuronal protein 15.6 Mus musculus Np 15 O8 96.280 at 3.1981 600 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 2: NADH Ndufa2 dehydrogenase 09 95131 f a 13386096 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, 2 Mus musculus 1810011 OO1Rik 10 95132 r a 13386096 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, 2 Mus musculus 1810011 OO1Rik 11 99660 f a 6680991 cytochrome c oxidase, subunit VIIc: cytochrome c oxidase subunit VIIc Mus Cox7c 12 93.014 a 3.1980.744 ATP synthase, H+ transporting, mitochondrial FO complex, subunit g; F1F0 Atp51 ATP 13 99678 f a 3.1980.744 ATP synthase, H+ transporting, mitochondrial FO complex, subunit g; F1F0 Atp51 ATP 14 97512 a 21312554 RIKEN cDNA 20101.07E04 (Mus musculus 20101.07E04 Rilk 15 100550 f at 1671 6343 cytochrome c oxidase, subunit VIc Mus musculus Cox6c 16 93.820 a 3.1981830 cytochrome c oxidase, subunit VIIa 2; cytochrome c oxidase subunit VIIa 3: Cox7a2 17 99115 a. 21539599 ubiquinol cytochrome c reductase hinge protein; mitochondrial hinge protein; 2610041P16Rik 18 94909 a 13384854 mitochondrial ribosomal protein S17 Mus musculus Mrps 17 19 96.686 i at 13385436 RIKEN cDNA 2010100O12 Mus musculus 2010100012Rik 2O 96687 f at 13385436 RIKEN cDNA 2010100O12 Mus musculus 2010100012Rik 21 94526 a 19527228 DNA segment, Chr 10, ERATO Doi 214, expressed Mus musculus D10Ertd214e 22 97880 a 21313536 dihydrolipoamide S-Succinyltransferase (E2 component of 2-oxo-glutarate 493 OS29OO8Rik complex) 23 96.096 f at 13195670 RIKEN cDNA 2610207I16 Mus musculus 26102O7I16Rik 24 94.866 a 13384844 mitochondrial ribosomal protein S16 Mus musculus Mrps 16 25 93582 a 2O587962 demethyl-Q 7 Mus musculus 26 94860 a 33468943 translocator of inner mitochondrial membranea; translocator of inner Timm17a 27 10O892 at 3.198O802 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, assembly factor 1; Ndufaf1 NADH 28 102097 f at 21539587 NADH-ubiquinone oxidoreductase B9 subunit; Complex I-B9; CI-B9 Mus 1010001M12Rik musculus 29 97874 a 338S9744 RIKEN cDNA 1500032D16 Mus musculus 1SOOO32D16Rik 30 93562 a 1338SOS4 NADH dehydrogenase (ubiquinone) 1 beta subcomplex 3 Mus musculus 27OOO331.6Rik 31 94534 a 182SO284 isocitrate dehydrogenase 3 (NAD+) alpha Mus musculus Idh3a 32 98.929 a 13384,742 RIKEN cDNA 1110018B13 Mus musculus 1110018B13Rik 33 95058 f at 21312594 RIKEN cDNA 2610205H19; EST AA108335 Mus musculus 261 O2OSH19Rik 34 99.666 a 1338S942 citrate synthase Mus musculus Cs 35 94080 a 20908717 Succinate dehydrogenase Fp subunit Mus musculus Scha 36 93029 a isocitrate dehydrogenase 3 (NAD+), gamma Mus musculus US 2007/0203083 A1 Aug. 30, 2007 69

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Se Exemplar Description Symbol 37 94.912 a 17505220 mitochondrial ribosomal protein S21 Mus musculus Mrps21 38 93531 a 2131.2012 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 8 Mus musculus O610033LO3Rik 39 93754 a 7949037 enoyl coenzyme A hydratase 1, peroxisomal; peroxisomalmitochondrial Ech1 dienoyl-CoA 40 92.581 a 6680618 acetyl-Coenzyme A dehydrogenase, medium chain Mus musculus Acadm 41 96.112 a 3.1981826 electron transferring flavoprotein, alpha polypeptide; Alpha-ETF Mus Etfa musculus 42 97869 a 21313290 electron transferring flavoprotein, dehydrogenase Mus musculus 43 95072 a 1338SOO6 cytochrome c-1 Mus musculus 44 96267 a 19526814 NADH dehydrogenase (ubiquinone) flavoprotein 1: NADH dehydrogenase flavoprotein 45 101989 at 13384,794 ubiquinol-cytochrome c reductase core protein 1 Mus musculus Udcirc1 46 94.806 a 18152793 pyruvate dehydrogenase (lipoamide) beta Mus musculus Plb 47 93815 a. 21313618 RIKEN cDNA 0610041L09 Mus musculus O610041LO9Rik 48 96268 a 9845299 Succinate-CoA ligase, GDP-forming, alpha Subunit; succinyl-CoA synthetase Suclg1 Mus 49 102749 at 6753504 cytochrome c oxidase, Subunit VIIa. 1; cytochrome c oxidase subunit VIIa 1 Cox7a.1 Mus 50 95698 a 13.385322 NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 7 Mus musculus 51 93119 a 6753500 cytochrome c oxidase, subunit Vb Mus musculus 52 96909 a 27754007 NADH dehydrogenase (ubiquinone)1, alpha/beta Subcomplex, 1 Mus musculus 53 99128 a 20070412 ATP synthase, H+ transporting, mitochondrial F1 complex, O subunit Mus At So S4 100753 at 668.0748 ATP synthase, H+ transporting, mitochondrial F1 complex, alpha Subunit, Atp5a1 isoform 55 93596 i at 13385484 ATP synthase, H+ transporting, mitochondrial F1 complex, epsilon Subunit; 2410043G19Rik ATP 56 93844 a 21539585 low molecular mass ubiquinone-binding protein; ubiquinol-cytochrome c Uacrb reductase 57 96915 f at 21539587 NADH-ubiquinone oxidoreductase B9 subunit; Complex I-B9; CI-B9 Mus 1010001M12Rik musculus 58 99618 a 13385112 ubiquinol-cytochrome c reductase subunit Mus musculus O71 OOO8D09Rik 59 100079 at 297891.48 NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 9 Mus musculus Ndufb9 60 93581 a 1338.5558 NADH dehydrogenase (ubiquinone) 1 beta subcomplex 8 Mus musculus 29OOO1 OIOSRik 61 96.870 a 18079339 aconitase 2, mitochondrial Mus musculus Aco2 62 98102 a 6679.261 pyruvate dehydrogenase E1 alpha 1: pyruvate dehydrogenase E1alpha Poha1 subunit Mus 63 95.425 a. 3.1982S2O acetyl-Coenzyme A dehydrogenase, long-chain Mus musculus 64 96913 a 21704100 hydroxyacyl-Coenzyme A dehydrogenase 3-ketoacyl-Coenzyme A 65 93972 a 23346461 NADH dehydrogenase (ubiquinone) Fe—S protein 2: NADH-coenzyme Q reductase Mus 66 94216 a 13384690 Succinate dehydrogenase complex, Subunit C, integral membrane protein Mus 67 97502 a 3.198.2856 dihydrolipoamide dehydrogenase Mus musculus Dld 68 92574 a 27229021 RIKEN cDNA 3110001M13 Mus musculus 3110001M13Rik 69 102000 f at 22267442 RIKubiquinol cytochrome c reductase core protein 2 Mus musculus 1SOOOO4OO6Rik 70 96321 a. 3384,720 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 9 Mus musculus 1010001N11Rik 71 97.201 S. at 33861OO NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 5 Mus musculus 29OOOO219Rik 72 93764 a 2963633 genes associated with retinoid-IFN-induced mortality 19 Mus musculus Grim19-pending 73 97.307 f at 27754,144 NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 5; NADH NdlfbS dehydrogenase 74 92.798 a 16O2916 ATP synthase, H+ transporting, mitochondrial F1 complex, gamma polypeptide 1: F1 75 92799 g 16O2916 ATP synthase, H+ transporting, mitochondrial F1 complex, gamma polypeptide 1: F1 76 93572 a. 21704O20 NADH dehydrogenase (ubiquinone) Fe—S protein 1 Mus musculus 77 93780 a 3385260 hioesterase Superfamily member 2 Mus musculus 78 99593 a 9527334 NADH dehydrogenase (ubiquinone) Fe—S protein 5; NADH dehydrogenase Fe—S protein 79 96746 a 31542.559 dihydrolipoamide S-acetyltransferase (E2 component of pyruvate dehydrogenase 95441 a 2025536 translocase of inner mitochondrial membrane 23 homolog Mus musculus Timm23 102049 at 7305.375 pyruvate dehydrogenase kinase, isoenzyme 4: pyruvate dehydrogenase Pok4. kinase 4Mus 82 95.485 a 668O163 L-3-hydroxyacyl-Coenzyme A dehydrogenase, short chain; hydroxylacyl Hadhsc Coenzyme A 83 95.426 a 29789289 enoyl Coenzyme A hydratase, short chain, 1, mitochondrial Mus musculus Echs1 84 95064 a 29.1262OS acetyl-Coenzyme A acyltransferase 2 (mitochondrial 3-oxoacyl-Coenzyme A D18Ertd240e 85 96947 a 21312004 RIKEN cDNA 0610009I16 Mus musculus O61000916Rik US 2007/0203083 A1 Aug. 30, 2007 70

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Set Exemplar Description Symbol 86 96757 at 20070420 DNA segment, Chr 10, Johns Hopkins University 81 expressed Mus D1OJhu81e musculus 87 98.128 at 7949005 ATP synthase, H+ transporting, mitochondrial FO complex, subunit F: Atp5 88 94531 at 33859690 RIKEN cDNA 2310005O14 Mus musculus 231 OOOSO14Rik 89 99.667 at 6753502 cytochrome c oxidase, subunit VI a, polypeptide 2: Subunit VIaH (heart-type) Cox6a2 90 102402 at 6679957 glioblastoma amplified Mus musculus Gbas 91 99631 f at 6680988 cytochrome c oxidase, subunit VI a, polypeptide 1: Subunit VIaL (liver-type) Cox6a1 92 96670 at 21313138 glutathione S-transferase class kappa Mus musculus O610O2SI19Rik 93 94.940 at 31980706 methylcrotonoyl-Coenzyme A carboxylase 1 (alpha) Mus musculus Mccc1 94 AFFX- 6679237 pyruvate carboxylase; pyruvate decarboxylase Mus musculus PyruCarbMurf L09192 MB at 95 AFFX- 6679237 pyruvate carboxylase; pyruvate decarboxylase Mus musculus PyruCarbMurf L09192 3 at 96 93308 s at 6679237 pyruvate carboxylase; pyruvate decarboxylase Mus musculus PCX 97 103401 at 31982522 acyl-Coenzyme A dehydrogenase, short chain; acetyl-Coenzyme A Acads dehydrogenase, 98 94.807 a 23943838 solute carrier family 25, member 1; DiGeorge syndrome gene j; solute carrier Slc25a1 99 97248 a 6681137 diazepam binding inhibitor; acyl-CoA binding protein; diazepam-binding Db inhibitor 94.507 a 31560705 fatty acid Coenzyme Aligase, long chain 2; acetyl-Coenzyme A synthetase: Facl2 104057 at 13277394 GrpE-like 1, mitochondrial Mus musculus Grpel1 97279 a 21704140 3-hydroxyisobutyrate dehydrogenase, mitochondrial precursor: EST AI265272 AI265272; 2O3 996.13 a 6678970 methylmalonyl-coenzyme A mutase Mus musculus Mut 204 96.035 a. 31982494 branched chain ketoacid dehydrogenase E1, alpha polypeptide; BCKAD Bckdha E1a Mus 205 101.045 at 7949047 hydroxyacyl-Coenzyme A dehydrogenase type II; hydroxyacyl-Coenzyme A Hadh2 2O6 98966 a 6753610 dihydrolipoamide branched chain transacylase E2: BCKAD E2 Mus Dbt musculus 2O7 104212 at 21389320 leucine-rich PPR motif-containing protein; leucine rich protein LRP130 Mus 3110001K13Rik 208 92845 a. 18266680 3-oxoacid CoA transferase Mus musculus Oxct 209 99009 a 31543330 nicotinamide nucleotide transhydrogenase Mus musculus Nint 210 97367 a 6753030 A-kinase anchor protein 1: A kinase anchor protein Mus musculus Akap1 211 93042 a 31981875 benzodiazepine receptor, peripheral Mus musculus BZrp 212 92754 a 6679767 ferredoxin reductase Mus musculus Folxr 213 92587 a 6679765 ferredoxin 1: ADRENODOXIN Mus musculus Fox1 214 92213 a 31543776 steroidogenic acute regulatory protein Mus musculus Star 215 962.56 a 6680690 peroxiredoxin 3; anti-oxidant protein 1: mitochondrial Trx dependent peroxide Prix3 216 92829 a 6680309 heat shock protein 1 (chaperonin 10); heat shock 10 kDa protein 1 Hspe1 (chaperonin 217 93.277 a 31981679 heat shock protein 1 (chaperonin); heat shock protein, 60 kDa; heat shock Hspd1 60 kDa 218 100.977 at 27369966 RIKEN cDNA D530020C15 Mus musculus DS30O2OC1SRik 219 1.01096 s at 6754-160 HS1 binding protein Mus musculus Hs 1 bp1 220 95065 a. 6754846 nitrogen fixation gene, yeast homolog 1: nifs-like (sic) Mus musculus Nfs1 221 AFFX- 6679237 pyruvate carboxylase; pyruvate decarboxylase Mus musculus PyruCarbMurf L09192 MA at 222 98.137 a 6671680 carbonic anhydrase 5a, mitochondrial; carbonic anhydrase 5, mitochondrial; CarSa 223 98459 a 6677943 serine hydroxymethyl transferase 1 (soluble) Mus musculus Shmt1 224 92.586 a 6680027 glutamate dehydrogenase Mus musculus Glud 225 97.450 s at 20070418 aldehyde dehydrogenase family 7, member A1; aldehyde dehydrogenase 7 Aldhfa1 family, 226 97515 a. 31982273 hydroxysteroid (17-beta) dehydrogenase 4: hydroxysteroid 17-beta Hsd17b4 dehydrogenase 227 103085 at 7305137 heme binding protein 1: heme-binding protein; p22 HBP: heme-binding Hebp1 protein 1 228 98533 a 13385268 cytochrome b-5 Mus musculus Cyb5 229 104086 at 21311901 dimethylglycine dehydrogenase precursor Mus musculus 1200014D15Rik 230 96.890 a 13385298 RIKEN cDNA 1300002 A08 Mus musculus 1300002AO8Rik 231 93026 a 31981068 microsomal glutathione S-transferase 1 Mus musculus Mgst1 232 96.763 a 20149748 sarcosine dehydrogenase Mus musculus Sardh 233 93278 a 28545662 sterol carrier protein 2, liver Mus musculus Scp2 234 101515 at 7656855 acyl-Coenzyme A oxidase 1, palmitoyl; acyl-Coenzyme A oxidase; Acyl-CoA Acox1 oxidase 235 93625 a. 7709978 alanine-glyoxylate aminotransferase: alanine-glyoxylate aminotransferase 1 Agxt Mus US 2007/0203083 A1 Aug. 30, 2007 71

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Set Exemplar Description Symbol 236 96326 at 22122769 tyrosine aminotransferase Mus musculus Tat 237 101910 f at 13654245 major urinary protein 1 Mus musculus Mup1 238 92606 at 6678509 urate oxidase; uricase Mus musculus Uox 239 102096 f at 13654245 major urinary protein 1 Mus musculus Mup1 240 93320 at 27804309 carnitine palmitoyltransferase 1, liver; L-CPT IMus musculus Cpt1a 241 96057 at 6753036 aldehyde dehydrogenase 2, mitochondrial Mus musculus Aldh2 242 96058 s at 6753036 aldehyde dehydrogenase 2, mitochondrial Mus musculus Aldh2 243 928.00 i at 11602916 ATP synthase, H+ transporting, mitochondrial F1 complex, gamma At Sc1 polypeptide 1: F1 244 00617 at 22094.075 solute carrier family 25, member 5; adenine nucleotide translocator 2, Sc2SaS 245 00618 f at 22094.075 solute carrier family 25, member 5; adenine nucleotide translocator 2, Sc2SaS 246 97.207 f at 6678760 lysophospholipase 1: phospholipase 1a; lysophopholipase 1 Mus musculus Lypla1 247 98.473 at 6753110 arginase type II Mus musculus Arg2 248 98112 r at 31981147 leucine aminopeptidase 3: leucine aminopeptidase Mus musculus Lap3 249 00633 at 19526.848 RIKEN cDNA 2810484M10 Mus musculus 2810484M10Rik 250 92848 at 8393866 ornithine aminotransferase Mus musculus Oa 251 04007 at 6754952 solute carrier family 25 (mitochondrial carrier; ornithine transporter), member Slc25a15 252 96.336 at 13385454 glycine amidinotransferase (L-arginine:glycine amidinotransferase) Mus Gatm 253 93595 at 6753448 ceroid-lipofuscinosis neuronal 2 Mus musculus Cln2 2S4 O4153 at 9789985 isovaleryl coenzyme A dehydrogenase: isovaleryl dehydrogenase precursor Iw Mus 255 94.005 at 20822904 RIKEN cDNA 3110004O18 Mus musculus 3110004O18Rik 2S6 03881 at 22203753 inorganic pyrophosphatase 2 Mus musculus 1110013 G13Rik 257 O1944 at 6678760 lysophospholipase 1: phospholipase 1a; lysophopholipase 1 Mus musculus Lypla1 258 O1945 g at 6678760 lysophospholipase 1: phospholipase 1a; lysophopholipase 1 Mus musculus Lypla1 259 O1946 at 6678760 lysophospholipase 1: phospholipase 1a; lysophopholipase 1 Mus musculus Lypla1 260 99112 a 7305501 solute carrier family 25 (mitochondrial carrier; dicarboxylate transporter), Slc25a10 261 99.521 a. 6753022 adenylate kinase 4 Mus musculus Aka 262 96.069 a 27659728 aldo-keto reductase family 7, member A5 (aflatoxin aldehyde reductase); Afar 263 96231 a 21624609 RIKEN cDNA 2010012D11 Mus musculus 2010012D11Rik 264 97.525 a. 668O139 glycerol kinase Mus musculus Gyk 26S 102192 r at 31982720 SA rat hypertension-associated homolog Mus musculus Sah 266 93435 a. 6753572 cytochrome P450, family 24, subfamily a, polypeptide 1: cytochrome P450, Cyp24 24: 267 99959 a 6753022 adenylate kinase 4 Mus musculus Aka 268 98.123 a 6754408 kynurenine aminotransferase II Mus musculus Kat2 269 96629 a 14861848 DNA segment, Chr 7, Roswell Park 2 complex, expressed; androgen D7Rp2e regulated gene 270 928.69 a 6680291 hydroxysteroid dehydrogenase-4, delta-3-beta; 3-beta-hydroxysteroid Hsd3b4 271 96910 a 22122743 hypothetical protein MGC37245 Mus musculus MGC37245 272 96938 a 1948.2166 kidney expressed gene 1 Mus musculus Keg1 273 95588 a 6678766 alpha-methylacyl-CoA racemase; alpha-methylacyl-Coenzyme Aracemase: Amacr 274 97316 a 31541815 L-specific multifunctional beta-oxdiation protein Mus musculus 3OOOO2P22Rik 275 97258 a 21703764 lactamase, beta 2 Mus musculus Cgi-83-pending 276 972.57 a 21703764 lactamase, beta 2 Mus musculus Cgi-83-pending 277 96.048 a 6680277 heat-responsive protein 12 Mus musculus Hrsp12 278 103389 at 31980703 aminoadipate-semialdehyde synthase; lysine oxoglutarate reductase, Aass Saccharopine 279 100967 at 6755548 solute carrier family 27 (fatty acid transporter), member 2: very long-chain Slc27a2 28O 96678 a 13507612 NADPH-dependent retinol dehydrogenase/reductase Mus musculus D14Ucla2 281 924.92 a. 23956104 adenylate kinase 3 alpha-like; adenylate kinase 3 alpha like Mus musculus Ak3 282 99659 r at 12963697 RIKEN cDNA 1110025H10 Mus musculus 110O2SH1 ORik 283 102761 at 29789124 GrpE-like 2, mitochondrial Mus musculus Grpel2 284 94961 a 6753454 caseinolytic protease X Mus musculus Clpx 285 103354 at 10181116 mitochondrial ribosomal protein S31; islet mitochondrial antigen, 38 kDMus Mrps31 286 93506 a 19526818 solute carrier family 25 (mitochondrial carrier; phosphate carrier), member 3: Slc25a3 287 93495 a 7948999 peroxiredoxin 4; antioxidant enzyme AOE372; Prx IV Mus musculus Prdx4 288 97.477 a 7305579 translocase of inner mitochondrial membrane 8 homolog b Mus musculus Timm8b 289 94515 a. 31981549 Sulfide quinone reductase-like; flavo-binding protein; sulfide Sqrdl 290 95660 a 12963539 HSCO protein Mus musculus O610O2SL1SRik 291 92807 a 6755911 thioredoxin 1; thioredoxin Mus musculus Txn1 292 93749 a 27804325 monoamine oxidase A Mus musculus Maoa 293 93984 a 31982864 ATPase inhibitor Mus musculus Atpi 294 96849 a 7305577 translocase of inner mitochondrial membrane 8 homolog a Mus musculus Timm8a. 295 104283 at 31981207 RIKEN cDNA 4432405K22 Mus musculus 443240SK22Rik 296 99513 a 66793.69 phospholipase A2, group IVA (cytosolic, calcium-dependent); phospholipase Pla2.g4a A2, 297 100957 at 20916351 single-stranded DNA binding protein 1 Mus musculus 298 993.35 a. 6754206 hexokinase 1: downeast anemia Mus musculus Hk1 US 2007/0203083 A1 Aug. 30, 2007 72

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-se ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Se Exemplar Description Symbol 299 92735 a. 72421.75 phospholipase A2, group IIA (platelets, synovial fluid); modifier of Min1; Pla2g2a 3OO 98902 a 21312O28 RIKEN cDNA 1110006I11 Mus musculus 1110006I11Rik 301 94284 a 1974.5150 diaphorase 1 (NADH) Mus musculus Dia1 3O2 92792 a 3154392O uncoupling protein 2, mitochondrial Mus musculus Ucp2 303 95676 a 18700024 isocitrate dehydrogenase 3, beta Subunit; isocitrate dehydrogenase 3 beta; N14A 3O4 99176 a 10946808 fibroblast growth factor (acidic) intracellular binding protein; aFGF Fibp 305 98.613 a 2131308O RIKEN cDNA 2700085E05 Mus musculus 27OOO8SEOSRik 306 96641 a 6754870 neighbor of Cox4 Mus musculus Noc4 307 97825 a. 1152852O p53 apoptosis effector related to Pmp22; p53 apoptosis-associated target Perp-pending Mus 3O8 928.60 a 6680993 cytochrome c oxidase, subunit VIIIa: COX VIII-L Mus musculus Cox8a. 309 99172 a 6678303 transcription factor A, mitochondrial Mus musculus Tfam 310 99836 a 2O867579 cytochrome P450, 40 (25-hydroxyvitamin D31 alpha-hydroxylase) Mus Cyp40 musculus 311 03043 at 6754,760 mature T-cell proliferation 1 Mus musculus Mtcp1 312 04102 at 3.1980991 protease, serine, 25; serine protease OMI Mus musculus PrSS25 313 28077029 RIKEN cDNA 913.0022B02 Mus musculus 913.0022BO2Rik 314 13384,766 RIKEN cDNA 1110021 D01 Mus musculus 11100021D01Rik 315 00300 at 31542440 cytochrome b-245, beta polypeptide Mus musculus Cybb 316 99.114 r at 1338SO90 cytochrome c oxidase, subunit VIb Mus musculus 201OOOOGOSRik 317 6753200 BCL2/adenovirus E1B 19 kDa-interacting protein 3-like: BCL2/adenovirus Bnip3| E1B 19 318 92768 s at 33859502 aminolevulinic acid synthase 2, erythroid; erythroid-specific ALAS: Alas2 319 00414 S at 6754732 myeloperoxidase Mus musculus Mpo 32O 92595 rat 20452466 errochelatase Mus musculus Fech 321 98505 at 6.681007 coproporphyrinogen oxidase; clone 560 Mus musculus Cpo 322 98.506 r at 6.681007 coproporphyrinogen oxidase; clone 560 Mus musculus Cpo 323 04234 at 31.981257 mitochondrial ribosomal protein S25 Mus musculus Mrps25 324 21313024 solute carrier family 25 (mitochondrial deoxynucleotide carrier), member 19 Slc25a19 Mus 325 13507712 sphingosine-1-phosphate phosphatase 1: Sphingosine-1-phosphate phosphatase Mus 326 O1557 at 6753.164 branched chain ketoacid dehydrogenase kinase; branched chain keto acid Bckdk 327 00443 at 338S9514 branched chain aminotransferase 2, mitochondrial Mus musculus Bcat2 328 27229283 Small fragment nuclease Mus musculus Smfn 329 02058 at 29789253 mitochondrial ribosomal protein L9 Mus musculus Mrpl9 330 O3045 at 6754,760 mature T-cell proliferation 1 Mus musculus Mtcp1 331 67531.98 BCL2/adenovirus E1B 19 kDa-interacting protein 1, NIP3; BCL2/adenovirus Bnip3 E1B 19 332 73O4999 deoxyguanosine kinase Mus musculus Dguok 333 14916467 inositol polyphosphate-5-phosphatase E; inositol polyphosphate-5- Inpp5e phosphatase, 72 334 102659 at 31560609 ceroid lipofuscinosis, neuronal 3, juvenile (Batten, Spielmeyer-Vogt disease) CIn3 335 94541 a 21314826 NADH:ubiquinone oxidoreductase B15 subunit Mus musculus O610006N12Rik 336 97368 a 6753O3O A-kinase anchor protein 1: A kinase anchor protein Mus musculus Akap1 337 96745 a. 31542.559 dihydrolipoamide S-acetyltransferase (E2 component of pyruvate Dlat dehydrogenase 338 95607 a 10946984 START domain containing 3; esé4 protein; steroidogenic acute regulatory Stard3 protein 339 1O1407 at 6679863 frataxin Mus musculus Frca 340 95896 a 6680991 cytochrome c oxidase, subunit VIIc: cytochrome c oxidase subunit VIIc Mus 341 101356 at 1083S111 thymidine kinase 2, mitochondrial; thymidine kinase 2 Mus musculus 342 100059 at 22094O77 cytochrome b-245, alpha polypeptide; cytochrome beta-558; p.22 phox Mus Cyba 343 93536 a 6680770 Bcl2-associated X protein Mus musculus Bax 344 101.036 at 13324.686 translocase of Outer mitochondrial membrane 20 homolog Mus musculus 181006OKO7Rik 345 97.472 a 29789.024 Solute carrier family 25 (mitochondrial carrier; peroxisomal membrane Sc25a17 protein), 346 92494 a 6753058 annexin A10 Mus musculus Anxa10 347 96.028 a 9055178 brain protein 44-like: apoptosis-regulating basic protein Mus musculus Brp441 348 942.54 a 73O4963 chloride intracellular channel 4 (mitochondrial) Mus musculus Clic4 349 94255 g at 73O4963 chloride intracellular channel 4 (mitochondrial) Mus musculus Clic4 350 942.56 a 73O4963 chloride intracellular channel 4 (mitochondrial) Mus musculus Clic4 351 99141 a 6806917 GM2 ganglioside activator protein Mus musculus Gm2a 352 101.055 at 6679437 protective protein for beta-galactosidase Mus musculus Ppgb 353 92633 a 11968166 cathepsin Z preproprotein; cathepsin Z precursor; cathepsin X Mus CtSZ musculus 102328 at 2O847456 caspase 8 Mus musculus Casp8 103608 at 22267456 RIKEN cDNA 2810431B21 Mus musculus 2810431B21Rik US 2007/0203083 A1 Aug. 30, 2007 73

TABLE 6-continued The ordered gene list for FIGS. 7 and 8. The list is ordered based on FIGS. 7 and 8, and each row includes the corresponding Affymetrix probe-set ID, protein accession, the gene symbol, evidence (white, previously annotated; gray, detected in proteomics: black, previously annotated and detected in proteomics), the module annotation, and the description Protein Row Probe Se Exemplar Description Symbol 356 93699 a 7657467 polymerase (DNA directed), gamma 2, accessory subunit; mitochondrial Polg2 polymerase 357 96.287 a 21281687 deoxyuridine triphosphatase Mus musculus Dutp 358 94283 a 13385752 mitochondrial ribosomal protein L49; neighbor of fau 1 Mus musculus Mrpla9 359 98547 a 6755360 mitochondrial ribosomal protein S12; ribosomal protein, mitochondrial, S12; Mrps 12 360 93251 a 6679066 4-nitrophenylphosphatase domain and non-neuronal SNAP25-like protein Nipsnap1 homolog 1 361 100589 at 21313262 inner membrane protein, mitochondrial Mus musculus mint 362 104132 at 6754870 neighbor of Cox4 Mus musculus Noc4 363 94368 a 31088872 Suppressor of varl, 3-like 1 Mus musculus 364 96036 a 13384998 tetratricopeptide repeat domain 11 Mus musculus 2010003O14Rik 365 100335 at 6680758 ATPase, Cu++ transporting, beta polypeptide; Wilson protein; toxic milk Mus Atp7b 366 103683 at 991 0194 dihydroorotate dehydrogenase Mus musculus Dhodh 367 97256 a 272.28985 RIKEN cDNA 2410011G03 Mus musculus 2410011GO3Rik 368 102031 at 6755334 ribonuclease H1 Mus musculus Rinaseh1 369 96906 a 18079334 ethanol induced 6 Mus musculus EtOhi(6 370 93561 a 27754146 RIKEN cDNA 0710001 P09 Mus musculus O71 OOO1PO9Rik 371 94.962 g at 6753454 caseinolytic protease X Mus musculus Clpx 372 98.433 a 31542228 BH3 interacting domain death agonist Mus musculus Bid 373 96904 a 30794474 mitchondrial ribosomal protein S7; ribosomal protein, mitochondrial, S7 Mus Mrps7 374 103386 at 18875408 peroxisomal acyl-CoA thioesterase 1 Mus musculus Pte1 375 93355 a. 6754036 glutamate oxaloacetate transaminase 2 mitochondrial; mitochondrial Got2 aSoartate 376 98.139 a 6755963 voltage-dependent anion channel 1 Mus musculus Vdac1 377 95738 a 24025659 pyrroline-5-carboxylate synthetase; glutamate gamma-semialdehyde Pycs synthetase Mus 378 98.298 a 6753676 dihydropyrimidinase-like 2; collapsin response mediator protein 2 Mus Dpysl2 musculus 379 95603 a 20070408 glycine decarboxylase Mus musculus D19WSS7e 380 97993 a 6678519 uroporphyrinogen III synthase; URO-synthase; uroporphyrinogen-III Uros synthase; 381 99159 a 19527310 peptidylprolyl isomerase F (cyclophilin F); peptidyl-prolyl cis-trans isomerase: AW457192 382 98118 a 9506911 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 1 (7.5 kD, MWFE); Ndufa1 NADH 383 98106 a 19705563 translocator of inner mitochondrial membrane 44 Mus musculus Timm44 384 103625 at 16905099 AFG3 (ATPase family gene 3)-like 1 Mus musculus Afg311 385 92497 a 9790129 solute carrier family 22 member 4: solute carrier family (organic cation Slc22a4 386 93385 a 6679146 nth (endonuclease III)-like 1; thymine glycol DNA glycosylasef AP lyase Mus Nthl1

O352

TABLE 7 The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM 972O1 s at 69 13386100 2900002J19Rik RIKEN cDNA 2900002J19 gene – 102561 at 68 92.574 at 68 27229021 3110001M13Rik RIKEN cDNA 3110001 M13 gene – 96321 at 68 13384720 1010001N11 Rik RIKEN cDNA 1010001N11 gene – 99128 at 68 20070412 Atp5o ATP synthase, H+ transporting, IPROOO711 H--- OSCP ATP mitochondrial F1 complex, O transporting two- synthase delta Subunit sector ATPase, delta (OSCP) subunit; 5.6e-64 (OSCP) subunit 10O892 at 67 3198O802 Ndlfalf1 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, assembly factor 1 102000 f at 67 22267442 1500004O06Rik RIKEN cDNA 1500004O06 gene IPROO1431 // Peptidase M16 || Insulinase-like Insulinase (Peptidase peptidase, family family M16); 1.5e-40 US 2007/0203083 A1 Aug. 30, 2007 74

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations

Probe Set Exemplar Symbol Title NTERPRO PFAM M16 if IPROO1478 PDZ.DHRf GLGF domain 93764 at 67 12963633 Grim 19-pending genes associated with retinoid IFN-induced mortality 19 96112 at 67 3.1981826 Etfa electron transferring flavoprotein, PROO1308 ETF alpha i? Electron alpha polypeptide Electron transfer transfer flavoprotein flavoprotein, alpha alpha subuni: 3.5e-149 Subunit 96611 at 67 2010012C24Rik RIKEN cDNA 2010012C24 gene 97502 at 67 3.198.2856 Dd dihydrolipoamide dehydrogenase PROO1327 FAD pyr redox dim if dependent pyridine Pyridine nucleotide nucleotide-disulphide disulphide oxidoreductase oxidored; 2.5e-61 ff. PROO4O99 pyr redox i? Pyridine Pyridine nucleotide nucleotide-disulphide disulphide oxidored; 1.2e-92 oxidoreductase dimerisation domain if IPROOO815 Mercuric reductase PROO1100 Pyridine nucleotide disulphide oxidoreductase, class I 991.06 at 67 Copso COP9 (constitutive PROO3640 Mov34 photomorphogenic) homolog, amily, subtype 2 / subunit 6 (Arabidopsis thaliana) PROOOSSS if Mov34 amily 99618 at 67 13385112 0710008D09Rik RIKEN cDNA 0710008D09 gene 100753 at 66 668.0748 Atp5a1 ATP synthase, H+ transporting, PROOS294 if ATP ATP-synt ab N / ATP mitochondrial F1 complex, alpha synthase F1, alpha synthase alphabeta Subunit, isoform 1 Subunit IPROOO793 family, beta-ba; 8.4e-19 H+-transporting fif ATP-synt ab if o-sector ATPase, ATP synthase phabeta subunit, alphabeta family, -terminal nucleot: 3e-162 fif PROO41OO H--- ATP-synt ab C / ATP transporting two synthase alphabeta sector ATPase, chain, C termin; 4e-37 alpha/beta subunit, N erminal PROOO790 H--- transporting two sector ATPase, alpha subunit, C-terminal f/f PROOO194 H--- transporting two sector ATPase, alphabeta subunit, central region 102228 at 66 Lat linker for activation of T cells 92.581 at 66 6680618 Acadm acetyl-Coenzyme A IPROO6089 / Acyl Acyl-CoA dh M / dehydrogenase, medium chain CoA dehydrogenase Acyl-CoA // IPROO6092 / Acyl dehydrogenase, CoA dehydrogenase, middle domain; 3.1e-66 N-terminal fi, Acyl-CoA dh i? IPRO06091 || Acyl Acyl-CoA CoA dehydrogenase, dehydrogenase, C middle domain terminal doma; 4.5e-68 IPROO6090 / Acyl fi, Acyl-CoA dh N if CoA dehydrogenase, Acyl-CoA C-terminal dehydrogenase, N terminal doma: 2.1e–53 94.912 at 66 17505220 Mrps21 mitochondrial ribosomal protein IPROO1911 if S21 Ribosomal protein S21 US 2007/0203083 A1 Aug. 30, 2007 75

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations

Probe Set Exemplar Symbol Title NTERPRO PFAM 97.307 f at 27754,144 Ndlfb5 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, 5 97914 at heat shock protein, A PROO2O48 HSP70 // Hsp70 Calcium-binding EF protein: 0 hand IPRO01023 Heat shock protein Hsp70 99666 at 66 1338S942 Cs citrate synthase PROO2020 Citrate citrate synt Citrate synthase synthase; 4.4e-233 100079 at 65 297891.48 Ndufb9 NADH dehydrogenase (ubiquinone) 1 beta Subcomplex, 9 93991 at 65 Mor1 malate dehydrogenase, PROO1236 ldh C lactatemalate mitochondrial Lactate/malate dehydrogenase, dehydrogenase alpha/beta C-t; 2e-72 PROO1252 Malate . . Idh lactatemalate dehydrogenase dehydrogenase, NAD binding do; 3.1e–73 94461 at 65 Pbef-pending pre-B-cell colony-enhancing factor PROO2O88 if Protein prenyltransferase, alpha Subunit 94907 f at 65 RIKEN cDNA 1110001.JO3 gene 95053 S at 65 RIKEN cDNA 0710008N11 gene PROO6058 2Fe- 2S fer2 2Fe-2S iron erredoxin Sulfur cluster binding PROO14SO 4Fe 4S domain; 0.057 erredoxin, iron-sulfur binding domain fif PROO1041 Ferredoxin PROO4489 Succinate dehydrogenase? fumarate reductase iron Sulfur protein 95072 at 65 1338SOO6 Cycl cytochrome c-1 PROO2326 Cytochrome C1 if Cytochrome c1 (if Cytochrome C1 PROOO345 family; 6.4e-165 Cytochrome cheme binding site 98132 at 65 Cycs cytochrome c, somatic cytochrome c if Cytochrome c; 3.9e-38 99140 at 65 Mrpl16 mitochondrial ribosomal protein PROOO114 if Ribosomal L16 L16 Ribosomal protein Ribosomal protein L16 L16; 1.9e-07 92799 g at 64 116O2916 ATP synthase, H+ transporting, PROOO131 H--- ATP-synt / ATP mitochondrial F1 complex, gamma transporting two synthase; 6.9e-132 polypeptide 1 sector ATPase, gamma Subunit 93119 at 64 6753500 CoxSb cytochrome c oxidase, subunit Vb PROO2124 if COX5B // Cytochrome Cytochrome c c oxidase subunit oxidase, subunit Vb Vb; 2.4e–58 93562 at 64 1338SOS4 27OOO331.6Rik RIKEN cDNA 2700033I16 gene 94080 at 64 20908717 Scha Succinate dehydrogenase PROO3952 complex, Subunit A, flavoprotein Fumarate (Fp) reductase, succinate dehydrogenase, FAD-binding site fif PROO1327 FAD dependent pyridine nucleotide-disulphide oxidoreductase PROO4112 Fumarate reductase, succinate dehydrogenase flavoprotein, C erminal PROO11OO . Pyridine nucleotide US 2007/0203083 A1 Aug. 30, 2007 76

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM disulphide oxidoreductase, class I IPROO3953 Fumarate reductase, succinate dehydrogenase flavoprotein, N terminal 95058 f at 64 21312594 26102OSH19Rik RIKEN cDNA 2610205H19 gene IPROOS336 Protein UPFOO41 if of unknown function Uncharacterised UPFOO41 protein family (UPF0041); 1.5e-33 95132 r at 64 13386096 1810011 OO1Rik RIKEN cDNA 1810011OO1 gene 96291 f at 64 ESTs, Highly similar to NUMM MOUSE NADH ubiquinone oxidoreductase 13 kDa A subunit (Complex I-13 KD A) (CI-13 KD-A) M. musculus 96.899 at 64 Ndufs3 NADH dehydrogenase PROO1268 NADH (ubiquinone) Fe—S protein 3 dehydrogenase (ubiquinone), 30 kDa Subunit 96909 at 64 277S4007 261OOO3B19Rik RIKEN cDNA 2610003B19 gene PROO3231 || Acyl pp-binding if carrier protein (ACP) Phosphopantetheine if IPROO2O48 attachment site: 1.6e-17 Calcium-binding EF hand IPROO6162 Phosphopantetheine attachment site PROO6163 Phosphopantetheine binding domain 97869 at 64 21313290 061OO1 OI2ORik RIKEN cDNA 061001OI20 gene PROOO 103 Pyridine nucleotide disulphide oxidoreductase, class-II 100432 f at 63 Mdf MyoD family inhibitor 100628 at 63 Ndufc1 NADH dehydrogenase (ubiquinone) 1, Subcomplex unknown, 1 101525 at 63 O610011B04Rik RIKEN cDNA 0610011 B04 gene 101989 at 63 13384794 Uqerc1 ubiquinol-cytochrome c reductase IPROO1431 Peptidase M16 || core protein 1 Insulinase-like Insulinase (Peptidase peptidase, family family M16); 2e-71 M16 93581 at 63 1338SSS8 29OOO1 OIOSRik RIKEN cDNA 290001OIO5 gene 93582 at 63 20587962 Coq7 demethyl-Q 7 IPROO4916 . COQ7 Ubiquinone Ubiquinone biosynthesis protein biosyntheis protein COQ7; 2.9e-107 COQ7 93815 at 63 21313618 0610041LO9Rik RIKEN cDNA 0610041L09 gene 93972 at 63 23346461 Ndufs2 NADH dehydrogenase IPROO1135 if NADH complex 1 49 Kod / (ubiquinone) Fe—S protein 2 ubiquinone Respiratory-chain oxidoreductase, NADH chain 49 kDa dehydrogenase, 4; 3.2e-205 94.078 at 63 RIKEN cDNA 111002OP15 gene 94216 at 63 RIKEN cDNA 0610010E03 gene IPROOO701 Sdh cyt Succinate Succinate dehydrogenase dehydrogenase, cytochrome b cytochrome b subunit subunit; 1.6e-44 94526 at 63 19527228 D10Etc214e DNA segment, Chr 10, ERATO Doi 214, expressed 94566 at 63 G2a-pending G protein-coupled receptor G2A IPROOS388 G2A 7tm 1 7 lysophosphatidylcholine transmembrane receptor fif receptor (rhodopsin IPROOO276 family); 6.7e-38 US 2007/0203083 A1 Aug. 30, 2007 77

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations

Probe Set Exemplar Symbol Title INTERPRO PFAM Rhodopsin-like GPCR Superfamily 95517 at 63 BCOO4004 cDNA sequence BC004004 95652 at 63 Ndufat NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 7 (B14.5a) 96.042 at 63 Sod2 Superoxide dismutase 2, IPROO1189 sodife C mitochondrial Manganese and iron Ironfmanganese Superoxide Superoxide dismutase dismutases, C term; 1.8e-77 sodife if Ironfmanganese Superoxide dismutases, alpha-; 1.5e-47 96082 at 63 Mrpl30 mitochondrial ribosomal protein IPROOO517 L30 Ribosomal protein L30 96267 at 63 19526814 Ndufv1 NADH dehydrogenase IPROO1949 Complex1 51K (ubiquinone) flavoprotein 1 Respiratory-chain Respiratory-chain NADH NADH dehydrogenase dehydrogenase, 51 kDa 51; 5.4e-183 Subunit 96292 r at 63 ESTs, Highly similar to NUMMMOUSE NADH ubiquinone oxidoreductase 13 kDa A subunit (Complex I-13 KD A) (CI-13 KD-A) M. musculus 96900 at 63 AI26.7078 expressed sequence AI26.7078 96913 at 63 21704100 493O479F1SRik RIKEN cDNA 4930479F15 gene IPROO2155 thiolase C Thiolase, Thiolase C-terminal IPROOO408 . domain; 1.1e–78 Regulator of thiolase Thiolase, N terminal domain; 1.4e-131 condensation, RCC1 96915 f at 63 21539587 101OOO1M12Rik RIKEN cDNA 1010001 M12 gene 97.874 at 63 338S9744 1SOOO32D16Rik RIKEN cDNA 1500032D16 gene 99150 at 63 Ict1 immature colon carcinoma PROOO352 Class I RF-1 / Peptidyl-tRNA transcript 1 peptide chain release hydrolase domain; 7e-30 actor domain 93029 at 62 6680345 Idh3g isocitrate dehydrogenase 3 PROO1804 if isodh (NAD+), gamma Socitratefisopropylmalate Isocitratefisopropylmalate dehydrogenase dehydrogenase; 4.7e-85 if IPROO4434 Socitrate dehydrogenase NAD dependent, mitochondrial 93844 at 62 21539585 Uqcrb ubiquinol-cytochrome c reductase PROO4205 || Ucro UcrCR / Ucro binding protein amily family; 1.9e-45 94.005 at 62 20822904 31.10004O18Rik RIKEN cDNA 3110004O18 gene PROO1431 if insulinase-like peptidase, family M16 95472 f at 62 13385726 221041SM14Rik RIKEN cDNA 2210415M14 gene PROO3197 UCR 14 kD Cytochrome bd Ubiquinol-cytochrome ubiquinol oxidase, 14 kDa C reductase complex Subunit 14k; 4.3e-58 96268 at 62 9845299 Suclg1 Succinate-CoA ligase, GDP PROOS811 ATP ligase-CoA CoA forming, alpha Subunit citrate lyase, succinyl ligase; 3.9e-65 ft/ CoA ligase if CoA binding CoA PROOS810 if binding domain; 5e-72 Succinyl-CoA ligase, alpha Subunit PROO3781 CoA Binding Domain 96626 at 62 2737OO92 23 OOOO2GO2Rik RIKEN cDNA 2300002G02 gene GTP EFTU D2 Elongation factor Tu US 2007/0203083 A1 Aug. 30, 2007 78

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM domain 2: 3.2e-24 f/f GTP EFTU D3 Elongation factor Tu C-terminal domain; 6.1e-41 ff GTP EFTU Elongation factor Tu GTP binding domain; 1.4e–89 96652 at 62 Mrpl28 mitochondrial ribosomal protein L28 98102 at 62 6679.261 Pha1 pyruvate dehydrogenase E1 alpha 1 PROO1017 if E1 dehydrog if Dehydrogenase, E1 Dehydrogenase E1 component component; 3.6e-183 102749 at 61 6753504 Cox7a.1 cytochrome c oxidase, PROO3177 COX7a / Cytochrome subunit VIIa 1 Cytochrome c c oxidase subunit oxidase, subunit VIIa VIIa; 7.4e-56 103001 at 61 Vegfb vascular endothelial growth PROO2400 Growth PDGF Patelet factor B actor, cystine knot derived growth factor PROOOO72 (PDGF); 4.3e-20 Platelet-derived growth factor (PDGF) 93455 s at 61 Bmp4 bone morphogenetic protein 4 PROO1111 TGF-beta Transforming growth Transforming growth actor beta (TGFb), factor beta like: 1.8e-62 N-terminal PROO1839 TGFb propeptide if Transforming growth TGF-beta actor beta (TGFb) propeptide; 2.4e–95 935O1 f at 61 Sucla2 Succinate-Coenzyme Aligase, PROOS811 ATP ADP-forming, beta subunit citrate lyase, succinyl CoA ligase if PROOS809 if Succinyl-CoA synthetase, beta Subunit IPROO3135 // ATP-dependent carboxylate-amine ligase-like, ATP grasp 94.062 at 61 209.00762 Ndlify2 NADH dehydrogenase IPROO2O23 NADH (ubiquinone) flavoprotein 2 dehydrogenase (ubiquinone), 24 kDa Subunit 94.806 at 61 18152793 Plb pyruvate dehydrogenase IPROOS476 transketolase C (lipoamide) beta Transketolase, C Transketolase, C terminal terminal domain; 4.1e-55 IPROO5475 if transket pyr Transketolase, Transketolase, central region pyridine binding domai; 1.5e-73 95698 at 61 13.385322 111OOO2H1SRik RIKEN cDNA 11100O2H15 gene 99593 at 61 19527334 NdlfSS NADH dehydrogenase (ubiquinone) Fe—S protein 5 100307 at 60 Mits musculus 4 days neonate male adipose cDNA, RIKEN full length enriched library, clone: B430214H24 product: nuclear factor IX, full insert sequence. 102097 f at 60 21539587 101OOO1M12Rik RIKEN cDNA 1010001 M12 gene 103406 at 60 2410004JO2Rik RIKEN cDNA 2410004JO2 gene IPROO3593 . AAA ATP-bind Conserved ATPase if hypothetical ATP IPROO4130 binding protein; 6.4e-114 Conserved hypothetical ATP binding protein US 2007/0203083 A1 Aug. 30, 2007 79

TABLE 7 -continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM 92798 at 60 11602916 Atp5c1 ATP synthase, H+ transporting, IPROOO131 H--- ATP-synt / ATP mitochondrial F1 complex, gamma transporting two synthase; 6.9e-132 polypeptide 1 sector ATPase, gamma Subunit 935O2 r at 60 Sucla2 Succinate-Coenzyme Aligase, IPROOS811 if ATP ADP-forming, beta subunit citrate lyase, succinyl CoA ligase if IPROOS809 . Succinyl-CoA synthetase, beta Subunit IPROO3135 // ATP-dependent carboxylate-amine igase-like, ATP grasp 93572 at 60 Mits musculus, clone PROO1467 fer2 2Fe-2S iron IMAGE: 1380460, mRNA Prokaryotic Sulfur cluster binding molybdopterin domain; 1.7e-11 oxidoreductase PROO1041 Ferredoxin PROOO283 if Respiratory-chain NADH dehydrogenase 75 Kd Subuni 94537 at 60 RIKEN cDNA 1500001 MO2 gene PROO2O48 efhand EF Calcium-binding EF hand: 1.3e-13 hand 94860 at 60 33468943 Timm17a translocator of inner mitochondrial PROO5678 membrane 17 kDa, a Mitochondrial import inner membrane translocase, Subunit Tim17 if IPROO3397 if Mitochondrial import inner membrane translocase, subunit Tim17,22 95483 at 60 PSmd1 proteasome (prosome, macropain) 26S subunit, non-ATPase, 1 96686 i at 60 13385436 20101OOO12Rik RIKEN cDNA 20 O100O12 gene 99658 f at 60 12963697 1110O2SH1 ORik RIKEN cDNA 11 OO25H10 gene IPROO2S29 FAA hydrolase if Fumarylacetoacetate Fumarylacetoacetate (FAA) hydrolase (FAA) hydrolase fam; 5.8e-79 996.60 f at 60 668O991 Cox7c cytochrome c oxi ase, subunit VIIc IPROO42O2 . COX7C / Cytochrome Cytochrome c c oxidase subunit oxidase subunit VIIc VIIc; 4e–33 101023 f at 59 O61001OE21Rik RIKEN cDNA O6 O010E21 gene 101094 at 59 Hig1-pending hypoxia induced gene 1 102022 at 59 1110007A04Rik RIKEN cDNA 11 O007A04 gene PROO4360 Glyoxalase/Bleomycin resistance protein dioxygenase domain 92615 at 59 201OOO3OO2Rik RIKEN cDNA 20 O003O02 gene 93596 at 59 13385484 2410043G19Rik RIKEN cDNA 24 0043G19 gene 95.485 at 59 668O163 Hadhsc L-3-hydroxyacyl-Coenzyme A PROO6180 3 3HCDH N 3 dehydrogenase, short chain hydroxyacyl-CoA hydroxyacyl-CoA dehydrogenase dehydrogenase, NAD PROOO2OS if NAD binding; 8.9e-105 ft/ binding site if 3HCDH 3 PROO6108 3 hydroxyacyl-CoA hydroxyacyl-CoA dehydrogenase, C dehydrogenase, C terminal: 2e-45 erminal domain if PROO6176 3 US 2007/0203083 A1 Aug. 30, 2007 80

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM hydroxyacyl-CoA dehydrogenase, NAD binding domain 96.879 at 59 Ogdh oxoglutarate dehydrogenase IPROO1017 (lipoamide) Dehydrogenase, E1 component if IPROO5475 Transketolase, central region 103331 at 58 CO30006K11Rik RIKEN cDNAC030006K11 gene IPROOO834 Zinc carboxypeptidase A metalloprotease (M14) // IPRO02086 // Aldehyde dehydrogenase 92568 a 58 Hkp1 house-keeping protein 1 IPROO1737 RrnaAD Ribosomal Ribosomal RNA RNA adenine adenine dimethylase dimethylase; 8.2e-06 93531 a 58 2131.2012 O610O33LO3Rik RIKEN cDNA 0610033LO3 gene 93787 f at 58 101OOO1COSRik RIKEN cDNA 1010001C05 gene 95736 a 58 Mrpla l itochondrial ribosomal protein L4 IPRO02136 if Ribosomal L4 Ribosomal protein Ribosomal protein L4-L1e L4/L1 family; 5.1e–07 96687 f at 58 13385436 20101OOO12Rik RIKEN cDNA 2010100O12 gene 96757 a 58 20070420 D10Jhu81e DNA segment, Chr 10, Johns IPR002818 / Family DJ-1 PipI / DJ-1/PfpI Hopkins University 81 expressed of unknown function family; 2.3e-28 ThiJ/PfpI 99166 a 58 O610012GO3Rik RIKEN cDNA 06100 12G03 gene 102124 f at 57 Cox4a cytochrome c oxidase, Subunit IVa IPROO42O3 . COX4 i? Cytochrome c Cytochrome c oxidase subunit oxidase subunit IV 95105 a 57 2010110M21Rik RIKEN cDNA 2010110M21 gene 95131 f at 57 13386096 1810011 OO1Rik RIKEN cDNA 1810011OO1 gene 95.425 a. 57 31982520 Acad acetyl-Coenzyme A PROO6089 / Acyl Acyl-CoA dh N if dehydrogenase, long-chain oA dehydrogenase Acyl-CoA // IPROO6092 / Acyl dehydrogenase, N oA dehydrogenase, terminal doma; 9.6e-47 -terminal fi, Acyl-CoA dh i? PRO06091 / Acyl Acyl-CoA oA dehydrogenase, dehydrogenase, C iddle domain terminal doma: 1.2e-62 PROO6090 / Acyl fi, Acyl-CoA dh M if oA dehydrogenase, Acyl-CoA -terminal dehydrogenase, middle domain; 5.4e–61 96.870 at 57 1807.9339 Aco aconitase 2, mitochondrial PROOO573 . aconitase if Aconitase Aconitate hydratase, family (aconitate C-terminal hydratase); 2.1e-272 ROO21 SS if Aconitase C Thiolase Aconitase C-terminal PROO1O3O domain; 1.8e-86 Aconitate hydratase, N-terminal 97880 at 57 21313536 493.0529OO8Rik RIKEN cDNA 493.0529O08 gene PROO1078 biotin lipoyl || Biotin Catalytic domain of requiring components of enzyme; 1.7e-27 || 2 various oxoacid dh i? 2-oxo dehydrogenase acid dehydrogenases complexes acyltransfera; 1.8e-132 IPROO3016 2-oxo acid dehydrogenase, acyltransferase component, lipoyl binding / IPROO0089 // Biotin/lipoyl attachment 994.71 at 57 ALO22671 expressed sequence AL022671 IPROO2913 / Lipid START START binding START domain; 1.5e-07 US 2007/0203083 A1 Aug. 30, 2007 81

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title NTERPRO PFAM 104212 at 56 2138932O 31.10001K13Rik RIKEN cDNA 3110001K13 gene PROO2885 PPR PPR / PPR repeat: 3e-30 repeat 92763 at 56 Abcb7 ATP-binding cassette, Sub-family PROO3439 ABC B (MDR/TAP), member 7 transporter fif PROO3593 AAA ATPase if PROO1140 ABC transporter, transmembrane region 94534 at 56 182SO284 isocitrate dehydrogenase 3 PROO1804 if isodh (NAD+) alpha Socitratefisopropylmalate Isocitratefisopropylmalate dehydrogenase dehydrogenase; 2.5e-173 if IPROO4434 Socitrate dehydrogenase NAD dependent, mitochondrial 94780 at 56 Zinc finger protein 288 PROOO210 BTBPOZ domain PROOO822 Zn finger, C2H2 type 95441 at 56 Timm23 translocase of inner mitochondrial PROOS681 membrane 23 homolog (yeast) Mitochondrial import inner membrane translocase, Subunit Tim23 95690 at 56 RIKEN cDNA 1110030L07 gene 96280 at 56 3.1981 600 NADH dehydrogenase (ubiquinone) 1 alpha Subcomplex, 2 96746 at 56 31542.559 Dlat dihydrolipoamide S IPROO4167 E3 2-oxoacid dh i? 2-oxo acetyltransferase (E2 component binding domain if acid dehydrogenases of pyruvate dehydrogenase IPROO1078 . acyltransfera; 3.8e-127 complex) Catalytic domain of // e3 binding f/ e3 components of binding domain; 2.9e-19 various f / biotin lipoyl || dehydrogenase Biotin-requiring complexes enzyme: 3.8e-29 PROO1412 Aminoacyl-tRNA synthetase, class I PROO3016 2-oxo acid dehydrogenase, acyltransferase component, lipoyl binding / IPROO0089 // Biotin/lipoyl attachment 96945 at 56 Snap23 synaptosomal-associated protein, PROOO928 SNAP SNAP-25 SNAP-25 23 kD 25 family f// family; 1.3e-24 PROOO727 / Target SNARE coiled-coil domain 101472 s at 55 Pkr pyruvate kinase liver and red blood PROO1697 PK C / Pyruvate cell Pyruvate kinase kinase, alphabeta domain; 5.9e–71 II PK // Pyruvate kinase, barrel domain: 1e-252 103261 at 55 G1 to phase transition 2 PROO416O if GTP EFTU ongation factor Tu, Elongation factor Tu -terminal GTP binding PROO4161 ; domain; 8.1e-93 ff E longation factor Tu, GTP EFTU D3 omain 2 Elongation factor Tu PROOO795 C-terminal Elongation factor, domain; 1.4e–30 / GTP-binding GTP EFTU D2 US 2007/0203083 A1 Aug. 30, 2007 82

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations

Probe Set Exemplar Symbol Title INTERPRO PFAM Elongation factor Tu domain 2: 7.5e-11 103849 at 55 Crk w-crk sarcoma virus CT10 IPROO1452 SH3 SH2 if SH2 oncogene homolog (avian)-like domain IPROOO980 domain; 5.7e-31 if SH2 motif SH3 if SH3 domain; 1.2e-20 93.014 at 55 31980744. Atp51 ATP synthase, H+ transporting, mitochondrial FO complex, Subunitg 93780 at 55 13385260 061OOO6O17Rik RIKEN cDNA 0610006O17 gene PROO3736 DUF157 Phenylacetic acid Uncharacterized degradation-related protein Paal, protein COG2050; 2.9e-10 94562 at 55 Gnpat glyceronephosphate O PROO2123 if Acyltransferase if acyltransferase Phospholipid glycerol Acyltransferase; 6.2e-33 acyltransferase 95611 at 55 Lpl lipoprotein lipase PROO2330 if lipase if Lipase; 1.1e-173 Lipoprotein lipase if if PLAT if PROO1024 if PLATLEH2 Lipoxygenase, LH2 domain; 5.8e-37 domain IPROOO734 fi Lipase fif PROOO379 Esteraselipasethioesterase, active site 95658 at 55 Murr1 U2afl-rs1 region 1 97.422 at 55 201OOO2H18Rik RIKEN cDNA 2010002H18 gene PROO23OO . Aminoacyl-tRNA synthetase, class Ia 942.79 at S4 21536220 O61OOO8F14Rik RIKEN cDNA 0610008F14 gene PROO1469 H--- ATP-synt DE / ATP transporting two synthase, sector ATPase, Delta Epsilon chain, delta epsilon Subunit long; 0.011 fif ATP synt DE N / ATP synthase, Delta Epsilon chain, beta; 4.5e-31 94908 r at S4 1110001.JO3Rik RIKEN cDNA 1110001.JO3 gene 98.130 at S4 990.3609 Tx2 thioredoxin 2 PROOOO63 thiored Thioredoxin type Thioredoxin; 3.4e-28 domain IPROO5746 . Thioredoxin 98539 at S4 Cops2 COP9 (constitutive PROOO717 Domain PCI PCI photomorphogenic) homolog, in components of the domain; 3.4e-25 subunit 2 (Arabidopsis thaliana) proteasome, COP9 complex and eIF3 (PCI) 98.929 at S4 133847.42 111 OO18B13Rik RIKEN cDNA 111018B13 gene 99237 at S4 U55872 cDNA sequence U55872 PROO1288 . IF3 Translation initiation factor 3 initiation factor IF 3: 2.5e-34 101.045 at 53 7949047 Hall2 hydroxyacyl-Coenzyme A PROO2198 if Short adh short short dehydrogenase type II chain chain dehydrogenase/reductase dehydrogenase; 7.4e–49 SDR PROO2347 Glucosef ribitol dehydrogenase 92625 at 53 6679078 Nine2 expressed in non-metastatic cells PROOO834 Zinc NDK Nucleoside 2, protein (NM23B) (nucleoside carboxypeptidase A diphosphate diphosphate kinase) metalloprotease kinase; 1.9e-116 (M14) // IPROO1564 if Nucleoside diphosphate kinase if IPROO3599 Immunoglobulin Subtype fif IPROO3598 US 2007/0203083 A1 Aug. 30, 2007 83

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM Immunoglobulin C-2 type /l/ IPROO3006 // Immunoglobulin major histocompatibility complex if IPROO3596 . Immunoglobulin V type 93754 a 53 7949037 Ech1 enoyl coenzyme A hydratase 1, IPRO01753 / Enoyl ECH / Enoyl-CoA peroxisomal CoA hydratase isomerase hydratase isomerase family; 1.4e–43 94829 a 53 1110020A09Rik RIKEN cDNA 1110020A09 gene 95646 a 53 carnitine palmitoyltransferase 2 IPROOOS42 Carn acyltransfif Acyltransferase Choline Carnitine o ChoActasef COTCPT acyltransferase: 4.4e-289 99594 a 53 Mrpl51 mitochondrial ribosomal protein LS1 100886 at 52 Mrpla5 mitochondrial ribosomal protein L45 94.866 a 52 1338484.4 Mrps16 mitochondrial ribosomal protein ROOO307 Ribosomal S16 S16 bosomal protein Ribosomal protein 6 S16; 5.4e-17 94909 a 52 13384854 Mrps17 mitochondrial ribosomal protein ROOO266 , Ribosomal S17 S17 bosomal protein Ribosomal protein 7 S17; 0.0021 95941 a 52 AI853514 expressed sequence AI853514 ROOOS69 HECT main (Ubiquitin otein ligase) 996.13 a 52 667897O Mut methylmalonyl-Coenzyme A ROO61OO . MM CoA mutase mutase Methylmalonyl-CoA Methylmalonyl-CoA mutase subfamily fif mutase: O B12 PROO6159 binding B12 binding Methylmalonyl-CoA domain; 1.7e-20 mutase, C-terminal PROO6158 . Coenzyme B12 binding / IPROO6099 // Methylmalonyl-CoA mutase IPROO6098 // Methylmalonyl-CoA mutase, N-terminal domain 102624 at 51 Stc2 Stanniocalcin 2 PROO4978 . Stanniocalcin Stanniocalcin Stanniocalcin family; 5.7e-193 94327 at 51 Mrps18a. mitochondrial ribosomal protein PROO1648 Ribosomal S18 S18A Ribosomal protein Ribosomal protein S18 S18; 0.0013 946.67 at 51 ESTS 94940 at 51 3.19807O6 Mccc.1 methylcrotonoyl-Coenzyme A PROOS482 Biotin CPSase L chain carboxylase 1 (alpha) carboxylase, C Carbamoyl-phosphate erminal synthase L. PROOS481 if chain,; 2.9e-53 Carbamoyl biotin lipoyl || Biotin phosphate requiring synthetase large enzyme: 3.5e-14 fif chain, N-terminal Biotin carb C Biotin PROO1882 Biotin carboxylase C requiring enzyme, terminal domain; 1e–43 attachment site CPSase L D2 PROOOO89 Carbamoyl-phosphate Biotin/lipoyl synthase L. attachment chain,; 2.2e-100 PROO5479 Carbamoyl phosphate synthase L chain, ATP-binding US 2007/0203083 A1 Aug. 30, 2007 84

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM 96756 at 51 1110007MO4Rik RIKEN cDNA 1110007MO4 gene 96.871 at 51 2310042G06Rik RIKEN cDNA 2310042G06 gene 98892 at 51 Lpin1 lipin 1 101867 at 50 Gpam glycerol-3-phosphate IPROO2123 Acyltransferase if acyltransferase, mitochondrial Phospholipid glycerol Acyltransferase: 5.3e-36 acyltransferase 94.855 at 50 6679299 Phb prohibitin IPROO1107 if Band 7 Band 7 / SPFH protein / IPROOO163 domain Band 7 Prohibitin family; 3.7e-61 96744 at 50 Acp6 acid phosphatase 6, IPROOOS60 if acid phosphat lysophosphatidic Histidine acid Histidine acid phosphatase phosphatase; 2.4e–07 96858 at 50 6755004 Polcd8 programmed cell death 8 IPROO1327 FAD pyr redox i? Pyridine dependent pyridine nucleotide-disulphide nucleotide-disulphide oxidoreducta: 2.6e-52 oxidoreductase IPROO1100 if Pyridine nucleotide disulphide oxidoreductase, class I 96.898 at 50 33859512 ATP synthase, H+ transporting, mitochondrial FO complex, Subunit b, isoform 1 100550 f at 49 1671 6343 Cox6c cytochrome c oxidase, Subunit VIc PROO4204 if COX6C / Cytochrome Cytochrome c c oxidase subunit oxidase subunit VIc VIc: 2.5e-50 103780 at 49 17OOO21FOSRik RIKEN cDNA 1700021 FO5 gene 104153 at 49 9789985 Ivo isovaleryl coenzyme A PROO6089 / Acyl Acyl-CoA dh N if dehydrogenase oA dehydrogenase Acyl-CoA // IPROO6092 / Acyl dehydrogenase, N oA dehydrogenase, terminal doma; 4.7e-58 -terminal fi, Acyl-CoA dh i? PRO06091 / Acyl Acyl-CoA oA dehydrogenase, dehydrogenase, C iddle domain terminal doma; 3e-55 PROO6090 / Acyl fi, Acyl-CoA dh M if oA dehydrogenase, Acyl-CoA -terminal dehydrogenase, middle domain: 9.3e-71 92364 at 49 Celsr2 cadherin EGF LAG seven-pass G PROO2126 laminin G if Laminin type receptor 2 Cadherin G domain; 1.2e-18 if PROO1881 EGF EGF EGF-like ike calcium-binding domain; 1.6e-21 if IPROO1368 GPS / Latrophilin/CL TNFRCD27.30,4095 1-like GPS cysteine-rich region domain; 1.3e-26 if IPROOOS61 EGF cadherin Cadherin ike domain domain; 2.9e-209 fif PROOO742 EGF 7tm 2 7 ike domain, Subtype transmembrane 2 IPROOO2O3 if receptor (Secretin GPS domain family); 1.8e-58 fif PROOO152 HRM Hormone Aspartic acid and receptor domain; 6.2e-17 asparagine hydroxylation site fif PROO2O49 Laminin-type EGF ike domain PROOO832 G protein coupled receptors family 2 (secretin-like) f/ PROO1791 Laminin G if IPROO1879 Hormone receptor, extracellular US 2007/0203083 A1 Aug. 30, 2007 85

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations

Probe Set Exemplar Symbol Title NTERPRO PFAM 93399 at Rai2 retinoic acid induced 2 93611 at Tbx6 T-box 6 PROO1699 T-box T-box; 1.1e-125 Transcription factor, T-box IPROO2O70 if Transcription factor, Brachyury 94531 at 49 338S9690 231OOOSO14Rik RIKEN cDNA 2310005O14 gene 96096 f at 49 13195670 26102O7I16Rik RIKEN cDNA 2610207I16 gene PROO2198 Short adh short short chain chain dehydrogenase/reductase dehydrogenase; 1.2e-29 SDR SCP2 SCP-2 PROO3O33 Stero sterol transfer binding / IPROO2347 family; 1.5e-27 if Glucosef ribitol dehydrogenase 96261 at 49 2310028O11Rik RIKEN cDNA 2310028O11 gene 99148 at 49 338S9554 F1 fumarate hydratase 1 PROOO362 umarate lyase 104710 at 48 Bak1 BCL2-antagonist/killer 1 PROOO712 if Bcl-2 fi Apoptosis Apoptosis regulator regulator proteins, Bcl Bcl-2 protein, BH fit 2 family; 2.3e-39 PROO247S BCL2 ike apoptosis inhibitor 96095 i at 48 13195670 26102O7I16Rik RIKEN cDNA 2610207I16 gene PROO2198 if Short adh short if short hain chain ehydrogenase/reductase dehydrogenase; 1.2e-29 DR SCP2 SCP-2 PROO3O33 Stero sterol transfer binding / IPROO2347 family; 1.5e-27 if Glucosef ribitol dehydrogenase 97397 at 48 DSErt33e DNA segment, Chr 5, ERATO Doi PROO4O33 Ubie methyltran f/ 33, expressed UbiE/COQ5 ubiE/COQ5 methyltransferase fif methyltransferase PROOOOS1 if SAM family; 1.4e-116 (and some other nucleotide) binding motif IPROO4034 Ubiquinone menaquinone biosynthesis methyltransferase fif PROO1601 Generic methyltransferase 103294 at 47 RgS5 regulator of G-protein signaling 5 PROOO342 . Regulator of G protein 103646 at 47 6.681009 Crat carnitine acetyltransferase PROOOS42 Carn acyltransfif Acyltransferase Choline Carnitine o ChoActasef COTCPT acyltransferase: O 94.508 a 47 181002OEO1Rik RIKEN cDNA 1810020E01 gene 95939 at 47 983O126M18 hypothetical protein 983 0126M18 96.035 a. 47 31982494 Bckdha branched chain ketoacid IPROO1017 E1 dehydrog if dehydrogenase E1, alpha Dehydrogenase, E1 Dehydrogenase E1 polypeptide component component; 1.8e-162 96.296 a 47 Mrpl15 mitochondrial ribosomal protein IPROO1196 L15 Ribosomal protein L15 96670 a 47 21313138 0610O2SI19Rik RIKEN cDNA 0610O25I19 gene IPROO4287 2 HCCA isomerase 2 hydroxychromene-2- hydroxychromene-2- carboxylate carboxylate isomerase isomer: 1.8e-110 97796 a 47 Crsp2 cofactor required for Sp1 transcriptional activation Subunit 2 98.128 a 47 7949005 Atp5i ATP synthase, H+ transporting, mitochondrial FO complex, subunit F US 2007/0203083 A1 Aug. 30, 2007 86

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title INTERPRO PFAM 100527 at 46 21311867 D11ErtO9e DNA segment, Chr 11, ERATO Doi 99, expressed 101027 s at 46 Pttg1 pituitary tumor-transforming 1 104215 at 46 RIKEN cDNA 913.0025P16 gene 104767 f at 46 mitochondrial ribosomal protein PROO1648 Ribosomal S18 S18A Ribosomal protein Ribosomal protein S18 S18; 0.0013 93346 at 46 phosphoglycerate kinase 1 PROO1576 PGK osphoglycerate Phosphoglycerate kinase kinase; 8.4e-296 93539 at 46 181OOO4DO7Rik RIKEN cDNA 1810004D07 gene 95498 at 46 13384968 Mrps15 mitochondrial ribosomal protein ROOS290 . Ribosomal S15 S15 bosomal protein Ribosomal protein S15, bacterial S15; 1.1e–08 chloroplast and l itochondrial type fif PROOOS89 bosomal protein 15 96947 at 46 21312004 061.000916Rik RIKEN cDNA 0610009I16 gene PROOOO49 ETF beta Electron Electron transfer transfer flavoprotein avoprotein beta beta subunit; 3.3e-124 Subunit IPROO6162 Phosphopantetheine attachment site 103401 at 45 31982522 Acads acetyl-Coenzyme A PROO6089 / Acyl Acyl-CoA dh M / dehydrogenase, short chain CoA dehydrogenase Acyl-CoA // IPROO6092 / Acyl dehydrogenase, CoA dehydrogenase, middle domain; 9e-64 N-terminal fi, Acyl-CoA dh N if PRO06091 / Acyl Acyl-CoA CoA dehydrogenase, dehydrogenase, N middle domain terminal doma: 1.9e-60 PROO6090 / Acyl fi, Acyl-CoA dh i? CoA dehydrogenase, Acyl-CoA C-terminal dehydrogenase, C terminal doma; 3.9e-77 104057 at 45 13277394 Grpel1 GrpE-like 1, mitochondrial PROOO740 || GrpE GrpE // GrpE; 3.8e-76 protein 95064 at 45 29.126205 D18Ertd240e DNA segment, Chr 18, ERATO Doi 240, expressed 961.80 at 45 RgS5 regulator of G-protein signaling 5 PROOO342 . Regulator of G protein 96.887 at 45 9506933 Np15 nuclear protein 15.6 97706 at 45 ESTS 96322 at 44 Edf1 endothelial differentiation-related PROO1387 Helix HTH 3 Helix-turn factor 1 turn-helix motif helix; 1.2e-10 98.527 at 44 102193 at 43 Sah SA rat hypertension-associated PROOO873 . AMP AMP-binding / AMP homolog ependent binding enzyme; 1.2e-102 ynthetase and ligase 93332 at 43 C36 CD36 antigen PROO2159 CD36 CD36 if CD36 antigen // IPROO5428 family: 1e-208 fi Adhesion molecule CD36 93528 s at 43 KIf Kruppel-like factor 9 PROOO822 Zn finger, C2H2 type 93994 at 43 Sycp3 synaptonemal complex protein 3 95730 at 43 Tce2 T-complex expressed gene 2 96676 at 43 181.0049H2ORik RIKEN cDNA 1810049H2O gene 97512 at 43 21312554 2010.107E04 Rik RIKEN cDNA 2010.107E04 gene 101078 at 42 basigin IPROO3599 Immunoglobulin Subtype fif IPROO3OO6 Immunoglobulin major US 2007/0203083 A1 Aug. 30, 2007 87

TABLE 7-continued The 643 genes in the mitochondria expression neighborhood. For each gene, the Affymetrix probe-set ID, neighborhood index (Noo), protein exemplar (if the gene was in mito-A), gene Symbol, description, and electronic annotations are provided. Electronic mito-A Gene Annotations Probe Set Noo Exemplar Symbol Title NTERPRO PFAM histocompatibility complex 94365 at 42 1190OOSLOSRik RIKEN cDNA 1190005LO5 gene PROO1310 Histidine triad (HIT) protein 94485 at 42 Peci peroxisomal delta3, delta2-enoyl PRO01753 / Enoyl ECH / Enoyl-CoA Coenzyme A isomerase CoA hydratase isomerase hydratase isomerase family; 3.2e-22 || // IPROOO582 || Acyl ACBP / Acyl CoA coA-binding protein, binding protein; 9.2e-41 ACBP 95056 r at 42 Tcte1 t-complex-associated-testis PROOS334 Tctex-1 Tctex-1 Tctex-1 expressed 1-like amily family; 5.5e-55 98966 at 42 67S3610 Dbt dihydrolipoamide branched chain PROO4167 E3 e3 binding / e3 transacylase E2 binding domain fif binding domain; 6.3e-18 PROO1078 if 2-oxoacid dh if Catalytic domain of 2-oxoacid components of dehydrogenases various acyltransfera; 5.4e-108 dehydrogenase f / biotin lipoyl || complexes Biotin-requiring PROO3016 2-oxo enzyme; 2e-25 acid dehydrogenase, acyltransferase component, lipoyl binding / IPROO0089 // Biotin/lipoyl attachment 100963 at 41 28104O3HOSRik RIKEN cDNA 2810403HO5 gene 102049 at 41 7305375 Pok4. pyruvate dehydrogenase kinase, PROOS467 HATPase c if isoenzyme 4 Histidine kinase Histidine kinase-, DNA PROO4358 . gyrase B-, and Bacterial sensor HSP90; 5e-19 protein C-terminal PROO3594 ATP binding protein, ATPase-like 103319 at 41 PSmd10 proteasome (prosome, macropain) PRO02110 / Ankyrin ank if Ankyrin 26S subunit, non-ATPase, 10 repeat; 8.1e-49 93040 at 41 FXYD domain-containing ion PROOO272 ATP1 G1 PLM MAT8 transport regulator 1 ATP1 G1 PLMMAT8 if ATP1 G1 PLMMAT8 amily family; 4e-35 93948 at 41 Nick2 non-catalytic region of tyrosine PROO1452 SH3 SH3 if SH3 kinase adaptor protein 2 domain IPROOO980 domain; 1.4e-57 if SH2 motif SH2 if SH2 domain; 6e-29 96388 at 41 EST 98924 at 41 4930S69004Rik RIKEN cDNA 493.0569OO4 gene PROOO768 . NAD:arginine ADP ribosyltransferase, ART 100099 at 40 Smpd1 sphingomyelin phosphodiesterase PROOOOO4 if Metallophos if 1, acid lysosomal Saposin type B fif Calcineurin-like PROO4843 Metallo phosphoesterase; 6.9e-17 phosphoesterase 100756 r at 40 Tyms-ps thymidylate synthase, pseudogene 95149 at 40 Copz1 coatomer protein complex, Subunit PROO0804 Clathrin Clat adaptor S Zeta 1 adaptor complex, Clathrin adaptor Small chain complex Small chain; 3.8e-76 95695 at 40 95721 at 40 Mapkapk2 MAP kinase-activated protein IPROO2290 . kinase 2 Serine Threonine protein kinase if IPROOO719 Eukaryotic protein kinase