<<

GENETIC, EVOLUTIONARY, AND GENOMIC ANALYSIS OF

HOMOCYSTEINE AND PATHWAY REGULATION

By

TOSHIMORI KITAMI

Submitted in partial fulfillment of the requirements

For the degree of Doctor of Philosophy

Thesis Adviser: Dr. Joseph H. Nadeau

Department of Genetics

CASE WESTERN RESERVE UNIVERSITY

January, 2006 CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the dissertation of

______

candidate for the Ph.D. degree *.

(signed)______(chair of the committee)

______

______

______

______

______

(date) ______

*We also certify that written approval has been obtained for any proprietary material contained therein. TABLE OF CONTENTS

TABLE OF CONTENTS 1 LIST OF TABLES 3 LIST OF FIGURES 4 ACKNOWLEDGEMENTS 5 ABSTRACT 6 CHAPTER 1: INTRODUCTION AND OBJECTIVES 8 Homocysteine, Folate, and Human Disease Risks 9 Introduction 9 Box 1: Different types of clinical studies 10 Non-genetic factors that affect homocysteine level 13 Effect of folate on homocysteine level 14 Cardiovascular disease 15 Neural tube defect 18 Cancer 20 Alzheimer’s disease and dementia 24 Genetic Mutations in Homocysteine and Folate 25 Rare mutations 25 Common mutation (Mthfr C677T mutation) 26 Mthfr C677T mutation and disease risks 27 Mouse Models and Disease Mechanisms 32 Basic Biochemistry of Homocysteine and Folate Pathways 36 Regulation of Homocysteine and Folate Pathways 39 Positive and negative feedback 39 Polyglutamation of folate 40 specific regulation 40 Folate transport 41 Transcriptional regulation 42 Global perspective on homocysteine and folate pathway regulation 43 Research Objective 45 CHAPTER 2: GENETIC AND PHENOTYPIC ANALYSIS OF MTHFR ACTIVITY, TAIL KINKS, AND SEIZURE SUSCEPTIBILITY IN PL/J MICE 48 Abstract 49 Introduction 50 Materials and Methods 52 Results 54 Discussion 60

1 CHAPTER 3: DUPLICATION, METABOLIC NETWORKING, AND GENETIC BUFFERING IN PHYSIOLOGICAL PATHWAYS OF HUMAN AND MICE 65 Abstract 66 Introduction 67 Materials and Methods 70 Results 76 Discussion 84 CHAPTER 4: GLOBAL GENE EXPRESSION AND METABOLITE PROFILING OF RESPONSES TO DIETARY FOLATE PERTURBATIONS IN TWO GENETICALLY DISTINCT INBRED MOUSE STRAINS 88 Abstract 89 Introduction 90 Materials and Methods 93 Results 100 Discussion 112 CHAPTER 5: SUMMARY AND FUTURE DIRECTIONS 117 Summary 118 Future Directions 120 APPENDIX 127 Appendix 1: Nomenclature of Homocysteine and Folate 128 Appendix 2: Composition of Folate Deficient and Control Diets 129 BIBLIOGRAPHY 130

2 LIST OF TABLES

CHAPTER 1 1.1 Summary of cohort studies on colon cancer risk and homocysteine and folate levels 23 1.2 Summary of association studies between Mthfr C677T and colon cancer risk 29 1.3 Summary of association studies between Mhtfr C677T and neural tube defect risk 31

CHAPTER 3 3.1 Average Ka/Ks values for orthologous in various metabolic pathways in humans and mice 78

CHAPTER 4 4.1 Significant over-representation of terms 103 4.2 Effect of ApoE knockout and dietary supplements on metabolite and gene expression profiles 107

3 LIST OF FIGURES

CHAPTER 1 1.1 Homocysteine and folate pathways 38

CHAPTER 2 2.1 sequence alignment of Mthfr in five model organisms 56 2.2 X-ray photograph of a kinky tail in a PL/J mouse 56 2.3 Number of tests with seizures in parental, F1, and F2 mice 59

CHAPTER 3 3.1 Schematic for defining networks based on the structure of metabolic pathways 75 3.2 Distribution of Ka/Ks values for 241 genes in various metabolic pathways 79 3.3 Scatter plot of the Ka/Ks values and the number of tissues with gene expression from mouse and human gene expression data 83

CHAPTER 4 4.1 Dietary folate perturbation protocols 94 4.2 Metabolite profiles of serum homocysteine and folate, and total in serum and 101 4.3 2-dimensional plots of metabolite profiles for A/J and C57BL/6J 101 4.4 Average linkage hierarchical clustering of liver gene expression profile 103 4.5 Gene expression profile of choline kinase 111

4 ACKNOWLEDGEMENTS

My thesis is a collaborative work of many people who have impacted me both on educational and personal level throughout my life. I like to thank my adviser Dr. Joseph

Nadeau for teaching me to think independently, write and speak effectively, and to approach science with curiosity and passion. I also like to thank my committee members for providing variety of views on my thesis and expanding my breadth of scientific thinking. I also thank the members of the Nadeau Lab for inputs and supports throughout my graduate career.

I like to thank my family for providing generous financial support throughout my life, always providing the best environment for my education and personal growth, and providing me freedom to pursue any career choice of my own. I also like to thank my

Cleveland family, the Tomciks and the Linnevers, for helping me situated in this city as well as inviting me to their holiday dinners. I also like to thank my tutor from my days in

Hawaii, Mrs. Murata, who taught me English as well as many valuable lessons in life.

Without her work, none of my academic success would have been possible. I also like to thank her family for their kindness and generosity, delicious meals, and fun memories.

I also like to thank my friends in graduate school, Josephine and Ricky (a.k.a. Mr. & Mrs.

Chan), Kirsten, Sheila, Lesil, Martha, Matt, Keith, Karen and Mike, Lisa and Brian, for making my long six years enjoyable, providing me with lots of fun memories, and motivating each other to survive the pains of graduate school.

5 Genetic, Evolutionary, and Genomic Analysis of

Homocysteine and Folate Pathway Regulation

Abstract

By

TOSHIMORI KITAMI

Abnormalities in homocysteine and folate metabolism are associated with increased risk for several common human diseases. Both elevated serum homocysteine and low dietary folate intake increase the risk for cardiovascular diseases, neural tube defects, cancers, and neurodegeneration. Although common mutations in these pathways have been identified, they do not fully account for the variety of disease types that are associated with increased homocysteine or low folate levels. Mouse models allow us to control many of the genetic and environmental variables inherent in human studies, allowing us to dissect genes, pathways, and important for disease pathogenesis. Previous mouse studies showed that mutations in other pathways can significantly modulate functions of homocysteine and folate metabolism and modify disease phenotypes suggesting that these pathway interactions and their regulations are crucial to understanding the role of homocysteine and folate metabolism in complex diseases. To study these pathway interactions and their regulation, I used genetic, evolutionary, and genomic approaches. I first used a unique mouse strain PL/J to dissect the genetic control of key enzyme methylenetetrahydrofolate reductase (MTHFR) and its potential role in disease phenotypes. I found that seizure and kinky tail phenotype in these mice were

6 polygenic and showed strong environmental contributions. Next, I used evolutionary

approaches to address whether biochemical interactions in metabolic pathways buffer

deleterious mutations. I found that redundant metabolic paths do not provide genetic

buffering as inferred from estimates of variation in gene evolution rates. However,

higher level interactions explained some of the evolutionary rate variability. Lastly, I

used genomic approaches to identify pathways that modulate homeostatic responses to

dietary folate perturbations in two genetically distinct inbred strains. I found striking

strain differences in homeostatic responses on folate retention and global gene expression

profiles. I also found that cholesterol and choline are involved in the folate

perturbation response, which may play a more general role in complex disease

mechanisms. Overall, these analyses revealed pathway interactions that are important for

functionality of homocysteine and folate metabolism and highlighted some future

strategies for dissecting the role of these pathways in complex diseases.

7

CHAPTER 1

Introduction and Objectives

8 Homocysteine, Folate, and Human Disease Risks

Introduction Abnormalities in homocysteine and folate metabolism have been associated with a striking variety of complex human diseases. An elevated level of serum homocysteine is associated with increased risk for cardiovascular disease and

Alzheimer’s disease. Low serum folate level or low dietary folate intake has also been associated with increased risk for mothers having children with neural tube defects and for colon and breast cancer. Genetic mutations in these metabolic pathways also elevate risks for these complex diseases (see Genetic Mutations in Homocysteine and Folate

Metabolism). Serum homocysteine level in this thesis refers to the total serum homocysteine, which includes -bound homocysteine and soluble homocysteine.

Protein-bound homocysteine accounts for nearly 80% and soluble homocysteine accounts for 20% of total homocysteine (Ueland 1995). The normal range of homocysteine is 5 to

15µM (Ueland et al. 1993, Graham et al. 1997). Elevated serum homocysteine beyond

the normal range (>15uM) is traditionally referred as hyperhomocysteinemia.

Hyperhomocysteinemia is further subcategorized into moderate (15-30uM), intermediate

(30-100uM), and severe (>100uM) hyperhomocysteinemia (Kang et al. 1992). Many of

the earlier clinical studies used these definitions in associating homocysteine level to

risks for human diseases. However, recent studies have shown that elevation of

homocysteine level at any range confers increased risk for human diseases including

increases within the normal range of variation in homocysteine levels.

9 Box 1: Different types of clinical studies

Hundreds of publications associate disorders of homocysteine and folate metabolism as risk factors for complex human diseases, making these pathways one of the most intriguing and medically relevant pathways known. However, it is important to understand different clinical research designs to make appropriate judgments about the strength of evidence before pursuing further research. Three types of research designs are used to assess homocysteine and folate level on disease risks; case-control, cohort, and randomized controlled trials (Peipert et al. 1997).

Case-control studies are also called retrospective studies in which studies begin after patients present themselves with a specific disease. Various measurements and medical records are collected from patients presenting disease (cases) as well as from patients without the disease (controls). These studies are cost-efficient but are prone to uncontrolled biases. There could be unknown or immeasurable predictors of disease present in cases versus controls (also called unequal baseline). More importantly, these studies cannot distinguish whether observed parameters such as elevated homocysteine level are the cause of the disease or occur as the result of the disease (disease marker).

Cohort studies are also called prospective studies in which specific parameters such as serum homocysteine level or average daily folate intake are measured first, and then patients are followed to monitor onset of diseases. Unlike case-control studies, this eliminates some of the ambiguity in the causality because the parameter precedes disease onset. This however does not guarantee causality. These studies cannot eliminate all

10 biases from unknown or immeasurable predictors of disease. For example, patients with

renal impairment have hypertension and also elevated serum homocysteine level. It is

possible that hypertension causes cardiovascular disease but one may associate only

homocysteine with cardiovascular disease if data on renal impairment is ignored.

Randomized controlled trials hold the highest standard in clinical studies. Patient

populations are first measured for baseline characteristics and then subjects are randomly

assigned to different treatment groups and followed for disease outcome. These studies

are the most expensive and time-consuming of all clinical studies. However, patients are

assigned to different treatment groups completely at random and are not subject to

uncontrolled bias. We need to make sure that these studies have adequate sample size to

avoid type II errors; a conclusion that there is no significant difference between treatment

and control groups when there actually is a difference. Also, we need to be careful that

criteria used in the trials restrict the generalization of the result. For example, if the study

only examined the recurrence of disease (repeated risk), one cannot make generalization

on the occurrence (first onset) of the disease.

Randomized controlled trials are often difficult to conduct, especially for late-onset

diseases such as cardiovascular disease, cancer, or Alzheimer’s disease. If many case-

control or cohort studies exist for a specific parameter and disease, one could perform

meta-analysis, which summarizes all of the available disease association studies

conducted to date. These studies are useful especially when individual studies are too

small and lack statistical power. However, the quality of the meta-analysis is still bound

11 by the design and execution of individual studies. Test for heterogeneity is used to assess

whether study populations are similar to each other before combining data.

12 Non-genetic factors that affect homocysteine level Several environmental and physiological factors affect the level of homocysteine. A large cross-sectional study involving 7591 men and 8585 women between 40 and 67 years old in Norway (Nygard et al. 1995) examined the correlation between plasma homocysteine and age, sex, pressure, smoking, physical activity, cholesterol, triglyceride, heart rate, and

intake. Among these factors, age, sex, and smoking showed strong positive correlations

to homocysteine level. Males showed a 1.82uM higher homocysteine level than females

and older persons (65-67 years old) showed a 2.2uM higher homocysteine level than

younger persons (40-42 years old). Heavy smokers (>20 cigarettes per day) showed a

1.91uM higher homocysteine level than non-smokers. The homocysteine level in

smokers increased incrementally as the number of cigarettes smoked increased. These

effects were still significant even after accounting for vitamin intake. The combined

effect of age, sex, and smoking reached 4.8uM homocysteine difference. It is important

to note that smoking is one of the major risk factors for cardiovascular disease and cancer.

In smaller studies, homocysteine levels were higher in postmenopausal females than

premenopausal females (Wouters et al. 1995). Homocysteine levels were also elevated

two-fold in high alcohol consumers (> 1.5g/kg/day) compared to non-consumers (Cravo

et al. 1996). The effect of low folate, , and B12 intake on homocysteine levels

varied depending on the individual, but generally resulted in elevated homocysteine (see

below). Patients with end-stage renal disease also had high homocysteine level and the

most prevalent cause of death in these patients was cardiovascular disease (Bostom and

Lathrop 1997). All of these factors suggest that one needs to adjust for these

13 confounding variables when assessing the impact of homocysteine level on human disease risks.

Effect of folate on homocysteine level Given the variety of common human diseases associated with elevated homocysteine level, decreasing homocysteine level may serve as an important disease intervention. Several studies have examined the effect of folic acid, vitamin B6, and B12 in lowering serum homocysteine level. Randomized controlled trials using folate supplements are ongoing and are discussed in each disease section.

A meta-analysis of 12 randomized controlled trials involving 1114 patients showed that folate supplement in the range of 0.5mg/day to 5mg/day results in 25% reduction in serum homocysteine level (Homocysteine Lowering Trialists’ Collaboration, 1998). The percent reduced risk did not differ within this range of supplement. A supplement of 0.5mg/day showed an additional 7% reduction in serum homocysteine.

Vitamin B6 supplement did not yield significant reductions in homocysteine level.

Despite the significant reduction in homocysteine level with folate supplement, many people do not consume enough folate. However, with the introduction of folic acid fortification of grain products (began between 1996 - 1998), women of child bearing age on average receive 80-100ug extra folate and middle age and older adults receive 70-

120ug extra folate (Jacques et al. 1999). A cohort study of patients before folate fortification (756 people) and after fortification (350 people) showed that the average

14 plasma folate level increased by 117% (4.6ng/mL to 10ng/mL) and average plasma homocysteine level decreased by 7% (10.1uM to 9.4uM) (Jacques et al. 1999). However,

these effects were much smaller compared to the effect of folate supplements

(>0.5mg/day) used in the Homocysteine Lowering Trialists’ Collaboration study (1998).

These results suggest that while folic acid fortification of grain products slightly reduced

serum homocysteine level in the general population, greater reduction can be achieved with additional folic acid supplements.

Cardiovascular disease Cardiovascular disease is a general term that encompasses abnormalities of heart and blood vessels. Most common forms include coronary heart disease, which is reduced blood supply to the heart muscle by narrowing or blockage of coronary . Another common cardiovascular disease is stroke, which is insufficient blood supply to part of the brain due to blockage or narrowing of the arteries or a hemorrhage. The association between elevated plasma homocysteine level and cardiovascular disease was first established in patients of homocystinuria; abnormally high level of homocysteine in the urine. These patients were deficient in cystathionine beta-synthase (CBS), an enzyme required for eliminating homocysteine from the . To test the effect of homocysteine on the vascular system, McCully injected homocysteine thiolactone into rabbits daily for 8 weeks and found significant increase in aortic plaques and lesions (McCully and Wilson 1975). He hypothesized that elevated homocysteine level causes cardiovascular disease. Despite a positive finding in an early (Wilcken and Wilcken 1976), homocysteine as a risk factor for cardiovascular disease did not receive much notice until 25 years later.

15 Clarke et al. (1991) published a case-control study involving 123 cardiovascular disease patients and showed that hyperhomocysteinemia (defined as >24uM after patients were administered methionine following overnight fast) is an independent risk factor for cardiovascular disease, separate from the traditional risk factors such as hypertension, hypercholesterolemia, and smoking. Since this publication of homocysteine as an independent risk factor, there has been a dramatic increase in the number of studies linking homocysteine to cardiovascular disease risk (Refsum et al. 1998). Although

earlier studies focused on hyperhomocysteinemia (>15uM), many later studies have

found no threshold effect of homocysteine on cardiovascular disease risk (Boushey et al.

1995). Instead, later studies showed graded increase in cardiovascular disease risk with

increasing homocysteine level.

Currently there are case-control and cohort studies involving homocysteine and

cardiovascular disease risk. Because of the large number of published studies and some

conflicting results between studies, I focused on a most recent meta-analysis that

summarizes all of the currently available published data.

The most recent meta-analysis that examined the association between homocysteine level

and cardiovascular disease risk is from the Homocysteine Studies Collaboration (2002),

which included 12 cohort studies and 18 case-control studies. Although all except one

case-control studies showed significant association between increased homocysteine level

and cardiovascular disease risk, only 7 of 12 cohort studies showed significant

association. Because almost all of the case-control studies showed significant positive

16 association, meta-analysis of these data does not provide additional information.

However, meta-analysis of cohort studies was much more useful not only because of the conflicting results but also because there were extensive medical information on each patient including other traditional cardiovascular disease risk factors.

After combining all of the cohort studies, there were 1968 coronary heart disease cases and 463 stroke cases within the total patient population of 9025. These studies ranged between 212 and 1778 total patients. They found that a 25% decrease in homocysteine level leads to an 11% (Odds ratio (OR) 0.89 [95% CI: 0.83-0.96]) decrease in coronary heart disease risk and a 19% lower risk of stroke (OR 0.81 [95% CI: 0.69-0.95]) after adjusting for age, sex, smoking, systolic blood pressure, and total cholesterol level. They found that the effect of homocysteine was much larger when the data were not adjusted for smoking and blood pressure, highlighting that homocysteine can also increase as a result of other traditional cardiovascular disease risk factors. This may partly explain why case-control studies have always shown strong association between homocysteine and cardiovascular disease risk. Another meta-analysis of cohort studies (Bautista et al.

2002) confirmed the significant association between homocysteine and cardiovascular disease.

Randomized controlled trials using folate supplements to lower cardiovascular disease risk are still required before recommending patient intervention (Bostom and Garber

2000). While randomized controlled trials are in progress, cohort study of daily folate intake and coronary heart disease risk in 80,082 women from the Nurses’ Health Study

17 showed incremental decrease in cardiovascular disease risk with increasing folate intake

(Rimm et al. 1998). Each addition of 200ug/day of folic acid led to an 11% decrease

(Relative risk (RR) 0.89 [95% CI: 0.82-0.96]) in coronary heart disease risk after adjusting for other cardiovascular disease risk factors. Most significant effect was

observed in intakes higher than 400ug/day of folic acid.

Neural tube defect Neural tube defects are congenital malformations of the nervous

system due to failure of the neural tube to close during development. Throughout history,

sudden increase in the incidence of neural tube defects have been documented in regions

which recently experienced poor crop yield (Michie 1991) or a natural disaster (Duff et al.

1991) suggesting that poor and are risk factors of neural tube

defect. There are many and nutrition factors that may account for such increase

in neural tube defect cases. Elizabeth Hibbard in the early 1960s developed an assay to detect folate deficiency using urinary formiminoglutamate secretion and the assay was used in case-control study for neural tube defect (Hibbard and Smithells 1965). Their positive finding between defective folate metabolism and neural tube defect led to a later case-control study of several blood vitamin levels and neural tube defects (Smithells et al.

1976). The most significant difference in blood vitamin level between affected and non- affected mothers was red cell folate. Based on this finding, clinical trials using folate or supplements were conducted albeit unsuccessfully in the early 1980s with one study using non-randomized study design (Smithells et al. 1980) and another study showing no beneficial effect of folate supplements (Laurence et al. 1981). The controversy over the role of folate supplementation in neural tube defect prevention was

18 finally settled in 1991 by the MRC Vitamin Study Research Group (1991) involving

1817 women from seven countries who previously gave birth to infants with a neural tube defect. To separate the effect of folate from other vitamins, women were given daily supplements of folate alone, multivitamin without folate, multivitamin with folate, or placebo starting prior to attempting another pregnancy up until three months into pregnancy. The folate supplement (4mg/day) alone showed a 72% reduction (RR 0.28

[95% CI: 0.12-0.71]) in neural tube defects compared to placebo and multivitamin without folate. It is important to note that this 72% is specific for recurrent risk of neural tube defect and does not apply to first occurrence. Vitamin supplements without folate showed no change in risk compared to placebo.

Despite the significance of the folate supplements in the MRC study, the women in the randomized controlled trial represented recurrent cases of neural tube defects which were only 5% of the total neural tube defect cases with the remainder being first occurrence.

To examine the protective effect of folate in the first occurrence of neural tube defect, a randomized controlled study was conducted in Hungary (Czeizel and Dudas 1992) involving 2104 women receiving daily vitamin supplement which included 0.8mg of folate and 2052 women receiving placebo. Multivitamin group showed 1.3% cases, which was significantly lower than the placebo group of 2.3% cases. These two randomized controlled trials firmly established the protective role of folate against neural tube defects such that folate supplements for women attempting pregnancy are now recommended in many countries.

19 Cancer Amongst many types of cancer, colon cancer and breast cancer by far have the highest number of clinical studies involving homocysteine and folate metabolism and will be the focus of this section. Wide variation in cancer rate between countries and changes in cancer rate among migrants suggest strong environmental contribution to the incidence of many cancer types (Doll and Peto 1981). For example, Japanese immigrants to Hawaii have similar incidence rate for at least nine different cancer types, including colon and breast, as Caucasians living in Hawaii and are significantly different from

Japanese living in Japan (Doll and Peto 1981). Crude estimates of the proportion of cancer caused by environmental factor are 90% for male and 92% for female for colon cancer and 83% for breast cancer in female (Doll and Peto 1981). However, a more recent study (Lichtenstein et al. 2000) involving 44,800 pairs of twins from Europe has shown higher heritability for colon cancer at 35% and for breast cancer at 27%. While the true estimate of cancer heritability is difficult (Risch 2001), these studies suggest that environmental factors contribute significantly to cancer risks.

One of the environmental contributors to colon cancer risk is . Increase in red consumption shows increased risk and increase in fiber intake shows decreased risk for colon cancer (Willett 1988). Interestingly, only fiber intake from fruits and shows protective effects whereas fiber from grain and cereal do not show any significant effect. High folate content in fruits and vegetables compared to grains and cereals, and the occurrence of DNA hypomethylation in neoplastic growth in colon prior to (Goelz et al. 1985) and during malignancy (Feinbert and Vogelstein 1983) led to clinical studies on folate intake and colon cancer risk. In case-control studies, one study involving 286

20 cases and 295 controls showed protective effect of folate (Benito et al. 1991) whereas another study involving 428 cases and matched controls (Freudenheim et al. 1991) showed no significant effect of folate. The discrepancy has been attributed to recall bias in which patients suffering from colon cancer report differently about their diet than the control healthy subject (Freudenheim 1999).

Currently, there are four cohort studies on folate intake and colon cancer risk and no meta-analysis nor randomized controlled trials published. Therefore, I summarized the results from the four cohort studies below (Table 1.1). Three studies (Giovannucci et al.

1998, Kato et al. 1999, Su and Arab 2001) showed around 45% decrease in colon cancer risk with high folate intake or high serum folate level. The study (Kato et al. 1999) which examined both serum folate and homocysteine level found that while high serum folate showed significant protective effect, high serum homocysteine did not change the risk for colon cancer. Two studies (Giovannucci et al. 1995, Su and Arab 2001) also examined the effect of alcohol and folate intake on colon cancer risk. They both found significant increase in colon cancer risk with high alcohol, low methionine, and low folate intake group compared to low alcohol, high methionine, and high folate intake group. In one study (Su and Arab 2001), the protective effect of folate against colon cancer was abolished with alcohol intake. These studies together suggest that high folate intake provides protective effect against colon cancer but the effect needs to be assessed in combination with alcohol intake.

21 Dietary risk factors for breast cancer include high intake and high alcohol consumption (Willett 1989). Two case-control studies, one with postmenopausal women and the other with premenopausal women, have examined different food and nutrient intake on breast cancer risk (Graham et al. 1991, Freudenheim et al. 1996) and showed weak to no effect of folate intake. The only cohort study (Zhang et al. 1999), involving

88,818 women from the Nurses’ Health Study of which 3483 women developed breast cancer, showed no significant protective effect of folate against breast cancer. However, when only women who consumed at least 15g/day of alcohol (>1 drink/day) were considered, high folate intake (>600ug/day) showed significant decrease in risk (RR 0.55

[95% CI: 0.39-0.76]) compared to low folate intake (150-300ug/day). Similar to colon

cancer, there is strong interaction between folate intake and alcohol consumption

suggesting complex interaction between environmental risk factors.

22 Table 1.1 Summary of cohort studies on colon cancer risk and homocysteine and folate levels.

Authors Study # colon Year Key observations population cancer cases follow- up Giovannucci et 47931 205 6 High alcohol (> 2 drinks/day vs <0.25 al. 1995 males drinks/day) – increased risk RR 2.07 [CI: 1.29-3.32] Low methionine (<1.74g/day vs >2.44g/day) & low folate (<270ug/day vs 6464ug/day) – no effect High alcohol, low methionine & folate – higher risk than alcohol alone, RR 3.30 [CI: 1.58-6.88] Giovannucci et 88756 442 14 High folate intake (>400ug/day vs al. 1998 females <200ug/day) – decreased risk, RR 0.69 [CI: 0.52-0.93] Kato et al. 15785 105 4-10 High serum folate (>31.1nM vs <12.2nM) 1999 females – decreased risk, RR 0.52 [CI: 0.27-0.97] High serum homocysteine (>12.2uM vs <7.9uM) – no effect Su & Arab 3913 males 99 males 20 High folate intake (>250ug/day vs 2001 6098 120 females <103ug/day) females – decreased risk in males only, RR 0.40 [CI: 0.18-0.88] Alcohol intake (>1/yr) abolished folate’s protective effect

All of the four studies were adjusted for confounding factors such as age, family history, smoking, red meat, fat, fiber, alcohol, other vitamins and methionine consumption, aspirin use, body mass index, physical activity, and calorie intake. Relative risk (RR) is proportion of disease people exposed to risk factor divided by proportion of disease people not exposed to risk factor. Confidence interval (CI) is 95%.

23 Alzheimer’s disease and dementia Among several neurodegenerative disorders, dementia, in particular Alzheimer’s disease, has been studied extensively in relation to homocysteine and folate metabolism. Although Mendelian disorders of homocysteine and folate metabolism have shown developmental abnormalities in neurological functions such as mental retardation and seizure, the link between neurodegenerative disorders, a late-onset neurological disease, have only been made recently. An observation that patients with are more likely to suffer from dementia or Alzheimer’s disease compared to non-atherosclerotic patients suggested that cardiovascular risk factors may also pose risk for Alzheimer’s disease (Hofman et al. 1997). Currently, there are several case-control studies and two cohort studies examining the relationship between Alzheimer’s disease and homocysteine and folate metabolism. I have highlighted some of the shared conclusions and differences between these studies.

Three out of four case-control studies (Clarke et al. 1998, McCaddon et al. 1998, McIlroy et al. 2002, Miller et al. 2002) showed between two- and four-fold increase in risk for

Alzheimer’s disease and dementia in hyperhomocysteinemia patients compared to normal homocysteine patients. These risks were statistically significant even after adjusting for known confounding factors such as age, blood pressure, smoking, and ApoE genotype.

Two of the case-control studies also compared the magnitude of risk conferred by ApoE

E4 allele and hyperhomocysteinemia (Clarke et al. 1998, McIlroy et al. 2002). Risks from ApoE4 allele were higher than risks from hyperhomocysteinemia in both of these studies suggesting that ApoE4 still remains as one of the strongest risk factors for non- familial Alzheimer’s disease.

24 In cohort studies, a study from Sweden involving 370 patients with three year follow-up showed no significant effect of serum folate and vitamin B12 level on Alzheimer’s disease risk (Wang et al. 2001). However, a more recent study involving 1092 patients with an eight year follow-up (Seshadri et al. 2002) showed that hyperhomocysteinemia

(>14uM) contributes around two-fold increased risk to Alzheimer’s disease (RR 1.9 [95%

CI: 1.2-3.0] ) and also to dementia (RR 1.9 [95% CI: 1.3-2.8]). This comprehensive study also took into account 13 confounding factors, more factors than any other previous studies on homocysteine and Alzheimer’s disease risk.

Genetic Mutations in Homocysteine and Folate Metabolism

Rare mutations Rare Mendelian mutations in the homocysteine and folate pathways have been reviewed in detail (Mudd et al. 2001, Rosenblatt and Fenton 2001).

The clinical manifestations of these mutations, often resulting in enzyme deficiency, vary from patient to patient and can be ameliorated with dietary intervention in some cases.

The common outcomes of these deficiencies are briefly described. Cystathionine beta- synthase (CBS) enzyme deficiency affects mainly eyes, skeletal, vascular, and central nervous systems. The most common disorders in these organs include dislocation of ocular lens, , thromboembolism, and mental retardation.

Methylenetetrahydrofolate reductase (MTHFR) enzyme deficiency is characterized by developmental delay accompanied by motor and gait abnormalities. Patients also exhibit seizures and psychiatric manifestations. Pathologies in the central nervous system include demyelination, astrogliosis, and macrophage infiltration. MTHFR enzyme deficiency also leads to vascular changes similar to homocystinuria (CBS enzyme

25 deficiency). Defects in metabolism, vitamin B6 and B12, for homocysteine pathway enzymes also lead to megaloblastic anemia and homocystinuria. The frequencies of these disorders are extremely rare and a few of these patients are described in clinical literature. The most frequent enzyme deficiency occurs in CBS with 1 in

300,000 frequency in the general population (Mudd et al. 1995) compared to 30 total cases reported for MTHFR enzyme deficiency and even less for other enzymes in the homocysteine and folate pathways.

Common mutation (Mthfr C677T mutation) The most prevalent mutation in the homocysteine and folate pathways is a cytosine to thymidine substitution at 677th base

pair in Mthfr resulting in alanine to valine substitution. Heterozygous mutation leads to

65% of the wildtype MTHFR enzyme activity and homozygous mutation leads to 35% of

the wildtype activity (Frosst et al. 1995). The mutation only destabilizes the binding of

FAD cofactor and does not affect the kinetic parameters of the enzyme (Guenther et al.

1999). Increase in folate concentration was shown to stabilize the binding of FAD

cofactor in the mutant MTHFR (Guenther et al. 1999). The allele frequency of 677T

varies among populations ranging from the low of 0.073 in sub-Saharan Africans to a

high of 0.545 in Spanish and 0.448 in Italians (Pepe et al. 1998). Asians show variability

with low of 0.020 in Indonesians to high of 0.375 in Chinese. Patients with homozygous

677TT mutation on average show 2.7uM higher serum homocysteine level compared to

677CC wildtype (Wald et al. 2002). Patients with 677TT also show lower red cell folate

(Molloy et al. 1997), which is indicative of the tissue folate status, than 677CC patients.

Additionally, under low serum folate condition (<12nM), patients with 677TT show

26 lower genomic DNA methylation level compared to patients with 677CC (Friso et al.

2002).

Mthfr C677T mutation and disease risks The association between MTHFR

677TT mutation and cardiovascular disease has been summarized in two recent meta- analysis involving over 35 case-control and five cohort studies with total 20,000 patients

(Wald et al. 2002, Klerk et al. 2002). The risk for cardiovascular disease was about 20% higher in patients with TT homozygous mutation than the CC wildtype patients.

However, the risk estimates between the 40 studies varied more than expected by chance.

Interestingly, Klerk et al. (2002) found that only data from European population showed

significant increase in risk (OR 1.14 [95% CI: 1.01-1.28]) for TT mutation whereas North

American population showed non-significant risk (OR 0.87 [95% CI: 0.73-1.05]). Also,

when serum folate levels, when available, were taken into account, the increase in

cardiovascular disease risk associated with TT was only significant in lower than median

folate level. The authors suggest that fortification of grains with folic acid, which started

much earlier in North America, may account for the difference in disease risk. These

highlight an important interaction between the MTHFR 677TT mutation and the dietary

folate intake in cardiovascular disease risk.

The association between MTHFR 677TT mutation and colon cancer risk has been studied

in several case-control studies many of which are summarized in Table 1.2. Of these

seven studies, five showed non-significant differences in disease risk between TT

homozygous mutant to CC wildtype and CT heterozygote. The two remaining studies

27 showed over 50% decreased risk in colon cancer. Interestingly, the mutation has not been associated with significant increase in colon cancer risk but instead with decreased risk. A new mutation in Mthfr, A1298C, was examined in three of the studies. The homozygous 1298CC mutation had 61% of the wildtype MTHFR enzyme activity (van der Put et al. 1998), similar to the enzyme activity for heterozygous 677CT. Two studies showed 40 to 50% decrease in colon cancer risk when comparing 1298CC mutation to

1298AA wildtype. When environmental factors such as high serum folate level or low alcohol intake were taken into account, the risk for colon cancer associated with 677TT

mutation or 1298 CC mutation decreased further or a non-significant risk became

significantly decreased risk. Based on these studies, MTHFR 677TT mutation poses

mostly non-significant risk to colon cancer and shows decreased risk given high serum

folate or low alcohol intake.

28 Table 1.2 Summary of association studies between Mthfr C677T and colon cancer risk Study Patient # Mutation Odds ratio Odds ratio when environmental factors are taken into account Chen et al. 144 cases 677 CC/CT vs TT Not significant 0.11 [CI: 0.01-0.85] 1996 627 controls signif only for < 1 drink/week -No interaction with methionine or folate Ma et al. 202 cases 677 CC/CT vs TT 0.46 [CI: 0.25-0.84] 0.32 [CI: 0.15-0.68] - significant 1997 326 controls only for >3ng/mL plasma folate 0.12 [CI: 0.03-0.57] - significant only for <0.14 drink/day Chen & 257 cases 677 CC/CT vs TT Not significant -No interaction with folate, Giovannucci 713 controls methionine, or alcohol et al. 1998 Ulrich et al. 200 cases 677 CC vs TT Not significant -No interaction with folate, B6, 2000 645 controls B12, or methionine Keku et al. 555 cases 677 CC/CT vs TT Not significant -No interaction with folate or 2002 875 controls alcohol 1298 AA vs CC 0.5 [CI: 0.3-0.9] 0.3 [CI: 0.2-0.7] - significant only -whites only, not for for no alcohol intake African American -No interaction with folate Toffoli et al. 134 proximal case 677 CC/CT vs TT 0.36 [CI: 0.14-0.91] N/A 2003 142 distal case -proximal only 279 control 1298 AA vs CC Not significant Curtin et al. 1608 cases 677 CC vs TT Not significant Unclear 2004 1972 controls 1298 AA vs CC 0.6 [CI: 0.5-0.9] women only

Odds ratio (OR) is the number of diseased people who were exposed to a risk factor (DE) over diseased people who were not exposed (DN) divided by non-diseased who were exposed (NE) over non-diseased who were not exposed (NN). OR = (DE/DN) / (NE)/(NN). Confidence interval (CI) is 95%.

29 The association between MTHFR mutation and risk for neural tube defect has been examined both in children with neural tube defect and mothers of these children across several case-control studies (Table 1.3). Maternal 677TT genotype conferred increased risk for having children with a neural tube defect in less than half of the studies.

Children’s 677TT genotype conferred increased risk for neural tube defect in less than a third of studies. The increase in risk was two- to three-fold for both the children and the mother. Geographically, the frequency of spina bifida is lower in Italy (3.6/10,000) than the Netherlands (5.8/10,000) (de Franchis et al. 1995). However, the frequency of 677TT mutation is higher in Italy (16.3%) than the Netherlands (4.8%). This discrepancy is reflected in Table 1.3 in which two studies from Italy showed no increase in risk while two studies from the Netherlands showed significant increase in risk for neural tube defect with 677TT mutation. This suggests that mutations in other genes or environmental factors largely account for this difference. Another mutation in Mthfr,

A1298C, showed significant increase in risk for neural tube defect in only one out of three studies. Overall, the increase in risk for neural tube defect associated with 677TT mutation is inconsistent and is weaker than the effect of folic acid. Meta-analysis of neural tube defect and 677TT mutation may provide increased statistical power to detect the effect of TT mutation.

30 Table 1.3 Summary of association studies between Mhtfr C677T and neural tube defect risk

Study Patient # Mutation Odds Ratio van der Put et Netherlands 677 CC/CT vs TT Patient: 2.9 [CI: 1.0-7.9] al. 1995 55 spina bifida patient Mother: 3.7 [CI: 1.5-9.1] 70 mothers Father: not significant 60 fathers 207 controls de Franchis et Italy 677 CC/CT vs TT Mother: not significant al. 1995 28 mothers of spina bifida 289 controls Papapetrou et Great Britain 677 CC/CT vs TT Patient: not significant al. 1996 41 NTD patients Mother: not significant 36 mothers Father: not significant 26 fathers 199 controls Ou et al. 1996 USA C677CC/CT vs TT Patient: 7.2 [CI: 1.8-30.3] 41 NTD patients 109 controls van der Put et Netherlands 677 CC/1298 AA Patient: not significant al. 1998 109 NTD patients vs Mother: 2.38 [CI: 1.05-5.53] 122 mothers 677TT/1298AA Father: not significant 103 fathers * the only signif combo 403 controls Stegmann et Germany C677T & A1298C Patient: not significant for all al. 1999 148 NTD patients combinations genotypes 174 controls Christensen et Canada 677CC/CT mom & child Child/mom: 6.00 al. 1999 56 spina bifida patients vs 677TT mom & child [CI: 1.26-28.53] 62 mothers 97 control patients 677CC/CT vs 677TT Mother alone: not significant 90 mothers of control Child alone: not significant De Marco et Italy 1298AA vs CC Patient: 3.67 [CI: 1.67-8.18] al. 2002 203 NTD patients Mother: 6.23 [CI: 2.58-15.35] 98 mothers Father: 3.28 [CI: 1.17-9.16] 67 fathers 210 controls Gueant- Italy 677CC/CT vs TT Patient: not significant Rodriguez et 40 spina bifida al. 2003 58 controls

Odds ratio (OR) is the number of diseased people who were exposed to a risk factor (DE) over diseased people who were not exposed (DN) divided by non-diseased who were exposed (NE) over non-diseased who were not exposed (NN). OR = (DE/DN) / (NE)/(NN). Confidence interval (CI) is 95%.

31 Mouse Models and Disease Mechanisms

Human studies mentioned in previous sections have established associations between elevated homocysteine level or reduced folate level and complex diseases. However, it remains unclear whether elevated homocysteine or low folate is a cause or consequence of these diseases. Single gene mouse mutants for homocysteine and folate pathways allow us to establish the causal link between disorder in the homocysteine and folate pathway (cause) and the disease (consequence) while minimizing the contribution of environmental variation to the phenotype. Studies of these mouse knockout models revealed important information regarding the role of homocysteine and folate pathways in mechanisms of complex diseases.

Decline in -dependent vasodilation is one of the early signs of atherosclerosis

(McLenachan et al. 1991), is associated with several risk factors for cardiovascular disease (Vita et al. 1990), and is a predictor of cardiovascular disease event (Suwaidi et al.

2000, Schachinger et al. 2000). This vascular impairment is assessed by decline in vasodilator response to mechanical or pharmacological stimuli, which is often caused by decline in nitric oxide synthesis. Folate enhances nitric oxide synthesis in vitro (Stroes et al. 2000) and improves endothelial function in humans (Woo et al. 1999, Bellamy et al.

1999). Auto-oxidation of homocysteine in vitro also leads to hydrogen peroxide and superoxide anion synthesis causing degradation of nitric oxide (Starkebaum and Harlan

1986). These evidence suggest that insufficient folate level or elevated homocysteine level causes impaired endothelial function.

32 To study the role of homocysteine metabolism in vascular function, a knockout for cystathionine beta-synthase (Cbs), a gene involved in eliminating homocysteine from the pathway, has been generated. The mutation was transferred to the C57BL/6J genetic background. Homozygous knockout mice showed reduced survival and reduced weight gain as well as 40-fold elevated level of homocysteine compared to the wildtype

(Watanabe et al. 1995). Heterozygous mice showed 50% of enzyme activity and two- fold higher homocysteine level compared to the wildtype. The heterozygous knockout mice showed reduced endothelium-dependent vasodilation in several studies (Eberhardt et al. 2000, Weiss et al. 2002). These mice also showed increased level of superoxide anion in aortic tissue which could increase oxidative stress and degrade nitric oxide

(Eberhardt et al. 2000). Increase in vascular tissue level and glutathione peroxidase 1 enzyme activity, which are involved in eliminating reactive oxygen species, improved endothelial function in the heterozygous knockout mice (Weiss et al. 2002). A knockout model for methylenetetrahydrofolate reductase (Mthfr), which is involved in taking a from the folate pathway to the homocysteine pathway, showed no difference in endothelium-dependent vasodilation between wildtype and heterozygous mice (Devlin et al. 2004). Heterozygous knockout mice had 60% to 70 % of wildtype enzyme activity and 1.6-fold higher serum homocysteine level (Chen et al. 2001).

However, homozygous knockout mice, which had ten-fold higher homocysteine level than the wildtype, contained significant deposition in aorta (Chen et al. 2001).

These mice were on 129/Sv and BALB mixed background. These knockout models showed that mutation in the homocysteine pathway leads to endothelial dysfunction or atherosclerosis but the specific defect varies between the mutants suggesting different

33 cardiovascular roles for each gene or confounding effects from different genetic background.

The role of folate deficiency in cancer has been studied extensively. Folate deficiency in rats leads to uracil misincorporation during DNA replication and leads to DNA strand breaks causing mutations and genomic instability (Pogribny et al. 1997). Low folate intake can also lead to genomic hypomethylation in humans (Jacob et al. 1998).

Genomic hypomethylation is almost a universal observation in tumorigenesis (Feinberg and Vogelstein 1983) and can lead to activation of genes normally suppressed by methylation such as proto-oncogenes and parasitic sequences. Active parasitic sequences can activate neighboring genes using promoters of parasitic element and high copy number of active parasitic sequence can also induce somatic recombination leading to genomic instability. A knockout mouse model for DNA methyltransferase 1 (Dnmt1) was created to study the role of defective DNA methylation (homocysteine pathway) on cancer. The Dnmt1 knockout mice are embryonic lethal and the embryonic stem cells carrying homozygous knockout alleles showed elevated mutation rate, most prevalent of which was deletion (Chen and Pettersson et al. 1998). To circumvent the embryonic lethality, a hypomorphic allele of Dnmt1 was used in conjunction with the null knockout allele to create a compound heterozygote, which carried 10% of the wildtype DNMT1 activity (Gaudet et al. 2003). These mice showed DNA hypomethylation, developed aggressive T cell lymphoma, and also showed high frequency of Trisomy 15. These observations confirmed that loss or reduction in DNA methylation leads to genomic instability and increased cancer rate.

34 Despite the success of folate supplementation in reducing the risk of neural tube defects, the mechanistic role of folate in proper neural tube formation is still unclear. It is speculated that low folate level inhibits cellular proliferation in neural tube or that high homocysteine serves as NMDA receptor antagonist during development (Rosenquist and

Finnell 2001). NMDA receptor is a principal regulator of neural cell migration and other functions essential for proper neural tube formation and closure. In fact, addition of

NMDA agonist decreased the incidence of homocysteine thiolactone-induced neural tube defect in chicken (Rosenquist et al. 1999). Currently, folate binding protein 1 (Folbp1) knockout is the only model for neural tube defect in the homocysteine and folate pathways. These mice are embryonic lethal by E10 and exhibit neural tube defects

(Piedrahita et al. 1999) and delayed developmental stages compared to the wildtype mice

(Spiegelstein et al. 2004). Folbp2 homozygous knockout mice are normal and show similar serum homocysteine level as wildtype littermates (Piedrahita et al. 1999). Similar to humans, folate supplementation in the mother prevented neural tube defect in the majority of the homozygous Folbp1 knockout mice (Piedrahita et al. 1999). However, specific role of folate in the neural tube formation and closure is still unanswered.

Currently, no mouse knockout models for homocysteine and folate pathways have been used to study Alzheimer’s disease. Some still speculate that vascular changes due to elevated homocysteine level lead to increased risk for Alzheimer’s disease. In vitro studies have also shown neurotoxicity of homocysteine (Kruman et al. 2000, 2002) but the significance of this observation in the in vivo context still remains to be seen. Also there are two recent mouse knockout models for homocysteine pathway genes,

35 methionine synthase (Mtr) (Swanson et al. 2001) and methionine adenosyltransferase 1

(Mat1a) (Lu et al. 2001) that have not yet been studied in the context of diseases.

Basic Biochemistry of Homocysteine and Folate Pathways

The homocysteine and folate pathways function in basic housekeeping duties including

methionine synthesis, methylation of DNA, RNA, and , synthesis of

purines and pyrimidines, and degradation of histidine. As the cellular demands for

certain metabolites change in response to environmental stress, developmental process,

and tissue specific need, homocysteine and folate pathways must respond to these

changes. It is possible that defects in these responses, or pathway regulation, lead to

specific disease types.

Before describing the functions of the pathway, the pathway will be divided into two

parts, homocysteine metabolism and folate metabolism. A diagram of the pathway is

shown in Figure 1.1. The homocysteine pathway is involved in methionine biosynthesis

by the addition of methyl group to homocysteine. When methionine is not utilized for

protein synthesis, it can be converted into S-adenosylmethionine to serve as a substrate

for over 100 methylation reactions including methylation of DNA, RNA, lipids and

proteins (Kisliuk, 1999). The by-product of methylation (S-adenosylhomocysteine, SAH) is metabolized to homocysteine, a compound that is often elevated when there is a defect in the pathway. Homocysteine can be recycled back into the pathway by remethylation involving methionine synthase (MTR) or eliminated from the pathway by transsulfuration

36 (via cystathionine beta synthase (CBS) and cystathionine gamma lyase (CTH)). During transsulfuration, homocysteine is converted into cysteine which is later used for glutathione synthesis. Glutathione is involved in removing reactive oxygen species to minimize cellular oxidative stress.

The folate pathway is involved in metabolism of a methyl group, which is used for purine and pyrimidine synthesis. The major supply of methyl group comes from serine, a metabolite of . Serine and tetrahydrofolate (THF) are converted into glycine and 5,10-methylene-THF in which methyl group from 5,10-methylene-THF can be delivered to the homocysteine pathway for methionine synthesis or retained in folate pathway. In folate pathway, 5,10-methylene-THF is used by thymidylate synthase for formation of dTMP (pyrimidine) or converted into 10-formyl-THF by a trifunctional enzyme tetrahydrofolate synthase for purine synthesis.

Three enzymes in the homocysteine pathway require vitamin cofactors (B6) and cobalamin (B12) in which dietary deficiency often result in increased homocysteine level. Vitamin B6 is a cofactor of cystathionine beta-synthase (CBS) and cystathionine gamma-lyase (CTH), which are involved in elimination of homocysteine from the pathway, and vitamin B12 is a cofactor of methionine synthase (MTR), which is involved in methylation of homocysteine to recycle homocysteine back into the pathway.

It is important to note that these pathways are dynamic, responding to specific demands of tissues, such as increased purine and pyrimidine synthesis for rapidly dividing cells,

37 and also to specific environmental stress such as methionine load after a meal rich in protein.

Atic, Gart

Ams Fthfd Mthfd1, Ftcd Mthfd2 Methyl- Dhfr Bhmt Shmt transferases Mtr Bhmt2 Tyms

Mthfs Ahcy Mthfr Mthfd1, Cbs Mthfd2

Cth THF 10-formyl-THF S-adenosylmethionine methionine

5,10-methenyl-THF dihydrofolate

5-methylTHF

S-adenosylhomocysteine homocysteine 5-formyl-THF 5,10-methyleneTHF cystathionine

cysteine

Figure 1.1. Homocysteine and folate pathways (after Rosenblatt, 1995). The circles represent substrates/products and arrows represent the enzymes and the direction of reaction. The left panel shows names of enzymes in italics and the right panel shows names of substrates/products. The names of enzymes, enzyme commission number (E.C.) and abbreviations are listed in Appendix 1. THF stands for tetrahydrofolate.

38 Regulation of Homocysteine and Folate Pathways

Several levels of regulation exist in the homocysteine and folate pathways. These include internal positive and negative feedbacks, polyglutamation of folate, and tissue specific regulation.

Positive and negative feedback One important regulatory decision within the homocysteine pathway is whether to continue using homocysteine in the methionine cycle or to eliminate homocysteine from the pathway. This decision is regulated by one of the reaction intermediates, S-adenosylmethionine (Finkelstein 1998). In the liver, when there is a large import of methionine into the pathway, particularly after consuming a diet rich in protein, methionine is converted into S-adenosylmethionine (AdoMet) by S- adenosylmethionine synthase (AMS). S-AdoMet inhibits MTHFR and lowers the level of 5-methyl-THF required for recycling of homocysteine by methionine synthase. S-

AdoMet also activates cystathionine beta-synthase (CBS), an enzyme which eliminates homocysteine from the pathway. Therefore, in the presence of high S-AdoMet, homocysteine is eliminated instead of being recycled.

However, this mechanism is carried out only in the liver because it is the only known tissue that carries the isozyme of AMS activated by increased level of its product S-

AdoMet (Finkelstein 1990). In all other tissues, S-AdoMet inhibits AMS so that there will not be enough accumulation of S-AdoMet to make this an effective regulatory switch

39 between homocysteine recycling and elimination. Therefore, it is thought that liver plays an important role in handling large load of methionine into the body.

The decision between homocysteine recycling and elimination is also made at the level of each reaction intermediate. Enzymes involved in homocysteine elimination, CBS and

CTH, require high concentration of substrate before the reaction can proceed (high Km).

However, enzymes involved in homocysteine recycling, such as methionine synthase and

S-AdoMet synthase, can function at low level of substrates (low Km). Therefore, the pathway tries to preserve its metabolites when their concentrations are low (Finkelstein

1998).

Polyglutamation of folate Although extensive feed-back and feed-forward regulation exist in the homocysteine and folate pathways, the rate of reactions can also be regulated on another level. When dietary in the intestine are absorbed into the blood, they are delivered mainly to liver where folypolyglutamate synthetase adds glutamyl chains to folate. This glutamyl chain enhances the retention of folate by cells (Rosenblatt and

Fenton 2001) and also allows enzymes in the folate pathway to use less of its folate derivative substrates for the reaction to proceed (lowering Km value). Therefore, in the absence of sufficient folate in the system, folypolyglutamate synthetase can be upregulated to compensate for the low folate concentration in the pathway.

Tissue specific regulation Homocysteine and folate pathways behave differently in various tissues in order to meet the specific metabolic demand of the cells. As previously

40 mentioned, liver has a special isozyme of AMS that allows high production of AdoMet.

Liver also has high activity of a second enzyme that can recycle homocysteine back into methionine called betaine homocysteine methyltransferase (BHMT). This enzyme, unlike MTR, obtains its methyl group from betaine so that it can function in the absence of folate. Additionally, a second BHMT gene was recently identified and shows high levels of expression in human liver and kidney (Chadwick et al. 2000).

Folate transport Dietary folate supplementation lowers the level of homocysteine and may reduce the risk for some diseases associated with elevated homocysteine level.

However, most of the circulating folate is not in its dietary form but exists as 5-methyl-

THF. 5-methyl-THF derives from a reaction catalyzed by MTHFR using 5,10- methylene-THF as a substrate. 5,10-methylene-THF can be derived through several different enzymes but the conversion to 5-methyl-THF is mediated only by MTHFR.

Therefore, MTHFR is thought to be crucial for the distribution of folate into various tissues.

There are at least two different mechanisms of folate uptake (Rosenblatt and Fenton

2001). The first involves “high capacity/low affinity” system that is driven by anion gradient. The enzyme is encoded by reduced folate carrier (RFC). The second system is a “low capacity/high affinity” system involving receptor-mediated phagocytosis of folate by folate binding proteins (FOLBP).

41 In the brain, the major form of folate that is transported across blood-brain barrier is methyl-THF. The transport is mediated by folate receptors (low capacity/high affinity).

Methyl-THF is also transported across the choroid plexus to assure rapid distribution of folate in cerebral spinal fluid (CSF) (Wu and Pardridge 1999). A low level of folate in

CSF is often found in patients suffering from demyelination (Surtees 1998).

Transcriptional regulation Most of the work on homocysteine and folate pathway regulation and diseases has focused on the internal regulation of the pathway through feedback mechanisms. It is thought that these feedback loops are essential for maintaining homeostasis of the pathway under sudden changes in metabolite levels.

However, there also exists transcriptional control of the pathway such that cells can respond to a long-term changes in the environmental stress, specific needs of particular tissues, and developmental stages. Recent advances in molecular biology have allowed cloning of all of the genes in the pathway in human and mouse so that we can now monitor the gene expression profiles of pathways in response to disease, environmental stress, and tissue types. Data from previous studies suggest complex transcriptional regulation of homocysteine and folate pathways. For example,

(Dhfr), a gene involved in pyrimidine synthesis, responds to changes in cell cycle on a

transcriptional level (Slansky and Farnham 1996). The mRNA level of Dhfr increases 7-

to 12-fold at G1/S boundary and is not attributed to changes in RNA stability. The

promoter of Dhfr contains cell cycle regulated transcription factors such as E2F and Sp1

(Slansky et al. 1993, Noe et al. 1997). Also, in vitro experiments using nasopharyngeal

carcinoma cell have shown that folate binding protein (Folbp) mRNA level increases

42 three- to five-fold in response to folate depletion. The mechanism of this change is partly attributed to RNA stability (Hsueh and Dolnick 1993) and the level of DNA methylation

(Hsueh and Dolnick 1994). Other housekeeping pathways, such as glycolysis, also go through transcriptional regulation under different environmental stress and tissue demands (Kletzien et al. 1994).

Global perspective on homocysteine and folate pathway regulation Aside from the variety of transcription factors regulating homocysteine and folate pathways, there are many other genes of diverse function that regulate these pathways. Several of the single gene mouse mutants for neural tube defect, such as Cart1 (Zhao et al. 1996),

Cited2 (Barbera et al. 2002), and Pax3 (Fleming and Copp 1998) transcription factors, show reduction in the frequency of neural tube defects by maternal folic acid supplementation. The rescue of defects suggests impairment of homocysteine and folate pathways in these mice. In fact, Pax3 mutants show a defect in uracil to thymidine conversion of the folate pathway (Fleming and Copp 1998). Similarly, increase in colon cancer incidence in Apc mutant, a signal transduction gene, can be suppressed with folic acid supplements (Song et al. 2000). Interestingly, folic acid supplementation reduces cancer incidence only at 3 months of age and increases cancer incidence at 6

months of age. This suggests that defects in the homocysteine and folate metabolism

resulting from mutations in other pathways change depending on the age of the animal

and the disease state.

43 A survey of single gene mouse mutants for neural tube defect (Ernest et al. 2002) also showed changes in serum homocysteine level as well as transcript level of homocysteine and folate pathway genes. These genes belonged to regulators of transcription and signal transduction (Pax3, Gli3, Apc) as well as cholesterol metabolism (Apob). These clearly showed that genes from wide variety of function can modulate not only the transcript level of homocysteine and folate pathway genes but also the level of serum homocysteine.

Pharmacologically induced rodent disease models such as streptozotocin induced diabetic rat showed decrease in homocysteine level as well as upregulation of transsulfuration enzyme (CBS and CTH) activity in the liver (Jacobs et al. 1998). These evidences together suggest that regulation of homocysteine and folate metabolism encompasses wide variety of pathways and that our understanding of homocysteine and folate metabolism in the context of complex diseases requires our understanding of these complex pathway interactions.

44 Research Objective

Abnormalities in homocysteine and folate metabolism have been associated with increased risk for variety of common human diseases. Elevated serum homocysteine and low dietary folate intake increase risks for cardiovascular disease, neural tube defect, cancer, and neurodegenerative disorders. Genetic mutations in homocysteine and folate pathways also increase risk for these diseases. Although the biochemical function of these pathways is known, these functions do not explain how specific defect in these pathways lead to each specific diseases. Mouse models allow us to control many of the genetic and environmental variables inherent in human studies, allowing us to dissect genes, pathways, and nutrients important for disease mechanism. Thus far, mouse knockout models showed involvement of some of the known functions of homocysteine and folate metabolism in diseases. However, mutations in other pathways can also modulate disease outcome and functions of homocysteine and folate metabolism. Study of homocysteine and folate pathway regulation suggests that functions of these pathways vary depending on their interaction with other pathways under different conditions.

Therefore, to understand the role of homocysteine and folate pathways in diseases, we need to analyze the regulation, interaction, and dynamics of the pathways. To pursue these goals, I used the following three approaches.

In Chapter 2, we first identified a unique strain of mice called PL/J that showed reduced

MTHFR enzyme activity, the key enzyme that regulates flow of methyl-group donor between folate and homocysteine pathway. PL/J also showed seizure phenotype

45 observed in MTHFR deficient patients and kinky tail phenotype observed in Mthfr mouse knockout model. The goal was to study the genetic control of MTHFR activity and proportion of seizure and tail kink phenotype explained by low MTHFR activity. We found that there are no coding or 5’ UTR mutations that explain for the reduced MTHFR

activity suggesting that unidentified cis-acting mutation or genes from other pathways

(trans-acting) are responsible for the reduced activity. Seizure and tail kink traits are

multigenic and showed strong environmental contribution such that the low number of

affected mice made further segregation analysis inefficient. Given the absence of

advanced genetic tools involving PL/J, other approaches will be needed to study the

regulation of homocysteine and folate pathways and diseases in this strain.

In Chapter 3, we used an evolutionary approach to address whether biochemical

interactions in metabolic pathways buffer deleterious mutations. This offers clues to why

only some genes in the homocysteine and folate pathways show disease phenotype when

mutated. We assumed that certain genes can evolve faster, thus undergoing rapid genetic

sequence changes, because some mechanism buffers these genetic changes from leading

to lethal phenotype. We found that between humans and mice in metabolic pathways,

gene duplication and metabolic networking (connectivity of biochemical reactions),

previously shown to be important for genetic buffering in yeast, do not show significant

effect on gene evolution rate. Instead, tertiary protein structure, tissue requirement, and

biological function are important in shaping gene evolution rate. These results suggested

that higher level of networks such as interaction across multiple tissues and function of

46 pathways determines gene evolution rate more strongly than genetic buffering effect of gene duplication or metabolic networking.

In Chapter 4, we took advantage of genomic tools, namely microarrays and genome databases, to examine how two genetically distinct inbred mouse strains respond to the same dietary folate perturbation; folate depletion followed by repletion. We found that

A/J was able to restore serum folate and homocysteine level as well as overall gene expression level close to the control levels after folate was replaced in the diet. However, in C57BL/6J, folate and homocysteine levels did not return to control and overall gene expression level continued to diverge from the control. Gene expression analysis and literature searches indicated that cholesterol metabolism, DNA methylation status, and choline utilization may be altered in C57BL/6J strain compared to A/J. We measured the metabolic output of these parameters to test whether changes in gene expression leads to changes in metabolic phenotype. We then examined whether alterations in these newly identified pathways help C57BL/6J recover from folate perturbation using single gene mutants and dietary supplements. Results indicated that global DNA methylation is unaffected by folate depletion, increase in serum cholesterol level may help increase retention of serum folate level, and choline metabolism participates in homocysteine and folate regulation in an unidentified matter. These pathway relationships or network modules serve as building blocks in analyzing mechanisms of complex diseases.

47

CHAPTER 2

Genetic and phenotypic analysis of MTHFR enzyme activity, tail kinks, and seizure susceptibility in PL/J mice

48 Authors: Toshimori Kitami, Sheila Ernest, Laura Gallaugher, Lee Friedman, Wayne N Frankel, Joseph H Nadeau

Reference: Mammalian Genome 15: 698-703. 2004.

Abstract

Homocysteine and folate pathways are involved in wide variety of complex human

diseases including cardiovascular disease, neural tube defect, cancer, and

neurodegenerative disorder. The most common mutation in the homocysteine and folate

pathways is the C677T mutation in Mthfr, which leads to reduced enzyme activity and

has been associated with increased risk for several complex diseases. To identify mouse

models for low MTHFR activity and its role in diseases, we analyzed MTHFR activity in

21 inbred strains. PL/J strain had the lowest enzyme activity, and additionally showed

seizure and tail kinks that are found in Mthfr mouse knockout model as well as

comparable phenotype in humans. Unlike the human mutation, there was no coding

mutation that explained the reduced enzyme activity. Our genetic cross of PL/J to

C57BL/6J indicated that tail kink is inherited in a highly polygenic manner and that

rhythmic tossing seizure trait shows low heritability and low penetrance. This indicated

that segregation analysis of MTHFR activity and disease phenotype will be difficult in

this conventional cross. Interestingly, PL/J was susceptible to a variety of seizure

phenotypes and showed genetic interaction with audiogenic seizure susceptible strain

DBA/2J, making PL/J a unique mouse model of epilepsy.

49 Introduction

Disorders of homocysteine and folate metabolism are associated with several complex human diseases. Elevated serum homocysteine level increases risk for cardiovascular disease (Homocysteine Studies Collaboration 2002) as well as Alzheimer’s disease

(Seshadri et al. 2002). Insufficient serum folate level or low dietary folate intake has been associated with increased risk for mothers having offspring with neural tube defects

(MRC Vitamin Study Research Group 1991, Czeizel and Dudas 1992) and increased risk for colon cancer (Giovannucci et al. 1998, Kato et al. 1999, Su and Arab 2001).

Environmental factors such as dietary folic acid intake (Homocysteine Lowering

Trialists’ Collaboration 1998), alcohol consumption (Cravo et al. 1996), and smoking

(Nygard et al. 1995) influence serum homocysteine and folate levels. Because many of these environmental factors are also risk factors for cardiovascular disease, cancer,

Alzheimer’s disease, and birth defects, it is difficult to study the causative role of homocysteine and folate in these complex diseases.

Genetic mutations within homocysteine and folate pathways help establish the causative role of these pathways in human diseases by directly linking the pathway to disease. The most common mutation in these pathways resides in methylenetetrahydrofolate reductase

(MTHFR), a key enzyme which regulates the methyl-group donor supply for remethylation of homocysteine. The allele frequency for the common mutation C677T varies from the low of 0.07 in African populations to as high as 0.50 in European populations (Pepe et al. 1998). The homozygous 677TT mutation results in 35% of the

50 wildtype enzyme activity (Frosst et al. 1995) and is associated with increased risk for cardiovascular disease (Wald et al. 2002, Klerk et al. 2002) as well as neural tube defects for both children and mothers (van der Put et al. 1995, Ou et al. 1996, De Marco et al.

2002), and reduced risk for colon cancer (Chen et al. 1996, Ma et al. 1997). However, many of these disease associations show strong environmental contribution. For example, the risk for colon cancer is only reduced in patients who consumed little to no alcohol suggesting gene-environment interactions (Chen et al. 1996, Ma et al. 1997).

Mouse models enable control of genetic and environmental factors that complicate studies of homocysteine and folate pathways in human. To identify mouse models for reduced MTHFR activity and its potential implication in diseases, we analyzed MTHFR enzyme activity data from 21 inbred strains (B. Christensen and S. Ernest, unpublished data). We identified a unique inbred strain PL/J that showed the lowest enzyme activity.

Interestingly, PL/J showed seizures during routine handling of cages, a phenotype also observed in patients with MTHFR deficiency (Rosenblatt and Fenton 2001) as well as in

Mthfr mouse knockout model (Chen et al. 2001). PL/J also showed kinky tail that is observed in the Mthfr mouse knockout model (Chen et al. 2001).

We sought to identify genetic mutations responsible for low MTHFR enzyme activity and establish whether low MTHFR activity is responsible for seizure and kinky tail phenotype in PL/J mice. Sequencing of coding and 5’ UTR region revealed no mutation suggesting involvement of a cis- or trans-acting mutation. We then tested heritability of seizure and kinky tail phenotype to determine whether linkage analysis of these

51 phenotypes with MTHFR activity is feasible using PL/J and C57BL/6J crosses. Co- segregation of seizure and tail kink phenotype with low MTHFR activity would strongly indicate shared genetic control for these traits. Our genetic analysis of kinky tail phenotype showed no affected mice in a large backcross suggesting a large number of genes involved in this trait. The seizure trait showed low heritability and low penetrance suggesting that linkage analysis is unpractical. However, PL/J showed susceptibility to a variety of seizure phenotypes and showed genetic interaction with audiogenic seizure susceptible strain DBA/2J, making this strain a unique resource for mouse model of epilepsy.

Materials and Methods

Sequencing Liver total RNA was extracted from C57BL/6J, PL/J, and DBA/2J using

Trizol (Invitrogen) and Mthfr mRNA was reverse transcribed using a primer from the 3’

end of the MTHFR mRNA sequence (NM_010840). MTHFR cDNA from each strain

was PCR amplified with high fidelity Taq polymerase, subcloned into pGEM-T Easy

Vector (Promega), and sequenced in both directions. The first and the last coding exons

were PCR amplified from the genomic DNA and sequenced. 5’ UTR sequence was

obtained from 5’ RACE (Roche).

Mice PL/J, C57BL/6J, and DBA/2J were purchased from the Jackson Laboratory.

Breeding colonies were established for each inbred strain. Mice were raised on the PMI

52 Nutrition Laboratory Autoclavable Rodent Diet #5010 from conception. All mice shared the same animal room with controlled temperature, humidity, and 12 hour light-dark cycle. Mice were provided food and ad libitum.

Rhythmic tossing seizure tests At 21 days of age, mice were weaned into separate

cages with four to five males or females per cage. Starting at six weeks old, mice were

subjected to rhythmic tossing similar to the protocol described by Legare et al. (2000).

Cages were vertically displaced about 3 cm by hand using ruler as a guide for 3 cycles

per second for 30 seconds. Seizure types were recorded using two levels of severity, with the first level being partial seizure in which clonus affected a limited portion of the body such as face, head, or forelimb, and the second level being tonic-clonic seizure in which muscle contraction followed by clonus affected the entire body including all four limbs and tail. Tests were administered once every three days over 60 days per mouse with a total of 20 tests. Both the frequency of seizure over 20 tests and the number of trials before seizure onset were analyzed. The difference in variance for seizure onset was analyzed using Levene’s test for homogeneity of variance using Statistica (StatSoft).

53 Results

MTHFR enzyme activity survey and sequence analysis

To identify a new mouse model for disorders of MTHFR regulation, we analyzed

MTHFR activity data from 21 inbred strains and identified PL/J as having the lowest

MTHFR enzyme activity followed by DBA/2J (B. Christensen and S. Ernest,

unpublished data). The enzyme activity in 21 strains followed a Gaussian distribution

with the median at C57BL/6J strain and PL/J showed 38% of the C57BL/6J activity.

Sequencing of coding region in PL/J indicated one silent (A1819G) and one amino acid

(G919C, arginine to serine) change compared to the published Mthfr sequence

(NM_010840) from C57BL/6J. DBA/2J showed the same sequence as PL/J. Multiple

sequence comparisons indicated that this amino acid varies among diverse strains and

species (Figure 2.1). More importantly, the human sequence has the same amino acid sequence as PL/J suggesting that this variant is not the cause of reduced enzyme activity.

Sequencing of the 5’ UTR also indicated no sequence differences between C57BL/6J,

DBA/2J, and PL/J. Given the vast length, uncertainty, and the complexity of the remaining regulatory sequence for MTHFR, we need to use MTHFR enzyme activity as a quantitative trait and determine whether this trait is controlled by cis- versus trans-acting genetic variants.

Tail kink

Aside from the low MTHFR activity, we found two phenotypes in PL/J mice, tail kinks

and seizures, that were previously observed in Mthfr knockout mice (Chen et al. 2001).

54 Kinky tails usually result from abnormalities in somites, notochord, or primary neural tube (Klootwijk et al. 2000, Matsuura et al. 1998). X-ray analysis of PL/J mice (Figure

2.2) showed that kink is caused by an asymmetrical vertebra where one side is shorter than the other. There was only one or no kink per tail and the tails were otherwise normal in length. There were no other striking defects observed in these mice. PL/J showed a kinky tail with about 40% penetrance in both males and females. Crosses with

C57BL/6J, which do not have kinky tail, resulted in non-affected F1 progeny for reciprocal crosses (N = 40). We observed no affected progenies from the backcross of F1 to PL/J in over 50 progenies suggesting that the trait is polygenic. In addition, F1 progeny between PL/J and DBA/2J also did not show the kinky tail phenotype (N = 44).

Seizure phenotype

Another phenotype that we discovered in PL/J mice was seizure which occurred spontaneously during routine handling of cages. Seizures started with limb and tail extension and were immediately followed by convulsion lasting less than one minute and resembled tonic-clonic seizures in humans. Mice often fell on their sides during tonic-

clonic seizures and were inactive for several minutes following each seizure episode. In

general, PL/J mice had a lower physical activity level compared to strains such as

C57BL/6J (data not shown) and many mice lived more than one year despite recurrent

seizure episodes. Analysis of brain histology (data not shown) as well as histology in

The Mouse Brain Library (http://www.mbl.org) did not reveal any gross neurostructural

changes in PL/J mice that might explain the seizure activity.

55 * H. sapiens 283 VIEP---IKDNDAAIRNYGIELAVSLCQEL M. musculus (B6) 282 VIEP---IKDNDAAIRNYGIELAVRLCREL M. musculus (PL) 282 VIEP---IKDNDAAIRNYGIELAVSLCREL C. elegans 296 DLEP---IKHDDDAVQKYGTERCIEMCRRL S. cerevisiae 245 RFPP-E-IQSDDNAVKSIGVDILIELIQEI S. cerevisiae 243 RLDP---IKDDDELVRDIGTNLIVEMCQKL E. coli 238 MFDG---LDDDAETRKLVGANIAMDMVKIL

Figure 2.1 Amino acid sequence alignment of Mthfr in five model organisms. Star denotes position 303 in human sequence where PL/J sequence differs from C57BL/6J (B6) sequence.

Figure 2.2 X-ray photograph of a kinky tail in a PL/J mouse. The vertebra at the kink is short and round with one side longer than the other.

56 Genetic analysis of seizure in crosses with C57BL/6J

Although spontaneous seizures can be studied with constant video monitoring, we induced seizures in PL/J mice using a rhythmic tossing protocol similar to Legare et al.

(2000). Seizure behavior was similar between rhythmic tossing induced seizures and spontaneous seizures observed during routine handling of cages. We crossed PL/J to the rhythmic tossing seizure resistant strain C57BL/6J to study the inheritance of seizure phenotypes. To study the onset and frequency of seizure, we performed rhythmic tossing tests every three days beginning when mice were six weeks old for a total of 20 tests. All seven PL/J mice were susceptible to tonic-clonic seizure with high frequency of 9 to 14 seizure episodes per mouse per 20 tests (Figure 2.3). As expected, C57BL/6J did not show seizures. Thirty percent (12/40) of the F1 progeny between PL/J and C57BL/6J showed mostly tonic-clonic seizures (10/40) and some partial seizures (2/40). F1 mice seized at most once over 20 tests. Significant differences in seizure frequency were not observed between the reciprocal crosses suggesting that parental effects did not contribute significantly to the phenotype. Twenty-five percent (9/36) of F2 progeny experienced seizures with a frequency of one to four seizure episodes over 20 tests.

About half of the seizures were tonic-clonic seizures and the remainder was partial seizures. Significant differences were not found between males and females for the seizure frequencies of PL, F1, and F2 mice.

The first seizure typically occurred between fifth and seventh tests for PL/J but was more variable (second to twentieth tests) for F1 mice (Levene’s test for variance, p < 0.003) with no significant difference between reciprocal crosses. Onset in the F2 was between

57 the second and fifteenth test, which is smaller in range compared to the onset in F1 mice

(Levene’s test for variance, p < 0.002). When only tonic-clonic seizures were analyzed,

the onset of seizures in F1 and F2 mice was always later than PL/J onset with onset in F1

mice occurring between the eighth and twentieth test and F2 onset between the seventh

and thirteenth test. Significant differences in seizure onset were not evident between

males and females.

Genetic analysis of seizure with DBA/2J

C57BL/6J is resistant to seizures across wide variety of stimuli. By contrast, DBA/2J is

resistant to rhythmic tossing induced seizures (Rise et al. 1991) but is susceptible to

audiogenic seizures (Seyfried et al. 1980), electroconvulsant (Frankel et al. 2001), and

chemoconvulsant (Ferraro et al. 1999). We crossed PL/J with DBA/2J to test whether

alleles from DBA/2J modulate the seizure phenotype in PL/J mice. None of the 16

DBA/2J mice showed seizures in our assays. Similar to the cross between PL/J and

C57BL/6J, 32% of F1 mice with DBA/2J showed seizures. However, unlike seizures in

(PL/J x C57BL/6J)F1 mice, (PL/J x DBA/2J)F1 mice showed only tonic-clonic seizures

and the seizure frequency ranged from 0 to as high as 7 seizure episodes over 20 tests.

The seizure onset occurred between eleventh and nineteenth test, which is significantly

smaller in range compared to (PL/J x C57BL/6J)F1 mice (Levene’s test, p < 0.003).

Significant differences were not found between the reciprocal crosses or between males

and females for frequency and onset of seizure.

58

Figure 2.3 Number of tests with seizures in parental (top panel), F1 (middle panel), and F2 (bottom panel) mice out of 20 rhythmic tossing seizure tests. Numbers in parenthesis indicate the number of mice tested.

59 Discussion

Disorders of homocysteine and folate metabolism have been associated with

cardiovascular disease (Homocysteine Studies Collaboration 2002), neural tube defects

(MRC Vitamin Study Research Group 1991, Czeizel and Dudas 1992), cancer

(Giovannucci et al. 1998, Kato et al. 1999, Su and Arab 2001), and neurodegenerative disorders (Seshadri et al. 2002). The most common genetic mutation in these pathways resides in methylenetetrahydrofolate reductase (Mthfr). The mutation, 677TT, occurs at high frequency in the European population (Pepe et al. 1998) and results in 35% of the wildtype enzyme activity (Frosst et al. 1995). The 677TT mutation is associated with increased risk for cardiovascular disease (Wald et al. 2002, Klerk et al. 2002) as well as neural tube defects (van der Put et al. 1995, Ou et al. 1996, De Marco et al. 2002) and decreased risk for colon cancer (Chen et al. 1996, Ma et al. 1997). To identify a mouse model for reduced MTHFR activity, we surveyed 21 mouse strains and found the lowest enzyme activity in PL/J strain. PL/J showed 38% of the C57BL/6J activity, which was similar to the 35% observed in human 677TT mutation (Frosst et al. 1995). PL/J also showed seizures and tail kinks, neurological and developmental defects observed in mouse knockout model (Chen et al. 2001) as well as comparable phenotype in humans

(Rosenblatt and Fenton 2001).

To identify the genetic mutation responsible for low MTHFR activity, we sequenced the coding region as well as 5’ UTR in PL/J, DBA/2J, and C57BL/6J. There was no mutation in these regions that accounted for the low MTHFR activity suggesting cis- or

60 trans-acting mutations in PL/J. Our preliminary analysis of Mthfr transcript levels in

liver and brain of six inbred strains showed significantly lower transcript level in PL/J

compared to other strains which had higher enzyme activity. However, there was no

correlation between enzyme activity and transcript level across six strains suggesting that

control of enzyme activity cannot be accounted solely by the transcript level. QTL

analysis of glycolytic enzyme activity in Arabidopsis showed that QTLs responsible for

each enzyme activity sometimes mapped to the locus of each gene but also mapped to

other loci suggesting both cis- and trans-acting mutations for many enzymes (Mitchell-

Olds and Pedersen 1998). Also, some of the trans-acting QTLs for several distinct

enzymes mapped to the same trans-acting control locus (Mitchell-Olds and Pedersen

1998) suggesting that single mutation may have pleiotropic effect on various enzymes in

the same pathway. QTL studies between PL/J and C57BL/6J using MTHFR activity as a

trait should reveal the genetic control of this key enzyme.

Before studying the association between low MTHFR enzyme activity and the kinky tail

and seizure phenotypes, we first characterized the phenotypes and their inheritance.

Kinky tail was the only bone defect in PL/J and showed recessive inheritance. However, we did not observe any kinky tail phenotype in large number of backcross progeny suggesting that this polygenic trait is not feasible for simple QTL study using traditional

F2 crosses. Seizures in PL/J was induced by systematic rhythmic tossing and showed characteristics of tonic-clonic seizures. PL/J showed no overt neurostructural abnormalities. PL/J seizures accompany abnormal EEG and show low threshold to a

61 variety of stimuli including electroconvulsant and chemoconvulsant aside from the rhythmic tossing (Kitami et al. 2004).

Genetic analysis of PL/J crossed to seizure resistant C57BL/6J indicated that the rhythmic tossing-induced seizure trait follows a semi-dominant inheritance with low penetrance. Taking 30% penetrance in F1 mice and 100% penetrance in PL/J, we expected 40% (= 0.25 x 100% + 0.5 x 30%) of F2 mice to show a seizure phenotype under single Mendelian gene segregation model. However, we found a lower penetrance of 25% in F2 mice suggesting that the trait is multigenic. The wider distribution in the number of trials before seizure onset in F1 mice, which are all genetically identical, compared to the F2, which are genetically variable, indicates that genetics plays a minor role in determining the seizure onset and is instead more dependent on stochastic or uncontrolled environmental factors.

The seizure frequency over 20 tests in F2 mice was never as high as the frequency in

PL/J. However, F2 progeny showed a wider variation in seizure frequency compared to

F1 mice suggesting a genetic contribution to the frequency phenotype. Although QTL studies for seizure frequency are possible (Rise et al. 1991, Frankel et al. 1994, Seyfried et al. 1980), recent QTL analysis of epilepsy in EL mice (Legare et al. 2000) indicated that significant QTL peaks are caused in some cases by multiple closely linked genes, each with individually small effects. Given the low penetrance, genetic complexity, and non-genetic factors that contribute to the seizure frequency trait, co-segregation analysis of MTHFR enzyme activity and seizure frequency using (PL/J x C57BL/6J)F2 mice

62 would be difficult. F2 cross is not ideal for dissecting trait with complex genetics or trait

with strong environmental component. Traits with complex genetics can be studied using substitution strains (CSS) (Nadeau et al. 2000, Singer et al. 2004) or recombinant congenic strains (RCS) (Demant and Hart 1986, Fijneman et al. 1996).

Traits with strong environmental component can also be studied with recombinant inbred strains (Bailey 1981). Unfortunately, PL/J is not available as part of these specialized resources.

When we crossed PL/J to the rhythmic tossing seizure resistant strain DBA/2J, we found that the penetrance of seizure in the resulting F1 did not differ significantly from the penetrance in (PL/J x C57BL/6J)F1 mice. However, the seizure frequency as well as the seizure severity was increased in (PL/J x DBA/2J)F1 mice compared to (PL/J x

C57BL/6J)F1 mice but not nearly as high as parental PL/J. Only one copy of DBA/2J allele is necessary for the enhanced seizure phenotype suggesting that the modifier trait is

inherited in dominant manner. Although DBA/2J is resistant to rhythmic tossing seizures

(Rise et al. 1991, and present study), it is susceptible to audiogenic seizures (Seyfried et

al. 1980), electroconvulsant (Frankel et al. 2001), and chemoconvulsant (Ferraro et al.

1999). Genes involved in seizure susceptibility in DBA/2J are likely to be responsible

for this increased seizure frequency in F1. PL/J could be crossed to congenic strains for

DBA/2J audiogenic seizure QTL on C57BL/6J background (Neumann and Seyfried 1990)

or to C57BL/6J x DBA/2J (BXD) recombinant inbred (RI) strains to generate F1 animals

for QTL analysis of dominant modifier of PL/J seizure susceptibility. This approach is

more practical than conventional QTL mapping studies of (PL/J x C57BL/6J)F2 because

63 F1 mice are genetically identical to each other such that crosses between PL/J and BXD

RI strains will preserve the advantages of RI strains in mapping seizure QTL.

Other rhythmic tossing seizure susceptible strains, such as EL (Rise et al. 1991) and

SWXL-4 (Frankel et al. 1994), have been used to study the genetics and pathology of seizures. PL/J mice have similar seizure onset as these two strains (50 to 60 days). In addition, the average frequency of seizure in PL/J (12.4%) falls between SWXL-4 (9.0%) and EL (15.6%). Although studies of PL/J seizure are limited by complex genetics and strong environmental contributions to the trait, the PL/J mouse strain represents an

important new model that should be considered in determining the efficacy of

anticonvulsants (Southam et al. 2002) as well as studying the molecular pathology of

idiopathic epilepsy (Drage et al. 2002, Fueta et al. 2003).

Acknowledgments

We thank Dr. Karl Herrup for helpful suggestions and review of this manuscript, Dr.

Kingman Strohl for the EEG equipment and suggestions, and Barbara Beyer for technical

assistance. This work was supported by NIH grants HL58982 to J.H. Nadeau and

NS31348 to W.N. Frankel. L. Gallaugher and L. Friedman were supported by HL64278.

64

CHAPTER 3

Gene duplication, metabolic networking, and genetic buffering in physiological pathways of human and mice

65 Authors: Toshimori Kitami and Joseph H Nadeau

References: Nature Genetics 32:191-194. 2002. Erratum in: Nature Genetics 32:681. 2004.

Abstract

During vertebrate evolution, the size of gene families and the complexity of gene

interactions have increased dramatically. Gene duplications and redundant networks

buffer developmental and physiological systems from genetic perturbations and enable

variation in the patterns of DNA sequence changes and functional diversification. We

assessed the relative importance of gene families and redundant networks to patterns of

genetic variation by comparing nonsynonymous versus synonymous substitution rates of

241 orthologous genes in various metabolic pathways in humans and mice. We found

that neither gene families nor redundant networks contribute significantly to variation in

gene evolution rate. Instead, we found that protein interaction, tissue expression, and

metabolic function of the enzyme contribute significantly to the gene evolution rate.

These results suggest that higher order structure of across tissues and

their biological function contribute to diversification of metabolic pathways rather than

the topology and gene copy number of biochemical networks.

66 Introduction

The biological complexity in vertebrate lineages has been attributed to an increase in the

number of new genes (Ohno 1970) and more recently to an increase in complexity of

biological networks (Bhalla and Iyengar 1999). In many cases, new genes arise by gene

duplication in which one copy acquires a new beneficial function and becomes preserved

by natural selection while the other copy maintains the original function (Ohno 1970,

Walsh 1995, Force 1999). In other cases, both copies undergo complementary

degenerative mutation in which the total functional capacity is reduced to the level of the

single-copy ancestral gene (Force 1999). However, the unexpectedly low number of

genes discovered in the (Lander et al. 2001, Venter et al. 2001) suggests that complexity of biological networks may also play an important role in vertebrate evolution.

A functional consequence of complex biological networks is robustness against genetic mutation (or genetic buffering) (Dipple et al. 2001, Lenski et al. 1999) in which interactions between genes of unrelated biochemical functions minimize the deleterious effects of a genetic mutation. For example, metabolic networks connect different metabolites using enzymes of distinct biochemical function and create network connectivity that is resistant to random attacks (Jeong et al. 2000). Protein interaction networks (Jeong et al. 2001, Giot et al. 2003, Li et al. 2004, Han et al. 2004), transcriptional networks (Guelzim et al. 2002, Luscombe et al. 2004), and genetic

interaction networks (Tong et al. 2004) also show functional resilience. Gene duplication

67 also provides genetic buffering as observed in model organisms (Thomas 1993, Pickett and Meeks-Wagner 1995). Overlapping functions among gene duplicates provide opportunities to complement spontaneous or engineered mutations in gene family members. Therefore, the two mechanisms implicated in increased biological complexity may also provide mechanisms for genetic buffering.

Recently, Wagner (2000) investigated the role of gene duplication in genetic buffering and showed that on a genome-wide scale, gene duplication in yeast has no greater influence on genetic buffering than genes with single copy. These results suggest that functional interactions among non-paralogous genes (or biological networks) are the major cause of buffering against mutations. This implies that genes with networks tolerate more mutation, as a result of genetic buffering, and therefore evolve at faster (at a more neutral) rate than genes without networks, which are subject to purifying selection.

The rate of gene evolution provides clues to genetic buffering because genes that are buffered from genetic mutation are likely to evolve faster. Hirsh and Fraser (2001) showed that the rate of gene evolution is correlated with the fitness effect of a mutation.

Genes that evolve faster have less effect on fitness when mutated compared to genes that evolve slower. Therefore, if gene duplication buffers against genetic mutation, genes with duplicate copies should evolve faster than genes with one copy (Wilson et al. 1977).

By contrast, if networks provide genetic buffering, we would expect genes in networks to evolve faster than genes that do not belong to a network.

68 To assess the relative importance of gene duplication and redundant networks in metabolic pathways, we compared the DNA sequence of 241 human and mouse orthologous genes for nonsynonymous substitutions per nonsynonymous site (Ka) and synonymous substitutions per synonymous site (Ks). Because a synonymous change does not lead to amino acid sequence change, Ks is assumed to be the neutral mutation rate for two sequences. A smaller Ka/Ks value implies a slower rate of evolution whereas a higher value implies a faster rate of evolution.

The recent increase in our understanding of different pathways and availability of genomic sequences allows us to study evolution rates in the context of biological pathways or networks. There are many known pathways involved in signal transduction, metabolism, and transcription pathways that could be used for our study. We used metabolic pathways for the analysis of networks because their structure and composition are well-defined (Michal 1999) and many of the gene sequences that constitute these pathways are known in both human and mouse. We defined networks based only on the topology of biochemical reactions (Reder 1988). A redundant network provides more than one route to convert one metabolite to another and therefore can compensate for a genetic defect in one route.

We found that neither redundant networks nor gene duplication significantly influences gene evolution rate. However, quaternary protein structure, tissue requirement for the enzyme, and metabolic function strongly influence gene evolution rate. These results suggest that higher order structure of metabolic networks across tissues and their

69 biological function contribute to diversification of metabolic pathways more than the topology and gene copy number of biochemical networks.

Materials and Methods

Pathway Structure We obtained metabolic pathways and the direction of reactions

from Michal (1999). We obtained gene symbols for each enzymes using Enzyme

Commission (EC) number on the pathway display of Kyoto Encyclopedia of Genes and

Genomes (KEGG) (Kanehisa et al. 2002).

Human and mouse orthologous sequences We obtained GenBank accession

numbers for human and mouse sequences from GeneCards (Rebhan et al. 1998) using gene names from KEGG as the query. We established the orthology of human and mouse sequences using the mouse-to-human conserved syntenic relations whenever mapping information or genome assembly position were available for both species

(Lander et al., 2001, Venter et al. 2001, Waterston et al. 2002). For genes with unknown conserved syntenic relations, we established orthology by the reciprocal best hits (Rivera et al. 1998), that is, the best BLAST hit for the mouse gene must match the query human sequence when the mouse best hit is used as a query against the human nr database. All of the human sequences had corresponding publications or citations that verify the functionality of the gene sequence and excluded all pseudogenes. Moreover, Ka/Ks values of less than one (see below) suggest that pseudogenes were not included in the

70 dataset because pseudogenes are expected to undergo neutral selection (Ka/Ks = 1) (Li

1999).

Sequence analysis We aligned coding sequences of human and mouse orthologs using

CLUSTALW (Thompson 1994). We estimated the synonymous substitutions per synonymous site (Ks) and the nonsynonymous substitutions per nonsynonymous site (Ka) using the modified Nei-Gojobori method (Nei and Gojobori 1986). The ratio of transition to transversion (R) was measured as Kimura (Kimura 1980). Ka, Ks, and R- values were computed using Mega2.2 software (Kumar et al. 2001).

Codon bias estimate We estimated codon bias using ‘effective number of codons used in a gene’ (Nc) (Wright 1990) because the algorithm is reliable regardless of coding region size (Comeron and Aguade 1998). Nc ranges from 20 (minimum number of codons for all 20 amino acids) to 61 (maximum number of codons). Nc closer to 20 represents unequal usage of codon or codon bias. The procedure was performed for both human and mouse sequences. If there is a codon bias, we would expect difference in Nc distribution between the groups compared.

Duplicated genes We searched for homologs within each species in the nr protein database (NCBI) using BLASTP. Homologous sequences with an alignment score below

10-10 for the entire sequence were identified as potential gene duplicates (Lynch and

Conery 2000). To avoid the inclusion of genes with shared motifs but different

biochemical function, we excluded genes that did not have the same EC number

71 annotation as the query sequence. This allowed us to include only genes with significant sequence identity and shared biochemical function as duplicated genes.

Redundant network and reversibility We used a simple definition of redundant network based on the structure of metabolic reactions (Reder 1988) and taking into account the reversibility of reactions defined by Michal (1999). A gene has non- redundant network (gene X, Fig. 3.1a) if it is the only enzyme, regardless of its copy number, that carries out the metabolic conversion. For example, pyruvate is converted to lactate only by lactate dehydrogenase. Although there are at least three copies of lactate dehydrogenase in humans and mice, they are duplicate copies and are not networked. A gene has a redundant network (gene X, Fig. 3.1b, c) if other sets of genes (such as genes

Y and Z converting a to c, then c to b) can carry out the same metabolic conversion. For example, glucose phosphate isomerase converts glucose 6-phosphate to fructose 6- phosphate. However, glucose 6-phosphate is also converted to glucose, then to sorbitol, fructose, and to fructose 6-phosphate by four enzymatic steps. Therefore, glucose phosphate isomerase has a redundant network. We did not include common cofactors such as ATP, ADP, or NAD, NADH in our network definition because they are used ubiquitously.

The conversion of glucose 6-phosphate to fructose 6-phosphate by glucose phosphate isomerase is reversible as well as the conversion by four enzymatic steps

(hexokinase/glucokinase, aldehyde reductase, iditol dehydrogenase, and hexokinase).

This reversibility (Fig. 3.1c) makes glucose phosphate isomerase a gene with

72 bidirectional redundant network. However, if the redundant network is not reversible, then it is unidirectional redundant network (Fig. 3.1b). If a gene has multiple enzymatic functions and belongs to two network categories, we assigned both categories to the gene by counting the gene twice. Such shared genes involved only 10 of 241 genes in our dataset.

Flux analysis We used published data from extreme pathway analysis of human red blood cell metabolism (Wiback and Palsson 2002) and counted the number of times each enzyme appeared in different steady-state flux map.

Tissue gene expression We obtained published gene expression data of 79 human and 61 mouse tissues, each performed in duplicate from http://symatlas.gnf.org (Su et al.

2004). We used average difference value of 200 from Affymetrix GENECHIP software package as threshold for presence or absence of gene expression. This conservative threshold was previously used and corresponds to three to five copies per cell (Su et al.

2002).

Structural constraints and OMIM data We retrieved tertiary structure and cellular location of proteins from SWISS-PROT database (Bairoch and Apweiler 2000).

Statistical methods We first used the Kolmogorov-Smirnov test to determine whether data from each analysis group were normally distributed. For comparison of two groups, we used unpaired t-test for normally distributed data and used Mann-Whitney U test for

73 non-normal data. For comparison of more than two groups, we used ANOVA for normally distributed data and Kruskal-Wallis test for non-normal data to establish statistical difference between groups. To identify the group pairs that show significant difference, we used Tukey’s multiple comparison test for normally distributed data and

Dunn’s multiple comparison test for the non-normal data. For correlation analysis, we performed both Pearson’s correlation (normal data) and Spearman’s rank correlation

(non-normal data) because we could not establish whether data were bivariate normal distribution. All of the statistical analyses were performed using Prism 3.0 for Windows

(GraphPad Software Inc.).

74

Figure 3.1 Schematic for defining networks based on the structure of metabolic pathways. Arrows (edges) represent the direction of an enzymatic reaction; italicized upper case letters represent genes catalyzing each reaction; and the circles (nodes) and lower case letters represent intermediate metabolites.

75 Results

Pattern of Ka/Ks variation

Based on published metabolic pathways (Michal 1999), we identified genes that code for enzymes involved in general metabolism of small molecules. We identified human and mouse orthologous sequence pairs in 74% (241/325) of the genes surveyed. On average,

63 to 90 percent of orthologous gene pairs from each pathway were identified (Table 3.1).

The Ka and Ks values ranged from 0 to 0.237 and from 0.203 to 0.728, respectively.

These Ka and Ks ranges fell within the range reported by Makalowski and Boguski (1998) in their comparison of 1138 human-mouse orthologs. The average Ka/Ks value for the

241 genes was 0.146, which is slightly lower than the Ka/Ks value of 0.196 from

Makalowski and Boguski (1998). Median Ka/Ks value for the 241 genes was 0.128 which is slightly higher than the Ka/Ks value of 0.115 (mean value not provided) from

12,845 orthologous gene pairs from mouse genome sequence (Waterston et al. 2002).

The overall distribution of Ka/Ks values deviated significantly from the normal distribution based on the Kolmogorov-Smirnov test (0.111, p < 0.006) and was skewed towards lower Ka/Ks value (Figure 3.2).

Gene duplicates

To test whether gene families evolve faster than single copy genes, we compared the

Ka/Ks distribution for the two groups. Ka/Ks distribution of these two groups separately were normal (Kolmogorov-Smirnov test; gene families 0.130, p>0.085, n=93, single copy genes 0.111, p > 0.052, n=148). We found that gene families (mean Ka/Ks = 0.135; n =

76 93) shared similar Ka/Ks values as single copy genes (mean Ka/Ks = 0.152; n = 148)

(one tailed t-test; 1.49, p>0.068). We conclude that genes with multiple copies do not evolve faster than single copy genes.

Networks

We categorized genes into non-redundant network, unidirectional redundant network, and

bidirectional redundant network (Fig. 3.1) to test whether genes with alternative

metabolic routes evolve faster than those without the alternative routes. Because of

incomplete information, we did not take into account the tissue specificity, cellular

location and compartmentation of these enzymes. Although we know some of the gene

expression patterns and enzyme locations for given pathways in normal cells, little is

known about how these patterns change after genetic and environmental perturbations.

We found that genes with non-redundant network (mean Ka/Ks = 0.150; n = 174)

evolved at similar rate (Kruskal-Wallis test; 3.51 p > 0.173) as genes from the two

categories of redundant network (unidirectional network: mean Ka/Ks = 0.124; n = 33,

bidirectional network: mean Ka/Ks = 0.138; n = 43). We conclude that genes with

redundant network evolve at similar rate as genes without redundant network.

77 Table 3.1 Average Ka/Ks values for orthologous genes in various metabolic pathways in humans and mice

Pathways Average # bidirectional # analyzed / Ka/Ks redundant # screened (%)c network metabolism of cholesterol & derivatives 0.263 1 16 / 19 (84.2)

fatty acid & triacylglycerol metabolism 0.192 3 26 / 41 (63.4)

hemoglobin biosynthesis 0.169 0 8 / 9 (88.9)

pyrimidine metabolism 0.168a 1 14 / 20 (70.0)

phenylalanine, tyrosine, tryptophan 0.159a 0 17 / 19 (89.5) metabolism branched chain amino acid metabolism 0.151a 0 11 / 14 (78.6)

folate, homocysteine, betaine, 0.138a 3 28 / 37 (75.7) glutathione metabolism urea, aspartate, glutamate metabolism 0.128a 5 27 / 37 (73.0)

purine metabolism 0.120a, b 11 36 / 51 (70.6)

glycolysis & 0.108a, b 19 41 / 55 (74.5)

Krebs cycle 0.104a, b 0 17 / 23 (73.9)

The Ka/Ks values between the pathways differed significantly from each other (ANOVA F=6.610, p<0.0001). a The Ka/Ks values for genes in the metabolism of cholesterol and derivatives differed significantly from the Ka/Ks values for genes in the indicated pathways at p<0.05 (pyrimidine, phenylalanine, branched chain amino acid) and p<0.001 (the rest of the pathways). b The Ka/Ks values for fatty acid and triacylglycerol metabolism differed significantly from the Ka/Ks values for genes in the indicated pathways at p<0.05 (purine, Krebs) and p<0.01 (glycolysis). c The number of genes screened refers to the number of genes with available human sequence for each pathway. The number analyzed refers to the number of screened genes with mouse orthologous sequences available.

78

Figure 3.2 Distribution of Ka/Ks values for 241 genes in various metabolic pathways. Numbers in the lower table refers to the number of genes in each pathway belonging to each Ka/Ks bin.

79 Metabolic flux

While the topology of enzyme reactions allow us to predict whether alternative metabolic route can be found in an absence of a particular enzyme, extreme pathway analysis can predict the number of times each enzyme participates in distinct sets of steady-state metabolic flux within a given cell type (Papin et al. 2004). This allows us to identify enzymes that are more frequently used across various conditions and therefore are more likely to be important for the organism. We used data from the metabolic pathway in human red blood cell (Wiback and Palsson 2002), the only available data, to identify the number of times a given enzyme participates in all possible sets of metabolic paths.

These include glycolysis, pentose phosphate, and adenosine nucleotide metabolism.

From the set of 34 enzymes in which the data were available, the number of reactions for a given enzyme did not correlate with Ka/Ks value (one tailed Pearson’s correlation: r =

-0.2736, p=0.0617; one tailed Spearman’s rank correlation: r = -0.1475, p>0.20). We conclude that the number of metabolic flux reactions also has a non-significant impact on gene evolution rate.

Rate of evolution by metabolic function

We examined the rate of gene evolution based on pathway function (Michal 1999). The number of networks present in each metabolic pathway varies depending on the function of the pathway. We found that genes in the cholesterol, fatty acid, and triacylglycerol metabolism evolved faster rate than genes from the other pathways (Table 3.1). This was most pronounced when these pathways were compared to the two slowest evolving pathways; glycolysis / gluconeogenesis and Krebs cycle. There was no significant

80 difference in the codon bias (Nc) between functional groups for either human (ANOVA

F=1.191, p>0.29) or mouse sequences (ANOVA F=1.775, p>0.06) that could explain for difference in gene evolution rate. Surprisingly, pathways with high occurrence of redundant network (glycolysis and gluconeogenesis and purine metabolism) were the slower evolving pathway.

Structural constraints on gene evolution rate

We assessed the impact of structural constraints on Ka/Ks variation. We found that

genes for transmembrane proteins (mean Ka/Ks=0.145, n=16), which are hydrophobic,

showed similar rate of evolution as genes for non-transmembrane proteins, which are

hydrophilic (mean Ka/Ks=0.148, n=126) (one tailed Mann-Whitney U test; 994.0,

p>0.46). However, genes for monomers (mean Ka/Ks=0.156, n=23) evolved slightly

faster than genes of higher order protein structure such as dimers and tetramers which

interact with multiple subunits (mean Ka/Ks=0.124, n=125) (Mann-Whitney U test;

1112, p < 0.045). There was no codon bias (Nc) between monomer versus higher order

structure for human (one tailed t-test; t=0.1227, p>0.45) or mouse (one tailed t-test;

t=0.3714, p>0.35) sequences.

Tissue requirement for enzyme and gene evolution rate

We assessed whether genes that are expressed in greater number of tissues are more

important for the survival of the organism and therefore evolve slower than genes that are

expressed in fewer number of tissue, which may be more dispensable. We used

Affymetrix microarray data from Su et al. (2004) which contains 79 human tissues and

81 61 mouse tissues surveyed in duplicate. We used the same threshold for calling presence

(average difference over 200) as previously performed (Su et al. 2002). This conservative threshold represents three to five transcript copies per cell (Su et al. 2002).

We found that there was significant negative correlation between the number of tissues in which a gene is expressed and Ka/Ks values (see Figure 3.3 legend for statistics) for both human and mouse tissue expression data. There was no trend for codon bias (Nc) towards genes that are expressed in greater number of tissues than genes expressed in fewer tissues (Figure 3.3). There was no significant difference in the tissue expression number between different pathway functions or between monomer and higher order protein structure suggesting that protein structure and function affect gene evolution rate independent of tissue expression.

Disease

We also analyzed the impact of genetic buffering on the number of Mendelian disorders in the OMIM database. We found that the number of Mendelian disorders was actually

1.6-fold higher than expected in redundant network genes (χ2 = 7.27, p < 0.05) compared to non-redundant network genes. The number of Mendelian disorders was 1.6-fold higher in single copy genes compared to gene families (χ2 = 8.42, p < 0.05). OMIM data

seem to support genetic buffering by gene duplication but not by redundant network.

However, ascertainment bias (lack of lethal mutation in OMIM database) complicates

interpretation. There was no difference in the evolution rate between genes with OMIM

entry (mean Ka/Ks=0.135, n=65) and no entry (mean Ka/Ks=0.149, n=176) (Mann-

Whitney U test; 5464, p>0.29).

82 125 Mouse

100

75

50 # tissue samples

25

0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Ka/Ks

175 Human

150

125

100

75 # tissue samples 50

25

0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Ka/Ks

Figure 3.3 Scatter plot of the Ka/Ks values and the number of tissues with gene expression from mouse (out of duplicate 61 tissue samples = 122 samples) and human (out of duplicate 79 tissue samples = 158 samples) gene expression data (Su et al. 2004). Correlation with mouse tissue expression were r = -0.355, p<0.0001, r2 = 0.126 (one tailed Pearson’s correlation) and r = -0.344, p<0.0001 (one tailed Spearman’s rank correlation). Correlation with human tissue expression were r = -0.199, p<0.0001, r2 = 0.0395 (one tailed Pearson’s correlation) and r = -0.216, p<0.0001 (one tailed Spearman’s rank correlation). Correlation between codon bias (Nc) in human and number of tissues with gene expression were r = 0.0173, p<0.018, r2 = 0.0109 (one tailed Pearson’s correlation) and r = 0.090, p<0.035 (one tailed Spearman’s rank correlation). Because we were expecting codon bias (smaller Nc value) for genes with greater number of tissue expression (one-tailed hypothesis), which is negative correlation, the positive correlation signifies absence of codon bias. Correlation between codon bias (Nc) in mouse and number of tissues with gene expression were r = -0.0107, p>0.072 (Pearson’s correlation) and r = -0.0986, p>0.089 (Spearman’s rank correlation).

83 Discussion

During evolution different genes evolve at unequal rates reflecting the varying functional constraints on phenotypes. Most genes undergo negative (or purifying) selection because most mutations are deleterious and are eliminated from the population. However, the rate of evolution varies significantly even among negatively selected genes. An important contributing force to this variation is genetic buffering, which is a mechanism that reduces the potential detrimental effects of mutations. Gene duplication and gene networks are two mechanisms proposed for genetic buffering. If either of these mechanisms provides genetic buffering, we would expect faster (more neutral) rate of evolution in genes that have duplicate copies, networks, or both.

We investigated the rate of evolution of 241 metabolic genes in human and mouse by comparing the rate of nonsynonymous substitution per nonsynonymous site (Ka) and synonymous substitution per synonymous site (Ks). Genes were classified according to the presence or absence of gene copies as well as redundant networks. A redundant network is an alternative metabolic route that can in principle compensate for loss of a gene based on the structural layout of the metabolic pathways. We found that neither redundant networks nor gene families are important for determining the rate of evolution.

Our data disagrees with Wagner (2000) for the role of networks in mutational robustness.

Wagner showed that genes with duplicate copies when mutated in yeast did not perform better on the fitness scale compared to single copy genes, suggesting that networks

84 contribute significantly more to mutational robustness than gene duplicates. However, a recent study of 1147 yeast deletion strains (Gu et al. 2003), compared to 45 in Wagner

(2000), showed that lower proportion of duplicate genes are lethal when deleted compared to single copy genes. This also contradicts with our data which showed similar gene evolution rate between single copy genes and gene families.

In contrast to Wagner (2000) and Gu et al. (2003), we relied on gene evolution rate rather than fitness to study genetic buffering. While fitness reflects on evolutionary rates (Hirsh and Fraser 2001), other factors such as functional constraints on protein contribute to evolutionary rates. Our analysis of quaternary structure and transmembrane protein information showed that hydrophobicity of the protein is not a significant determinant of gene evolution rate in our dataset. However, monomers evolved slightly faster than proteins of higher order structure suggesting that protein interaction may restrict gene evolution. This result is consistent with Fraser et al. (2002) which showed that number of interacting proteins is negatively correlated with gene evolution rate.

We also examined functional constraints on gene evolution rate based on the number of tissues that requires each enzyme and also based on the pathway function. We found that genes expressed in smaller number of tissues evolved faster than genes expressed across wide variety of tissues and developmental stages. This is consistent with a recent study

(Zhang and Li 2004) which showed that tissue specific genes evolve faster than house keeping genes. It is possible that genes expressed in larger number of tissues are more

likely to be involved in wide variety of biological processes and are likely to show larger

85 effect on fitness when mutated. We also found that genes from cholesterol and evolve faster than most other pathways, especially when compared to glycolysis and Krebs cycle genes. However, this pattern could not be explained by the number of tissues expressing each gene in the pathway. Given the large physiological differences and evolutionary distance between humans and mice, it is difficult to speculate how these pathways contributed to the speciation of these two organisms.

More closely related species are required to identify plausible explanation for faster evolution of specific genes and pathways (Johnson et al. 2001).

Metabolic networks can be represented in different levels of complexity. We defined metabolic networks in a simple way, similar to previous graph theory representation of metabolic network (Reder 1988). Currently, there is not enough information to model the biochemical consequence of a mutation of an enzyme in most pathways. However, the metabolic networks can be represented at higher complexity using elementary flux mode analysis (Schuster et al. 1999) and extreme pathway analysis (Schilling et al. 2000) using stoichiometric equations and reaction kinetics. We used data from the extreme pathway analysis of red blood cell metabolism to evaluate the number of different metabolic routes (modes) that the enzyme can participate. While the number of modes did not show any contribution to the gene evolution rate in our analysis, larger study involving all metabolic pathways in E.coli showed that metabolic flux exhibits properties of robust network (Almaas et al. 2004). Most enzymes are infrequently utilized, showing low metabolic flux, and loss of enzyme do not lead to significant reduction in overall metabolic flux. However, few enzymes participate in the high-flux backbone and are

86 essential for the survival of the organism. Larger applications of these analyses in

humans and mice may reveal how gene evolution rate may be constrained by the

metabolic flux.

We showed that neither redundant metabolic networks (defined as alternative metabolic

routes) nor gene families contribute to faster rate of gene evolution. Based on the

assumption that faster evolving genes show less effect on fitness (Hirsh and Fraser 2001),

our data suggest that neither gene families nor redundant metabolic networks contribute

to genetic buffering. However, structural constraints, tissue requirement for the enzyme,

and metabolic function showed significant contribution to the gene evolution rate.

Further understanding of molecular and metabolic consequence of these genetic changes

will reveal the relationship between genetic and phenotypic variation and how these

relationships constrain gene evolution.

Acknowledgments

We thank Evan Eichler and Susan Rutherford for comments on a draft of this manuscript

and reviewers for helpful suggestions to improve the work and this paper. We also thank

Jeff Bailey and Matthew Johnson for assistance with the computational analysis. This

work was supported by NIH grant HL48492 and a gift from the Charles B. Wang

Foundation.

87

CHAPTER 4

Global gene expression and metabolite profiling of responses to dietary folate perturbations in two genetically distinct inbred mouse strains

88 Authors: Toshimori Kitami, Renee Rubio, William O’Brien, John Quackenbush, Joseph H Nadeau

Reference: Manuscript in preparation.

Abstract

Defects in homocysteine and folate metabolism are associated with increased risks for

cardiovascular diseases, neural tube defects, cancers, and neurodegeneration. Some of

these risks could be lowered by increased dietary folate intake. Although common

mutations have been identified for these pathways, these mutations do not fully account

for all of the various diseases associated with perturbations in homocysteine and folate

metabolism suggesting that interaction with other pathways are involved in disease

processes. To identify pathways that interact with homocysteine and folate metabolism,

we used dietary folate perturbations on two genetically distinct mouse strains and

analyzed global and pathway specific changes in gene expression. We found that

compared to the C57BL/6J strain, A/J shows improved response to folate perturbation in

retaining higher serum folate level and minimizing global gene expression changes. We

found strain-specific differences in the gene expression profile of cholesterol biosynthesis

pathway and in total serum cholesterol levels. Increase in serum total cholesterol using

ApoE knockout mice slightly improved folate retention during folate depletion but also

induced additional gene expression changes. We also identified choline as a mediator of

several folate perturbation gene expression responses. These pathways are components

of a complex homeostatic regulatory network involved in maintenance of homocysteine

and folate pathways and may play a more general role in complex disease mechanisms.

89 Introduction

Homocysteine and folate metabolic pathways have received significant attention recently for their association with a remarkable variety of common and complex diseases. An increase in serum homocysteine level is associated with increased risk for cardiovascular diseases (Homocysteine Studies Collaboration 2002) and Alzheimer’s disease (Seshadri et al. 2002) and increased dietary folate intake is associated with reduced risk for cardiovascular diseases (Rimm et al. 1998), neural tube defects (MRC Vitamin Study

Research Group 1991, Czeizel and Dudas 1992), and colon cancer (Giovannucci et al.

1998, Su and Arab 2001). Although common mutations in these pathways have been identified (Frosst et al. 1995, van der Put et al. 1998), they do not fully account for the variety of disease types that are associated with increased homocysteine or low folate intake. Moreover, given the significant genetic and environmental variation in human studies, it remains unclear whether abnormalities in homocysteine and folate metabolism are cause or consequence of these diseases.

Mouse models enable control of many genetic and environmental variables that complicate the analysis of complex traits and diseases in human. Mouse knockouts of homocysteine and folate pathway genes alone are insufficient for full disease onset despite changes in key physiological parameters for diseases (Watanabe et al. 1995, Chen et al. 2001). For example, Mthfr knockout mice show lipid deposition in aorta without any atherosclerotic lesions (Chen et al. 2001). A recent genetic interaction study indicated that mutations in other pathways in conjunction with homocysteine and folate pathway mutants better recapitulate disease phenotype (Wang et al. 2003). Additionally,

90 mouse knockout models that share the similar disease phenotypes as homocysteine and folate pathway defects show altered regulation of homocysteine and folate metabolism

(Fleming and Copp 1998, Ernest et al 2002). These results suggest that identifying these interacting pathways is crucial to understanding the role of homocysteine and folate pathways in diseases.

Although several key interacting pathways of homocysteine and folate pathways have been identified, the dynamics of these interactions have not been studied. To analyze the

dynamics of pathway interactions, we performed dietary folate depletion followed by

repletion across several time points in two genetically distinct inbred mouse strains. We

used microarrays to systematically survey pathways involved in perturbation response

and measured the metabolic output of those pathways. We found that the A/J strain

showed improved response to folate perturbations compared to C57BL/6J in both serum

folate level and global gene expression pattern. The cholesterol synthesis pathway

showed the most dynamic strain-specific gene expression responses and the serum total

cholesterol level was increased in A/J during early phases of perturbation response. To

test the role of cholesterol metabolism in homocysteine and folate pathway regulation, we

elevated serum cholesterol level in C57BL/6J using ApoE knockout mice. Increases in

serum cholesterol slightly improved folate retention during dietary folate depletion but

also induced additional gene expression changes to other pathways. We also found that

increases in choline level reduced expression of folate perturbation response genes

indicating that choline served as mediator of perturbation responses. We identified key

interacting pathways of homocysteine and folate metabolism involved in responses to

91 dietary folate perturbation. Understanding the regulation of these interactions will improve our understanding of homocysteine and folate pathway homeostasis and its role in complex diseases.

92 Materials and Methods

Animals & Diets Six-week old female A/J, C57BL/6J, and B6.129P2-Apoetm1Unc/J

(Piedrahita et al. 1992) were purchased from the Jackson Laboratory. All mice were

raised on control diet containing 4ppm folic acid (Basal Diet 5755, TestDiet) for one

week before the start of studies. Selected mice were then placed on folic acid deficient

diet (58C3, Test Diet) containing 1% succinylsulfathiazole, an antibiotic commonly used

to suppress folate production by in the intestine. All diets were irradiated by the manufacturer. Treatment plans are outlined in Figure 4.1a and b with eight replicate mice per treatment plan per strain. A new batch of diet was manufactured for the second phase of studies (Figure 4.1b). C57BL/6J mice on choline treatment (Figure 4.1b) were given

water containing 25mM of choline and 50mM of saccharin. Saccharin was used to

reduce the bitter taste of choline. C57BL/6J folate depleted mice (Figure 4.1b) were

placed on water containing 50mM saccharin to monitor effect of choline on water

consumption. From each mouse, blood samples were obtained from the retro-orbital

sinus and centrifuged. Serum samples were stored at -80°C. Mice were then sacrificed

and tissues were collected, frozen in dry ice, and immediately stored at -80°C. All mice

shared the same animal room with controlled temperature, humidity, and 12 hour light-

dark cycle. Mice were provided food and water ad libitum.

93 (a)

l

a 7 1 Folate depleted diet

t n 7 2 e Control diet

m

i 7 7

r

e 7 14 p

x

E 7 14 1 7 14 7

l 7 o 7 9 ntr 7 22 Co

6 weeks old Start Days of treatment

(b) Control diet Folate depleted diet

Control 7 14 7 21 ApoE KO 7 14 7 14 7 Saccharin 7 14 7 14 7

Choline + 7 14 7 14 7 saccharin

6 weeks old Start 6 weeks old Start Days of treatment

Figure 4.1 (a) Folate perturbation protocol. Six-week old female A/J and C57BL/6J from the Jackson Laboratory were placed on each of the nine treatment plans with eight replicate mice per treatment plan per strain. (b) Folate perturbation protocol with additional perturbations in C57BL/6J. Eight replicate mice were used for each of the eight treatment plans. All mice were on C57BL/6J genetic background.

94 Expression profiling For mice from Figure 4.1a, equal amount (by weight) of liver tissue from eight replicate mice per treatment plan were pooled as one sample and total

RNA were extracted using TriZol (Invitrogen). For mice from Figure 4.1b, equal amount

(by weight) of liver tissue from eight replicate mice were separated into two pools of four replicate tissues each to provide biological replicates. Extracted RNA were treated with

DNase (Ambion) and cleaned using RNeasy (Qiagen) according to the manufacturer’s protocols. 15ug of pooled total RNA and 15ug of Universal Mouse Reference RNA

(Stratagene) were aminoallyl labeled with Cy3 and Cy5 in duplicate with reversing of dyes for mice from Figure 4.1a. 15ug of pooled RNA from treated mice and 15ug of pooled RNA from control mice for each time point were aminoallyl labeled with Cy3 and

Cy5 in duplicate with reversing of dyes for mice from Figure 4.1b. Sample and reference probes or treated and control samples were co-hybridized to mouse cDNA array representing over 25,000 unique genes and ESTs from NIA mouse 15K set (Tanaka et al.

2000) and BMAP mouse cDNA clone set. Detailed protocols for array fabrication, RNA labeling, hybridization, and image processing are described in Hegde et al. (2000). Array images were scanned using the GenePix 4000A scanner (Molecular Devices) and array spot intensities were acquired using TIGR Spotfinder as outlined in Yang et al. (2002) and normalized using global Lowess regression using MIDAS (Saeed et al. 2003) set at a smoothing parameter of 0.3.

Hierarchical clustering All microarray data were log2 transformed and filtered to minimize the number of gene expression patterns resulting from noise. Only genes whose gene expression variance across time points is greater than the assay noise (the

95 average of variances from duplicate measurements for each gene) were kept. Data were mean centered and scaled (normalized) both across genes and experiments. We performed average linkage hierarchical clustering using Euclidean distance and estimated statistical robustness of each branch by bootstrap resampling of genes (1000x) using

TIGR Multiexperiment Viewer (MeV) (Saeed et al. 2003).

Significant gene expression changes To test for significant gene expression changes between two strains across time points (Figure 4.1a), we fitted our data using fixed model

ANOVA method previously described (Kerr et al. 2000) using MAANOVA R package v.0.98-3. After fitting the effects of array and dye on gene expression variance, our null model consisted of variance due to independent effects of strain and folate treatment effects without interactions between the two effects. Our alternative hypothesis

contained the interaction term. The calculated F statistics were compared to the tabulated

F distribution on a per-gene basis and adjusted for multiple testing using Bonferroni

correction. For ApoE knockout and choline treatments (Figure 4.1b), we performed

ANOVA analysis for each time point with only treatment as the variable of interest. To

identify gene expression changes specific to each treatments, we performed post-analysis

t-test using tabulated t-statistics with Bonferroni correction for multiple testing and false

discovery rate estimation.

Pathway analysis Each of the microarray clones referenced by GenBank accession

number were converted to Mouse Genome Informatics (MGI) accession number using

the TIGR Resourcerer database v.12.0 (Tsai et al. 2001). MGI accession numbers were

96 then used to link each clone to Gene Ontology (GO) functional annotation (July 2005 monthly repository) (Ashburner et al. 2000). We tested for over-representation of each

GO term in differentially expressed genes using one-tailed Fisher’s exact test. We made certain that a gene represented by multiple clones appeared only once when counting the significantly expressed genes as well as when calculating the null distribution from all genes on the array. We tested all terms within the hierarchy of GO functions and used

Bonferroni correction for the number of terms examined. We excluded GO terms when the number of observed genes for that term was smaller than three to avoid significance arising from few false positive genes.

Annotation of methylated genes For genomic repeats, each of the clone sequences were analyzed against RepBase (Jurka 2000) using RepeatMasker

(http://www.repeatmasker.org). Only clones with repeat sequence spanning the entire clone were included. Therefore, clones with partial repeat sequence (hybrid sequence) were excluded from the definition of repeat clones. We obtained a list of imprinted genes from Mouse Imprinting Database at Mammalian Genetics Unit at MRC

(http://www.mgu.har.mrc.ac.uk/research/imprinting/). For X-inactivated genes, we assumed that all of the genes on the X-chromosome undergo inactivation. Presence of repeat elements in 5’ and 3’ UTR were obtained from UCSC mouse genome assembly mm4 database (http://genome.ucsc.edu/index.html) (Karolchik et al. 2003).

Gene set enrichment analysis To identify annotation terms associated with differentially expressed genes for data from Figure 4.1b, we performed gene set

97 enrichment analysis (Mootha et al. 2003) using GSEA software v1.0 (Subramanian et al.

2005). We used signal-to-noise ratio to rank genes for differential expression. Given the limited replicate size, we performed permutation of gene annotation assignments for each gene to calculate significance scores rather than permutation on sample identities. We used the C2 functional gene set annotation which consisted of pathway annotation from eight manually curated databases along with regulated gene expression response reported in literature. Because these annotations were referenced by human GenBank Refseq accession number, we converted the MGI accession number from our array clones to human nucleotide and protein Refseq accession number using Mouse Genome Database

(data as of July 30, 2005) (Blake et al. 2003). Human protein Refseq accession numbers were converted to human nucleotide Refseq accession numbers using UCSC Genome

Database human genome assembly hg17 annotation (http://genome.ucsc.edu/index.html)

(Karolchik et al. 2003). We also included gene sets that correlated with cholesterol synthesis and choline kinase gene expression patterns from data in Figure 4.1a at greater than 90% Pearson correlation threshold (equivalent to nominal p=0.005) as part of the annotation. To avoid enrichment of specific terms arising from genes represented by more than one clone, we selected one clone per gene based on the maximum difference between gene expression variance across treatment conditions relative to average variance from replicates.

Serum homocysteine & folate Total serum homocysteine level was measured using a liquid chromatographic-tandem mass spectrometric method (McCann et al. 2003).

98 Serum folate level was measured in duplicate using microbiological method (Horne 1997) after deproteination.

DNA methylation measurement Genomic DNA was extracted from each replicate liver tissue using DNeasy Tissue kit (Qiagen). DNA was treated with methyl-sensitive

HpaII and methyl-insensitive MspI restriction enzymes (New England Biolabs) and cytosine extension assay was performed to quantify the amount of digested DNA

(Pogribny et al. 1999). The ratio of HpaII to MspI digest indicates the fraction of unmethylated DNA. The DNA digests and cytosine extension assay were performed in duplicate.

Cholesterol measurement Serum total cholesterol was quantified by GC/MS (Hewlett

Packard 6890 GC with a 5973 mass spectrometer) using selected ion monitoring mode

(Singer et al. 2004) for A/J and C57BL/6J folate perturbation experiments (Figure 4.1a).

For experiments involving additional treatments on C57BL/6J (Figure 4.1b), serum total cholesterol was measured by enzymatic assay (Allain et al. 1974) in duplicate using

Infinity Cholesterol Reagent (ThermoElectron). Liver cholesterol was extracted with and methanol as described in Folch et al. (1957). Extracted cholesterol from the chloroform layer was air-dried at 50°C and resuspended in 5% Triton X-100 and quantified in duplicate with enzymatic assay as above. The total cholesterol level from liver was standardized against the amount of protein from the aqueous extract using BCA

Protein Assay Kit (Pierce).

99 Results

Metabolite changes

To study the pathways involved in folate perturbation response, we performed dietary

folate depletion followed by repletion on two genetically distinct inbred mouse strains.

We selected time points on the order of days to capture immediate changes and weeks to

capture longer term effects (Figure 4.1a) based on previous observations that serum folate

loss occurs within days after folate depletion (Raghunathan et al. 1997). We first

monitored the impact of folate perturbation on main metabolite markers, serum

homocysteine and serum folate levels. In both A/J and C57BL/6J strains, serum

homocysteine level decreased, 41% and 35% respectively, over two weeks of folate

depletion (Figure 4.2). Serum folate showed a more dramatic decline and strain

difference over two weeks. In A/J, serum folate declined rapidly by 54% after one day of folate depletion, whereas in C57BL/6J serum folate remained relatively unchanged with a decline of only 10%. However, A/J serum folate loss was mitigated by second day of folate depletion and declined by total of 72% after two weeks, whereas C57BL/6J serum folate showed more rapid decline of 89% over two weeks (Figure 4.3a). Once folate was restored in the diet, both serum homocysteine and folate levels returned close to control level in A/J but C57BL/6J still showed 21% and 41% decrease from control for homocysteine and folate respectively (Figure 4.3a). We conclude that A/J minimized folate loss during folate depletion and was able to regain folate homeostasis faster once folate was resupplied in the diet compared to C57BL/6J.

100

Figure 4.2 Metabolite profiles of serum homocysteine and folate, and total cholesterol in serum and liver. Metabolite levels at each treatment time point were compared to the extrapolated control values (0, 9, 22 days). These differences were normalized to the extrapolated control values and represented as percent change from the control. Results of ANOVA and multiple comparison post-tests are represented as closed and open squares. Closed squares indicate statistical difference from the nearest control time point (0, 9, or 22 days) at p < 0.05, whereas open squares indicate p > 0.05. ANOVA analysis and Bonferroni’s multiple comparison post-test were performed for metabolite data with normal distribution and equal variance. For non-normal or unequal variance data, we performed Kruskal-Wallis test with Dunn’s multiple comparison post-test. Metabolite levels were measured in each of the eight replicate mice per strain per time point.

(a) (b)

-50 -40 -30 -20 -10 0

0 0 -0.5 -0.4 -0.3 -0.2 -0.1 0 0. 1 0 C57BL/6J -30 -20 -10 0 10 20 30 e e 1 A/J 1 c c C57BL/6J A/J

n 21 -0.2 -20 ce -20 e n er

ff ere i ff d 21 2 i -0.4 -40 15 -40 d

te 2 late Differen a 15 l o 7 1 7 2 1 2 -0.6 -60 -60 folate

m fo

u 15 15 r 14 7

7 rum 14 -0.8 -80 -80 rmalized F se % se % No 14 14 -1 -100 -100 Normalized Homocysteine Difference % serum homocysteine difference % serum cholesterol difference

Figure 4.3 (a) Percent change in serum homocysteine and folate for A/J and C57BL/6J plotted on the same graph using data from Figure 4.2. Numbers on each data point indicate days of treatment in which 14 represents maximum days of depletion. Day 15 and 21 are from repletion period. (b) Percent change in serum folate and total cholesterol for A/J and C57BL/6J plotted on the same graph using data from Figure 4.2.

101 Global gene expression changes

To survey global gene expression responses to dietary folate perturbations, we performed

microarray analysis on pooled liver total RNA using cDNA arrays consisting of over

25,000 unique EST sequences. We selected liver because it is the primary organ of

homocysteine and folate metabolism and it expresses all genes in the pathways. To

capture the relationship between each treatment time points, we performed hierarchical

clustering (Figure 4.4). First, all of the time points separated into two distinct clusters corresponding to each inbred strain, indicating stronger effect of genetic variation on gene expression response versus environmental variation due to folate perturbation.

Close examination of each time point revealed that A/J gene expression levels diverged from initial control levels after one day of folate depletion but returned closer to control by second day of depletion. This was similar to the serum folate profile in which folate declined rapidly after one day but stabilized by the second day. In C57BL/6J, however, the gene expression changes were gradual and clustered together based on similarity in duration of depletion. Once folate was restored in the diet, A/J gene expression clustered with the final control time point, whereas C57BL/6J time points belonged to a different cluster from the final control, a result which was similar to the outcome of serum folate and homocysteine levels. We conclude that A/J gene expression profiles were closer to controls during folate depletion and especially after folate repletion compared to

C57BL/6J, a result similar to serum folate profiles.

102

Figure 4.4 Average linkage hierarchical clustering of liver gene expression profile from folate perturbation experiments in A/J and C57BL/6J. Gene expression profiles from 3950 genes out of 25,000 that showed greater gene expression variance across treatments compared to average variance in replicates were used. Bootstrap resampling of genes were performed (1000x) to estimate the confidence level of each branch. Black color above 100 indicates high confidence and red color below 0 indicates low confidence. Samples are numbered based on the length of treatment days and were color coded to denote different phases of perturbation.

Table 4.1 Significant over-representation of Gene Ontology terms

Bonferroni GO All Pathway Observed Expected corrected accession genes p value biosynthesis GO:0006694 22 8 0.76 0.000161 lipid biosynthesis GO:0008610 65 12 2.25 0.000679 sterol biosynthesis GO:0016126 14 6 0.48 0.00168 GO:0006629 164 18 5.67 0.00429 sterol metabolism GO:0016125 25 7 0.86 0.00662 cholesterol biosynthesis GO:0006695 11 5 0.38 0.00825 steroid metabolism GO:0008202 38 8 1.31 0.0155 isoprenoid biosynthesis GO:0008299 3 3 0.10 0.0187 cholesterol metabolism GO:0008203 21 6 0.73 0.0248 cellular lipid metabolism GO:0044255 140 15 4.84 0.0339 alcohol metabolism GO:0006066 81 11 2.80 0.0391

Column ‘all genes’ refers to all genes on the array annotated with the particular GO term.

103 Pathway responses to folate perturbation

We identified striking strain differences in folate perturbation response based on

metabolite and global gene expression patterns. To identify pathways that are involved in

these responses, we first tested for significant gene expression changes specific to

interactions between strain and folate treatment effects using a fixed model ANOVA

(Kerr et al. 2000) at p = 0.05 threshold after Bonferroni correction. Then using Gene

Ontology (GO) annotation, we queried specific pathway annotations that were over-

represented among these significant genes. The only term that showed significant over-

representation belonged to cholesterol biosynthesis pathways and its parent terms (Table

4.1). Many of the cholesterol synthesis genes showed strong correlation (70% to 99%)

between each other within each strain. Although the GO annotation set does not contain

homocysteine and folate pathway as part of the annotation, when we used most

commonly included genes in the pathway (Rosenblatt 2001) as a new annotation, these

pathways did not show any significant over-representation (Fisher’s exact test: observed

2, expected 0.442, nominal p=0.070).

To examine whether changes in cholesterol gene expression translate to changes in

cholesterol metabolites, we measured serum and liver total cholesterol levels in A/J and

C57BL/6J across all folate perturbation time points and controls. Significant changes in

cholesterol levels occurred after one day of folate depletion in A/J. Serum total

cholesterol increased significantly (17%) compared to control mice whereas liver total

cholesterol declined (8%) (Figure 4.2). This coincided with the time points in which

rapid folate loss and global gene expression change took place (Figure 4.3b). At later

104 time points, serum total cholesterol declined (20%) whereas liver total cholesterol increased (57%) compared to A/J on the control diet. In contrast, C57BL/6J showed gradual changes with an increase in liver total cholesterol (32%) and decrease in serum total cholesterol (26%) throughout most of the folate perturbation period (Figure 4.2).

Surprisingly, none of the gene expression profiles for cholesterol synthesis genes correlated strongly with cholesterol metabolite profile. We conclude that A/J strain showed dynamic changes in both serum and liver cholesterol level after one day of folate depletion coinciding with dynamic changes in serum folate and global gene expression profile.

Effect of increased serum cholesterol on folate perturbation response

The increase in serum cholesterol level in the A/J strain coincided with the time point in which folate and global gene expression both undergo rapid changes followed by stabilization. This change was not observed in C57BL/6J at these time points. To test the effect of increased serum cholesterol level on C57BL/6J strain during folate perturbation, we used Apolipoprotein E knockout mice on C57BL/6J genetic background

(B6.129P2-Apoetm1Unc/J) in which a defect in cholesterol transport significantly elevates serum total cholesterol level. We used the same perturbation protocol (Figure 4.1b) and diet as before with the exception that the folate level in the control diet was significantly lower than the previous control diet due to greater destruction of folic acid by higher doses of radiation used to sterilize the diet. Serum total in ApoE knockout mice were significantly higher than C57BL/6J on control and folate depleted diet (598% and 532% respectively) (Table 4.2). ApoE knockout mice showed slightly higher serum

105 folate level (52%) both during folate depletion and after repletion compared to C57BL/6J on folate perturbation treatments. However, the differences between the two strains were modest and did not reach statistical significance. Gene expression analysis using the same cDNA array as before showed that there were higher number of significantly expressed genes in ApoE knockout mice compared to C57BL/6J both during depletion and repletion (Table 4.2) and that ApoE mice even showed upregulation of the fatty acid metabolism gene set during depletion (Enrichment Score = 0.74; Bonferroni corrected p

= 0.002). Liver histological sections showed no gross changes in either C57BL/6J or

ApoE knockout mice during folate depletion and repletion (not shown). We conclude that increased total serum cholesterol in ApoE knockout modestly improved folate retention during folate depletion but induced greater gene expression changes compared to C57BL/6J alone.

106 Table 4.2 Effect of ApoE knockout and dietary choline supplements on metabolite and gene expression profiles.

Additional Serum Serum Liver FWER FDR Time Diet Perturbation folate cholesterol cholesterol p=0.05 q=0.01

Control None 19.1±4.3 122±7 234±20 -- -- Folate None 5.4±1.6* 137±13 257±28 2 1 14 Folate ApoE KO 8.3±3.7 729±107* 289±30 8 27 Folate Choline 3.2±1.1* 152±20 299±55* 7 38 Control None 18.4±2.8 127±19 235±20 -- -- Folate None 16.4±4.5 142±16 282±51 0 0 21 Folate ApoE KO 22.5±3.9 872±61* 335±38* 3 4 Folate Choline 16.6±1.8 143±23 295±42* 2 3

Units for metabolites are pmol/mL serum folate, mg/dL serum total cholesterol, mg total liver cholesterol/g liver protein. Significant gene expression changes were computed at family-wise error rate (FWER) p=0.05 and false discovery rate (FDR) q=0.01 using fixed model ANOVA using gene specific tabulated F statistic. Each treatment was compared to mice on control diet for post-analysis using t-test corrected for multiple testing with the same corresponding thresholds as ANOVA. Asterisks (*) indicate significant changes in metabolite level from mice on control diet after ANOVA and Bonferroni’s multiple comparison post-test for data with normal distribution and equal variance. For non- normal or unequal variance data, we performed Kruskal-Wallis test with Dunn’s multiple comparison post-test. All mice were on the C57BL/6J background. Folate treated mice without additional perturbation were on 50mM saccharin water and folate treated choline supplemented mice were on 50mM saccharin and 25mM choline water.

107 Gene expression changes relative to folate metabolite profile

Although gene expression analysis for strain and treatment interaction effect revealed changes in cholesterol metabolism pathway, it was the only pathway that collectively showed significant changes based on Gene Ontology annotations. It is possible that other pathways undergo more subtle changes that do not necessarily meet statistical significance at the pathway level. To identify genes that respond proportionately to changes in serum folate levels, we tested for genes that showed strong positive or negative correlation to folate metabolite profile. Surprisingly, we identified two EST clones belonging to repetitive element intracisternal A particle (IAP) from the LTR class that negatively correlated at 82% and 86% to folate profile in C57BL/6J but not in A/J.

Although our arrays contained a total of 182 repeat sequences from LINE, SINE, and

LTR classes, only selected IAPs in C57BL/6J showed negative correlations at nominal

(uncorrected for multiple testing) p value threshold of 0.05. This trend was greater than probability of random correlation (Fisher’s exact test: observed 2, expected 0.353, nominal p=0.0471). We conclude that IAPs were selectively activated during folate depletion in C57BL/6J.

IAPs are over-expressed during DNA hypomethylation (Walsh et al. 1998), an important biochemical component of homocysteine metabolism, and its expression can be suppressed by increased dietary folate level (Wolff et al. 1998, Waterland and Jirtle

2003). Therefore, we tested global DNA methylation levels in both strains for all time points. However, we did not find significant changes in global DNA methylation levels in either strain during folate perturbation (A/J: ANOVA df = 8,62; F = 0.998, p=0.447,

108 C57BL/6J: ANOVA df = 8,63; F = 0.456, p = 0.882). We also did not find significant over-representation of genes with IAPs in their 5’ or 3’ UTR, imprinted genes, and X- inactivated genes among genes that showed negative correlation to folate profile at nominal p=0.05 (not shown). We conclude that there was no significant change in global

DNA methylation level or change in expression profile of genes strongly influenced by

DNA methylation level during our folate perturbation.

Choline as a mediator of folate perturbation response

Although changes in dietary folate levels induced strain-specific changes in cholesterol metabolism, it is unclear how mice detect changes in folate status to initiate the cascade of gene expression responses. To identify genes involved in mediating these responses, we tested for early gene expression changes during folate depletion that were different between the two strains. The purpose was to increase the probability of identifying genes that reflected strain differences in initial folate depletion response as opposed to later physiological consequences of folate depletion. We focused on gene expression profiles that showed initial strain specific difference that also showed literature co-occurrence of their gene name with terms homocysteine or folate in PubMed abstracts. We identified choline kinase as having early upregulation in A/J both during folate depletion and repletion (Figure 4.5). In contrast, C57BL/6J showed gradual and continued down- regulation of choline kinase. The product of choline kinase, phosphocholine, is involved in phospholipid synthesis, transmembrane signaling, and lipoprotein secretion (Zeisel and

Blusztajn 1994). The substrate of choline kinase, choline, is shared with a pathway

109 leading to synthesis of the alternative methyl donor (betaine), which is important for sustaining methylation during folate depletion.

To test the role of choline in the folate perturbation response, we supplemented drinking water with choline and performed a similar dietary folate perturbation on C57BL/6J as in

ApoE knockout mice (Figure 4.1b). We used saccharin to reduce the bitter taste of choline and found that mice drank 16% less water in the choline treated group as compared to mice on saccharin-water but the difference was not significant (Mann-

Whitney U test 3.50, p=0.10). We found that choline supplementation during folate depletion further decreased serum folate level by 40% compared to C57BL/6J on folate depleted diet alone although the difference was not significant (Table 4.2). Choline treated mice also showed a significant decrease in choline kinase expression level compared to C57BL/6J on control diet (Day 14: 1.59 fold decrease, nominal p = 0.00023;

Day21: 1.52 fold decrease, nominal p = 0.00023), which was not observed in C57BL/6J on the folate treatment diet. This trend was contrary to the expression pattern originally found in A/J. When we examined the expression profile of genes that originally shared similar expression profile as C57BL/6J choline kinase expression (90% correlation threshold), we found lower expression of this gene set both during depletion (Enrichment

Score = 0.59, Bonferroni corrected p = 0.026) and repletion (Enrichment Score = 0.63,

Bonferroni corrected p = 0.027) compared to C57BL/6J on folate-treatment diet. We conclude that choline supplement exacerbated folate retention during dietary folate depletion and induced down-regulation of choline kinase and its cluster members, contrary to the trends observed in the early depletion period in A/J.

110 1.5 A/J 1 C57BL/6J on

0.5 essi

0

0127141521 -0.5

Log2 Expr -1 Depletion Repletion -1.5 Treatment

Figure 4.5 Gene expression profile of choline kinase during folate perturbation in A/J and C57BL/6J. Gene expression change is shown in log2 scale in which expression change of 1 unit equals two-fold change. The log2 value is gene expression change relative to universal reference standard.

111 Discussion

Defects in homocysteine and folate metabolism have been associated with increased risk for cardiovascular diseases, neural tube defects, cancers, and neurodegenerative disorders

(Homocysteine Studies Collaboration 2002, Seshadri et al. 2002). Dietary folate supplementation often reduces risks for many of these diseases (Rimm et al. 1998, MRC

Vitamin Study Research Group 1991, Giovannucci et al. 1998). Although common mutations for these pathways have been discovered (Frosst et al. 1995, van der Put et al.

1998), they do not account for the variety of diseases associated with the pathways suggesting that other interacting pathways are involved in disease processes. Therefore, identification of interacting pathways and their regulation is important for understanding disease mechanisms. To identify the key interacting pathways, we performed dietary folate perturbation in two genetically distinct mouse strains and monitored pathway responses on gene expression level using microarrays. We found that A/J strain responded better to folate depletion and recovered faster during folate repletion both on serum folate and global gene expression levels compared to C57BL/6J.

We identified significant strain differences in gene expression response to folate perturbation in cholesterol metabolism pathway. The A/J strain showed a significant increase in serum total cholesterol and a decrease in liver total cholesterol during initial days of folate depletion, coinciding with the significant serum folate loss and global gene expression changes. The magnitude of the serum cholesterol increase in A/J was comparable to the variation in serum cholesterol due to genetic variation between A/J and

112 C57BL/6J (Singer et al. 2004). We investigated the role of increased serum cholesterol level on folate perturbation response using ApoE knockout mice on C57BL/6J background and showed that this increase improved folate retention at a modest level

during folate depletion but at the cost of inducing greater changes in other pathways such as fatty acid metabolism. In fact, the increase in serum cholesterol was only a transient effect in A/J and was not sustained throughout the depletion period. It is possible that the increase in serum cholesterol is an initial response to folate perturbation or response to modest decrease in serum folate because a continued increase in serum cholesterol would probably cause other additional stresses on the system.

Although changes in dietary folate level lead to changes in cholesterol metabolism and changes in cholesterol metabolism lead to changes in homocysteine and folate metabolism, we do not know the mechanisms responsible for these relationships. Ex vivo studies indicate that cholesterol regulates cellular import of folate by mediating the clustering of folate receptors on the cell membrane (Smart et al. 1996, Mayor et al. 1998).

In fact, low cholesterol level ex vivo leads to decreased folate import (Chang et al. 1992).

Additional evidence for pathway interactions includes increases in serum homocysteine levels and altered homocysteine and folate gene expression profiles in heterozygous

ApoB knockout (Ernest et al. 2002). Also, homocysteine pathway mutant Cbs knockout shows altered lipid and cholesterol metabolism, but without affecting total serum cholesterol level (Werstuck et al. 2001, Namekata et al. 2004). However, mechanisms leading to changes in gene expression and metabolite level of cholesterol pathway remains to be discovered.

113 We examined the strain-specific gene expression profile for genes and pathways that were likely to participate in early differential responses to folate perturbation and identified choline kinase as a candidate based on our evidence and on information from the literature. Increased choline intake in C57BL/6J decreased folate retention and reduced gene expression of choline kinase and its cluster members, in the direction opposite of A/J strain. These changes suggest that choline can induce or enhance expression of folate perturbation response genes especially those that clustered with choline kinase expression and may be involved in mediating some of the folate perturbation response cascade. Inhibition of choline kinase or choline depletion may reveal specific role of choline in folate perturbation response.

Surprisingly we found a decrease in serum homocysteine level during folate depletion albeit smaller in magnitude compared to serum folate loss. A decrease in dietary folate intake is often associated with an increase in serum homocysteine level in humans

(Homocysteine Lowering Trialists’ Collaboration 1998). It is possible that folate

depletion alone does not provide sufficient stress to increase serum homocysteine level as

other animal studies often use high methionine and low folate diet (Werstuck et al. 2001).

It is also possible that folate depletion was too severe and did not recapitulate low dietary

folate condition in humans. However, folate loss itself, independent of serum

homocysteine level, has been associated with disease risks (Kato et al. 1999). In addition,

a decrease in serum homocysteine level is observed in type I diabetics without renal

complications (Robillon et al. 1994, Jacobs et al. 1998) and in ApoE knockout mice

114 (Moghadasian et al. 2001) suggesting that such decrease represents an important marker of physiological states under certain conditions.

We used pooled RNA samples from eight replicate liver tissues at each time point to detect global and gene-specific expression changes in A/J and C57BL/6J. Given the high cost of microarrays, pooling allowed us to obtain average gene expression effect among eight replicate mice. However, we cannot detect outlier mice from a pooled RNA sample such that some of the gene expression changes can result from biological noise rather than from dietary folate perturbation. To minimize the effect of outlier mice on gene expression profile, we first used eight replicate mice for each time point so that any outlier effect is only a fraction of the gene expression level in a pooled sample. In hierarchical clustering of A/J and C57BL/6J time points, we observed that C57BL/6J global gene expression profile was farther from the control both during folate depletion and repletion in contrast to A/J profile. If the effect from the outlier mice was large enough to prevent C57BL/6J from clustering with the nearest control time points, depletion (1,2,7 days) and repletion (15, 21 days) time points should not cluster amongst themselves but rather cluster randomly, away from each other. For individual genes and pathways, we performed metabolite assays to detect changes in pathway activity on biochemical level and performed additional experiments using gene knockout and chemical supplement to validate the role of these pathways in homocysteine and folate pathway maintenance. In the future experiments, we can perform microarray analysis on each biological replicates, each biological replicate assayed with technical replicates, to detect and eliminate biological noise from the dataset.

115 We identified strain-specific changes in cholesterol metabolism and the role of choline in mediating folate perturbation response. However, the effect of these parameters on serum folate level and gene expression response was modest. Without knowledge of gene expression regulation, it is difficult to identify optimal targets for additional perturbation in studying pathway responses to folate depletion or predicting its consequences (Ideker et al. 2001). Ongoing studies in identifying gene expression

circuitry (Harbison et al. 2004), reverse engineering and probabilistic modeling of

expression data (Liao et al. 2003, Basso et al. 2005), and QTL analysis of gene

expression responses (Schadt et al. 2005) will significantly improve our ability to identify

optimal targets for studying the regulation of folate perturbation responses. In the mean

time, these pathways serve as component list for regulatory networks involved in

maintenance of homocysteine and folate pathways which may play a more general role in

complex diseases.

116

CHAPTER 5

SUMMARY AND FUTURE DIRECTIONS

117

Summary

Defects in homocysteine and folate metabolism are associated with increased risk for a variety of common and complex diseases. Although the biochemical functions of these pathways have been identified, these biological processes do not explain the role of these

pathways in complex diseases. In addition, common mutations in homocysteine and

folate pathways do not account for all of the disease types associated with the pathways.

Mouse models allow us to control much of the genetic and environmental variations that

complicate the study of complex traits and diseases in human. Recent genetic studies in

mice suggested that homocysteine and folate pathways interact with different pathways in

diseases and that mutations in other pathways regulate metabolism and gene expression

of homocysteine and folate pathways. Therefore, understanding the role of homocysteine

and folate metabolism in diseases requires identification of interacting pathways, and an

understanding of the molecular mechanisms of those interactions, and of their regulation.

To accomplish these goals, I used quantitative genetics, evolutionary, and genomic

approaches.

In chapter 2, I examined the regulation of a key enzyme, methylenetetrahydrofolate

reductase (MTHFR), which connects the folate pathway to the homocysteine pathway. A

unique inbred mouse strain, PL/J, showed reduced MTHFR activity as well as seizure

and tail kink phenotypes previously observed in MTHFR knockout mice and the

comparable phenotype in human patients. I found that there was no coding mutation in

Mthfr in PL/J that explained the reduced enzyme activity, suggesting the existence of cis-

118 or trans-acting mutations. I used genetic crosses between PL/J and C57BL/6J to examine the inheritance of the seizure and tail-kink traits. If the trait showed strong genetic

control, these crosses could be used to test for shared genetic control between reduced

MTHFR activity and seizure and tail-kink phenotypes. I found that seizure onset and

frequency were strongly influenced by uncontrolled environmental variation and that the

tail-kink trait is highly polygenic suggesting that these traits are too complex to be

analyzed using conventional crosses. However, for the seizure phenotype, I identified a

genetic interaction between PL/J and DBA/2J in F1 mice that is not found in

PL/JxC57BL/6J F1 mice. Given the availability of C57BL/6JxDBA/2J recombinant

inbred (RI) strains, these RI strains can be crossed to PL/J to generate genetically

identical F1 mice for analysis of dominant modifiers of the PL/J seizure phenotype.

These results suggested that the use of advanced genetic mouse strains will be crucial for dissecting these complex traits because there is significant uncontrolled environmental variation that strongly influences the phenotype of interest.

In chapter 3, I investigated the role of biochemical interactions in providing genetic robustness to mutations. I inferred robustness based on faster gene evolution rates between human and mouse with the hypothesis that a faster rate implies improved tolerance to mutations. I found that redundant metabolic paths (networks) do not contribute to variation in gene evolution rates. Similarly, redundancy in gene families did not show a significant effect. However, we found that protein interactions, tissue expression patterns, and gene function contribute significantly to variation in gene evolution rate. This suggested that topology of the metabolic pathway alone is not

119 enough to capture the complexity of biological networks. Rather, representation of interactions on the organism level, taking into account tissue requirements for the enzyme

and function of those enzymes within the organism better account for variation in gene

evolution rates.

In chapter 4, I searched for pathways that regulate homeostasis of homocysteine and

folate metabolism after dietary folate perturbation. I found that the A/J strain responded

better to folate perturbation than C57BL/6J both in minimizing the effects of depletion

and in the rate of recovery as assessed by serum folate levels and global gene expression

profiles. I found that during folate perturbations, cholesterol metabolism was

significantly altered both in gene expression profiles and in serum and liver cholesterol

levels. I also found that increases in serum cholesterol improved folate retention at

modest levels in C57BL/6J but also induced additional gene expression changes. There

was no change in the global DNA methylation status, an important marker of stress in

homocysteine and folate metabolism. I also identified choline as a mediator of folate

perturbation response. These pathways are part of a network involved in maintenance of

homocysteine and folate metabolism and can be studied together in dissecting the role of

homocysteine and folate in complex diseases.

120 Future Directions

Homeostatic response to dietary folate perturbation

In chapter 5, I found higher serum folate retention and fewer gene expression changes in

A/J compared to C57BL/6J during dietary folate depletion. I also found faster recovery

towards control serum folate and gene expression profile in A/J once dietary folate was

restored. However, many of the conclusions were based only on serum metabolite changes and liver gene expression profiles. Although liver is the primary organ of homocysteine and folate metabolism, and serum metabolite changes indicate amount of metabolites transported across tissues, some tissues may function independent of metabolite changes in serum and liver. One such example is brain.

Brain is insulated from changes in serum metabolite levels by blood-brain barrier. Folate is transported across blood-brain barrier by folate receptors as described in the introduction. However, cholesterol, a metabolite we identified to be important for folate retention, is not actively transported across blood-brain barrier. The amount of cholesterol flux across blood-brain barrier is only one percent of the flux of cholesterol

between serum and other tissues (Dietschy and Turley 2001). Therefore, brain

cholesterol is synthesized de novo as well as degraded and recycled in brain. This

suggests that any changes in serum cholesterol we observed during folate depletion may

not impact cholesterol level in the brain.

121 Aside from the liver, I collected four different regions of the brain; cortex, mid- and hind- brain, cerebellum, and pons and brain stem. I can measure cholesterol levels in these four

brain regions and ask whether brain tissues also show similar changes in cholesterol level

as observed in serum. Given that brain synthesizes and degrades its own cholesterol,

changes in brain cholesterol may take longer than serum cholesterol changes. This could

lead to faster loss of folate in brain compared to other tissues. I could examine other

mechanisms of folate transport and folate retention in brain that can counteract the low

cholesterol flux across blood-brain barrier. For example, I could measure the changes in

folate transporter mRNA level in brain to test whether folate receptor numbers are

increased to import more folate across the blood-brain barrier. I could also measure the

levels of polyglutamate attached to folate which improves cellular retention of folate (as

mentioned in introduction).

I could also measure the global gene expression profile in the brain and determine

whether the brain shows similar homeostatic response as the liver in strain specific

manner. Given the significant decline in the cost of microarrays in recent years, I could

perform gene expression analysis on replicate tissues rather than pooling all replicate

tissues into one and assaying the pool. This will reduce the likelihood of identifying gene

expression changes due to a biological noise rather than the changes due to dietary folate

perturbation. Increase in replicate size will also increase the statistical power and allow

me to identify the number of genes different at each time point from the control rather

than determining the relationship between samples based on a correlation distance as I

performed in chapter 5. I could also ask whether the same sets of genes and pathways are

122 undergoing gene expression changes in the brain as the liver. I may identify pathways important for maintenance of homocysteine and folate metabolism specific to the brain.

The change in brain cholesterol level in response to dietary folate perturbation is important for understanding the relationship between Alzheimer’s disease risk and homocysteine and folate pathways. Elevated serum cholesterol and serum homocysteine levels have independently been associated with increased risk for Alzheimer’s disease.

Also, increase in cholesterol flux across the blood-brain barrier is proportional to the increase in severity of dementia in Alzheimer’s disease (Lutjohann et al. 2000).

Measuring the changes in metabolite and gene expression profiles in brain will offer new insights into the relationship between homocysteine and folate pathway maintenance and cholesterol metabolism which may play a role in pathogenesis of Alzheimer’s disease.

Building a network of homocysteine and folate pathway homeostasis

In chapter 5, I was able to identify and experimentally validate the involvement of

cholesterol and choline pathways in dietary folate perturbation response. I used a

knockout model of cholesterol transport gene and a chemical supplementation with

choline to specifically perturb these pathways. However, there were many other genes

that showed gene expression changes which were not examined in chapter 5. These

genes may be part of a larger network involved in homocysteine and folate pathway

maintenance. Although there are mouse knockout models for some of these genes, there

is currently no high-throughput method to specifically perturb individual genes and

123 pathways in mouse. However, in a model organism C. elegans, RNAi is routinely used to downregulate specific genes in high-throughput fashion.

Unlike mouse models, I cannot measure the changes in serum homocysteine and folate levels in C. elegans to test the role of each gene in homocysteine and folate pathway maintenance. However, I can measure the changes in expression level of homocysteine and folate pathway genes to build gene expression network of homocysteine and folate pathway maintenance. In other words, I can use gene expression levels of homocysteine and folate pathways as reporters in RNAi screen.

There were many genes that showed expression changes in response to dietary folate perturbation in chapter 5. To prioritize genes for RNAi screen, I will first look for transcription factors that showed gene expression changes in my mouse experiments. I will identify orthologs of these transcription factors in C. elegans using the same method from chapter 3 and perform RNAi, and measure the changes in expression of homocysteine and folate pathway genes. This will identify some of the transcription factors that directly or indirectly regulate expression of homocysteine and folate pathway genes.

For example, forkhead transcription factor Foxo1 showed gene expression pattern opposite of choline kinase for both A/J and C57BL/6J. This means that during the time point in which gene expression was upregulated in choline kinase, Foxo1 showed downregulation. FOXO1 is a transcription factor that acts downstream of insulin/IGF

124 receptor in regulating some of the genes involved in ageing. In C. elegans, genes that were induced in daf-2 mutant (IGF/insulin receptor) but repressed in daf-2 / daf-16

(Foxo1 ortholog in C. elegans) double mutants were identified through microarray studies (Murphy et al. 2003). Because daf-2 mutants show increased lifespan, genes identified in this paper were genes involved in longevity that were regulated by FOXO1.

These genes included Mthfd (folate pathway gene) and glutathione transferase

(homocysteine pathway). RNAi mediated downregulation of these genes led to decrease in lifespan in daf-2 mutants. Given the role of homocysteine and folate pathways in

DNA synthesis, DNA methylation, and clearance of reactive oxygen species, it is no surprise that these pathways are part of a gene expression network involved in ageing

process. In our mouse experiment, both Mthfd and glutathione transferase showed gene

expression changes during folate perturbation. This suggests that FOXO1 may be

involved in regulating expression of homocysteine and folate pathway genes during

dietary folate perturbation.

To test whether transcription factors identified in my RNAi screen directly regulate

homocysteine and folate genes, I can perform ChIP-chip experiment (Odom et al. 2004)

using liver tissue from mice under dietary folate perturbation. This is an experiment in

which transcription factors are cross-linked to their target genes, then DNA are digested,

and an antibody specific to a transcription factor of interest is used to pull target gene

sequences. For example, I could ask whether FOXO1 directly regulate Mthfd and

glutathione transferase or whether there are other transcription factors downstream of

FOXO1 that mediate the gene expression change. Because ChIP-chip experiment is a

125 global survey of all genes regulated by a particular transcription factor, I can build a regulatory network of homocysteine and folate pathway genes as well as genes from other pathways. This provides a global view of how different pathways are coordinately regulated during dietary folate perturbation.

For other genes that respond to dietary folate perturbation, I can prioritize genes for

RNAi screen by looking for genes that showed conserved co-expression with homocysteine and folate pathway genes. Previously, Stuart et al. (2003) analyzed microarray data from human, fly, worm, and yeast and identified genes that are co- expressed across these species. These evolutionarily conserved co-expressions are likely to have functional significance. For example, in the data from Stuart et al. (2003), homocysteine gene Ahcy was co-expressed with cholesterol synthesis gene Hmgcs, and folate pathway gene Mthfr and Mtr were co-expressed with glycolysis and Krebs cycle genes. I could ask whether disruption of these co-regulated genes by RNAi lead to changes in gene expression of homocysteine and folate pathways. Although the mechanism of regulation will not be direct physical interaction, these data will expand the network of gene expression cascades involved in homocysteine and folate pathway maintenance.

126

APPENDIX

127 Appendix 1. Gene symbol, gene name, and enzyme commission number (E.C.) of enzymes in homocysteine and folate metabolic pathways. Names of methyltransferases are not listed due to large number of enzymes (over 100 methyltransferases) involved in this reaction.

Abbreviation Gene Name E.C. Number Ahcy S-adenosylhomocysteine hydrolase 3.3.1.1 Ams S-adenosylmethionine synthase 2.5.1.6 Atic 5-aminoimidazole-4-carboxamide ribonucleotide 2.1.2.3 formyltransferase/IMP cyclohydrolase (bifunctional) 3.5.4.10 Bhmt betaine homocysteine methyltransferase 2.1.1.5 Bhmt 2 betaine homocysteine methyltransferase 2 2.1.1.5 Cbs cystathionine beta-synthase 4.2.1.22 Cth cystathionine gamma-lyase 4.4.1.1 Dhfr dihydrofolate reductase 1.5.1.3 Ftcd formiminotransferase cyclodeaminase 2.1.2.5 4.3.1.4 Fthfd 10-formyltetrahydrofolate dehydrogenase 1.5.1.6 Gart phosphoribosylglycinamide formyltransferase 2.1.2.2 (trifunctional) 6.3.3.1 6.3.4.13 Mthfd 1 methylenetetrahydrofolate dehydrogenase (NADP+ 3.5.4.9 dependent) (trifunctional, cytoplasmic) 1.5.1.5 6.3.4.3 Mthfd 2 methylene tetrahydrofolate dehydrogenase (NAD+ 3.5.4.9 dependent) (bifunctional, mitochondrial) 1.5.1.15 Mthfr 5,10-methylenetetrahydrofolate reductase 1.5.1.20 Mthfs 5,10-methenyltetrahydrofolate synthetase 6.3.3.2 Mtr methionine synthase 2.1.1.13 Shmt 1 serine hydroxymethyltransferase 1 (cytoplasmic) 2.1.2.1 Shmt 2 serine hydroxymethyltransferase 2 (mitochondrial) 2.1.2.1 Tyms thymidylate synthetase 2.1.1.45

128 APPENDIX 2

Chemical Composition of Basal Diet 5755 (TestDiet)

Nutrients Minerals Protein, % 19.0 % 0.60 Arginine, % 0.73 Phosphorus, % 0.57 Cysteine, % 0.08 Potassium, % 0.40 Glycine, % 0.41 Magnesium, % 0.07 Histidine, % 0.54 Sodium, % 0.21 Isoleucine, % 1.00 Chlorine, % 0.24 Leucine, % 1.82 , ppm 5.0 Lysine, % 1.53 Iron, ppm 60 Methionine, % 0.69 Zinc, ppm 21 Phenylalanine, % 1.00 Manganese, ppm 65 Tyrosine, % 1.06 Copper, ppm 15.0 Threonine, % 0.81 Cobalt, ppm 3.2 Tryptophan, % 0.23 Iodine, ppm 0.57 Valine, % 1.20 Chromium, ppm 3.0 Fat, % 10.0 , ppm 0.82 Cholesterol, ppm 0 Selenium, ppm 0.23 Fiber (crude), % 4.3 Carbohydrates, % 60.6 Energy 4.09 (Physiological fuel value), kcal/g

Vitamins Vitamin K (as ), ppm 10.4 Thiamin Hydrochloride, ppm 20.6 , ppm 20.0 , ppm 90 , ppm 55 Choline Chloride, ppm 1400 Folic Acid, ppm 4.0 Pyridoxine, ppm 16.5 , ppm 0.4 Vitamin K-12 mcg/kg 20 , IU/g 22.1 -3 (added), IU/g 2.2 , IU/kg 50 Ascorbic Acid, ppm 0.0

Folic Acid Deficient Purified Diet with 1% Succinylsulfathiazole 58C3 (TestDiet) Same composition as basal diet except the diet contains no folic acid and 1% succinylsulfathiazole (antibiotic).

129

BIBLIOGRAPHY

130 Allain, C.C., L.S. Poon, C.S. Chan, W. Richmond, and P.C. Fu. 1974. Enzymatic determination of total serum cholesterol. Clin Chem 20: 470-475.

Almaas, E., B. Kovacs, T. Vicsek, Z.N. Oltvai, and A.L. Barabasi. 2004. Global organization of metabolic fluxes in the bacterium . Nature 427: 839-843.

Ashburner, M., C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. , A.P. Davis, K. Dolinski, S.S. Dwight, J.T. Eppig et al. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25-29.

Bailey, D.W. 1981. Recombinant inbred strains and bilineal congenic strains. In The mouse in biomedical research (eds. H.L. Foster J.D. Small, and J.G. Fox), pp. 223-239. Academic Press, New York.

Bairoch, A. and R. Apweiler. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 28: 45- 48.

Barbera, J.P., T.A. Rodriguez, N.D. Greene, W.J. Weninger, A. Simeone, A.J. Copp, R.S. Beddington, and S. Dunwoodie. 2002. Folic acid prevents exencephaly in Cited2 deficient mice. Hum Mol Genet 11: 283-293.

Basso, K., A.A. Margolin, G. Stolovitzky, U. Klein, R. Dalla-Favera, and A. Califano. 2005. Reverse engineering of regulatory networks in human B cells. Nat Genet 37: 382-390.

Bautista, L.E., I.A. Arenas, A. Penuela, and L.X. Martinez. 2002. Total plasma homocysteine level and risk of cardiovascular disease: a meta-analysis of prospective cohort studies. J Clin Epidemiol 55: 882-887.

Bellamy, M.F., I.F. McDowell, M.W. Ramsey, M. Brownlee, R.G. Newcombe, and M.J. Lewis. 1999. Oral folate enhances endothelial function in hyperhomocysteinaemic subjects. Eur J Clin Invest 29: 659-662.

Benito, E., A. Stiggelbout, F.X. Bosch, A. Obrador, J. Kaldor, M. Mulet, and N. Munoz. 1991. Nutritional factors in colorectal cancer risk: a case-control study in Majorca. Int J Cancer 49: 161-167.

Bhalla, U.S. and R. Iyengar. 1999. Emergent properties of networks of biological signaling pathways. Science 283: 381-387.

Blake, J.A., J.E. Richardson, C.J. Bult, J.A. Kadin, and J.T. Eppig. 2003. MGD: the Mouse Genome Database. Nucleic Acids Res 31: 193-195.

131 Bostom, A.G. and L. Lathrop. 1997. Hyperhomocysteinemia in end-stage renal disease: prevalence, etiology, and potential relationship to arteriosclerotic outcomes. Kidney Int 52: 10-20.

Bostom, A.G. and C. Garber. 2000. Endpoints for homocysteine-lowering trials. Lancet 355: 511-512.

Boushey, C.J., S.A. Beresford, G.S. Omenn, and A.G. Motulsky. 1995. A quantitative assessment of plasma homocysteine as a risk factor for vascular disease. Probable benefits of increasing folic acid intakes. Jama 274: 1049-1057.

Chadwick, L.H., S.E. McCandless, G.L. Silverman, S. Schwartz, D. Westaway, and J.H. Nadeau. 2000. Betaine-homocysteine methyltransferase-2: cDNA cloning, gene sequence, physical mapping, and expression of the human and mouse genes. Genomics 70: 66-73.

Chang, W.J., K.G. Rothberg, B.A. Kamen, and R.G. Anderson. 1992. Lowering the cholesterol content of MA104 cells inhibits receptor-mediated transport of folate. J Cell Biol 118: 63-69.

Chen, J., E. Giovannucci, K. Kelsey, E.B. Rimm, M.J. Stampfer, G.A. Colditz, D. Spiegelman, W.C. Willett, and D.J. Hunter. 1996. A methylenetetrahydrofolate reductase polymorphism and the risk of colorectal cancer. Cancer Res 56: 4862-4864.

Chen, J., E. Giovannucci, S.E. Hankinson, J. Ma, W.C. Willett, D. Spiegelman, K.T. Kelsey, and D.J. Hunter. 1998. A prospective study of methylenetetrahydrofolate reductase and methionine synthase gene polymorphisms, and risk of colorectal adenoma. Carcinogenesis 19: 2129- 2132.

Chen, R.Z., U. Pettersson, C. Beard, L. Jackson-Grusby, and R. Jaenisch. 1998. DNA hypomethylation leads to elevated mutation rates. Nature 395: 89-93.

Chen, Z., A.C. Karaplis, S.L. Ackerman, I.P. Pogribny, S. Melnyk, S. Lussier- Cacan, M.F. Chen, A. Pai, S.W. John, R.S. Smith et al. 2001. Mice deficient in methylenetetrahydrofolate reductase exhibit hyperhomocysteinemia and decreased methylation capacity, with neuropathology and aortic lipid deposition. Hum Mol Genet 10: 433-443.

Christensen, B., L. Arbour, P. Tran, D. Leclerc, N. Sabbaghian, R. Platt, B.M. Gilfix, D.S. Rosenblatt, R.A. Gravel, P. Forbes et al. 1999. Genetic polymorphisms in methylenetetrahydrofolate reductase and methionine synthase, folate levels in red blood cells, and risk of neural tube defects. Am J Med Genet 84: 151-157.

132 Clarke, R., L. Daly, K. Robinson, E. Naughten, S. Cahalane, B. Fowler, and I. Graham. 1991. Hyperhomocysteinemia: an independent risk factor for vascular disease. N Engl J Med 324: 1149-1155.

Clarke, R., A.D. Smith, K.A. Jobst, H. Refsum, L. Sutton, and P.M. Ueland. 1998. Folate, vitamin B12, and serum total homocysteine levels in confirmed Alzheimer disease. Arch Neurol 55: 1449-1455.

Collaboration, H.L.T. 1998. Lowering blood homocysteine with folic acid based supplements: meta-analysis of randomised trials. Bmj 316: 894-898.

Collaboration, T.H.S. 2002. Homocysteine and risk of ischemic heart disease and stroke: a meta-analysis. Jama 288: 2015-2022.

Comeron, J.M. and M. Aguade. 1998. An evaluation of measures of synonymous codon usage bias. J Mol Evol 47: 268-274.

Cravo, M.L., L.M. Gloria, J. Selhub, M.R. Nadeau, M.E. Camilo, M.P. Resende, J.N. Cardoso, C.N. Leitao, and F.C. Mira. 1996. Hyperhomocysteinemia in chronic alcoholism: correlation with folate, vitamin B-12, and vitamin B-6 status. Am J Clin Nutr 63: 220-224.

Curtin, K., J. Bigler, M.L. Slattery, B. Caan, J.D. Potter, and C.M. Ulrich. 2004. MTHFR C677T and A1298C polymorphisms: diet, estrogen, and risk of colon cancer. Cancer Epidemiol Biomarkers Prev 13: 285-292.

Czeizel, A.E. and I. Dudas. 1992. Prevention of the first occurrence of neural- tube defects by periconceptional vitamin supplementation. N Engl J Med 327: 1832-1835. de Franchis, R., G. Sebastio, C. Mandato, G. Andria, and P. Mastroiacovo. 1995. Spina bifida, 677T-->C mutation, and role of folate. Lancet 346: 1703.

De Marco, P., M.G. Calevo, A. Moroni, L. Arata, E. Merello, R.H. Finnell, H. Zhu, L. Andreussi, A. Cama, and V. Capra. 2002. Study of MTHFR and MS polymorphisms as risk factors for NTD in the Italian population. J Hum Genet 47: 319-324. Demant, P. and A.A. Hart. 1986. Recombinant congenic strains--a new tool for analyzing genetic traits determined by more than one gene. Immunogenetics 24: 416-422.

Devlin, A.M., E. Arning, T. Bottiglieri, F.M. Faraci, R. Rozen, and S.R. Lentz. 2004. Effect of Mthfr genotype on diet-induced hyperhomocysteinemia and vascular function in mice. Blood 103: 2624-2629.

133 Dietschy, J.M. and S.D. Turley. 2001. Cholesterol metabolism in the brain. Curr Opin Lipidol 12: 105-112.

Dipple, K.M., J.K. Phelan, and E.R. McCabe. 2001. Consequences of complexity within biological networks: robustness and health, or vulnerability and disease. Mol Genet Metab 74: 45-50.

Doll, R. and R. Peto. 1981. The causes of cancer: quantitative estimates of avoidable risks of cancer in the United States today. J Natl Cancer Inst 66: 1191-1308.

Drage, M.G., G.L. Holmes, and T.N. Seyfried. 2002. Hippocampal neurons and glia in epileptic EL mice. J Neurocytol 31: 681-692.

Duff, E.M., E.S. Cooper, C.M. Danbury, B.E. Johnson, and G.R. Serjeant. 1991. Neural tube defects in hurricane aftermath. Lancet 337: 120-121.

Eberhardt, R.T., M.A. Forgione, A. Cap, J.A. Leopold, M.A. Rudd, M. Trolliet, S. Heydrick, R. Stark, E.S. Klings, N.I. Moldovan et al. 2000. Endothelial dysfunction in a murine model of mild hyperhomocyst(e)inemia. J Clin Invest 106: 483-491.

Ernest, S., B. Christensen, B.M. Gilfix, O.A. Mamer, A. Hosack, M. Rodier, C. Colmenares, J. McGrath, A. Bale, R. Balling et al. 2002. Genetic and molecular control of folate-homocysteine metabolism in mutant mice. Mamm Genome 13: 259-267.

Feinberg, A.P. and B. Vogelstein. 1983. Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature 301: 89-92.

Ferraro, T.N., G.T. Golden, G.G. Smith, P. St Jean, N.J. Schork, N. Mulholland, C. Ballas, J. Schill, R.J. Buono, and W.H. Berrettini. 1999. Mapping loci for pentylenetetrazol-induced seizure susceptibility in mice. J Neurosci 19: 6733-6739.

Fijneman, R.J., S.S. de Vries, R.C. Jansen, and P. Demant. 1996. Complex interactions of new quantitative trait loci, Sluc1, Sluc2, Sluc3, and Sluc4, that influence the susceptibility to lung cancer in the mouse. Nat Genet 14: 465-467.

Finkelstein, J.D. 1990. Methionine metabolism in mammals. J Nutr Biochem 1: 228-237.

Finkelstein, J.D. 1998. The metabolism of homocysteine: pathways and regulation. Eur J Pediatr 157 Suppl 2: S40-44.

134 Fleming, A. and A.J. Copp. 1998. Embryonic folate metabolism and mouse neural tube defects. Science 280: 2107-2109.

Folch, J., M. Lees, and G.H. Sloane Stanley. 1957. A simple method for the isolation and purification of total lipides from animal tissues. J Biol Chem 226: 497-509.

Force, A., M. Lynch, F.B. Pickett, A. Amores, Y.L. Yan, and J. Postlethwait. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151: 1531-1545.

Frankel, W.N., B.A. Taylor, J.L. Noebels, and C.M. Lutz. 1994. Genetic epilepsy model derived from common inbred mouse strains. Genetics 138: 481-489.

Frankel, W.N., L. Taylor, B. Beyer, B.L. Tempel, and H.S. White. 2001. Electroconvulsive thresholds of inbred mouse strains. Genomics 74: 306- 312.

Fraser, H.B., A.E. Hirsh, L.M. Steinmetz, C. Scharfe, and M.W. Feldman. 2002. Evolutionary rate in the protein interaction network. Science 296: 750-752.

Freudenheim, J.L., S. Graham, J.R. Marshall, B.P. Haughey, S. Cholewinski, and G. Wilkinson. 1991. Folate intake and carcinogenesis of the colon and rectum. Int J Epidemiol 20: 368-374.

Freudenheim, J.L., J.R. Marshall, J.E. Vena, R. Laughlin, J.R. Brasure, M.K. Swanson, T. Nemoto, and S. Graham. 1996. Premenopausal breast cancer risk and intake of vegetables, fruits, and related nutrients. J Natl Cancer Inst 88: 340-348.

Freudenheim, J.L. 1999. Study design and hypothesis testing: issues in the evaluation of evidence from research in nutritional epidemiology. Am J Clin Nutr 69: 1315S-1321S.

Friso, S., S.W. Choi, D. Girelli, J.B. Mason, G.G. Dolnikowski, P.J. Bagley, O. Olivieri, P.F. Jacques, I.H. Rosenberg, R. Corrocher et al. 2002. A common mutation in the 5,10-methylenetetrahydrofolate reductase gene affects genomic DNA methylation through an interaction with folate status. Proc Natl Acad Sci U S A 99: 5606-5611.

Frosst, P., H.J. Blom, R. Milos, P. Goyette, C.A. Sheppard, R.G. Matthews, G.J. Boers, M. den Heijer, L.A. Kluijtmans, L.P. van den Heuvel et al. 1995. A candidate genetic risk factor for vascular disease: a common mutation in methylenetetrahydrofolate reductase. Nat Genet 10: 111-113.

135 Fueta, Y., L.A. Vasilets, K. Takeda, M. Kawamura, and W. Schwarz. 2003. Down-regulation of GABA-transporter function by hippocampal translation products: its possible role in epilepsy. Neuroscience 118: 371-378.

Gaudet, F., J.G. Hodgson, A. Eden, L. Jackson-Grusby, J. Dausman, J.W. Gray, H. Leonhardt, and R. Jaenisch. 2003. Induction of tumors in mice by genomic hypomethylation. Science 300: 489-492.

Giot, L., J.S. Bader, C. Brouwer, A. Chaudhuri, B. Kuang, Y. Li, Y.L. Hao, C.E. Ooi, B. Godwin, E. Vitols et al. 2003. A protein interaction map of Drosophila melanogaster. Science 302: 1727-1736.

Giovannucci, E., E.B. Rimm, A. Ascherio, M.J. Stampfer, G.A. Colditz, and W.C. Willett. 1995. Alcohol, low-methionine--low-folate diets, and risk of colon cancer in men. J Natl Cancer Inst 87: 265-273.

Giovannucci, E., M.J. Stampfer, G.A. Colditz, D.J. Hunter, C. Fuchs, B.A. Rosner, F.E. Speizer, and W.C. Willett. 1998. Multivitamin use, folate, and colon cancer in women in the Nurses' Health Study. Ann Intern Med 129: 517- 524.

Goelz, S.E., B. Vogelstein, S.R. Hamilton, and A.P. Feinberg. 1985. Hypomethylation of DNA from benign and malignant human colon neoplasms. Science 228: 187-190.

Graham, S., R. Hellmann, J. Marshall, J. Freudenheim, J. Vena, M. Swanson, M. Zielezny, T. Nemoto, N. Stubbe, and T. Raimondo. 1991. Nutritional epidemiology of postmenopausal breast cancer in western New York. Am J Epidemiol 134: 552-566.

Graham, I.M., L.E. Daly, H.M. Refsum, K. Robinson, L.E. Brattstrom, P.M. Ueland, R.J. Palma-Reis, G.H. Boers, R.G. Sheahan, B. Israelsson et al. 1997. Plasma homocysteine as a risk factor for vascular disease. The European Concerted Action Project. Jama 277: 1775-1781.

Group, M.V.S.R. 1991. Prevention of neural tube defects: results of the Medical Research Council Vitamin Study. Lancet 338: 131-137.

Gu, Z., L.M. Steinmetz, X. Gu, C. Scharfe, R.W. Davis, and W.H. Li. 2003. Role of duplicate genes in genetic robustness against null mutations. Nature 421: 63-66.

136 Gueant-Rodriguez, R.M., C. Rendeli, B. Namour, L. Venuti, A. Romano, G. Anello, P. Bosco, R. Debard, P. Gerard, M. Viola et al. 2003. Transcobalamin and methionine synthase reductase mutated polymorphisms aggravate the risk of neural tube defects in humans. Neurosci Lett 344: 189-192.

Guelzim, N., S. Bottani, P. Bourgine, and F. Kepes. 2002. Topological and causal structure of the yeast transcriptional regulatory network. Nat Genet 31: 60- 63.

Guenther, B.D., C.A. Sheppard, P. Tran, R. Rozen, R.G. Matthews, and M.L. Ludwig. 1999. The structure and properties of methylenetetrahydrofolate reductase from Escherichia coli suggest how folate ameliorates human hyperhomocysteinemia. Nat Struct Biol 6: 359-365.

Han, J.D., N. Bertin, T. Hao, D.S. Goldberg, G.F. Berriz, L.V. Zhang, D. Dupuy, A.J. Walhout, M.E. Cusick, F.P. Roth et al. 2004. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430: 88-93.

Harbison, C.T., D.B. Gordon, T.I. Lee, N.J. Rinaldi, K.D. Macisaac, T.W. Danford, N.M. Hannett, J.B. Tagne, D.B. Reynolds, J. Yoo et al. 2004. Transcriptional regulatory code of a eukaryotic genome. Nature 431: 99- 104.

Hegde, P., R. Qi, K. Abernathy, C. Gay, S. Dharap, R. Gaspard, J.E. Hughes, E. Snesrud, N. Lee, and J. Quackenbush. 2000. A concise guide to cDNA microarray analysis. Biotechniques 29: 548-550, 552-544, 556 passim.

Hibbard, E.D. and R.W. Smithells. 1965. Folic acid metabolism and human embryopathy. Lancet i: 1254.

Hirsh, A.E. and H.B. Fraser. 2001. Protein dispensability and rate of evolution. Nature 411: 1046-1049.

Hofman, A., A. Ott, M.M. Breteler, M.L. Bots, A.J. Slooter, F. van Harskamp, C.N. van Duijn, C. Van Broeckhoven, and D.E. Grobbee. 1997. Atherosclerosis, apolipoprotein E, and prevalence of dementia and Alzheimer's disease in the Rotterdam Study. Lancet 349: 151-154.

Horne, D.W. 1997. Microbiological assay of folates in 96-well microtiter plates. Methods Enzymol 281: 38-43.

Hsueh, C.T. and B.J. Dolnick. 1993. Altered folate-binding protein mRNA stability in KB cells grown in folate-deficient medium. Biochem Pharmacol 45: 2537-2545.

137 Hsueh, C.T. and B.J. Dolnick. 1994. Regulation of folate-binding protein gene expression by DNA methylation in methotrexate-resistant KB cells. Biochem Pharmacol 47: 1019-1027.

Ideker, T., T. Galitski, and L. Hood. 2001. A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet 2: 343-372.

Jacob, R.A., D.M. Gretz, P.C. Taylor, S.J. James, I.P. Pogribny, B.J. Miller, S.M. Henning, and M.E. Swendseid. 1998. Moderate folate depletion increases plasma homocysteine and decreases lymphocyte DNA methylation in postmenopausal women. J Nutr 128: 1204-1212.

Jacobs, R.L., J.D. House, M.E. Brosnan, and J.T. Brosnan. 1998. Effects of streptozotocin-induced diabetes and of insulin treatment on homocysteine metabolism in the rat. Diabetes 47: 1967-1970.

Jacques, P.F., J. Selhub, A.G. Bostom, P.W. Wilson, and I.H. Rosenberg. 1999. The effect of folic acid fortification on plasma folate and total homocysteine concentrations. N Engl J Med 340: 1449-1454.

Jeong, H., B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi. 2000. The large- scale organization of metabolic networks. Nature 407: 651-654.

Jeong, H., S.P. Mason, A.L. Barabasi, and Z.N. Oltvai. 2001. Lethality and centrality in protein networks. Nature 411: 41-42.

Johnson, M.E., L. Viggiano, J.A. Bailey, M. Abdul-Rauf, G. Goodwin, M. Rocchi, and E.E. Eichler. 2001. Positive selection of a gene family during the emergence of humans and African apes. Nature 413: 514-519.

Jurka, J. 2000. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet 16: 418-420.

Kanehisa, M., S. Goto, S. Kawashima, and A. Nakaya. 2002. The KEGG databases at GenomeNet. Nucleic Acids Res 30: 42-46.

Kang, S.S., P.W. Wong, and M.R. Malinow. 1992. Hyperhomocyst(e)inemia as a risk factor for occlusive vascular disease. Annu Rev Nutr 12: 279-298.

Karolchik, D., R. Baertsch, M. Diekhans, T.S. Furey, A. Hinrichs, Y.T. Lu, K.M. Roskin, M. Schwartz, C.W. Sugnet, D.J. Thomas et al. 2003. The UCSC Genome Browser Database. Nucleic Acids Res 31: 51-54.

138 Kato, I., A.M. Dnistrian, M. Schwartz, P. Toniolo, K. Koenig, R.E. Shore, A. Akhmedkhanov, A. Zeleniuch-Jacquotte, and E. Riboli. 1999. Serum folate, homocysteine and colorectal cancer risk in women: a nested case-control study. Br J Cancer 79: 1917-1922.

Keku, T., R. Millikan, K. Worley, S. Winkel, A. Eaton, L. Biscocho, C. Martin, and R. Sandler. 2002. 5,10-Methylenetetrahydrofolate reductase codon 677 and 1298 polymorphisms and colon cancer in African Americans and whites. Cancer Epidemiol Biomarkers Prev 11: 1611-1621.

Kerr, M.K., M. Martin, and G.A. Churchill. 2000. Analysis of variance for gene expression microarray data. J Comput Biol 7: 819-837.

Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16: 111-120.

Kisliuk, R.L. 1999. Folate biochemistry in relation to antifolate selectivity. In Antifolate drugs in cancer therapy (ed. A.L. Jackman), pp. 13-36. Humana Press, Totowa.

Kitami, T., S. Ernest, L. Gallaugher, L. Friedman, W.N. Frankel, and J.H. Nadeau. 2004. Genetic and phenotypic analysis of seizure susceptibility in PL/J mice. Mamm Genome 15: 698-703.

Klerk, M., P. Verhoef, R. Clarke, H.J. Blom, F.J. Kok, and E.G. Schouten. 2002. MTHFR 677C-->T polymorphism and risk of coronary heart disease: a meta-analysis. Jama 288: 2023-2031.

Kletzien, R.F., P.K. Harris, and L.A. Foellmi. 1994. Glucose-6-phosphate dehydrogenase: a "housekeeping" enzyme subject to tissue-specific regulation by hormones, nutrients, and oxidant stress. Faseb J 8: 174-181.

Klootwijk, R., B. Franke, C.E. van der Zee, R.T. de Boer, W. Wilms, F.A. Hol, and E.C. Mariman. 2000. A deletion encompassing Zic3 in bent tail, a mouse model for X-linked neural tube defects. Hum Mol Genet 9: 1615-1622.

Kruman, II, C. Culmsee, S.L. Chan, Y. Kruman, Z. Guo, L. Penix, and M.P. Mattson. 2000. Homocysteine elicits a DNA damage response in neurons that promotes apoptosis and hypersensitivity to excitotoxicity. J Neurosci 20: 6920-6926.

139 Kruman, II, T.S. Kumaravel, A. Lohani, W.A. Pedersen, R.G. Cutler, Y. Kruman, N. Haughey, J. Lee, M. Evans, and M.P. Mattson. 2002. Folic acid deficiency and homocysteine impair DNA repair in hippocampal neurons and sensitize them to amyloid toxicity in experimental models of Alzheimer's disease. J Neurosci 22: 1752-1762.

Kumar, S., K. Tamura, I.B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17: 1244-1245.

Lander, E.S. L.M. Linton B. Birren C. Nusbaum M.C. Zody J. Baldwin K. Devon K. Dewar M. Doyle W. FitzHugh et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.

Laurence, K.M., N. James, M.H. Miller, G.B. Tennant, and H. Campbell. 1981. Double-blind randomised controlled trial of folate treatment before conception to prevent recurrence of neural-tube defects. Br Med J (Clin Res Ed) 282: 1509-1511.

Legare, M.E., F.S. Bartlett, 2nd, and W.N. Frankel. 2000. A major effect QTL determined by multiple genes in epileptic EL mice. Genome Res 10: 42-48.

Lenski, R.E., C. Ofria, T.C. Collier, and C. Adami. 1999. Genome complexity, robustness and genetic interactions in digital organisms. Nature 400: 661- 664. Li, W.-H. 1999. Molecular Evolution. Sinauer, Sunderland.

Li, S., C.M. Armstrong, N. Bertin, H. Ge, S. Milstein, M. Boxem, P.O. Vidalain, J.D. Han, A. Chesneau, T. Hao et al. 2004. A map of the interactome network of the metazoan C. elegans. Science 303: 540-543.

Liao, J.C., R. Boscolo, Y.L. Yang, L.M. Tran, C. Sabatti, and V.P. Roychowdhury. 2003. Network component analysis: reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci U S A 100: 15522-15527.

Lichtenstein, P., N.V. Holm, P.K. Verkasalo, A. Iliadou, J. Kaprio, M. Koskenvuo, E. Pukkala, A. Skytthe, and K. Hemminki. 2000. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 343: 78-85.

Lu, S.C., L. Alvarez, Z.Z. Huang, L. Chen, W. An, F.J. Corrales, M.A. Avila, G. Kanel, and J.M. Mato. 2001. Methionine adenosyltransferase 1A knockout mice are predisposed to liver injury and exhibit increased expression of genes involved in proliferation. Proc Natl Acad Sci U S A 98: 5560-5565.

140 Luscombe, N.M., M.M. Babu, H. Yu, M. Snyder, S.A. Teichmann, and M. Gerstein. 2004. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431: 308-312.

Lutjohann, D., A. Papassotiropoulos, I. Bjorkhem, S. Locatelli, M. Bagli, R.D. Oehring, U. Schlegel, F. Jessen, M.L. Rao, K. von Bergmann et al. 2000. Plasma 24S-hydroxycholesterol (cerebrosterol) is increased in Alzheimer and vascular demented patients. J Lipid Res 41: 195-198.

Lynch, M. and J.S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290: 1151-1155.

Ma, J., M.J. Stampfer, E. Giovannucci, C. Artigas, D.J. Hunter, C. Fuchs, W.C. Willett, J. Selhub, C.H. Hennekens, and R. Rozen. 1997. Methylenetetrahydrofolate reductase polymorphism, dietary interactions, and risk of colorectal cancer. Cancer Res 57: 1098-1102.

Makalowski, W. and M.S. Boguski. 1998. Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. Proc Natl Acad Sci U S A 95: 9407-9412.

Matsuura, T., I. Narama, K. Ozaki, M. Nishimura, T. Imagawa, H. Kitagawa, and M. Uehara. 1998. Developmental study on reduction and kinks of the tail in a new mutant knotty-tail mouse. Anat Embryol (Berl) 198: 91-99.

Mayor, S., S. Sabharanjak, and F.R. Maxfield. 1998. Cholesterol-dependent retention of GPI-anchored proteins in endosomes. Embo J 17: 4626-4638.

McCaddon, A., G. Davies, P. Hudson, S. Tandy, and H. Cattell. 1998. Total serum homocysteine in senile dementia of Alzheimer type. Int J Geriatr Psychiatry 13: 235-239.

McCann, S.J., S. Gillingwater, B.G. Keevil, D.P. Cooper, and M.R. Morris. 2003. Measurement of total homocysteine in plasma and blood spots using liquid chromatography-tandem mass spectrometry: comparison with the plasma Abbott IMx method. Ann Clin Biochem 40: 161-165.

McCully, K.S. and R.B. Wilson. 1975. Homocysteine theory of arteriosclerosis. Atherosclerosis 22: 215-227.

McIlroy, S.P., K.B. Dynan, J.T. Lawson, C.C. Patterson, and A.P. Passmore. 2002. Moderately elevated plasma homocysteine, methylenetetrahydrofolate reductase genotype, and risk for stroke, vascular dementia, and Alzheimer disease in Northern Ireland. Stroke 33: 2351-2356.

141 McLenachan, J.M., J.K. Williams, R.D. , P. Ganz, and A.P. Selwyn. 1991. Loss of flow-mediated endothelium-dependent dilation occurs early in the development of atherosclerosis. Circulation 84: 1273-1278.

Michal, G. 1999. Biochemical pathways: an atlas of biochemistry and molecular biology. John Wiley and Sons, New York.

Michie, C.A. 1991. Neural tube defects in 18th century. Lancet 337: 504.

Miller, J.W., R. Green, D.M. Mungas, B.R. Reed, and W.J. Jagust. 2002. Homocysteine, vitamin B6, and vascular disease in AD patients. Neurology 58: 1471-1475.

Mitchell-Olds, T. and D. Pedersen. 1998. The molecular basis of quantitative genetic variation in central and secondary metabolism in Arabidopsis. Genetics 149: 739-747.

Moghadasian, M.H., B.M. McManus, L.B. Nguyen, S. Shefer, M. Nadji, D.V. Godin, T.J. Green, J. Hill, Y. Yang, C.H. Scudamore et al. 2001. Pathophysiology of apolipoprotein E deficiency in mice: relevance to apo E-related disorders in humans. Faseb J 15: 2623-2630.

Molloy, A.M., S. Daly, J.L. Mills, P.N. Kirke, A.S. Whitehead, D. Ramsbottom, M.R. Conley, D.G. Weir, and J.M. Scott. 1997. Thermolabile variant of 5,10-methylenetetrahydrofolate reductase associated with low red-cell folates: implications for folate intake recommendations. Lancet 349: 1591- 1593.

Mootha, V.K., C.M. Lindgren, K.F. Eriksson, A. Subramanian, S. Sihag, J. Lehar, P. Puigserver, E. Carlsson, M. Ridderstrale, E. Laurila et al. 2003. PGC- 1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34: 267-273.

Mudd, H.S., H.L. Levy, and J.P. Kraus. 2001. Disorders of transsulfuration. In The metabolic & molecular bases of inherited disease (eds. C.R. Scriver A.L. Beaudet W.S. Sly, and D. Valle), pp. 2007-2056. McGraw-Hill, New York.

Murphy, C.T., S.A. McCarroll, C.I. Bargmann, A. Fraser, R.S. Kamath, J. Ahringer, H. Li, and C. Kenyon. 2003. Genes that act downstream of DAF- 16 to influence the lifespan of Caenorhabditis elegans. Nature 424: 277- 283.

Nadeau, J.H., J.B. Singer, A. Matin, and E.S. Lander. 2000. Analysing complex genetic traits with chromosome substitution strains. Nat Genet 24: 221- 225.

142 Namekata, K., Y. Enokido, I. Ishii, Y. Nagai, T. Harada, and H. Kimura. 2004. Abnormal lipid metabolism in cystathionine beta-synthase-deficient mice, an animal model for hyperhomocysteinemia. J Biol Chem 279: 52961- 52969.

Nei, M. and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3: 418-426.

Neumann, P.E. and T.N. Seyfried. 1990. Mapping of two genes that influence susceptibility to audiogenic seizures in crosses of C57BL/6J and DBA/2J mice. Behav Genet 20: 307-323.

Noe, V., C. Chen, C. Alemany, M. Nicolas, I. Caragol, L.A. Chasin, and C.J. Ciudad. 1997. Cell-growth regulation of the hamster dihydrofolate reductase gene promoter by transcription factor Sp1. Eur J Biochem 249: 13-20.

Nygard, O., S.E. Vollset, H. Refsum, I. Stensvold, A. Tverdal, J.E. Nordrehaug, M. Ueland, and G. Kvale. 1995. Total plasma homocysteine and cardiovascular risk profile. The Hordaland Homocysteine Study. Jama 274: 1526-1533.

Odom, D.T., N. Zizlsperger, D.B. Gordon, G.W. Bell, N.J. Rinaldi, H.L. Murray, T.L. Volkert, J. Schreiber, P.A. Rolfe, D.K. Gifford et al. 2004. Control of and liver gene expression by HNF transcription factors. Science 303: 1378-1381.

Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, Heidelberg.

Ou, C.Y., R.E. Stevenson, V.K. Brown, C.E. Schwartz, W.P. Allen, M.J. Khoury, R. Rozen, G.P. Oakley, Jr., and M.J. Adams, Jr. 1996. 5,10 Methylenetetrahydrofolate reductase genetic polymorphism as a risk factor for neural tube defects. Am J Med Genet 63: 610-614. Papapetrou, C., S.A. Lynch, J. Burn, and Y.H. Edwards. 1996. Methylenetetrahydrofolate reductase and neural tube defects. Lancet 348: 58.

Papin, J.A., J. Stelling, N.D. Price, S. Klamt, S. Schuster, and B.O. Palsson. 2004. Comparison of network-based pathway analysis methods. Trends Biotechnol 22: 400-405.

Peipert, J.F., D.S. Gifford, and L.A. Boardman. 1997. Research design and methods of quantitative synthesis of medical evidence. Obstet Gynecol 90: 473-478.

143 Pepe, G., O. Camacho Vanegas, B. Giusti, T. Brunelli, R. Marcucci, M. Attanasio, O. Rickards, G.F. De Stefano, D. Prisco, G.F. Gensini et al. 1998. Heterogeneity in world distribution of the thermolabile C677T mutation in 5,10-methylenetetrahydrofolate reductase. Am J Hum Genet 63: 917-920.

Pickett, F.B. and D.R. Meeks-Wagner. 1995. Seeing double: appreciating genetic redundancy. Plant Cell 7: 1347-1356.

Piedrahita, J.A., S.H. Zhang, J.R. Hagaman, P.M. Oliver, and N. Maeda. 1992. Generation of mice carrying a mutant apolipoprotein E gene inactivated by gene targeting in embryonic stem cells. Proc Natl Acad Sci U S A 89: 4471-4475.

Piedrahita, J.A., B. Oetama, G.D. Bennett, J. van Waes, B.A. Kamen, J. Richardson, S.W. Lacey, R.G. Anderson, and R.H. Finnell. 1999. Mice lacking the folic acid-binding protein Folbp1 are defective in early embryonic development. Nat Genet 23: 228-232.

Pogribny, I.P., L. Muskhelishvili, B.J. Miller, and S.J. James. 1997. Presence and consequence of uracil in preneoplastic DNA from folate/methyl-deficient rats. Carcinogenesis 18: 2071-2076.

Pogribny, I., P. Yi, and S.J. James. 1999. A sensitive new method for rapid detection of abnormal methylation patterns in global DNA and within CpG islands. Biochem Biophys Res Commun 262: 624-628.

Raghunathan, K., J.C. Schmitz, and D.G. Priest. 1997. Impact of schedule on leucovorin potentiation of fluorouracil antitumor activity in dietary folic acid deplete mice. Biochem Pharmacol 53: 1197-1202.

Rebhan, M., V. Chalifa-Caspi, J. Prilusky, and D. Lancet. 1998. GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics 14: 656-664.

Reder, C. 1988. Metabolic control theory: a structural approach. J Theor Biol 135: 175-201.

Refsum, H., P.M. Ueland, O. Nygard, and S.E. Vollset. 1998. Homocysteine and cardiovascular disease. Annu Rev Med 49: 31-62.

Rimm, E.B., W.C. Willett, F.B. Hu, L. Sampson, G.A. Colditz, J.E. Manson, C. Hennekens, and M.J. Stampfer. 1998. Folate and vitamin B6 from diet and supplements in relation to risk of coronary heart disease among women. Jama 279: 359-364.

144 Risch, N. 2001. The genetic epidemiology of cancer: interpreting family and twin studies and their implications for molecular genetic approaches. Cancer Epidemiol Biomarkers Prev 10: 733-741.

Rise, M.L., W.N. Frankel, J.M. Coffin, and T.N. Seyfried. 1991. Genes for epilepsy mapped in the mouse. Science 253: 669-673.

Rivera, M.C., R. Jain, J.E. Moore, and J.A. Lake. 1998. Genomic evidence for two functionally distinct gene classes. Proc Natl Acad Sci U S A 95: 6239- 6244.

Robillon, J.F., B. Canivet, M. Candito, J.L. Sadoul, D. Jullien, P. Morand, P. Chambon, and P. Freychet. 1994. Type 1 diabetes mellitus and homocyst(e)ine. Diabete Metab 20: 494-496.

Rosenblatt, D.S. and W.A. Fenton. 2001. Inherited disorders of folate and cobalamin transport and metabolism. In The metabolic & molecular bases of inherited disease (eds. C.R. Scriver A.L. Beaudet W.S. Sly, and D. Valle), pp. 3897-3933. McGraw-Hill, New York.

Rosenquist, T.H., A.M. Schneider, and D.T. Monogham. 1999. N-methyl-D- aspartate receptor agonists modulate homocysteine-induced developmental abnormalities. Faseb J 13: 1523-1531.

Rosenquist, T.H. and R.H. Finnell. 2001. Genes, folate and homocysteine in embryonic development. Proc Nutr Soc 60: 53-61.

Saeed, A.I., V. Sharov, J. White, J. Li, W. Liang, N. Bhagabati, J. Braisted, M. Klapa, T. Currier, M. Thiagarajan et al. 2003. TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34: 374-378.

Schachinger, V., M.B. Britten, and A.M. Zeiher. 2000. Prognostic impact of coronary vasodilator dysfunction on adverse long-term outcome of coronary heart disease. Circulation 101: 1899-1906.

Schadt, E.E., J. Lamb, X. Yang, J. Zhu, S. Edwards, D. Guhathakurta, S.K. Sieberts, S. Monks, M. Reitman, C. Zhang et al. 2005. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37: 710-717.

Schilling, C.H., D. Letscher, and B.O. Palsson. 2000. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. J Theor Biol 203: 229-248.

145 Schuster, S., T. Dandekar, and D.A. Fell. 1999. Detection of elementary flux modes in biochemical networks: a promising tool for pathway analysis and metabolic engineering. Trends Biotechnol 17: 53-60.

Seshadri, S., A. Beiser, J. Selhub, P.F. Jacques, I.H. Rosenberg, R.B. D'Agostino, P.W. Wilson, and P.A. Wolf. 2002. Plasma homocysteine as a risk factor for dementia and Alzheimer's disease. N Engl J Med 346: 476- 483.

Seyfried, T.N., R.K. Yu, and G.H. Glaser. 1980. Genetic analysis of audiogenic seizure susceptibility in C57BL/6J X DBA/2J recombinant inbred strains of mice. Genetics 94: 701-718.

Sidow, A. 1996. Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Dev 6: 715-722.

Singer, J.B., A.E. Hill, L.C. Burrage, K.R. Olszens, J. Song, M. Justice, W.E. O'Brien, D.V. Conti, J.S. Witte, E.S. Lander et al. 2004. Genetic dissection of complex traits with chromosome substitution strains of mice. Science 304: 445-448.

Slansky, J.E., Y. Li, W.G. Kaelin, and P.J. Farnham. 1993. A protein synthesis- dependent increase in E2F1 mRNA correlates with growth regulation of the dihydrofolate reductase promoter. Mol Cell Biol 13: 1610-1618.

Slansky, J.E. and P.J. Farnham. 1996. Transcriptional regulation of the dihydrofolate reductase gene. Bioessays 18: 55-62.

Smart, E.J., C. Mineo, and R.G. Anderson. 1996. Clustered folate receptors deliver 5-methyltetrahydrofolate to cytoplasm of MA104 cells. J Cell Biol 134: 1169-1177.

Smithells, R.W., S. Sheppard, and C.J. Schorah. 1976. Vitamin dificiencies and neural tube defects. Arch Dis Child 51: 944-950.

Smithells, R.W., S. Sheppard, C.J. Schorah, M.J. Seller, N.C. Nevin, R. Harris, A.P. Read, and D.W. Fielding. 1980. Possible prevention of neural-tube defects by periconceptional vitamin supplementation. Lancet 1: 339-340.

Song, J., A. Medline, J.B. Mason, S. Gallinger, and Y.I. Kim. 2000. Effects of dietary folate on intestinal tumorigenesis in the apcMin mouse. Cancer Res 60: 5434-5440.

146 Southam, E., S.C. Stratton, R.S. Sargent, K.T. Brackenborough, C. Duffy, R.M. Hagan, G.D. Pratt, S.A. Jones, and P.F. Morgan. 2002. Broad spectrum anticonvulsant activity of BW534U87: possible role of an adenosine- dependent mechanism. Pharmacol Biochem Behav 74: 111-118.

Spiegelstein, O., R.M. Cabrera, D. Bozinov, B. Wlodarczyk, and R.H. Finnell. 2004. Folate-regulated changes in gene expression in the anterior neural tube of folate binding protein-1 (Folbp1)-deficient murine embryos. Neurochem Res 29: 1105-1112.

Starkebaum, G. and J.M. Harlan. 1986. Endothelial cell injury due to copper- catalyzed hydrogen peroxide generation from homocysteine. J Clin Invest 77: 1370-1376.

Stegmann, K., A. Ziegler, E.T. Ngo, N. Kohlschmidt, B. Schroter, A. Ermert, and M.C. Koch. 1999. Linkage disequilibrium of MTHFR genotypes 677C/T- 1298A/C in the German population and association studies in probands with neural tube defects(NTD). Am J Med Genet 87: 23-29.

Stroes, E.S., E.E. van Faassen, M. Yo, P. Martasek, P. Boer, R. Govers, and T.J. Rabelink. 2000. Folic acid reverts dysfunction of endothelial nitric oxide synthase. Circ Res 86: 1129-1134.

Stuart, J.M., E. Segal, D. Koller, and S.K. Kim. 2003. A gene-coexpression network for global discovery of conserved genetic modules. Science 302: 249-255.

Su, L.J. and L. Arab. 2001. Nutritional status of folate and colon cancer risk: evidence from NHANES I epidemiologic follow-up study. Ann Epidemiol 11: 65-72.

Su, A.I., M.P. Cooke, K.A. Ching, Y. Hakak, J.R. Walker, T. Wiltshire, A.P. Orth, R.G. Vega, L.M. Sapinoso, A. Moqrich et al. 2002. Large-scale analysis of the human and mouse transcriptomes. Proc Natl Acad Sci U S A 99: 4465-4470.

Su, A.I., T. Wiltshire, S. Batalov, H. Lapp, K.A. Ching, D. Block, J. Zhang, R. Soden, M. Hayakawa, G. Kreiman et al. 2004. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci U S A 101: 6062-6067.

Subramanian, A., P. Tamayo, V.K. Mootha, S. Mukherjee, B.L. Ebert, M.A. Gillette, A. Paulovich, S.L. Pomerov, T.R. Golub, E.S. Lander et al. 2005. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A In press.

147 Surtees, R. 1998. Demyelination and inborn errors of the single carbon transfer pathway. Eur J Pediatr 157 Suppl 2: S118-121.

Suwaidi, J.A., S. Hamasaki, S.T. Higano, R.A. Nishimura, D.R. Holmes, Jr., and A. Lerman. 2000. Long-term follow-up of patients with mild coronary disease and endothelial dysfunction. Circulation 101: 948-954.

Swanson, D.A., M.L. Liu, P.J. Baker, L. Garrett, M. Stitzel, J. Wu, M. Harris, R. Banerjee, B. Shane, and L.C. Brody. 2001. Targeted disruption of the methionine synthase gene in mice. Mol Cell Biol 21: 1058-1065.

Tanaka, T.S., S.A. Jaradat, M.K. Lim, G.J. Kargul, X. Wang, M.J. Grahovac, S. Pantano, Y. Sano, Y. Piao, R. Nagaraja et al. 2000. Genome-wide expression profiling of mid-gestation placenta and embryo using a 15,000 mouse developmental cDNA microarray. Proc Natl Acad Sci U S A 97: 9127-9132.

Thomas, J.H. 1993. Thinking about genetic redundancy. Trends Genet 9: 395- 399.

Thompson, J.D., D.G. Higgins, and T.J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673-4680.

Toffoli, G., R. Gafa, A. Russo, G. Lanza, R. Dolcetti, F. Sartor, M. Libra, A. Viel, and M. Boiocchi. 2003. Methylenetetrahydrofolate reductase 677 C-->T polymorphism and risk of proximal colon cancer in north Italy. Clin Cancer Res 9: 743-748.

Tong, A.H., G. Lesage, G.D. Bader, H. Ding, H. Xu, X. Xin, J. Young, G.F. Berriz, R.L. Brost, M. Chang et al. 2004. Global mapping of the yeast genetic interaction network. Science 303: 808-813.

Tsai, J., R. Sultana, Y. Lee, G. Pertea, S. Karamycheva, V. Antonescu, J. Cho, B. Parvizi, F. Cheung, and J. Quackenbush. 2001. RESOURCERER: a database for annotating and linking microarray resources within and across species. Genome Biol 2: software0002.

Tusher, V.G., R. Tibshirani, and G. Chu. 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98: 5116-5121.

Ueland, P.M., H. Refsum, S.P. Stabler, M.R. Malinow, A. Andersson, and R.H. Allen. 1993. Total homocysteine in plasma or serum: methods and clinical applications. Clin Chem 39: 1764-1779.

148 Ueland, P.M. 1995. Homocysteine species as components of plasma thiol status. Clin Chem 41: 340-342.

Ulrich, C.M., E. Kampman, J. Bigler, S.M. Schwartz, C. Chen, R. Bostick, L. Fosdick, S.A. Beresford, Y. Yasui, and J.D. Potter. 2000. Lack of association between the C677T MTHFR polymorphism and colorectal hyperplastic polyps. Cancer Epidemiol Biomarkers Prev 9: 427-433. van der Put, N.M., R.P. Steegers-Theunissen, P. Frosst, F.J. Trijbels, T.K. Eskes, L.P. van den Heuvel, E.C. Mariman, M. den Heyer, R. Rozen, and H.J. Blom. 1995. Mutated methylenetetrahydrofolate reductase as a risk factor for spina bifida. Lancet 346: 1070-1071. van der Put, N.M., F. Gabreels, E.M. Stevens, J.A. Smeitink, F.J. Trijbels, T.K. Eskes, L.P. van den Heuvel, and H.J. Blom. 1998. A second common mutation in the methylenetetrahydrofolate reductase gene: an additional risk factor for neural-tube defects? Am J Hum Genet 62: 1044-1051.

Venter, J.C. M.D. Adams E.W. Myers P.W. Li R.J. Mural G.G. Sutton H.O. Smith M. Yandell C.A. Evans R.A. Holt et al. 2001. The sequence of the human genome. Science 291: 1304-1351.

Vita, J.A., C.B. Treasure, E.G. Nabel, J.M. McLenachan, R.D. Fish, A.C. Yeung, V.I. Vekshtein, A.P. Selwyn, and P. Ganz. 1990. Coronary vasomotor response to acetylcholine relates to risk factors for . Circulation 81: 491-497.

Wagner, A. 2000. Robustness against mutations in genetic networks of yeast. Nat Genet 24: 355-361.

Wald, D.S., M. Law, and J.K. Morris. 2002. Homocysteine and cardiovascular disease: evidence on causality from a meta-analysis. Bmj 325: 1202.

Walsh, J.B. 1995. How often do duplicated genes evolve new functions? Genetics 139: 421-428.

Walsh, C.P., J.R. Chaillet, and T.H. Bestor. 1998. Transcription of IAP endogenous retroviruses is constrained by cytosine methylation. Nat Genet 20: 116-117.

Wang, H.X., A. Wahlin, H. Basun, J. Fastbom, B. Winblad, and L. Fratiglioni. 2001. Vitamin B(12) and folate in relation to the development of Alzheimer's disease. Neurology 56: 1188-1194.

149 Wang, H., X. Jiang, F. Yang, J.W. Gaubatz, L. Ma, M.J. Magera, X. Yang, P.B. Berger, W. Durante, H.J. Pownall et al. 2003. Hyperhomocysteinemia accelerates atherosclerosis in cystathionine beta-synthase and apolipoprotein E double knock-out mice with and without dietary perturbation. Blood 101: 3901-3907.

Watanabe, M., J. Osada, Y. Aratani, K. Kluckman, R. Reddick, M.R. Malinow, and N. Maeda. 1995. Mice deficient in cystathionine beta-synthase: animal models for mild and severe homocyst(e)inemia. Proc Natl Acad Sci U S A 92: 1585-1589.

Waterland, R.A. and R.L. Jirtle. 2003. Transposable elements: targets for early nutritional effects on epigenetic gene regulation. Mol Cell Biol 23: 5293- 5300.

Waterston, R.H. K. Lindblad-Toh E. Birney J. Rogers J.F. Abril P. Agarwal R. Agarwala R. Ainscough M. Alexandersson P. An et al. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562.

Weiss, N., S. Heydrick, Y.Y. Zhang, C. Bierl, A. Cap, and J. Loscalzo. 2002. Cellular redox state and endothelial dysfunction in mildly hyperhomocysteinemic cystathionine beta-synthase-deficient mice. Arterioscler Thromb Vasc Biol 22: 34-41.

Werstuck, G.H., S.R. Lentz, S. Dayal, G.S. Hossain, S.K. Sood, Y.Y. Shi, J. Zhou, N. Maeda, S.K. Krisans, M.R. Malinow et al. 2001. Homocysteine-induced endoplasmic reticulum stress causes dysregulation of the cholesterol and triglyceride biosynthetic pathways. J Clin Invest 107: 1263-1273.

Wiback, S.J. and B.O. Palsson. 2002. Extreme pathway analysis of human red blood cell metabolism. Biophys J 83: 808-818.

Wilcken, D.E. and B. Wilcken. 1976. The pathogenesis of coronary artery disease. A possible role for methionine metabolism. J Clin Invest 57: 1079-1082.

Willett, W. 1989. The search for the causes of breast and colon cancer. Nature 338: 389-394.

Wilson, A.C., S.S. Carlson, and T.J. White. 1977. Biochemical evolution. Annu Rev Biochem 46: 573-639.

Wolff, G.L., R.L. Kodell, S.R. Moore, and C.A. Cooney. 1998. Maternal epigenetics and methyl supplements affect agouti gene expression in Avy/a mice. Faseb J 12: 949-957.

150 Woo, K.S., P. Chook, Y.I. Lolin, J.E. Sanderson, C. Metreweli, and D.S. Celermajer. 1999. Folic acid improves arterial endothelial function in adults with hyperhomocystinemia. J Am Coll Cardiol 34: 2002-2006.

Wouters, M.G., M.T. Moorrees, M.J. van der Mooren, H.J. Blom, G.H. Boers, L.A. Schellekens, C.M. Thomas, and T.K. Eskes. 1995. Plasma homocysteine and menopausal status. Eur J Clin Invest 25: 801-805.

Wright, F. 1990. The 'effective number of codons' used in a gene. Gene 87: 23- 29.

Wu, D. and W.M. Pardridge. 1999. Blood-brain barrier transport of reduced folic acid. Pharm Res 16: 415-419.

Yang, I.V., E. Chen, J.P. Hasseman, W. Liang, B.C. Frank, S. Wang, V. Sharov, A.I. Saeed, J. White, J. Li et al. 2002. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol 3: research0062.

Zeisel, S.H. and J.K. Blusztajn. 1994. Choline and . Annu Rev Nutr 14: 269-296.

Zhang, S., D.J. Hunter, S.E. Hankinson, E.L. Giovannucci, B.A. Rosner, G.A. Colditz, F.E. Speizer, and W.C. Willett. 1999. A prospective study of folate intake and the risk of breast cancer. Jama 281: 1632-1637.

Zhang, L. and W.H. Li. 2004. Mammalian housekeeping genes evolve more slowly than tissue-specific genes. Mol Biol Evol 21: 236-239.

Zhao, Q., R.R. Behringer, and B. de Crombrugghe. 1996. Prenatal folic acid treatment suppresses acrania and meroanencephaly in mice mutant for the Cart1 homeobox gene. Nat Genet 13: 275-283.

151