PDF hosted at the Radboud Repository of the Radboud University Nijmegen

The following full text is a publisher's version.

For additional information about this publication click this link. http://hdl.handle.net/2066/106938

Please be advised that this information was generated on 2021-10-05 and may be subject to change. Text mining and information extraction for the life sciences: an enhanced science approach

Proefschrift

ter verkrijging van de graad van doctor aan de Radboud Universiteit Nijmegen op gezag van de rector magnificus prof. mr. S.C.J.J. Kortmann, volgens besluit van het college van decanen in het openbaar te verdedigen op donderdag 16 mei 2013 om 10.30 uur precies

door

Wilco Wilhelmus Maria Fleuren

geboren op 8 april 1978 te Nijmegen Promotor Prof. dr. J. de Vlieg

Copromotor Dr. W.B.L. Alkema (NIZO food research BV)

Manuscript-commissie Prof. dr. S.S. Wijmenga Prof. dr. H.G. Brunner Prof. dr. A.J. van Gool

This work was performed within the NBIC BioRange project “A Systems Bioinformatics Approach For Evaluating And Translating Drug-Target Effects In Disease Related Pathways” (BR4.2) and continued within the Netherlands eScience project “Creation of food specific ontologies for food focused text mining and knowledge management studies” (027.011.306).

Title: Text mining and information extraction for the life sciences: an enhanced science approach.

Copyright © 2013 Wilco Fleuren, Nijmegen, The Netherlands

ISBN/EAN: 978-90-9027438-6

Cover design: Walter van Rooij, Buro Brandstof

Printed by: CPI Wöhrmann Print Service, Zutphen

"Vorsprung durch Technik", Audi AG

TABLE OF CONTENTS

Chapter 1 General introduction 6 Chapter 2 CoPub update: CoPub 5.0 a text mining system to answer 22 biological questions Chapter 3 Identification of new biomarker candidates for glucocorticoid 34 induced insulin resistance using literature mining Chapter 4 Prednisolone-induced changes in -expression 56 profiles in healthy volunteers Chapter 5 Prednisolone induces the Wnt signaling pathway in 3T3-L1 80 adipocytes Chapter 6 Prednisolone-induced differential gene expression in mouse liver 106 carrying wild type or a dimerization-defective glucocorticoid receptor Chapter 7 Identification of T lymphocyte signal transduction pathways 134 Chapter 8 Functional annotation of microorganisms using text mining 166 Chapter 9 General discussion 190

Summary 200

Samenvatting 206

Dankwoord 213

Curriculum vitae 217

Bibiliography 219 CHAPTER 1 General introduction

General introduction 9

General introduction

Products for human consumption and intake need to be safe and should not be harmful. Hence the pharmaceutical and food industry spent millions of dollars on research and 1 Chapter development (R&D) to develop better and safer products. The safety of drugs and food products is tested in experiments before they are marketed. These experiments result in enormous amounts of biological data that need to be analyzed in order to draw conclusions, to initiate follow-up studies, and to generate new hypothesis. This so-called data driven research is becoming more and more important in these industries and requires specialists which are able to extract useful information out of these biological data. Results from experiments are published in scientific papers and reports for which the abstracts often are deposited in the publicly available Medline database. Medline contains over 22 million of journal citations and abstracts from biomedical literature and is growing exponentially each year. Extraction of useful knowledge from these abstracts in order to annotate results from experiments requires methods which can do this extraction automatically.

Text mining and Information extraction

Text mining (TM) and automatic information extraction (IE) has evolved over the years into a sophisticated and specialized discipline in the bio medical sciences. The process of TM and IE involves high quality information extraction from literature and subsequently linking this information to experimental data. Currently, the most used approaches to retrieve knowledge from text are by co- occurrences and by natural language processing (NLP) [1].

Co-occurrence-based methods Co-occurrence based methods use the simple concept that two keywords that co-occur in the same body of text often are functionally related. For example the co-occurrence of retinol-binding 4 (RBP4) and insulin resistance suggests a functional relationship between the gene and the disease [2, 3]. However, also many keywords co-occur in text without being functionally related. Therefore these methods often use a statistically scoring based on frequency of occurrence of both keywords, to add a degree of significance to the relationship. Co-occurrence based methods do not provide information about the type of relation between keywords but are therefore able to detect any type of relationship.

10 Chapter 1

NLP-based methods NLP based methods use prior knowledge to retrieve information out of text. This prior knowledge can be for instance how language is structured or specific knowledge on how biological information is mentioned in the literature. NLP is defined as "the processing of natural language, i.e. human languages by computers". In contrast to co-occurrence based systems, NLP systems are phrase based and include information about the relationship between two keywords, e.g. gene A inhibits gene B or gene C is involved in disease Y. NLP based systems are very suitable to search for specific relationships. However these systems are limited to predefined relationships and are often very computationally intensive.

Drug discovery

The first part of this thesis describes the application of bioinformatics and especially IE and TM to study drug (side) effects. The development of a new drug is very complex, time-consuming and expensive. On average the development of a drug takes about twelve years and costs are in the order of 1 billion US dollars. Already in the United States only, approximately just one out of six drugs that enter the clinical testing pipeline will eventually obtain approval to be marketed [4]. Despite the increasing investment in research and development (R&D), the attrition rate of drug candidates increases. Therefore early detection of failure of drug candidates is essential to reduce the drug development costs. Traditionally, the drug development process can be divided into distinct phases starting with the target discovery phase and eventually leading to the successful market approval of a drug candidate (Figure 1).

Target discovery In the first phase possible new therapeutic targets will be identified. Information about

Figure 1 A schematic representation of the drug development pipeline. The process ranges from the identification of potential drug targets in the Target discovery stage to the successful introduction of the drug on the market in the Marketing & sales stage.

the function and the role of these targets in the development of the disease willbe gathered and subsequently used to get an overview of the pathways and / that are involved. This step is very important since it gives insight into the biological context of the target and its relation to the disease. Typically in this phase information General introduction 11 about the target will be retrieved from earlier studies and from the scientific literature. in vitro in vivo

Promising targets will be validated in both and experiments in which the therapeutic effect and the relation of the target to the disease will be further investigated. Chapter 1 Chapter

Lead discovery and lead optimization Once possible targets have been detected and verified, screening of large compound libraries starts to identify so called lead compounds, which are able to modulate the biological activity of targets. Subsequently in the lead optimization phase these identified compounds will be chemically modified to improve their potency, selectivity and efficacy. The efficacy of compounds will be determined in genome-wide association studies using microarray technology, in which changes in gene expression are measured as a result of compound treatment. These expression profiles can be compared with expression profiles from other compounds and can also be used to get an insight into the mechanisms and pathways affected. In the target discovery, lead discovery and lead optimization phases is it important to gather as much as possible information about the target, its biologcal surrounding and the compound to predict (side) effects. TM and IE are important techniques to retrieve this information from the scientific literature.

Pre-clinical development Compounds which showed a promising effect on the biological target will move to the pre-clinical development phase were they will be further tested for efficacy and safety. Compounds are tested on toxicity in animal models. Furthermore in pharmacokinetics experiments the compounds are tested for their absorption, distribution, metabolism, and excretion (ADME) properties in the body. The pre-clinical development phase is important because the ultimate safety profile of compounds will determined. Based on this safety profile decisions will be made to move the compounds to the next phase in which they will be tested in humans. Approximately two third of the compounds fail in the pre-clinical development phase because of poor toxicity and/or pharmacokinetics properties.

Clinical trials In the last phase compounds also called drug candidates will be tested in humans in so called clinical trials. Clinical trials are normally taking place in four phases. In phase I in a small group of healthy volunteers (20 - 80) the toxicity and possible side effects of the drug candidate will be tested together with assessment of the correct dosage. These tests will continue in phase II, but in larger groups (100-300) of both healthy volunteers and patients. The effectiveness of the drug candidate will be further tested. In phase III the drug candidate will be administered to large groups of people (1000-3000) to continue to investigate the effectiveness and to compare the effects with commonly used treatments. 12 Chapter 1

The results from these phases will be documented and reviewed by agencies such as the European Medicines Agency (EMEA) and the US Food and Drug Administration (FDA) to get market approval for the drug candidate. Phase IV trials are post-market studies in which long-term treatment is monitored to assess long-term effects of the drug. Clinical trials are time-consuming taking several of years and account for 50-70% of the total drug development costs.

Studying drug effects on different levels

With the rising drug development costs and the increasing pressure by governments to introduce safer and more valuable drugs on the market, the pharmaceutical industry faces a challenge to reduce these costs in order to keep delivering affordable drugs on the market. Therefore early detection of failure of drug candidates is essential. To predict the effects of drug candidates it is important to understand which drug targets they are able to modulate, and what the biological surrounding of these targets are. Many drug targets influence multiple phenotypic traits, genes, proteins and pathways. The techniques used in this thesis to study drug effects on genes, pathways and proteins

Figure 2 Techniques used in the studies described in this thesis to study drug effects. General introduction 13 are described in more detail below and in Figure 2.

Genome-wide expression profiling

Genome-wide association studies enable to study gene expression profiles of thousands 1 Chapter of genes simultaneously. In this way drug effects can be studied by looking at changes in gene expression as a result of treatment with a drug or compound. Gene expression is measured using microarray technology. A microarray consists of spots on which pieces of DNA called probes are attached to the surface of the array. Extracted RNA from the biological sample will be labelled with a fluorescent tag and hybridized to the probes on the microarray. Subsequently, the hybridized chip is scanned to measure fluorescent intensities. Typically, drug effects on gene expression are studied by comparing the gene expression from a drug treated sample with gene expression from a control sample.

Protein profiling using anti-body arrays Differentially expressed proteins can be measured using anti-body arrays. This enables to study drug effects on proteins. Instead of using DNA probes, protein arrays consist of antibodies that are attached to the microarray surface. These antibodies are able to detect target proteins from the biological sample, by means of specific labelled secondary antibodies. Currently, protein arrays can only hold a couple of hundreds of antibodies, this in contrast to DNA microarrays. However with the introduction of antibody arrays the gap between the effects seen at gene expression level and the effects seen at protein level can be decreased.

Text mining to interpret experiments Co-occurrence based TM methods have been successfully used to analyze microarray data, identifying gene-disease and gene-drug relationships [5-8]. We developed CoPub a text mining tool that is able to search for occurrences of different types of keywords from a wide variety of thesauri, e.g. gene names, disease names, drug names and pathway names, in Medline abstracts. CoPub uses the principle that a co- occurrence of two keywords in the same abstract is an indication of a functional relation between both keywords. The significance of a found relation is described by the R-scaled score, which is an indication of the strength of a co-occurrence of two keywords given their individual frequencies of occurrence [9]. CoPub has originally been developed for the interpretation of microarray data [5, 9] and its technology has been expanded to other areas, such as the identification of novel hidden relations [10]. In chapter 2 of this thesis we describe CoPub 5.0, which combines all CoPub technologies with extensive search and filter algorithms that can be used to answer a variety of biological questions.

Gene literature networks Studying genes which are related to a disease or potential drug target enables to better 14 Chapter 1

understand which mechanisms are involved. For this it is important to study genes in their functional context. Typically such a functional context can be the biological process in which the gene is involved. Examples of such a biological process can be for instance the glycolysis in which in ten steps glucose will be converted into pyruvate. The functional context of phosphofructokinase, an enzyme in the glycolysis, can be defined by the other enzymes in this process. Although the functional context for phosphofructokinase in this example is straight forward and well defined, in most cases the functional context of a gene is not that straight forward. Another way to define a functional context for a gene is to search for co-occurrences of this gene with other genes in Medline abstracts. Hence in this way a literature network of co-occurring genes can be created. This network can be used to study drug effects by means of mapping drug induced gene expression data or by mapping drug related information from the literature onto the network. This approach not only enables to study the effects of a drug on a single gene but also on the genes with which it is functionally related in the network. In chapter 3 of this thesis this approach is used to study the effects of dexamethasone, a synthetic glucocorticoid used as an anti inflammatory agent for the treatment of inflammatory diseases, on a network of insulin resistance (IR) related genes.

Glucocorticoids

The studies described in chapters 3-5 of this thesis focus on the metabolic side effects as a result of treatment with synthetic glucocorticoids (GCs). GCs are naturally occurring hormones which are released by the adrenal cortex in response to stress. They are part of the feed-back mechanism of the immune system in response to inflammation. The most abundant form in the human body iscortisol , which binds with high affinity to the glucocorticoid receptor (GR). This receptor is expressed in almost every cell in the body and unbound GR is mainly present in the cytosol of the cell. Studies indicated that GR regulates up to 10 -20% of the in different cell types, thus showing the important role of this receptor. GC effects can roughly be divided into two types [11, 12]. The best studied effects are the so called genomic effects, in which the GC-GR complex moves into the nucleus of the cell, where it regulates gene expression. The GR-GC complex enhances or represses gene transcription by direct binding as a homodimer to the glucocorticoid-responsive-element (GRE), by binding to other transcription factors or by direct GRE binding and interactions with transcription factors bound to neighboring sites of a gene (Figure 3). These genomic effects take place within a couple of hours. However GCs induce also rapid effects which are taking place within minutes. These so called non-genomic effects are far less well studied and are thought to be mediated by affecting the properties of cell membranes or through the binding of intracellular or membrane-bound glucocorticoid receptors [13, 14]. General introduction 15

Therapeutic usage Because of their immunosuppressive properties, GCs are the most commonly prescribed anti-inflammatory drugs for the treatment of inflammatory diseases since the mid 1950s, with the introduction of prednisolone, dexamethasone and betamethasone. Patients 1 Chapter suffering from rheumatoid arthritis, asthma and psoriasis are often treated with GCs. In fact up to 60% of the rheumatoid arthritis patients are more or less continuously treated with GCs [15, 16]. Although the treatment of these inflammatory diseases with GCs is very successful, it is hampered by dose-dependent side effects, such as osteoporosis, skin atrophy, hypertension and severe metabolic disturbances such as insulin resistance which

Figure 3 General mechanisms of Action of glucocorticoids and the glucocorticoid receptor in the inhibition of inflammation. TNFa denotes tumor-necrosis factor a, HSP heat-shock protein, mRNA messenger RNA, and P phosphate. The three mechanisms are nongenomic activation, DNAdependent regulation, and protein interference mechanisms (e.g., NFk B elements). Black arrows denote activation, the red line inhibition, the red dashed arrow repression, and the red X lack of product (i.e., no mRNA). Reproduced with permission from [12] , Copyright Massachusetts Medical Society. 16 Chapter 1

eventually leads to diabetes and obesity. This limits the long-term usage of these synthetic GCs. The general assumption is that the immunosuppressive effects of GCs are mainly driven by transrepression, in which the GC-GR complex, together with other transcription factors, prevents the transcription of pro-inflammatory genes. The unwanted side-effects are mainly driven by the act of transactivation of gene expression. A lot of effort has been put into the development of novel GC ligands with an improved efficacy/side effect ratio compared to conventional GCs. This requires knowledge about the mechanisms behind GC induced metabolic side effects. In the studies described in this thesis, a combination of microarrays, antibody arrays, bioinformatics and TM/IE are used to characterize GC induced metabolic side–effects in both in vitro and in vivo studies.

Studying microorganisms in food production and health

IE and TM have successfully been used in drug discovery to study drug effects and can also be used in other areas, such as in the microbiology. Microorganisms play crucial roles in processes ranging from waste-product degradation in ecosystems to industrial processes where they are used for the development of fermented foods and beverages. Furthermore, microorganisms are also the main contributor to food spoilage and can be harmful causing diseases.

Microorganisms in food The biggest contribution of microorganisms to food development is in the process of fermentation, in which organic compounds are transformed under the influence of bacteria and yeasts. Louis Pasteur was one of the first scientists who discovered that yeasts were responsible for alcohol fermentation [17]. This discovery initiated the scientific and industrial interests in food microbiology and resulted in that fermentation nowadays is a highly industrialized process. Lactic acid bacteria (LAB) are very important for the fermentation of food. For instance Lactobacillus bulgaricus subsp. delbrueckii and Streptococcus thermophilus strains are used for making yoghurt [18]. Lactobacillus lactis contributes to the flavor development of cheese [19, 20], and Oenococcus oeni is important for malolactic fermentation in the process of wine making, giving wine its characteristic taste and aroma [20, 21]. Estimations indicate that about one third of the human diet consists of fermented foods and drinks [22]. In addition to these food-characteristics, some LAB strains have proven health-promoting effects on humans and are used as probiotics. General introduction 17

Probiotics Probiotics are defined as "live microorganisms which when administered in adequate amounts confer a health benefit on the consumer". The possible contribution of probiotics to human health has been proposed in a number of ways: via competitive exclusion 1 Chapter of pathogenic bacteria, via enhancement of epithelial barrier function by modulation of signaling pathways that lead to enhanced mucus, or by modulation of the immune system of the host [23]. Probiotic effects have been established in the prevention of number of diseases including diarrhea, necrotizing enterocolitis and pouchitis. However probiotic effects have not been consistently observed in studies to other diseases such as irritable bowel syndrome, atopic dermatitis, and inflammatory bowel disease. This requires that more scientific research is needed to improve the industrial application of probiotics towards stimulation of human health.

Community profiling studies In order to use microorganisms in industrialized processes for food production and to develop cures against pathogenic microorganisms, relationships between microorganisms and a given process or treatment needs to be studied. Metagenomics is a well established technique to study the dynamics of microbial communities in so-called community profiling studies. Metagenomics is the genomic analysis of microbial DNA that is extracted directly from environmental samples, by means of pyrosequencing of 16S rRNA genes and has been proposed as the most accurate approach for the assessment of taxonomy composition [24]. One of the outcomes of these metagenomics experiments are lists of microorganisms that are affected by a given process or treatment. Annotation of these lists is essential to understand how these microorganisms are involved in the studied processes.

Microorganism focused text mining and knowledge building Similar to the usage of TM for the annotation of drug induced gene expression, in chapter 8 of this thesis, we describe how TM can be used to get a quick insight into the biological properties of microorganisms. Thesaurus based TM is used to annotate lists of microorganisms that are the result from community profiling studies. Furthermore TM without using thesaurus is used to study LAB to gain insight into the biological processes in which they are involved.

18 Chapter 1

Thesis outline

This thesis consists of two parts. Both parts focus on the development and application of TM and IE for the annotation of biological experiments. In the first part bioinformatics and IE/TM are used to study the effects of drugs in order to unravel the underlying mechanisms and affected pathways. The majority of the studies described in this thesis focus on the mechanisms that underlie the unwanted metabolic side effects of glucocorticoid treatment. In the second part we demonstrate that knowledge of the usage of TM in drug discovery can be used to retrieve information from literature in order to study microorganisms. A

Figure 4 Outline of this thesis.

schematic summary of the work in this thesis is depicted in Figure 4. In Chapter 2 we describe CoPub 5.0, a publicly available text mining system, which uses Medline abstracts to calculate robust statistics for keyword occurrences. In CoPub 5.0 all available CoPub technology has been integrated into a single application with a renewed user interface. CoPub 5.0 enables to answer a variety of biological questions ranging from microarray data analysis and keyword enrichment to the search for new biological relations. Chapter 3 describes CoPubGene a method to mine Medline abstracts for functional gene- General introduction 19 disease relations. The method has been used to create a literature network of genes

related to insulin resistance, a known side-effect of glucocorticoid treatment for which the mechanisms are not completely understood. The study gave new insights into these mechanisms and indicated a possible involvement of the sex-steroid synthesis pathway 1 Chapter in the development of GC induced IR. Chapter 4 describes a genome wide expression profiling study to identify prednisolone induced gene signatures in CD4+ T lymphocytes and CD14+ monocytes derived from healthy volunteers and to link these signatures to underlying biological pathways involved in metabolic adverse effects as a result of prednisolone treatment. Chapter 5 describes the usage of genome wide expression data and adipokine profiling in combination with literature mining techniques to study mechanisms and signaling pathways involved in prednisolone induced metabolic disturbances in murine 3T3-L1 adipocytes. We found the wnt signaling pathway to be involved in metabolic disturbances in adipocyte cells and we suggested this pathway for follow-up experiments to study its role in GC-induced metabolic effects. Chapter 6 describes a genome wide gene expression profiling study to examine the role of GR dimerization in the regulation of gene expression in mouse liver. Biological pathways targeted by prednisolone in the liver of wild type (WT) mice and mice lacking the ability to form glucocorticoid receptor dimers (GRdim) have been identified and compared. Chapter 7 describes a study that identifies pathways and genes involved in the differentiation of Jurkat T cells towards Th1 and Th2 subtype cells, using various stimuli and pathway inhibitors. Specific gene profiles of IL-2, associated with a Th1 type of response, and CCL1, associated with a Th2 type of response, have been identified. TM analysis was used to find additional evidence for these Th1 and Th2 specific fingerprints. In Chapter 8 the usage of TM to add meaning to results of metagenomics experiments is demonstrated. Abstracts, in which microorganisms occur, are automatically analyzed to retrieve useful keywords that can be used to understand the biology of microorganisms and their involvement in the processes studied. In Chapter 9 a short discussion and some concluding remarks with regard to this thesis are described. 20 Chapter 1

References

1. Cohen, K.B. and L. Hunter, Getting started in text mining. PLoS Comput Biol, 2008. 4(1): p. e20. 2. Mody, N., et al., Decreased clearance of serum retinol-binding protein and elevated levels of transthyretin in insulin-resistant ob/ob mice. Am J Physiol Endocrinol Metab, 2008. 294(4): p. E785-93. 3. Reinehr, T., B. Stoffel-Wagner, and C.L. Roth, Retinol-binding protein 4 and its relation to insulin resistance in obese children before and after weight loss. J Clin Endocrinol Metab, 2008. 93(6): p. 2287-93. 4. DiMasi, J.A., et al., Trends in risks associated with new drug development: success rates for investigational drugs. Clin Pharmacol Ther, 2010. 87(3): p. 272-7. 5. Frijters, R., et al., CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Res, 2008. 36(Web Server issue): p. W406-10. 6. Maier, H., et al., LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts. Nucleic Acids Res, 2005. 33(Web Server issue): p. W779-82. 7. Blaschke, C., et al., Extracting information automatically from biological literature. Comp Funct Genomics, 2001. 2(5): p. 310-3. 8. Garten, Y., N.P. Tatonetti, and R.B. Altman, Improving the prediction of pharmacogenes using text- derived drug-gene relationships. Pac Symp Biocomput, 2010: p. 305-14. 9. Alako, B.T., et al., CoPub Mapper: mining MEDLINE based on search term co-publication. BMC Bioinformatics, 2005. 6: p. 51. 10. Frijters, R., et al., Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Comput Biol, 2010. 6(9). 11. Oakley, R.H. and J.A. Cidlowski, Cellular processing of the glucocorticoid receptor gene and protein: new mechanisms for generating tissue-specific actions of glucocorticoids. J Biol Chem, 2011. 286(5): p. 3177-84. 12. Rhen, T. and J.A. Cidlowski, Antiinflammatory action of glucocorticoids--new mechanisms for old drugs. N Engl J Med, 2005. 353(16): p. 1711-23. 13. Losel, R. and M. Wehling, Nongenomic actions of steroid hormones. Nat Rev Mol Cell Biol, 2003. 4(1): p. 46-56. 14. Lowenberg, M., et al., Novel insights into mechanisms of glucocorticoid action and the development of new glucocorticoid receptor ligands. Steroids, 2008. 73(9-10): p. 1025-9. 15. Buttgereit, F., et al., Glucocorticoids in the treatment of rheumatic diseases: an update onthe mechanisms of action. Arthritis Rheum, 2004. 50(11): p. 3408-17. 16. Thiele, K., et al., Current use of glucocorticoids in patients with rheumatoid arthritis in Germany. Arthritis Rheum, 2005. 53(5): p. 740-7. 17. Mortimer, R.K., Evolution and variation of the yeast (Saccharomyces) genome. Genome Res, 2000. 10(4): p. 403-9. 18. Thevenard, B., et al., Characterization of Streptococcus thermophilus two-component systems: In silico analysis, functional analysis and expression of response regulator genes in pure or mixed culture with its yogurt partner, Lactobacillus delbrueckii subsp. bulgaricus. Int J Food Microbiol, 2011. 151(2): p.171-81. 19. Van Hoorde, K., et al., Selection, application and monitoring of Lactobacillus paracasei strains as adjunct cultures in the production of Gouda-type cheeses. Int J Food Microbiol, 2010. 144(2): p. 226- 35. 20. Smit, G., B.A. Smit, and W.J. Engels, Flavour formation by lactic acid bacteria and biochemical flavour profiling of cheese products. FEMS Microbiol Rev, 2005.29(3): p. 591-610. 21. Ugliano, M., A. Genovese, and L. Moio, Hydrolysis of wine aroma precursors during malolactic fermentation with four commercial starter cultures of Oenococcus oeni. J Agric Food Chem, 2003. 51(17): p. 5073-8. 22. van Hylckama Vlieg, J.E., et al., Impact of microbial transformation of food on health-from fermented foods to fermentation in the gastro-intestinal tract. Curr Opin Biotechnol, 2011. 22(2): p. 211-9. 23. Bron, P.A., P. van Baarlen, and M. Kleerebezem, Emerging molecular insights into the interaction between probiotics and the host intestinal mucosa. Nat Rev Microbiol, 2012. 10(1): p. 66-78. 24. von Mering, C., et al., Quantitative phylogenetic assessment of microbial communities in diverse environments. Science, 2007. 315(5815): p. 1126-30. General introduction 21

Chapter 1 Chapter CHAPTER 2 CoPub update: CoPub 5.0 a text mining system to answer biological questions

Wilco W.M. Fleuren1,2, Stefan Verhoeven3, Raoul Frijters1, Bart Heupers4, Jan Polman3, René van Schaik3, Jacob de Vlieg1,3 and Wynand Alkema3

1Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, 2Netherlands Bioinformatics Centre (NBIC), PO Box 9101, 6500 HB Nijmegen, 3Molecular Design and Informatics, MSD, PO Box 20, 5340 BH Oss and 4SARA Computing and Network Services, Amsterdam, The Netherlands.

Nucleic Acids Research, 2011, 39:W450-4.

CoPub 5.0 a textmining system to answer biological questions 25

Abstract

In this article, we present CoPub 5.0, a publicly available text mining system, which uses Medline abstracts to calculate robust statistics for keyword co-occurrences. CoPub was initially developed for the analysis of microarray data, but we broadened the scope by implementing new technology and new thesauri. In CoPub 5.0, we integrated existing

CoPub technology with new features, and provided a new advanced interface, which can be used to answer a variety of biological questions. CoPub 5.0 allows searching Chapter 2 Chapter for keywords of interest and its relations to curated thesauri and provides highlighting and sorting mechanisms, using its statistics, to retrieve the most important abstracts in which the terms co-occur. It also provides a way to search for indirect relations between genes, drugs, pathways and diseases, following an ABC principle, in which A and C have no direct connection but are connected via shared B intermediates. With CoPub 5.0, it is possible to create, annotate and analyze networks using the layout and highlight options of Cytoscape web, allowing for literature based systems biology. Finally, operations of the CoPub 5.0 Web service enable to implement the CoPub technology in bioinformatics workflows. CoPub 5.0 can be accessed through the CoPub portal http://www.copub.org. 26 Chapter 2

Introduction

Medline abstracts are a very useful source of biomedical information covering topics such as biology, biochemistry, molecular evolution, medicine, pharmacy and health care. This knowledge is useful to better understand the complexity of living organisms and can, for instance, be used to study groups of genes or metabolites in their biological context. In the 2008, Web Service issue of NAR, we presented CoPub as a publicly available text mining system. This system uses Medline abstracts to calculate robust statistics for keyword co-occurrences, to be used for the biological interpretation of microarray data (1,2). Since then, CoPub has been intensively used in the analysis of several microarray experiments and toxicogenomics studies [3–8]. However, literature data can be applied far beyond questions related to microarray studies. Therefore, we broadened the scope of CoPub by implementing new technology and adding new thesauri to the database. We developed a new technology called CoPub Discovery, which can be used to mine the literature for new relationships following a simple ABC-principle, in which keyword A and C have no direct relationship, but are connected via shared B-intermediates [9]. This technology can, for instance, be used to study mechanisms behind diseases, connect new genes to pathways or to find novel applications for existing drugs. To reflect all these developments, we created CoPub 5.0, which has a complete new user interface and in which we integrated all CoPub technologies. CoPub 5.0 enables the use of CoPub functionality in a very dynamic interactive manner by easily switching between multiple analysis modes and is very suitable to answer a variety of biological questions. It is also accessible using operations of the CoPub 5.0 Web Service (SOAP or JSON), which makes it possible to embed the CoPub functionality into bioinformatics workflows. CoPub 5.0 and the CoPub 5.0 Web Service can be accessed at the CoPub portal http://www .copub.org.

Methods

CoPub 5.0 has three analysis modes. A ‘term search’ mode that retrieves abstracts and

Figure 1 Schematic representation of CoPub. The CoPub database holds co-occurrence information between categories in Medline Abstracts. The CoPub functionality can be used via three modes using the web interface or via the CoPub web services either via SOAP or JSON. CoPub 5.0 a textmining system to answer biological questions 27 keyword relations for a single term, a ‘pair search’ mode that analyzes known or new relations between a pair of terms and a set of terms mode that deals with the relation between multiple terms (Figure 1).

‘Term search’ mode The ‘term search’ mode provides a way to search for keywords and subsequently showing

their relations with other categories in the CoPub database. This mode provides a table and cloud view which can be used to answer questions such as ‘to which diseases is this gene related?’ or ‘in which biological processes is my metabolite involved?’ For instance, the 2 Chapter cloud view in which strongly connecting terms [i.e. high R-scaled score (1)] are displayed with a larger font, can be used to immediately show the most important relations of the term with keywords from one or more categories in the database (Figure 2A). The

Figure 2 An example of the term search view for the human chemokine receptor 4. In the cloud view, it is immediately clear, by the large font of the terms, that CXCR4 is strongly connected to its ligand CXCL12 and CXCR7, with which it forms a heterodimer (A). Also, CXCR4 is strongly connected to ‘HIV infections’ (category: disease), which is mediated by CXCR4 and to ‘stromal’ cell, to which CXCR4 is linked because of its stromal derived ligand CXCL12. In B an example is shown of the underlying abstracts for the co-occurrences. evidence for these relations lies in the Medline abstracts in which both terms occur. CoPub retrieves these abstracts, highlights both terms in them and ranks the abstracts which has the most term occurrences as first (Figure 2B). In the example, in Figure 2, it is shown that CXCR4 is strongly connected to its ligand CXCL12 and to CXCR7, with which it forms a heterodimer, and it mediates HIV infections. Besides co-occurrences, it is also possible to search for new hidden relations between the term and selected categories via shared intermediates using the ‘open discovery’ mode (see ‘Hidden Relations’ section). From the ‘term search’ mode, it is possible to add a term to the current set and switch to the ‘set of terms’ mode. 28 Chapter 2

‘Pair search’ mode The ‘pair search’ mode can be used to search for specific relations between existing keywords in the CoPub database, e.g. to search for a relation between a gene and a drug. A wizard will guide the user in its search for relations between terms. CoPub will first search for co-occurrences and if no co-occurrence is found, the user can search for hidden relations using the ‘closed discovery’ mode (see ‘Hidden Relations’ section). The pair search mode can be useful to search the literature for more evidence which supports relations found in experiments, for instance, between a drug and a pathway or between a gene and a pathway or which supports hypothesis.

‘Set of terms’ mode Biological research often involves a better understanding of the complexity of living organisms, for instance, to better understand the development of a disease or to gain more insight into complex signaling pathways [10,11]. This requires a systems approach in which groups of genes or metabolites are studied in relation to a disease, drugs or pathways. In CoPub 5.0, we provide such a systems approach via the ‘set of terms mode’. In this mode, a set of keywords can be uploaded either by copy–pasting them or by uploading a text file. Terms can belong to multiple categories (e.g. insulin belongs to the category human gene and to the category drug), which can be further specified to only the desired categories using the ‘Members of category’ option. An uploaded set of terms can be analyzed in a number of ways.

Set enrichment analysis. To see with which categories the set has significant relations, an enrichment analysis can be performed. In this analysis, the relation of a given term from a category with the set is tested using the Fisher exact test against a background set. The calculated P-values are corrected using the Benjamini–Hochberg multiple testing correction method. In case the set consists of multiple categories, only one type of category can be chosen to be used as a background set. For each enriched term, the number of contributors of the set is shown and the contributors can be accessed by clicking on this number. All statistical tests are done using the R Statistics package (http://www.r-project.org).

Set annotation. A set of terms can be annotated by searching for co-occurrences between the set and categories in the database. The cloud view immediately shows for each term in the set, the most significant associated terms (in larger font) per category. Categories from the database can be added or removed from this view. All co-occurring annotation can be downloaded from this view using the download button. CoPub 5.0 a textmining system to answer biological questions 29

Network. To analyze the relations between terms in the set, a literature network of the set can be created. Subsequently, the network will be visualized using the Cytoscape web plugin (http://cytoscapeweb.cytoscape.org/). Strongly connected terms have thick edges (high R-scaled score), which immediately shows important relations (Figure 3). For large networks (>500 nodes), the network can be downloaded and visualized in a standalone Cytoscape environment. Chapter 2 Chapter

Figure 3 Network of a group of mixed terms using the Cytoscape plugin. In the network the gene IL4 has strong connections to the genes IL2 and CCL11 and is also strongly connected to the biological process ‘isotype switching’ and ‘cytokine biosynthesis’. This is indicated by the thick edges between these nodes. Clicking on an edge will show the abstracts in which both terms occur allowing for more detailed analysis of the biological context in which the terms are related.

Add additional terms. At any time an uploaded set can be extended with additional terms. These additional terms can be provided by the user (via ‘add additional terms’), by searching for co- occurrences between the set and categories in the database (via ‘Grow set with co- occurrences’) or by adding a specific term via the ‘term search’ mode, from which it can be added to the set using the ‘Add term to set’ button.

Hidden relations From the ‘term search’ mode and the ‘pair search’ mode in the website, it is possible to search for hidden relations using the CoPub Discovery technology [9]. CoPub Discovery uses an ‘open discovery’ and ‘closed discovery’ process to search for new hidden relations. Both processes follow an ABC principle in which, in case of ‘open discovery’, the user provides a term A (e.g. disease) and searches the literature for hidden relations with a category (C) via intermediates (B) and in case of ‘closed discovery’, the user tests the hypothesis that, for instance, a gene (A) is related to a disease (C) and searches the literature for shared intermediates (B) which support this hypothesis. This technology 30 Chapter 2

can be useful to find different roles of genes in new pathways or to get more insight into mechanisms behind diseases.

CoPub Web Service The operations from the CoPub Web Service allows to embed CoPub functionality into work flows and to use it in an automatic fashion. For this, we provide to usethese operations either via SOAP or via JSON. The description of these operations can be found in the help files of the CoPub 5.0 website and an example script, showing how operations can be used, is accessible via the CoPub portal http://www.copub.org.

Discussion and conclusion

CoPub 5.0 can be used to answer a wide variety of biological questions and bridges the gap between indexed searching of PubMed and dedicated manually curated pathway databases such as Wikipathways [12], Ingenuity Pathway Analysis (http://www.ingenuity. com) and Metacore (GeneGo) (http://www.genego.com/metacore.php). There are a number of tools that provide part of the technology offered by CoPub. For example, Chilibot [13] is an NLP based tool that retrieves abstracts for user defined pairs of terms, but has no curated dictionaries, meaning that only relations between user defined terms are found, thus limiting the possibility to discover new relations. FACTA [14] offers curated dictionaries but does not provide indirect relation searching or network possibilities. Arrowsmith [15] is a tool for the discovery of hidden relations but does not contain curated ontologies nor does it provide networking possibilities or term mode options. Furthermore, options for analyzing enrichment in terms lists are not provided by these tools, limiting their use for the analysis of approximately omics sets. The advantage of CoPub is that it integrates the approaches offered by the above methods and combines this with advanced graphical output, web service ability and multiple options for analyzing lists of terms and creating networks. The statistical frame work of CoPub 5.0, together with the cloud view functionality is very suitable for the analysis of large ‘omics’ data sets. First, by running a broad scan using enrichment to get a general overview of the data and subsequently by zooming in on relevant pathways, focusing on strong connections (by means of R-scaled score) in the data. Together with the hidden relations technology, this can be used to generate new hypotheses. Future steps could include a better interface to Gene Set Enrichment Analysis (GSEA) software [16] and to incorporate Natural Language Processing (NLP) to be able to even better filter on biological relevant information.

Acknowledgements The authors thank SARA Computing and Networking Services (Amsterdam, The Netherlands) for maintaining and hosting our CoPub database and web server. CoPub 5.0 a textmining system to answer biological questions 31

Funding Grants received from the Netherlands Bioinformatics Centre (NBIC) under the BioAssist program and from Merck Sharp & Dohme (MSD). Funding for open access charge: NBIC.

Conflict of interest statement. None declared.

Chapter 2 Chapter 32 Chapter 2

References

1. Alako,B.T., Veldhoven,A., van Baal,S., Jelier,R., Verhoeven,S., Rullmann,T., Polman,J. and Jenster,G. (2005) CoPub Mapper: mining MEDLINE based on search term co-publication. BMC Bioinformatics, 6, 51. 2. Frijters,R., Heupers,B., van Beek,P., Bouwhuis,M., van Schaik,R., de Vlieg,J., Polman,J. and Alkema,W. (2008) CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Res., 36, W406–W410. 3. Frijters,R., Verhoeven,S., Alkema,W., van Schaik,R. and Polman,J. (2007) Literature-based compound profiling: application to toxicogenomics. Pharmacogenomics, 8, 1521–1534. 4. Mitterhuemer,S., Petzl,W., Krebs,S., Mehne,D., Klanner,A., Wolf,E., Zerbe,H. and Blum,H. Escherichia coli infection induces distinct local and systemic transcriptome responses in the mammary gland. BMC Genomics, 11, 138. 5. Shimizu,T., Krebs,S., Bauersachs,S., Blum,H., Wolf,E. and Miyamoto, A. Actions and interactions of progesterone and estrogen on transcriptome profiles of the bovine endometrium.Physiol. Genomics, 42A, 290–300. 6. Friberg,P.A., Larsson,D.G. and Billig,H. Transcriptional effects of progesterone receptor antagonist in rat granulosa cells. Mol. Cell. Endocrinol., 315, 121–130. 7. Merkl,M., Ulbrich,S.E., Otzdorff,C., Herbach,N., Wanke,R., Wolf,E., Handler,J. and Bauersachs,S. Microarray analysis of equine endometrium at days 8 and 12 of pregnancy. Biol. Reprod., 83, 874–886. 8. Frijters,R., Fleuren,W., Toonen,E.J., Tuckermann,J.P., Reichardt,H.M., van der Maaden,H., van Elsas,A., van Lierop,M.J., Dokter,W., de Vlieg,J. et al. Prednisolone-induced differential gene expression in mouse liver carrying wild type or a dimerization-defective glucocorticoid receptor. BMC Genomics, 11, 359. 9. Frijters,R., van Vugt,M., Smeets,R., van Schaik,R., de Vlieg,J. and Alkema,W. Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Comput. Biol., 6, e1000943. 10. Ideker,T. and Lauffenburger,D. (2003) Building with a scaffold: emerging strategies for high- to low-level cellular modeling. Trends Biotechnol., 21, 255–262. 11. Sharan,R. and Ideker,T. (2006) Modeling cellular machinery through biological network comparison. Nat. Biotechnol., 24, 427–433. 12. Pico,A.R., Kelder,T., van Iersel,M.P., Hanspers,K., Conklin,B.R. and Evelo,C. (2008) WikiPathways: pathway editing for the people. PLoS Biol., 6, e184. 13. Chen,H. and Sharp,B.M. (2004) Content-rich biological network constructed by mining PubMed abstracts. BMC bioinformatics, 5, 147. 14. Tsuruoka,Y., Tsujii,J. and Ananiadou,S. (2008) FACTA: a text search engine for finding associated biomedical concepts. Bioinformatics, 24, 2559–2560. 15. Smalheiser,N.R., Torvik,V.I. and Zhou,W. (2009) Arrowsmith two-node search interface: a tutorial on finding meaningful links between two disparate sets of articles in MEDLINE. Comput. Meth. Prog. Bio., 94, 190–197. 16. Subramanian,A., Tamayo,P., Mootha,V.K., Mukherjee,S., Ebert,B.L., Gillette,M.A., Paulovich,A., Pomeroy,S.L., Golub,T.R., Lander,E.S. et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA, 102, 15545– 15550. CoPub 5.0 a textmining system to answer biological questions 33

Chapter 2 Chapter CHAPTER 3 Identification of new biomarker candidates for glucocorticoid indu- ced insulin resistance using litera- ture mining.

Wilco W.M. Fleuren1,2, Erik J.M. Toonen3, Stefan Verhoeven4, Raoul Frijters1,a, Tim Hulsen1,b, Ton Rullmann5, René van Schaik4, Jacob de Vlieg1,4 and Wynand Alkema1,c

1Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB Nijmegen, the Netherlands, 2Netherlands Bioinformatics Centre (NBIC), P.O. Box 9101, 6500 HB Nijmegen, The Netherlands, 3Department of Medicine; Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands, 4Netherlands eScience Center, Amsterdam, The Netherlands, 5TNO, Zeist, The Netherlands.

aPresent address: Rijk Zwaan Nederland BV, Fijnaart, The Netherlands bPresent address: Philips Research Laboratories, Eindhoven, The Netherlands. cPresent address: NIZO Food Research BV, Ede, The Netherlands.

BioData Min. 2013 Feb 4;6(1):2.

Identification of new biomarker candidates for GC induced IR using literature mining 37

Abstract

Background Glucocorticoids are potent anti-inflammatory agents used for the treatment of diseases such as rheumatoid arthritis, asthma, inflammatory bowel disease and psoriasis. Unfortunately, usage is limited because of metabolic side-effects, e.g. insulin resistance, glucose intolerance and diabetes. To gain more insight into the mechanisms behind glucocorticoid induced insulin resistance, it is important to understand which genes play a role in the development of insulin resistance and which genes are affected by glucocorticoids.

Medline abstracts contain many studies about insulin resistance and the molecular effects of glucocorticoids and thus are a good resource to study these effects. Results 3 Chapter We developed CoPubGene a method to automatically identify gene-disease associations in Medline abstracts. We used this method to create a literature network of genes related to insulin resistance and to evaluate the importance of the genes in this network for glucocorticoid induced metabolic side effects and anti-inflammatory processes. With this approach we found several genes that already are considered markers of GC induced IR, such as phosphoenolpyruvate carboxykinase (PCK) and glucose-6- phosphatase, catalytic subunit (G6PC). In addition, we found genes involved in steroid synthesis that have not yet been recognized as mediators of GC induced IR. Conclusions With this approach we are able to construct a robust informative literature network of insulin resistance related genes that gave new insights to better understand the mechanisms behind GC induced IR. The method has been set up in a generic way so it can be applied to a wide variety of disease networks.

Keywords: literature mining, insulin resistance, glucocorticoids, gene networks. 38 Chapter 3

Background

Glucocorticoids (GCs) are often prescribed for the treatment of inflammatory diseases such as rheumatoid arthritis, asthma, inflammatory bowel disease and psoriasis [1-3]. Despite their excellent efficacy, usage is limited because of side-effects such as insulin resistance, glucose intolerance, diabetes, central adiposity, dyslipidemia, skeletal muscle wasting and osteoporosis [4-8]. GCs bind to the glucocorticoid receptor (GR), which then dimerizes and translocates to the nucleus where it influences gene transcription. Positive regulation of genes (transactivation) is mainly mediated by direct binding of the GR-GC complexto glucocorti¬coid response elements located in the regulatory region of a target gene. The GR-GC complex may also bind to negative gluco¬corticoid response elements, which leads to a negative regulation of genes (transrepression). It is believed that transrepression, in which proinflammatory genes are downregulated, is mainly responsible for the efficacy of GCs as anti-inflammatory drugs [5, 7], while transactivation might be responsible for the GC-induced adverse effects [9]. An important side effect is the development of insulin resistance (IR), because it is the onset of many metabolic diseases and conditions such as obesity, diabetes mellitus and hypertension. IR is a physiological condition in which a given concentration of insulin produces a less-than-expected biological effect. These biological effects are different depending on the tissue in which they occur. For instance, under IR conditions, fat and muscle cells fail to adequately respond to circulating insulin, which results in reduced glucose uptake, and subsequently higher glucose levels in blood [10, 11]. In liver cells the IR- effects can be seen in reduced glycogen synthesis and storage, and a failure to suppress glucose production and release into the blood. One way by which GCs induce IR is by inhibition of the recruitment of GLUT4 glucose transporter, which results in reduced insulin-stimulated glucose transport in skeletal muscle [12]. However, not all mechanisms involved in GC-induced side effects are not completely understood. To gain more insight into mechanisms behind GC induced IR, it is important to understand which genes play a role in the development of insulin resistance and which genes are affected by GCs. It has been widely recognized that a system approach in which networks of genes in their functional context are studied, contributes to a better understanding of the mechanisms and pathways related to the disease and the drug effects [13-17]. To study a gene network related to a disease such as IR, a list of disease related genes as well as a notion of the interactions between these genes is needed. Literature databases such as Medline contain many studies about IR and the molecular effects of synthetic glucocorticoids and thus are a good resource that can be used to create and study disease related gene networks. The retrieval of relevant gene-disease associations out of the millions of abstracts in Identification of new biomarker candidates for GC induced IR using literature mining 39

Medline is very labor intensive and thus a text mining system is needed to this in an automated fashion. In previous work we reported about CoPub [18-20], a publicly available text mining system, which has successfully been used for the analysis of microarray data and in toxicogenomics studies [21-26]. CoPub calculates keyword co-occurrences in titles and abstracts from the entire Medline database, using thesauri for genes, diseases, drugs and pathways. We used this technology to develop CoPubGene, a rapid gene–disease network building tool. To evaluate the importance of genes in these networks we implemented a method to score the importance of genes in biological processes of interest by incorporating their functional neighborhood. We used CoPubGene to create a network of genes related to insulin resistance and to evaluate the importance of the genes in this network for glucocorticoid induced metabolic side effects and anti-inflammatory processes. 3 Chapter By using this method, we identified several genes that already are considered markers of GC induced IR, such as phosphoenolpyruvate carboxykinase (PCK) and glucose-6- phosphatase, catalytic subunit (G6PC) [27, 28]. Even more importantly, we were able to identify genes involved in steroid synthesis that have not yet been recognized as mediators of GC induced IR.

Methods

CoPubGene We constructed CoPubGene as a SOAP based web service (Table 1). This CoPub Web Service WSDL is created in Eclipse using the so-called Document Literal Wrapped style. The web service provider code is written in Perl using the SOAP::WSDL module and is available via the CoPub portal http://www.copub.org.

Retrieval of Gene-Disease associations To create disease related gene networks, we used CoPubGene to retrieve gene-disease and gene-gene associations from Medline abstracts. Disease terms which had significant gene associations based on the R-scaled score (rs > 35) and literature count (lc > 5) in Medline abstracts, were extracted from the CoPub thesaurus.

Disease clustering Disease clustering was done in R (http://www.r-project.org) using the pvclust R package with “complete” setting for hierarchical clustering, based on correlation distance of R-scaled scores between genes and diseases, with 100 bootstrap replications. The hierarchical cluster was visualized using Denroscope [29]. Additional gene set enrichment analysis against the GENETIC_ASSOCIATION_DB_DISEASE was done with the annotation server DAVID [30, 31]. 40 Chapter 3

Creation of IR gene network CoPubGene was used to create a set of genes related to IR, by searching for associations between genes and IR in Medline abstracts using default values (rs> 30 and lc > 5). Subsequently the IR-gene network was created by connecting genes that had significant co-occurrences with each other.

Keyword enrichment analysis of IR related genes Keyword enrichment analysis on the list of IR related genes was done against disease and drug terms from the CoPub database. Threshold values were chosen using default values.

Analysis of the IR gene network and calculation of neighbor score for genes The IR gene network was analyzed by mapping specific occurrences of the IR related genes with ‘inflammation’ and ‘dexamethasone’ in Medline abstracts onto the network. For the evaluation of the involvement of a gene, calculation of the literature score for a given gene and a given disease term, also the effects of dexamethasone and inflammation on the connecting genes are included. The literature score for gene g with term d is calculated in the following way:

In which g1 is the R-scaled score of gene g with term d, and Ns is the literature score of its neighboring genes with term d. This latter score Ns is calculated using the R-scaled score of each neighboring gene of gene g with term d (g2, g3,..,gn) relative to its relation (R-scaled score) with gene g (rg2, rg3,..,rgn).

Results

We developed CoPubGene by creating a number of web service operations that can be used to construct networks of genes based on their co-occurrences in Medline abstracts. These web service operations can be combined to answer a variety of biological questions (Table 1). For example, the question “to what biological processes is this gene related?” can be answered by running the “get genes” and “get literature neighbours” functions. Using subsequently the “get references” function will return all the relevant entries in which the gene and keywords co-occur. By applying the “get keywords” and “get literature neighbours” functions one can retrieve all disease terms that are linked to a given drug term in the Medline abstract, or vice versa, retrieve all drug terms that are linked to a given disease term in abstracts. The networks that are created can be written to Cytoscape for downstream applications and visualizations. Also more advanced Identification of new biomarker candidates for GC induced IR using literature mining 41 questions such as the construction of disease related gene networks, and subsequent calculation of keyword enrichment in this network can be addressed in an automatic way. In Table 1 the available web service operations are shown.

Table 1 List of available operations of the CoPub Web Service. Biological identifiers are used by CoPub to identify biological concepts in the system. Each biological concept has a unique identifier.

Name Operation name Input Output Description Get genes Get_genes Gene name, Biological identifier(s), Each gene in CoPub gene identifier with gene specific belongs to an internal information identifier (biological identifier). Get_genes converts the input gene to such a Biological identifier. This biological identifier serves as an

input for subsequent 3 Chapter operations. Get Keywords Get_keywords Keyword Biological identifier(s), Retrieves for a set of with keyword specific keywords, the Biological information identifiers to which these keywords belong in CoPub. These biological identifiers serve as an input for subsequent operations. Get references Get_references Biological Literature references Given a Biological identifier(s) identifier, retrieves all abstracts in which the term occurs. Get literature Get_literature_ Biological Literature neighbors Given a Biological neighbours neighbours identifier(s) identifier, retrieves a list of keywords which are mentioned in the literature together with the input term.

Get enriched Get_enriched_ List of gene List of enriched For a list of genes, keywords keywords identifiers keywords this operation calculates a keyword overrepresentation. Get literature Get_literature_ Biological SVG / Cytoscape For a set of genes, the network network identifier(s) network operation creates a network of genes.

Get categories Get_Categories - List of categories Returns a list of categories of terms in CoPub Get chips Get_chips - List of microarrays Returns a list of available Affymetrix chip names in CoPub. Version Version - Version of code and Returns the version of the literature code and literature.

Selftest Selftest - Diagnostic information - 42 Chapter 3

Retrieval of gene-disease associations Our aim was to get insight into the pathways and genes that are involved in insulin resistance, and the effect of glucocorticoids on this network. As a first step we created a list of genes associated with insulin resistance using CoPubGene. This yielded a list of 384 genes each of them connected to IR with an R scaled score (in Table 2A the top scoring genes with IR are shown, the full list is available in supplementary table 2). To evaluate the quality of this list and to investigate whether this gene list is unique for IR or whether this list contains a large number of genes that are associated with multiple diseases we constructed a gene association list for all diseases in the disease thesaurus of CoPub, using similar parameter settings as used for construction of the IR gene list. This yielded a list of disease profiles with for each disease, a number of genes connected to that disease with an R scaled score. (Table 2 shows the results for a few selected diseases, the full

Figure 1 Hierarchical cluster of disease terms from the CoPub database. The top 80 disease terms with the most gene associations are shown. Disease terms are clustered together based on having the same gene associations. Red numbers at the nodes represent approximately unbiased bootstrap values (%). Identification of new biomarker candidates for GC induced IR using literature mining 43

Table 2 Part of the disease matrix, which has been used for the clustering. A link between a gene and a disease term is given by the R-scaled score. Top scoring genes with insulin resistance (IR) are related to other metabolic disease terms (indicated in green) and not to inflammatory disease terms (indicated in red) (A). Top scoring genes with rheumatoid arthritis are related to other inflammatory disease terms (shown in green) and not to metabolic disease terms (show in red) (B). IBD: inflammatory bowel disease. A Gene Gene Name IR Diabetes Obesity Rheumatoid Psoriasis IBD Symbol mellitus, arthritis type 2 RBP4 retinol binding protein 4,plasma 53 47 46 0 0 0 GPR83 G protein coupled receptor 83 51 48 39 0 0 0 BSCL2 Bernardinelli Seip congenital 51 0 0 0 0 0

lipodystrophy 2 RETN resistin 51 46 47 0 0 37 Chapter 3 Chapter AGPAT2 1 acylglycerol 3 phosphate O 50 0 0 0 0 0 acyltransferase 2 SERPINA12 serpin peptidase inhibitor,clade 50 0 47 0 0 0 A,member 12 IRS1 insulin receptor substrate 1 50 43 40 0 0 0 ADIPOR2 adiponectin receptor 2 50 0 45 0 0 0 STEAP4 STEAP family member 4 50 0 46 0 0 0 ADIPOQ adiponectin,C1Q and collagen 50 44 46 0 0 0 domain containing

B Gene Gene Name IR Diabetes Obesity Rheumatoid Psoriasis IBD Symbol mellitus, arthritis type 2 PADI4 peptidyl arginine 0 0 0 48 41 0 deiminase,type IV FCRL3 Fc receptor like 3 0 0 0 48 0 0 NCOA5 nuclear receptor 0 0 0 48 0 0 coactivator 5 OLIG3 oligodendrocyte 0 0 0 47 0 0 3 TNFAIP3 tumor necrosis factor,alpha 0 0 0 46 45 0 induced protein 3 PTPN22 protein tyrosine 0 0 0 46 41 39 phosphatase,non receptor type 22 HLADRB4 major histocompatibility 0 0 0 45 0 36 complex,class II,DR beta 4 HLADRB5 major histocompatibility 0 0 0 45 0 36 complex,class II,DR beta 5 CHI3L1 chitinase 3 like 1 37 0 0 44 0 43 HLADRB1 major histocompatibility 0 0 0 44 0 36 complex,class II,DR beta 1 44 Chapter 3

table is available in supplementary table 3). These disease profiles were clustered using hierarchical clustering with multiscale bootstrap resampling, grouping together disease terms which have a similar profile, i.e. co-occur with the same genes (Figure 1). It appeared that a number of clusters of similar disease terms i.e. disease terms for which it is known that they have similar symptoms or have a similar mode of action, could be identified. For instance cancer related terms, such as ‘cancer of breast’, ‘cancer of prostate’ and ‘colon cancer’ are clustered together and inflammatory related disease terms such as ‘psoriasis’, ‘inflammatory bowel disease’ and ‘asthma’ are clustered together. These clusters also have high unbiased (AU) bootstrap values, indicating strong evidence for these clusters. To further confirm that the found gene-disease associations by CoPub are indeed biologically relevant, for each sub-cluster in Figure 1, we collected the union of all genes for that sub-cluster, and used these genes to perform a functional annotation analysis against the genetic association disease database using DAVID. The results of this analysis indicated that indeed similar disease terms to CoPub were found by DAVID (for the results of this analysis see supplementary tables). These analyses showed that with CoPubGene we are able to construct a relevant list of specific IR related genes that can be used for further analysis and that CoPubGene can be used to create a variety of disease related genes lists.

Network of insulin resistance related genes. To create the IR gene network, we used the 384 genes from the IR gene list and connected the genes based on their co-occurrences with each other in Medline abstracts. The resulting network is shown in Figure 2A. We found that 381 genes of the IR gene list were connected to at least one other gene. We identified a number of hubs such as peroxisome proliferator-activated receptor gamma (PPARG), insulin receptor substrate 1 (IRS1), v-akt murine thymoma viral oncogene homolog 1 (AKT1), insulin receptor (INSR), solute carrier family 2 (facilitated glucose transporter), member 4 (SLC2A4) and insulin (INS) which were connected to more than 100 other genes. The resulting network is a scale free network, as indicated by the distribution of connectivity that follows a power law distribution which is indicative for a scale free network (supplementary Figure 1) [32]. Although the above network has the characteristics of a biological network, and contains the expected genes as central hubs, without additional annotations this network representation is still largely uninformative and contains too little substructure to draw biological conclusions.

Annotation of the network with drugs and diseases terms As a first step towards annotating the network and identification of sub networks with a shared biological function, we investigated which drugs and diseases in the literature are specifically linked to this network using a keyword enrichment analysis on the list of IR related genes (For details about the enrichment method see Table 1). This enrichment Identification of new biomarker candidates for GC induced IR using literature mining 45 yielded a number of drugs that are known drugs for the treatment of diabetes such as ‘rosiglitazone’, ‘metformin’ , ‘pioglitazone’, and also ‘glucagon’ and ‘insulin’ which are frequently used for the treatment of hypoglycemia and hypoinsulinemia (Table3A. For the full list see supplementary table 4A). Notably, among these top scoring drugs we found dexamethasone, a well known synthetic glucocorticoid. High scoring genes with dexamethasone are for instance CEBPA, SERPINA6, PCK2 and GPD1 (for a full list of genes per enriched drug term, see supplementary Table 4A.2), which also have been mentioned in the development of several metabolic diseases [33-37]. There are several top scoring over-represented terms that are related to metabolic diseases, e.g. ‘diabetes mellitus’, ‘obesity’, ‘diabetes mellitus, type 2’ and ‘hyperinsulinemia’ (Table 3B ). The fact that these terms are high scoring is expected since we constructed the gene network based on the keyword insulin resistance. However we also found diseases that share a common origin with insulin resistance such as cardiovascular disease (Table 3B). 3 Chapter The most interesting high scoring term for our particular research question was the non- metabolic term ‘inflammation’, which was represented in the network by genes such asIL6 , IL18, IL1RA, SOCS1, SOCS3, CCL2 and CCR2. Several of these genes have been mentioned

Table 3 Over-represented drug and disease terms (P-value < 0.05). The top scoring drug terms in the IR network from the CoPub database (A). Top scoring disease terms from the CoPub database in the IR network (B). A B Term Number of genes Term Number of genes insulin 358 insulin resistance 381 dexamethasone 195 obesity 263 nitric oxide 193 inflammation 219 estrogen 169 diabetes mellitus 190 adenosine 151 cardiovascular disease 181 estradiol 145 Diabetes mellitus,type 2 173 rosiglitazone 125 Oxygen deficiency 164 actinomycin 124 fibrosis 138 actinomycin d 121 hyperinsulinemia 137 glucagon 120 Cancer of breast 131 thrombin 108 Adiposity 130 progesterone 97 cancer 128 trypsin 86 starvation 120 nicotinamide 85 metformin 84 pioglitazone 82 46 Chapter 3

in studies to be involved in the development of metabolic diseases. For instance, elevated levels of IL6 in subjects with obesity and diabetes showed an association between insulin resistance and IL6 [38]. Studies in mice showed that CCR2 deficiency or antagonism of this receptor resulted in attenuation of systemic insulin resistance and development of obesity, hence suggesting a modulating role of CCR2 in this [39, 40]. These results show that even with an unbiased data driven construction of a gene network, the relation between IR, dexamethasone and inflammation is discovered based on the genes that play a role in these effects. We subsequently highlighted the genes in the IR network that are related to inflammation and dexamethasone (Figure 2).

Figure 2 Literature network of insulin resistance related genes (A). Genes, represented by nodes are linked, based on co-occurrences in Medline abstracts. The thickness of the edge indicates the strength of the link between two genes (R-scaled score). Genes in blue have a co-occurrence with dexamethasone in Medline abstracts (R-scaled score). The strength of the link with dexamethasone is given by the color shading, ranging from no link (white) to a strong link (dark blue). The strength of the link with inflammation (R-scaled score) is given by the size of the node of the gene, ranging from no link (normal size of the node) to a strong link with inflammation (large size of the node). Sub-network for gene PPARG (B). Sub-network of Cytochrome P450s (C).

Genes linked to inflammation and glucocorticoids in the context of insulin resistance From a drug development perspective it is interesting to separate the desired effect of GCs on inflammatory processes from the undesired effect on metabolic processes. To Identification of new biomarker candidates for GC induced IR using literature mining 47 rank each gene with respect to the relation with GC and inflammation, we calculated for each gene a literature score with dexamethasone and inflammation. Subsequently we focused on genes that score low on inflammation and high on dexamethasone (Figure 3). These genes are thought to be more exclusively related to GC induced IR. For these genes we calculated a literature neighbor score as well, by also including the relations of dexamethasone and inflammation with genes to which the gene is connected in the network. In Figure 3 it is shown that many genes which are not directly connected to inflammation (grey dots) are definitely influenced by inflammation via their connecting genes (black dots). The majority of the genes in Figure 3 are directly involved in important metabolic processes such as gluconeogenesis (PCK2, G6PC, PC and GCG), glycolysis (GCK, GCG), glucose uptake, lipid metabolism (ACACA, CHPT1, GPD1) and carbohydrate metabolism (GPD1). Other ones are directly involved in insulin signaling (GIP, IGF2, IPF1, IAPP). 3 Chapter

Figure 3 Influence of dexamethasone and inflammation on IR genes that have a high score with dexamethasone (> 25) and a low score with inflammation (<25). The direct score of these genes with dexamethasone and inflammation are shown in grey. The literature neighbor score for these genes, by also including the relations of dexamethasone and inflammation with genes to which the gene is connected in the network, are shown in black. The grey arrows indicate the migration of the gene from a direct score to a literature neighbor score. 48 Chapter 3

Sex steroid physiology in relation to insulin resistance Interestingly in Figure 3 we also see three cytochrome P450s, i.e.CYP17A1 , CYP19A1 and CYP21A2, which are key regulator enzymes in the steroid synthesis (Figure 4). The sub- network in Figure 2C shows the three cytocromes P450s and their direct gene neighbors. Analysis of this sub-network showed that many of the genes in the network are mentioned in studies from women suffering of the Polycystic ovary syndrome (PCOS), in which there is an imbalance of a woman's female sex hormones. PCOS is characterized by insulin resistance, possibly because of hyperandrogenism and low levels of SHBG. The latter effect has also been observed in men suffering from the metabolic syndrome [41]. Also a study by Macut et al. suggested that alterations of a cross-talk between glucocorticoid signaling and metabolic parameters, is related to PCOS pathophysiology [42]. Additional topological analysis of the sub-network using cytohubba [58] revealed that IGF1R, HSD11B2, IGF2 and SHBG have a high betweenness centrality, i.e. they have many shortest paths going through them, analogous to major bridges and tunnels on a high map. Studies show that such a bottle necks play important roles in biological networks [43, 44].

Figure 4 Steroid synthesis. Enzymes indicated with a red box have been found in our analysis. CYP17A1 encodes for an enzyme which has both a 17α-hydroxylase and a 17,20 lyase function. CYP21A2 encodes for a steroid 21-hydroxylase and CYP19A1 encodes for an aromatase. Figure derived from the image Steroidogenesis.png in Wikipedia, by David Richfield and Mikael Häggström, licensed under Creative Commons CC BY-SA 3.0 and GFDL. Identification of new biomarker candidates for GC induced IR using literature mining 49

CYP19A1 encodes for an aromatase which is responsible for the aromatization of androgens into estrogens, thus influencing the androgen to estrogen balance. Several studies showed that an imbalance between androgen and estrogen balance because of aromatase deficiency resulted in the development of symptoms related to the metabolic syndrome [45-48]. The fact that dexamethasone can regulate aromatase activity [49-51], suggests a role of aromatase in GC induced IR. CYP17A1 is a key regulator of androgen synthesis and catalyzes the reactions in which pregnenolone and progesterone are converted into their 17-alpha-hydroxylated products and subsequently into Dehydroepiandrosterone (DHEA). A decline in DHEA and also its sulfated ester (DHEA-S) has been suggested to be causally linked to insulin resistance and obesity [52-55]. The possible inhibitory effects of dexamethasone on Cyp17a1 [56, 57] suggests a role in GC induced IR by this gene. CYP21A2 is a cytochrome P450 enzyme coding for the 21-hydroxylase that is involved 3 Chapter in the biosynthesis of the steroid hormones aldosterone and cortisol. A defect in this gene leads to Congenital adrenal hyperplasia (CAH) in which there is a disbalance in cortisol and aldosterone secretion. CAH patients are characterized by insulin resistance, lower insulin sensitivity and hyperinsulinemia [58-61]. Some studies indicate that the development of IR is because of GC treatment in this patient group [62-64]. Whether these patients develop IR because of CAH and deficiency of 21-hydroxylase, or because of the fact that they are often treated with synthetic GCs need to be elucidated.

Genes involved in osteoporosis Another side effect of GC treatment is the development of glucocorticoid induced osteoporosis (GIOP) [65]. GIOP is characterized by reduced bone mineral density (BMD), decreased bone mass and disturbance of the bone matrix, leading to increased susceptibility to fractures. We applied CoPubGene to deduce important genes involved in GIOP by analyzing top scoring genes with OP (in total 131 genes associated with OP were found, see supplementary table 5; the network of these top scoring genes with relations to dexamethasone and inflammation is shown in supplementary figure 3. The majority of the genes are involved in bone remodeling and resorption (TNFRSF11A, TNFRSF11B, TNFSF11, SP7, CTSK), in bone mineralization (PTH, Klotho, VDR, Calca, BGLAP) or are part of the wnt signaling pathway that is involved in the regulation of bone formation (SOST, DKK1, LRP5, LRP6) [66]. Among these genes are known biomarkers of GIOP such as osteoprotegerin (encoded by TNFRSF11B) and the ligand RANK-L (encoded by TNFS11) [67]. Here we also searched for genes with a low score with inflammation. Several of these genes in the set, such as BGLAP, COL1A1 and SP7 are affected by GCs [68-72], have low associations with inflammation and therefore are interesting biomarker candidates for GIOP. 50 Chapter 3

Discussion

In the work presented here we used Medline abstracts to study mechanisms and genes involved in glucocorticoid induced insulin resistance. We created CoPubGene, a number of web service operations that can be used to retrieve relevant gene-disease, gene-drug and gene-gene associations out of Medline abstracts, using the CoPub technology. The clustering of disease terms based on their associations with genes in Medline abstracts showed that CoPubGene is able to generate a list of specific IR genes that can be used for further analysis, and that this method also can be used to generate a variety of other gene disease associations. We used this clustering to evaluate the quality of disease related gene lists, generated using a text mining approach, because to our knowledge there is no real gold standard data set that covers a sufficient range of gene- disease associations that can be used. Databases such as OMIM and the KEGG disease database [73] only cover a sub range of diseases which makes these datasets difficult to use in this type of evaluation. Next, we studied the IR genes in their functional context, by including genes with which they co-occur in Medline abstracts. In this gene network we focused on genes that are strongly linked to dexamethasone and less strongly to inflammation. These genes are thought to be more exclusively related to GC induced IR and therefore might be interesting markers for this effect. However, all of them are to a certain extent related to inflammation, either directly or indirectly by their neighbors, which suggests that these genes cannot be used as an exclusive marker for GC induced IR. This might have consequences for the search of dissociating compounds, i.e. compounds which only have the immune suppressive properties and not the unwanted side effects. Instead the search should focuson compounds that show a reduced effect on the expression of these IR genes. The majority of the IR genes that have a low literature neighbor score for inflammation (< 25) and a high score for dexamethasone (literature neighbor score > 25) code for enzymes and hormones directly involved in important metabolic processes, such as glycolysis, gluconeogenesis, glucose uptake and lipid metabolism. All these processes are tightly regulated by insulin. This suggests that at a first instance, the search for mechanisms of GC induced IR should be focused on these processes. Additionally, we also identified a sub network of genes involved in sex steroid synthesis that to our knowledge, not have been recognized yet as mediators of GC induced side effects. Key enzymes involved in steroid synthesis, i.e. CYP17A1, CYP21A2 and CYP19A1 keep the balance between several steroids, and an impairment of this balance could possibly result in metabolic disturbances such as IR. Additional topological analyses could further prioritize this sub-network for follow-up studies to determine the influence of GCs on sex steroid synthesis and the relation to IR. In such a study one could look at the influence of GCs on the balance between the steroids in combination with their influence Identification of new biomarker candidates for GC induced IR using literature mining 51 on insulin stimulated glucose uptake in glucose sensitive tissues such as adipose and muscle tissue.

Conclusions

Using CoPubGene we are able to construct an informative literature network of IR related genes by only using information from Medline abstracts. Our approach revealed genes, that on a first glance were not considered to be involved in GC induced IR and thus gave new insights that might lead to a better understanding of the mechanisms behind GC induced IR.

Financial Contribution Author WF was supported by the Biorange project (BR4.2) “A Systems Bioinformatics 3 Chapter Approach For Evaluating And Translating Drug-Target Effects In Disease Related Pathways” of NBIC.

Conflict of interest Authors have no conflict of interest.

Authors' contributions WF performed all data analysis, research design and wrote the paper. ET supervised with biological interpretation and helped with writing the paper. SV, TH developed the web service operations. TR, RVS, RF and JVD helped with research design. WA supervised the work and helped with research design, analyzing the data and writing the manuscript.

Supplementary informationcan be retrieved on request by the author of this thesis. 52 Chapter 3

References

1. Del Rosso Do, J.Q., Combination topical therapy for the treatment of psoriasis. J Drugs Dermatol, 2006. 5(3): p. 232-4. 2. Schwartz, M. and R. Cohen, Optimizing conventional therapy for inflammatory bowel disease. Curr Gastroenterol Rep, 2008. 10(6): p. 585-90. 3. Hillier, S.G., Diamonds are forever: the cortisone legacy. J Endocrinol, 2007. 195(1): p. 1-6. 4. De Bosscher, K. and G. Haegeman, Minireview: latest perspectives on antiinflammatory actions of glucocorticoids. Mol Endocrinol, 2009. 23(3): p.281-91. 5. Rhen, T. and J.A. Cidlowski, Antiinflammatory action of glucocorticoids--new mechanisms for old drugs. N Engl J Med, 2005. 353(16): p. 1711-23. 6. Rockall, A.G., et al., Computed tomography assessment of fat distribution in male and female patients with Cushing's syndrome. Eur J Endocrinol, 2003. 149(6): p. 561-7. 7. Schacke, H., W.D. Docke, and K. Asadullah, Mechanisms involved in the side effects of glucocorticoids. Pharmacol Ther, 2002. 96(1): p. 23-43. 8. Schacke, H., et al., Insight into the molecular mechanisms of glucocorticoid receptor action promotes identification of novel ligands with an improved therapeutic index. Exp Dermatol, 2006. 15(8): p. 565- 73. 9. Diamond, M.I., et al., Transcription factor interactions: selectors of positive or negative regulation from a single DNA element. Science, 1990. 249(4974): p. 1266-72. 10. Schenk, S., M. Saberi, and J.M. Olefsky, Insulin sensitivity: modulation by nutrients and inflammation. J Clin Invest, 2008. 118(9): p. 2992-3002. 11. Kalupahana, N.S., N. Moustaid-Moussa, and K.J. Claycombe, Immunity as a link between obesity and insulin resistance. Mol Aspects Med, 2011. 12. Weinstein, S.P., et al., Dexamethasone inhibits insulin-stimulated recruitment of GLUT4 to the cell surface in rat skeletal muscle. Metabolism, 1998. 47(1): p.3-6. 13. Ideker, T. and D. Lauffenburger, Building with a scaffold: emerging strategies for high- to low-level cellular modeling. Trends Biotechnol, 2003. 21(6): p. 255-62. 14. Alkema, W., T. Rullmann, and A. van Elsas, Target validation in silico: does the virtual patient cure the pharma pipeline? Expert Opin Ther Targets, 2006. 10(5): p. 635-8. 15. Sharan, R. and T. Ideker, Modeling cellular machinery through biological network comparison. Nat Biotechnol, 2006. 24(4): p. 427-33. 16. Goh, K.I., et al., The human disease network. Proc Natl Acad Sci U S A, 2007. 104(21): p. 8685-90. 17. Plake, C. and M. Schroeder, Computational polypharmacology with text mining and ontologies. Curr Pharm Biotechnol, 2011. 12(3): p. 449-57. 18. Alako, B.T., et al., CoPub Mapper: mining MEDLINE based on search term co-publication. BMC Bioinformatics, 2005. 6: p. 51. 19. Frijters, R., et al., CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Res, 2008. 36(Web Server issue): p.W406-10. 20. Fleuren, W.W., et al., CoPub update: CoPub 5.0 a text mining system to answer biological questions. Nucleic Acids Res, 2011. 39(Web Server issue): p. W450-4. 21. Friberg, P.A., D.G. Larsson, and H. Billig, Transcriptional effects of progesterone receptor antagonist in rat granulosa cells. Mol Cell Endocrinol, 2010. 315(1-2): p. 121-30. 22. Frijters, R., et al., Prednisolone-induced differential gene expression in mouse liver carrying wild type or a dimerization-defective glucocorticoid receptor. BMC Genomics, 2010. 11: p. 359. 23. Frijters, R., et al., Literature-based compound profiling: application to toxicogenomics. Pharmacogenomics, 2007. 8(11): p. 1521-34. 24. Merkl, M., et al., Microarray analysis of equine endometrium at days 8 and 12 of pregnancy. Biol Reprod, 2010. 83(5): p. 874-86. 25. Mitterhuemer, S., et al., Escherichia coli infection induces distinct local and systemic transcriptome responses in the mammary gland. BMC Genomics, 2010. 11: p. 138. 26. Shimizu, T., et al., Actions and interactions of progesterone and estrogen on transcriptome profiles of the bovine endometrium. Physiol Genomics, 2010. 42A(4): p. 290-300. 27. Voice, M.W., A.P. Webster, and A. Burchell, The in vivo regulation of liver and kidney glucose-6- phosphatase by dexamethasone. Horm Metab Res, 1997. 29(3): p. 97-100. Identification of new biomarker candidates for GC induced IR using literature mining 53

28. Franckhauser, S., et al., Expression of the phosphoenolpyruvate carboxykinase gene in 3T3-F442A adipose cells: opposite effects of dexamethasone and isoprenaline on transcription. Biochem J, 1995. 305 ( Pt 1): p. 65-71. 29. Huson, D.H., et al., Dendroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics, 2007. 8: p. 460. 30. Huang da, W., B.T. Sherman, and R.A. Lempicki, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 2009. 4(1): p. 44-57. 31. Huang da, W., B.T. Sherman, and R.A. Lempicki, Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res, 2009. 37(1): p. 1-13. 32. Barabasi, A.L., Scale-free networks: a decade and beyond. Science, 2009. 325(5939): p. 412-3. 33. Xu, H., et al., Dual specificity MAPK phosphatase 3 activates PEPCK gene transcription and increases gluconeogenesis in rat hepatoma cells. J Biol Chem, 2005. 280(43): p. 36013-8. 34. Park, J.J., et al., GRB14, GPD1, and GDF8 as potential network collaborators in weight loss-induced improvements in insulin action in human skeletal muscle. Physiol Genomics, 2006. 27(2): p. 114-21. 35. Krempler, F., et al., Leptin, peroxisome proliferator-activated receptor-gamma, and CCAAT/enhancer binding protein-alpha mRNA expression in adipose tissue of humans and their relation to cardiovascular risk factors. Arterioscler Thromb Vasc Biol, 2000. 20(2): p. 443-9. 36. Chia, Y.Y., et al., Amelioration of glucose homeostasis by glycyrrhizic acid through gluconeogenesis rate- 3 Chapter limiting enzymes. Eur J Pharmacol, 2012. 677(1-3): p. 197-202. 37. Fernandez-Real, J.M., et al., Serum corticosteroid-binding globulin concentration and insulin resistance syndrome: a population study. J Clin Endocrinol Metab, 2002. 87(10): p. 4686-90. 38. Kern, P.A., et al., Adipose tissue tumor necrosis factor and interleukin-6 expression in human obesity and insulin resistance. Am J Physiol Endocrinol Metab, 2001. 280(5): p. E745-51. 39. Weisberg, S.P., et al., CCR2 modulates inflammatory and metabolic effects of high-fat feeding. J Clin Invest, 2006. 116(1): p. 115-24. 40. Tamura, Y., et al., C-C chemokine receptor 2 inhibitor improves diet-induced development of insulin resistance and hepatic steatosis in mice. J Atheroscler Thromb, 2010. 17(3): p. 219-28. 41. Bhasin, S., et al., Sex hormone-binding globulin, but not testosterone, is associated prospectively and independently with incident metabolic syndrome in men: the framingham heart study. Diabetes Care, 2011. 34(11): p. 2464-70. 42. Macut, D., et al., Age, body mass index, and serum level of DHEA-S can predict glucocorticoid receptor function in women with polycystic ovary syndrome. Endocrine, 2010. 37(1): p. 129-34. 43. Yu, H., et al., The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol, 2007. 3(4): p. e59. 44. McDermott, J.E., et al., Bottlenecks and hubs in inferred networks are important for virulence in Salmonella typhimurium. J Comput Biol, 2009. 16(2): p. 169-80. 45. Bader, M.I., et al., Comparative assessment of estrogenic responses with relevance to the metabolic syndrome and to menopausal symptoms in wild-type and aromatase-knockout mice. J Steroid Biochem Mol Biol, 2011. 46. Jones, M.E., et al., Of mice and men: the evolving phenotype of aromatase deficiency. Trends Endocrinol Metab, 2006. 17(2): p. 55-64. 47. Maffei, L., et al., Dysmetabolic syndrome in a man with a novel mutation of the aromatase gene: effects of testosterone, alendronate, and estradiol treatment. J Clin Endocrinol Metab, 2004. 89(1): p. 61-70. 48. Takeda, K., et al., Progressive development of insulin resistance phenotype in male mice with complete aromatase (CYP19) deficiency. J Endocrinol, 2003. 176(2): p. 237-46. 49. Zhao, H., et al., A novel promoter controls Cyp19a1 gene expression in mouse adipose tissue. Reprod Biol Endocrinol, 2009. 7: p. 37. 50. Simpson, E.R., et al., Estrogen formation in stromal cells of adipose tissue of women: induction by glucocorticosteroids. Proc Natl Acad Sci U S A, 1981. 78(9): p. 5690-4. 51. Enjuanes, A., et al., Regulation of CYP19 gene expression in primary human osteoblasts: effects of vitamin D and other treatments. Eur J Endocrinol, 2003. 148(5): p. 519-26. 52. Koga, M., et al., Serum dehydroepiandrosterone sulphate levels in patients with non-alcoholic fatty liver disease. Intern Med, 2011. 50(16): p. 1657-61. 53. Kurzman, I.D., E.G. MacEwen, and A.L. Haffa, Reduction in body weight and cholesterol in spontaneously obese dogs by dehydroepiandrosterone. Int J Obes, 1990. 14(2): p. 95-104. 54 Chapter 3

54. Sanchez, J., et al., Dehydroepiandrosterone prevents age-associated alterations, increasing insulin sensitivity. J Nutr Biochem, 2008. 19(12): p. 809-18. 55. Schriock, E.D., et al., Divergent correlations of circulating dehydroepiandrosterone sulfate and testosterone with insulin levels and insulin receptor binding. J Clin Endocrinol Metab, 1988. 66(6): p. 1329-31. 56. Lee, T.C., W.L. Miller, and R.J. Auchus, Medroxyprogesterone acetate and dexamethasone are competitive inhibitors of different human steroidogenic enzymes. J Clin Endocrinol Metab, 1999. 84(6): p. 2104-10. 57. Trzeciak, W.H., et al., Dexamethasone inhibits corticotropin-induced accumulation of CYP11A and CYP17 messenger RNAs in bovine adrenocortical cells. Mol Endocrinol, 1993. 7(2): p. 206-13. 58. Mooij, C.F., et al., Unfavourable trends in cardiovascular and metabolic risk in paediatric and adult patients with congenital adrenal hyperplasia? Clin Endocrinol (Oxf), 2010. 73(2): p. 137-46. 59. Speiser, P.W., et al., Insulin insensitivity in adrenal hyperplasia due to nonclassical steroid 21-hydroxylase deficiency. J Clin Endocrinol Metab, 1992. 75(6): p. 1421-4. 60. Paula, F.J., et al., Androgen-related effects on peripheral glucose metabolism in women with congenital adrenal hyperplasia. Horm Metab Res, 1994. 26(11): p.552-6. 61. Saygili, F., A. Oge, and C. Yilmaz, Hyperinsulinemia and insulin insensitivity in women with nonclassical congenital adrenal hyperplasia due to 21-hydroxylase deficiency: the relationship between serum leptin levels and chronic hyperinsulinemia. Horm Res, 2005. 63(6): p. 270-4. 62. Kroese, J.M., et al., Pioglitazone improves insulin resistance and decreases blood pressure in adult patients with congenital adrenal hyperplasia. Eur J Endocrinol, 2009. 161(6): p. 887-94. 63. Charmandari, E., et al., Children with classic congenital adrenal hyperplasia have elevated serum leptin concentrations and insulin resistance: potential clinical implications. J Clin Endocrinol Metab, 2002. 87(5): p. 2114-20. 64. Bachelot, A., et al., Long-term outcome of patients with congenital adrenal hyperplasia due to 21-hydroxylase deficiency. Horm Res, 2007. 67(6): p. 268-76. 65. den Uyl, D., I.E. Bultink, and W.F. Lems, Glucocorticoid-induced osteoporosis. Clin Exp Rheumatol, 2011. 29(5 Suppl 68): p. S93-8. 66. Issack, P.S., D.L. Helfet, and J.M. Lane, Role of Wnt signaling in bone remodeling and repair. HSS J, 2008. 4(1): p. 66-70. 67. Canalis, E., Mechanisms of glucocorticoid-induced osteoporosis. Curr Opin Rheumatol, 2003. 15(4): p. 454-7. 68. Kauh, E., et al., Prednisone affects inflammation, glucose tolerance, and bone turnover within hours of treatment in healthy individuals. Eur J Endocrinol, 2012. 166(3): p. 459-67. 69. Eastell, R., et al., Bone formation markers in patients with glucocorticoid-induced osteoporosis treated with teriparatide or alendronate. Bone, 2010. 46(4): p. 929-34. 70. Zheng, H.F., T.D. Spector, and J.B. Richards, Insights into the genetics of osteoporosis from recent genome-wide association studies. Expert Rev Mol Med, 2011. 13: p. e28. 71. Fu, H., et al., Osteoblast differentiation in vitro and in vivo promoted by Osterix. J Biomed Mater Res A, 2007. 83(3): p. 770-8. 72. Advani, S., et al., Dexamethasone suppresses in vivo levels of bone collagen synthesis in neonatal mice. Bone, 1997. 20(1): p. 41-6. 73. Kanehisa, M., et al., KEGG for linking genomes to life and the environment. Nucleic Acids Res, 2008. 36(Database issue): p. D480-4. Identification of new biomarker candidates for GC induced IR using literature mining 55

Chapter 3 Chapter CHAPTER 4 Prednisolone-induced changes in gene-expression profiles in healthy volunteers

Erik J.M. Toonen1, Wilco W.M. Fleuren2,3, Ulla Nässander1, Marie-José C. van Lierop1, Susanne Bauerschmidt1, Wim H.A. Dokter1and Wynand Alkema1,3

1MSD Research Laboratories, Oss, The Netherlands, 2Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre and 3Netherlands Bioinformatics Centre (NBIC), Nijmegen, The Netherlands.

Pharmacogenomics. 2011 Jul;12(7):985-98.

Prednisolone-induced changes in healthy volunteers 59

Abstract

Background Prednisolone and other glucocorticoids (GCs) are potent anti-inflammatory and immunosuppressive drugs. However, prolonged use at a medium or high dose is hampered by side effects of which the metabolic side effects are most evident. Relatively little is known about their effect on gene-expression in vivo, the effect on cell subpopulations and the relation to the efficacy and side effects of GCs. Aim To identify and compare prednisolone-induced gene signatures in CD4+ T lymphocytes and CD14+ monocytes derived from healthy volunteers and to link these signatures to underlying biological pathways involved in metabolic adverse effects. Materials & methods Whole-genome expression profiling was performed on CD4+ T lymphocytes and CD14+ monocytes derived from healthy volunteers treated with prednisolone. Text-mining analyses was used to link genes to pathways involved in metabolic adverse events.

Results 4 Chapter Induction of gene-expression was much stronger in CD4+ T lymphocytes than in CD14+ monocytes with respect to fold changes, but the number of truly cell-specific genes where a strong prednisolone effect in one cell type was accompanied by a total lack of prednisolone effect in the other cell type, was relatively low. Subsequently, a large set of genes was identified with a strong link to metabolic processes, for some of which the association with GCs is novel. Conclusion The identified gene signatures provide new starting points for further study intoGC- induced transcriptional regulation in vivo and the mechanisms underlying GC-mediated metabolic side effects.

Keywords: gene signature, glucocorticoids, metabolic side effects, prednisolone, whole- genome expression profiling. 60 Chapter 4

Introduction

Glucocorticoids (GCs) display strong anti-inflammatory effects and are therefore among the most prescribed agents for the treatment of numerous inflammatory conditions [1]. Exogenous GCs are used in daily clinical practice to treat inflammatory, autoimmune and allergic disorders, to attenuate organ rejection after transplantation, to treat brain edema, shock and various blood cancers [2]. Most of the effects of GCs are mediated through binding to the intracellular GC receptor (GR). This receptor can be found in almost all tissues of the human body. Upon entering the cell, GCs bind to the GR, which then dimerizes and translocates to the nucleus where it influences gene transcription. Positive regulation of genes (transactivation) is mainly mediated by direct binding oftheGR homodimer to glucocorticoid response elements located in the regulatory region of a target gene. Ligand-activated GR homodimers may also bind to negative glucocorticoid response elements, which leads to a negative regulation of genes (transrepression). Transrepression is also achieved by interaction of GR monomers or homodimers with other transcription factors via protein–protein interactions [1,3]. It is believed that the transrepression pathway, in which proinflammatory genes are downregulated, is mainly responsible for the efficacy of GCs as anti-inflammatory drugs [1,3]. However, in spite of excellent efficacy, the clinical use of GCs is hampered by a wide range of dose-dependent side effects. Persistent exposure to elevated levels of circulating GCs is related to insulin resistance, glucose intolerance, diabetes, adiposity, dyslipidemia, skeletal muscle wasting and osteoporosis [1,2,4]. The mechanisms of action behind these adverse events are largely unknown. It is thought that the transactivation pathway might be responsible for these GC-induced adverse effects[5]. A number of synthetic analogs of the natural human GC cortisol have been developed, and one of the most widely used GCs in the daily clinic is prednisolone [6]. For example, prednisolone is used for the treatment of rheumatoid arthritis, inflammatory bowel disease, psoriasis and asthma [7–9]. As seen for all GCs, treatment with prednisolone also induces, besides the desired effects, a large number of (metabolic) adverse events. Therefore, there is a high medical need for improved anti-inflammatory drugs that are as effective as classical GCs but have a better safety profile. Since the precise mechanisms involved in prednisolone-induced side effects remain unclear, it is of importance to elucidate genes or gene pathways that contribute to these side effects. In recent years, genome-wide gene-expression analysis using microarrays has become a key component to unraveling the underlying transcriptional regulation of various complex mechanisms and diseases [10,11]. Indeed, a large number of experiments detailing the effects of GC-induced transcriptional regulation have been described [12– 14]. However, most of these experiments have been performed in cell lines or cultured primary cells or whole blood. Relatively little is known regarding the effects of GCs in an in vivo setting and the effect on the individual cell types that are present in blood. To Prednisolone-induced changes in healthy volunteers 61 address these issues and to monitor the acute effects of GCs in anin vivo setting in which GC-induced transcriptional regulation is measured in the context of all other relevant signals, we performed gene-expression profiling on CD4+ and CD14+ cells isolated from healthy volunteers that were treated with prednisolone. These cell types were chosen since T cells and monocytic cells are the two most important immune cells involved in inflammation and are also accessible. The healthy volunteers that participated in the trial demonstrated metabolic adverse events after prednisolone treatment such as a decrease in the C-peptide:glucose ratio and impaired glucose sensitivity [15]. The aim of the present study was to identify and compare acute prednisolone-induced gene signatures in CD4+ T lymphocytes and CD14+ monocytes. In addition, we aimed to link these signatures to underlying biological pathways involved in previously described metabolic adverse effects [15]. The results demonstrate that prednisolone induces significant changes in the transcriptional activity of T cells and monocytes, with a large overlap in the gene sets that are affected in both cell types and a relatively low number of genes regulated by prednisolone exclusively in one cell type. Several genes were identified that play crucial roles in metabolic pathways and that might be good candidates for further studies Chapter 4 Chapter aimed at identifying biomarkers for GC-induced adverse events and the development of improved GCs.

Materials & methods

Study design The study was a single-centre, double blind, randomized, placebo-controlled study, consisting of two distinct parts which were performed in parallel. In the first partof the study the acute effects of prednisolone treatment were assessed (acute protocol). Subjects were treated for 1 day with either two doses of 5 mg prednisolone (30 mg group; n = 6) or one dose of 75 mg prednisolone (n = 6). Prednisolone and cortisol levels were monitored at several time points (from 7:55 AM to 8:00 AM the next day) at day 0 and day 1 (predose) and day 2 (24 h or 12 h after the last dosing). Blood was sampled at day 1 (predose) and day 2 (24 h or 12 h after the last dosing) and used for whole-genome expression profiling. In the second part of the study, subjects were treated for 15 days with a placebo (n = 11) or 30 mg/day prednisolone (n = 12) and blood was sampled for expression analysis at day 0 (predose) and day 15 (predose) (24 h after the last dose on day 14) (2-week protocol). In this study, we focused on the results from the acute protocol and used results from the 2-week cohort for validation. In both parts, medication was taken in the morning (8:00 AM) and for the 2 ×15 mg prednisolone group also at 8.00 PM. Prednisolone tablets were obtained from Pfizer AB (Sollentuna, Sweden), placebo tablets were provided by MSD (Oss, The Netherlands). Tablets were encapsulated in order to allow the treatment to be blinded. Blood prednisolone and cortisol levels were 62 Chapter 4

determined by using the normal-phase HPLC procedure as described previously [16]. After sampling, the blood was separated into CD4+ T lymphocytes and CD14+ monocytes. Subsequently, RNA was isolated and used for whole-genome expression analysis. All participants provided written informed consent. The study was approved byan independent ethics committee and the study was conducted in accordance with the Declaration of Helsinki, using good clinical practice.

Study population Both the acute and 2-week protocols enrolled healthy male volunteers (age range between 20 and 45 years; BMI 22–30 kg/m2). Health status was confirmed by medical history, physical and laboratory examination, ECG and vital signs recordings. Participants were excluded if they had a clinically relevant history or presence of a medical disorder known to influence the investigational parameters.

Isolation of CD4+ & CD14+ cells Human peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll density gradient centrifugation. Next, CD4+ T cells were isolated from PBMCs by the use of a MACS® CD4+ T-Cell Isolation Kit II according to protocol (Miltenyi Biotec, Utrecht, The Netherlands). In short, T cells are isolated by depletion of non-T cells (negative selection). Non-T cells are indirectly magnetically labeled with a cocktail of biotin-conjugated monoclonal antibodies, as primary labeling reagent, and antibiotin monoclonal antibodies conjugated to Microbeads® (MicroBeads, Cateno, AS, Norway), as secondary labeling reagent. The magnetically labeled non-T cells are depleted by retaining them on a MACS Column in the magnetic field of a MACS Separator, while the unlabeled T cells pass through the column. CD14+ monocytes were isolated from PBMCs by using the MACS Monocyte Isolation Kit II, which uses a positive selection procedure.

RNA isolation RNA was isolated using the RNeasy® midi kit according to the manufacturer’s protocol (Qiagen Benelux BV, Venlo, The Netherlands). To remove residual traces of genomic DNA, the RNA was treated with DNase I (Invitrogen, Leek, The Netherlands) while bound to the RNeasy column. Quality and quantity of the purified RNA was controlled using a NanoDrop® spectrophotometer (Nanodrop technologies, DE, USA). RNA integrity was investigated by using the 2100 Bioanalyzer (Agilent technologies, PA, USA).

Expression analysis using Affymetrix GeneChip Human Genome U133 Plus 2.0 Amplification of 20 ng total RNA was performed with the Two-Cycle Eukaryotic Target Labeling kit (Affymetrix, CA, USA). Briefly, after the first round of cDNA synthesis, an unlabeled ribonucleotide mix (MEGAscript® T7 kit, Ambion, TX, USA) was used to generate unlabeled cRNA according to the protocol of Affymetrix. After clean up of the cRNA with Prednisolone-induced changes in healthy volunteers 63 a GeneChip® Sample Cleanup Module IVT Column (Affymetrix, CA, USA), the unlabeled cRNA concentration was determined and 150 ng was reverse transcribed using random primers. Subsequently, the T7-Oligo(dT) Promoter Primer was used in the second-strand cDNA synthesis to generate double-stranded cDNA containing T7 promoter sequences. The resulting double-stranded cDNA was then amplified and biotin labeled using the IVT Labeling kit (Affymetrix). Biotin-labeled cRNA was fragmented at 1 μg/μl following the manufacturer’s protocol. After fragmentation, cRNA (10 μg) was hybridized at 45°C for 16 h to the Human Genome U133 2.0 Plus array (Affymetrix). Following hybridization, the arrays were washed, stained with phycoerythrin–streptavidin conjugate (Molecular Probes, OR, USA), and the signals were amplified by staining the array with biotin-labeled antistreptavidin antibody (Vector Laboratories, CA, USA) followed by phycoerythrin– streptavidin. The arrays were laser scanned with a GeneChip Scanner 3000 7G (Affymetrix) according to the manufacturer’s instructions. Data was saved as a raw image file and quantified using AGCC version 1.0 (Affymetrix). Raw data files are available on request.

Synthesis of cDNA & quantitative PCR Total RNA (1 μg) was reverse transcribed using a commercially available cDNA synthesis Chapter 4 Chapter kit (iScript®, BioRad Laboratories, CA, USA). Quantitative PCR was performed by SYBR® Green-based quantification according to the manufacturer’s protocol (Applied Biosystems, CA, USA). Primers were developed for FKBP5, DDIT4, KLF9, ZBTB16, FLT3 and ABCA1 using Primer3 (Supplementary Table 1; www.futuremedicine.com/doi/suppl/10.2217/ pgs.11.34) [17]. All primer pairs were exon-spanning. The genes HUWE1 and FXBO9 were chosen as endogenous controls since the expression arrays did not demonstrate any differences in expression of these genes between the experimental groups. PCR products were selected to be between 80 and 120 bp long. Samples were run on the 7900 HT Real- Time PCR System (Applied Biosystems) using the following protocol: 10 min denaturation at 95°C, and 40 cycles of 15 s denaturation at 95°C, 1 min, annealing and extension at 60°C. All primer pairs were validated in triplicate using serial cDNA dilutions that resulted in a final concentration of the equivalent of 800, 400, 200, 100 and 50 pg/μl input of total RNA in the first strand synthesis. Primer pairs that were 100 ± 10% efficient, which implies a doubling of PCR product in each cycle, were used to quantify mRNA levels. Threshold cycle numbers (referred to as Ct) were obtained using the 7900 HT SDS software version 2.3 (Applied Biosystems). All samples were measured twice and duplicate samples with a standard deviation larger than 0.5 were excluded from the analysis. The relative quantity of the gene-specific mRNA was calculated from the average value of the ΔCt (target gene Ct endogenous control gene Ct) for each sample. Differences in expression between two samples were calculated by the 2ΔCt method [18,19]. 64 Chapter 4

Statistical analysis Raw data were analyzed in R using packages from the BioConductor library [101]. Expression data were normalized using guanine cytosine robust multiarray analysis. Identification of regulated genes was carried out using the limma package from Bioconductor. For identifying differentially expressed genes after treatment with prednisolone, the following selection criteria were used: fold change >2 and p-value <0.05 after correction for multiple testing using the Benjamini–Hochberg correction.

Results

Comparison between subjects of both study protocols Subjects of the acute and the 2-week protocol were matched regarding age, BMI and physical activity at baseline. No physical complaints were reported in both protocols when compared with the placebo group. The completion rate of the study was 100%, none of the healthy volunteers withdrew owing to prednisolone-induced adverse events [15].

Changes in lymphocyte & monocyte cell counts after treatment with prednisolone in the acute protocol Glucocorticoid treatment will affect the number of immune cells in the blood. Leukocyte count (109/l) changed from 6.37 (± 2.21) at baseline to 8.77 (± 2.06) after treatment with 2 × 15 mg prednisolone at day one and from 5.62 (± 0.85) at baseline to 8.67 (± 0.66) after treatment with 1 × 75 mg prednisolone at day one. Lymphocyte count (109/l) changed from 2.63 (± 0.93) to 2.18 (± 0.63) in the prednisolone 2 × 15 group and from 2.23 (± 0.34) to 2.67 (± 0.10) in the prednisolone 1 × 75 group after treatment. Monocyte count (109/l) changed from 0.47 (± 0.197) to 0.52 (± 0.172) in the prednisolone 2 × 15 group and from 0.45 (± 0.084) to 0.53 (± 0.052) in the prednisolone 1 × 75 group after treatment.

Identification of prednisolone induced gene-expression profiles in CD4+ T lymphocytes & CD14+ monocytes In order to study GC effects on distinct subpopulations in human blood, CD4+ and CD14+ cells were isolated from PBMCs as described in the methods section from the samples obtained from the acute protocol. These purification steps yielded >98% purified CD4+ T cells and >90% CD14+ cells. In total, 892 probe sets were found to be significantly regulated by prednisolone (in CD4+ and/or CD14+ cells). Figure 1 shows the distribution of fold changes for the differentially expressed probe sets under each condition. The maximum fold changes that are observed are between four- and eight-fold, the majority of the genes being upregulated. It appeared that the fold changes of the regulation were much larger in the CD4+ T cells compared with CD14+ monocytes. Surprisingly, the classical GC target genes such as FKBP5, ZBTB16, DDIT4, KLF9 and FLT3 were downregulated in this Prednisolone-induced changes in healthy volunteers 65 study (FKBP5: 2.1-fold; ZBTB16: 2.9-fold; DDIT4: threefold; KLF9: 3.8-fold) whereas most studies reported an upregulation of these genes after GC treatment [12–14,20–22]. These observed downregulations were also found by microarray analysis in the independent validation cohort (2-weeks protocol) and by quantitative PCR validation. It is known that expression of these genes is not only induced by GCs, which are administered as a drug, but also by the endogenous GC cortisol. We observed that in the acute protocol, cortisol levels decreased after treatment with prednisolone. Even after 24 h of prednisolone intake, cortisol levels are still reduced when compared with baseline despite that prednisolone levels are decreased to zero after a few hours (Figure 2). These reduced

Figure 1 Distribution of ratios of differentially expressed probe sets in CD4+ and CD14+ subsets under both conditions (30 mg and 75 mg prednisolone). The ordered gene index shown on the x-axis presents all the probe sets that were differentially

regulated on day 2 compared with day 1. These probe sets were

ordered on the basis of the fold 4 Chapter change between day 2 and day 1.

cortisol levels after 24 h may provide an explanation for the observed downregulation of FKBP5, ZBTB16, DDIT4, KLF9 and FLT3. In order to identify probe sets that are specifically regulated in CD4+ or CD14+ cells, we followed two approaches. The first approach consisted of identifying probe sets that demonstrated the largest difference in prednisolone effects between both cell types. It appeared that a large number of the probe sets that were identified using this criteria demonstrated a regulation in both cell types, but differed only in the strength of the regulation. GAS7 is an example of such a gene (Figure 3, left panel). This indicates that the difference between the cell types is caused more by a difference in magnitude of the prednsiolone effect rather than an on–off effect in which a prednisolone effect in one of the cell types is completely absent. To identify the latter category, which we regard as a better representation of cell-specific probe sets, we selected probe sets that demonstrated regulation in one cell type (using the criteria stated earlier) and an absence of regulation in the other (defined as a log2 ratio of <0.1). This analysis yielded 62 CD4+ specific probe sets and 14 CD14+ specific probe sets (Table 1, Supplementary Tables 2 & 3). Both cell-specific probe sets included many genes related to inflammation and immune regulation. However, a link between gene function and cell specificity could be made for a number of genes. For example, FCER1G and IFNGR1, regulated in CD4+ cells but not in CD14+ cells, are involved in T-helper cell signaling and differentiation [23–25], 66 Chapter 4

Figure 2 Cortisol and prednisolone levels in acute protocol. The solid lines show the prednisolone levels during the course of the trial. Prednisolone was administered at 8 am on day 1 at two doses: 2 × 15 mg (second dose at 8 pm; lighter colored line) and 75 mg (darker colored line) as indicated by the arrows. Cortisol levels (dashed lines) were suppressed for almost 20 h by these doses of prednisolone compared with the levels at day 0 after placebo administration. Samples for gene-expression analysis were taken predose at day 1 and after 24 h.

Figure 3 Examples of genes which show differences in prednisolone effects concering magnitude or cell specificity.(A) Example of a gene which is regulated by prednisolone in both CD4+ and CD14+ cells but shows a difference in the magnitude of expression after treatment. (B) Example of a gene which demonstrates expression in both cell types (cut-off: intensity >20) but is regulated by prednisolone only in CD4+ cells (CD4+ specific). (C) Example of a gene which demonstrates expression in both cell types (cut-off: intensity >20) but is regulated by prednisolone only in CD14+ cells (CD14+ specific).

Table 1 Number of cell-specific regulated probe sets. Cut-off for determining whether a probe set was expressed or not was set at an average intensity of 20. Numbers denote probe sets that demonstrated a log ratio of <0.1 in the other cell type (corresponding to a fold change of <1.07)

Cells in which specific Total number of Genes expressed in Genes not expressed regulation occurs regulated genes other cell type in other cell type

CD4+ 62 33 29

CD14+ 15 1 14 Prednisolone-induced changes in healthy volunteers 67 processes that are predominantly mediated by CD4+ T lymphocytes. The genes PECAM1 and CXCL9, regulated in CD14+ cells but not in CD4+ cells, are known to have a role in antigen presentation and phagocytosis, which are functions mainly performed by CD14+ cells [26,27]. The relatively small sets of cell-type specific regulated genes indicate that onlya minority of the genes demonstrate true cell specific behavior and that the majority of prednisolone-regulated genes are regulated in both cell types.

Table 2 Top five over-represented pathways as identified by Ingenuity Pathway Analysis and CoPub Analysis. The enrichment was calculated for 892 probe sets that were differentially regulated in CD4+ or C14+ samples. The total list of over-represented pathways is outlined in Supplementary Table 4.

Pathway P-value

Ingenuity Pathway Analysis TREM1 Signaling 2.45E-09 Chapter 4 Chapter Role of Pattern Recognition Receptors in Recognition of Bacteria and Viruses 7.41E-08 B Cell Receptor Signaling 2.00E-07 Production of Nitric Oxide and Reactive Oxygen Species in Macrophages 5.75E-07 Toll-like Receptor Signaling 2.51E-06 CoPub Analysis cell adhesion 0.00E+00 neutrophil activation 0.00E+00 respiratory burst 0.00E+00 t-cell proliferation, t-cell homeostatic proliferation 9.79E-11 cell development 9.79E-11

Pathways affected by prednisolone treatment In order to investigate which biological pathways were mostly affected by prednisolone, we used Ingenuity Pathway Analysis software version 8.8 (Ingenuity systems, CA, USA) and CoPub ([28,102]) to map the regulated probe sets to canonical pathways. The top five over-represented pathways as identified by Ingenuity Pathway Analysis and CoPub analysis is outlined in Table 2, the total list is outlined in Supplementary Table 4. The top scoring terms appeared to have a strong immunological component, exemplified by pathways such as Toll-like receptor signaling, acute phase response, T-cell and B-cell signaling. Also a link to autoimmune diseases such as systemic lupus erythematosus and rheumatoid arthritis was observed. In addition, a link to specific signaling pathways 68 Chapter 4

via NF-kB, IL-8, IL-6 and GM-CSF was observed, all of which are signaling pathways that play an important role in inflammatory processes. This indicates that the main effects that are induced by prednisolone in CD4+ and CD14+ are related to the pharmacological, anti-inflammatory properties of prednisolone. Interestingly, these effects can thusbe observed in healthy, unchallenged volunteers.

Text-mining analysis to identify genes that are involved in prednisolone-induced metabolic side effects The above mentioned results demonstrated that we can clearly identify GC signatures in both CD4+ and CD14+ cell types and that these signatures have a strong link to immunological processes. The above analysis is, however, an enrichment analysis in which only the most affected pathways are identified. In the case that only asmall, but possibly crucial, number of genes in a pathway are regulated, this pathway will not receive a high enrichment score and might therefore easily be missed when the results are interpreted. To make a link to possible prednisoloneinduced metabolic side effects, we set out to identify genes that are regulated by prednisolone and are involved in metabolic pathways and processes. To perform this in an unbiased way we used CoPub to identify from the set of regulated genes, all genes with a strong literature link to keywords describing glucose and fatty-acid metabolism. Since we were also interested in the interplay between immunological and metabolic processes and the possible relation with GC regulation, we also examined for each of the genes whether there was a strong known link with keywords describing GR-mediated processes and immunological pathways. The keywords that were used are presented in Supplementary Table 5. We identified 253 genes with a link to one or more of the metabolic keywords that were used for searching. The top scoring metabolic genes (34 genes) are shown in Figure 4, together with their literature scores. A short description regarding the function of ten of these 34 genes is outlined in Table 3; in Supplementary Table 6 the function of all 34 genes is summarized. It appeared that a large number of these genes had not only a strong link to metabolic processes but were also connected to inflammatory processes. For example, MAPK14 (p38 MAP kinase) is involved in signal transduction of inflammatory stimuli, but has also been reported to play a role in insulininduced glucose transport in muscle [29]. Other genes with a strong link to both metabolic and inflammatory processes areMAPK1 and IL-1B. However, for a number of genes an immunological function is less obvious, and these may represent a set of genes that have a predominantly metabolic function and are less important for the anti-inflammatory action of prednisolone. Examples of such genes are genes related to fatty acid transport and metabolism: ABCA1, ACOX1, PDK4, ACSL1 and EPHX2 [30–36], and genes related to (regulation of) glucose and glycogen metabolism: PYGL, GYG1, SLC16A3, PFKFB3, PFKFB2, AQP9, GK and SLC2A3 (also known as GLUT3) (Figure 4) [32,35,37–42]. These results clearly indicate that the glucose and Prednisolone-induced changes in healthy volunteers 69 lipid metabolic signaling pathways, in which these genes play a role, are affected by GC treatment. Some of the above mentioned genes, for example PYGL, GYG1 and AQP9 display a weak literature link with GC. A detailed search in PubMed confirms that these genes indeed have not been reported to be under control of GCs. Interestingly, in promoter regions

Chapter 4 Chapter

Figure 4 Text-mining analysis to identify genes involved in prednisolone-induced metabolic side effects. Genes depicted in this list are sorted according to high literature co-occurrence score for keywords involved in metabolism (cut-off ≥0.5).

of human and pig AQP9, putative GC response elements have been identified [43,44] indicating that this gene might be under the direct control of GCs. A metabolic role for GYG1 and PYG1 in PBMCs is supported by the fact that these genes were also strongly 70 Chapter 4

Table 3 Function of ten of 34 genes that were identified by text mining. Information was ob- tained from two sources. Information on the general function of the gene was obtained from the EntrezGene database from the NCBI (http://www.ncbi.nlm.nih.gov/gene). The link be- tween the gene and metabolic processes was obtained from manual inspection of the Pubmed abstracts returned by the CoPub text mining strategy, i.e., those abstracts in which the genes and metabolic keywords co-occur which is the underlying evidence for the literature scores depicted in figure 4. The total gene list of 34 genes is outlined in supplementary table6.

Gene Gene Function Role in metabolism symbol name PYGL Phospho- Encodes a homodimeric Humans have three glycogen phosphorylase rylase, protein that catalyses the isozymes that are primarily expressed in liver, glycogen, cleavage of alpha-1,4- brain and muscle, respectively. The liver isozyme liver glucosidic bonds to release serves the glycemic demands of the body in glucose-1-phosphate from general while the brain and muscle isozymes liver glycogen stores supply just those tissues. In glycogen storage dis- ease type VI, or Hers disease, mutations in liver glycogen phosphorylase inhibit the conversion of glycogen to glucose and results in moderate hypoglycemia, mild ketosis, growth retardation and hepatomegaly GYG1 Glycogenin Encodes a member of the This gene is expressed in muscle and other tis- 1 glycogenin family. Glycogenin sues and has a key role in glycogen formation. is a glycosyltransferase that Mutations in this gene result in glycogen storage catalyzes the formation of a disease XV short glucose polymer from uridine diphosphate glucose in an autoglucosylation reaction AQP9 Auaporin 9 The aquaporins are a family AQP9 stimulates urea transport and osmotic wa- of water-selective membrane ter permeability; there are contradicting reports channels. The protein about its role in providing glycerol permeability. encoded by this gene allows It may also play a role in specialized leukocyte passage of a wide variety of functions such as immunological response and noncharged solutes bactericidal activity PDK4 Pyruvate PDK4 is located in the By inhibiting the pyruvate dehydrogenase com- dehydro- matrix of the mitrochondria plex it is contributing to the regulation of glucose genase and inhibits the pyruvate metabolism. Expression of this gene is regulated kinase, iso- dehydrogenase complex by by glucocorticoids, retinoic acid and insulin zyme 4 phosphorylating one of its subunits

GLUL Glutama- The protein catalyzes the Glutamine is a main source of energy and is te-ammo- synthesis of glutamine from involved in cell proliferation, inhibition of apop- nia ligase glutamate and ammonia tosis, and cell signaling ACSL1 Acyl-CoA Isozyme of the long-chain Plays a key role in lipid biosynthesis and fatty synthetase fatty-acid-coenzyme A ligase acid degradation long-chain family. All isozymes of this family family convert free long-chain member 1 fatty acids into fatty acyl-CoA esters Prednisolone-induced changes in healthy volunteers 71

Gene Gene Function Role in metabolism symbol name

CEBPB CCAAT/ The protein encoded by this Polymorphisms in CEBPB are associated with enhancer intronless gene is a bZIP abdominal obesity and related metabolic abnor- binding transcription factor which can malities like type 2 diabetes protein (C/ bind as a homodimer to cer- EBP), beta tain DNA regulatory regions

IL1B Interleukin This cytokine is an important Inflammation is associated with obesity, the 1, beta mediator of the inflammatory metabolic syndrome, and diabetes. Weight response, and is involved in reduction results in a decrease in the mRNA ex- a variety of cellular activities, pression of IL-1beta (IL1B), IL-1 receptor antago- including cell proliferation, nist, and tumor necrosis factor alpha. Decrease differentiation, and apoptosis in IL1B expression is correlated with an increase in insulin sensitivity index. The strong correlation between the decrease in IL1B expression and the increase in insulin sensitivity suggest a con- tribution of IL1B to insulin-resistant states found in obesity and the metabolic syndrome.

MAPK14 Mitogen- Member of the MAP kinase Regulates the activities of many transcription

activated family. MAP kinases act as an factors and proteins/enzymes and thus has a 4 Chapter protein integration point for multiple wide-spectrum of biological effects. Patients kinase 14 biochemical signals, and are with insulin resistance and/or type 2 diabetes involved in a wide variety of have high levels of plasma free fatty acids, cellular processes such as inflammatory cytokines, and/or glucose, and proliferation, differentiation, over-activation of the cardiovascular renin- transcription regulation and angiotensin system, all are capable of activat- development ing MAPK14. MAPK14 plays a central role in hepatic glucose and lipid metabolism, leading to increased hepatic glucose production and de- creased hepatic lipogenesis. The role of MAPK14 in insulin-mediated glucose uptake in skeletal muscle and adipose tissue remain controversial

MAPK1 Mitogen- Member of the MAP kinase Recent evidence suggests that common stress- activated family. MAP kinases act as an activated signaling pathways such as nuclear protein integration point for multiple factor-kappaB, MAPK1, and NH2-terminal Jun kinase 1 biochemical signals, and are kinases/stress-activated protein kinases underlie involved in a wide variety of the development of diabetic complications. In cellular processes such as type 2 diabetes, there is evidence that the acti- proliferation, differentiation, vation of these same stress pathways by glucose transcription regulation and and possibly FFA leads to both insulin resistance development and impaired insulin secretion 72 Chapter 4

regulated in PBMCs of healthy volunteers after a high calorie breakfast [45].

Discussion

It is well known that GCs are very efficacious drugs in humans and that prolonged use of GCs induces metabolic side effects in humans. However, knowledge of the effects of GCs on immune-related and metabolic pathways in an in vivo setting is relatively scarce. Therefore, we conducted genome-wide profiling of gene-expression in human healthy volunteers that were treated with prednisolone as a typical, broadly used GC. To our knowledge this is the first study in which gene-expression profiles in prednisolone-treated healthy volunteers is described. As a first aim of this study, we identified prednisolone- induced gene signatures in CD4+ T lymphocytes and CD14+ monocytes, cells which are known to be peripheral immune target cells for GC action. These expression signatures will provide insight into the effects of GCs in an in vivo setting. We observed a much stronger induction of gene-expression in CD4+ T lymphocytes than in CD14+ monocytes with respect to fold changes, but the number of truly cell-specific genes regulated by prednisolone was relatively low. Tissue-specific regulation is known for GCs and large differences between expression profiles of liver, muscle and kidney can be observed [46–48]. Genes such as TAT, G6PC and PCK1 are specifically regulated in the liver, and the interplay between GR- and tissue-specific cofactors have been studied in detail [49–51]. However, to the best of our knowledge, no literature studies exist in which in vivo GC regulation in monocytes and T cells have been compared. Apparently, the tissue specificity of GC is less pronounced between cells from the same organ and lineage. It should, however, be noted that, although we found a strong prednisolone effect on immune-related pathways, these cells were derived from healthy volunteers; in diseased people the effect of GCs might well be larger and more cell specific than observed in our study. We found that classical GC target genes FKBP5 and ZBTB16 were downregulated. The product of FKBP5 functions as a co-chaperone molecule of the GR that affects the transport of GR into the nucleus. ZBTB16 encodes a zinc finger transcription factor that contains nine Kruppel-type zinc finger domains at the carboxyl terminus. This protein is located in the nucleus and is involved in cell cycle progression. Expression of both genes have previously been reported as being strongly induced shortly after GC administration in both cell lines and primary patient cells [12,52–55]. The main difference between those studies and our study is that we studied prednisolone-induced gene-expression in a complex in vivo setting in contrast to other studies in which most often cell lines, animals or cultured primary cells, were investigated [12–14,20–22]. Interestingly, other known target genes of GR (for example GLUL) demonstrated an upregulation by prednisolone after 24 h, whereas other known GC target genes suchas DUSP1 and GILZ were not regulated by prednisolone in this study. The exact interplay between exogenous and Prednisolone-induced changes in healthy volunteers 73 endogenous cortisol levels (which were strongly reduced after 24 h), kinetics of induction of gene-expression and mRNA stability determines the effect of GC on its target genes in a complex (in vivo) setting and can only partly be replicated in isolated in vitro studies. This demonstrates that when using the expression of GR target genes as biomarkers for GR-induced processes, either time resolved measurements are needed or markers should be chosen that are relatively independent of the time window at which sampling is taking place. In our study, we used purified CD4+ and CD14+ cells to study the effect of prednisolone. For large scale clinical trials, gene-expression studies with whole-blood samples are preferred because of ease of handling and reproducibility of the results [56]. A potential problem in measuring prednisolone-induced changes in gene expression in whole blood is that direct effects of prednisolone on gene transcription may be confounded by the shifts in T cell and monocyte subsets that can be induced by prednisolone. In the case that prednisolone would act on largely distinct gene sets for each cell type, changes in whole- blood gene expression could be unambiguously assigned to a specific cell type. However, the fact that we observed a large overlap in gene sets regulated by prednisolone in CD4+ and CD14+ cells indicates that when measuring prednisolone-induced gene-expression in Chapter 4 Chapter whole blood, cell counts of leukocyte subsets populations or gene-expression markers in these subsets need to be measured to account for shifts in cell populations [57]. Clinical measurements that were carried out during this trial demonstrated that both acute and short-term exposure to different prednisolone impaired aspects of pancreatic β-cell function, such as glucose sensitivity and fasting insulin secretion, contribute to a diabetic phenotype [15]. Therefore, we used the expression profiles obtained from CD4+ and CD14+ cells to study whether we could find a link to pathways and mechanisms that might be involved in prednisolone-induced metabolic adverse events. Although blood cells are not the tissues most relevant for metabolic side effects (these are most likely liver, pancreas, muscle and adipose tissue), there is growing evidence that blood cells can act as sentinels of disease [58–61]. The permanent interactions between blood cells and every tissue in the body make it plausible that subtle changes occurring in relation to disease within cells or tissues of the body trigger specific changes in gene-expression in blood cells, thereby reflecting the disease state of affected cells and tissues. Indeed, Liew and colleagues demonstrated that over 80% of the genes expressed in other tissues overlapped with the genes expressed in peripheral blood [58]. These findings suggest that peripheral bloodexpression profiles can provide information concerning the health or disease state of any particular tissue in the body and may serve as a tool for diagnostic purposes. Text-mining analysis identified 253 prednisolone- regulated genes with a metabolic link. Among the top scoring genes were a number of genes that play a crucial role in the regulation of glucose and lipid metabolism (Figure 4). From these genes,PDK4 and ACSL1 are particularly interesting. PDK4 is a crucial regulator of glucose and lipid metabolism. 74 Chapter 4

In the well-fed state, the pyruvate dehydrogenase complex reduces blood glucose levels by converting it to acetyl-coenzyme A which will enter the citric acid cycle. In the fasting state, inactivation of the pyruvate dehydrogenase complex is mainly achieved by PDK4, thereby maintaining blood glucose levels [62]. Increased PDK4 activity has been implicated in the pathogenesis of insulin resistance, diabetes and obesity [62,63]. Our results demonstrate an upregulation of PDK4 in CD4+ and CD14+ cells after prednisolone treatment. These results imply that prednisolone-induced insulin resistance might partly be mediated through PDK4 regulation.ACSL1 encodes an isozyme of the long-chain fatty acid-CoA ligase family that converts free fatty acids into fatty acyl-CoA esters that can be used for lipid biosynthesis and oxidation. In adipose tissue, insulin signaling leads to the translocation of ACSL1 to the membrane, together with other enzymes involved in fatty acid binding and translocation. SNPs in this gene have been demonstrated to be associated with metabolic syndrome [32]. It was observed that ACSL1 expression in subcutaneous adipose tissue was negatively correlated with insulin resistance [64]. Whereas the regulation of PDK4 and ACSL1 in metabolic tissues such as the muscle, adipose tissue and the liver, and the effect of GC on this regulation, has been well studied [65], relatively little is known regarding the regulation of these genes in T cells or monocytes. The results obtained by text-mining analyses identify interesting genes that might be involved in GC-induced metabolic adverse events. Expression-profiling studies demonstrated that several of the genes identified in this study are also regulated in the muscle of collagen-induced arthritic mice after treatment with prednisolone [Van Lierop et al., Unpublished Data]. Genes that were regulated after prednisolone treatment are AQP9, PDK4, GLUL, CEBPB and CD14. Taken together, these results indeed demonstrate that prednisolone has a major impact on metabolic processes in blood cells. Whether this is caused by a direct effect of prednisolone on these cells or whether this is a reaction of the blood cells to a changed metabolic state of the primary metabolic organs such as the liver, remains to be studied. A number of genes that were identified via text mining also revealed prednisolone- induced genes with a metabolic function that had not yet been linked to GCs inthe literature. These genes might be interesting new candidate genes for further investigation of GC-induced metabolic side effects. This demonstrates that a combined transcriptomics and text-mining approach may reveal new genes and pathways involved in GC-induced metabolic side effects and provide new hypotheses for follow-up research.

Conclusion

By using whole-genome expression profiling, we generated expression signatures in blood CD4+ and CD14+ cells that reflect anti-inflammatory effects but also metabolic side effects. Our findings demonstrate that prednisolone induces a very specific profile Prednisolone-induced changes in healthy volunteers 75 in both cell types, consisting of a large number of primary GC-responsive genes that are regulated in both cell types, as well as a set of cell-type-specific genes. Combined with the fact that prednisolone-induced pancreatic b-cell failure was observed in these volunteers, these findings suggest that peripheral blood-expression profiles can provide information on the prednisolone-induced metabolic state. The identified genesmay be useful as biomarkers for GC-induced side effects or provide a starting point for the development of improved GCs that demonstrate the same anti-inf lammatory effect but have a better safety profile.

Financial & competing interests disclosure Clinical trial number: NCT00971724. This work was performed within the framework of Dutch Top Institute Pharma, project ‘Glucocorticoid-induced insulin resistance’ (project nr. T1-106-1). Ulla Nässander, Marie-José C van Lierop, Susanne Bauerschmidt, Wim HA Dokter and Wynand Alkema are all employees of Merck, Sharp and Dohme. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed. No writing assistance was utilized Chapter 4 Chapter in the production of this manuscript.

Ethical conduct of research The authors state that they have obtained appropriate institutional review board approval or have followed the principles outlined in the Declaration of Helsinki for all human or animal experimental investigations. In addition, for investigations involving human subjects, informed consent has been obtained from the participants involved.

Supplementary information can be found online at the website of the publisher or can be retrieved on request by the author of this thesis. 76 Chapter 4

References

Papers of special note have been highlighted as: * of interest ** of considerable interest 1. Rhen T, Cidlowski JA: Antiinflammatory action of glucocorticoids – new mechanisms for old drugs. N. Engl. J. Med. 353(16),1711–1723 (2005). 2. de Bosscher K, Haegeman G: Minireview: latest perspectives on antiinflammatory actions of glucocorticoids. Mol. Endocrinol. 23(3), 281–291 (2009). 3. Schacke H, Berger M, Rehwinkel H, Asadullah K: Selective glucocorticoid receptor agonists (SEGRAs): novel ligands with an improved therapeutic index. Mol. Cell Endocrinol. 275(1–2), 109–117 (2007). 4. Schacke H, Rehwinkel H, Asadullah K, Cato AC: Insight into the molecular mechanisms of glucocorticoid receptor action promotes identification of novel ligands with an improved therapeutic index. Exp. Dermatol. 15(8), 565–573. (2006). ** Informative overview of approaches used for optimizing glucocorticoid receptor ligands. 5. Diamond MI, Miner JN, Yoshinaga SK, Yamamoto KR: Transcription factor interactions: selectors of positive or negative regulation from a single DNA element. Science 249(4974), 1266–1272 (1990). 6. Liberman AC, Druker J, Perone MJ, Arzt E:Glucocorticoids in the regulation of transcription factors that control cytokine synthesis. Cytokine Growth Factor Rev.18(1–2), 45–56 (2007). 7. Hillier SG: Diamonds are forever: the cortisone legacy. J. Endocrinol. 195(1), 1–6 (2007). 8. Schwartz M, Cohen R: Optimizing conventional therapy for inflammatory bowel disease. Curr. Gastroenterol. Rep. 10(6), 585–590 (2008). 9. Del Rosso Do JQ: Combination topical therapy for the treatment of psoriasis. J. Drugs Dermatol. 5(3), 232–234 (2006). 10. Myers AJ, Gibbs JR, Webster JA et al.: A survey of genetic human cortical geneexpression. Nat. Genet. 39(12), 1494–1499 (2007). 11. Goring HH, Curran JE, Johnson MP et al.: Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nat. Genet. 39(10), 1208–1216 (2007). 12. Tissing WJ, den Boer ML, Meijerink JP et al.: Genomewide identification of prednisoloneresponsive genes in acute lymphoblastic leukemia cells. Blood 109(9), 3929–3935 (2007). 13. Hulleman E, Kazemier KM, Holleman A et al.: Inhibition of glycolysis modulates prednisolone resistance in acute lymphoblastic leukemia cells. Blood 113(9), 2014–2021 (2009). 14. Almon RR, Dubois DC, Jin JY, Jusko WJ: Temporal profiling of the transcriptional basis for the development of corticosteroid-induced insulin resistance in rat muscle. J. Endocrinol. 184(1), 219–232 (2005). 15. Van Raalte DH, Nofrate V, Bunck M et al.: Acute and 2-week exposure to prednisolone impair different aspects of b-cell function in healthy men. Eur. J. Endocrinol. 162(4), 729–735 (2010). 16. Mager DE, Lin SX, Blum RA, Lates CD, Jusko WJ: Dose equivalency evaluation of major corticosteroids: pharmacokinetics and cell trafficking and cortisol dynamics. J. Clin.Pharmacol. 43(11), 1216–1227 (2003). 17. Rozen S, Skaletsky H: Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365–386 (2000). 18. Livak KJ, Schmittgen TD: Analysis of relative gene-expression data using real-time quantitative PCR and the 2(-D D C(T)) method. Methods 25(4), 402–408 (2001). 19. Pfaffl MW: A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29(9), e45 (2001). 20. Franchimont D, Galon J, Vacchio MS et al.: Positive effects of glucocorticoids on T–cell function by up- regulation of IL-7 receptor alpha. J. Immunol. 168(5), 2212–2218 (2002). 21. Galon J, Franchimont D, Hiroi N et al.: Gene profiling reveals unknown enhancing and suppressive actions of glucocorticoids on immune cells. FASEB J. 16(1), 61–71 (2002). 22. Caulfield J, Fernandez M, Snetkov V, Lee T, Hawrylowicz C: CXCR4 expression on monocytes is up- regulated by dexamethasone and is modulated by autologous CD3+ T cells. Immunology 105(2), 155– 162 (2002). 23. Arques JL, Regoli M, Bertelli E, Nicoletti C: Persistence of apoptosis-resistant T cell-activating dendritic cells promotes T helper type-2 response and IgE antibody production. Mol. Immunol. 45(8), 2177–86 Prednisolone-induced changes in healthy volunteers 77

(2008). 24. Maldonado RA, Soriano MA, Perdomo LC et al.: Control of T helper cell differentiation through cytokine receptor inclusion in the immunological synapse. J. Exp. Med. 206(4), 877–892 (2009). 25. Blink SE, Fu YX: IgE regulates T helper cell differentiation through FcgammaRIII mediated dendritic cell cytokine modulation. Cell Immunol. 264(1), 54–60 (2010). 26. Koch S, Kohl K, Klein E, von BD, Bieber T: Skin homing of Langerhans cell precursors: adhesion, chemotaxis, and migration. J. Allergy Clin. Immunol. 117(1), 163–168 (2006). 27. Wu Y, Stabach P, Michaud M, Madri JA: Neutrophils lacking platelet-endothelial cell adhesion molecule-1 exhibit loss of directionality and motility in CXCR2-mediated chemotaxis. J. Immunol. 175(6),3484–3491 (2005). 28. Frijters R, Heupers B, van Beek P et al.: CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Res. 36, W406–W410 (2008). 29. Geiger PC, Wright DC, Han DH, Holloszy JO: Activation of p38 MAP kinase enhances sensitivity of muscle glucose transport to insulin. Am. J. Physiol. Endocrinol. Metab. 288(4), E782–E788 (2005). 30. Koehn J, Fountoulakis M, Krapfenbauer K: Multiple drug resistance associated with function of ABC- transporters in diabetes mellitus: molecular mechanism and clinical relevance. Infect. Disord. Drug Targets 8(2), 109–118 (2008). 31. Woo MN, Jeon SM, Kim HJ et al.: Fucoxanthin supplementation improves plasma and hepatic lipid metabolism and Blood glucose concentration in high-fat fed C57BL/6N mice. Chem. Biol. Interact. 186(3),316–322 (2010). 32. Phillips CM, Goumidi L, Bertrais S et al.: Gene-nutrient interactions with dietary fat modulate the

association between genetic variation of the ACSL1 gene and metabolic syndrome. J. Lipid Res. 51(7), 1793–1800 (2010).

33. Botion LM, Green A: Long-term regulation of lipolysis and hormone-sensitive lipase by insulin and 4 Chapter glucose. Diabetes 48(9), 1691–1697 (1999). 34. Langin D: Adipose tissue lipolysis as a metabolic pathway to define pharmacological strategies against obesity and the metabolic syndrome. Pharmacol. Res. 53(6), 482–491 (2006). 35. Mattson MP: Calcium and neuronal injury in Alzheimer’s disease. Contributions of b-amyloid precursor protein mismetabolism, free radicals, and metabolic compromise. Ann. NY Acad. Sci. 747, 50–76 (1994). 36. Hommelberg PP, Plat J, Sparks LM et al..: Palmitate-induced skeletal muscle insulin resistance does not require NF-kB activation. Cell Mol. Life Sci. 68(7), 1215-1225 (2010). 37. De Mello VD, Kolehmainen M, Schwab U et al.: Effect of weight loss on cytokine messenger RNA expression in peripheral Blood mononuclear cells of obese subjects with the metabolic syndrome. Metabolism 57(2),192–199 (2008). 38. Wanecq E, Prevot D, Carpene C: Lack of direct insulin-like action of visfatin/Nampt/PBEF1 in human adipocytes. J. Physiol. Biochem. 65(4), 351–359 (2009). 39. Salehzadeh F, Al-Khalili L, Kulkarni SS, Wang M, Lonnqvist F, Krook A: Glucocorticoid-mediated effects on metabolism are reversed by targeting 11 b hydroxysteroid dehydrogenase type 1 in human skeletal muscle. Diabetes Metab. Res. Rev. 25(3), 250–258 (2009). 40. Shibata M, Hakuno F, Yamanaka D et al.: Paraquat-induced oxidative stress represses phosphatidylinositol 3-kinase activities leading to impaired glucose uptake in 3T3-L1 adipocytes. J. Biol. Chem. 285(27),20915–20925 (2010). 41. Piatkiewicz P, Czech A, Taton J, Gorski A: Investigations of cellular glucose transport and its regulation under the influence of insulin in human peripheral blood lymphocytes. Endokrynol. Pol. 61(2), 182– 187 (2010). 42. Zhang Q, Xiao X, Feng K et al.: Berberine moderates glucose and lipid metabolism through multipathway mechanism. Evid Based Complement Alternat Med 2011; 2011(Pt 2) 924851 (2010). 43. Li X, Lei T, Xia T et al.: Molecular characterization, chromosomal and expression patterns of three aquaglyceroporins (AQP3, 7, 9) from pig. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 149(3), 468– 476 (2008). 44. Tsukaguchi H, Weremowicz S, Morton CC, Hediger MA: Functional and molecular characterization of the human neutral solute channel aquaporin-9. Am. J. Physiol. 277(5 Pt 2), F685-F696 (1999). 45. van Erk MJ, Blom WA, van OB, Hendriks HF: High-protein and high carbohydrate breakfasts differentially change the transcriptome of human blood cells. Am. J. Clin. Nutr. 84(5), 1233–1241 (2006). 46. Chrousos GP, Kino T: Glucocorticoid signaling in the cell. Expanding clinical implications to complex 78 Chapter 4

human behavioral and somatic disorders. Ann. NY Acad. Sci. 1179, 153–166 (2009). 47. Kino T: Tissue glucocorticoid sensitivity: beyond stochastic regulation on the diverse actions of glucocorticoids. Horm. Metab. Res. 39(6), 420–424 (2007). 48. Lu NZ, Cidlowski JA: Glucocorticoid receptor isoforms generate transcription specificity. Trends Cell Biol. 16(6), 301–307 (2006). 49. Panin LE, Usynin IF: Role of glucocorticoids and resident liver macrophages in induction of tyrosine aminotransferase. Biochemistry 73(3), 305–309 (2008). 50. Vander Kooi BT, Onuma H, Oeser JK et al.: The glucose-6-phosphatase catalytic subunit gene promoter contains both positive and negative glucocorticoid response elements. Mol. Endocrinol. 19(12), 3001– 3022 (2005). 51. Quinn PG, Yeagley D: Insulin regulation of PEPCK gene-expression: a model for rapid and reversible modulation. Curr. Drug Targets Immune Endocr. Metabol. Disord. 5(4), 423–437 (2005). 52. Schmidt S, Rainer J, Riml S et al.: Identification of glucocorticoid-response genes in children with acute lymphoblastic leukemia. Blood 107(5), 2061–2069 (2006). 53. Vermeer H, Hendriks-Stegeman BI, van der Burg B, van Buul-Offers SC, Jansen M: Glucocorticoid- induced increase in lymphocytic FKBP51 messenger ribonucleic acid expression: a potential marker for glucocorticoid sensitivity, potency, and bioavailability. J. Clin. Endocrinol. Metab. 88(1), 277–284 (2003). * One of the first papers that identifies FKBP51 as a biomarker for glucocorticoid sensitivity. 54. U M, Shen L, Oshida T, Miyauchi J, Yamada M, Miyashita T: Identification of novel direct transcriptional targets of glucocorticoid receptor. Leukemia 18(11), 1850–1856 (2004). 55. Nehme A, Lobenhofer EK, Stamer WD, Edelman JL: Glucocorticoids with different chemical structures but similar glucocorticoid receptor potency regulate subsets of common and unique genes in human trabecular meshwork cells. BMC Med. Genomics 2, 58 (2009). 56. Debey S, Zander T, Brors B, Popov A, Eils R, Schultze JL: A highly standardized, robust, and cost-effective method for genome-wide transcriptome analysis of peripheral blood applicable to large-scale clinical trials. Genomics 87(5), 653–664 (2006). 57. Whitney AR, Diehn M, Popper SJ et al.: Individuality and variation in gene expression patterns in human blood. Proc. Natl Acad. Sci. USA 100(4), 1896–901 (2003). 58. Liew CC, Ma J, Tang HC, Zheng R, Dempsey AA: The peripheral blood transcriptome dynamically reflects system wide biology: a potential diagnostic tool. J. Lab. Clin. Med. 147(3), 126–132 (2006). ** Results from this study show that peripheral blood cells reflect the disease state of other less accessible tissues. This suggests that easily accessible blood cells can be used to monitor disease state, which might be a valuable tool for diagnostic purposes. 59. Liew CC: Expressed genome molecular signatures of heart failure. Clin. Chem. Lab. Med. 43(5), 462– 469 (2005). 60. Griffin TA, Barnes MG, Ilowite NT et al.: Gene-expression signatures in polyarticular juvenile idiopathic arthritis demonstrate disease heterogeneity and offer a molecular classification of disease subsets. Arthritis Rheum 60(7), 2113–2123 (2009). 61. Radich JP, Mao M, Stepaniants S et al.: Individual-specific variation of gene expression in peripheral blood leukocytes. Genomics 83(6), 980–988 (2004). 62. Jeoung NH, Harris RA: Role of pyruvate dehydrogenase kinase 4 in regulation of blood glucose levels. Korean Diabetes J. 34(5), 274–283 (2010). 63. Majer M, Popov KM, Harris RA, Bogardus C, Prochazka M: Insulin downregulates pyruvate dehydrogenase kinase (PDK) mRNA: potential mechanism contributing to increased lipid oxidation in insulin-resistant subjects. Mol. Genet. Metab. 65(2), 181–186 (1998). 64. Gertow K, Rosell M, Sjogren P et al.: Fatty acid handling protein expression in adipose tissue, fatty acid composition of adipose tissue and serum, and markers of insulin resistance. Eur. J. Clin. Nutr. 60(12), 1406–1413 (2006). 65. Connaughton S, Chowdhury F, Attia RR et al.: Regulation of pyruvate dehydrogenase kinase isoform 4 (PDK4) gene-expression by glucocorticoids and insulin. Mol. Cell Endocrinol. 315(1–2), 159–167 (2010). Prednisolone-induced changes in healthy volunteers 79

Websites 101. The R Project for Statistical Computing www.r-project.org 102. CoPub www.copub.org 103. The National Center for Biotechnology Information www.ncbi.nlm.nih.gov/gene

Chapter 4 Chapter CHAPTER 5 Prednisolone induces the Wnt signaling pathway in 3T3-L1 adipocytes

Wilco W.M. Fleuren1,2, Margot M.L. Linssen3, Erik J.M. Toonen4, Gerard C.M. van der Zon3, Bruno Guigas3,5 Jacob de Vlieg1,6, Wim H.A. Dokter7, D. Margriet Ouwens8,9 and Wynand Alkema1,a

1CDD, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands, 2Netherlands Bioinformatics Centre (NBIC), Nijmegen, The Netherlands, 3Department of Molecular Cell Biology, Leiden University Medical Center, Leiden, The Netherlands, 4Department of Medicine; Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands, 5Department of Parasitology, Leiden University Medical Center, Leiden, The Netherlands 6Netherlands eScience Center, Amsterdam, The Netherlands, 7Synthon biooharmaceuticals BV, Nijmegen, The Netherlands, 8Institute of Clinical Biochemistry and Pathobiochemistry, German Diabetes Center,Düsseldorf, Germany. 9Department of Endocrinology, Ghent University Hospital, Ghent, Belgium aPresent address: NIZO Food Research BV, Ede, The Netherlands.

Accepted for publication in Arch Physiol Biochem. 2013

Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 83

Abstract Synthetic glucocorticoids are potent anti-inflammatory drugs but show dose dependent metabolic side effects such as the development of insulin resistance and obesity. The precise mechanisms involved in these glucocorticoid-induced side effects, and especially the participation of adipose tissue in this are not completely understood. We used a combination of transcriptomics, antibody arrays and bioinformatics approaches to characterize prednisolone-induced alterations in gene expression and adipokine secretion, which could underlie metabolic dysfunction in 3T3-L1 adipocytes. Several pathways, including cytokine signaling, Akt signaling, and Wnt signaling were found to be regulated at multiple levels, showing that these processes are targeted by prednisolone. These results suggest that mechanisms by which prednisolone induce insulin resistance include dysregulation of wnt signaling and immune response processes. These pathways may provide interesting targets for the development of improved glucocorticoids.

Keywords: glucocorticoids, metabolic dysfunction, gene profiling

Chapter 5 Chapter 84 Chapter 5

Introduction

Synthetic glucocorticoids (GCs) such as prednisolone and dexamethasone are widely used for the treatment of inflammatory diseases such as rheumatoid arthritis, asthma, inflammatory bowel disease and psoriasis [1-3]. Despite their excellent efficacy, GC usage is hampered because of adverse (metabolic) side effects, such as insulin resistance, glucose intolerance, diabetes, central adiposity, dyslipidemia, skeletal muscle wasting and osteoporosis [4-8]. The precise mechanisms involved in these GC-induced side effects are not completely understood and are tissue specific [9-12]. One of the key tissues thought to be involved in metabolic side effects is the adipose tissue. Several studies have shown that the side-effects of GCs in adipose tissue include the development of central adiposity [7], dyslipidaemia [13, 14] and the inhibition of insulin-stimulated glucose uptake [15-18]. It is now well established that adipose tissue is not only involved in energy storage, but also functions as an endocrine organ that secretes hundreds of bioactive substances known as adipokines [19-21]. Previously, GCs have been found to affect the secretion of several adipokines, including TNF-α, IL-6, Resistin, Adiponectin and Leptin in rodents [22]. Because recent proteomic approaches have led to the characterization of numerous novel adipokines, we applied a combination of genomics, antibody arrays and bioinformatics approaches to identify prednisolone-induced alterations in (adipokine) gene expression and secretion which could potentially underlie prednisolone-induced metabolic dysfunction of 3T3-L1 adipocytes.

Materials and Methods

Set up of the study In this study, differentiated 3T3-L1 adipocytes were treated with prednisolone or DMSO for 0, 1, 6, 24 and 48 hours and the effects of prednisolone were studied at gene expression and protein secretion level.

Cell culture 3T3-L1 pre-adipocytes obtained from ATCC were cultured and differentiated in adipocytes as previously described [23]. Cells were used seven days after completion of the differentiation process. Only cultures in which >95% of cells displayed adipocyte morphology were used. Prior to use, adipocytes were serum starved for 16 hours with DMEM supplemented with 0.5% fetal bovine serum.

Analysis of insulin signaling Differentiated 3T3-L1 adipocytes, grown in 12-well plates, were incubated with prednisolone (1 μM) or DMSO at day 8 for 48 hours. After incubation with prednisolone or DMSO, cells were serum-starved for 2 hours, and then stimulated with insulin (100 Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 85 nM) for 10 minutes. Following insulin stimulation, cells were washed twice with ice-cold PBS, and lysed as described previously [24]. Protein expression and phosphorylation were determined by Western blotting as described [24] using the following antibodies: phospho-insulin receptor substrate 1-Tyr1222, phospho-Akt-Thr308, phospho-Akt-Ser473 (all from Cell Signalling Technology, Danvers, MA, USA), total Akt, phospho-Akt substrate of 160-kDa (AS160), total AS160 (all from Millipore, MA, USA), glucose transporter 4 (GLUT4), and β-actin (both from Abcam, Cambridge, UK).

Assay of 2-deoxy-D-glucose (2DOG) uptake Differentiated 3T3-L1 adipocytes, grown in 12-wells plates, were incubated with prednisolone (1 µM) or DMSO at day 8 for 6, 24 and 48 hours. After incubation with prednisolone or DMSO for the indicated times, the cells were washed once with HEPES buffer, consisting of 50 mM HEPES, 0.14 M NaCl, 1.85 mM CaCl2, 1.3 mM MgSO4, and 4.8 mM KCl [pH 7.4]), and then incubated in this buffer for an additional hour at 37oC. Then, cells were stimulated with insulin (100 nM) or kept untreated. After 15 min, 2DOG uptake was initiated by the addition of 2-deoxy-D-[14C]glucose (0.075 µCi per well) in 3 mM 2DOG. After 10 min, the assay was terminated by three quick washes with ice-cold PBS. Cells were lysed in 0.1 M NaOH, 0.2% SDS, whereafter incorporated 2-deoxy-D-[14C] glucose was determined by liquid scintillation counting.

RNA isolation

Differentiated 3T3-L1 adipocytes grown in 6-well plates were incubated with prednisolone 5 Chapter (1µM) for 1, 6, 24 and 48 hours and used for RNA isolation and whole-genome expression profiling. RNA was isolated using Tripure (Roche), and subsequently purified using the RNeasy mini kit according to the manufacturer’s protocol (Qiagen Benelux B.V. Venlo, The Netherlands). To remove residual traces of genomic DNA, the RNA was treated with DNase I (Invitrogen, Leek, The Netherlands) while bound to the RNeasy column. Quality and quantity of the purified RNA was controlled using a NanoDrop spectrophotometer (Nanodrop technologies, Montchanin, DE, USA). RNA integrity was investigated by using the 2100 Bioanalyser (Agilent technologies, Philadelphia, PA, USA).

Whole-genome expression analysis using Affymetrix GeneChip Mouse Genome 430 2.0 Array Amplification of 20 ng total RNA was performed with the Two-Cycle Eukaryotic Target Labeling kit (Affymetrix, Santa Clara, CA). Briefly, after the first round of cDNA synthesis, an unlabeled ribonucleotide mix (MEGAscript T7 kit, Ambion) was used to generate unlabeled cRNA according to the protocol of Affymetrix. After cleanup of the cRNA with a GeneChip Sample Cleanup Module IVT Column (Affymetrix, Santa Clara, CA), the unlabeled cRNA concentration was determined and 150 ng was reverse transcribed using random primers. Subsequently, the T7-Oligo(dT) Promoter Primer was used in 86 Chapter 5

the second-strand cDNA synthesis to generate double-stranded cDNA containing T7 promoter sequences. The resulting double-stranded cDNA was then amplified and biotin labeled using the IVT Labeling kit (Affymetrix, Santa Clara, CA). Biotin-labeled cRNA was fragmented at 1 μg/μl following the manufacturer’s protocol. After fragmentation, cRNA (10 μg) was hybridized at 45°C for 16 hours to the Human Genome U133 2.0 Plus array (Affymetrix, Santa Clara, CA). Following hybridization, the arrays were washed, stained with phycoerythrin-streptavidin conjugate (Molecular Probes, Eugene, OR), and the signals were amplified by staining the array with biotin-labeled anti-streptavidin antibody (Vector Laboratories, Burlingame, CA) followed by phycoerythrin-streptavidin. The arrays were laser scanned with a GeneChip Scanner 3000 7G (Affymetrix, Santa Clara, CA) according to the manufacturer's instructions. Data was saved as raw image file and quantified using AGCC v 1.0 (Affymetrix, Santa Clara, CA).

Murine adipokine antibody arrays This array allows simultaneous detection of 308 secreted mouse proteins in cell culture supernatants (AAM-BLM, L-series, Raybiotech, Inc, Norcross GA, USA). Differentiated 3T3-L1 adipocytes were subjected to 4 hours pred/dmso, followed by two rinsing steps and culture medium containing 0.2% FBS and Prednisolone or vehicle (1 • M) for 44 hours. To prepare samples for the antibody array, samples were subjected to centricon tubes (Millipore amicon ultra-15 ultracel 3K) and washed 3 times with PBS, pH 8.0. Finally proteins were recovered in 2 ml PBS. Samples were biotin-labelled according to manufacturer’s protocol. Then, membranes were blocked and incubated with the biotinylated samples overnight at 4ºC, and finally incubated with HRP conjugated streptavidin. Signals were detected using enhanced chemiluminescence and quantified on a LumiIMager system using LumiAnalyst software version 3.1 (Roche Diagnostics, Mannheim, Germany).

Gene expression data Microarray data were analyzed with packages from the BioConductor library[25]. Gene expression data was normalized using gcrma. Identification of regulated probes was done using limma package from Bioconductor. For identifying differentially expressed probes after incubation with prednisolone, the following selection criteria were used: fold change >2, intensity > 20, and p-value<0.05 after correction for multiple testing using the Benjamini-Hochberg correction. All statistical analysis was performed in R. Significance of regulation of the Wnt pathway has been calculated using globaltest.

Secreted protein data obtained from murine adipokine antibody arrays Regulated proteins as a result of prednisolone treatment have been selected using a P-value < 0.05 (calculated using a standard Student's t-Test). Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 87

Keyword enrichment analysis Keyword enrichment analysis on the microarray data was performed using CoPub, a text mining algorithm that detects co-occurring biomedical concepts in abstracts from the MedLine literature database [26], using the following settings for the threshold values: p-value <0.05, R-scaled score > 35 and abstract count > 3. A heatmap of the pathway terms was been generated using a transformation of the p-value in the following way:

Additional gene annotation enrichment analysis was done using the functional annotation module of Database for Annotation, Visualization and Integrated Discovery (DAVID) tool [27, 28] with default settings against the KEGG pathway database.

Results

Effect of prednisolone on insulin action in 3T3-L1 adipocytes Exposure of differentiated 3T3-L1 adipocytes to 1 µM prednisolone had no effect on the expression of key signaling molecules, like the insulin receptor β-subunit, Akt, AS160 and

GLUT4. Prednisolone significantly impaired the phosphorylation of Akt-Thr308 and Akt- 5 Chapter Ser473 in response to insulin, whereas no effects were observed on IRS1-Tyr1222 (Figure 1). Intriguingly, but in line with a study on the effects of dexamethasone on insulin action in 3T3-L1 adipocytes [29], prednisolone did not impact on insulin-induced phosphorylation of AS160-Thr642 (Figure 1). The mechanism(s) underlying this discordance in signaling from Akt to AS160 is not understood. Nevertheless, prednisolone was found to impair insulin-stimulated glucose uptake already after 6 hours of incubation, and this inhibition was still present at 48 hours after incubation (Figure 1). These findings are in line with earlier observations using dexamethasone in 3T3-L1 adipocytes [15, 29] and confirm the induction of insulin resistance by prednisolone in our experimental conditions.

Whole genome expression profiling We next studied prednisolone-induced changes in gene expression following exposure of 3T3-L1 adipocytes to prednisolone or DMSO for 1, 6, 24 and 48 hours. Principle Component Analysis (PCA) showed a separation between prednisolone- and DMSO-treated samples after 6, 24 and 48 hours (Supplementary Figure 1). At each time-point, the significantly differentially expressed probes were identified between cells treated with prednisolone and DMSO (Supplementary Table 2). Comparison of probes significantly affected by prednisolone at 6 hours with probes significantly affected by prednisolone at 48 hours 88 Chapter 5

Figure 1 Effect of prednisolone on insulin action in 3T3-L1 adipocyte cells. Effects on protein expression of the insulin receptor (IR) β-subunit, Akt, AS160, and GLUT4, as well as the phosphorylation of insulin receptor substrate 1 (IRS1-Tyr1222), Akt-Thr308, Akt-Ser473, and AS160-Thr642 were determined by Western blotting in cells exposed to 1 μM prednisolone for 48 hours prior to stimulation with insulin (10 min, 100 nM). Data are presented as mean ± standard error of the mean of >4 independent experiments (A-D), and representative Western blots (F). E. Effect of prednisolone on 2-deoxyglucose uptake in in 3T3-L1 adipocytes after 6, 24 and 48 hours of incubation with prednisolone. Data are mean ± standard error of the mean of >4 independent experiments. In all bar graphs, open columns represent the basal condition, and filled bars depict insulin-stimulated cells. The effects of prednisolone on insulin action were analyzed using a two- way ANOVA followed by Bonferonni analysis for multiple comparisons. *** indicates a P<0.001; **, P<0.01. Effects of insulin versus basal were analyzed using a student’s t-test. ###, P<0.001, ##, P<0.01, #, P<0.05.

showed that probes with a high fold change are found to be regulated at both time points (Figure 2). Probes which are only regulated at 6 hours or at 48 hours have a lower fold change. The same trend was observed when comparing prednisolone induced probes at 6 hours with prednisolone induced probes at 24 hours (Figure 2B). At all time points, 25 genes represented by 34 probes (in fact 26 genes were found, but for probe 1434025_at we could not find annotation), were found to be significantly regulated by prednisolone Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 89

(Table 1). Most of these genes were upregulated and showed a sustained response to prednisolone at all time points, starting after one hour of incubation, with the highest response after 6 hours, and still present after 48 hours of incubation.

Figure 2 Regulated probes by prednisolone at 6 hours versus at respectively 48 hours (A) and 24 hours (B). Probes represented by red dots are regulated at both time points. Probes represented by blue dots are only regulated by prednisolone at 6 hours. Probes represented by green dots are only regulated by prednisolone at 48 hours. In B the regulation of the probes by prednisolone after 6 hours versus 24 hours is shown. The scale on both axes is log 2 fold change.

Biological pathways targeted by prednisolone To identify biological pathways that were affected by prednisolone, we mapped the 5 Chapter regulated genes to biological pathway terms using the text mining algorithm CoPub [26] (Figure 3). It appeared that the most significant pathway terms could be divided into three categories: immune system/inflammation (eicosanoid metabolism, prostaglandin metabolism, cytokine receptors, cytokine network, immune system, platelet activation), general metabolism (fatty acid metabolism, glucose metabolism, gluconeogenesis, lipid metabolism, fatty acid metabolism, glycolysis) and signaling (insulin signaling, TGF-β signaling, Akt signaling, Wnt signaling, MAPK signaling). Additional functional analysis of these regulated genes using DAVID, an annotation server based on GeneOntology and KEGG pathways [27], gave similar results (results not shown). We are interested in mechanisms and pathways that could underlie prednisolone-induced metabolic dysfunction and more specific prednisolone-induced insulin resistance in adipocytes. Therefore for the remainder of the paper, we focused on pathways that are known to play a role in the disturbance of insulin signaling that eventually could lead to insulin resistance, i.e. Akt/ insulin signaling, cytokine signaling, TGF-β signaling and MAPK signaling. Furthermore, we also focused on pathways that play a role in the disturbance of adipocyte differentiation that could lead to dyslipidaemia and obesity, i.e. Wnt signaling and cytokine signaling. In Figure 4 the prednisolone-induced genes have been categorized according to these pathways. Genes that have an effect on insulin signaling, e.g. growth- 90 Chapter 5

Table 1 Regulated probes (genes) by prednisolone after 1, 6, 24 and 48 hours of incubation. For each probe the log fold change at each time point is shown. Probes that are up-regulated are indicated with a red color and probes that are down-regulation are indicated with a green color. 1436839_at 1451415_at 1435595_at 1448667_x_at 1417310_at 1448666_s_at 1417309_at 1417273_at 1416029_at 1421228_at 1420380_at 1448830_at 1421471_at 1428942_at 1422557_s_at 1450297_at 1420772_a_at 1425281_a_at 1448700_at 1423233_at 1451021_a_at Affymetrix id 72123 69068 69068 57259 57259 57259 57259 27273 21847 20306 20296 19252 18166 17750 17748 16193 14605 14605 14373 12609 12224 Entrezgene ID 2010109K11Rik 1810011O10Rik 1810011O10Rik Tob2 Tob2 Tob2 Tob2 Pdk4 Klf10 Ccl7 Ccl2 Dusp1 Npy1r Mt2 Mt1 Il6 Tsc22d3 Tsc22d3 G0s2 Cebpd Klf5 Gene Symbol RIKEN cDNA 2010109K11 gene RIKEN cDNA 1810011O10 gene RIKEN cDNA 1810011O10 gene transducer of ERBB2, 2 transducer of ERBB2, 2 transducer of ERBB2, 2 transducer of ERBB2, 2 isoenzyme 4 pyruvatedehydrogenase kinase, Kruppel-like factor 10 ligandchemokine (C-C motif) 7 ligandchemokine (C-C motif) 2 dual specificity phosphatase 1 neuropeptide Y receptor Y1 metallothionein 2 metallothionein 1 interleukin 6 TSC22 domain family, member 3 TSC22 domain family, member 3 G0/G1 switch gene 2 (C/EBP), delta CCAAT/enhancer binding protein Kruppel-like factor 5 Gene Name 1.58E+00 2.24E+00 2.38E+00 1.32E+00 1.40E+00 1.40E+00 1.48E+00 1.55E+00 -1.42E+00 -1.50E+00 -1.42E+00 1.68E+00 1.32E+00 2.27E+00 1.13E+00 -1.32E+00 1.60E+00 1.76E+00 -1.55E+00 1.97E+00 1.92E+00 DMSO_T1 PRED_T1- 2.81E+00 3.35E+00 3.42E+00 2.16E+00 2.71E+00 2.55E+00 2.98E+00 2.82E+00 -1.71E+00 -2.00E+00 -3.12E+00 2.71E+00 1.02E+00 4.23E+00 2.16E+00 -2.06E+00 2.74E+00 3.27E+00 -1.53E+00 2.97E+00 3.61E+00 - 6 T DMSO_T6 _ D E R P 2.54E+00 2.04E+00 1.99E+00 1.73E+00 2.14E+00 1.87E+00 2.09E+00 2.75E+00 -1.48E+00 -1.14E+00 -1.94E+00 1.40E+00 1.97E+00 3.49E+00 1.96E+00 -2.86E+00 2.58E+00 3.33E+00 -2.38E+00 2.57E+00 2.62E+00 DMSO_T24 PRED_T24- 2.61E+00 1.99E+00 2.18E+00 1.61E+00 2.11E+00 1.87E+00 2.21E+00 3.56E+00 -1.00E+00 -1.48E+00 -2.00E+00 1.20E+00 1.77E+00 3.13E+00 2.26E+00 -2.58E+00 2.21E+00 2.60E+00 -1.12E+00 1.54E+00 4.32E+00 DMSO_T48 PRED_T48- Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 91 1434025_at 1457102_at 1438201_at 1434228_at 1434202_a_at 1439293_at 1460011_at 1457035_at 1437424_at 1428306_at 1416129_at 1419816_s_at 1429100_at Affymetrix id NA 399638 381511 381511 268709 235493 232174 226691 214804 74747 74155 74155 72123 Entrezgene ID NA A030001D16Rik Pdp1 Pdp1 Fam107a BC031353 Cyp26b1 AI607873 Syde2 Ddit4 Errfi1 Errfi1 2010109K11Rik Gene Symbol NA dehyrogenase RIKEN cDNA A030001D16 gene dehyrogenase phosphatase catalytic subunit 1 pyruvate phosphatase catalytic subunit 1 pyruvate 107, member A similarity sequence with family cDNA sequence BC031353 26, subfamily b, polypeptide 1 family P450, cytochrome expressed Rho sequence AI607873 1, e l b GTPase, homolog 2 (C. elegans) i defective c u d n i synapse - e g feedback a transcript 4 m a d - A N D receptor feedback inhibitor 1 ERBB receptor inhibitor 1 ERBB RIKEN cDNA 2010109K11 gene Gene Name

Chapter 5 Chapter 1.87E+00 1.34E+00 1.22E+00 1.25E+00 1.70E+00 1.53E+00 1.07E+00 2.12E+00 1.21E+00 2.35E+00 2.68E+00 3.21E+00 1.42E+00 DMSO_T1 PRED_T1- 3.42E+00 3.85E+00 1.71E+00 1.62E+00 6.79E+00 1.93E+00 2.29E+00 4.44E+00 1.62E+00 2.24E+00 4.60E+00 5.12E+00 2.68E+00 - 6 T DMSO_T6 _ D E R P 2.56E+00 2.02E+00 1.44E+00 1.70E+00 6.37E+00 1.82E+00 1.88E+00 5.89E+00 1.63E+00 1.31E+00 3.46E+00 4.29E+00 2.24E+00 DMSO_T24 PRED_T24- 4.46E+00 2.57E+00 1.07E+00 1.18E+00 9.00E+00 3.03E+00 3.61E+00 1.99E+00 1.22E+00 1.23E+00 3.83E+00 4.08E+00 2.16E+00 DMSO_T48 PRED_T48- 92 Chapter 5

Figure 3 Heatmap representation of biological pathways identified using CoPub analysis of regulated genes. The colors shading indicates the significance of the over-represented pathways

(based on pval_transformed), from the lowest value shown in red, to the highest value shown in white. Columns represent the time points at which the pathways have been found after incubation with Prednisolone: 1 hour (T1), 6 hours (T6), 24 hours (T24) and 48 hours (T48).

factor insulin-like growth factor I (Igf1), insulin receptor substrate 1 (Irs1) and platelet derived growth factor receptor, beta polypeptide (Pdgfrb) and genes from the TGF-β pathway such as MAD homolog 3 (Drosophila) (Smad3), gremlin 1 (Grem1), transforming growth factor, beta 2 (Tgfb2) and transforming growth factor, beta induced (Tgfbi) were down-regulated. Phosphatase and tensin homolog (Pten) an important regulator of the Akt pathway was up-regulated. The majority of the genes coding for cytokines, such as Interleukin-6 (Il-6), chemokine (C-X-C motif) ligand 5 (Cxcl5), chemokine (C-X-C motif) ligand 10 (Cxcl10) and chemokine (C-C motif) ligand 7 (Ccl7), were down regulated by prednisolone. Other cytokines such as interleukin 1 receptor, type I (IL1r1) and interleukin 1 receptor accessory protein (IL1rap) were up regulated. Many Wnt signaling pathway members were significant down regulated by prednisolone in our study. We observed down regulation of frizzled receptors Fzd1, Fzd2, Fzd4, Fzd5 together with ligand wingless-related MMTV integration site 5A (Wnt5A) and antagonist secreted frizzled-related protein 2 (Sfrp2). Also several down-stream molecules such as Wisp1, Wisp2, Id2 were down regulated (Figure 4). Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 93

Chapter 5 Chapter

Figure 4 Prednisolone induced genes after 1 hour (T1), 6 hours (T6), 24 hours (T24) and 48 hours (T48). Prednisolone-induced up-regulated genes are shown in red, prednisolone induced down-regulated genes are shown in green. For clarity genes have been assigned to major pathways. However, one should keep in mind that genes can belong to multiple pathways depending on the definition of a pathway.

Regulation of adipokines measured after 48 hours of incubation with prednisolone The microarray data showed several transcripts that encode hormones secreted by 94 Chapter 5

adipocytes, like cytokines, and regulators of the Wnt, Akt and TGF-β pathway. We therefore decided to study to what extend the altered levels of gene expression would lead to changes in protein expression, by analyzing the effects of prednisolone on adipokine secretion by 3T3-L1 adipocytes treated with DMSO or prednisolone for 48 hours. For this purpose, the supernatants of the 3T3-L1 adipocytes that were used in the gene expression study were analyzed. Out of the 308 proteins on the array, 25 proteins were significant regulated by prednisolone (Table 2, p-value < 0.05. For all proteins, see supplementary table 1). The majority of these proteins are cytokines involved in inflammatory and immune response related processes (indicated with a purple background in Table 2). Cytokines such as CXCL2, CCL3, CCL9, IL31 were down regulated by prednisolone while CCR6, IL12B, TNFSF8, TNFRSF7 and TNFRSF17 showed an up regulation. Also several components of the Wnt signaling pathway were significantly regulated at protein level (Table 2, indicated with a blue background). DKK4 an inhibitor of the Wnt signaling pathways was up-regulated. Furthermore downstream regulators of the Wnt signaling pahthway, LRP6 and WISP1 are down regulated. Additionally, initiation of insulin signaling is represented by the down regulation of IGFBP5, IGFBP3 and Resistin. These results indicate that the regulation of the Akt/ insulin signaling, cytokine signaling and Wnt signaling pathways by prednisolone is not only found on the transcriptome level, but is also translated into altered proteins levels in these pathways.

Regulation of the Wnt signaling pathway by prednisolone Individual Wnt signaling components were found to be significantly regulated by prednisolone at both gene and protein levels. Therefore, we next investigated whether the entire Wnt signaling pathway was affected by prednisolone. The KEGG database was used to select genes upstream in the canonical Wnt signaling pathway that are more exclusively linked to the Wnt signaling in comparison with down-stream genes like Smad3, Lef, Tak1 and Pparδ, which are also linked to TGF-β signaling, MAPK signaling and Cell cycle (Supplementary Table 3). We used the expression profiles of these 51 Wnt signaling genes to test whether there is a difference in expression for the entire Wnt signaling pathway between prednisolone and DMSO treated samples. In this method, genes that do not meet the FC>2-fold and Pval <0.05 cut offs but may have nevertheless a changed expression level were taken into consideration. A significant difference in expression for the entire Wnt signaling pathway was observed after 6 hours (p-value = 0.031). In Figure 5, the genes that had the most influence on this difference are shown. In this figure, the gene expression ratios (open dots) and protein ratios (black dots) as a result of prednisolone treatment are also shown. Genes that had an influence on the difference between DMSO and prednisolone treated samples and that showed a higher expression in DMSO treated samples (red bars) were down-regulated because of prednisolone treatment (open dots). The same effect on the Wnt signaling pathway was also seen after 48 hours (p-value = 0.064) of treatment (Supplementary Figure 3). This Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 95

Table 2 Regulated proteins after 48 hours of incubation with prednisolone. Inflammatory related proteins are indicated with a purple background, wnt signaling related proteins with a blue background and insulin signaling related proteins with a green background. Up-regulated proteins are indicated with a red color. Down-regulated proteins are indicated with a green color. MIP-1 gamma WISP-1 / CCN4 Gremlin CD30 L TNFRSF7 CD27 / Dkk-4 IL-12 p70 LIF TLR1 Decorin Artemin CCR6 / A M TNFRSF17 C B Name CCL9 WISP1 GREM1 TNFSF8 TNFRSF7 DKK4 IL12B LIF TLR1 DCN ARTN CCR6 TNFRSF17 Protein Symbol -0.32 -0.17 0.24 0.37 0.51 0.57 0.64 0.64 0.78 0.84 0.85 0.87 0.99 Mean log 2 ratios

Chapter 5 Chapter 0.007 0.033 0.019 0.010 0.040 0.024 0.024 0.016 0.024 0.003 0.015 0.035 0.045 P-value healthy animals. properties. Circulates at high concentrations in the blood of Monokine with inflammatory, pyrogenic and chemokinetic Downstream regulator in the Wnt/Frizzled-signaling pathway. patterning in maintaining the FGF4-SHH feedback loop. BMP a antagonist required for early limb outgrowth and carcinogenesis and metanephric kidney organogenesis, as Cytokine that may play an important role during of T-cells. Cytokine that binds to TNFRSF8/CD30. Induces proliferation association with SIVA1. of activated T-cells. May play a role in apoptosis through Receptor for CD70/CD27L. May play a role in survival promotes complex that ternary a KREMEN internalization protein of LRP5/6. by forming transmembrane the LRP5/6 with and inhibiting Wnt with by interaction signaling Wnt canonical Antagonizes by of IFN-gamma production resting PBMC the stimulate and cells, killer and T NK cells, enhance activated the lytic activityfor of NK/lymphokine-activatedfactor growth a as act can that Cytokine of and the stimulation of acute-phase protein differentiation, synthesisthe induction in hepatocytes. cell neuronal of differentiation induction the include terminalcells, activities induce Its tohematopoietic differentiation in normal and myeloid leukemia cells. capacity leukemic the in has LIF secretion and the inflammatory response via Acts lipopeptides. toNF-kappa-Bleading TRAF6, activation,and MYD88 cytokine or lipoproteins bacterial to microbial to response response immune innate immune the mediate to TLR2 with Cooperates agents.innate the in Participates May affect the rate of fibrils formation. activate the GFR-alpha-1-RET receptor complex. Ligandforreceptor GFR-alpha-3-RET the complex canalso but intracellular calcium ions level. MIP-3-alpha/ the increasing by to signal a transduces Binds subsequently and chemokine. LARC type C-C a for Receptor TNFSF13/APRIL. humoral immunity. and of regulation the in role a plays TNFSF13B/BLyS/BAFF and survival B-cell Promotes for Receptor Function description derived from Uniprot 96 Chapter 5 Name TCCR / WSX-1 MIP-2 VEGF IL-31 Resistin P-Selectin MIP-1 alpha IGFBP-5 LRP-6 VEGF R1 CD18 Integrin beta 2 / IGFBP-3 Protein Symbol IL27RA CXCL2 VEGFA IL31 RETN SELP CCL3 IGFBP5 LRP6 FLT1 ITGB2 IGFBP3

Mean log 2 ratios -0.36 -0.35 -0.35 -0.42 -0.43 -0.69 -0.65 -0.65 -0.60 -0.57 -0.86 -0.71 P-value 0.000 0.040 0.035 0.049 0.022 0.010 0.016 0.015 0.008 0.027 0.040 0.043 Function description derived from Uniprot does not induce chemokinesis or an oxidative burst. Chemotactic for human polymorphonuclear leukocytes but permeabilization of blood vessels. promotes cell migration, inhibits apoptosis and induces endothelial cell growth. Induces proliferation, Growth factor active in angiogenesis, vasculogenesis and OSMR. IL31 may function in skin immunity. the IL31 heterodimeric receptor composed of IL31RA and Activates STAT3 and possibly STAT1 and STAT5 through defense mechanisms. type immune responses. Also appears to be involved in innate through STAT3 and STAT1. Involved in the regulation of Th1- transduction in response to IL27. This signaling system acts Receptor for IL27. Requires IL6ST/gp130 to mediate signal diabetes. glucose uptake into adipose cells. Potentially links obesity to Hormone that seems to suppress insulin ability stimulate Ca neutrophils. Binding to a high-affinity receptor activates calcium release in properties. Has a potent chemotactic activity for eosinophils. Monokine with inflammatory, pyrogenic and chemokinetic promoting effects of the IGFs on cell culture. have been shown to either inhibit or stimulate the growth IGF-binding proteins prolong the half-life of IGFs and receptor-ligand complexes into ribosome-sized signalsomes. beta-catenin signaling through inducing aggregation of Component of the Wnt-Fzd-LRP5-LRP6 complex that triggers permeability. key role in vascular development and regulation of The VEGF-kinase ligand/receptor signaling system plays a complement component and for fibrinogen. beta-2 are receptors for the iC3b fragment of third ICAM3 and ICAM4. Integrins alpha-M/beta-2 alpha-X/ Integrin alpha-L/beta-2 is a receptor for ICAM1, ICAM2, promoting effects of the IGFs on cell culture. have been shown to either inhibit or stimulate the growth IGF-binding proteins prolong the half-life of IGFs and carbohydrates on neutrophils and monocytes. 2+ -dependent receptor for myeloid cells that binds to

analysis indicates that in 3T3-L1 adipocytes, the entire Wnt signaling pathway is down- regulated by prednisolone. Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 97

Figure 5 Wnt signaling genes that have an influence on the difference between prednisolone and DMSO, after 6 hours of incubation. Genes that are upregulated in DMSO treated samples are

shown in red and genes that are upregulated in prednisolone treated samples are shown in green. 5 Chapter Ratios of prednisolone-induced genes are indicated with open dots. Prednisolone-induced ratios of the proteins are indicated with black dots. Ratio are indicated in log2 fold change.

Discussion and conclusions

The prolonged use of glucocorticoids is hampered by metabolic side-effects such as insulin resistance, which eventually leads to diabetes and obesity. Adipose tissue is thought to play an important role in GC-induced development of metabolic side effects. The mechanisms behind these side-effects are not completely understood. Key findings of our study indicate that prednisolone reduces insulin stimulated glucose uptake in 3T3- L1 cells, which is reflected by changes in expression of genes and proteins from the Akt/ insulin, cytokine and Wnt signaling pathways. To our knowledge, this is the first work that uses a combination of gene expression and antibody arrays to study effects of prednisolone in adipocyte cells. The set of 25 genes that showed regulation by prednisolone at all time points includes known target genes of the glucocorticoid receptor, such as Ddit4, Dusp1 and Cebpd [30, 31], and genes such as Fam107a and PDK4, that were also found to be regulated by prednisolone in in vivo studies of mouse and human [30, 32]. This indicates that prednisolone acts via its target genes and initiates a GC specific response in these 3T3-L1 98 Chapter 5

cells. Multiple cytokine molecules were regulated by prednisolone. Increasing evidence suggests that a chronic low-grade state of inflammation in adipose tissue contributes to the development of systemic insulin resistance and diabetes [33-35]. In the absence of IL1, mice are protected against high fat diet induced insulin resistance which is accompanied with a reduction in local adipose tissue inflammation [36, 37]. Our results showed up-regulation of IL1 signaling by means of IL1r1 and IL1rap. This suggests a role of IL-1 signaling in GC induced development of insulin resistance in 3T3-L1 cells. Also other studies reported about the up-regulation of cytokines such as Il6, Cxcl5, Ccl2 and Cxcl10 in insulin resistance and obesity (Figure 6) [35, 38-42]. In contrast to these studies, we observed a down regulation of expression of these genes by prednisolone. Based on these results it is difficult to deduce whether prednisolone-induced regulation of cytokines in adipocytes contribute to the development of insulin resistance or are the result of the immune suppressing properties of prednisolone or a combination of both. Furthermore our results indicated that the reduction in insulin stimulated glucose uptake by prednisolone acts partly via the down regulation of insulin signaling. This regulation might be mediated by Pten, a known suppressor of insulin signaling that acts via the PI3K/ Akt signaling pathway [43]. Additional studies confirm this observation and suggest an important role of Pten in the development of insulin resistance and diabetes [44-46]. Again an apparent contrasting effect of prednisolone is the down regulation of proteins like Igfb3 and resistin that reduce glucose uptake in adipose tissue [47, 48]. Inhibition of TGF-β/ Smad3 signaling results in diminished adiposity, improved glucose tolerance and insulin sensitivity and in the protection from insulin resistance, diabetes and obesity [49-51]. The observed down regulation of Smad3 suggests that disturbance of glucose uptake and adiposity by prednisolone in 3T3-L1 cells acts via additional mechanisms. Activation of the Wnt signaling pathway has been linked to inhibition of adipogenesis [52- 55]. Recent studies toward the characterization of the adipocyte secretome have shown that multiple regulators of the activity the Wnt signaling are secreted by adipocytes, such as WISP2 and SFRP5 [54, 56]. Interestingly, expression and circulating levels of these factors are altered in insulin resistance and obesity [54, 57-59], underlying the suggestion that a dysregulated secretion of Wnt regulators could lead to metabolic disturbances and eventually into the development of obesity and diabetes [54, 60-63]. Yet, in the context of our study, it remains to be investigated to what extent the factors affected by prednisolone interfere with insulin action in adipocytes. Prednisolone induced Pten, Smad3 and Lef1 which are able to mediate cross-talk between the signaling pathways [64]. Via the alteration of the Pten function, TGF-β is able to influence the Akt activity, and via Smad3 the Akt pathway is able to restrict the TGF-β pathway. Also Smad activity is involved in the cross-talk between Wnt signaling and TGF-β signaling, as part of the Smad/β-catenin/Lef protein complex in the nucleus. These Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 99

Figure 6 Overview of pathway members of insulin/ Akt signaling, TGF-β, wnt signaling and inflammatory pathways that were regulated by prednisolone in 3T3-L1 cells. For these pathway Chapter 5 Chapter members the literature links to metabolic disorders and other pathways are shown. The numbers at the arrows indicate the references in the literature, in which the association between pathway member and pathways/disease phenotype is mentioned. 1 [59], 2 [44], 3 [43], 4 [42], 5 [62], 6 [48], 7 [47], 8 [63], 9 [58], 10 [52], 11 [64], 12 [51], 13 [50], 14 [36], 15 [34], 16 [35], 17 [37], 18 [38], 19 [33], 20 [65], 21 [66], 22 [39], 23 [67], 24 [68] , 25 [69], 26 [70], 27 [71], 28 [72], 29 [73], 30 [74], 31 [75], 32 [76], 33 [77], 34 [78], 35 [79]. genes could play a central role in the development of prednisolone induced metabolic effects because of their ability to connect multiple pathways (Figure 6). The sometimes opposing effects of prednisolone on individual pathway members of Smad3 and Akt in comparison with earlier work, requires additional in vivo studies in which the systemic effect of prednisolone on these pathways could be determined. To our knowledge, for the first time it is shown that prednisolone affects the expression and secretion of Wnt regulators in adipocytes. The fact that Wnt signaling also participates in the development of metabolic disturbances in other tissues [65, 66] may direct future work towards a dedicated study of the Wnt signaling pathway in GC-induced metabolic effects. In an in vivo setting the Wnt signaling mediated GC effects in multiple glucose responsive tissues such as muscle tissue, liver tissue and adipose tissue could be determined. Hence new GC compounds with an improved efficacy/ side effects ratio should have a reduced effect on components in the Wnt signaling pathway. 100 Chapter 5

Declaration of interest This work was performed within the framework of Dutch Top Institute Pharma (TIP) project “Glucocorticoid-induced insulin resistance” (project nr. T1-106-1). DMO is supported by the Federal Ministry of Health, the Ministry of Innovation, Science, Research and Technology of the German State of North-Rhine Westphalia. WF is supported by the Biorange project (BR4.2) “A Systems Bioinformatics Approach For Evaluating And Translating Drug-Target Effects In Disease Related Pathways” of NBIC. From 1 October 2012, WF is supported by the Netherlands eScience project (027.011.306) “Creation of food specific ontologies for food focused text mining and knowledge management studies”. The authors Wynand Alkema, Wim Dokter and Jacob de Vlieg were employed by Merck at which part of this work was initiated. The authors Erik J.M. Toonen, Wilco W.M. Fleuren, Margot M. L. Linssen, Gerard C.C. van der Zon, Bruno Guigas and D. Margriet Ouwens have no conflict of interest. Wim Dokter and Jacob de Vlieg own stock of Merck.

Supplementary information can be retrieved on request by the author of this thesis or can be found at the website of the journal. Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 101

References

1. Del Rosso Do, J.Q., Combination topical therapy for the treatment of psoriasis. J Drugs Dermatol, 2006. 5(3): p. 232-4. 2. Hillier, S.G., Diamonds are forever: the cortisone legacy. J Endocrinol, 2007. 195(1): p. 1-6. 3. Schwartz, M. and R. Cohen, Optimizing conventional therapy for inflammatory bowel disease. Curr Gastroenterol Rep, 2008. 10(6): p. 585-90. 4. De Bosscher, K. and G. Haegeman, Minireview: latest perspectives on antiinflammatory actions of glucocorticoids. Mol Endocrinol, 2009. 23(3): p. 281-91. 5. Rhen, T. and J.A. Cidlowski, Antiinflammatory action of glucocorticoids--new mechanisms for old drugs. N Engl J Med, 2005. 353(16): p. 1711-23. 6. Schacke, H., et al., Insight into the molecular mechanisms of glucocorticoid receptor action promotes identification of novel ligands with an improved therapeutic index. Exp Dermatol, 2006. 15(8): p. 565- 73. 7. Rockall, A.G., et al., Computed tomography assessment of fat distribution in male and female patients with Cushing's syndrome. Eur J Endocrinol, 2003. 149(6): p. 561-7. 8. Schacke, H., W.D. Docke, and K. Asadullah, Mechanisms involved in the side effects of glucocorticoids. Pharmacol Ther, 2002. 96(1): p. 23-43. 9. Chrousos, G.P. and T. Kino, Glucocorticoid signaling in the cell. Expanding clinical implications to complex human behavioral and somatic disorders. Ann N Y Acad Sci, 2009. 1179: p. 153-66. 10. Kino, T., Tissue glucocorticoid sensitivity: beyond stochastic regulation on the diverse actions of glucocorticoids. Horm Metab Res, 2007. 39(6): p. 420-4. 11. Lu, N.Z. and J.A. Cidlowski, Glucocorticoid receptor isoforms generate transcription specificity. Trends Cell Biol, 2006. 16(6): p. 301-7. 12. van Raalte, D.H., D.M. Ouwens, and M. Diamant, Novel insights into glucocorticoid-mediated diabetogenic effects: towards expansion of therapeutic options? Eur J Clin Invest, 2009. 39(2): p. 81-93. 13. Taskinen, M.R., et al., Plasma lipoproteins, lipolytic enzymes, and very low density lipoprotein triglyceride turnover in Cushing's syndrome. J Clin Endocrinol Metab, 1983. 57(3): p. 619-26. 14. Wajchenberg, B.L., Subcutaneous and visceral adipose tissue: their relation to the metabolic syndrome. Endocr Rev, 2000. 21(6): p. 697-738. 5 Chapter 15. Bazuine, M., et al., Mitogen-activated protein kinase (MAPK) phosphatase-1 and -4 attenuate p38 MAPK during dexamethasone-induced insulin resistance in 3T3-L1 adipocytes. Mol Endocrinol, 2004. 18(7): p. 1697-707. 16. Buren, J., et al., Insulin action and signalling in fat and muscle from dexamethasone-treated rats. Arch Biochem Biophys, 2008. 474(1): p. 91-101. 17. Buren, J., et al., Dexamethasone impairs insulin signalling and glucose transport by depletion of insulin receptor substrate-1, phosphatidylinositol 3-kinase and protein kinase B in primary cultured rat adipocytes. Eur J Endocrinol, 2002. 146(3): p. 419-29. 18. Sakoda, H., et al., Dexamethasone-induced insulin resistance in 3T3-L1 adipocytes is due to inhibition of glucose transport rather than insulin signal transduction. Diabetes, 2000. 49(10): p. 1700-8. 19. Kershaw, E.E. and J.S. Flier, Adipose tissue as an endocrine organ. J Clin Endocrinol Metab, 2004. 89(6): p. 2548-56. 20. Ouchi, N., et al., Adipokines in inflammation and metabolic disease. Nat Rev Immunol, 2011. 11(2): p. 85-97. 21. Lehr, S., et al., Identification and validation of novel adipokines released from primary human adipocytes. Mol Cell Proteomics, 2011. 22. Fasshauer, M. and R. Paschke, Regulation of adipocytokines and insulin resistance. Diabetologia, 2003. 46(12): p. 1594-603. 23. van den Berghe, N., et al., Activation of the Ras/mitogen-activated protein kinase signaling pathway alone is not sufficient to induce glucose uptake in 3T3-L1 adipocytes. Mol Cell Biol, 1994. 14(4): p. 2372-7. 24. Linssen, M.M., et al., Prednisolone-induced beta cell dysfunction is associated with impaired endoplasmic reticulum homeostasis in INS-1E cells. Cell Signal, 2011. 23(11): p. 1708-15. 25. The R Project for Statistical Computing package. Available from: http://www.r-project.org. 26. Fleuren, W.W., et al., CoPub update: CoPub 5.0 a text mining system to answer biological questions. 102 Chapter 5

Nucleic Acids Res, 2011. 39(Web Server issue): p. W450-4. 27. Huang da, W., B.T. Sherman, and R.A. Lempicki, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 2009. 4(1): p. 44-57. 28. Huang da, W., B.T. Sherman, and R.A. Lempicki, Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res, 2009. 37(1): p. 1-13. 29. Hoehn, K.L., et al., IRS1-independent defects define major nodes of insulin resistance. Cell Metab, 2008. 7(5): p. 421-33. 30. Toonen, E.J., et al., Prednisolone-induced changes in gene-expression profiles in healthy volunteers. Pharmacogenomics, 2011. 12(7): p. 985-98. 31. Yang, H., et al., Expression and activity of C/EBPbeta and delta are upregulated by dexamethasone in skeletal muscle. J Cell Physiol, 2005. 204(1): p. 219-26. 32. Frijters, R., et al., Prednisolone-induced differential gene expression in mouse liver carrying wild type or a dimerization-defective glucocorticoid receptor. BMC Genomics, 2010. 11: p. 359. 33. Harford, K.A., et al., Fats, inflammation and insulin resistance: insights to the role of macrophage and T-cell accumulation in adipose tissue. Proc Nutr Soc, 2011: p. 1-10. 34. Heilbronn, L.K. and L.V. Campbell, Adipose tissue macrophages, low grade inflammation and insulin resistance in human obesity. Curr Pharm Des, 2008. 14(12): p. 1225-30. 35. Bastard, J.P., et al., Recent advances in the relationship between obesity, inflammation, and insulin resistance. Eur Cytokine Netw, 2006. 17(1): p. 4-12. 36. de Roos, B., et al., Attenuation of inflammation and cellular stress-related pathways maintains insulin sensitivity in obese type I interleukin-1 receptor knockout mice on a high-fat diet. Proteomics, 2009. 9(12): p. 3244-56. 37. McGillicuddy, F.C., et al., Lack of interleukin-1 receptor I (IL-1RI) protects mice from high-fat diet- induced adipose tissue inflammation coincident with improved glucose homeostasis. Diabetes, 2011. 60(6): p. 1688-98. 38. Chavey, C., et al., CXC ligand 5 is an adipose-tissue derived factor that links obesity to insulin resistance. Cell Metab, 2009. 9(4): p. 339-49. 39. Lagathu, C., et al., Chronic interleukin-6 (IL-6) treatment increased IL-6 secretion and induced insulin resistance in adipocyte: prevention by rosiglitazone. Biochem Biophys Res Commun, 2003. 311(2): p. 372-9. 40. Rotter, V., I. Nagaev, and U. Smith, Interleukin-6 (IL-6) induces insulin resistance in 3T3-L1 adipocytes and is, like IL-8 and tumor necrosis factor-alpha, overexpressed in human fat cells from insulin-resistant subjects. J Biol Chem, 2003. 278(46): p. 45777-84. 41. Tateya, S., et al., An increase in the circulating concentration of monocyte chemoattractant protein-1 elicits systemic insulin resistance irrespective of adipose tissue inflammation in mice. Endocrinology, 2010. 151(3): p. 971-9. 42. Murdolo, G., et al., In situ profiling of adipokines in subcutaneous microdialysates from lean and obese individuals. Am J Physiol Endocrinol Metab, 2008. 295(5): p. E1095-105. 43. Tang, X., et al., PTEN, but not SHIP2, suppresses insulin signaling through the phosphatidylinositol 3-kinase/Akt pathway in 3T3-L1 adipocytes. J Biol Chem, 2005. 280(23): p. 22523-9. 44. Ikubo, M., et al., Impact of lipid phosphatases SHIP2 and PTEN on the time- and Akt-isoform-specific amelioration of TNF-alpha-induced insulin resistance in 3T3-L1 adipocytes. Am J Physiol Endocrinol Metab, 2009. 296(1): p. E157-64. 45. Lazar, D.F. and A.R. Saltiel, Lipid phosphatases as drug discovery targets for type 2 diabetes. Nat Rev Drug Discov, 2006. 5(4): p. 333-42. 46. Nakashima, N., et al., The tumor suppressor PTEN negatively regulates insulin signaling in 3T3-L1 adipocytes. J Biol Chem, 2000. 275(17): p. 12889-95. 47. Chan, S.S., et al., Insulin-like growth factor binding protein-3 leads to insulin resistance in adipocytes. J Clin Endocrinol Metab, 2005. 90(12): p. 6588-95. 48. Kim, H.S., et al., Insulin-like growth factor binding protein-3 induces insulin resistance in adipocytes in vitro and in rats in vivo. Pediatr Res, 2007. 61(2): p. 159-64. 49. Tsurutani, Y., et al., The roles of transforming growth factor-beta and Smad3 signaling in adipocyte differentiation and obesity. Biochem Biophys Res Commun, 2011. 407(1): p. 68-73. 50. Tan, C.K., et al., Smad3 deficiency in mice protects against insulin resistance and obesity induced by a high-fat diet. Diabetes, 2011. 60(2): p. 464-76. 51. Yadav, H., et al., Protection from obesity and diabetes by blockade of TGF-beta/Smad3 signaling. Cell Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 103

Metab, 2011. 14(1): p. 67-79. 52. Christodoulides, C., et al., The Wnt antagonist Dickkopf-1 and its receptors are coordinately regulated during early human adipogenesis. J Cell Sci, 2006. 119(Pt 12): p. 2613-20. 53. Longo, K.A., et al., Wnt signaling protects 3T3-L1 preadipocytes from apoptosis through induction of insulin-like growth factors. J Biol Chem, 2002. 277(41): p. 38239-44. 54. Ouchi, N., et al., Sfrp5 is an anti-inflammatory adipokine that modulates metabolic dysfunction in obesity. Science, 2010. 329(5990): p. 454-7. 55. Bennett, C.N., et al., Regulation of Wnt signaling during adipogenesis. J Biol Chem, 2002. 277(34): p. 30998-1004. 56. Lehr, S., et al., Identification and validation of novel adipokines released from primary human adipocytes. Mol Cell Proteomics, 2012. 11(1): p. M111 010504. 57. Dahlman, I., et al., Functional annotation of the human fat cell secretome. Arch Physiol Biochem, 2012. 118(3): p. 84-91. 58. Mori, H., et al., Secreted frizzled-related protein 5 suppresses adipocyte mitochondrial metabolism through WNT inhibition. J Clin Invest, 2012. 122(7): p. 2405-16. 59. Hu, W., et al., Circulating Sfrp5 Is a Signature of Obesity-Related Metabolic Disorders and Is Regulated by Glucose and Liraglutide in Humans. J Clin Endocrinol Metab, 2012. 60. Jin, T., The WNT signalling pathway and diabetes mellitus. Diabetologia, 2008. 51(10): p. 1771-80. 61. Liu, Z. and J.F. Habener, Wnt signaling in pancreatic islets. Adv Exp Med Biol, 2010. 654: p. 391-419. 62. Oh, D.Y. and J.M. Olefsky, Medicine. Wnt fans the flames in obesity. Science, 2010. 329(5990): p. 397-8. 63. Schinner, S., Wnt-signalling and the metabolic syndrome. Horm Metab Res, 2009. 41(2): p. 159-63. 64. Guo, X. and X.F. Wang, Signaling cross-talk between TGF-beta/BMP and other pathways. Cell Res, 2009. 19(1): p. 71-88. 65. Abiola, M., et al., Activation of Wnt/beta-catenin signaling increases insulin sensitivity through a reciprocal regulation of Wnt10b and SREBP-1c in skeletal muscle cells. PLoS One, 2009. 4(12): p. e8509. 66. Anagnostou, S.H. and P.R. Shepherd, Glucose induces an autocrine activation of the Wnt/beta-catenin pathway in macrophage cell lines. Biochem J, 2008. 416(2): p. 211-8.

67. Lin, H.M., et al., Transforming growth factor-beta/Smad3 signaling regulates insulin gene transcription and pancreatic islet beta-cell function. J Biol Chem, 2009. 284(18): p. 12246-57.

68. Zhang, J., et al., Bone morphogenetic protein signaling inhibits hair follicle anagen induction by 5 Chapter restricting epithelial stem/progenitor cell activation and expansion. Stem Cells, 2006. 24(12): p. 2826- 39. 69. Ross, S.E., et al., Inhibition of adipogenesis by Wnt signaling. Science, 2000. 289(5481): p. 950-3. 70. Yamamoto, T., et al., Cross-talk between IL-6 and TGF-beta signaling in hepatoma cells. FEBS Lett, 2001. 492(3): p. 247-53. 71. Walia, B., et al., TGF-beta down-regulates IL-6 signaling in intestinal epithelial cells: critical role of SMAD-2. FASEB J, 2003. 17(14): p. 2130-2. 72. Kamei, N., et al., Overexpression of monocyte chemoattractant protein-1 in adipose tissues causes macrophage recruitment and insulin resistance. J Biol Chem, 2006. 281(36): p. 26602-14. 73. Sell, H. and J. Eckel, Monocyte chemotactic protein-1 and its role in insulin resistance. Curr Opin Lipidol, 2007. 18(3): p. 258-62. 74. Clemmons, D.R., Role of insulin-like growth factor iin maintaining normal glucose homeostasis. Horm Res, 2004. 62 Suppl 1: p. 77-82. 75. Gronning, L.M., et al., Glucose induces increases in levels of the transcriptional Id2 via the hexosamine pathway. Am J Physiol Endocrinol Metab, 2006. 290(4): p. E599-606. 76. Chen, H., et al., PDGF signalling controls age-dependent proliferation in pancreatic beta-cells. Nature, 2011. 478(7369): p. 349-55. 77. Yuasa, T., et al., Platelet-derived growth factor stimulates glucose transport in skeletal muscles of transgenic mice specifically expressing platelet-derived growth factor receptor in the muscle, but it does not affect blood glucose levels. Diabetes, 2004. 53(11): p. 2776-86. 78. Whiteman, E.L., J.J. Chen, and M.J. Birnbaum, Platelet-derived growth factor (PDGF) stimulates glucose transport in 3T3-L1 adipocytes overexpressing PDGF receptor by a pathway independent of insulin receptor substrates. Endocrinology, 2003. 144(9): p. 3811-20. 79. Alessi, M.C., et al., Plasminogen activator inhibitor 1, transforming growth factor-beta1, and BMI are closely associated in human adipose tissue during morbid obesity. Diabetes, 2000. 49(8): p. 1374-80. 80. Garcia-Diaz, D.F., et al., Glucose and insulin modify thrombospondin 1 expression and secretion in 104 Chapter 5

primary adipocytes from diet-induced obese rats. J Physiol Biochem, 2011. 67(3): p. 453-61. 81. Yu, X., et al., Egr-1 decreases adipocyte insulin sensitivity by tilting PI3K/Akt and MAPK signal balance in mice. EMBO J, 2011. 30(18): p. 3754-65. 82. Fujimoto, S., et al., Insulin resistance induced by a high-fat diet is associated with the induction of genes related to leukocyte activation in rat peripheral leukocytes. Life Sci, 2010. 87(23-26): p. 679-85. 83. Mehta, N.N., et al., Experimental endotoxemia induces adipose inflammation and insulin resistance in humans. Diabetes, 2010. 59(1): p. 172-81. 84. Bala, M., et al., Type 2 diabetes and lipoprotein metabolism affect LPS-induced cytokine and chemokine release in primary human monocytes. Exp Clin Endocrinol Diabetes, 2011. 119(6): p. 370-6. Prednisolone-induces the Wnt signaling pathway in 3T3-L1 adipocytes 105

Chapter 5 Chapter CHAPTER 6 Prednisolone-induced differential gene expression in mouse liver car- rying wild type or a dimerization- defective glucocorticoid receptor

Raoul Frijters1, Wilco W.M. Fleuren1, Erik J.M. Toonen3, Jan P Tuckermann5, Holger M Reichardt6, Hans van der Maaden4, Andrea van Elsas3, Marie-Jose van Lierop3, Wim Dokter3, Jacob de Vlieg1,2 and Wynand Alkema2

1Computational Drug Discovery (CDD), Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, Geert Grooteplein Zuid 26-28, 6525 GA Nijmegen, The Netherlands. 2Department of Molecular Design & Informatics, Schering-Plough, Molenstraat 110, 5342 CC Oss, The Netherlands. 3Department of Immunotherapeutics, Schering-Plough, Molenstraat 110, 5342 CC Oss, The Netherlands. 4Molecular Pharmacology Department, Schering-Plough, Molenstraat 110, 5342 CC Oss, The Netherlands. 5Tuckermann Lab, Leibniz Institute for Age Research - Fritz Lipmann Institute, Beutenbergstraße 11, 07745 Jena, Germany. 6Department of Cellular and Molecular Immunology, University of Göttingen Medical School, Humboldtallee 34, 37073 Göttingen, Germany.

BMC Genomics 2010, Jun 5;11:359

Prednisolone-induced differential gene expression in mouse liver 109

Abstract

Background Glucocorticoids (GCs) control expression of a large number of genes via binding to the GC receptor (GR). Transcription may be regulated either by binding of the GR dimer to DNA regulatory elements or by protein-protein interactions of GR monomers with other transcription factors. Although the type of regulation for a number of individual target genes is known, the relative contribution of both mechanisms to the regulation of the entire transcriptional program remains elusive. To study the importance of GR dimerization in the regulation of gene expression, we performed gene expression profiling of livers of prednisolone-treated wild type (WT) and mice that have lost the ability to form GR dimers (GRdim). Results The GR target genes identified in WT mice were predominantly related to glucose metabolism, the cell cycle, apoptosis and inflammation. Indim GR mice, the level of prednisolone-induced gene expression was significantly reduced compared to WT, but not completely absent. Interestingly, for a set of genes, involved in cell cycle and apoptosis processes and strongly related to Foxo3a and p53, induction by prednisolone was completely abolished in GRdim mice. In contrast, glucose metabolism-related genes were still modestly upregulated in GRdim mice upon prednisolone treatment. Finally, we identified several novel GC-inducible genes from which Fam107a, a putative histone acetyltransferase complex interacting protein, was most strongly dependent on GR dimerization. Conclusions dim This study on prednisolone-induced effects in livers of WT and GR mice identified a number of interesting candidate genes and pathways regulated by GR dimers and sheds Chapter 6 Chapter new light onto the complex transcriptional regulation of liver function by GCs. 110 Chapter 6

Background

Naturally occurring glucocorticoids (GCs), such as cortisol, play an important role in the regulation of cardiovascular, metabolic and immunological processes. GCs are potent suppressors of inflammatory indices and are widely used to treat chronic inflammatory diseases such as rheumatoid arthritis and asthma [1-3]. However, chronic use of GCs induces side effects, such as diabetes mellitus, fat redistribution, osteoporosis, muscle atrophy, glaucoma and skin thinning [4]. One of the aspects influencing the balance between the desired anti-inflammatory effects and the side effects of GCs is the activation and repression of gene expression. After binding of GCs to the cytosolic GC receptor (GR), the receptor-ligand complex translocates to the nucleus where it alters gene expression. Ligand-bound GR can influence the expression of target genes, either by binding as a dimer to palindromic GC response elements (GRE) or tethering to other DNA-bound transcription factors [5, 6]. It is generally assumed that the anti-inflammatory actions of GCs are mainly driven by transrepression, in which ligand-bound GR binds to the pro-inflammatory transcription factors NF-κB (Nuclear factor kappa-light-chain-enhancer of activated B cells), AP-1 (Activator protein 1), IRF-3 (Interferon regulatory factor 3) or other factors, forming an inactive transcription machinery complex, which prevents expression ofpro- inflammatory genes [7-16]. However, recent studies demonstrated that under some inflammatory conditions DNA dimer-dependent gene expression could also contribute to the anti-inflammatory activities of the GR [17]. The occurrence of side effects has been linked mainly to transactivation of gene transcription. The well-documented example of GC-induced upregulation of Pck1 (Phosphoenolpyruvate carboxykinase 1; also known as Pepck) and G6pc (Glucose-6-phosphatase), two genes encoding key enzymes that control gluconeogenesis, is one example linking transactivation to metabolic side effects [18-20]. One of the mechanisms by which GR activates gene transcription is by binding as a homodimer to a GRE in the promoter region of a target gene. GR dimerization involves the D loop located in the DNA-binding domain of the GR, in which several amino acids interact to facilitate receptor dimer formation. It was shown that an A458T point mutation, introduced in the D loop of the GR (GRdim), impairs homodimerization and ablates DNA binding [21]. Mice carrying this GRdim mutation are almost as effective as wild type (WT) mice in repressing AP-1 and NF-κB-modulated gene transcription, whereas GR-mediated Tat (Tyrosine aminotransferase) mRNA induction is largely abolished [21, 22]. The mechanism of dimer-dependent transactivation by the GR has been studied in detail for a limited number of GR target genes, including Fkbp5 (FK506 binding protein 5), Tat, Pck1 and Dusp1 (Dual specificity phosphatase )1 [23-26], and tethering-facilitated transrepression of Mmp1/Mmp13 (Matrix metallopeptidase 1/13), IL-8 (Interleukin 8) and others [27-29]. In order to perform a comprehensive study of the genes and cellular processes that are affected by GC-treatment and influenced by GR dimerization, we have performed genome wide gene expression profiling in liver of prednisolone-treated WT Prednisolone-induced differential gene expression in mouse liver 111 and GRdim mice. We found that prednisolone predominantly influenced expression of genes involved in glucose metabolism, inflammation, the cell cycle and apoptosis. In general, activation of transcription, including transactivation of known GR marker genes such Fkbp5as and Tat, was significantly reduced, but not totally absent in the GRdim mutant. However, a total absence of transcriptional transactivation upon GC-treatment was observed inGRdim mice for a subset of genes, all related to regulation of the cell cycle. In contrast, a small subset of genes was found exclusively regulated by GCs in GRdim mice. Furthermore, we identifiedFam107a (Family with sequence similarity 107, member a; also known as Tu3a and Drr1) as a novel GC-inducible gene that completely relies on GR dimerization for transactivation.

Results

Clustering of individual samples In order to chart the gene expression profile induced by prednisolone in WT and GRdim mice, 6 mice per group were treated with prednisolone or vehicle. A principal component analysis (PCA) on the normalized intensity data showed that the samples clustered into four distinct groups (Figure 1). The largest separation, represented by PC1, was observed between samples derived from male versus female mice. The genes with the highest gender specificity belong to the family of cytochrome P450 proteins, such as Cyp3a41, Cyp3a44 and Cyp4a12, which are known for their gender-specific expression [30, 31]. In addition to the gender-based separation, a clear separation was observed between prednisolone-treated and vehicle-treated WT mice, whereas prednisolone-treated GRdim dim mice clustered closely together with vehicle-treated GR and WT mice (PC2). These Chapter 6 Chapter Figure 1 Principal component analysis on gene expression intensity data. A principal component analysis (PCA) was performed on normalized gene expression intensity data from 12 wild type (WT) mice (vehicle and prednisolone treatment) and 12 GRdim mice (vehicle and prednisolone treatment). Bullets: WT mice; Triangles: GRdim mice. White symbols: vehicle-treated mice; Black symbols: prednisolone-treated mice. The largest separation was seen between samples derived from male versus female mice (PC1). 112 Chapter 6

Figure 2 Intensity profiles of Tat and Fkbp5 in vehicle and prednisolone-treated wild type versus GRdim mice. The expression intensities ofTat (Tyrosine aminotransferase) and Fkbp5 (FK506 binding protein 5) in vehicle-treated (white bars) and prednisolone-treated (grey bars) wild type (WT) and GRdim mice are plotted. Symbols: Bullets: male mice; Triangles: female mice. BothTat and Fkbp5 are known GR marker genes and show strong differential regulation in prednisolone-treated WT mice.

results indicate that there is a significant prednisolone effect in WT mice, which is strongly reduced in prednisolone-treated GRdim mice. Genes with high loading scores on PC2 include Tat and Fkbp5, which are known GR-regulated genes in liver [32, 33] (Figure 2). These results show that the transcriptional program that is induced by prednisolone in WT mice is strongly reduced in mice carrying the GRdim mutation. This supports the hypothesis that the loss of GR dimerization leads to a reduction in transcriptional transactivation by the GR and underscores the importance of GR dimerization in induction of gene expression. The effect of prednisolone on up and downregulated genes and pathways will be discussed in more detail below.

Prednisolone-induced changes in gene expression To quantify the differences that were seen in the multivariate analysis, differentially expressed probe sets were identified for several treatment comparisons (e.g. prednisolone versus vehicle-treated WT mice and prednisolone vs. vehicle-treated GRdim mice). Probe sets were marked as differentially regulated when the p-value, corrected for multiple testing, was below the 0.01 cutoff value. Prednisolone treatment resulted in a significant differential regulation of 518 probe sets in WT mice (347 upregulated and 171 downregulated), whereas in prednisolone- treated GRdim mice only 34 probe sets were differentially regulated (29 upregulated and 5 downregulated). Notably, no differentially regulated probe sets were found in the comparison between vehicle-treated GRdim and vehicle-treated WT mice, suggesting that Prednisolone-induced differential gene expression in mouse liver 113 the GRdim mutation itself does not cause differential gene regulation and that the effects between WT and GRdim mice only become apparent after treatment with prednisolone. The observation that only 34 probe sets are differentially regulated inGRdim mice upon prednisolone treatment, suggests that there is almost no regulation of gene expression by prednisolone in GRdim mice. However, a large number of the probe sets that are differentially regulated in WT mice showed the same direction of regulation by prednisolone in GRdim mice, but did not meet the cutoff value of 0.01 for the p-value. This indicates that induction of gene expression by prednisolone in GRdim mice is strongly reduced but not completely absent (see below). Taken together these data are in line with the hypothesis that transactivation of gene expression through the GR is significantly reduced in mice carrying the GRdim mutation.

Biological processes targeted by prednisolone in WT mice In order to identify cellular processes on which prednisolone has a prominent effect, a gene set enrichment analysis was conducted using the 518 probe sets that were differentially regulated in WT mice by prednisolone. For this analysis we used CoPub, a text mining tool that calculates which biological processes are significantly associated to a set of genes [34]. It appeared that the differentially regulated genes are predominantly involved in three major processes: (glucose) metabolism, cell cycle/apoptosis and immune/inflammatory response (Table 1). In order to gain insight into the relationships between genes and cellular processes a network representation was created, in which differentially regulated genes are plotted together with enriched keywords. In this representation, cellular processes that are affected by prednisolone can be appreciated as separate areas in the network, such as cell cycle and apoptosis, acute-phase response, and metabolism-related processes such as amino acid metabolism, gluconeogenesis and lipid metabolism. The most influential 6 Chapter genes appear as highly connected hubs (Figure 3). Important factors that seem to connect the gluconeogenesis and the cell cycle and apoptosis subnetworks are the transcription factors Foxo1 and Foxo3a (Forkhead box O1 and Forkhead box O3a; see below). Strong effects of prednisolone treatment on gluconeogenesis, the cell cycle and apoptosis are in line with the anticipated effect of the GR in the liver. The fact that many inflammation and acute phase response-related genes, such as IL6r (Interleukin 6 receptor) and Cxcl12 (Chemokine (C-X-C motif) ligand ) 12 were regulated (Figure 3) is interesting, given that we analyzed livers from healthy mice, not challenged with inflammatory stimuli. The direction and magnitude of prednisolone-induced differential regulation of genes involved in gluconeogenesis and the cell cycle was studied in more detail to get more insight into these prednisolone-affected processes.

114 Chapter 6

Table 1 CoPub keyword enrichment analysis on prednisolone-regulated genes. CoPub was used to calculate keyword enrichment in the set of prednisolone-regulated probe sets in wild type mice. The analysis revealed that the differential regulated genes (non-redundant set) are predominantly involved in three main processes: metabolism, cell cycle and apoptosis, and immune/inflammatory response. The right column shows the number of differential regulated genes associated with a specific enriched keyword.

Biological process # of associated genes Metabolism Glucose metabolism / transport 22 Gluconeogenesis 21 Lipid metabolism / glycosylation 21 Glycolysis 14 Carbohydrate metabolism / transport 14 Amino acid metabolism / transport 13 Cell cycle and apoptosis Apoptosis 77 Cell proliferation 72 Cell cycle 65 Cell growth and–or maintenance, cell growth 63 Homeostasis 58 Cell differentiation 49 Cell cycle arrest 37 Immune / inflammatroy response Inflammatory response 31 Acute-phase response 20

Gluconeogenesis Table 2 shows the prednisolone-regulated genes in WT mice identified by CoPub as being associated with gluconeogenesis. Among the upregulated genes are Pck1, encoding the rate limiting enzyme in gluconeogenesis [35], Ppargc1a (Peroxisome proliferative activated receptor, gamma, coactivator 1 alpha), a transcriptional coactivator that coordinates the expression of genes involved in gluconeogensis and ketogenesis [36], and Sds (Serine dehydratase) and Aass (Aminoadipate-semialdehyde synthase), two genes encoding enzymes involved in amino acid catabolism and amino acid utilization for gluconeogenesis [37, 38]. The set of downregulated genes includes Irs1 (Insulin receptor substrate 1), a downstream mediator of the growth factor/insulin signaling pathway that negatively regulates gluconeogenesis [39] and Pklr (Pyruvate kinase), which encodes a glycolysis associated Prednisolone-induced differential gene expression in mouse liver 115

Table 2 Prednisolone-regulated genes associated with ‘gluconeogenesis’ in literature. CoPub identified 21 prednisolone-regulated genes in wild type (WT) mice that are associated in literature with gluconeogenesis. The identified genes share at least 3 publications and have Ran -scaled score [34] of at least 35 with keyword ‘gluconeogenesis’ in Medline abstracts. For each gene, the fold change (Fc) in prednisolone versus vehicle-treated WT mice (Fc WT) is given. Additionally, for each gene also the Fc in prednisolone versus vehicle-treated GRdim mice (Fc GRdim) is shown. *It should be noted that in GRdim mice these genes did not meet the p-value cutoff of 0.01 for significance after correction for multiple testing, and are mentioned to illustrate a small effect of prednisolone on gene expression in GRdim mice.

Biological process - Gluconeogenesis Gene name Symbol Fc WT Fc GRdim* Insulin-like growth factor binding protein 1 Igfbp1 9.7 3.6 Tyrosine aminotransferase Tat 4.1 1.6 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 Pfkfb3 3.9 2.3 Peroxisome proliferative activated receptor, gamma, Ppargc1a 3.2 3.2 coactivator 1 alpha CCAAT/enhancer binding protein (C/EBP), beta Cebpb 3.1 1.6 Protein-tyrosine sulfotransferase 2 Tpst2 2.7 1.5 Solute carrier family 2 (facilitated glucose transporter), Slc2a1 2.6 1.9 member 1 Forkhead box O1 Foxo1 2.2 1.7 Serine dehydratase Sds 2.1 1.5 Aminoadipate-semialdehyde synthase Aass 2.1 1.3 Phosphoenolpyruvate carboxykinase 1, cytosolic Pck1 1.9 1.3

Tryptophan 2,3-dioxygenase Tdo2 1.5 1.2 Aconitase 2, mitochondrial Aco2 1.4 1.3 Chapter 6 Chapter Transketolase Tkt -1.2 -1.1 Protein kinase, AMP-activated, beta 1 non-catalytic Prkab1 -1.5 1.0 subunit Adiponectin receptor 2 Adipor2 -1.6 -1.3 Purinergic receptor P2Y, G-protein coupled, 5 P2ry5 -1.6 -1.3 Pyruvate kinase liver and red blood cell Pklr -2.1 -1.4 CCAAT/enhancer binding protein (C/EBP), alpha Cebpa -2.3 -1.2 Sterol regulatory element binding factor 1 Srebf1 -3.0 -1.6 Insulin receptor substrate 1 Irs1 -4.0 -1.5 enzyme known to catalyze the production of pyruvate from phosphoenolpyruvate [40]. The direction of regulation and the function of these genes, suggests that in the liver of prednisolone-treated WT, glucose metabolism is balanced towards gluconeogenesis, which is in line with the reported gluconeogenic effect of glucocorticoids [41, 42]. 116 Chapter 6

Table 3 Prednisolone-regulated genes associated with ‘cell cycle’ in literature. CoPub identified 65 prednisolone-regulated genes in wild type (WT) mice that are associated in literature with the cell cycle (top 20 shown). The identified genes share at least 3 publications and have an R-scaled score [34] of at least 35 with keyword ‘cell cycle’ in Medline abstracts. For each gene, the fold change (Fc) in prednisolone versus vehicle-treated WT mice (Fc WT) is given. Additionally, for each gene also the Fc in prednisolone versus vehicle-treated GRdim mice (Fc GRdim) is shown. *It should be noted that in GRdim mice these genes did not meet the p-value cutoff of 0.01 for significance after correction for multiple testing, and are mentioned to illustrate a small effect of prednisolone on gene expression in GRdim mice. Biological process - Cell cycle Gene name Symbol Fc WT Fc GRdim* Growth arrest and DNA-damage-inducible 45 gamma Gadd45g 17.8 2.9 Dual specificity phosphatase 1 Dusp1 8.2 1.0 Cyclin-dependent kinase inhibitor 1A (P21) Cdkn1a 8.1 1.0 Growth arrest and DNA-damage-inducible 45 beta Gadd45b 5.1 1.0 Polo-like kinase 3 Plk3 4.5 1.0 Retinoblastoma binding protein 8 Rbbp8 4.5 1.3 Bcl2-like 1 Bcl2l1 3.0 1.1 Tensin 1 Tns1 2.6 1.2 Protein tyrosine phosphatase 4a1 Ptp4a1 2.3 1.1 Forkhead box O1 Foxo1 2.2 1.7 PAX interacting protein 1 Paxip1 1.9 1.0 Forkhead box O3a Foxo3a 1.9 -1.1 E1A binding protein p300 Ep300 1.8 1.0 NIMA (never in mitosis gene a)-related expressed kinase 7 Nek7 1.8 1.1 Cell division cycle 2-like 6 (CDK8-like) Cdc2l6 1.8 1.2 Cell division cycle 40 homolog Cdc40 1.4 1.0 Cyclin D2 Ccnd2 -1.5 -1.2 NIMA (never in mitosis gene a)-related expressed kinase 6 Nek6 -1.7 -1.2 CDC28 protein kinase regulatory subunit 2 Cks2 -2.3 -1.3 TSC22 domain family, member 1 Tsc22d1 -4.0 -2.3

The reduced fold changes in GRdim mice compared to WT mice (Table 2) indicate that glucocorticoid-induced gluconeogenesis is reduced in GRdim mice.

Cell cycle Table 3 shows the prednisolone-regulated genes in WT mice that CoPub identified as being associated with the cell cycle. Upregulated genes include Gadd45b, Gadd45g (Growth arrest and DNA-damage-inducible 45 beta and gamma), Cdkn1a (Cyclin- dependent kinase inhibitor 1a; also known as p21) and Plk3 (Polo-like kinase 3), which Prednisolone-induced differential gene expression in mouse liver 117 are stress sensors involved in cell cycle arrest, DNA repair and apoptosis [43-45]. Other upregulated genes are Bcl2l1 (also known as Bcl2-like1 and Bcl-xl), an anti-apoptotic protein that is enhanced by binding to the Gadd45 family [46, 47], and Dusp1, a p53 target gene involved in cell cycle regulation [48]. Amongst the downregulated genes are Ccnd2 (Cyclin D2) and Cks2 (Cdc28 protein kinase regulatory subunit 2), both involved in cell cycle progression [49, 50]. Overall, the direction of the prednisolone-induced differential regulation of genes associated with the cell cycle suggests that prednisolone induces a cytostatic response in liver of WT mice.

Foxo transcription From the literature-based network, the two transcription factors Foxo1 and Foxo3a appear to be intermediates between the gluconeogenesis and cell cycle and apoptosis subnetworks (Figure 3). Foxo transcription factors are key mediators of cell cycle progression, apoptosis, glucose metabolism, reactive oxygen species detoxification and DNA damage repair [51-54]. Their activity is tightly controlled by the insulin and growth factor-inducible Pi3k/Akt pathway. Akt-mediated phosphorylation of Foxo transcription factors promotes their translocation from the nucleus to the cytosol and thereby their inactivation through binding of 14-3-3 proteins [54-56]. Foxo transcription factors regulate gene expression of enzymes that are important regulators of gluconeogenesis, such as Pck1 and G6p [57, 58]. In addition, they suppress genes that are involved in glycolysis, the pentose shunt and lipogenesis [59]. Analysis of Foxo1-overexpressing mice revealed an increased expression of Igfbp1 (Insulin-like growth factor binding protein 1), Pck1, Tat and Tdo (Tryptophan 2,3-dioxygenase) and downregulation of Srebf1 (Sterol regulatory element binding factor 1) and Adipor2

(Adiponectin receptor )2 in liver [59]. Interestingly, these genes are similarly regulated in prednisolone-treated WT mice (Figure 3 and Table 2). Chapter 6 Chapter Finally, Foxo transcription factors are regulators of cell cycle progression and were shown to induce Cdkn1a and Gadd45b [60] and to suppress Ccnd2 [54]. Also these genes show the same direction of regulation in prednisolone-treated WT mice (Figure 3 and Table 3). These observations suggest that GCs in WT mice induce Foxo transcription factors, which in turn may synergize with the GR to modulate the expression of their respective target genes.

Prednisolone-induced differential gene expression in WT versus dimGR mice Table 2 and 3 list a number of genes that are significantly regulated by prednisolone in WT mice. In the GRdim mice these genes do not meet the significance cutoff for differential regulation (p<0.01), but for most of them a small effect of prednisolone on the expression level can be observed in GRdim mice. To analyze these differences quantitatively, we plotted the log2 ratios of gene expression after prednisolone treatment in WT versus GRdim mice (Figure 4). This figure shows that on average, genes in WT mice are more 118 Chapter 6

Figure 3 Literature-based network of glucocorticoid-induced effects.A network representation of the enrichment results was generated, in which differentially expressed genes together with enriched keywords are plotted. Genes are shown as circles (Red: upregulated; Green: downregulated), whereas enriched keywords are shown as squares. Connections between genes and keywords represent co-publications in Medline abstracts. To avoid an over-complex network, thresholds were set to simplify the interpretation of the results. Only keywords and genes that share at least 5 publications and have an R-scaled score of at least 45 are plotted in the network [34]. Several higher-order biological processes that are affected by prednisolone can be appreciated: cell cycle and apoptosis, acute-phase response, stress response, amino acid metabolism, gluconeogenesis and lipid metabolism. Prednisolone-induced differential gene expression in mouse liver 119 strongly regulated than in GRdim mice after prednisolone treatment and confirms that induction of gene expression by prednisolone is not abrogated in GRdim mice but reduced to on average 33% of the level in WT mice, as indicated by the slope of the dotted line in Figure 4.

Figure 4 Log2 ratio plot of prednisolone-induced differential gene expression in wild type versus GRdim mice. Genes on the solid black line show an equal induction of differential gene expression by prednisolone in wild type (WT) versus GRdim mice. The black-dotted line indicates the average slope Chapter 6 Chapter of the data and shows that the level of gene regulation in prednisolone-treated GRdim mice is 33% of that observed in WT mice. A subset of genes that show obvious differences in the magnitude of regulation by prednisolone in WT versus GRdim mice is highlighted, as well as genes that show equal magnitude of regulation in WT versus GRdim mice.

Identification of genes that differentially respond to prednisolone inGRdim mice as compared to WT mice In Figure 4, several genes are highlighted that show a different response upon prednisolone treatment in WT mice than in GRdim mice. Unexpectedly, a small subset of these genes shows stronger upregulation by prednisolone in GRdim mice compared to WT mice. These genes include Car8 (Carbonic anhydrase 8), a protein associated with proliferation and invasiveness of colon cancer cells [61],Zdhhc23 (Zinc finger, DHHC-type containing ),23 a Nitric oxide synthase-binding and activation protein [62], Flvcr2 (Feline leukemia virus subgroup C cellular receptor family, member 2), 120 Chapter 6

a calcium-chelate transporter [63], and Amy2 (Amylase 2), an amylase that catalyzes the endohydrolysis of 1,4-alpha-D-glucosidic linkages in oligosaccharides. Genes that appear to strictly rely on GR dimerization include Dusp1, Gadd45b, Cdkn1a, Foxo3a, Plk3 and Fam107a (Figure 4 and 5). As discussed above, Foxo3a, Dusp1, Gadd45b, Cdkn1a and Plk3 all participate in cell cycle regulation, apoptosis and DNA damage repair [43-45, 52]. The most striking difference in the level of regulation by prednisolone between WT and GRdim mice was observed for Fam107a (also known as Tu3a and Drr1) (Figure 4, 5 and 6). The expression profile suggests that transcription of Fam107a is under direct control of the GR and strictly depends on its dimerization. Fam107a encodes a ubiquitously expressed protein that was first described in the context of renal cell carcinoma, in which Fam107a expression is reduced or absent due to promotor hypermethylation [64-66]. Overexpression of Fam107a in Fam107a-negative cell lines leads to growth retardation and apoptosis, indicating that Fam107a might act as a tumor suppressor [65, 67] We also identified a subset of genes that showed an equal induction of gene expression by prednisolone in WT and GRdim mice (located on the diagonal black line in Figure 4); Dst (Dystonin), a cytoskeletan-interacting protein postulated to cross-link cytoskeletal filaments and thereby maintain cellular integrity [68],Gtf2a2 (General transcription factor II A, 2), a subunit of the transcription initiation factorTFIIA [69], Agpat6 (1-acylglycerol-3- phosphate O-acyltransferase 6), an enzyme involved in triglyceride synthesis [70], Marcks (Myristoylated alanine rich protein kinase C substrate), cytoskeletal protein involved in cell adhesion and cell motility [71],Obfc2a (Oligonucleotide/oligosaccharide-binding fold containing 2A), a single stranded nucleic-acid-binding protein [72], and Ppargc1a. Prednisolone-induced changes in gene expression in WT and GRdim mice were validated by Q-PCR for Fam107a, Dusp1, Cdkn1a and Car8. Figure 6 shows the fold-changes of the selected genes in prednisolone versus vehicle-treated WT and GRdim mice. For all four genes, the regulation by prednisolone in WT and GRdim mice was qualitatively similar to what was found by microarray analysis.

Discussion and conclusion

In this study we performed liver gene expression profiling of WT and GRdim mice after prednisolone administration. Our aim was to chart the biological processes in the liver that are affected by GCs and to study their dependence on DNA-binding and dimerization of the GR. CoPub keyword enrichment analysis with prednisolone-regulated genes in WT mice showed enrichment of keywords associated with glucose, lipid and amino acid metabolism, the cell cycle and apoptosis (Table 1 and Figure 3). This is in agreement with cellular processes known to be affected by GCs [73, 74]. Interestingly, the forkhead transcription factors Foxo1 and Foxo3a are regulators of these cellular processes and induced by prednisolone in WT mice. Together with the observation that a subset of Prednisolone-induced differential gene expression in mouse liver 121

Figure 5 Intensity profiles of Plk3, Dusp1, Gadd45b, Cdkn1a, Foxo3a and Fam107a in vehicle and prednisolone-treated wild type and GRdim mice. Several genes were identified that showed differences between prednisolone-treatment in wild type (WT) and GRdim mice. Amongst those genes are Plk3, Dusp1, Gadd45b, Cdkn1a, Foxo3a and Fam107a were differential regulated in prednisolone-treated WT mice, but not in GRdim mice. White bars represent vehicle-treated mice, whereas grey bars represent prednisolone-treated mice. Symbols; Triangles: female mice, Bullets: male mice.

Chapter 6 Chapter

Figure 6 Validation of microarray-obtained gene expression by Q-PCR. The relative expression of Fam107a, Dusp1, Cdkn1a and Car8 in prednisolone-treated versus vehicle-treated wild type (WT) and GRdim mice was validated by Q-PCR. Differences in expression between two samples were calculated by the 2ΔΔCt method [103, 104]. Mean differences in gene expression between WT and GRdim mice were analyzed using the Student’s t-test. No expression was observed for Fam107a and Cdkn1a in vehicle and prednisolone-treated GRdim mice (Ct>30). Asterisks denote p-values as follows: *p < 0.05, **p < 0.01 and ***p < 0.001. 122 Chapter 6

the prednisolone-induced genes are Foxo1 and Foxo3a target genes and are similarly regulated in Foxo1 and Foxo3a expression profiling experiments [54, 59, 60], this suggests that the GR synergizes with these transcription factors in mouse liver to control lipid and glucose metabolism and the cell cycle. This concept is further supported by the recent finding that the MurF1 (Muscle RING finger )1 promoter contains adjacent binding sites for the GR and Foxo transcription factors and is synergistically activated when both are co-expressed [75]. However, it is likely that a subset of the prednisolone-induced genes is under direct control of Foxo1 and Foxo3a without the need for synergy with the GR. Other studies, addressing small gene sets or individual genes, had also identified a role for Foxo1 and Foxo3a in GR-induced gene expression [58, 75-77]. The list of differentially regulated genes by prednisolone in WT mice overlaps with that of an earlier study by Phuc Le et al., which combined chromatin immunoprecipitation (ChIP) data with gene expression data to identify direct GR target genes in mouse liver [78]. Most of the genes found in our study displayed the same up and downregulation of gene expression, such as Tat, Foxo1 and Fkbp5 that are upregulated and Adipor2, CCAAT/enhancer binding protein alpha (C/EBPα) and Tkt that are downregulated. From a candidate gene set of 302 genes, Phuc Le et al. identified metabolism, cell proliferation and programmed cell death as important processes in their GO-term analysis, which is in agreement with our findings. Interestingly, in contrast to our study, Phuc Leet al. did not observe upregulation of bona fide GR target genes, such as Pck and Igfpb in fed CD1 mice and explain this by lack of response due to a potential inhibition by insulin signaling in the fed state [79]. The fact that we observed upregulation of these genes indicates that in our experimental setup, which differs in several aspects from the setup used by Phuc Leet al, such as mouse strain used for the study, GC dosage and the fed state of the mice, the inhibitory effects of fed- induced insulin signaling do not play a role. The overall induction of gene expression was strongly reduced in prednisolone-treated GRdim mice compared to WT mice (Figure 4). Nevertheless, in many cases residual gene induction by prednisolone was still observed. This indicates that GR dimerization is indeed an important mechanism for activation of these genes, although some regulation can take place even in the absence of GR homodimers. In fact, GR monomers are in principle capable of binding to GC response elements (GREs) and evoking a basal induction of gene expression. However, due to the lack of cooperative binding they are less potent than GR homodimers. This is in line with the regulation reported for Pnmt (Phenylethanolamine- N-methyltransferase) and Amy2 in which binding of GR monomers to GREs or a GRE half-site (i.e. only one-half of the classical palindrome) was sufficient to confer induction by GCs [80, 81]. In case of the Pnmt gene, multiple GRE half-site have been identified in the promoter region allowing receptor clustering and thereby stable binding of GR monomers independent of the DNA-binding domain (DBD) interface [80]. This also explains why expression of the Pnmt gene is not compromised in GRdim mice despite the Prednisolone-induced differential gene expression in mouse liver 123 lack of GR dimer formation [21]. Another explanation for the residual gene induction by prednisolone in GRdim mice is that the mutant GR still forms homodimers but that these are far less stable than in WT mice. A recent study showed that the specific sequence of a GRE differently affects the conformation of the GR and thereby its activity towards specific target genes [82]. Mutation analysis of overexpressed GR domains that are involved in transcriptional activation, namely the dimerization region in the DBD (Dim) as well as AF1 andAF2 (activation function 1 and 2), showed that the dependency on each of them was specific for the sequence of the GR binding site and that genes differed in their dependence on dimerization [82]. The fold induction of Tat and Fkbp5 by dexamethasone in the Dim mutant was around 30% for Tat and 50% for Fkbp5 of that for WT GR, which is in gross agreement with our own observations in vivo (Figure 4). The effect of the GRdim mutation was found in all major pathways that were induced by prednisolone, i.e. for genes in inflammatory pathways, gluconeogenesis and cell cycle. Moreover, the attenuating effect of the GRdim mutation was found for genes that were upregulated as well as for genes that were downregulated by prednisolone. This suggests that genes, which fail to be repressed by prednisolone in GRdim mice, are either regulated through GR binding to negative GRE elements (nGREs) or indirectly regulated via other GR target genes. Interestingly, analysis of the magnitude of gene regulation by prednisolone in WT mice compared to GRdim mice showed that the cell cycle-related genes are more dependent on the dim interface than genes related to gluconeogenesis (Table 2 and 3). This can also be appreciated in Figure 4; several genes that are on the x-axis (i.e. show no differential gene expression upon prednisolone treatment in GRdim mice), are cell cycle- related.

We identified several genes that showed strong upregulation by prednisolone in GRdim mice compared to WT mice, such as Amy2, Car8 and Zdhhc23, and several genes that Chapter 6 Chapter showed equal induction of gene expression by prednisolone in WT andGRdim mice, amongst them are Dst and Ppargc1a (Figure 4). These genes are potential candidates for having GRE half-sites in their promotor regions, which could explain why these genes show equal or even higher induction of gene expression by prednisolone compared to WT mice. As mentioned earlier, the presence of GR half-sites in the promotor region for Amy2 was indeed experimentally confirmed [81]. For the other genes however, we did not find literature evidence for the presence of GR half-sites in their promotor regions. Therefore, follow-up studies on these genes to determine the functional GRE sites in their regulatory region would be of interest to study the significance of dimer- interface independent GRE binding. Genes that showed equal or stronger upregulation by prednisolone in GRdim mice compared to WT mice can also be secondary response genes; under transcriptional control of transcription factors other than the GR. Keyword enrichment analysis performed with this set of genes did not identify enrichment for a particular cellular process. 124 Chapter 6

We identified several genes that are absolutely dependent on GR dimerization for the induction of gene expression by prednisolone, such asCdkn1a , Gadd45b, Dusp1, Plk3 and Foxo3a (Figure 4). Interestingly, all of these have a strong relationship with p53: some are under direct transcriptional control of p53 (Cdkn1a [83], Gadd45b [84] and Dusp1 [48]) or physically interact with p53 (Foxo3a [85, 86] and Plk3 [87]). The cellular responses mediated by Foxo3a and p53 are highly similar, share some of their target genes (e.g. Cdkn1a and Gadd45b) and use similar mechanisms to regulate post-translational modification [51, 88]. Therefore, Foxo3a and p53 can be regarded as partners that positively as well as negatively regulate each other, depending on the context [51]. This observation might indicate that GC-induced activation of Foxo3a and/or p53 is hampered in GRdim mice. In line with this hypothesis Foxo3a was recently shown to be required for GC-induced apoptosis in lymphocytes [89]. Interestingly, this process is defective in GRdim mice highlighting a possible link between the GR, Foxo3a and induction of lymphocyte apoptosis [21]. Fam107a showed the largest induction of gene expression by prednisolone in WT mice (Figure 4 and 5). Analysis of protein-protein interactions revealed that Fam107a interacts with Tada2a [90, 91], a protein that together with binding partner Tada3a (Transcriptional adaptor 2 and 3 alpha) are core proteins of the histone acetyltransferase (HAT) complex [92]. HAT complexes are involved in chromatin structure modification for initiation of gene transcription, but can also acetylate non-histone proteins to modify their activity and stability [92, 93]. Interestingly, Fam107b a paralog of Fam107a with 84% protein similarity (not regulated in mouse liver by prednisolone) was shown to interact with Tada3a [94]. The observation that Fam107a inhibits cell proliferation and induces apoptosis when overexpressed [65, 67], suggests that Fam107a, like Foxo3a, Dusp1, Gadd45b, Cdkn1a and Plk3 play a role in regulating the cell cycle. Furthermore, the association of Fam107a with a core protein of the HAT complex might indicate that Fam107a may serve as a cofactor in the transcription machinery complex. The activity and function of Foxo3a and p53 are strongly modulated by acetylation [51, 88, 95]. Hence, Fam107a is an interesting candidate gene for follow up experiments to study whether it modulates GR-induced gene expression and/or acetylation of GR-associated transcription factors such as Foxo3a, p53, C/EBPα and C/EBPβ (CCAAT/enhancer binding protein, beta). For a more comprehensive view on GC regulated process in the liver, experiments with multiple time points, different doses of GCs and using one or more inflammatory stimuli, could be considered. Prednisolone-induced differential gene expression in mouse liver 125

Methods

Animals All mice (WT and GRdim; Balb/c) were bred at the Institute of Virology and Immunobiology at the University of Würzburg. In total 24 mice (8 male WT, 4 female WT, 8 male GRdim and 4 female GRdim) were included in the study. Mice were treated subcutaneously with 1 mg/kg prednisolone (5% DMSO and 5% Chremophor in manitol, 10 ml/kg) once and sacrificed 150 minutes later by cervical dislocation, which was approved bythe responsible authorities in Bavaria (Regierung von Unterfranken). All experiments were performed in the morning between 9 and 10 AM. The mice were exposed to a regular dark-light-cycle (lights on between 6 AM and 6 PM) and had access to water and food ad libitum at any time.

RNA isolation

Liver biopsy specimens were collected into aluminum containers, snap freezed in liqN2 and stored at -80°C before use. RNA isolation was done using Trizol, followed by RNeasy clean-up to enhance the A260/A230 ratio. RNA quantity and quality was determined using the NanoDrop Spectrophotometer and Agilent Bioanalyzer. For all samples subjected to microarray hybridization, the RIN (RNA integrity number) was 9.0 - 10.

Microarray data processing Processed RNA samples were hybridized on GeneChip Mouse Genome 430 2.0 arrays (MOE430-2) from Affymetrix [96]. Processing and downstream statistical analysis of the

microarray data was done using the R-Statistics package [97]. Data were normalized using the gcrma algorithm, pair-wise ratios between treatments were built using the limma Chapter 6 Chapter package and annotation for the probe sets was derived from the mouse430-2 library, all as provided in BioConductor [98]. In all contrast matrices, a correction for gender type was applied. Data were deposited in the NCBI Gene Expression Omnibus (GEO), accession number GSE21048.

Keyword enrichment analysis Keyword enrichment analysis on the microarray data was performed using CoPub [34] with default settings as provided by the web server [99]. The literature-network between enriched keywords and genes (nodes) and their co-publications (edges) were visualized using Cytoscape software [100, 101].

Validation of microarray results with Q-PCR Total RNA (1 µg) was reverse transcribed using a commercially available cDNA synthesis kit (iScript, BioRad Laboratories, Hercules, CA, USA). Q-PCR was performed by SYBR Green- 126 Chapter 6

based quantification according to the manufacturer’s protocol (Applied Biosystems, Foster City, CA, USA). Primers were developed for Fam107a (Family with sequence similarity 107, member a; Fw:TCATCAAACCCAAGAAGCTG; Rev: CTCAGGCTTGCTGTCCATAC), Car8 (Carbonic anhydrase 8; Fw: CACACCATTCAAGTCATCCTG; Rev: ACCACGCTGGTTTTCTCTTC), Cdkn1a (Cyclin-dependent kinase inhibitor 1a; Fw: TCTTGCACTCTGGTGTCTGAG; Rev: ATCTGCGCTTGGAGTGATAG) and Dusp1 (Dual specificity phosphatase; 1 Fw: GTGCCTGACAGTGCAGAATC; Rev: CCAGGTACAGGAAGGACAGG) using Primer3 [102]. All primer pairs were exon-spanning. The gene Azgp1 (alpha-2-glycoprotein 1, zinc- binding; Fw: AAGGAAAGCCAGCTTCAGAG; Rev: ACCAAACATTCCCTGAAAGG) was chosen as endogenous control since the expression arrays did not show any differences in expression of this gene between the two experimental groups (WT vs GRdim). PCR products were selected to be between 80 and 120 bp long. Samples were run on the 7500 Fast Real-Time PCR System (Applied Biosystems) using the following protocol: 10 min. denaturation at 95°C, and 40 cycles of 15 sec. denaturation at 95°C, 60sec. annealing and extension at 60°C. All primer pairs were validated in triplicate using serial cDNA dilutions. Primer pairs that were 100+/-10% efficient, which implies a doubling of PCR product in each cycle, were used to quantify mRNA levels. Threshold cycle numbers

(referred to as Ct) were obtained using the 7900 HT System SDS software version 2.3 (Applied Biosystems). All samples were measured for three times and samples with a standard deviation (SD) larger than 0.5 were excluded from the analysis. The relative quantity (RQ) of the gene-specific mRNA was calculated from the average value of the ΔCt (target gene Ct – endogenous control gene Ct) for each of the 24 analyzed samples. Differences in expression between two samples were calculated by the 2ΔΔCt method [103, 104]. For Q-PCR, mean differences in expression between groups were analyzed using the Student’s t-test. A p-value of <0.05 was considered statistically significant in each situation.

Protein – protein interaction data Protein-protein interaction data were retrieved from the Biological General Repository for Interaction Datasets (BioGRID) database [105, 106]. Abbreviations

GR: Glucocorticoid receptor, WT: Wild type, GRdim: GR with the A458T point mutation in the dimerization region of the DNA-binding domain, GCs: Glucocorticoids, GRE: Glucocorticoid response element.

Authors'contributions RF, AvE, MJvL, JPT, HMR, WD, JdV and WA designed research. JPT and HMR generated and provided GRdim and WT mice. HvdM performed RNA isolations and microarray hybridization experiments. EJMT designed and performed Q-PCR experiments. RF, WF Prednisolone-induced differential gene expression in mouse liver 127 and WA analyzed the microarray data and wrote the paper. All authors read and approved the final manuscript.

Acknowledgements WF is sponsored by a grant received from the Netherlands Bioinformatics Centre (NBIC) under the BioRange program.

Chapter 6 Chapter 128 Chapter 6

References

1. Kirwan J, Power L: Glucocorticoids: action and new therapeutic insights in rheumatoid arthritis. Curr Opin Rheumatol 2007, 19:233-237. 2. Ito K, Getting SJ, Charron CE: Mode of glucocorticoid actions in airway disease. ScientificWorldJournal 2006, 6:1750-1769. 3. Ito K, Chung KF, Adcock IM: Update on glucocorticoid action and resistance. J Allergy Clin Immunol 2006, 117:522-543. 4. Schacke H, Docke WD, Asadullah K: Mechanisms involved in the side effects of glucocorticoids. Pharmacol Ther 2002, 96:23-43. 5. Zhou J, Cidlowski JA: The human glucocorticoid receptor: one gene, multiple proteins and diverse responses. Steroids 2005, 70:407-417. 6. Schoneveld OJ, Gaemers IC, Lamers WH: Mechanisms of glucocorticoid signalling. Biochim Biophys Acta 2004, 1680:114-128. 7. Newton R, Holden NS: Separating transrepression and transactivation: a distressing divorce for the glucocorticoid receptor? Mol Pharmacol 2007, 72:799-809. 8. De Bosscher K, Vanden Berghe W, Haegeman G: The interplay between the glucocorticoid receptor and nuclear factor-kappaB or activator protein-1: molecular mechanisms for gene repression. Endocr Rev 2003, 24:488-522. 9. Hermoso MA, Cidlowski JA: Putting the brake on inflammatory responses: the role of glucocorticoids. IUBMB Life 2003, 55:497-504. 10. Heck S, Kullmann M, Gast A, Ponta H, Rahmsdorf HJ, Herrlich P, Cato AC: A distinct modulating domain in glucocorticoid receptor monomers in the repression of activity of the transcription factor AP-1. EMBO J 1994, 13:4087-4095. 11. Scheinman RI, Gualberto A, Jewell CM, Cidlowski JA, Baldwin AS, Jr.: Characterization of mechanisms involved in transrepression of NF-kappa B by activated glucocorticoid receptors. Mol Cell Biol 1995, 15:943-953. 12. Lucibello FC, Slater EP, Jooss KU, Beato M, Muller R: Mutual transrepression of Fos and the glucocorticoid receptor: involvement of a functional domain in Fos which is absent in FosB. EMBO J 1990, 9:2827-2834. 13. Adcock IM, Ito K, Barnes PJ: Glucocorticoids: effects on gene transcription. Proc Am Thorac Soc 2004, 1:247-254. 14. Ray A, Prefontaine KE: Physical association and functional antagonism between the p65 subunit of transcription factor NF-kappa B and the glucocorticoid receptor. Proc Natl Acad Sci U S A 1994, 91:752- 756. 15. Heck S, Bender K, Kullmann M, Gottlicher M, Herrlich P, Cato AC: I kappaB alpha-independent downregulation of NF-kappaB activity by glucocorticoid receptor. EMBO J 1997, 16:4698-4707. 16. Reily MM, Pantoja C, Hu X, Chinenov Y, Rogatsky I: The GRIP1:IRF3 interaction as a target for glucocorticoid receptor-mediated immunosuppression. EMBO J 2006, 25:108-117. 17. Tuckermann JP, Kleiman A, Moriggl R, Spanbroek R, Neumann A, Illing A, Clausen BE, Stride B, Forster I, Habenicht AJ, Reichardt HM, Tronche F, Schmid W, Schutz G: Macrophages and neutrophils are the targets for immune suppression by glucocorticoids in contact allergy. J Clin Invest 2007, 117:1381- 1390. 18. Opherk C, Tronche F, Kellendonk C, Kohlmuller D, Schulze A, Schmid W, Schutz G: Inactivation of the glucocorticoid receptor in hepatocytes leads to fasting hypoglycemia and ameliorates hyperglycemia in streptozotocin-induced diabetes mellitus. Mol Endocrinol 2004, 18:1346-1353. 19. Watts LM, Manchem VP, Leedom TA, Rivard AL, McKay RA, Bao D, Neroladakis T, Monia BP, Bodenmiller DM, Cao JX, Zhang HY, Cox AL, Jacobs SJ, Michael MD, Sloop KW, Bhanot S: Reduction of hepatic and adipose tissue glucocorticoid receptor expression with antisense oligonucleotides improves hyperglycemia and hyperlipidemia in diabetic rodents without causing systemic glucocorticoid antagonism. Diabetes 2005, 54:1846-1853. 20. Vander Kooi BT, Onuma H, Oeser JK, Svitek CA, Allen SR, Vander Kooi CW, Chazin WJ, O’Brien RM: The glucose-6-phosphatase catalytic subunit gene promoter contains both positive and negative glucocorticoid response elements. Mol Endocrinol 2005, 19:3001-3022. 21. Reichardt HM, Kaestner KH, Tuckermann J, Kretz O, Wessely O, Bock R, Gass P, Schmid W, Herrlich P, Prednisolone-induced differential gene expression in mouse liver 129

Angel P, Schutz G: DNA binding of the glucocorticoid receptor is not essential for survival. Cell 1998, 93:531-541. 22. Reichardt HM, Tuckermann JP, Gottlicher M, Vujic M, Weih F, Angel P, Herrlich P, Schutz G: Repression of inflammatory responses in the absence of DNA binding by the glucocorticoid receptor. EMBO J 2001, 20:7168-7173. 23. Vermeer H, Hendriks-Stegeman BI, van der Burg B, van Buul-Offers SC, Jansen M: Glucocorticoid- induced increase in lymphocytic FKBP51 messenger ribonucleic acid expression: a potential marker for glucocorticoid sensitivity, potency, and bioavailability. J Clin Endocrinol Metab 2003, 88:277-284. 24. Alexandrova M: Stress induced tyrosine aminotransferase activity via glucocorticoid receptor. Horm Metab Res 1994, 26:97-99. 25. Imai E, Stromstedt PE, Quinn PG, Carlstedt-Duke J, Gustafsson JA, Granner DK: Characterization of a complex glucocorticoid response unit in the phosphoenolpyruvate carboxykinase gene. Mol Cell Biol 1990, 10:4712-4719. 26. Kassel O, Sancono A, Kratzschmar J, Kreft B, Stassen M, Cato AC: Glucocorticoids inhibit MAP kinase via increased expression and decreased degradation of MKP-1. EMBO J 2001, 20:7108-7116. 27. Smoak KA, Cidlowski JA: Mechanisms of glucocorticoid receptor signaling during inflammation. Mech Ageing Dev 2004, 125:697-706. 28. Tuckermann JP, Reichardt HM, Arribas R, Richter KH, Schutz G, Angel P: The DNA binding-independent function of the glucocorticoid receptor mediates repression of AP-1-dependent genes in skin. J Cell Biol 1999, 147:1365-1370. 29. Rogatsky I, Luecke HF, Leitman DC, Yamamoto KR: Alternate surfaces of transcriptional coregulator GRIP1 function in different glucocorticoid receptor activation and repression contexts. Proc Natl Acad Sci U S A 2002, 99:16701-16706. 30. Sakuma T, Bhadhprasit W, Hashita T, Nemoto N: Synergism of glucocorticoid hormone with growth hormone for female-specific mouse Cyp3a44 gene expression. Drug Metab Dispos 2008, 36:878-884. 31. Muller DN, Schmidt C, Barbosa-Sicard E, Wellner M, Gross V, Hercule H, Markovic M, Honeck H, Luft FC, Schunck WH: Mouse Cyp4a isoforms: enzymatic properties, gender- and strain-specific expression, and role in renal 20-hydroxyeicosatetraenoic acid formation. Biochem J 2007, 403:109-118. 32. Wang JC, Derynck MK, Nonaka DF, Khodabakhsh DB, Haqq C, Yamamoto KR: Chromatin immunoprecipitation (ChIP) scanning identifies primary glucocorticoid receptor target genes. Proc Natl Acad Sci U S A 2004, 101:15603-15608. 33. Suh DS, Rechler MM: Hepatocyte nuclear factor 1 and the glucocorticoid receptor synergistically activate transcription of the rat insulin-like growth factor binding protein-1 gene. Mol Endocrinol 1997, 11:1822-1831.

34. Frijters R, Heupers B, van Beek P, Bouwhuis M, van Schaik R, de Vlieg J, Polman J, Alkema W: CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Res 2008, Chapter 6 Chapter 36:W406-410. 35. Sun Y, Liu S, Ferguson S, Wang L, Klepcyk P, Yun JS, Friedman JE: Phosphoenolpyruvate carboxykinase overexpression selectively attenuates insulin signaling and hepatic insulin sensitivity in transgenic mice. J Biol Chem 2002, 277:23301-23307. 36. Yoon JC, Puigserver P, Chen G, Donovan J, Wu Z, Rhee J, Adelmant G, Stafford J, Kahn CR, Granner DK, Newgard CB, Spiegelman BM: Control of hepatic gluconeogenesis through the transcriptional coactivator PGC-1. Nature 2001, 413:131-138. 37. Hagopian K, Ramsey JJ, Weindruch R: Serine utilization in mouse liver: influence of caloric restriction and aging. FEBS Lett 2005, 579:2009-2013. 38. Markovitz PJ, Chuang DT: The bifunctional aminoadipic semialdehyde synthase in lysine degradation. Separation of reductase and dehydrogenase domains by limited proteolysis and column chromatography. J Biol Chem 1987, 262:9353-9358. 39. Thirone AC, Huang C, Klip A: Tissue-specific roles of IRS proteins in insulin signaling and glucose transport. Trends Endocrinol Metab 2006, 17:72-78. 40. Beutler E, Baronciani L: Mutations in pyruvate kinase. Hum Mutat 1996, 7:1-6. 41. Scott DK, Stromstedt PE, Wang JC, Granner DK: Further characterization of the glucocorticoid response unit in the phosphoenolpyruvate carboxykinase gene. The role of the glucocorticoid receptor-binding sites. Mol Endocrinol 1998, 12:482-491. 42. Barthel A, Schmoll D: Novel concepts in insulin regulation of hepatic gluconeogenesis. Am J Physiol Endocrinol Metab 2003, 285:E685-692. 130 Chapter 6

43. Liebermann DA, Hoffman B: Gadd45 in stress signaling. J Mol Signal 2008, 3:15. 44. Xiong Y, Hannon GJ, Zhang H, Casso D, Kobayashi R, Beach D: p21 is a universal inhibitor of cyclin kinases. Nature 1993, 366:701-704. 45. Bahassi el M, Conn CW, Myer DL, Hennigan RF, McGowan CH, Sanchez Y, Stambrook PJ: Mammalian Polo-like kinase 3 (Plk3) is a multifunctional protein involved in stress response pathways. Oncogene 2002, 21:6633-6640. 46. Brunelle JK, Letai A: Control of mitochondrial apoptosis by the Bcl-2 family. J Cell Sci 2009, 122:437- 441. 47. Smith GB, Mocarski ES: Contribution of GADD45 family members to cell death suppression by cellular Bcl-xL and cytomegalovirus vMIA. J Virol 2005, 79:14923-14932. 48. Li M, Zhou JY, Ge Y, Matherly LH, Wu GS: The phosphatase MKP1 is a transcriptional target of p53 involved in cell cycle regulation. J Biol Chem 2003, 278:41059-41068. 49. Sherr CJ: G1 phase progression: cycling on cue. Cell 1994, 79:551-555. 50. Rother K, Dengl M, Lorenz J, Tschop K, Kirschner R, Mossner J, Engeland K: Gene expression of cyclin- dependent kinase subunit Cks2 is repressed by the tumor suppressor p53 but not by the related proteins p63 or p73. FEBS Lett 2007, 581:1166-1172. 51. van der Horst A, Burgering BM: Stressing the role of FoxO proteins in lifespan and disease. Nat Rev Mol Cell Biol 2007, 8:440-450. 52. Arden KC: FOXO animal models reveal a variety of diverse roles for FOXO transcription factors. Oncogene 2008, 27:2345-2350. 53. Nakae J, Oki M, Cao Y: The FoxO transcription factors and metabolic regulation. FEBS Lett 2008, 582:54- 67. 54. Furukawa-Hibi Y, Kobayashi Y, Chen C, Motoyama N: FOXO transcription factors in cell-cycle regulation and the response to oxidative stress. Antioxid Redox Signal 2005, 7:752-760. 55. Puigserver P, Rhee J, Donovan J, Walkey CJ, Yoon JC, Oriente F, Kitamura Y, Altomonte J, Dong H, Accili D, Spiegelman BM: Insulin-regulated hepatic gluconeogenesis through FOXO1-PGC-1alpha interaction. Nature 2003, 423:550-555. 56. Dong XC, Copps KD, Guo S, Li Y, Kollipara R, DePinho RA, White MF: Inactivation of hepatic Foxo1 by insulin signaling is required for adaptive nutrient homeostasis and endocrine growth regulation. Cell Metab 2008, 8:65-76. 57. Barthel A, Schmoll D, Kruger KD, Bahrenberg G, Walther R, Roth RA, Joost HG: Differential regulation of endogenous glucose-6-phosphatase and phosphoenolpyruvate carboxykinase gene expression by the forkhead transcription factor FKHR in H4IIE-hepatoma cells. Biochem Biophys Res Commun 2001, 285:897-902. 58. Nakae J, Kitamura T, Silver DL, Accili D: The forkhead transcription factor Foxo1 (Fkhr) confers insulin sensitivity onto glucose-6-phosphatase expression. J Clin Invest 2001, 108:1359-1367. 59. Zhang W, Patil S, Chauhan B, Guo S, Powell DR, Le J, Klotsas A, Matika R, Xiao X, Franks R, Heidenreich KA, Sajan MP, Farese RV, Stolz DB, Tso P, Koo SH, Montminy M, Unterman TG: FoxO1 regulates multiple metabolic pathways in the liver: effects on gluconeogenic, glycolytic, and lipogenic gene expression. J Biol Chem 2006, 281:10105-10117. 60. Gomis RR, Alarcon C, He W, Wang Q, Seoane J, Lash A, Massague J: A FoxO-Smad synexpression group in human keratinocytes. Proc Natl Acad Sci U S A 2006, 103:12747-12752. 61. Nishikata M, Nishimori I, Taniuchi K, Takeuchi T, Minakuchi T, Kohsaki T, Adachi Y, Ohtsuki Y, Onishi S: Carbonic anhydrase-related protein VIII promotes colon cancer cell growth. Mol Carcinog 2007, 46:208-214. 62. Saitoh F, Tian QB, Okano A, Sakagami H, Kondo H, Suzuki T: NIDD, a novel DHHC-containing protein, targets neuronal nitric-oxide synthase (nNOS) to the synaptic membrane through a PDZ-dependent interaction and regulates nNOS activity. J Biol Chem 2004, 279:29461-29468. 63. Brasier G, Tikellis C, Xuereb L, Craigie J, Casley D, Kovacs CS, Fudge NJ, Kalnins R, Cooper ME, Wookey PJ: Novel hexad repeats conserved in a putative transporter with restricted expression in cell types associated with growth, calcium exchange and homeostasis. Exp Cell Res 2004, 293:31-42. 64. Yamato T, Orikasa K, Fukushige S, Orikasa S, Horii A: Isolation and characterization of the novel gene, TU3A, in a commonly deleted region on 3p14.3-->p14.2 in renal cell carcinoma. Cytogenet Cell Genet 1999, 87:291-295. 65. Wang L, Darling J, Zhang JS, Liu W, Qian J, Bostwick D, Hartmann L, Jenkins R, Bardenhauer W, Schutte J, Opalka B, Smith DI: Loss of expression of the DRR 1 gene at chromosomal segment 3p21.1 in renal Prednisolone-induced differential gene expression in mouse liver 131

cell carcinoma. Genes Cancer 2000, 27:1-10. 66. Awakura Y, Nakamura E, Ito N, Kamoto T, Ogawa O: Methylation-associated silencing of TU3A in human cancers. Int J Oncol 2008, 33:893-899. 67. Liu Q, Zhao XY, Bai RZ, Liang SF, Nie CL, Yuan Z, Wang CT, Wu Y, Chen LJ, Wei YQ: Induction of tumor inhibition and apoptosis by a candidate tumor suppressor gene DRR1 on 3p21.1. Oncol Rep 2009, 22:1069-1075. 68. Young KG, Pinheiro B, Kothary R: A Bpag1 isoform involved in cytoskeletal organization surrounding the nucleus. Exp Cell Res 2006, 312:121-134. 69. Nakadai T, Shimada M, Shima D, Handa H, Tamura TA: Specific interaction with transcription factor IIA and localization of the mammalian TATA-binding protein-like protein (TLP/TRF2/TLF). J Biol Chem 2004, 279:7447-7455. 70. Vergnes L, Beigneux AP, Davis R, Watkins SM, Young SG, Reue K: Agpat6 deficiency causes subdermal lipodystrophy and resistance to obesity. J Lipid Res 2006, 47:745-754. 71. Arbuzova A, Schmitz AA, Vergeres G: Cross-talk unfolded: MARCKS proteins. Biochem J 2002, 362:1-12. 72. Kang HS, Beak JY, Kim YS, Petrovich RM, Collins JB, Grissom SF, Jetten AM: NABP1, a novel RORgamma- regulated gene encoding a single-stranded nucleic-acid-binding protein. Biochem J 2006, 397:89-99. 73. Rogatsky I, Trowbridge JM, Garabedian MJ: Glucocorticoid receptor-mediated cell cycle arrest is achieved through distinct cell-specific transcriptional regulatory mechanisms. Mol Cell Biol 1997, 17:3181-3193. 74. Sionov RV, Kfir S, Zafrir E, Cohen O, Zilberman Y, Yefenof E: Glucocorticoid-induced apoptosis revisited: a novel role for glucocorticoid receptor translocation to the mitochondria. Cell Cycle 2006, 5:1017- 1026. 75. Waddell DS, Baehr LM, van den Brandt J, Johnsen SA, Reichardt HM, Furlow JD, Bodine SC: The glucocorticoid receptor and FOXO1 synergistically activate the skeletal muscle atrophy-associated MuRF1 gene. Am J Physiol Endocrinol Metab 2008, 295:E785-797. 76. Zhao W, Qin W, Pan J, Wu Y, Bauman WA, Cardozo C: Dependence of dexamethasone-induced Akt/ FOXO1 signaling, upregulation of MAFbx, and protein catabolism upon the glucocorticoid receptor. Biochem Biophys Res Commun 2009, 378:668-672. 77. Kwon HS, Huang B, Unterman TG, Harris RA: Protein kinase B-alpha inhibits human pyruvate dehydrogenase kinase-4 gene induction by dexamethasone through inactivation of FOXO transcription factors. Diabetes 2004, 53:899-910. 78. Phuc Le P, Friedman JR, Schug J, Brestelli JE, Parker JB, Bochkis IM, Kaestner KH: Glucocorticoid receptor-dependent gene regulatory networks. PLoS Genet 2005, 1:e16. 79. Pierreux CE, Rousseau GG, Lemaigre FP: Insulin inhibition of glucocorticoid-stimulated gene

transcription: requirement for an insulin response element? Mol Cell Endocrinol 1999, 147:1-5. 80. Adams M, Meijer OC, Wang J, Bhargava A, Pearce D: Homodimerization of the glucocorticoid receptor is Chapter 6 Chapter not essential for response element binding: activation of the phenylethanolamine N-methyltransferase gene by dimerization-defective mutants. Mol Endocrinol 2003, 17:2583-2592. 81. Slater EP, Hesse H, Muller JM, Beato M: Glucocorticoid receptor binding site in the mouse alpha- amylase 2 gene mediates response to the hormone. Mol Endocrinol 1993, 7:907-914. 82. Meijsing SH, Pufall MA, So AY, Bates DL, Chen L, Yamamoto KR: DNA binding site sequence directs glucocorticoid receptor structure and activity. Science 2009, 324:407-410. 83. Hill R, Bodzak E, Blough MD, Lee PW: p53 Binding to the p21 promoter is dependent on the nature of DNA damage. Cell Cycle 2008, 7:2535-2543. 84. Balliet AG, Hatton KS, Hoffman B, Liebermann DA: Comparative analysis of the genetic structure and chromosomal location of the murine MyD118 (Gadd45beta) gene. DNA Cell Biol 2001, 20:239-247. 85. Nemoto S, Fergusson MM, Finkel T: Nutrient availability regulates SIRT1 through a forkhead-dependent pathway. Science 2004, 306:2105-2108. 86. Brunet A, Sweeney LB, Sturgill JF, Chua KF, Greer PL, Lin Y, Tran H, Ross SE, Mostoslavsky R, Cohen HY, Hu LS, Cheng HL, Jedrychowski MP, Gygi SP, Sinclair DA, Alt FW, Greenberg ME: Stress-dependent regulation of FOXO transcription factors by the SIRT1 deacetylase. Science 2004, 303:2011-2015. 87. Xie S, Wu H, Wang Q, Cogswell JP, Husain I, Conn C, Stambrook P, Jhanwar-Uniyal M, Dai W: Plk3 functionally links DNA damage to cell cycle arrest and apoptosis at least in part via the p53 pathway. J Biol Chem 2001, 276:43305-43312. 88. Spange S, Wagner T, Heinzel T, Kramer OH: Acetylation of non-histone proteins modulates cellular signalling at multiple levels. Int J Biochem Cell Biol 2009, 41:185-198. 132 Chapter 6

89. Ma J, Xie Y, Shi Y, Qin W, Zhao B, Jin Y: Glucocorticoid-induced apoptosis requires FOXO3A activity. Biochem Biophys Res Commun 2008, 377:894-898. 90. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, et al: Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437:1173-1178. 91. Ewing RM, Chu P, Elisma F, Li H, Taylor P, Climie S, McBroom-Cerajewski L, Robinson MD, O’Connor L, Li M, Taylor R, Dharsee M, Ho Y, Heilbut A, Moore L, Zhang S, Ornatsky O, Bukhman YV, Ethier M, Sheng Y, Vasilescu J, Abu-Farha M, Lambert JP, Duewel HS, Stewart, II, Kuehl B, Hogue K, Colwill K, Gladwish K, Muskat B, et al: Large-scale mapping of human protein-protein interactions by mass spectrometry. Mol Syst Biol 2007, 3:89. 92. Wang YL, Faiola F, Xu M, Pan S, Martinez E: Human ATAC Is a GCN5/PCAF-containing acetylase complex with a novel NC2-like histone fold module that interacts with the TATA-binding protein. J Biol Chem 2008, 283:33808-33815. 93. Nagy Z, Tora L: Distinct GCN5/PCAF-containing complexes function as co-activators and are involved in transcription factor and global histone acetylation. Oncogene 2007, 26:5341-5357. 94. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksoz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE: A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122:957-968. 95. Kruse JP, Gu W: Modes of p53 regulation. Cell 2009, 137:609-622. 96. Affymetrix [http://www.affymetrix.com] 97. The R Project for Statistical Computing [http://www.r-project.org] 98. BioConductor [http://www.bioconductor.org] 99. CoPub web server [http://www.copub.org] 100. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003, 13:2498-2504. 101. Cytoscape: Analyzing and Visualizing Network Data [http://www.cytoscape.org] 102. Rozen S, Skaletsky H: Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 2000, 132:365-386. 103. Livak KJ, Schmittgen TD: Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 2001, 25:402-408. 104. Pfaffl MW: A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res 2001, 29:e45. 105. Breitkreutz BJ, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bahler J, Wood V, Dolinski K, Tyers M: The BioGRID Interaction Database: 2008 update. Nucleic Acids Res 2008, 36:D637-640. 106. The BioGRID [http://www.thebiogrid.org] Prednisolone-induced differential gene expression in mouse liver 133

Chapter 6 Chapter CHAPTER 7 Identification of T lymphocyte signal transduction pathways

Wilco W.M. Fleuren2,3, Ruben L Smeets1,6,9,Xuehui He7, Paul M Vink1, Frank Wijnands1, Monika Gorecka1, Henri Klop1, Susanne Bauerschmidt4, Anja Garritsen1, Hans JPM Koenen7, Irma Joosten7, Annemieke MH Boots1,8 and Wynand Alkema2,3,4,5

1Department of Immune Therapeutics, Merck Research Laboratories (MRL), MSD, Oss, the Netherlands. 2Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands. 3Netherlands Bioinformatics Centre (NBIC), Nijmegen, The Netherlands. 4Department of Molecular Design and Informatics, Merck Research Laboratories (MRL), MSD, Oss, the Netherlands. 5NIZO Food Research, Ede, The Netherlands. 6Department of Laboratory Medicine, laboratory for Clinical Chemistry, Radboud University Medical Centre, Nijmegen, The Netherlands. 7Department of Laboratory Medicine, laboratory for Medical Immunology, Radboud University Medical Centre, Nijmegen, The Netherlands. 8Department of Rheumatology and Clinical Immunology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands. 9Department of Laboratory Medicine, laboratory for Clinical Chemistry, Radboud University Medical Centre, Nijmegen, Geert Grooteplein 10, Postbus 9101, 6500 HB Nijmegen, The Netherlands.

This work is adapted from ‘Molecular pathway profiling of T lymphocyte signal transduction pathways; Th1 and Th2 genomic fingerprints are defined by TCR and CD28-mediated signaling.’ Ruben L Smeets1,6,9, Wilco W.M. Fleuren2,3, Xuehui He7, Paul M Vink1, Frank Wijnands1, Monika Gorecka1,Henri Klop1, Susanne Bau- erschmidt4, Anja Garritsen1, Hans JPM Koenen7, Irma Joosten7, Annemieke MH Boots1,8 and Wynand Alkema2,3,4,5 BMC Immunol. 2012 Mar 14;13(1):12.

Identification of T lymphocyte signal transduction pathways 137

Abstract

Background: T lymphocytes are orchestrators of adaptive immunity. Naïve T cells may differentiate into Th1, Th2, Th17 or iTreg phenotypes, depending on environmental co- stimulatory signals. To identify genes and pathways involved in differentiation of Jurkat T cells towards Th1 and Th2 subtypes we performed comprehensive transcriptome analyses of Jurkat T cells stimulated with various stimuli and pathway inhibitors. Results from these experiments were validated in a human experimental setting using whole blood and purified CD4+ Tcells. Results: Calcium-dependent activation of T cells using CD3/CD28 and PMA/CD3 stimulation induced a Th1 expression profile reflected by increased expression ofT-bet, RUNX3, IL-2, and IFNγ, whereas calcium-independent activation via PMA/CD28 induced a Th2 expression profile which includedGATA3 , RXRA, CCL1 and Itk. Knock down with siRNA and gene expression profiling in the presence of selective kinase inhibitors showed that proximal kinases Lck and PKCϴ are crucial signaling hubs during T helper cell activation, revealing a clear role for Lck in Th1 development and for PKCϴ in both Th1 and Th2 development. Medial signaling via MAPkinases appeared to be less important in these pathways, since specific inhibitors of these kinases displayed a minor effect ongene expression. Translation towards a primary, whole blood setting and purified human CD4+ T cells revealed that PMA/CD3 stimulation induced a more pronounced Th1 specific, Lck and PKCθ dependent IFNγ production, whereas PMA/CD28 induced Th2 specificIL-5 and IL-13 production, independent of Lck activation. PMA/CD3-mediated skewing towards a Th1 phenotype was also reflected in mRNA expression of the master transcription factor Tbet, whereas PMA/CD28-mediated stimulation enhanced GATA3 mRNA expression in primary human CD4+ Tcells. Conclusions: This study identifies stimulatory pathways and gene expression profiles for in vitro skewing of T helper cell activation. PMA/CD3 stimulation enhances a Th1- like response in an Lck and PKCϴ dependent fashion, whereas PMA/CD28 stimulation

results in a Th2-like phenotype independent of the proximal TCR-tyrosine kinase Lck. This approach offers a robust and fast translational in vitro system for skewed T helper cell responses in Jurkat T cells, primary human CD4+ Tcells and in a more complex matrix 7 Chapter such as human whole blood.

Keywords: Signal transduction pathways, Gene expression profiling, T lymphocytes, Th1 and Th2 development 138 Chapter 7

Background

Activation of T helper 0 (Th0) cells leads to differentiation into several lineages. These lineages include the Th1 and Th2 subsets as well as the more recently described subsets such as induced T regulatory cells and Th17 cells. The Th1 cells protect against intracel- lular pathogens and are in general characterized by their ability to produce IFNγ, IL-2 and TNFα and express the Th1-specific transcription factor T-bet. The Th2 subset, which is in- volved in the defense against extracellular pathogens, is characterized by the production of IL-4, IL-5 and IL-13 and is controlled by the master transcription factor GATA3 [1,2]. In a proper functioning immune system, these different T helper subsets are well-bal- anced and co-operate to eliminate invading pathogens and to maintain homeostasis. Hy- per activation of one T helper subset, however, can tip the balance from health towards disease, in which Th2-overshoot can lead to inappropriate immune responses leading to diseases like allergy and asthma. Alternatively, overshoot towards a Th1 or Th17-phe- notype can cause autoimmune diseases, like rheumatoid arthritis and multiple sclerosis [3,4]. For effective CD4 T cell activation, the antigen presenting cell (APC) provides a key contact point to facilitate T cell activation and polarization towards different T helper subsets. A crucial event in this process is the interaction between the antigen presented via the MHCII receptor and the TCR receptor (signal 1). The nature of activation, defined by the strength of the TCR stimulation, can affect T helper cell polarization towards Th1 or Th2, in which a high affinity interaction favors Th1 development and low affinity drives Th2 development [5-8]. Besides the TCR signal transduction, an additional signal is pro- vided by the APC in the form of a co-stimulatory signal (signal 2). This signal is provided via CD28-B7 interaction and has been shown to be important for effective T cell activa- tion [9]. Furthermore, CD28-mediated co-stimulation has been implicated in effective polarization of T cells towards a Th2 phenotype [10,11]. Also other co-stimulatory mol- ecules, including ICOS and OX40, have been positively correlated with Th2 differentiation [12,13]. The results from these studies underline the importance of both signal 1 and signal 2, but also underline the complexity of these integrated signaling pathways. The cascade of biochemical events, linking cell surface receptor engagement to cellular responses has been a focus of many studies. Detailed investigation of these signal trans- duction events has led to identification and functional characterization of many kinases and phosphatases downstream of the TCR and CD28-receptor. TCR ligation results in the recruitment of p56Lck (Lck), a proximal TCR Src family kinase, which kick-starts the signal transduction cascade leading to phosphorylation of the ITAM motifs in the TCR, which recruits and activates ZAP70 [14]. This initial step leads to the activation of PLCγ that

hydrolyzes PIP2 into IP3, which is the second messenger molecule responsible for the sustained intracellular calcium flux in T cells. CD28-ligation on T cells results in the re-

cruitment of PI3K, with PIP2 and PIP3, which serve as pleckstrin homology (PH) domain membrane anchors. Via this mechanism PDK1 and PKB/Akt are recruited and regulate Identification of T lymphocyte signal transduction pathways 139 several pathways that increase cellular metabolism [15]. Additionally, CD28-signaling has been shown to initiate NFB signaling, via a mechanism that is functionally linked through recruitment of PKCθ to CD28 in the immunological synapse [16-18]. Members of the Mitogen-activated protein kinase family, which can be activated via TCR signaling, also play a role in the differentiation of Th1 and Th2 subsets. In a thorough re- view by Dong et al., the role of p38, JNK and ERK in T helper cell differentiation has been outlined [19]. ERK is important for Th2 differentiation, whereas p38 and JNK2 appear to be involved in Th1 development. TCR/CD3 stimulation and CD28 stimulation alone are weak activators of T cell signaling. It is generally conceived that CD28 signaling merely acts as a signal potentiator on top of the initiator signal mediated via the TCR/CD3. Led- better and June et al. described that CD28 stimulation in the absence of cross-linking on top of PMA stimulation can activate T cells, without increasing calcium flux [20,21]. This suggests that co-stimulatory pathways synergize with biochemical pathways induced via the TCR. Whether CD28 ligation, in the absence of TCR signaling, leads to activation and differentiation has not been fully explored. These findings show that effective T cell activation and differentiation towards effector subsets is the result of precise integration of multiple signaling routes. To explore the pathways underlying these distinct routes towards T cell activation and differentiation we used comprehensive biochemical characterization and gene expression profiling of Jurkat T cells that were activated with various co-stimulatory signals in the presence of various inhibitors of specific signaling routes. With this approach we identified specific PMA/ CD3 and PMA/CD28 signal transduction and genomic fingerprints. PMA/CD3 stimulation induced a Th1 phenotype, dependent on both Lck and PKCϴ, whereas PMA/CD28 stimu- lation, which is independent of TCR-mediated activation of Lck, resulted in a profound activation of T cells, skewing towards a Th2 phenotype.

Results and discussion

Activation of Jurkat T cells by various stimuli leads to differential signaling fingerprints

Jurkat T cells were activated by anti-CD3, anti-CD28, PMA, or ionomycin or combina- tions of these single stimuli, in order to map the contribution of these stimuli towards 7 Chapter the activation of proximal, medial and distal signal transduction pathways. As shown in Figure 1A, CD3-stimulation and ionomycin/PMA were able to increase intracellular lev- els of Ca2+. Interestingly, neither CD28 nor PMA stimulation alone, affected intracellular Ca2+ levels. As expected, CD3-signaling resulted in an Lck-dependent phosphorylation of ZAP70 (Figure 1B). Stimuli containing PMA directly activated the MAPK pathway, which is reflected by the phosphorylation of ERK, P38 and JNK (Figure 1B). Furthermore, PMA addition directly activated PKC which was not reflected in the autophosphorylation of PKCθ, but was clearly detectable on the phosphorylation of the PKC substrate MARCKS (Figure 1B). CD3-mediated stimulations and PMA-induced stimulations resulted both in 140 Chapter 7

the activation of AP1 family transcription factors c-Jun and ATF2 (Figure 1B). Analysis of nuclear translocation of NFATc1 and c-Jun (AP1)/NFκB p65, as part of the distal signal- ing events revealed that indeed CD3-mediated signaling induced both NFAT and c-Jun/ NFκB, of which the latter pathways were potentiated by CD28-mediated signaling (Figure 1C). In line with the calcium release from the ER, PMA or PMA/CD28-mediated signal- ing did not induce NFAT nuclear translocation but highly activated the CD28 responsive element transcription factors c-Jun and NFκB p65 (Figure 1C). These results indicate that two distinct co-stimulatory profiles can be identified. A CD3/28 and PMA/CD3 stimulus that signals via Lck, increasing intracellular Ca2+ and activating NFAT, and a PMA/CD28 calcium independent (co)-stimulatory activation signaling via PKCθ and MARCKS. Next, the molecular mechanisms involved in these signaling pathways were further explored in genomics studies.

Differential regulation of genes after PMA/CD3 and CD3/28 Vs PMA/CD28 stimulation In order to further characterize the different signal transduction events induced by dif- ferent (co)-stimulatory signals, we performed a first gene expression experiment with Jurkat T cells that were stimulated for 1 or 8 hours with PMA/CD3, CD3/28 and PMA/ CD28. It appeared that after 1 hour a limited response on transcription level was seen, whereas after 8 hours of stimulation, several hundreds of genes were regulated. Further- more, PMA/CD3 and PMA/CD28 regulate more genes compared to CD3/28, reflecting the strength of the stimuli used (Figure 2). Multivariate analysis by principal component analysis and hierarchical clustering showed that the 3 stimuli lead to clearly distinct gene expression profiles. At both time points the profile induced by PMA/CD28 is clearly dis- tinct from the profiles induced by PMA/CD3 and CD3/28 (Figure 2).

Differentially regulated genes CCL1 and IL-2 are profilespecific secreted proteins Gene profiles of the differential stimuli were ranked on the level of induction and evalu- ated on whether or not the translated protein is secreted. This resulted in the identifica- tion of the PMA/CD28-specific transcript CCL1 (Figure 3A), the CD3-specific transcripts IL-2 (Figure 3B) and XCL1/2 (data not shown). Small but significant inductions of these genes were observed after 1 hour of stimulation for both CCL1 and IL-2. However, both genes were highly induced after 8 hours of stimulation. The secretion of the protein 24 hours after stimulation reveals an identical profile compared to the mRNA (Figure 3A/B, right hand panel). Identification of T lymphocyte signal transduction pathways 141

A

55000 PMA/ionomycine αCD3/αCD28

45000 PMA/αCD3 αCD3

35000 RFU αCD28 PMA/αCD28 25000 PMA Control 15000 0 25 50 75 100 125 150 175 seconds

B

Proximal signal transduction Medial signal transduction Distal signal transduction events events events

C

Figure 1 Signal transduction in Jurkat T cells. Jurkat T cells were stimulated with different

combinations of stimuli in order to elucidate the different signal transduction pathways. A; Jurkat 7 Chapter T cells were stimulated as indicated and intracellular Ca2+ release was monitored over/in time. B; Intracellular signal transduction routes were charted via phospho-analysis using western blot. Jurkat T cells were stimulated for 15 min using different stimulations. Proximal (Lck, ZAP70, PKCΘ and the PKC substrate MARCKs), medial (MAPK phosphorylation) and distal (c-Jun and ATF2) signaling was monitored based on the phosphorylation status of the described proteins.C ; Nuclear translocation of the transcription factors NFAT, NFkB and c-Jun was evaluated 15 minutes after stimulation. Pathway profiling with multiple stimuli and inhibitors To investigate the contribution of proximal, medial and distal signaling events onthe CD3/28, PMA/CD3 and PMA/CD28 stimuli, we performed a second gene expression pro- filing experiment with different selective inhibitors, including proximal kinase inhibitors 142 Chapter 7

Overlap with Number of probe sets CD3_PMA PMA_CD28 CD3_CD28 T= 1 hrs CD3_PMA 338 1.00 0.53 0.83 PMA_CD28 194 0.92 1.00 0.89 CD3_CD28 319 0.88 0.54 1.00 T= 8 hrs CD3_PMA 1567 1.00 0.72 0.42 PMA_CD28 1853 0.60 1.00 0.30 CD3_CD28 786 0.84 0.70 1.00

PMA + CD28

PMA + CD3 CD3 + CD28

Figure 2 PMA/CD3 and CD3/28 stimulations differ from PMA/CD28 stimulation.The table shows the number of regulated probe sets and their overlap at 1 and 8 hrs following stimulation with CD3/ PMA, PMA/CD28 and CD3/CD28 (using a fold change cut off of 2 and a p-value cut off of <1.10-6. The right panel shows a hierarchical clustering of the data using the data for the regulated probe sets. Most of the variation is caused by the time difference (1hr vs 8 hrs). At both time points the PMA/CD28 stimulus was clearly different from the CD3/CD28 and CD3/PMA stimulus. Repeating this analysis with different values for the fold change and p–value cut off yielded essentially the same results for the multivariate analysis.

Lck (A420983), PKCθ (AEB071), medial MAPK inhibitors PD98059 (MEK/ERK), SP600125 (pan JNK), Org 48762-0 (P38) and the distal Calcineurin (Cn) inhibitor Cyclosporin A (CsA). Jurkat T cells were stimulated with PMA, CD28 and CD3 alone and combinations thereof in the presence of the above mentioned inhibitors. Based on the results of the first gene expression profiling experiment, we chose evaluated gene expression after 8 hours of stimulation. A principal component analysis on the ratio data set is shown in Additional file 1: Figure S1. It appeared that the PMA/CD28, CD3/CD28 and PMA/CD3 co-stimuli induced several hundreds of genes, whereas the effect of a single stimulus was smaller, with the excep- tion of the PMA single stimulus (Additional file 1: Figure S1, Additional file 2: Table S1). The gene set induced by a PMA stimulus showed a larger overlap with the genes induced by the PMA/CD28 stimulus than with the CD3/PMA induced gene set. Whereas PMA and CD3 as a single stimulus induce a large number of genes, CD28 elicits only a minor effect. It can be observed that CsA, AEB071 and A420983 induce the largest effects on gene regulation, whereas the inhibitors of the MAPK pathway only have a minor effect on gene expression. This finding is corroborated by the number of regulated genes, showing that the MAPK inhibitors only regulate a small number of genes whereas A420983, CsA and AEB071 regulate many genes (Additional file 2: Table S1). A420983 and CsA only show a significant effect on the PMA/CD3 and CD3/CD28 pathways, in which the effect of CsA Identification of T lymphocyte signal transduction pathways 143 is smaller than the effect of A420983. AEB071 is the only compound that shows also a significant effect on the PMA/CD28 induced pathway. These analyses were rerun with different settings for the thresholds used for gene selection. In all cases similar results were obtained indicating that the results were not critical dependent on the thresholds settings that were used. Inspection of the profiles of CCL1 and IL-2 revealed that CCL1 mRNA is highly induced via the PMA/CD28 pathway. This induction is depending on PKC signaling and negatively regulated via Lck signaling. Apparently this effect was upstream of Cn, since the inhibitor CsA did not increase CCL1 mRNA induction (Figure 4A).

A

B

Chapter 7 Chapter

Figure 3 Differential regulation of IL-2 and CCL1. Jurkat T cells were activated for 1 and 8 hours using different stimuli (as indicated). Differentially regulated genes of which the proteins were secreted were identified and the highest ranking genes were selected. A; regulation of the CCL1 mRNA is highly and differentially regulated after PMA/CD28 stimulation. Right panel shows the validation of the mRNA levels on protein level, secreted by activated Jurkat T cells. B; regulation of IL-2 mRNA is highly and differentially regulated after CD3/28 and PMA/CD3 stimulation. Right panel shows the validation of the mRNA levels on protein, secreted by activated Jurkat T cells. 144 Chapter 7

Figure 4 Involvement of signal transduction pathways on differentially regulated genes. Jurkat T cells were stimulated as indicated previously and the involvement of different signaling pathways on gene regulation was elucidated using inhibitors of specific pathways, including Lck (1µM), PKC (10µM), Calcineurin (1 µM), and MAPKs p38, JNK, and MEK/ERK (all 10µM). A; Regulation of CCL1 mRNA in Jurkat T cells after stimulation with PMA/CD28-, CD3/28- or PMA/CD3-and co-incubated and cultured for 8 hours in the presence or absence of signal transduction pathway inhibitors. B; Regulation of IL-2 mRNA in Jurkat T cells after stimulation with PMA/CD28-, CD3/28- and PMA/ CD3- and co-incubated and cultured for 8 hours in the presence or absence of signal transduction pathway inhibitors. The height of the bars represent the average intensity of 3 biological replicates, the error bars indicate the variation (min and max values). Significance of changes in expression level between the stimulated sample and the sample that was stimulated in the presence of the pathway inhibitor is indicated as follows; *** P value < 0.0001; ** 0.01 < P value >0.001 * 0.05 < P value >0.01.

As expected, PMA/CD3 and CD3/28 stimuli and to a lesser extent PMA/CD28 resulted in a marked expression of IL-2, which is highly depending on the Lck/Cn signal transduc- tion pathway, and the PKC pathway. Interestingly inhibition of MAPK signaling (with the Identification of T lymphocyte signal transduction pathways 145 exception of the MEK/ERK pathway) does not affect IL-2 mRNA induction (Figure 4B). These effects on CCL1 and IL-2 production by inhibitors of the Lck/Cn and PKC pathway were further substantiated in a full dose-response experiment. Figure 5A shows that indeed AEB071 dose-dependently inhibited PMA/CD28-induced CCL1 production, which is slightly enhanced in the presence of an Lck inhibitor and which is not affected by Cn inhibition via CsA. IL-2 production can be blocked by inhibition of both the Lck/Cn and PKC pathway (Figure 5B). The involvement of Lck in the CD3-mediated pathway and PKC in the CD3 and CD28-mediated pathways was further confirmed by the knock-down of both kinases under the distinct stimuli. Knock-down of Lck did not affect PMA/ CD28- induced CCL1 production, whereas knock-down of PKCθ resulted in significant inhibition of both IL-2 and CCL1 (Figure 5C). These results clearly show that PMA/CD28-induced gene profiles are highly depending on PKCθ signaling pathways but are independent of Lck/Cn and MAPK signaling pathways, whereas CD3-mediated signaling pathways are dependent on both Lck/Cn and PKCθ signal transduction and independent on MAPK sig- naling events.

PMA/CD3, PMA/CD28 and CD3/CD28 induce distinct genomic fingerprints The above analysis indicated that treating Jurkat T-cells with multiple combinations of stimuli and inhibitors highlights pathways that are regulated by specific combinations of stimulus and inhibitor, revealing the involvement of certain kinases as signaling hub under specific stimulatory conditions. In order to identify additional genes in the path- ways that are exemplified by CCL1 and IL-2, we searched for genes with similar profiles to these pathway genes. Figure 6 shows genes related to CCL1, identified by a strong up regulation following PMA/CD28 stimulation only and a down regulation by AEB071. In- terestingly, besidesCCL1 , which is a chemo attractant for Th2 cells, many Th2-associated genes co-clustered with CCL1, including GATA3, Itk, RXRA, c-FLIP (CFLAR), ICOS and the IL-31 receptor and also other genes that are associated with Th2 development (See Fig- ure 6; CCL1 gene cluster). Likewise a very specific IL-2 profile can be constructed by selecting genes that are up regulated under all three conditions and down regulated by all three inhibitors, and by which CsA is the weakest inhibitor (see Figure 6; IL-2 cluster). In this gene cluster ap- 7 Chapter peared to be Th1- associated genes, including the Th1 master transcription factor Tbet (TBX21), Th1 chemokine XCL1/2, IFNγ, granzyme, RUNX3, FASL, OX40L (TNFRSF4), CD27, and the IL-21 receptor. Of note, inhibition of both Lck and Cn under PMA/CD3 stimula- tion enhanced the expression of Th2 master transcription factors GATA3 and RXRA, but also peptidoglycan recognition protein 4 (PGLYRP4) and G protein-coupled receptor 84 (GPR84). A third example is provided by genes clustering together with EGR1, which show an up regulation by all stimuli but are specifically regulated by AEB071 in all conditions and only by A420983 after CD3/CD28 stimulus. Although IL-2 and EGR1 show a similar regulation 146 Chapter 7

by the stimuli used, the profiles can clearly be discriminated by the effects of the various inhibitors on their expression. The list of genes that are shown in Figure 7 together with their annotation and the correlation score to theCCL , IL-2 and EGR1 profiles are shown in Additional file 1: Figure S1, Additional file 2: Table S2, Additional file 3: Table S3. This analysis shows that by applying multiple stimuli and selective compound treatments, pathways can be unraveled at high resolution.

Figure 5 PMA/CD28-induced CCL1 production is not dependent on the Lck/Cn pathway. Jurkat T cells were stimulated using PMA/CD3 and PMA/CD28 in the presence of Lck (A420983), Cn (CsA) and PKC (AEB071) pathway inhibitors for 24 hours. Dose-response effects of the inhibitors were evaluated on the production of CCL1, after PMA/CD28 stimulation (A) and IL-2 after PMA/CD3 stimulation (B) in supernatant of the cell cultures. The data are representative for 3 independent experiments. C; Knock down of Lck and PKCϴ resulted in a clear dose-dependent reduction of the protein (500, 100, 20 nM siRNA). 24 hour culture supernatants were collected after stimulation with PMA/CD28 and PMA/CD3 and the effect of knock down on respectively CCL1 and IL-2 was determined. Data of two independent experiments are presented as mean + SEM. Significance of differences are indicated by ** p<0.01, *p<0.05, # no difference using a one-way ANOVA with a Bonferroni's post-hoc test. N.b the values shown in this figure are log2 values no to be confused with the intensity values as shown in figure 3. Identification of T lymphocyte signal transduction pathways 147

Chapter 7 Chapter 148 Chapter 7

Figure 6 Heatmap of stimulation dependent gene clusters. A: CCL1 cluster characterized by induction of genes exclusively following CD28/PMA stimulation and subsequent repression by AEB071. B: IL-2 cluster characterized by induction of genes by all stimulations, and down-regulated by A420983 and AEB071 and to a lesser extent by CsA. C: EGR1 cluster characterized by induction of genes by all stimulations, but down-regulated only by AEB071 when stimulated with PMA/CD28 or PMA/CD3. A red color denotes an up-regulation, a green color a down-regulation. The MAPK inhibitors induced very little regulation and have been omitted from this figure for clarity. The gene lists shown in this figure, with extended annotation and their distance to the CCL1, IL-2 and EGR1 profiles are provided in supplemental files 1,2 and 3.

Translation of PMA/CD3 and PMA/CD28 stimulations; differential modulation of primary T cell cytokine responses in human donor blood In order to assess whether the stimulation profiles and signal transduction profiles identified in Jurkat T-cells were also relevant in a primary human setting, the stimulation protocols were adapted and a primary assay was established using healthy human donor whole blood. The effect of differential stimulation was evaluated using IFNγ and IL-17 as Th1- and Th17-associated read-outs respectively and both IL-5 and IL-13 as Th2-associated read-outs. Stimulation of human whole blood cells with CD3/CD28 was unsuccessful: no cytokine release was detected. However, PMA/CD3 stimulation of human blood cells, resulted in a high production of IFNγ (Figure 7A). Interestingly, INFγ production levels were significantly lower after PMA/CD28 stimulation. A similar observation wasseen Identification of T lymphocyte signal transduction pathways 149 when analyzing IL-17 production. Thus, higher production levels of both IFNγ and IL-17 were seen following PMA/CD3 stimulation when compared to PMA/CD28 stimulation. Furthermore, when analyzing Th2-associated IL-5 and IL-13 production, we found that PMA/CD28 stimulation was superior to PMA/CD3 stimulation in enhancing production of these cytokines (Figure 7B). Of note, CCL1 production could not be detected in this assay system (data not shown). In aggregate, the data suggests that PMA/CD28 stimulation favours Th2 responsiveness in this assay. Since PMA/CD28 signaling was shown to be independent of Lck, but mainly dependent on PKCθ whereas PMA/CD3 signaling was both Lck and PKCθ dependent we evaluated the effect of both proximal kinases in this human whole blood assay and evaluated IFNγ and IL-13 production since these cytokines were most readily produced. Figure 7C shows that indeed PMA/CD3-induced IFNγ production is depending on both Lck and PKCθ signaling, whereas PMA/CD28-induced IL-13 production is Lck-independent and PKCθ dependent. These results clearly show that the differential stimulations identified in the Jurkat assay can be translated towards a primary human cellular assay and are depending on the same proximal signaling hubs. Furthermore, also in this setting it can be observed that PMA/CD3 stimulation diverges more towards a Th1-like phenotype, whereas PMA/CD28 stimulation skews more towards a Th2-like response.

PMA/CD3 stimulation of purified human CD4+ T cells enhances Th1 activation, whereas PMA/CD28 potentiates Th2 activation Using purified human CD4+ T cells we validated the effects observed on Jurkat T cells and in a primary human whole blood assay setting. Of all stimuli used PMA/CD3 appeared to be the most powerful stimulus able to induce IFNγ production. Also in this setting inhibition of either Lck using A-420983, or PKC using AEB071 completely inhibited PMA/ CD3 induced IFNg production. CD3/CD28-mediated stimulation, which can be successfully applied in this assay format induced IFNg production, which was dependent on both Lck and PKC mediated signal transduction pathways. Of interest and comparable to the effects observed on CCL1 production by Jurkat T cells, Lck inhibition under PMA/CD28 stimulation did not inhibit IFNγ production and even appears to slightly enhance IFNγ production (Figure 8A). The observed effect on IFNγ production of the different stimuli is 7 Chapter in line with the effects observed on the induction of the Th1 master transcription factor Tbet (Figure 8B). Both inhibition of Lck and PKC reduced CD3/28 and PMA/CD3 mediated induction of Tbet, whereas Lck inhibition did not affect PMA/CD28-induced expression of Tbet. PMA/CD28 was the most profound inducer of IL-13 in CD4+ cells (Figure 8B). Interestingly IL-13 production under al stimulatory conditions used is dependent on PKC, whereas Lck inhibition does not affect IL-13 production under PMA/CD3 or PMA/CD28 culture conditions. Under all culture conditions inhibition of PKC reduced IL-13, which was paralleled with reduced GATA3 expression (Figure 8D), whereas inhibition of Lck appeared 150 Chapter 7

A

** p=0.001 * p=0.028 3500 140 3000 120 2500 100

2000 80 ] pg/ml γ γ γ γ 1500 60 [IFN

1000 [IL-17] pg/ml 40 500 20 0 0 Control PMA/CD3 PMA/CD28 Control PMA/CD3 PMA/CD28

B

* p=0.013 * p=0.033

2000 800 700

1500 600 500 1000 400 300 [IL-5] pg/ml [IL-13] pg/ml 500 200 100 0 0 Control PMA/CD3 PMA/CD28 Control PMA/CD3 PMA/CD28

C D * * * ns 3000 1250

1000 2000 750 ] pg/ml γ γ γ γ 500 1000 [IFN [IL-13] pg/ml 250

0 0 control control AEB-071 AEB-071 A-420983 A-420983 Figure 7 PMA/CD3 and PMA/CD28 stimulation differentially modulate primary T cell cytokine responses in human healthy donor blood. Healthy donor blood was stimulated with PMA/CD3 and PMA/CD28 and cultured for 24 hours in the presence or absence of pathway inhibitors for Lck and PKC. Culture supernatants were analyzed on A; IFNγ, IL-17 (Th1/17 cytokines) and B; IL-13 and IL-5 (Th2 cytokines). Figure C shows the involvement of Lck (420983; 1µM) and PKC (AEB071; 1µM) signal transduction on PMA/CD3-induced IFNγ production. Figure D shows the effect of Lck (A-420983; 1µM)) and PKC (AEB071; 1µM) signal transduction pathways on PMA/CD28-induced IL-13 production. Data are presented as results from 3 donors measured as biological duplicates. Significance of differences are presented as followed; * p<0.05, **p<0.01 using a one-way ANOVA with Dunnett’s post hoc testing.

to promote Th2 dvelopment under all stimuli used, which was reflected by enhanced expression of GATA3. Identification of T lymphocyte signal transduction pathways 151

                 

 

                                       

                           

                                    

Figure 8 PMA/CD3 and PMA/CD28 effects on purified primary human CD4+ Tcells. Purified CD4+ T cells were stimulated with anti-CD3/CD28, PMA/CD3 and PMA/CD28 and cultured for 24 hours in the presence or absence of pathway inhibitors for Lck (A420983; 1µM) and PKC(AEB071; 1µM). Culture supernatants were analyzed for IFNγ, production (A), mRNA expression of the Th1 master transcription factor T-bet (B), the production of the Th2 cytokine IL-13 (C) and expression of the Th2 master transcription factorGATA3 (D). Message RNA expression was normalized against ribosomal house hold gene RPL19. Data are presented of three independent experiments using

purified human CD4+ T cells from three different healthy donors. Data are presented as mean + 7 Chapter SD. Significance of differences are indicated by ** p<0.01, * p<0.05, using a mann-withney U-test.

Text mining of IL-2 cluster genes and CCL1 cluster genes. The gene expression study showed that PMA/CD3 induced an IL-2 gene profile which has been associated with a Th1 type of response. Similarly, PMA/CD28 induces a CCL1 gene profile which is associated with a Th2 response. In order to find additional evidence for the Th1 cell specificity of IL-2 and the Th2 cell specificity of CCL1, we used CoPub to search in Medline abstracts for other studies which support these results. First we performed a keyword enrichment analysis of the genes that showed a similar gene expression profile toCCL1 (CCL1 cluster) and of the genes that showed an expression 152 Chapter 7

Table 1 Gene set enrichment analysis of CCL1 cluster and IL2 cluster with cell terms from the CoPub database. P-value < 0.01. CCL1 cluster IL2 cluster Cell name Nr. of genes Cell name Nr. of genes macrophage 16 Lymph cell 28 epithelial cell 15 Neoplastic cell 27 monocyte 14 Cancer cell 23 fibroblast 14 macrophage 23 Thymus derived lymphocyte 13 fibroblast 22 Lymph cell 13 Thymus derived lymphocyte 21 Neoplastic cell 13 epithelial cell 21 Cancer cell 12 endothelial cell 20 dendritic cell 12 dendritic cell 18 hepatocyte 11 monocyte 18 endothelial cell 11 Stem cell 18 Th2 cell 10 thymocyte 16 Keratinocyte 10 granulocyte 15 Bursa-equivalent lymphocyte 10 leukocyte 15 leukocyte 10 Leukemic cell 14 Peripheral blood mononuclear cell 9 Bursa-equivalent lymphocyte 12 Antigen-presenting cell 9 Neutrophilic leukocyte 12 mast cell 9 Hela cell 11 Neutrophilic leukocyte 9 Myeloid cell 11 granulocyte 9 T-lymphocyte,cytotoxic 11 eosinophil 8 natural killer cell 11 Effector cell 8 Peripheral blood mononuclear cell 11 Breast cancer cell 7 hepatocyte 11 Th1 cell 7 plasma cell 11 adipocyte 7 stromal cell 11 splenocyte 7 Breast cancer cell 10 thymocyte 7 splenocyte 10 T-lymphocyte,cytotoxic 7 Melanoma cell 10 natural killer cell 7 Squamous cell 10 Melanoma cell 6 hybridoma 10 Muscle cell 6 Muscle cell 10 stromal cell 6 neuron 10 Stem cell 6 Th2 cell 9 T4 lymphocyte 5 Th1 cell 9 Identification of T lymphocyte signal transduction pathways 153 profile similar to IL-2 (IL-2 cluster) against a list of cell terms from the CoPub database. Both the enrichment analysis with the IL-2 cluster and the CCL1 cluster resulted in similar enriched terms such as ‘macrophage’, ‘lymph cell’, ‘Thymus derived lymphocyte’ and ‘monocyte’ (Table 1). These results are not surprising since both clusters of genes were found to be expressed in Th1 and Th2 cells. Th1 and Th2 are T helper cells which play important roles in the immune system in recruiting other cells. This analysis confirms that these clusters of genes are involved in immune responses. Both the CCL1 cluster and the IL-2 cluster are enriched with the terms ‘Th1 cell’ and ’Th2 cell’. Therefore we studied into more detail the relation of each cluster with Th1 and Th2 cell types. We created a literature network, searching for co-occurrences between the genes in each cluster and identifying the link of the genes in the network with Th1 and Th2 cells in Medline abstracts. In Figure 9A the network of the IL-2 cluster is shown. In the network we can identify 7 hub-genes, i.e. IL-2, IFNG, IL2RA, GZMB, FASLG, TNFRSF4 and TNFRSF7, which are connected to several other genes in the network. With the exception of GZMB, all these hub genes have connections to both Th1 and Th2. From the network it is also clear that Th1 has a slightly higher connectivity (9 connections) than Th2 (8 connections). In Figure 9B the network of theCCL1 cluster is shown. Also in this network highly connected genes such as CCL1, CXCl10, CXCL9, GATA3 and ISG20 have connections to both Th1 and Th2. Hence from the literature network, the specificity of the genes for either Th1 or Th2 is not immediately clear. However, in the network (Figure 9B) Th2 has more connections (9 connections) than Th1 (7 connections).

Chapter 7 Chapter

Figure 9 Literature networks of the Il-2 cluster (A) and the CCL1 cluster (B). In both networks the link to Th1 cell and Th2 cell is shown. The thickness of the line between co-occurring genes or between genes and cell terms indicates the strength by means of the R-scaled score. In the networks IL-2 and CCL1 are highlighted with a red border color. 154 Chapter 7

Conclusions

In this study we systematically explored pathways involved in T cell activation by molecular profiling. We showed that TCR (both CD3/28 and PMA/CD3) driven stimulation profiles are truly distinct from co-stimulatory profiles mediated via PMA/CD28. Secondly, using selective inhibitors and siRNA we found that the proximal kinase Lck is involved in CD3 and not PMA/CD28 activation, whereas PKCθ appears to be a crucial central signaling kinase in both TCR and PMA/CD28 (co)-stimulatory activation of T cells. Finally, stimulations involving TCR/CD3 appear to preferentially induce a Th1-like fingerprint, whereas lack of TCR/CD3 signaling in the presence of PMA/CD28 stimulation diverts T cells towards a Th2-like state. It has been suggested that the strength of TCR-signaling can regulate the fate determination of naive T cells; highpotency signals skew towards Th1 differentiation, whereas low potency signals promote Th2 differentiation [6,22]. Although TCR and co- stimulatory pathways have been the focus of many studies in the previous decades, the direct contribution of TCR stimulation vs. co-stimulatory signals towards Th differentiation is not fully understood. By stimulating T cells with PMA/CD3 and PMA/CD28 we dissected signaling pathways and explored the activation profiles. CD3-mediated signaling rapidly increased intracellular Ca2+, a second messenger to activate many enzymes including Calcineurin, which resulted in an increased nuclear translocation of NFATc1. Interestingly, PMA/CD28 stimulation did not result in a Ca2+-mediated response (and was therefore marked a calcium-independent stimulation) but enhanced many of the co-stimulatory mediators including MAPK/AP1 and NFB signal transduction. These results are in line with earlier studies that showed differential effects of cyclosporine A (CsA) and dexamethasone on CD3 vs CD28-mediated signaling, which revealed that PMA/CD28 stimulation was insensitive towards CsA-mediated Calcineurin inhibition in contrast to PMA/CD3 stimulation [20,23]. Gene expression induced by combinations of stimulatory signals revealed pathway- specific biomarkers or fingerprints. PMA/CD3-induced gene profiles included IL-2, IFNγ, XCL1, granzyme B, and FASL, which have been associated with a Th1 type of response [24,25]. Also, sustained NFAT signaling, which is also induced via PMA/CD3 stimulation, has been shown to promote Th1-like gene transcripts, including IFNγ, FasL and P-selectin glycoprotein ligand 1 [26]. Our results are further substantiated by the finding that T-bet (TBX21), the Th1 master transcription factor [27], and RUNX3, which together with Tbet are crucial for inducing IFNγ and repressing IL-4 [28], were highly expressed under PMA/ CD3-stimulatory conditions. PMA/CD28 stimulation does not induce aCa2+ flux nor does it increase nuclear translocation of NFAT. However it provides the cell with a high level of co-stimulatory signaling, and induces a completely distinct genomic fingerprint compared to PMA/ CD3 stimulation. Following PMA/CD28 stimulation, Jurkat T cells highly expressed CCL1/ Identification of T lymphocyte signal transduction pathways 155

I309, a chemokine which is highly expressed during a Th2-eosinophil response in allergic airway diseases [29,30]. Lymphotoxin (LT), a cytokine which is associated with a Th2-type of response controlling IgE production [31], was also highly expressed under PMA/CD28 stimulation. In conjunction with this finding, the master transcription factors for Th2, GATA3 [32] and the Retinoid X Receptor (RXR) [33], were induced under the PMA/CD28 stimulatory condition. Notably, Th2-associated cytokines likeIL-4 , IL-5 and IL-13 were not induced in Jurkat T cells after PMA/CD28-stimulation, this in contrast with PMA/CD28 stimulation of human whole blood and purified CD4+ T cells, which could be due to the developmental blockage of Jurkat T cells. Additional file 6: Figure 2 shows a schematic overview summarizing the involvement of the signaling pathways and genes induced under differential stimulation as observed in this study and highlights their relation towards T helper 1 and -2 development. Our results are in line with the notion that high calcium levels drive Th1 and CTL responses and low calcium levels drive Th2 responses [7,34,35], which was further substantiated by our results using inhibitors for Lck and Cn, which modulate Calcium signaling in T cells. These inhibitors repressed Th1-associated genes under PMA/CD3-stimulation, but induced Th2 transcription factors GATA3 and RXRA, revealing a skewing of Th1 towards Th2 profiles. In contrast, PMA/CD28 stimulation in the presence of Lck and Cn inhibition, Th2-associated genes, e.g. CCL1 or IL-13 in CD4+ T cells, were not affected or even induced. The crucial role of Calcium and Lck in driving Th1 response is in line with the observation that knock down of Lck affects the virus-specific Th1/CTL response in mice and Lck deficiency increases Th2 associated cytokine production [36,37]. Interestingly, lack of Calcium signaling can give rise to an anergic T cell phenotype (reviewed in [38]). Therefore it would be of interest to further explore the role of Lck in calcium-dependent activation via PMA/CD3 on Th1/CTL responses and calciumindependent activation of T cells via PMA/CD28 on the induction of anergy in more detail. CD28 signaling has been functionally linked with PKCθ induced activation of NFB [39], which was also validated using PMA/CD28 as stimulus [40]. Previously it has been reported that CD28-costimulation induces GATA3 expression and Th2 differentiation via the activation of NFB [41,42]. Additional studies in mice revealed that PKCθ is involved in mounting both Th2- and Th1-mediated lung inflammation, although Th2-mediated 7 Chapter inflammation is more PKCθ-dependent [43]. Our studies show that inhibition of PKCθ can indeed inhibit a PMA/CD28 stimulation, which was reflected by the effect of PKCθ inhibition on the PMA/CD28-induced Th2-like gene expression profile. These observations are in line with the results from CD28 knock-out mice and inhibition of CD28 signaling using CTLA4Ig, showing that the CD28 co-stimulatory signaling is crucial for mounting a proper Th2 response. In contrast, Th1 and CTL responses were found to be less dependent on CD28 signaling [44,45]. Of interest, PKCθ inhibition in our hands, also affected PMA/ CD3-induced Th1-like expression profiles. These results underline the duality of PKCθ in the integration of TCR and CD28-mediated signaling events which is evident from PKCθ 156 Chapter 7

KO mice experiments. Text mining analysis has been used to find additional evidence for the observed specificity of the IL-2 cluster for Th1 cells and the CCL1 cluster for Th2 cells. Keyword enrichment analysis with both gene clusters did not show this specificity for either Th1 or Th2. However enrichment analysis resulted in immune related cells for both gene clusters, thus confirming the experimental observed results presented in this paper. Literature networks for both the IL-2 and the CCL1 cluster showed that the highly connected genes in each network are connected to both cell types. Although in the IL-2 network, Th1 has a higher connectivity than Th2 and in the CCL1 network Th2 a higher connectivity than Th1, the cell specificity of both clusters is not clearly visible. Moreover in the literature networks is shown that both Th1 cell and Th2 cell are highly connected to each other. The reason that the cell specificity could not be detected using co-occurrence based text mining could lie in the fact that in abstracts in which Th1 occurs, in more than 30% of these abstracts also Th2 occurs and vice versa. This makes it impossible to find a specificity of the genes for one of the cell types using co-occurrence based methods such as CoPub. Hence in order to detect this type of specificity in text, other types of information extraction methods such as natural language processing (NLP) are needed. NLP is able to identify the type of relation between a gene and a cell type en enables to search for specific relations. Finally, our results also show that this differential stimulation does not only occur in Jurkat T cells, but also plays a role in primary human T cells. These cells were found to secrete a Th1-like response (Tbet-IFNg) via PMA/CD3 stimulation, whereas PMA/CD28 stimulation led to a Th2 activation profile (GATA3-IL-5/IL-13). In these cells inhibition of the Lck/Cn/NFAT pathway was only effective after PMA/CD3 stimulation whereas inhibition of PKCθ inhibited both PMA/CD3-induced IFNg production and PMA/CD28- induced IL-13 production. These results illustrate that the findings in the Jurkat T cell line were successfully translated and relevant to a human primary cellular setting. Interestingly, PMA/CD3 stimulation also enhanced IL-17 production in the primary human whole blood assay and increased the expression of the IL-21 receptor, which is crucial for Th17 induction [46,47], in Jurkat T cells. These results suggest that additional signals, like IL-21 in conjunction with TGFΒ and IL-6, might be necessary to differentiate from a Th1-like phenotype towards a Th17 phenotype, whereas the absence of TGFΒ in the presence of high levels of IL-2 will favor Treg development or stabilization. Therefore further exploration of these differential stimulations in the presence of defined/ different cytokine stimuli could further elucidate T helper cell differentiation and establish sub-set specific genome profiles. The findings described in this paper offer a robust platform for in vitro activation of T cells, in which observed responses can be easily translated form Jurkat T cells, towards purified CD4+ T cells and even human whole blood. This can be of interest for efficiency and selectivity profiling of kinase inhibitors or for pathway-specific biomarker identification for future drug development and clinical studies. Identification of T lymphocyte signal transduction pathways 157

Methods

Compounds Inhibitors selectively targeting defined pathways used in this study were A-420983 (1 M; Lck inhibitor) [48], AEB-071 (10 M; PKCϴ inhibitor) [49] and Cyclosporin A (1 µM; Calcineurin inhibitor). Additionally, inhibitors of the MAPK pathway, SP600125 (10 µM; pan JNK), PD98059 (10 µM; MEK1/2), Org 48762-0 (10 M; P38) [50] were used. All compounds were dissolved in 100% DMSO. Maximal and final concentration of DMSO used in the culture assays was 0.1% v/v.

Cell culture Jurkat E6.2.11 T cells were cultured in DMEM F12 medium (#041-94895 M, Gibco) supplemented with 10% FBS (#10099-141, Invitrogen) and 80 U/ml penicillin/80 g/ml streptomycin (#15140-122 Gibco). Cells were cultured at concentrations between 1-2 × 5 10 cells/ml at 37°C/5% CO2. Cells were stimulated for 15 minutes up to 24 hours with anti-CD3 (1 µg/ml, OKT3), anti-CD28 (1 µg/ml, pericluster CD28 #M1456 Sanquin, the Netherlands), PMA (10 ng/ml, Sigma, USA) and ionomycin (1 µg/ml, Sigma, USA), or combinations thereof. For gene expression profiling Jurkat T cells were seeded in T25 culture flasks ata concentration of 1 × 106 cells/ml (1 × 107 cells in total) and cultured overnight at

37°C/5%CO2, one day prior to stimulation. On the day of the experiment cells were preincubated with the compound of interest for 30 minutes, followed by a stimulation with either CD3/CD28, PMA/CD28 or PMA/CD3, at concentrations of 10 ng/ml PMA, 1 µg/ml CD3 and 1 µg/ml CD28. Jurkat T cells were cultured in the presence or absence of stimulation for one or eight hours in total, after which the cells were washed in ice cold PBS. Thereafter cell pellets were collected and snap frozen at -80°C. Cell pellets were stored until further processing.

Isolation and quality check of mRNA Total RNA was isolated from Jurkat T cells using the RNeasy mini extraction kit (Qiagen # 74106) according to the manufactures’ protocol. RNA was dissolved and diluted in RNAse 7 Chapter free water and the RNA concentration was determined via Nanodrop analysis. The quality of total RNA was evaluated by capillary electrophoresis using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, Calif.) Double-stranded cDNA was synthesized from 1.5 µg total RNA using the One-Cycle Target Labeling Kit (Affymetrix Santa Clara, CA), and used as a template for the preparation of biotin-labeled cRNA using the GeneChip IVT Labeling Kit (Affymetrix Santa Clara, CA). Biotinlabeled cRNA was fragmented at 1 µg/l following the manufacturer’s protocol. After fragmentation, cRNA (10 µg) was hybridized at 45°C for 16-17 hours to the Human Genome U133A 2.0 Array or the Human Genome U133 Plus 2.0 Array (Affymetrix, Santa 158 Chapter 7

Clara, CA). Following hybridization, the arrays were washed, stained with phycoerythrin- streptavidin conjugate (Molecular Probes, Eugene, OR), and the signals were amplified by staining the array with biotin-labeled anti-streptavidin antibody (Vector Laboratories, Burlingame, CA) followed by phycoerythrin-streptavidin. The arrays were laser scanned with an GeneChip Scanner 3000 6 G (Affymetrix, Santa Clara, CA) according to the manufacturer’s instructions. Data was saved as raw image file and quantified using GCOS (Affymetrix).

Statistical analysis The .CEL files were analyzed with the R http://www.rproject.org and the BioConductor software package http://www.bioconductor.org. Normalization was done using gcrma. Building of the experimental design and calculation of the ratios was done with the limma package. Regulated probe sets were selected on basis of the fold change and the adjusted p-value (Benjamini-Hochberg correction). Multivariate data analysis and clustering was done with standard methods in the R software package http://www.r-project.org. For the principal component analysis and hierarchical clustering, ratio data were used. The ratio data were calculated for each treatment to its corresponding control. For the treatment with the stimuli, the untreated cells were taken as a control. For the treatment with stimulus + compound combinations, the treatment with the stimulus alone was taken as a control. Results were expressed as mean ± SEM. Significance of differences was determined using a one-way ANOVA followed by post-hoc testing as indicated. Data sets can be found in GEO http://www.ncbi.nlm.nih.gov/gds/ under accession number GSE30678.

FLIPR calcium flux assay 96-wells plates were coated with poly-L-lysine in PBS for 1 h at 37°C. Jurkat T cells were 5 seeded at a concentration of 7 × 10 in culture medium and rested for 1 hour at 37°C/5%CO2. Thereafter cells were incubated for 1 hour in the dark with FLIPR calcium buffer, according to the manufacturers’ protocol. Stimuli were added via the Flexstation384 and calcium release was monitored in time (Molecular devices, Sunnyvale, USA).

Western blotting and nuclear translocation assay Cells were washed in ice-cold PBS and pellets were lysed on ice in lysis buffer (Biosource, cat nr FNN0011, supplemented with 1 × protease inhibitor cocktail Roche cat no 11873580051, 1 mM PMSF Fluka cat no 93482, phosphatase inhibitor cocktail I, Sigma cat no P2850, phosphatase inhibitor cocktail II, Sigma cat no P5726) followed by an incubation for 30 min on ice. The lysates were stored at -80°C until further analysis. Phosphorylation of proteins from stimulated Jurkat cells were evaluated via western blot analysis. Briefly, samples were run on a 4-12% NuPage gels (#NP321BOX, Invitrogen) for 35 min on 200 V in 1 × MES buffer (#NP0002, Invitrogen, USA) and subsequently transferred Identification of T lymphocyte signal transduction pathways 159 to a PVDF membrane (#162-0184, Biorad). The blots were blocked in PBS/0.05% Tween-20 with 1% skim milk (#232100, Difco) and 1% BSA. Blots were incubated O/N at 4°C in a roller bottle with the primary antibody diluted 1:1000 in block buffer, followed by incubation with a secondary detection antibody. Thereafter blots were incubated in ECL (#RPN2106V1 and RPN2106V2 Amersham Pharmacia) and hyperfilms (#RPN3103K, Amersham Pharmacia) were exposed and developed. For the detection the following antibodies were used: pLck/Src (Tyr416 Cat N° 2101 L), Lck (Cat N° 2752), pZAP70 (Tyr319 Cat N° 2701), pPKCθ (Thr538, Cat N° 9377), pMARCKS (Ser152/156 Cat N° 4273) and ATF-2 (Thr71 Cat N° 9221) (all cell signaling technology, Danvers, USA). pP38 (Thr180/Tyr182 Cat N° 44- 684 G), pERK (Thr185/Tyr187 Cat N° 44-680 G), pJNK1/2 (Thr183/Tyr185 Cat N° 44-682 G), JNK1 (Cat N° 44-690 G) were obtained from Invitrogen (Carlsbad, USA) and c-Jun (pSer73) was obtained from calbiochem (cat no 420114). For the analysis of nuclear translocation of the transcription factors NFAT, NFkBp65 and c-JUN, nuclear fractions of activated Jurkat T cells were isolated via hypotonic shock and levels of activated transcription factors in the nuclear lysates was tested in a TransAM transcription factor ELISA according to the manufacturers’ protocol (Active Motif, Carlsbad, USA).

Knock down of PKCθ and lck in Jurkat T cells Jurkat T cells (2 × 10e7 cells/ml) were mock transfected or electroporated (250 V/975 μF) with siRNA targeting Lck (RefSeq accession-number NM-005356; sense 5’-GCACGCUGCUCAUCCGAAAdTdT) and PKCθ (RefSeq accession-number NM-006257; sense 5’-GCAGCAAUUUCGACAAAGAdTdT) in different concentrations of 500, 100 and 20 nM or scrambled control siRNA (Thermo scientific, Dharmacon Inc, Lafayette, USA). Electroporated cells were cultured in M505 supplemented with 10% FCS. Cells were stimulated 72 hours after electroporation and knockdown efficiency of the specified proteins was checked via western blot analysis. Culture supernatants of PMA/CD28 or PMA/CD3-stimulated cells were collected after 24 hours and production of respectively CCL1 and IL-2 was determined.

Experiments with WBA Peripheral whole blood was obtained by venipuncture from healthy adults (male/female) 7 Chapter and was collected into lithium heparinised tubes. Blood was obtained from healthy volunteers and the time between puncture and processing was less than 1 hour. Blood was diluted 1:4 with RPMI 1640 (Life Technologies, cat no. 32404-014) supplemented with penicillin/streptomycin (GibcoBRL, cat 15140-122) and 2 mM L-glutamine (GibcoBRL, cat 25030-024) and distributed (200 μl/well) into 96-wells plates (Nunc, Cat 167008). Blood cultures were stimulated with soluble αCD3/PMA and soluble aCD28/PMA, or left unstimulated. Blood (200 μl/well) in RPMI 1640 medium was incubated with 25 μl compound (maximum of 0.1% v/v DMSO). 160 Chapter 7

Cytokine determination Cytokines and chemokines IL-2, CCL1/I309 and XCL1, secreted into the supernatant of stimulated Jurkat T cells were determined via ELISA (R&D systems, USA). Cytokines IL-17, IFNγ, IL-13 and IL-5, produced by human CD4+ T-cells activated in whole blood, cultured in the presence or absence of compounds were determined in the culture supernatant, using a bead-based human cytokine multiplex kit (Bioplex-system; Bio-Rad, Veenendaal, The Netherlands) according to the manufacturer’s instructions. Culture suppernatants were collected at day 1 of culture. Samples were analyzed using a Luminex-100 analyzer (Luminex, Austin, USA) with Bio-plex Manager Software 3.0 (Bio-Rad). Proteins were discriminated based on the fluorescent label of the bead and the PE levels were corrected for background levels of negative controls. The sensitivity of the cytokine assay was less than 5 pg/ml for all cytokines measured.

cDNA synthesis and q-PCR Primary human CD4+ T cells were isolated from buffy coats from three healthy donors using MACS negative CD4+ purification technology (Miltenyi biotech, Germany), yielding a overall 96% pure CD4+ T cell population. RNA from stimulated CD4+ T cells was isolated using an RNeasy minikit (QIAGEN Gmbh, Germany). RNA content of samples was analyzed using Nanodrop (Agilent technologies, Ca, USA) and purity was analyzed using the Agilent RNA 6000 nanokit protocol on the RNA nano labchip using the Agilent 2100 bioanalyzer (Agilent technologies, Waldborn, Germany). Three microgram of RNA was used for cDNA synthesis using random hexamer primer mix (invitrogen), 10 mM dNTP, M-MLV RT buffer and M-MLV Reverse transcriptase (Promega). RT reaction was performed at 42°C for 1 hour followed by a deactivation for 5 minutes at 90°C. cDNA of the Th1 master transcription factor Tbx21 (Tbet), Th2 transcription factor GATA3 or the control household gene RPL19 was applified using Power SYBR green mastermix (Applied Biosystems, Warrington, UK) and expression was monitored on the ABI prism 7900 HT sequence detection system (Applied Biosystems, Warrington, UK), Ct values were normalized for the expression of the RPL19 gene.

Additional material can be found at the online version of this paper at the publishers website or can be retrieved on request by the author of this thesis.

Abbreviations AP1: Activator protein 1; APC: Antigen presenting cell; ATF: Activating transcription factor; CCL1: CC chemokine ligand 1; Cn: Calcineurin; CsA: Cyclosporin A; CTL: Cytotoxic T cell; CTLA4: Cytotoxic lymphocyte antigen 4; ERK: Extracellular-signal-regulated kinase; FASL: TNFSF6 Fas ligand; GATA3: T cell specific transcription factor binding to DNA sequence GATA; IFNγ: Interferon gamma; IL: Interleukin; ITAM: Immuno-tyrosine based activation motif; Itk: IL-2-inducible T cell kinase; JNK: c-JUN N-terminal Kinase; Lck: p56 Identification of T lymphocyte signal transduction pathways 161

Lymphocyte-specific protein tyrosine kinase; MAPK: Mitogen activated protein kinase; MARCKS: Myristoylated alanine-rich C-kinase substrate; MHC: Major histocompatibility complex; NFAT: Nuclear factor of activated T cells; NFκB: Nuclear factor kappa B; OX40: CD134; P38: p38 Mitogen-activated protein kinase; PCA: Principle component analysis; PDK1: 3-phosphoinositide dependent protein kinase-1; PI3K: Phosphatydylinositol-3- kinase; PIP: Phosphatydylinositolphosphate; PLCγ: Phospholipase C gamma; PKB/AKT: Protein kinase B; PKCθ: Protein kinase C theta; PMA: Phorbol 12-myristate 13-acetate; RUNX3: Runt-related transcription factor 3; RXRA: Retinoid X Receptor alpha; T-bet: T-box transcription factor 21 (TBX21); TCR: T cell receptor; Th: T helper cell; TNF: Tumor necrosis factor; ZAP70: Zeta-chainassociated protein 70.

Acknowledgements The authors would like to thank Hans van de Maaden for performing the PKCθ and Lck knock down experiments and Roselinde van Ravestein-van Os, Roger van de Wetering (Merck Research Laboratories, Oss, the Netherlands) for their contribution to the microarray analyses. WF was supported by a grant from NBIC, the Netherlands.

Authors’ contributions Contribution: RLS, WA, SB, PV, XH, JPMK, IJ, AG and AMHB designed research; RLS, FW, MG, HK, and WA performed research; RLS, WF, WA,, XH, JPMK, IJ, SB, AMHB analyzed data; and RLS, WF, WA, SB, XH, JPMK, IJ, and AMHB, wrote the paper. All authors read and approved the final manuscript.

Competing interests The authors of this manuscript have no conflicts of interest to disclose as described by the Journal. The work performed here was funded by NV Organon, now part of MSD, Merck.

Chapter 7 Chapter 162 Chapter 7

References

1. Murphy KM, Reiner SL: The lineage decisions of helper T cells. Nat Rev Immunol 2002, 2:933-944. 2. Szabo SJ, Sullivan BM, Peng SL, Glimcher LH: Molecular mechanisms regulating Th1 immune responses. Annu Rev Immunol 2003, 21:713-758. 3. Dardalhon V, Korn T, Kuchroo VK, Anderson AC: Role of Th1 and Th17 cells in organ-specific autoimmunity. J Autoimmun 2008, 31:252-256. 4. Wan YY, Flavell RA: How diverse-CD4 effector T cells and their functions. J Mol Cell Biol 2009, 1:20-36. 5. Brogdon JL, Leitenberg D, Bottomly K: The potency of TCR signaling differentially regulates NFATc/p activity and early IL-4 transcription in naive CD4+ T cells. J Immunol 2002, 168:3825-3832. 6. Constant SL, Bottomly K: Induction of Th1 and Th2 CD4+ T cell responses: the alternative approaches. Annu Rev Immunol 1997,15:297-322. 7. Sloan-Lancaster J, Steinberg TH, Allen PM: Selective loss of the calcium ion signaling pathway in T cells maturing toward a T helper 2 phenotype. J Immunol 1997, 159:1160-1168. 8. Tao X, Constant S, Jorritsma P, Bottomly K: Strength of TCR signal determines the costimulatory requirements for Th1 and Th2 CD4+ T cell differentiation. J Immunol 1997, 159:5956-5963. 9. Lenschow DJ, Walunas TL, Bluestone JA: CD28/B7 system of T cell costimulation. Annu Rev Immunol 1996, 14:233-258. 10. Green JM, Noel PJ, Sperling AI, Walunas TL, Gray GS, Bluestone JA, Thompson CB: Absence of B7- dependent responses in CD28-deficient mice. Immunity 1994, 1:501-508. 11. Lenschow DJ, Herold KC, Rhee L, Patel B, Koons A, Qin HY, Fuchs E, Singh B, Thompson CB, Bluestone JA: CD28/B7 regulation of Th1 and Th2 subsets in the development of autoimmune diabetes. Immunity 1996, 5:285-293. 12. Gonzalo JA, Tian J, Delaney T, Corcoran J, Rottman JB, Lora J, Al-garawi A, Kroczek R, Gutierrez-Ramos JC, Coyle AJ: ICOS is critical for T helper cellmediated lung mucosal inflammatory responses. Nat Immunol 2001, 2:597-604. 13. Salek-Ardakani S, Song J, Halteman BS, Jember AG, Akiba H, Yagita H, Croft M: OX40 (CD134) controls memory T helper 2 cells that drive lung inflammation. J Exp Med 2003, 198:315-324. 14. Palacios EH, Weiss A: Function of the Src-family kinases, Lck and Fyn, in T-cell development and activation. Oncogene 2004, 23:7990-8000. 15. Rudd CE, Taylor A, Schneider H: CD28 and CTLA-4 coreceptor expression and signal transduction. Immunol Rev 2009, 229:12-26. 16. Wang D, Matsumoto R, You Y, Che T, Lin XY, Gaffen SL, Lin X: CD3/CD28 costimulation-induced NF- kappaB activation is mediated by recruitment of protein kinase C-theta, Bcl10, and IkappaB kinase beta to the immunological synapse through CARMA1. Mol Cell Biol 2004, 24:164-171. 17. Wang D, You Y, Case SM, McAllister-Lucas LM, Wang L, DiStefano PS, Nunez G, Bertin J, Lin X: A requirement for CARMA1 in TCR-induced NFkappa B activation. Nat Immunol 2002, 3:830-835. 18. Yokosuka T, Kobayashi W, Sakata-Sogawa K, Takamatsu M, Hashimoto-Tane A, Dustin ML, Tokunaga M, Saito T: Spatiotemporal regulation of T cell costimulation by TCR-CD28 microclusters and protein kinase C theta translocation. Immunity 2008, 29:589-601. 19. Dong C, Davis RJ, Flavell RA: MAP kinases in the immune response. Annu Rev Immunol 2002, 20:55-72. 20. June CH, Ledbetter JA, Gillespie MM, Lindsten T, Thompson CB: T-cell proliferation involving the CD28 pathway is associated with cyclosporine-resistant interleukin 2 gene expression. Mol Cell Biol 1987, 7:4472-4481. 21. Ledbetter JA, Parsons M, Martin PJ, Hansen JA, Rabinovitch PS, June CH: Antibody binding to CD5 (Tp67) and Tp44 T cell surface molecules: effects on cyclic nucleotides, cytoplasmic free calcium, and cAMPmediated suppression. J Immunol 1986, 137:3299-3305. 22. Badou A, Savignac M, Moreau M, Leclerc C, Foucras G, Cassar G, Paulet P, Lagrange D, Druet P, Guery JC, Pelletier L: Weak TCR stimulation induces a calcium signal that triggers IL-4 synthesis, stronger TCR stimulation induces MAP kinases that control IFN-gamma production. Eur J Immunol 2001, 31:2487- 2496. 23. Furue M, Ishibashi Y: Differential regulation by dexamethasone and cyclosporine of human Tcells activated by various stimuli. Transplantation 1991,52:522-526. 24. Mosmann TR, Sad S: The expanding universe of T-cell subsets: Th1, Th2 and more. Immunol Today 1996, 17:138-146. Identification of T lymphocyte signal transduction pathways 163

25. Muller K, Bischof S, Sommer F, Lohoff M, Solbach W, Laskay T: Differential production of macrophage inflammatory protein 1gamma (MIP-1gamma), lymphotactin, and MIP-2 by CD4(+) Th subsets polarized in vitro and in vivo. Infect Immun 2003, 71:6178-6183. 26. Porter CM, Clipstone NA: Sustained NFAT signaling promotes a Th1-like pattern of gene expression in primary murine CD4+ T cells. J Immunol 2002, 168:4936-4945. 27. Szabo SJ, Kim ST, Costa GL, Zhang X, Fathman CG, Glimcher LH: A novel transcription factor, T-bet, directs Th1 lineage commitment. Cell 2000, 100:655-669. 28. Djuretic IM, Levanon D, Negreanu V, Groner Y, Rao A, Ansel KM: Transcription factors T-bet and Runx3 cooperate to activate Ifng and silence Il4 in T helper type 1 cells. Nat Immunol 2007, 8:145-153. 29. Bishop B, Lloyd CM: CC chemokine ligand 1 promotes recruitment of eosinophils but not Th2 cells during the development of allergic airways disease. J Immunol 2003, 170:4810-4817. 30. Romagnani S: Cytokines and chemoattractants in allergic inflammation. Mol Immunol 2002, 38:881- 885. 31. Kang HS, Blink SE, Chin RK, Lee Y, Kim O, Weinstock J, Waldschmidt T, Conrad D, Chen B, Solway J, et al: Lymphotoxin is required for maintaining physiological levels of serum IgE that minimizes Th1-mediated airway inflammation. J Exp Med 2003, 198:1643-1652. 32. Zheng W, Flavell RA: The transcription factor GATA-3 is necessary and sufficient for Th2 cytokine gene expression in CD4 T cells. Cell 1997, 89:587-596. 33. Stephensen CB, Rasooly R, Jiang X, Ceddia MA, Weaver CT, Chandraratna RA, Bucy RP: Vitamin A enhances in vitro Th2 development via retinoid X receptor pathway. J Immunol 2002, 168:4495-4503. 34. Badou A, Bennasser Y, Moreau M, Leclerc C, Benkirane M, Bahraoui E: Tat protein of human immunodeficiency virus type 1 induces interleukin-10 in human peripheral blood monocytes: implication of protein kinase Cdependent pathway. J Virol 2000, 74:10551-10562. 35. Gajewski TF, Lancki DW, Stack R, Fitch FW: “Anergy” of TH0 helper T lymphocytes induces downregulation of TH1 characteristics and a transition to a TH2-like phenotype. J Exp Med 1994, 179:481-491. 36. Al-Ramadi BK, Nakamura T, Leitenberg D, Bothwell AL: Deficient expression of p56(lck) in Th2 cells leads to partial TCR signaling and a dysregulation in lymphokine mRNA levels. J Immunol 1996,157:4751- 4761. 37. Molina TJ, Bachmann MF, Kundig TM, Zinkernagel RM, Mak TW: Peripheral T cells in mice lacking p56lck do not express significant antiviral effector functions. J Immunol 1993, 151:699-706. 38. Baine I, Abe BT, Macian F: Regulation of T-cell tolerance by calcium/NFAT signaling. Immunol Rev 2009, 231:225-240. 39. Takeda K, Harada Y, Watanabe R, Inutake Y, Ogawa S, Onuki K, Kagaya S, Tanabe K, Kishimoto H, Abe R: CD28 stimulation triggers NF-kappaB activation through the CARMA1-PKCtheta-Grb2/Gads axis. Int Immunol 2008, 20:1507-1515. 40. Matsumoto R, Wang D, Blonska M, Li H, Kobayashi M, Pappu B, Chen Y, Lin X: Phosphorylation of CARMA1 plays a critical role in T Cell receptormediated NF-kappaB activation. Immunity 2005, 23:575- 585. 41. Das J, Chen CH, Yang L, Cohn L, Ray P, Ray A: A critical role for NF-kappa B in GATA3 expression and TH2 differentiation in allergic airway inflammation. Nat Immunol 2001, 2:45-50. 42. Rodriguez-Palmero M, Hara T, Thumbs A, Hunig T: Triggering of T cell proliferation through CD28 induces GATA-3 and promotes T helper type 2 differentiation in vitro and in vivo. Eur J Immunol 1999, 29:3914-3924. 7 Chapter 43. Salek-Ardakani S, So T, Halteman BS, Altman A, Croft M: Differential regulation of Th2 and Th1 lung inflammatory responses by protein kinase C theta. J Immunol 2004, 173:6440-6447. 44. Keane-Myers A, Gause WC, Linsley PS, Chen SJ, Wills-Karp M: B7-CD28/CTLA-4 costimulatory pathways are required for the development of T helper cell 2-mediated allergic airway responses to inhaled antigens. J Immunol 1997, 158:2042-2049. 45. Shahinian A, Pfeffer K, Lee KP, Kundig TM, Kishihara K, Wakeham A, Kawai K, Ohashi PS, Thompson CB, Mak TW: Differential T cell costimulatory requirements in CD28-deficient mice. Science 1993, 261:609- 612. 46. Korn T, Bettelli E, Gao W, Awasthi A, Jager A, Strom TB, Oukka M, Kuchroo VK: IL-21 initiates an alternative pathway to induce proinflammatory T(H)17 cells. Nature 2007, 448:484-487. 47. Yang L, Anderson DE, Baecher-Allan C, Hastings WD, Bettelli E, Oukka M,Kuchroo VK, Hafler DA: IL-21 and TGF-beta are required for differentiation of human T(H)17 cells. Nature 2008, 454:350-352. 48. Borhani DW, Calderwood DJ, Friedman MM, Hirst GC, Li B, Leung AK, McRae B, Ratnofsky S, Ritter K, 164 Chapter 7

Waegell W: A-420983: a potent, orally active inhibitor of lck with efficacy in a model of transplant rejection. Bioorg Med Chem Lett 2004, 14:2613-2616. 49. Evenou JP, Wagner J, Zenke G, Brinkmann V, Wagner K, Kovarik J, Welzenbach KA, Weitz-Schmidt G, Guntermann C, Towbin H, et al: The potent protein kinase C-selective inhibitor AEB071 (sotrastaurin) represents a new class of immunosuppressive agents affecting early Tcell activation. J Pharmacol Exp Ther 2009, 330:792-801. 50. Mihara K, Almansa C, Smeets RL, Loomans EE, Dulos J, Vink PM, Rooseboom M, Kreutzer H, Cavalcanti F, Boots AM, Nelissen RL: A potent and selective p38 inhibitor protects against bone damage in murine collagen-induced arthritis: a comparison with neutralization of mouse TNFalpha. Br J Pharmacol 2008, 154:153-164. Identification of T lymphocyte signal transduction pathways 165

Chapter 7 Chapter CHAPTER 8 Functional annotation of microorganisms using text mining

Wilco W.M. Fleuren1, Marijn Sanders2, Jos Boekhorst3,4, Jacob de Vlieg1,2 and Wynand Alkema1,4.

1 CDD, CMBI, NCMLS, Radboud University Medical Centre, Nijmegen, The Netherlands. 2 Netherlands eScience Center, Amsterdam, The Netherlands. 3 Bacterial genomics, CMBI, NCMLS, Radboud University Medical Centre, Nijmegen, The Netherlands. 4 NIZO Food Research BV, Ede, The Netherlands.

In preparation

Functional annotation of microorganisms using text mining 169

Abstract

Background Metagenomics by means of 16S rRNA sequencing allows identifying microbial communities that are directly retrieved from environmental samples. This technique is used in so called community profiling studies and results in lists of organisms that are affected by a certain treatment or process. Knowledge of the biological properties of these organisms is important to understand their role in these processes. This knowledge is for a large part contained in the scientific literature. Because these organism lists contain dozens of organisms, retrieving the biological properties from the literature is labor-intensive and therefore requires methods that can do this automatically. Method To this end we developed a method which is able to automatically retrieve biological properties of microorganisms from Medline abstracts. Results We analyzed lists of microorganisms derived from metagenomics experiments that were aimed at the characterization of microbiota in female urine and the characterization of the oral fungal microbiome in healthy individuals. With the organisms from both lists, we searched for associations with diseases in Medline abstracts. Several specific disease associations were found indicating the pathogenic potential of these microorganisms. Furthermore, we automatically identified biological properties and industrial applications for groups of lactic acid bacteria (LAB). We identified a cluster of LAB with opposite roles in the process of wine making and a cluster of LAB that are involved in food spoilage as well as in fermentation processes to make sauerkraut and kimchi. Conclusions Automatic information extraction on microorganisms gives rapid insight into the pathogenic potential of microorganisms by linking to diseases in Medline abstracts. Furthermore word clouds of highly occurring keywords in these abstracts give an overview of the biological properties of LAB and the processes in which they play a role. This overview can help to select LAB for fermentation processes in order to improve the characteristics of fermented products.

Keywords: Microorganism annotation, text mining, community profiling, metagenomics data, lactic acid bacteria. Chapter 8 Chapter 170 Chapter 8

Background

Microorganisms play crucial roles in a number of biological processes ranging from the decomposition of waste products [1, 2], to industrial processes where they are used for the development of fermented foods and beverages [3-5]. Furthermore pathogenic microorganisms such as Mycobacterium tuberculosis, Salmonella and Clostridium tetani can be harmful, causing tuberculosis, food poisoning and tetanus respectively. In metagenomics studies, microbial DNA is directly extracted from environmental samples and sequenced by means of 16S RNA sequencing. This technique is well established to study the dynamics of microbial communities. One of the outcomes of these so-called community profiling studies is a list of microorganisms that are affected by a given process or treatment. Insight into the biological properties of these microorganisms is essential to better understand how they are involved in these processes. Because these organism lists contain dozens of microorganisms, retrieval of biological properties these organisms from the scientific literature is labor- intensive and requires methods that can do this retrieval automatically. Publications about microorganisms are growing exponentially [6] and literature databases such as the Medline database have over 1.5 million abstracts related to microbiota and microorganisms (Table 1).

Table 1 Number of abstracts found for each term when searching in Pubmed (till December 2012).

Search term Number of hits found microbiota OR microbiome OR microflora OR bacteria OR microorganisms 1581863 Salmonella 72488 Mycobacterium tuberculosis 42774 lactococcus lactis 4299 Stretococcus thermophilus 1135 Clostridium tetani 995 Oenococcus oeni 293

Depending on the microorganism, the number of abstracts range from a few hundred to tens of thousands (Table 1). This indicates that the Medline database is a suitable source for the retrieval of biological properties of microorganisms. In earlier work we reported about CoPub [7] , a publicly available text mining system that has successfully been used for the analysis of gene expression data by automatically searching for co-occurrences between genes and pathways, biological processes, diseases and drug terms in Medline abstracts [8-11]. Functional annotation of microorganisms using text mining 171

Here we report the development of a tool that identifies organism names in any kind of text and maps these to the NCBI taxonomy ids. With this tool, analogous to the gene list based text mining described for CoPub, lists of microorganisms can be annotated with important biological properties of these microorganisms and the processes in which they are involved. Subsequently, this text is analyzed in two ways: One analysis method consists of searching for significant occurrences of disease and pathway terms from thesauri in Medline abstracts. Additionally highly occurring informative keywords are identified in these abstracts and presented as a word cloud. We show that with both methods we can annotate lists of microorganisms from community profiling studies and find evidence about the pathogenic potential of these microorganisms, and predict biological functions for single and clusters of lactic acid bacteria. Thus text mining is a valuable approach to quickly retrieve biological properties of microorganisms from text and can help experts to gain insight into how these microorganisms are involved in the studied processes.

Materials and Methods

Detection of organism names in Medline abstracts A set of regular expressions was implemented in Perl and used to tag organism names in Medline abstracts. Subsequently, for organisms with a known taxonomy identifier in the NCBI Taxonomy Database [12], the hits in the Medline abstracts were stored in a database. This resulted in one to multiple organism hits per Medline abstract.

Calculation of significant co-occurrences The significance of a co-occurrence between organisms or for a co-occurrence between organisms and disease terms from controlled vocabulary [7] was determined by calculating a p-value using the R implementation of Fisher’s exact test. We considered co-occurrences with a p-value < 0.05 significant. Co-occurrence matrices were clustered with the R package pvclust with “complete” setting for hierarchical clustering. For further analysis sub clusters were created (tree cut-off: 0.9).

Selection of organism lists from community profiling studies

Bacteria identified in female urine were selected using Table S1 from the paperby 8 Chapter Siddiqui et al. [13]. Fungi identified in oral rinse samples were selected from Table 3 in combination with fungi from Supplementary table 3 for which the genera were presented in Figure 3 in the paper by Ghannoum et al. [14]. 172 Chapter 8

Generation of word clouds of highly occurring keywords The set of abstracts for a given organism (organism_abstracts) was used to generate word clouds for the organism of interest. Keywords with less than 5 characters and keywords occurring in < 2% of the abstracts were discarded. The significance of occurrence of a keyword in the organism_abstracts was determined by calculating the relative keyword frequency for the keyword in organism_abstracts and the relative keyword frequency of the keyword in a background set. The ratio of these relative frequencies was used to rank the keywords. Wordclouds were generated using the R-package “wordcloud”.

Results

Text mining vs. manual searching in Pubmed In order to retrieve information about microorganisms from Medline abstracts we automatically tagged 3502122 unique organism names in these abstracts. We were interested to see whether tagging of organism names, called tagging hits, yielded good results in comparison with when searching for these organism names in Pubmed, referred to as Pubmed hits. For 100 organisms we compared the number of tagging hits with the number of Pubmed hits. The results are shown in Figure 1. For about 40 % of the organisms the number of tagging hits was similar to the number of Pubmed hits. Differences between both search strategies can be explained by that fact there is a discrepancy of about 5% between the Medline database in Pubmed and the Medline database as provided to licensees for download, in which the tagging hits were found. Moreover, Pubmed hits were also found in Mesh Terms (Medical Subject Headings from controlled vocabulary thesaurus used for indexing articles for Pubmed) of abstracts, this in contrast to the tagging hits for which we only searched in the title and the abstract. For example for Clostridium tetani several tagging hits [15-19] were not retrieved, because the organism name is only used in the Mesh Term header. For 35 out of the 100 organisms more tagging hits were retrieved in comparison with Pubmed hits. This difference can be explained by the fact that tagging hits were found by searching with all names and synonyms of the particular organism, while Pubmed hits were found by searching only with one name or synonym. For example in the case of Oenococcus oeni, tagging hits also included abstracts in which Leuconostoc oenos (the previous name of Oenococcus oeni), was found [20, 21]. Similarly for Lactococcus lactis subsp. cremoris additional tagging hits were found with its synonym Streptococcus cremoris [22, 23]. On the other hand for Leuconostoc, Moraxella, Listeria, Neisseria and Salmonella more Pubmed hits were retrieved. All these organisms are classified as genus in the NCBI taxonomy database and the Pubmed hits included also abstracts about organisms that belong to this genus, e.g. Leuconostoc mesenteroides. Tagging hits for Leuconostoc Functional annotation of microorganisms using text mining 173 resulted in abstracts about Leuconostoc and not in abstracts about organisms that belong to this genus. These latter abstracts were mapped to these specific organisms. Taken together, these results show that that tagging of organism names yielded good results in comparison with searching for these organism names in Pubmed and outperforms Pubmed searches for certain microorganisms.

Figure 1 Retrieval of the number of abstracts for 100 microorganisms. The x-axis represents the number of Pubmed hits, the y-axis represents the number of tagging hits. Ten percent of the microorganisms for which the most tagging hits were found are shown in green. Ten percent of the microorganisms for which the most Pubmed hits were found are shown in red. Tagging of organism names was done on Medline abstracts till October 2012. Pubmed queries were done on abstracts till the end of November 2012.

Annotation of metagenomics experiments Metagenomics and community profiling studies often result in lists of microorganisms. To see if we could automatically find biologically meaningful descriptions in Medline abstracts, we used tagging to retrieve these descriptions for two sets of microorganisms: one from a study aimed at the characterization of microbiota in urine from healthy females

[13] and one from a study aimed at the characterization of the oral fungal microbiome 8 Chapter in healthy individuals [14]. For both studies we were interested to see if we could find evidence in the literature about the pathogenic potential of these microorganisms.

Annotation of organisms derived from female urine samples Although urine within the urinary tract is generally considered as ”sterile”, several 174 Chapter 8

Table 2 Associations between organisms from urine samples of healthy volunteers and disease terms found in the literature. NA indicates no association was found.

Tax ID Organism Disease term (p-value) 165188 uncultured Megasphaera sp. NA 28130 Prevotella disiens pelvic inflammatory disease (3.12e-11) cancer (2.38e-06) cellulitis (0.000152) abscess (0.000311) Endometritis (0.00066) furunculosis (0.0112) carcinoma (0.0178) salpingitis (0.019) bacterial infection (0.0419) 1376 Aerococcus urinae urinary tract infection (5.16e-26) balanitis (3.17e-05) prostatic disease (0.0152) 361505 Anaerococcus sp. gpac155 NA 119206 Aerococcus sanguinicola urinary tract infection (3.36e-06) prostatic disease (0.00147) 322095 Porphyromonas somerae NA 159274 uncultured Porphyromonas sp. NA 844 Wolinella succinogenes NA 174293 uncultured candidate division OP11 bacterium NA 189716 Flexistipes sp. E3_33 NA 33043 Coprococcus eutactus irritable bowel syndrome (0.00404) inflammatory bowel disease (0.0124) 172733 uncultured Clostridiales bacterium NA 38303 Corynebacterium pseudogenitalium urinary tract infection (0.00257) corneal ulcer (0.00683) Endometritis (0.00783) urethritis (0.0121) Nosocomial infections (0.0274) 208548 uncultured Neisseriaceae bacterium NA 361501 Peptoniphilus sp. gpac121 NA 47770 Lactobacillus crispatus urinary tract infection (0.000916) gonorrhea (0.00145) trichomonas infection (0.00598) inflammation (0.0378) Functional annotation of microorganisms using text mining 175

Tax ID Organism Disease term (p-value) 221218 uncultured candidate division OD1 bacterium NA 202789 Actinobaculum massiliense urinary tract infection (0.000289) 76517 Campylobacter hominis Gastrointestinal disease (0.00578) disseminated intravascular coagulation (0.00912) gastroenteritis (0.0135) periodontal disease (0.0164) 195049 uncultured Clostridiaceae bacterium NA 163550 Eubacterium sp. oral clone BU061 NA 179000 uncultured Methylophilus sp. NA 54007 Anaerococcus octavius NA 827 Bacteroides ureolyticus urethritis (5.13e-31) gastroenteritis (0.000701) abscess (0.00217) brain abscess (0.00444) Gastrointestinal disease (0.0125) periodontal disease (0.0127) 158755 uncultured Cytophagales bacterium NA 178214 Facklamia hominis NA 87541 Aerococcus christensenii urinary tract infection (0.0276) 181675 Lactobacillus coleohominis Crohn's disease (0.0393) 339338 uncultured Allisonella sp. NA 187319 uncultured Fibrobacteres bacterium NA 33012 Propionimicrobium lymphophilum NA 359408 Methylotenera mobilis NA 254354 uncultured Peptoniphilus sp. NA 166587 uncultured Chloroflexi bacterium NA 159272 uncultured Prevotella sp. NA 159268 uncultured Veillonella sp. NA 293422 uncultured Eggerthella sp. NA 853 Faecalibacterium prausnitzii Crohn's disease (1.68e-21) inflammatory bowel disease (1.09e-10)

cancer (1.69e-10)

colitis ulcerative (6.08e-08) 8 Chapter irritable bowel syndrome (0.0205) colon cancer (0.045) 28128 Prevotella corporis NA 1260 Finegoldia magna Acne (4.03e-09) 176 Chapter 8

Tax ID Organism Disease term (p-value) fasciitis necrotizing (0.000368) Endometritis (0.000552) fasciitis (0.00331) abscess (0.00371) Respiratory tract infection (0.00869) 293428 uncultured Anaerococcus sp. NA 59505 Actinobaculum schaalii urinary tract infection (1.69e-21) Fournier's gangrene (0.00546) bacteremia (0.00618) kidney calculi (0.0298) 169971 uncultured Peptostreptococcus sp. NA 211450 uncultured Pelobacter sp. NA 171953 uncultured Acidobacteria bacterium NA

microorganisms with a pathogenic potential were found in these samples [13]. In order to see if we could find more specific evidence that further could underline these observations; we searched for associations between the organisms and disease terms in Medline abstracts (Table2). We found, similar to the observations and manual annotations by Siddiqui et al., disease terms related to the infection/inflammation of the urinary tract, e.g. urinary tract infection [24-26] and urethritis (inflammation of the urethra) [27, 28] and several diseases associated with female urogenital pathology, e.g. salpingitis (infection and inflammation of fallopian tubes)[29], endometritis (infection of the inner lining of the uterus) [29-31] and pelvic inflammatory disease (inflammation of the uterus, fallopian tubes, and/or ovaries) [29, 32]. In addition we found associations with intestine-related diseases, such as inflammatory bowel disease [33, 34], Crohn’s disease [35, 36], irritable bowel disease [37], gastroenteritis [38] and necrotizing fasciitis [39, 40]. These results show that tagging in combination with thesaurus based matching is as effective as manual annotation.

Annotation of the fungal microbiome in healthy individuals Next we were interested to see if we also could find evidence about the pathogenic potential of fungi identified in oral rinse samples from healthy individuals. We searched for associations between these fungi and diseases (Table 3). A number of diseases related to the oral cavity and respiratory system were found such as candidiasis [41-43], aspergillosis [44-47], asthma [48, 49], rhinitis [50, 51] and respiratory tract disease [52]. Several of these pathogenic fungi are also able to infect other parts of the body such as the skin [53] and the vagina [54, 55]. Functional annotation of microorganisms using text mining 177

The risk of getting infected by these pathogenic fungi especially increases in individuals with a compromised immune system, for instance in cancer, leukemia and diabetes mellitus patients [56-58]. Typically these patients stay in a hospital environment which also increases the risk on nosocomial infections [59]. The presence of potential pathogenic fungi in healthy individuals could be the first step towards infection by them as indicating by the authors. We demonstrated that tagging in combination with thesaurus based matching can be used to get insight into specific diseases that could be caused by these fungi.

Table 3 Associations between fungi from orinse samples of healthy volunteers and disease terms found in the literature. NA indicates no association was found. Tax ID Organism Disease term (p-value) 41959 aspergillus penicillioides NA 5516 fusarium culmorum NA 5507 fusarium oxysporum cancer (5.24e-235) leukemia (3.29e-17) keratitis (3.4e-13) 41960 aspergillus caesiellus NA 5074 penicillium brevicompactum NA 92950 cladosporium sphaerospermum carcinoma (0.0189) Respiratory tract disease (0.0316) 119927 alternaria tenuissima Iga deficiency (0.0166) retroperitoneal fibrosis (0.0213) 4897 schizosaccharomyces japonicus NA 4956 zygosaccharomyces rouxii NA 279121 ophiostoma pulvinisporum NA 4896 schizosaccharomyces pombe cancer (0) starvation (6.13e-79) Hypovolemic shock (5.47e-14) 4932 saccharomyces cerevisiae cancer (0) starvation (1.26e-243) Hypovolemic shock (9.17e-66) 415599 glomus fulvum NA

5480 candida parapsilosis candidiasis (9e-305) cancer (7.29e-231) Chapter 8 Chapter Mycoses (4.9e-39) onychomycosis (4.73e-36) Nosocomial infections (7.65e-21) fever (3.86e-13) 178 Chapter 8

Tax ID Organism Disease term (p-value) neutropenia (6.19e-11) diabetes mellitus (3.81e-10) Immunologic deficiency syndrome acquired (2.06e-08) Systemic inflammatory response syndrome (0.035) 4931 saccharomyces bayanus NA 69773 penicillium glabrum hypersensitivity (6.88e-05) pneumonia (0.00391) 29917 cladosporium cladosporioides sick building syndrome (3.66e-08) keratitis (0.00582) allergy (0.00625) 5580 aureobasidium pullulans corneal ulcer (0.000127) fever (0.016) asthma (0.0304) 5062 aspergillus oryzae Hypovolemic shock (3.27e-09) 70808 cladosporium tenuissimum NA 396024 aspergillus ruber hypersensitivity immediate (0.00323) 216836 alternaria triticina NA 749552 phoma foveata NA 62271 saccharomyces ellipsoideus NA 5059 aspergillus flavus cancer (2.46e-266) aspergillosis (5.85e-222) keratitis (1.79e-12) leukemia (7.53e-10) sinusitis (1.05e-07) pneumonia (1.69e-05) 89919 cryptococcus diffluens dermatitis (2.34e-06) cryptococcosis (0.0102) candidiasis (0.033) 227369 zygosaccharomyces NA pseudorouxii 260535 candida khmerensis NA 89918 cryptococcus cellulolyticus NA 104300 ophiostoma floccosum NA 29918 cladosporium herbarum allergy (5.7e-46) asthma (2.29e-20) rhinitis (2.94e-12) eczema (1.49e-07) sick building syndrome (3.75e-06) Functional annotation of microorganisms using text mining 179

Tax ID Organism Disease term (p-value) hypersensitivity immediate (0.00056) hypersensitivity (0.00108) aspergillosis (0.00316) 531293 phoma plurivora NA 273372 candida metapsilosis Nosocomial infections (0.000623) candidiasis (0.00141) 5476 candida albicans candidiasis (0) cancer (0) inflammation (2.81e-233) Mycoses (1.32e-151) pneumonia (3.47e-51) Immunologic deficiency syndrome acquired (5.37e-22) Systemic inflammatory response syndrome (9e-05) 5054 eurotium amstelodami Lung disease (5.09e-08) 5482 candida tropicalis cancer (3.27e-204) candidiasis (3.48e-197) neutropenia (3.82e-10) fever (3.12e-09) Immunologic deficiency syndrome acquired (4.59e-08) leukemia (8.16e-07) 27381 glomus mosseae NA 63822 penicillium spinulosum NA 36050 fusarium poae NA

Finding relevant information for lactic acid bacteria We showed that automatically linking microorganisms to keywords in thesauri is as effective as manual annotation. However a lot of the properties of microorganisms are described in abstracts with keywords that are not present in thesauri. We were interested to see if we could automatically retrieve these keywords. Furthermore we evaluated whether presenting these keywords in word clouds provide sufficient information about the biological properties of microorganisms.

For Streptococcus thermophilus, a widely used lactic acid bacterium in the dairy industry, we identified 976 abstracts. We created a word cloud of highly occurring keywords in Chapter 8 Chapter these abstracts (Figure 2). The word cloud reveals the role of Streptococcus thermophilus in fermentation processes where it is used as starter culture for the preparation of yoghurt [60, 61]. In cheese, Streptococcus thermophilus is mainly responsible for the acidification which is important for the ripening and the characteristics of cheese [62, 63]. Additionally, someStreptococcus 180 Chapter 8

Figure 2 Word cloud of highly occurring keywords in Medline abstracts about Streptococcus thermophilus, describing biological properties and applications of Streptococcus thermophilus.

thermophilus strains produce exopolysaccharides (ESP) which is an important ingredient controlling the texture of yogurt and cheese and has been suggested to be beneficial for the treatment of chronic gastritis [64]. These results show that the word cloud, constructed of keywords retrieved from abstracts about Streptococcus thermophilus give an overview of the the industrial processes and applications in whichStreptococcus thermophilus is involved. Besides keywords that give insight into the application and the biological properties of Streptococcus thermophilus, the word cloud also contains (parts of) names from other bacteria, e.g. bifidobacterium, Lactobacillus acidophilus and Lactobacillus delbrueckii subsp. bulgaricus are shown. This indicates that Streptococcus thermophilus is functionally related to these LAB in fermentation processes.

Microbial community literature study of LAB To better understand in which ways these LAB are related to each other and how they together are involved in biological processes we searched in the literature for evidence about which LAB are often mentioned together and in which processes and applications they are mentioned together. In order to do this, we started withLactobacillus plantarum, Streptococcus thermophilus and Oenococcus oeni and retrieved the organisms with which they co-occur. To see if taxonomy differences are also reflected in the scientific literature we also included non-LAB Bacillus subtilis. For each of these four bacteria Functional annotation of microorganisms using text mining 181 we obtained a list of the most significant co-occurring organisms (p-value < 0.05). For Bacillus subtilis we found the highest number of co-occurring organisms, 328 organisms in 8963 abstracts, and for Oenococcus Oeni we found the lowest number of co-occurring organisms, 17 organisms in 133 abstracts. Subsequently, for each organism list we ranked the organisms based on the p-value and retrieved from each organism list the top ranked co-occurring organisms. This yielded a unique set of 51 organisms (organisms classified

Figure 3 Hierarchical tree of LAB related microorganisms. For each cluster (designated with a character), a word cloud of highly occurring terms in abstracts (indicated with a corresponding character) in which organisms in the cluster are mentioned, is made.

as ‘superkingdom’ or ’genus’ were discarded). To find out how these 51 organisms are related to each other, we determined whether they co-occur in Medline abstracts. Based on these co-occurrences, the organisms were clustered (Figure 3). It appeared Chapter 8 Chapter that Bacillus subtilis, which is not classified as a LAB, clusters separately from the other LAB. This indicates that the taxonomy difference betweenBacillus subtilis and LAB is also reflected by the literature context. Furthermore Lactobacillus plantarum, Streptococcus thermophilus and Oenococcus oeni, with which we initially started our search with, cluster in separate, sub-clusters all of them with similar LAB. 182 Chapter 8

To see how LAB that clustered together are functionally related to each other, we created word clouds based on the abstracts connected to these clusters.

Determine the biological properties for groups of LAB To retrieve the processes and applications for LAB that clustered together, we first divided the tree in Figure 3, into 9 clusters. For each cluster we retrieved the abstracts in which the microorganisms from the cluster are mentioned. This yielded between 14582 and 46 abstracts per cluster. We used these abstracts to generate a word cloud of highly occurring keywords (Figure 3). Cluster i in Figure 3 consists of microorganisms that are involved in the wine making process, but with different functions. Oenococcus oeni is a malolactic starter for the wine making process, giving the wine its characteristic taste and aroma [65]. However, glucan-producing strains of Pediococcus damnosus are considered as wine spoilers [66]. Similarly, Pediococcus parvulus is also considered as unwanted in wine because of its ability to give wines ropiness [67]. Cluster h (Figure 3) consists of LAB that are used for the preparation of yoghurt and cheese [68]. Many food fermentations are performed using mixed cultures of LAB. For example yoghurt is produced by the symbiotic growth of Streptococcus thermophilus and Lactobacillus delbrueckii subsp. bulgaricus [60]. Cluster h also contains Lactobacillus acidophilus and several bifidobacterium species that are used as probiotics [69]. Some dairy products that are created by using Streptococcus thermophilus, serve as a carrier for these probiotic bacteria [70, 71]. However, Streptococcus thermophilus itself is not used as a probiotic, since it does not survive the stomach acid in healthy humans. Cluster e (Figure 3) is an interesting cluster. It consists of microorganisms that are on the one hand known to cause food spoilage and on the other hand known for their use in the production of fermentative products. Leuconostoc mesenteroides causes spoilage of blood-sausage and vacuum packed meat [72-74], but is also used for improving the texture of sourdough [75] and for the fermentation of sauerkraut [76]; Similarly, Lactobacillus brevis is used for the fermentation of kimchi (a fermented cabbage product originating from Korea) [77], but is also a known beer spoiler [78] . These apparent opposite functions of these microorganisms can be explained by fact that these effects are caused by different strains of the microorganism. This analysis shows that it is important to provide the underlying literature in which these effects are described to get more insight into the details about which type of strain of the microorganism is causing the effect and in which context this occurs. Additionally, several bacteriocins, e.g. nisin, bavaricin, lactococcin and mundticin are shown in the word clouds. Bacteriocins are small proteins produced by bacteria to inhibit the growth of similar or closely related bacteria. Bacteriocins from LAB are often used to increase shelf life of fermented products. Examples of this are nisin (produced by Lactococcus lactis) against Staphylococcus aureus (food spoiler) and lactococcin for the conservation of cheese and raw milk [79-81]. These bacteriocins give insight into the Functional annotation of microorganisms using text mining 183 mechanisms by which LABs are used for the conservation of food products. These results show that word clouds of keywords for consortia of LAB give insight into which LAB are functionally related and for which processes and products they are important.

Discussion and conclusions

The presented strategy for the retrieval of biological information for microorganisms is to our best knowledge the first approach that uses automatically information extraction from scientific literature for this. A number of methods exists that are able to tag automatically organism names in text but these methods do not provide an extensive analysis of this text in order to retrieve functional keywords that describe the biological properties of these organisms [82, 83]. Some studies include text mining to extract organism related information from scientific literature, but only for specific model organisms [84, 85] or they focus on species specific extraction of genetic entities [86] . Our tagging method can be improved by also searching in Mesh terms. Because Mesh terms are partly manually curated, this is once again a proof that expert curation together with text mining is a good combination to find information in the literature. We showed that automatic information extraction from Medline abstracts gives rapid insight into the biology of microorganisms that can be used to better understand how these microorganisms are involved in the studied processes. For two lists of organisms derived from healthy individuals we found evidence about the pathogenic potential of these microorganisms, by finding specific disease associations. This analysis showed that with text mining we are able retrieve similar results to manual annotation. Moreover this also showed that we are able to retrieve additional associations with other diseases. Linking the microorganisms to a controlled vocabulary with drug terms, would enable to study the relation between found microorganisms and known therapeutic treatments. We also showed that text mining without using thesauri yields highly occurring keywords from abstracts in which LAB are mentioned. By presenting these keywords in word clouds, we get an overview of the biological properties and industrial processes in which LAB are used. The knowledge provided by these keywords can be used to make the right decisions in selecting LAB for fermentation processes and for finding new applications for them.

By using the NCBI taxonomy database we are limited to microorganisms in this database. Many specific bacterial strains are not present in the taxonomy database, but are still Chapter 8 Chapter detected in the abstracts using our regular expressions. These bacterial strain hits are mapped to the closest related bacterium that is present in the NCBI database. This means that a search for abstracts in which Streptococcus thermophilus is mentioned can also result in abstracts in which a specific bacterial strain of Streptococcus thermophilus is described that is not present in the NCBI taxonomy database. Therefore careful 184 Chapter 8

interpretation of the subsequently presented keywords is required since these keywords could also be related to specific bacterial strains. Although the Medline database covers over 1.5 million abstracts related to microbiology, the database is focused on biomedical related research. To increase coverage of literature in which food related studies are described, other literature sources such as the FSTA database [87] could be used.

Acknowledgements

Author WF was supported by the Biorange project (BR4.2) “A Systems Bioinformatics Approach For Evaluating And Translating Drug-Target Effects In Disease Related Pathways” of NBIC when this work was initiated. From 1 October 2012, WF is supported by the Netherlands eScience project (027.011.306) “Creation of food specific ontologies for food focused text mining and knowledge management studies”. Functional annotation of microorganisms using text mining 185

References

1. Leahy, J.G. and R.R. Colwell, Microbial degradation of hydrocarbons in the environment. Microbiol Rev, 1990. 54(3): p. 305-15. 2. Martinkova, L., et al., Biodegradation potential of the genus Rhodococcus. Environ Int, 2009. 35(1): p. 162-77. 3. St-Gelais, D., et al., Production of fresh Cheddar cheese curds with controlled postacidification and enhanced flavor. J Dairy Sci, 2009. 92(5): p. 1856-63. 4. Horiuchi, H. and Y. Sasaki, Short communication: effect of oxygen on symbiosis between Lactobacillus bulgaricus and Streptococcus thermophilus. J Dairy Sci, 2012. 95(6): p. 2904-9. 5. Sisto, A. and P. Lavermicocca, Suitability of a probiotic Lactobacillus paracasei strain as a starter culture in olive fermentation and development of the innovative patented product "probiotic table olives". Front Microbiol, 2012. 3: p. 174. 6. Collison, M., et al., Data mining the human gut microbiota for therapeutic targets. Brief Bioinform, 2012. 13(6): p. 751-68. 7. Fleuren, W.W., et al., CoPub update: CoPub 5.0 a text mining system to answer biological questions. Nucleic Acids Res, 2011. 39(Web Server issue): p. W450-4. 8. Frijters, R., et al., Prednisolone-induced differential gene expression in mouse liver carrying wild type or a dimerization-defective glucocorticoid receptor. BMC Genomics, 2010. 11: p. 359. 9. Toonen, E.J., et al., Prednisolone-induced changes in gene-expression profiles in healthy volunteers. Pharmacogenomics, 2011. 12(7): p. 985-98. 10. Merkl, M., et al., Microarray analysis of equine endometrium at days 8 and 12 of pregnancy. Biol Reprod, 2010. 83(5): p. 874-86. 11. Friberg, P.A., D.G. Larsson, and H. Billig, Transcriptional effects of progesterone receptor antagonist in rat granulosa cells. Mol Cell Endocrinol, 2010. 315(1-2): p. 121-30. 12. Sayers, E.W., et al., Database resources of the National Center for Biotechnology Information. Nucleic Acids Res, 2009. 37(Database issue): p. D5-15. 13. Siddiqui, H., et al., Assessing diversity of the female urine microbiota by high throughput sequencing of 16S rDNA amplicons. BMC Microbiol, 2011. 11: p. 244. 14. Ghannoum, M.A., et al., Characterization of the oral fungal microbiome (mycobiome) in healthy individuals. PLoS Pathog, 2010. 6(1): p. e1000713. 15. Hope, V.D., et al., A decade of spore-forming bacterial infections among European injecting drug users: pronounced regional variation. Am J Public Health, 2012. 102(1): p. 122-5. 16. Zielinski, A., [Tetanus in Poland in 2009]. Przegl Epidemiol, 2011. 65(2): p. 271-2. 17. Chapter 2-12-4. Anaerobic infections (individual fields): tetanus. J Infect Chemother, 2011. 17 Suppl 1: p. 125-32. 18. Langner, K.F., et al., Localised tetanus in a cat. Vet Rec, 2011. 169(5): p. 126. 19. Za'tsev, E.M., et al., [Monitoring of antibodies against diphtheria, tetanus and pertussis in pregnant women]. Zh Mikrobiol Epidemiol Immunobiol, 2010(1): p. 32-5. 20. Ramos, A., et al., Enzyme Basis for pH Regulation of Citrate and Pyruvate Metabolism by Leuconostoc oenos. Appl Environ Microbiol, 1995. 61(4): p. 1303-10. 21. Cavin, J.F., et al., Medium for Screening Leuconostoc oenos Strains Defective in Malolactic Fermentation. Appl Environ Microbiol, 1989. 55(3): p. 751-3. 22. van der Lelie, D., J.M. van der Vossen, and G. Venema, Effect of Plasmid Incompatibility on DNA Transfer to Streptococcus cremoris. Appl Environ Microbiol, 1988. 54(4): p. 865-71. 23. Inamine, J.M., L.N. Lee, and D.J. LeBlanc, Molecular and genetic characterization of lactose-metabolic

genes of Streptococcus cremoris. J Bacteriol, 1986. 167(3): p. 855-62. 24. Greub, G. and D. Raoult, "Actinobaculum massiliae," a new species causing chronic urinary tract Chapter 8 Chapter infection. J Clin Microbiol, 2002. 40(11): p. 3938-41. 25. Haller, P., et al., Vertebral osteomyelitis caused by Actinobaculum schaalii: a difficult-to-diagnose and potentially invasive uropathogen. Eur J Clin Microbiol Infect Dis, 2007. 26(9): p. 667-70. 26. Bank, S., et al., Actinobaculum schaalii, a common uropathogen in elderly patients, Denmark. Emerg Infect Dis, 2010. 16(1): p. 76-80. 27. Mazuecos, J., et al., Anaerobic bacteria in men with urethritis. J Eur Acad Dermatol Venereol, 1998. 10(3): p. 237-42. 186 Chapter 8

28. Woolley, P.D., Anaerobic bacteria and non-gonococcal urethritis. Int J STD AIDS, 2000. 11(6): p. 347-8. 29. Brook, I., Microbiology and management of polymicrobial female genital tract infections in adolescents. J Pediatr Adolesc Gynecol, 2002. 15(4): p. 217-26. 30. Topiel, M.S. and G.L. Simon, Peptococcaceae bacteremia. Diagn Microbiol Infect Dis, 1986. 4(2): p. 109- 17. 31. Mikamo, H., et al., In vitro and in vivo antibacterial activities of biapenem in the fields of obstetrics and gynecology. Chemotherapy, 2000. 46(2): p. 95-9. 32. Crombleholme, W.R., et al., Ampicillin/sulbactam versus metronidazole-gentamicin in the treatment of soft tissue pelvic infections. Am J Obstet Gynecol, 1987. 156(2): p. 507-12. 33. Swidsinski, A., et al., Active Crohn's disease and ulcerative colitis can be specifically diagnosed and monitored based on the biostructure of the fecal flora. Inflamm Bowel Dis, 2008. 14(2): p. 147-61. 34. Khan, M.T., et al., How Can Faecalibacterium prausnitzii Employ Riboflavin for Extracellular Electron Transfer? Antioxid Redox Signal, 2012. 17(10): p. 1433-40. 35. Schwiertz, A., et al., Microbiota in pediatric inflammatory bowel disease. J Pediatr, 2010. 157(2): p. 240-244 e1. 36. Willing, B., et al., Twin studies reveal specific imbalances in the mucosa-associated microbiota of patients with ileal Crohn's disease. Inflamm Bowel Dis, 2009. 15(5): p. 653-60. 37. Malinen, E., et al., Association of symptoms with gastrointestinal microbiota in irritable bowel syndrome. World J Gastroenterol, 2010. 16(36): p. 4532-40. 38. Lawson, A.J., D. Linton, and J. Stanley, 16S rRNA gene sequences of 'Candidatus Campylobacter hominis', a novel uncultivated species, are found in the gastrointestinal tract of healthy humans. Microbiology, 1998. 144 ( Pt 8): p. 2063-71. 39. Lee, S., et al., A case of necrotizing fasciitis due to Streptococcus agalactiae, Arcanobacterium haemolyticum, and Finegoldia magna in a dog-bitten patient with diabetes. Korean J Lab Med, 2008. 28(3): p. 191-5. 40. Brook, I., Aerobic and anaerobic microbiology of necrotizing fasciitis in children. Pediatr Dermatol, 1996. 13(4): p. 281-4. 41. Oliveira, M.A., et al., Microbiological and immunological features of oral candidiasis. Microbiol Immunol, 2007. 51(8): p. 713-9. 42. Belazi, M., et al., Oral Candida isolates in patients undergoing radiotherapy for head and neck cancer: prevalence, azole susceptibility profiles and response to antifungal treatment. Oral Microbiol Immunol, 2004. 19(6): p. 347-51. 43. Nakajima, J., et al., Effect of a novel phyto-compound on mucosal candidiasis: further evidence from an ex vivo study. J Dig Dis, 2007. 8(1): p. 48-51. 44. Cho, H., et al., Invasive oral aspergillosis in a patient with acute myeloid leukaemia. Aust Dent J, 2010. 55(2): p. 214-8. 45. Correa, M.E., et al., Primary aspergillosis affecting the tongue of a leukemic patient. Oral Dis, 2003. 9(1): p. 49-53. 46. Bor, O., et al., Successful treatment of tongue aspergillosis caused by Aspergillus flavus with liposomal amphotericin B in a child with acute lymphoblastic leukemia. Med Mycol, 2006. 44(8): p. 767-70. 47. Alp, S. and S. Arikan, Investigation of extracellular elastase, acid proteinase and phospholipase activities as putative virulence factors in clinical isolates of Aspergillus species. J Basic Microbiol, 2008. 48(5): p. 331-7. 48. Jaakkola, M.S., A. Ieromnimon, and J.J. Jaakkola, Are atopy and specific IgE to mites and molds important for adult asthma? J Allergy Clin Immunol, 2006. 117(3): p. 642-8. 49. Niedoszytko, M., et al., Association between sensitization to Aureobasidium pullulans (Pullularia sp) and severity of asthma. Ann Allergy Asthma Immunol, 2007. 98(2): p. 153-6. 50. Rid, R., et al., Isolation and immunological characterization of a novel Cladosporium herbarum allergen structurally homologous to the alpha/beta hydrolase fold superfamily. Mol Immunol, 2010. 47(6): p. 1366-77. 51. Modrzynski, M. and E. Zawisza, Specific nasal provocation tests in patients hypersensitive to mould allergens. Med Sci Monit, 2005. 11(1): p. CR44-8. 52. Ng, K.P., et al., Sequencing of Cladosporium sphaerospermum, a Dematiaceous fungus isolated from blood culture. Eukaryot Cell, 2012. 11(5): p. 705-6. 53. Sugita, T., et al., The basidiomycetous yeasts Cryptococcus diffluens and C. liquefaciens colonize the skin of patients with atopic dermatitis. Microbiol Immunol, 2003. 47(12): p. 945-50. Functional annotation of microorganisms using text mining 187

54. Mahmoudi Rad, M., et al., The epidemiology of Candida species associated with vulvovaginal candidiasis in an Iranian patient population. Eur J Obstet Gynecol Reprod Biol, 2011. 155(2): p. 199- 203. 55. Ahmad, A. and A.U. Khan, Prevalence of Candida species and potential risk factors for vulvovaginal candidiasis in Aligarh, India. Eur J Obstet Gynecol Reprod Biol, 2009. 144(1): p. 68-71. 56. Michael, R.C., et al., Mycological profile of fungal sinusitis: An audit of specimens over a 7-year period in a tertiary care hospital in Tamil Nadu. Indian J Pathol Microbiol, 2008. 51(4): p. 493-6. 57. Schelenz, S., et al., Epidemiology of oral yeast colonization and infection in patients with hematological malignancies, head neck and solid tumors. J Oral Pathol Med, 2011. 40(1): p. 83-9. 58. Rothe, A., et al., Combination therapy of disseminated Fusarium oxysporum infection with terbinafine and amphotericin B. Ann Hematol, 2004. 83(6): p. 394-7. 59. Sabino, R., et al., Isolates from hospital environments are the most virulent of the Candida parapsilosis complex. BMC Microbiol, 2011. 11: p. 180. 60. Sieuwerts, S., et al., Mixed-culture transcriptome analysis reveals the molecular basis of mixed-culture growth in Streptococcus thermophilus and Lactobacillus bulgaricus. Appl Environ Microbiol, 2010. 76(23): p. 7775-84. 61. Ebel, B., et al., Use of gases to improve survival of Bifidobacterium bifidum by modifying redox potential in fermented milk. J Dairy Sci, 2011. 94(5): p. 2185-91. 62. Milesi, M.M., et al., Two strains of nonstarter lactobacilli increased the production of flavor compounds in soft cheeses. J Dairy Sci, 2010. 93(11): p. 5020-31. 63. Charlet, M., et al., Multiple interactions between Streptococcus thermophilus, Lactobacillus helveticus and Lactobacillus delbrueckii strongly affect their growth kinetics during the making of hard cooked cheeses. Int J Food Microbiol, 2009. 131(1): p. 10-9. 64. Rodriguez, C., et al., Therapeutic effect of Streptococcus thermophilus CRL 1190-fermented milk on chronic gastritis. World J Gastroenterol, 2010. 16(13): p. 1622-30. 65. Bloem, A., A. Lonvaud-Funel, and G. de Revel, Hydrolysis of glycosidically bound flavour compounds from oak wood by Oenococcus oeni. Food Microbiol, 2008. 25(1): p. 99-104. 66. Gindreau, E., E. Walling, and A. Lonvaud-Funel, Direct polymerase chain reaction detection of ropy Pediococcus damnosus strains in wine. J Appl Microbiol, 2001. 90(4): p. 535-42. 67. Dols-Lafargue, M., et al., Characterization of gtf, a glucosyltransferase gene in the genomes of Pediococcus parvulus and Oenococcus oeni, two bacterial species commonly found in wine. Appl Environ Microbiol, 2008. 74(13): p. 4079-90. 68. Rossetti, L., et al., Grana Padano cheese whey starters: microbial composition and strain distribution. Int J Food Microbiol, 2008. 127(1-2): p. 168-71. 69. de Vrese, M. and J. Schrezenmeir, Probiotics, prebiotics, and synbiotics. Adv Biochem Eng Biotechnol, 2008. 111: p. 1-66. 70. Ashraf, R. and N.P. Shah, Selective and differential enumerations of Lactobacillus delbrueckii subsp. bulgaricus, Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus casei and Bifidobacterium spp. in yoghurt--a review. Int J Food Microbiol, 2011. 149(3): p. 194-208. 71. Choi, S.C., et al., Probiotic Fermented Milk Containing Dietary Fiber Has Additive Effects in IBS with Constipation Compared to Plain Probiotic Fermented Milk. Gut Liver, 2011. 5(1): p. 22-8. 72. Diez, A.M., I. Jaime, and J. Rovira, The influence of different preservation methods on spoilage bacteria populations inoculated in morcilla de Burgos during anaerobic cold storage. Int J Food Microbiol, 2009. 132(2-3): p. 91-9. 73. Chenoll, E., et al., Lactic acid bacteria associated with vacuum-packed cooked meat product spoilage: population analysis by rDNA-based methods. J Appl Microbiol, 2007. 102(2): p. 498-508.

74. Schirmer, B.C., E. Heir, and S. Langsrud, Characterization of the bacterial spoilage flora in marinated pork products. J Appl Microbiol, 2009. 106(6): p. 2106-16.

75. Bounaix, M.S., et al., Characterization of glucan-producing Leuconostoc strains isolated from 8 Chapter sourdough. Int J Food Microbiol, 2010. 144(1): p. 1-9. 76. Kim, S.Y., et al., Monitoring of Leuconostoc population during sauerkraut fermentation by quantitative real-time polymerase chain reaction. J Microbiol Biotechnol, 2011. 21(10): p. 1069-72. 77. Islam, S.M., et al., Organophosphorus hydrolase (OpdB) of Lactobacillus brevis WCP902 from kimchi is able to degrade organophosphorus pesticides. J Agric Food Chem, 2010. 58(9): p. 5380-6. 78. Sakamoto, K. and W.N. Konings, Beer spoilage bacteria and hop resistance. Int J Food Microbiol, 2003. 89(2-3): p. 105-24. 188 Chapter 8

79. Sudagidan, M. and A. Yemenicioglu, Effects of Nisin and Lysozyme on Growth Inhibition and Biofilm Formation Capacity of Staphylococcus aureus Strains Isolated from Raw Milk and Cheese Samples. J Food Prot, 2012. 75(9): p. 1627-33. 80. Zottola, E.A., et al., Utilization of cheddar cheese containing nisin as an antimicrobial agent in other foods. Int J Food Microbiol, 1994. 24(1-2): p. 227-38. 81. Alegria, A., et al., Bacteriocins produced by wild Lactococcus lactis strains isolated from traditional, starter-free cheeses made of raw milk. Int J Food Microbiol, 2010. 143(1-2): p. 61-6. 82. Naderi, N., et al., OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents. Bioinformatics, 2011. 27(19): p. 2721-9. 83. Rebholz-Schuhmann, D., et al., Text processing through Web services: calling Whatizit. Bioinformatics, 2008. 24(2): p. 296-8. 84. Van Auken, K., et al., Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR. Database (Oxford), 2012. 2012(0): p. bas040. 85. Pillai, L., et al., Developing a biocuration workflow for AgBase, a non-model organism database. Database (Oxford), 2012. 2012: p. bas038. 86. McQuilton, P., Opportunities for text mining in the FlyBase genetic literature curation workflow. Database (Oxford), 2012. 2012(0): p. bas039. 87. FSTA, Food Science and Technology database. http://www.ifis.org/fsta/, -. Functional annotation of microorganisms using text mining 189

Chapter 8 Chapter CHAPTER 9 General discussion

General discussion 193

Background

In this thesis, the application of text mining (TM) and automatic information extraction (IE), to study drug effects and microorganisms was presented. In drug discovery the effects of a drug on the expression of genes is studied in genome-wide association studies. This often results in hundreds of regulated genes that need to be analyzed. Similarly, microorganisms are studied in community profiling studies, were metagenomics is used to identify microorganisms that are affected by a processes or treatment resulting in tens to hundred of organisms that are affected by the experimental conditions. In both research disciplines annotation of the results from these studies is crucial in order to draw conclusions and to initiate follow-up research. The Medline database contains millions of abstracts and literature citations that are used by researchers to keep up to date with the state-of the- art developments in their research area. For this reason, the Medline database is very suitable for the annotation of biological experiments, because it can be used to find what already is known in the literature about the genes and microorganisms of interest. The number of scientific papers in the Medline database grows exponentially. Therefore TM and IE are needed to automatically extract useful information from these papers.

The concept of high-throughput screening Literature studies to get insight into the developments of a specific research area, involve manual selection and careful reading of interesting papers. Depending on the topic and how much is written about it, this can take weeks. With the rapid accumulation of new scientific papers, it is very likely that the amount of time to read and select useful information from scientific literature will only increase in the future. Moreover manual searching has the drawback of not being all inclusive, possibly missing valuable information. In drug discovery, high-throughput screening (HTS) is a widely used technique to rapidly screen the activity of large compound libraries to identify compounds that change the activity of a molecular target. The concept of HTS has been used by the cheminformatics community to perform virtual screening to identify bioactive compounds through computational means [1]. Similarly, the concept of HTS can be applied to automatically screen literature for interesting information using computational methods (Figure 1). HTS of literature using TM and IE involves selection of papers about a certain topic, extraction of useful information from these papers and subsequently presentation of this information in such a way that it can be used by scientists. HTS of literature saves time since it automatically reduces the large amount of literature

to a subset of papers that needs to be analyzed. The exact screening method and subsequent selection of relevant papers is depending on the research question. For example if we are interested in information about the process of cheese making, we first 9 Chapter 194 Chapter 9

screen the literature for the occurrence of ‘cheese’ or ‘cheese making’ and select those papers in which these terms occur. This results in 7184 abstracts from over 22 million abstracts that need to be analyzed. The next step in HTS of literature is the automatically extraction of detailed information about cheese making from this subset of articles.

Figure 1 High-throughput screening of literature involves the total package of selection, extraction and visualization of information from literature.

Extraction of information from papers can be done in different ways. In this thesis we used CoPub. CoPub has been built on the principle that keywords that co-occur in Medline abstracts have a functional relationship [2]. This principle can also be used to find hidden relations by searching for shared intermediates between two keywords which are not directly related [3, 4] or to perform ‘concept profiling’ in which two keywords or concepts which have a similar profile of matching keywords, may be related to each other [5] . Another way to extract information from text is by means of natural language processing (NLP). NLP is able to search for specific relations between two keywords and has successfully been used for the extraction of protein-protein interaction information and for pathway reconstruction [6-9]. Although HTS of literature is focused on fast retrieval of relevant literature and extraction of interesting concepts, experts are still needed to set up intelligent search strategies and to analyze the search results returned by these strategies. General discussion 195

Automatic extraction of gene names from text IE and TM are well established to retrieve information about genes from literature [10- 12]. In the majority of the studies described in this thesis, CoPub was used to identify higher level biological relationships for lists of genes [13-15]. One of the biggest challenges in text mining is the recognition of gene names in text, since genes in scientific literature are described by different names and symbols and some of them even share symbols and names. Results from the gene normalization task of the BioCreative II contest underlined this challenge, since none of the participating systems was able to correctly extract all human gene names from a set of expert-curated Medline abstracts [16]. These types of contests help IE systems to improve their automatic gene recognition procedures, but could also lead to overfitting of the system on the training dataset resulting in a reduced performance on ‘real’ datasets. A possible solution for solving the problem of correctly identifying gene names in text is to include Entrezgene numbers as unique gene identifiers in papers submitted for publication [17]. In general TM and IE would greatly benefit when papers would be written according to standardized structures and rules. Part of this standardization could be that for every publication two versions are produced. One is a human readable version, but according to a standardized format. The second version is completely structured in which important relations are marked by the authors. This approach of standardization of scientific papers would demand great discipline of the authors and should be a jointed-effort of international research institutes, universities, companies, publishers and the text mining community [18-20].

Co-occurrence based approach vs NLP based approach Although co-occurrence based TM is very powerful for the analysis of large datasets it also has drawbacks. In chapter 7 of this thesis, CoPub was used to find additional evidence for the fact that CCL1 and genes with a similar profile show a specific Th2 cell response andIL-2 and genes with a similar profile show a specific Th1 cell response. However, CoPub was not able to find this additional evidence. This is caused by the fact that for over 30% of the abstracts in which Th1 occurs also Th2 occurs and vice versa. For a co-occurrence based system, Th1 and Th1 are too close related in order to detect specific relationships between genes and either Th1 or Th2. This requires a high resolution approach such as NLP. NLP will only detect a relation between CCL1 and Th2 in the sentence “CCL1 is expressed in Th2 cells, but CCL1 is not expressed in Th1” where co-occurrence based TM would find relations between CCL1 and Th1 and between CCL1 and Th2. However NLP is computationally intensive, resulting in complex extracted information structures which makes itless suitable for the analysis of large datasets. A combination of a co-occurrence based TM with NLP might be a good solution for the extraction of information, in which the co- 9 Chapter 196 Chapter 9

occurrence based system captures the biology at a higher level, while NLP can be used to find more specific relations.

Text mining to study effects of synthetic glucocorticoids In chapters 3-6, TM was used in multiple ways to get information from the scientific literature about genes and pathways induced by glucocorticoids (GCs). Treatment with synthetic GCs results in side effects and understanding how these side effects are initiated, can help to develop new GC compounds that have reduced side effects while maintaining the desired immune suppressive properties. In chapter 4 and 6 of this thesis, prednisolone induced genes were analyzed by searching for co-occurrences of the genes with disease- and pathway terms. Although this approach was very effective to get a general insight into the affected biological processes by prednisolone, it lacks specific information on the mechanisms by which prednisolone initiates the side effects. This requires in-depth analysis of the affected processes by experts. A more detailed approach to study GC induced metabolic side effects was presented in chapter 3, where a network of insulin resistance (IR) related genes was created based on the co-occurrences of these IR related genes in the literature. Subsequently, on this network, associations of the IR genes with dexamethasone were mapped. This system approach identified the effects of dexamethasone on a single gene and on its neighboring genes in the network. Although the study in chapter 3 was entirely based on abstracts from published studies, it nevertheless gave new insights into the development of GC induced IR. We suggested the possible involvement of the sex steroid pathway in GC induced IR. In chapter 5 we used TM to create a knowledge framework in which we captured the observations of GC effects on genes and proteins. This framework enabled us to identify pathways affected by prednisolone that could be involved in metabolic disturbances in adipocytes. These studies show that TM can assist research on various levels from gene and protein level to pathway level to identify relevant information about GC biology that can be used by experts to annotate their experiments. Moreover methods such as CoPubGene (chapter 3 of this thesis) and CoPub Discovery, a method to find hidden relations between keywords by searching for shared intermediates [3], can provide novel insights in the side effects of GC treatment.

Study the dynamics of microorganisms using literature mining In chapter 8, thesaurus based text mining was used to find evidence for the pathogenic potential of microorganisms derived from two community profiling studies. Co- occurrences of these microorganisms with disease terms from a thesaurus resulted in many specific disease associations. However, thesaurus based TM is limited to only General discussion 197 keywords in the thesaurus. Especially in food research, thesaurus based TM is not applicable because of lack of specialized thesauri that can be used to find associations in the literature between industrial important microorganisms and food components. Therefore we were interested to see if we could apply TM without using a thesaurus, to identify keywords that describe the properties of lactic acid bacteria (LAB). Moreover we evaluated whether presentation of these keywords in a word cloud would give a sufficient overview of the processes and applications in which these LAB are involved.

Word clouds Word clouds are a powerful solution for quickly summarizing text by maximizing the display of the most relevant keywords in the minimum amount of space. Many web applications use word clouds to summarize news articles [21]. Also in biomedical research, word clouds have been used to summarize gene related information [22], to sum up annotation for biological networks [23], or to summarize information in Mesh descriptors [24]. The word clouds indeed provided a sufficient overview of the biological properties of LAB and the processes in which these LAB were involved. These word clouds are a starting point for specific searches by experts to gain insight into the context in which the found keywords are mentioned. However, in the word clouds also a number of keywords were found that belong to the same concept, e.g. ‘fermentation’ and ‘fermented’ have the same meaning and ‘yogurt’ and ‘yoghurt’ are the same but different spelled. Nevertheless, the power of this TM approach lies in the fact that without prior knowledge interesting keywords about LAB can be retrieved from abstracts. Additionally, the information in the word clouds could be improved by giving keywords which belong to the same category the same color or by indicating how keywords in the word cloud are related to each other. Moreover word clouds could be extended by maximizing the display of relevant keywords based on a calculated p-value by comparison with a background word cloud.

Other literature sources TM of information about microorganisms was done on Medline abstracts. The Medline database has a strong focus on biomedical research and is therefore very suitable for finding relations between microorganisms and diseases. However, food related topics are under-represented in the Medline database. Therefore additional literature resources such as the Food Science Technology Abstracts (FSTA) database are needed [25]. FSTA covers food-related literature from 1969 till present and might be useful for the retrieval of food related information on microorganisms. It would be interesting to see if indeed such a dedicated literature resource yields more food-related abstracts (recall) or yields more abstracts that are relevant for food-related research (precision) in comparison with 9 Chapter 198 Chapter 9

the Medline database.

Conclusion In this thesis TM was used for high-throughput analysis and annotation of gene expression and protein datasets by linking these sets to diseases, drug terms and biological processes in the scientific literature. This gave insight into the underlying mechanisms of GC induced metabolic side effects on multiple levels. Furthermore TM was used to retrieve disease associations for sets of microorganisms from community profiling studies and for the retrieval of biological properties of groups of LAB to get an overview of the biological processes and industrial applications in which they are involved. In a time in which the generation of biological data is exploding, TM can have a valuable contribution to translating this data to biological knowledge.

General discussion 199

References

1. Scior, T., et al., Recognizing Pitfalls in Virtual Screening: A Critical Review. J Chem Inf Model, 2012. 2. Alako, B.T., et al., CoPub Mapper: mining MEDLINE based on search term co-publication. BMC Bioinformatics, 2005. 6: p. 51. 3. Frijters, R., et al., Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Comput Biol, 2010. 6(9). 4. van Haagen, H.H., et al., Novel protein-protein interactions inferred from literature context. PLoS One, 2009. 4(11): p. e7894. 5. Jelier, R., et al., Literature-based concept profiles for gene annotation: the issue of weighting. Int J Med Inform, 2008. 77(5): p. 354-62. 6. Bandy, J., D. Milward, and S. McQuay, Mining protein-protein interactions from published literature using Linguamatics I2E. Methods Mol Biol, 2009. 563: p. 3-13. 7. Miyao, Y., et al., Evaluating contributions of natural language parsers to protein-protein interaction extraction. Bioinformatics, 2009. 25(3): p. 394-400. 8. Yuryev, A., et al., Automatic pathway building in biological association networks. BMC Bioinformatics, 2006. 7: p. 171. 9. Oda, K., et al., New challenges for text mining: mapping between text and manually curated pathways. BMC Bioinformatics, 2008. 9 Suppl 3: p. S5. 10. Krallinger, M., A. Valencia, and L. Hirschman, Linking genes to literature: text mining, information extraction, and retrieval applications for biology. Genome Biol, 2008. 9 Suppl 2: p. S8. 11. Gerner, M., et al., BioContext: an integrated text mining system for large-scale extraction and contextualization of biomolecular events. Bioinformatics, 2012. 28(16): p. 2154-61. 12. Jelier, R., et al., Anni 2.0: a multipurpose text-mining tool for the life sciences. Genome Biol, 2008. 9(6): p. R96. 13. Merkl, M., et al., Microarray analysis of equine endometrium at days 8 and 12 of pregnancy. Biol Reprod, 2010. 83(5): p. 874-86. 14. Friberg, P.A., D.G. Larsson, and H. Billig, Transcriptional effects of progesterone receptor antagonist in rat granulosa cells. Mol Cell Endocrinol, 2010. 315(1-2): p. 121-30. 15. Frijters, R., et al., Literature-based compound profiling: application to toxicogenomics. Pharmacogenomics, 2007. 8(11): p. 1521-34. 16. Morgan, A.A., et al., Overview of BioCreative II gene normalization. Genome Biol, 2008. 9 Suppl 2: p. S3. 17. Bennani-Baiti, B. and I.M. Bennani-Baiti, Gene symbol precision. Gene, 2012. 491(2): p. 103-9. 18. Williams, A.J., et al., Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today, 2012. 17(21-22): p. 1188-98. 19. Hirst, A. and D.G. Altman, Are peer reviewers encouraged to use reporting guidelines? A survey of 116 health research journals. PLoS One, 2012. 7(4): p. e35621. 20. Dalgleish, R., et al., Solving bottlenecks in data sharing in the life sciences. Hum Mutat, 2012. 33(10): p. 1494-6. 21. Kuo BYL, H.T., Good BM, Wilkinson MD, Proceedings of the 16th international conference on World Wide Web: 2007. ACM, 2007. 22. Desai, J., et al., Visual presentation as a welcome alternative to textual presentation of gene annotation information. Adv Exp Med Biol, 2010. 680: p. 709-15. 23. Oesper, L., et al., WordCloud: a Cytoscape plugin to create a visual semantic summary of networks. Source Code Biol Med, 2011. 6: p. 7. 24. Sarkar, I.N., et al., LigerCat: using "MeSH Clouds" from journal, article, or gene citations to facilitate the identification of relevant biomedical literature. AMIA Annu Symp Proc, 2009. 2009: p. 563-7. 25. IFIS and FSTA. -; Available from: http://www.ifis.org/media-centre/media-resources/.

Chapter 9 Chapter

Summary

202 Summary 203

Summary

Bioinformatics approaches are important for the interpretation of results from experiments that are used to study drug effects and in the microbiology. Annotation of these experiments is crucial in order to draw conclusions and to initiate follow-up research. Automatic information extraction (IE) and text mining (TM) are important techniques to find relevant information in the ever growing (scientific) literature to aid scientific research. In this thesis IE and TM are used to study the metabolic side-effects of glucocorticoid (GC) treatment on genes and biological processes. Furthermore it is shown that these techniques also can be used to study microorganisms by annotation of lists of organisms that are the result of metagenomics experiments, to better understand in which way they influence biological processes and diseases. In chapter 2 CoPub 5.0 is presented, a publicly available text mining system that calculates keyword co-occurrences in Medline abstracts. CoPub 5.0 is the integration of existing CoPub technology with new features and a new advanced interface that can be used to answer a variety of biological questions. With CoPub 5.0 direct and indirect relations between genes, drugs, diseases and biological processes can be found. Chapter 3 describes CoPubGene, a method to create literature networks of disease related genes. CoPubGene has been used to create a literature network of genes related to insulin resistance (IR), a known side-effect of glucocorticoid treatment for which the mechanisms are not completely understood. Subsequent mapping of dexamethasone relations on this gene network, gave new insights into the effects of this synthetic glucocorticoid on IR related genes. The method revealed the possible involvement of sex steroid synthesis in the development of GC induced IR. Additionally, implementation of the CoPubGene technology into webservice operations enables to incorporate this technology into bioinformatics workflows which makes it possible to create a wide variety of literature networks of disease related genes. Chapter 4 describes a genome wide expression profiling study to identify prednisolone induced gene signatures in CD4+ T lymphocytes and CD14+ monocytes derived from healthy volunteers. Identified gene signatures were linked to underlying metabolic related biological pathways. This enables to study the adverse effects of prednisolone on metabolism. TM revealed several affected genes, such as PDK4 and ACSL1 which have a predominantly metabolic function and are less important for the anti-inflammatory action of prednisolone. These genes might be suitable gen-based biomarker candidates for the unwanted metabolic effects of GC treatment. Chapter 5 decribes a study in which the integration of genome wide expression data, adipokine profiling data, and TM techniques identifies pathways and mechanisms involved in prednisolone induced metabolic disturbances in murine 3T3-L1 adipocytes. Several members of the wnt signaling pathway were found to be regulated by prednisolone both

204

at gene expression level and at protein secretion level. Hence the wnt signaling pathway has been suggested as a suitable pathway candidate for follow-up research to study its involvement in the development of metabolic disturbances in adipocyte cells as a result of GC treatment. Chapter 6 presents a genome wide gene expression profiling study to examine the role of glucocorticoid receptor (GR) dimerization on the regulation of gene expression in mouse liver. Biological pathways targeted by prednisolone in the liver of wild type (WT) mice and mice lacking the ability to form glucocorticoid receptor dimers (GRdim) were identified and compared using the CoPub technology. In GRdim mice, the level of prednisolone-induced gene expression was significantly reduced compared to WT, but not completely absent. Furthermore several novel GC-inducible genes were identified from which Fam107a, that encodes for a putative histone acetyltransferase complex interacting protein, was most strongly dependent on GR dimerization. Chapter 7 describes a study that identifies pathways and genes involved in the differentiation of Jurkat T cells towards Th1 and Th2 subtype cells, using various stimuli and pathway inhibitors. PMA/CD3 induced an IL-2 gene profile associated with a Th1 type of response while PMA/CD28 induced a CCL1 gene profile associated with a Th2 type of response. TM analysis using CoPub was used to find additional evidence in the scientific literature for these Th1 and Th2 specific gene expression patterns of IL-2 and CCL1. Analysis with CoPub technology did not result in additional evidence for this specificity. This shows the limitations of co-occurrence based TM methods. However, this type of new and challenging applications can be used to improve the CoPub technology. In chapter 8 it is shown that TM/IE techniques used in drug discovery can be transferred to other research areas such as microbiology. TM was used for the annotation of organism lists from metagenomics experiments by searching for co-occurrences of the microorganisms and disease terms. Furthermore, analysis of abstracts in which microorganisms occur, resulted in information about the biological properties of these microorganisms and the (industrial) processes in which they are used. This information can be used to assist research towards improvement of the use of microorganisms in existing applications, but also to find new applications. Furthermore this results in new hypothesis for industrial important microorganisms such as lactic acid bacteria.

Summary 205

Samenvatting

208 Samenvatting 209

Samenvatting

Het toepassen van bioinformatica-methoden zijn belangrijk voor het begrijpen en interpreteren van resultaten van experimenten die gebruikt worden in studies naar medicijnwerking en in studies naar de toepassing van micro-organismen. Annotatie van deze experimenten is van cruciaal belang om de juiste conclusies te kunnen trekken en om vervolg onderzoek te kunnen initiëren. Automatische informatie extractie (IE) en text mining (TM) zijn belangrijke technieken voor het vinden van relevante informatie in de steeds groeiende (wetenschappelijke) literatuur. In dit proefschrift worden IE en TM gebruikt voor het bestuderen vande ongewenste effecten als gevolg van de behandeling met glucocorticoïden (GC), op genen en biologische processen die betrokken zijn bij de stofwisseling. Verder wordt gedemonstreerd dat deze technieken ook gebruikt kunnen worden voor het annoteren van lijsten van micro-organismen die het resultaat zijn van metagenomische experimenten met als doel zo beter te kunnen begrijpen wat de invloed en werking van deze micro- organismen zijn op biologische processen en ziektes. In hoofdstuk 2 wordt CoPub 5.0 gepresenteerd, een TM systeem dat relaties tussen trefwoorden (keywords) berekent in Medline samenvattingen. Een nieuwe geavanceerde interface, die bestaande CoPub-technologie combineert met nieuwe kenmerken, kan worden gebruikt voor het beantwoorden van een aan verscheidenheid aan biologische vragen. Met CoPub 5.0 kunnen directe en indirecte relaties tussen genen, medicijnen, ziekten en biologische processen gevonden worden. Hoofdstuk 3 beschrijft CoPubGene, een methode voor het maken van netwerken van ziekte-gerelateerde genen op basis van informatie in de literatuur. CoPubGene werd gebruikt voor het maken van een literatuur netwerk van genen, gerelateerd aan insulineresistentie (IR), een bekende maar onbegrepen bijwerking van de behandeling met GC. Het mappen van dexamethasone-relaties op het verkregen gen-netwerk gaf nieuwe inzichten in de effecten van deze synthetische GC op IR gerelateerde genen. De methode onthulde de mogelijke betrokkenheid van leden van het sex-steroïde-synthese proces bij de ontwikkeling van GC geïnduceerde IR. Implementatie van de CoPubGene- technologie in webservice operaties stelt ons bovendien in staat om deze technologie te gebruiken in bioinformatica-workflows, om aldus literatuur netwerken voor een verscheidenheid aan ziekte gerelateerde genen te kunnen maken en bestuderen. In Hoofdstuk 4 wordt een genoomwijde genexpressie-profilering-studie beschreven, waarin gen signaturen worden geidentificeerd, die geïnitieerd worden door prednisolon in CD4+ T lymphocyten en CD14+ monocyten, afkomstig van gezonde vrijwilligers. Deze geïdentificeerde genen werden in dit onderzoek gekoppeld aan de onderliggende metabole biologische processen. Op deze wijze werden we in staat gesteld om de ongunstige effecten van prednisolon op de stofwisseling te bestuderen. Door gebruik te maken van TM, werden genen zoals PDK4 en ACSL1 gevonden die voornamelijk bij de

210

stofwisseling betrokken zijn, en minder bij de anti-inflammatoire effecten van prednisolon. Deze genen kunnen mogelijk gebruikt worden als gen-gebaseerde biomarker-kandidaten voor de ongewenste effecten op de stofwisseling als gevolg van de behandeling met GC. In hoofdstuk 5 wordt een studie beschreven die door het combineren van genoomwijde genexpressie data, adipokine-profilerings data en TM technieken, processen en mechanismen identificeert die betrokken zijn bij prednisolon-geïnduceerde metabole verstoringen in murine 3T3-L1 adipocyten. Verschillende leden van de wnt signaling pathway werden gereguleerd door predisolon op zo wel gen-expressie niveau als op eiwit niveau. Daarom is de wnt signaling pathway een goede kandidaat voor vervolgonderzoek om zo de betrokkenheid van dit proces bij de ontwikkeling van prednisolon-geïnduceerde metabole verstoringen in adipocyte cellen te bestuderen. Hoofdstuk 6 presenteert een genoomwijde genexpressie studie om de rol van het vormen van dimeren van de glucocorticoide receptor (GR) op de regulatie van gen expressie in muizenlever te achterhalen. Biologische processen die door prednisolon beïnvloed werden in de lever van wildtype (WT) muizen en in de lever van genetisch gemodificeerde muizen die niet meer in staat zijn om dimeren van GR(GRdim) te vormen, zijn m.b.v. de CoPub- technologie vergeleken met elkaar. In GRdim muizen was het niveau van prednisolon geïnduceerde gen-expressie significant lager in vergelijking met WT muizen, maar niet helemaal afwezig. Verder werden er verschillende nieuwe GC-induceerbare genen gevonden, waarvan Fam107a, dat codeert voor een eiwit dat mogelijk een binding aan gaat met een histone acetyltransferase complex, daarbij sterk afhankelijk is van GR dimerisatie. Hoofdstuk 7 beschrijft een studie waarin processen en genen geïdentificeerd werden, die betrokken zijn bij de differentiatie van Jurkat T cellen naar specifieke Th1 enTh2 subtypen cellen, door gebruik te maken van verschillende stimulerende stoffen en pathway remmers. PMA/CD3 induceerde een IL-2 gen profiel dat geassocieerd word met een Th1 type response, terwijl PMA/CD28 een CCL1 gene profiel induceerde dat geassocieerd word met een Th2 type response. Er werd vervolgens gebruik gemaakt van CoPub, om additioneel bewijs in de wetenschappelijke literatuur te vinden voor deze Th1 en Th2 specifieke expressiepatronen van IL-2 en CCL1. De analyse met de CoPub technologie leverde echter geen additioneel bewijs op voor deze specificiteit en toont daarmee ook de beperkingen van co-occurrence gebaseerde TM methoden aan. Dit soort nieuwe en uitdagende toepassingen kunnen echter worden gebruikt om de CoPub technologie verder te verbeteren. In hoofdstuk 8 wordt gedemonstreerd dat TM/IE-technieken zoals die in medicijnonderzoek gebruikt worden, ook in andere onderzoeksvakgebieden, waaronder bijvoorbeeld de microbiologie ingezet kunnen worden. TM kan worden gebruikt voor het annoteren van lijsten van organismen komende uit metagenomische experimenten, door in de wetenschappelijke literatuur te zoeken naar relaties tussen deze organismen en ziektes. Tevens levert automatische analyse van wetenschappelijke literatuur waarin Samenvatting 211 micro-organismen worden genoemd, informatie op met betrekking tot de biologische eigenschappen van deze organismen en de (industriele) toepassingen waarin ze worden gebruikt. Deze informatie helpt onderzoek naar het verbeteren van bestaande microbiologische toepassingen, als naar het vinden van nieuwe toepassingen. Verder roept dit nieuwe hypothesen op aangaande industrieel belangrijke micro-organismen zoals melkzuur bacteriën.

212 Dankwoord 213

Dankwoord

En dan nu het dankwoord… Dit proefschrift heeft alleen tot stand kunnen komen door de mensen die mij gesteund hebben tijdens mijn promotieonderzoek. Een persoonlijke dank aan jullie is hier dan ook wel op zijn plaats.

Jacob, jouw manier van toegepaste wetenschap bedrijven heeft mij altijd aangesproken. Ooit begonnen als master student bij het CMBI, hield ik een presentatie over mijn stage project. Jij zat toen in het publiek en stelde na afloop de o zo relevante vraag “Maar wat kun je hier nou mee?” Tijdens mijn promotieonderzoek heb ik die ene vraag altijd onthouden. Ik ben dan ook dankbaar dat ik in jouw academische groep, waar toegepaste wetenschap belangrijk is, mijn promotieonderzoek heb mogen uitvoeren.

Wynand, van jou heb ik zo ontzettend veel geleerd. Jouw tomeloze inzet, enthousiasme en pragmatische manier van werken, heeft mij zeer gemotiveerd om uiteindelijk dit ‘boekje’ te schrijven. Onze samenwerking begon bij MDI in Oss waar we text mining methoden ontwikkelden en toepasten in drug discovery projecten. Na de aankondiging van MSD om de onderzoeksafdelingen te sluiten, besloot je om je carrière bij NIZO in Ede voort te zetten. Ik ben je erg dankbaar dat je ook na deze carrièreswitch mij nog steeds wilde begeleiden bij mijn promotieonderzoek, ook al betekende het dat je dit naast je werkzaamheden bij NIZO in je vrije tijd moest doen. Je was nooit te beroerd om tijd in te plannen voor overleg, dan wel telefonisch dan wel bij Starbucks op het station of later ook bij NIZO. Deze samenwerking heeft er ook in geresulteerd dat we nu samen verder werken aan text mining voor voedselonderzoek in een eScience project.

Het eerste gedeelte van mijn promotieonderzoek heb ik binnen de muren van de MDI afdeling van Schering-plough (voorheen Organon, later MSD) uitgevoerd. De prettige werksfeer die daar heerste, mede te danken aan een aantal inmiddels ex-collega’s, heeft zonder meer bijgedragen bij de totstandkoming van dit ‘boekje’. Ik wil jullie hier zeker niet vergeten: Ton, Cees, Rene, Tinka, Ruud, Ross, Scott, Rita, de Jannen, Susanne, Peter, Maarten, Thea, Karin en Ria. Speciaal wil ik hier Stefan noemen. Jouw onuitputtelijke kennis van computers en programmeren heeft er voor gezorgd dat ik altijd bij jou terecht kon wanneer ik vragen of problemen had. Ik wil je graag bedanken voor al je hulp en alles wat ik van je geleerd heb.

Ook de (ex) collega’s van de CDD groep en het CMBI verdienen het om hier genoemd te worden. Allereerst Raoul “CoPub Berry”. Samen hebben we aan text mining gewerkt, ook jouw promotie onderwerp, wat geresulteerd heeft in het behalen van je Dr.-titel in september 2011. Je was een fijne collega om mee samen te werken. Succes bij je baan

214

als bioinformaticus bij Rijk-Zwaan. Simon “Koetje”. Je was al weg toen ik aan mijn promotieonderzoek begon, maar het contact dat we hebben gehouden en ook onze gemeenschappelijke historie bij het CMBI hebben er voor gezorgd dat ik je hier toch even wil noemen. De bezoekjes waarbij men met jou altijd zo mooi kon kletsen over van alles en helemaal niks, waren altijd zeer ontspannend. Tegenwoordig zit je bij TNO en ben je trotse vader van dochter Isis! Dave “Dr. D. indeed” Wood. Although you were only there for a short period, it was very nice meeting you. I wish you all the best in London with your wife Alex and son George. Sander, naast inmiddels ex-college, ben je vooral een goede vriend. Ik ben dan ook erg trots dat je mijn paranimf wil zijn. Je was altijd geïnteresseerd in mijn werk en naast het werk stond de deur bij jou, Lisa, Marrit en Mirte altijd open voor een gezellig samen zijn. Veel succes met je carrière bij Lead pharma en het aanstaande vaderschap. Marijn “Weet je wat ut is ” Sanders. Samen waren we de AIO’s in de CDD groep en hadden we veel steun aan elkaar tijdens alle veranderingen waarmee we geconfronteerd werden tijdens onze promotie. Nu werken we samen aan text mining in het eScience project. Juma my roommate at the CMBI: Thanks for your help and support. Jurgen, bedankt voor de lekkere koffie en de mooie verhalen (succes bij d’n Epic). Barbara voor alle administratieve zaken.

Erik “stuur de sh*t maar op” Toonen. Bedankt voor de vele discussies. Op het gebied van glucocorticoïd biologie en het schrijven van wetenschappelijke artikelen heb ik veel van je geleerd. Het is ook niet voor niets dat je als mede-auteur bij vier hoofdstukken in dit proefschrift staat vermeld!

Ontspanning tijdens het promotieonderzoek is belangrijk om je gedachten even te kunnen verzetten. Waar kun je dit beter doen dan bij LOKOMOTIV NIJMEGEN? De vele uurtjes op de fiets, maar ook de gezellige biertjes bij Frowijn hebben er altijd voor gezorgd dat ik mijn promotieonderzoek in perspectief heb kunnen plaatsen. Mannen bedankt! Graag wil ik hier ook nog speciaal Walter bedanken voor de prachtige kaft van mijn proefschrift!

Niels (geweldig dat je mijn paranimf wilt zijn) en Judith bedankt voor alle plezierige etentjes en de gezellige avondjes uit. Geniet van jullie prachtige dochter Ika.

Daniel thanks for your support and the nice discussions about science and about life. I wish you and Ghislaine all the best for the future!

En dan als laatste wil ik mijn familie bedanken. Allereerst mijn zus Karin, schoonbroer Bas en mijn neefjes en nichtje Jelmar, Yulotte, Tibbe en Lauritz. Een prachtige familie, waar ik altijd terecht kan. Meinen Schwiegereltern Peter und Heike: Es ist bei Euch in Idstein wie ein zweites Dankwoord 215

Zuhause für mich. Vielen Dank für Eure Unterstützung! Lieve pap en mam, soms was het lastig te begrijpen wat ik nu precies voor een werk deed, maar jullie waren altijd geïnteresseerd en probeerde het te begrijpen. Jullie stonden en staan nog steeds voor mij klaar wanneer ik jullie nodig heb. Ik kan jullie dan ook niet genoeg bedanken voor de fantastische ondersteuning die mij gemotiveerd heeft om dit proefschrift te schrijven!

En dan de belangrijkste dames in mijn leven: Lieve Lara, toen ik bijna twee jaar bezig was met mijn promotieonderzoek kwam je ter wereld. Het promotieonderzoek en het vaderschap bleek soms lastig te combineren, maar ik ben zo trots om jouw papa te mogen zijn. Je bent mijn inspiratie geweest om dit proefschrift te schrijven. Dikke kus van papa!

Liebe Tina, Du bist das Ziel einer langen Reise Die Perfektion der besten Art und Weise In stillen Momenten leise Die Schaumkrone der Woge der Begeisterung Bergauf, mein Antrieb und Schwung Ich wollte dir nur mal eben sagen Dass du das Größte für mich bist Und sichergehen, ob du denn dasselbe für mich fühlst Für mich fühlst (Von dem Lied "Ein Kompliment", Sportfreunde Stiller)

Wilco Nijmegen, maart 2013

216 Curriculum Vitae 217

CURRICULUM VITAE

Wilco Fleuren werd geboren op 8 april 1978 te Nijmegen. Na het behalen van het HAVO- diploma aan het Thomas college in Venlo in 1996, is hij begonnen aan de opleiding Ho- gere Informatica aan de Fontys hogescholen in Eindhoven. Zijn afstudeerstage verrichtte hij bij Philips Medical Systems (PMS) op de afdeling Systems Development Cardio Vas- cular (SDCV) in Best. Onder leiding van dr. Paul Luttikhuizen werden verschillende soft- ware pakketten geëvalueerd voor het bewaken van veiligheidseisen van cardio-vasculaire diagnostische systemen. In februari 2001 behaalde hij zijn diploma. Van februari tot en met juni van datzelfde jaar heeft hij deze werkzaamheden bij PMS voort gezet in de rol van requirements-analist. In september 2001 is hij vervolgens begonnen aan de mas- ter opleiding bioinformatica aan de Radboud Universiteit in Nijmegen. Bij het Centre for Molecular and Biomolecular Informatics (CMBI) onder leiding van prof. dr. Gert Vriend heeft hij onder andere gewerkt aan een database voor eiwit families en in samenwerk- ing met dr. Berend Snel aan een methode voor het voorspellen van eiwit-eiwit interacties in prokaryoten door gebruik te maken van gecorreleerde mutatie analyse. In augustus 2003 behaalde hij zijn master diploma. Van december 2003 tot en met december 2008 is hij werkzaam geweest als wetenschappelijk programmeur in de Computational Drug Discovery (CDD) groep van prof. dr. Jacob de Vlieg, onderdeel van het Centre for Molecu- lar and Biomolecular Informatics (CMBI) aan de Radboud Universiteit Nijmegen. In deze functie heeft hij zijn werkzaamheden uitgevoerd binnen de afdeling Molecular Design and Informatics (MDI) van Organon (later Schering-plough en nu Merck Sharp & Dhome) onder leiding van dr. Peter Groenen en dr. Wynand Alkema. Hij heeft gewerkt aan een da- tabase waarin eiwit sequenties van verschillende organismen opgeslagen werden. Deze eiwitsequenties werden met elkaar vergeleken om zo een beter inzicht te krijgen in de verschillen tussen de organismen op eiwitniveau. Vervolgens is hij van december 2008 tot en met september 2009 als bioinformaticus werkzaam geweest binnen deze MDI afdeling en heeft o.a. de CoPub methode geoptimaliseerd, in samenwerking met Super- computer Center SARA, Amsterdam. In oktober 2009 is hij begonnen aan zijn promotie onderzoek onder leiding van prof. dr. Jacob de Vlieg (promotor) en dr. Wynand Alkema (co-promotor). Het grootste deel van het promotieonderzoek, met als doel het ontwikke- len en toepassen van bioinformatica en automatische informatie-extractie-methoden om bijwerkingen van glucocorticoïden te bestuderen, is uitgevoerd op de afdeling van MDI en in samenwerking met dr. Erik Toonen. Het tweede gedeelte van het promotieonder- zoek, het ontwikkelen en toepassen van automatische informatie-extractie-technieken om micro-organismen te bestuderen, is uitgevoerd op het CMBI in samenwerking met NIZO. De resultaten van het onderzoek staan beschreven in dit proefschrift.

218 Bibliography 219

Bibliography

Fleuren WW, Sanders MP, Boekhorst J, de Vlieg J, Alkema W. Functional annotation of microorganisms using text mining. (2013), in preparation.

Ellero S, Fleuren WW, Bauerschmidt S, Dokter WH, Toonen EJ. Identification of gene signatures for prednisolone-induced metabolic dysfunction in collagen-induced arthritis mice. (2013), in preparation.

Verhoeven S, Fleuren WW, Wagener M, de Vlieg J, Ritschel T. KRIPOweb: Detect and analyze similar protein binding sites beyond protein family boundaries. (2013), in preparation.

Fleuren WW, Linssen MM, Toonen EJ, van der Zon GCM. , Guigas B, de Vlieg J, Dokter WH, Ouwens DM, Alkema W. Prednisolone induces the Wnt signaling pathway in 3T3-L1 adipocytes. (2013), Accepted for publication in Archives of Physiology and Biochemistry.

Fleuren WW, Toonen E, Frijters R, Verhoeven S, Hulsen T, Rullmann T, van Schaik R, de Vlieg J, Alkema W. Identification of new biomarker candidates for glucocorticoid induced insulin resistance using literature mining. (2013), BioData Min. 6(1):2.

Smeets RL, Fleuren WW, He X, Vink PM, Wijnands F, Gorecka M, Klop H, Bauerschmidt S, Garritsen A, Koenen HJPM, Joosten I, Boots AMH, Alkema W. Molecular pathway profiling of T lymphocyte signal transduction pathways; Th1 and Th2 genomic fingerprints are defined by TCR and CD28-mediated signaling.(2012), BMC Immunol, 13(1):12.

Sanders MP, Fleuren WW, Verhoeven S, van den Beld S, Alkema W, de Vlieg J, Klomp JP. ss-TEA: Entropy based identification of receptor specific ligand binding residues from a multiple sequence alignment of class A GPCRs.(2011), BMC Bioinformatics, 12:332.

Fleuren WW, Verhoeven S, Frijters R, Heupers B, Polman J, van Schaik R, de Vlieg J, Alkema W. CoPub update: CoPub 5.0 a text mining system to answer biological questions. (2011), Nucleic Acids Res, 39:W450-4.

Toonen EJ, Fleuren WW, Nässander U, van Lierop MJ, Bauerschmidt S, Dokter WH, Alkema W. Prednisolone induced changes in gene expression profiles in healthy volunteers. (2011), Pharmacogenomics, (2011),12(7):985-98.

220

Frijters R, Fleuren W, Toonen EJ, Tuckermann JP, Reichardt HM, van der Maaden H, van Elsas A, van Lierop MJ, Dokter W, de Vlieg J, Alkema W. Prednisolone-induced differential gene expression in mouse liver carrying wild type or a dimerization-defective glucocorticoid receptor. (2010), BMC Genomics,11:359.

Folkertsma S, van Noort P, Van Durme J, Joosten HJ, Bettler E,Fleuren W, Oliveira L, Horn F, de Vlieg J, Vriend G. A family-based approach reveals the function of residues in the nuclear receptor ligand-binding domain. (2004), J Mol Biol, 341(2):321-35. Bibliography 221