VU Research Portal

Stress-free de Boer, T.E.

2010

document version Publisher's PDF, also known as Version of record

Link to publication in VU Research Portal

citation for published version (APA) de Boer, T. E. (2010). Stress-free springtails: Determining natural gene expression profiles in collembolans. Ipskamp Drukkers B.V.

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ?

Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

E-mail address: [email protected]

Download date: 25. Sep. 2021

Stress-free springtails – Determining natural gene expression profiles in collembolans

Cover design: Janine Mariën Lay-out: Désirée Hoonhout Printing: Ipskamp Drukkers B.V., Enschede

VRIJE UNIVERSITEIT

Stress-free springtails

Determining natural gene expression profiles in collembolans

ACADEMISCH PROEFSCHRIFT

ter verkrijging van de graad Doctor aan de Vrije Universiteit Amsterdam, op gezag van de rector magnificus prof.dr. L.M. Bouter, in het openbaar te verdedigen ten overstaan van de promotiecommissie van de faculteit der Aard- en Levenswetenschappen op woensdag 1 december 2010 om 11.45 uur in de aula van de universiteit, De Boelelaan 1105

door

Tjalf Elmer de Boer geboren te Alkmaar promotor: prof.dr. N.M. van Straalen copromotor: dr.ir. D. Roelofs Contents

Page

Chapter 1: Introduction 7

Chapter 2: Reference genes for QRT-PCR tested under various stress conditions 23 in Folsomia candida and cincta (Insecta, Collembola)

Chapter 3: The effect of soil pH and temperature on Folsomia candida 47 transcriptional regulation

Chapter 4: Transcriptional plasticity of a soil across different 65 ecological conditions

Chapter 5: The effects of aged copper pollution on Folsomia candida physiology 89

Chapter 6: Discussion 109

Summary 117 Samenvatting 121 Dankwoord 125 Curriculum Vitae 126 Publications 127

Chapter 1

General introduction

All over the world, from mountains to deserts, soil fulfills many important functions for plants, and humans. Plants grow in it while extracting nutrients and water from the soil. Animals and humans are exposed to it directly by walking on it, or living in it, or indirectly by eating plants or other animals. There are theories that soil, in the form of clay particles, played a major role in the origins of life. This theory states that clay particles are able to catalyze the reactions needed for the formation of primitive life (Ponnamperuma et al., 1982). In 2003 Hanczyc et al (2003) reported that particles of a certain type of clay, called Montmorrilonite, is able to catalyze the formation of lipid bi-layers, simple forms of cell membranes, from single fatty acids. Because soil plays an important role in organismal functioning, soil pollution can have a major impact on plants and animals. Soil pollutants can be taken up by plants and animals where they can cause adverse effects on their development and ultimately threaten the survival of species (McGrath et al., 1995, Lande, 1998). Soil pollution can be formed by natural phenomena, such as volcanic eruptions, or caused by human society. The latter, also called anthropogenic pollution, is the result of an overburden of substances emitted into the environment faster than natural systems can eliminate them. Anthropogenic pollution can be historic; an example is heavy metal pollution. In the Bronze Age (3300 - 1200 BC) human civilization started working metals on a large scale. The production of these metals, first copper and tin but later also iron and zinc, requires mining and smelting of the various metal containing ores. During the smelting process other metals such as lead and cadmium were also emitted. This is why soils and litter in many historic smelting sites in Europe are heavily polluted with these heavy metals (Nriagu, 1996). To understand the effect of soil pollutants on soil flora and fauna, it is important to know the interaction between soil properties and pollutants.

The origin of soil To understand the functional properties of soil we need to understand its composition and how it was formed. Soil is generally made up by minerals, metals, organic molecules, water

7 and air. The distribution of these elements in the earth crust was established during the formation of our solar system and the earth. The earth took shape approximately 4.5 billion years ago. After its formation the young planet heated up rapidly due to different processes, after which it entered a mostly liquid state. In that phase and as it still is today, 93% of the planet’s mass is made up of only four elements: iron (35%), oxygen (30%), silicon (15%) and magnesium (13%). However, due to its mass and under the influence of gravity, Iron started to fall down into the centre of the planet making up the core. This process is called chemical zonation (Rama Murthy and Hall, 1972) in which lighter elements aggregate in the crust of the planet while heavier elements fall in and concentrate in the centre. This is why the crust of the planet has a different relative abundance compared to the planet as a whole. 82% of the earth’s crust is made up of oxygen (46%), silicon (28%) and aluminum (8%). iron, although the most abundant element in the whole planet, only accounts for only 6% of the mass in the crust (Press and Siever, 1986). As the planet cooled the first rock formations started to form the crust. The earth crust is mostly built up of minerals. Minerals are homogenous substances with a fixed composition and crystal structure. 98% of the minerals that make up the earth crust are composed out of different combinations of eight elements: (Si, O, Al, Fe, Ca, Na, K and Mg) (Yaroshevsky, 2006). Silicon and oxygen form the most common minerals in the form of silicon oxides. Silicon oxides form minerals in the ratio of one silicon atom and two oxygen atoms, the simplest being SiO2 also known as quartz (Lyon and Burns, 1963). In the crust, minerals form into larger aggregations called rock. Rock is formed out of one or multiple elements and sometimes out of organic substances. According to its origin rocks are classified generally into three classes; igneous rock, sedimentary rock and metamorphic rock. Igneous rock forms by cooling down and solidifying of volcanic magma (Le maitre, 2002). This can either be at the surface during a volcanic eruption or in between other rock layers when the pressure in the volcanic trench is not high enough to cause a surface eruption. Sedimentary rock is formed by the weathering down and erosion of other rock types, mainly igneous rock, into sediment which is deposited elsewhere after transportation by rivers, glaciers, wind and gravity. This sediment can, under the right conditions, lithify into rock. Sedimentary rock exists in many types and forms since there are many types of mineral or even organic sources of erosion (Einsele, 2000). Metamorphic rock is formed by the transformation of other rock, igneous, sedimentary or other metamorphic rock. This transformation is often caused by high pressure deep within the crust or by plate tectonics (Bucher and Frey, 2002).

8

The vertical soil profile Soil is formed by the deposition of sedimentary minerals eroded from rocks and from organic breakdown products of plants and animals. Soil specific structure is formed when both the mineral and the organic fraction of the soil are bound to each other. Sediment deposition is often a long-term process and can have multiple parent rock sources over time. This results in different layers, which can often be distinguished by eye from each other and are called soil horizons. There are many types of soil horizons but the most important ones are, from surface to deep soil, the O, A, B and C horizons (2000). The O (Organic) horizon is composed out of the litter layer on top of the soil. The A horizon is the surface soil in which most of the biological activity in the form of plant and life takes place. This part of the soil also contains most of the organic material. The B horizon, also referred to as the sub-soil, contains mineral layers which may contain different concentrations of clay. The C horizon is the parent rock on which the other horizons rest. There are different types of interaction possible between the soil horizons. Metal ions and clay particles, for example, are often transported by vertical water transport or chemical leaching, from the A to the B horizon where they accumulate, a process which is called illuvation (Lundström et al., 2000). In this thesis the top soil, or A horizon, is most important as this horizon has the greatest impact on plant and animal life. The soil samples used in Chapters 4 and 5 were taken from the top soil. The interaction between the two soil horizons however, cannot be neglected. In Chapter 5 for example an in situ spiked copper soil is investigated and the interaction between the A and B horizons has a major impact on copper behavior in the soil.

Classification of soil texture according to particle size The soil type is often classified according to the size of the particles that make up the soil. There are three main types: sand, silt and clay. Sand forms the largest particles with a particle size distribution between 0.05 and 2 mm (diameter of the particle). Particles larger than 2 mm are considered gravel. Silt has a particle distribution between 0.002 and 0.05 mm, while all particles smaller than 0.002 mm are classified as clay. This classification is according to the United States Department of Agriculture (USDA) soil texture classification system (Davis and Bennet, 1927). A soil hardly ever consists of a single type but is often composed of a mix of different types. Figure 1 depicts a soil triangle which is often used to classify and name different mixes of soil types. Mixes between the different types are often referred to as loam with a specific name according to ratio between the different types (see figure 1). In Chapter 4 it becomes apparent how much influence soil texture can have on the physiology of soil-living

9 animals. In the next part of this paragraph the different soil types, sand and clay will be discussed in more detail as these are the types of soil most relevant to the work presented in this thesis.

Figure 1: a soil type triangle that depicts the soil classification according to the particle size distribution and ratio’s between the different soil components

Sand Sand is the soil component with the largest particles. The most common constituent of sand is silicon oxide however; due to contamination with other elements many forms are possible (Nesbitt et al., 1997). While clay is formed by chemical erosion forming complicated minerals, sand is often formed by physical erosion yielding mineral compositions which are similar to their parent rock. Due to the large particle size sandy soils often have an open structure with large pores and therefore they do not retain water very well (Haverkamp and

10

Parlange, 1986). This means that other elements and minerals are easily leached from sandy soils.

Clay Clay is the component of the soil with the smallest particle size distribution and is a product of the slow chemical weathering of, mainly, sedimentary rock. This chemical erosion is often caused by acids such as carbonic acid. One of the basic types of minerals that are found in clay, are the aluminosilicates. Kaolinite, which is a type of clay that is used in the standard OECD test soil, discussed in Chapter 3, is mainly formed by this mineral. Other metals, such as magnesium and potassium are found in other types of clay such as montmorrilonite and illite (Guggenheim and Martin, 1995). Due to its small particle size distribution clay soils are often compacted which may influence soil flora and fauna.

Figure 2: Schematic representation of the different processes (solid arrows) and their parent entities that contribute to soil formation. Dashed arrows display entities that influence each other.

11

Soil water and organic matter In the previous paragraphs the origin and properties of the soil mineral component was explained. There are however other soil components; its organic component, soil air and soil water. When soil water is considered, it is often part of the so called ‘three phase system’, (see also figure 2 for an overview. In this system, the soil particles are considered the solid phase; water that fills the pores between particles is the liquid phase and if the pores between the particles are filled with air it is called the gas phase (Kuipers, 1976). The liquid phase is important because salts and other substances can be dissolved in the liquid phase where they become mobile and can be transported to other layers of the soil and eventually leached from the soil. When dissolved, these substances also become bio-available to soil organisms and can, if toxic, cause harmful effects. The amount of water a soil can hold in its pores, also called the water holding capacity (WHC), is determined by the soil type. In general, clay soils have a greater WHC than sandy soils in which water passes through more easily. Other properties such as organic matter and soil structure also affect the WHC (Hudson, 1994). In soil quality bioassays using collembolans, such as described in later chapters of this thesis, the WHC is an important factor to take into account. By setting the soil moisture level at a fixed percentage of the WHC all test animals experience the same moisture content regardless of soil type (Abdellatif et al., 1967). The soil is home to all kinds of life, from single-celled organisms such as bacteria and archaea to invertebrates such as insects, collembolans, worms and mammals (Oades, 1993, Roper and Gupta, 1995). Since the origin of life the remains of these animals and of plants have formed organic matter which has become a part of the soil. In the Netherlands for example, on average 4% of the soil is composed of organic matter. The soil organic matter can have a large influence on soil properties. The organic matter in the soil is often part of a cycle between inorganic matter and organic matter. In this cycle plants take up inorganic matter in the form of water, carbon dioxide and minerals such as nitrogen, phosphorus and potassium and turn them into organic molecules in processes such as photosynthesis and carbon fixation. The remains of animals and plants are broken down into organic molecules by primary decomposers, which are in turn broken down into humus by a process called humification. Humus is considered as stable organic matter and can remain so for a long time but it can also be broken down by microorganisms into inorganic matter, thus completing the cycle (Schlesinger and Andrews, 2000). This cycle eventually prevents the accumulation of organic matter in the soil. Another source of organic matter in the soil are the root exudates that plants excrete into the soil. These exudates are organic acids that often assist the plant in

12 multiple processes such as the uptake of nutrients and metal detoxification (Jones, 1998). These root exudates can have a major influence on the soil and soil microorganisms. Soil microorganism such as bacteria and fungi also add to the organic matter content. Especially fungi can contribute organic matter to the soil (Tisdall and Oades, 1982). Please consult Figure 2 for an overview of the different components that make up soil in general. Organic molecules in soil, such as phenols and lignin but also humic acids can have adverse effects on soil fauna (Potapow, 2001) as they are often hard to digest and can be toxic. An example of this is given in Chapter 4 where gene expression profiles derived from collembolans exposed to forest soils, which are rich in lignin and polyphenols, show a detoxification signature. Organic matter can also have an indirect effect on soil flora and fauna. It can form aggregates with inorganic mineral particles. In sandy soils this can enhance soil fertility because it prevents leaching of inorganic minerals such as nitrogen and phosphorous. In clay soil, organic matter can prevent the soil from compacting too much so that more aeration pores can form which are beneficial to plant roots. Organic matter also influences the cation exchange capacity (CEC) of the soil. The CEC is determined by the number of sites on the soil particles that can bind cations and the number of those sites where cations can be exchanged for other cations. The CEC is important for nutrient retention and general soil fertility. In sandy soils the CEC is almost totally determined by the organic matter content. Organic matter can sometimes negatively influence the CEC in clay soils because the organic matter particles interact with clay particles and therefore cancel out each other’s capacity to bind. The CEC is also important in metal contamination since metal cations will be retained better at higher cation exchange capacities (Rieuwerts et al., 1998). Soil pH also influences the CEC and general metal bio-availability. The soil pH is determined by a number of factors. The parent rock material from which the soil was formed influences soil pH, for example, soil formed from erosion of limestone will be very alkaline. Weathering can influence soil pH; when rainfall is greater than transpiration alkaline soil minerals can be leached to be replaced by more acidic ones so the soil pH goes down. In forested areas breakdown products from leaves or needles can lower the soil pH as they form acids such as humic acid.

The influence of soil properties on pollutants and bioassays As is the described in the previous paragraphs, soil comes in many different forms and with many different properties. These different properties can influence the bio-availability and toxicity of soil pollutants. Soil pH for example, greatly influences the bio-availability of metals in the soil. At a lower soil pH, there are more free protons which compete with metal

13 ions for binding sites on soil particles. Therefore, metal pollution generally becomes more available to soil flora and fauna at lower soil pH. Metal bio-availability is also negatively correlated with organic matter content (Crommentuijn et al., 1997). Organic compounds such as phenanthrene and benzene bind strongly to organic matter and are therefore less bio- available in high organic matter soils (Rutherford et al., 1992, Karapanagioti et al., 2000, Gestel and Ma, 1993). There are several bio-assays for testing soil quality that use different test organisms which include collembolans and worms. Since soil properties influence soil pollution they can also influence pollutant uptake properties differently in different animals. Therefore soil properties need to be taken into account when selecting the appropriate soil quality test. In this thesis, the ISO 11267 test (1999) is used which uses the collembolan Folsomia candida as a test animal. This small arthropod has a size of up to 4-5 mm and its soil-dwelling lifestyle; short generation time and ease to culture make it a valid animal for soil quality tests (Fountain and Hopkin, 2005). This collembolan lives in soil pores and pollutant exposure routes are via its food (soil bacteria and fungi) and pore water which is taken up by an organ at the underside of the animal called the ventral tube. Due to these exposure routes, soil properties such as organic matter content and CEC, can influence test results.

Soil in the Netherlands In the previous paragraphs general soil properties and the factors that influence them were discussed. As the work presented in this thesis concerns field soils from the Netherlands, a short history on the geological history of the Netherlands and what processes influenced soil formation is in place. In general, two processes have shaped the Dutch landscape to what we see today. The first is sediment deposition during glacial eras. During the Saalian glacial era in the Pleistocene (238,000 – 128,000 years ago) glaciers from Scandinavia reached the northern part of the Netherlands forcing the fluvial systems of both the Rhine and Meuse rivers westward instead of north (Busschers et al., 2008). During this era both rivers deposited vast amounts of materials creating a delta system that stretched out as far as Scotland. The deposited material mainly consisted of sand, loam and gravel and this forms the subsoil of most of the country today (Busschers et al., 2007). In later glacial eras, the ice did not reach the Netherlands but due to the cold, dry climate a layer of coarse sand was deposited, covering most of the country. During the Holocene, when the ice began to melt due to a warmer climate, most of the western and northern part of the country turned into swamp land and peat soil formed. Approximately 5000 years BC the rising sea level created the strait of Calais and the Dutch coastline began to resemble its current shape. During this time sea clay

14 was deposited along the coast and river clay near the rivers (Kuipers, 1976). So the Dutch soils are of five main types: clay-loam from the Pleistocene glacification, sand from the Holocene, peat, river clay and marine clay The other process that has influenced the Dutch landscape, particularly in the western part of the Netherlands, is land reclamation from the sea. From the 15th century onwards large parts of land in the provinces of Noord-Holland, Zuid-Holland and Flevoland have been reclaimed from the sea. In general we can say that the eastern and southern part of the Netherlands consist mainly of sandy soils while the western and north-western parts mainly consist of clay soils. These differences in soil type can have a major impact in soil ecotoxicological testing as described in Chapter 4 of this thesis.

Soil quality assessment To determine whether a soil is healthy its quality needs to be assessed. Soil however, is not an isolated object put part of the landscape and the ecosystem. For soil quality assessment, often the future utilization and possible usage of the landscape and ecosystem are taken into account. In this paragraph I will discuss two types of general soil assessment and a specific system from the Netherlands used to determine soil quality.

Ecosystem services Human civilization benefits from a multitude of resources and processes that are supplied by natural ecosystems. Collectively, these benefits are known as ecosystem services (ES). An example of ES products is clean drinking water, while processes may include decomposition of organic waste into fertile soil. Ecosystem services have been discussed by scientists for decades but were formalized by the 2004 Millennium Ecosystem Assessment (MA) (Carpenter et al., 2006). In the MA a new conceptual framework for understanding the effects of environmental change on ecosystems and human well being was used. This framework focused on the services that ecosystems provide for society and how human actions impact alter ecosystems and the services they provide (Carpenter et al., 2009). Soil and soil organisms provide a number of ecosystem services. They support agriculture as production services but also climate regulation services by controlling green house gas fluxes, carbon sequestering, flood control, detoxification, etc (Lavelle et al., 2006). Some of these services are directly measurable and can serve as indicators of soil health. When soils become disturbed by pollution or climate change, functions can be lost by alteration of soil properties

15 or by a change in soil flora or fauna composition. Measuring of ecosystem services can then provide an indication of societal and economical impact of this soil disturbance.

Triad approach The Triad approach to soils assessment is a more hands-on technique than ecosystem services. This approach is often depicted as a triangle with on each corner one of the three types of measurements that are used together for risk assessment of potential polluted soils: Toxicological, Chemical and Ecological. The chemical corner of the triangle measures the actual, total concentration of chemicals and pollutants in the soil. The toxicological part of the triad approach uses bio-assays to determine the impact of chemicals on test animals or plants. The difference between the chemical and toxicological corners of the triad approach is caused by the bio-availability of the pollutants. Finally, the ecological corner of this approach looks at the impact on the ecosystem that pollutants may have. While the triad approach focuses more on the environmental aspects of soil quality, the ecosystem services approach also included economical aspects in determining soil health (Rutgers et al., 2005).

BoBI The BoBI program (BISQ in English), which stands for soil biological indicator, program is a Dutch soil monitoring program executed by the Dutch Institute for Public Health and Environment (RIVM) and the Ministry of Spatial planning, Housing and Environment (VROM). The objective of this program is to form an instrument that may be used to formulate environmental policy goals for sustainable use of biodiversity and soil function. In this program an indicator set has been built which uses multiple types of measurements (chemical and biological) to evaluate soil quality. This indicator was used to measure over 200 soils, sampled over multiple years, from all over the Netherlands (Schouten et al., 2001). The soils used in chapter 4 of this thesis were all soils taken from the BoBI program.

Gene expression methods The use of molecular DNA techniques in the field of ecology has become more common in the last 10 years (Gibson, 2002, Kammenga et al., 2007). The measurement of differences in gene expression in two groups of differently exposed test organism offers a better understanding of modes of effects that stresses may exert on animals and may predict better the long term effects these stresses may have. Quantitative PCR (Q-PCR) and microarray analysis are two gene expression measurement techniques used in this thesis. Compared to the

16 high throughput microarray analysis, Q-PCR methods remain rather low throughput, even with recent advances in capillary PCR. However, Q-PCR methods often perform better quantitatively making them more suitable for dose-effect studies. Q-PCR also remains a less expensive platform when only small numbers of genes are of interest. Microarray technology was first developed in 1995 (Schena et al., 1995) as an advance from the southern blot technology (Southern, 1975) and has since become the standard technique of measuring high throughput gene expression. New advances in next generation sequencing however, are rapidly gaining interest since they not only provide possible quantitative data but also gene sequence data. Microarray technology uses glass slides that have been printed with oligonucleotide probes and in the past cDNA probes, which represent gene fragments. mRNA isolated from samples is labeled with a fluorescent dye and two samples are competitively hybridized to the probes on the array. By determining ratios between the two dyes we can calculate per gene if its expression in one sample is higher or lower than in the other sample.

Aim of this thesis The work presented in this thesis is part of a greater program with the general goal of developing a new method of determining the adverse effect that chemicals and soil pollution may have on a soil dwelling invertebrate. This new method could then be used, in combination with existing methods, for risk assessment of chemicals and pollutants in the environment. More specifically, a recognized ISO (ISO, 1999) test, which measures the survival and reproduction in the collembolan Folsomia candida after 28 days of exposure to polluted or chemically spiked soil, was expanded by adding gene expression analysis as a measured end-point. The thesis of Benjamin Nota (2010), who was also involved in this program, showed that measuring gene expression with microarray technology of collembolans exposed to cadmium and phenanthrene containing soil, made it possible to elucidate specific pathways involved induced by these chemical stresses. He was also able to distinguish gene expression patterns of F. candida exposed to different metals showing that the enhancement of this soil quality test was not only sensitive but also specific. This is not the only example of gene expression methods being used to test the impact of chemicals or pollutants on ecological relevant test species (Bierkens et al., 1998, Fountain, 2004, Hankard et al., 2005) since these techniques can give an indication of the organism’s physiological state at the moment of exposure.

17

Soil quality tests however come with a number of confounding factors (Gibson, 2002). As described in the previous paragraphs, soil properties can have a major impact on soil pollutants in both their bio-availability and chemical state. Also, the soil properties themselves can influence a test animal’s physiology and thus their gene expression. Gene expression analysis itself also complicates matters; instead of measuring one or a couple of endpoints, thousands of endpoints are measured, some of which are independent and some of which interact with others. To distinguish between effects caused by a pollutant and effects caused by confounding factors we need to determine how the test animal, in this case Folsomia candida, responds to a clean unpolluted soil. The main aim of this thesis is to investigate how the Folsomia candida transcriptome responds to control situations in order to separate stress-induced gene expression profiles from gene expression profiles induced by unstressed conditions. Investigating the transcriptome under unstressed conditions however, is also biologically interesting as it gives information about the natural variation of gene expression which can lead to insights in the fundamental ecological niche of this collembolan.

Outline of this thesis In Chapter 2 of this thesis the stability of reference genes is investigated. Gene expression analysis methods such as Q-PCR often rely on genes that show a stable expression pattern across the different treatments to act as control or reference genes. These reference genes make it possible to calculate relative expression levels for differentially regulated genes and make comparisons between different conditions or treatments. Folsomia candida and another collembolan species, Orchesella cincta were exposed to a range of treatments and the expression stability of different standard reference genes for both species was determined. Chapter 3 deals with the effects abiotic stress has on Folsomia candida transcriptional regulation. Collembolans were exposed to different soil pH values and temperatures, all in standard OECD soil and the expression levels of nine, stress implicated genes was measured. In Chapter 4 the concept of a Natural Operating Range is introduced. If gene expression levels are to be measured as valid endpoints in soil ecotoxicological testing, a reference database with information on how the test animal responds to natural, unpolluted soils has to be established. 26 Dutch field soils, all part of the biological soil indicator network (BoBI) by the RIVM, were sampled, F. candida was exposed to these soils and gene expression was measured. A survival and reproduction test of 28 days was also performed to compare gene expression results to these original ISO test end points.

18

In Chapter 5 the impact of an aged copper-polluted field soil on F candida gene expression was determined. In 1980 a field in Bennekom, the Netherlands was spiked with four copper concentrations and four pH treatments. The pH treatments were repeated over the years but there was no additional copper added to the soil. This design made this site an ideal, controlled, aged metal-polluted soil to test. All copper/pH combinations were sampled and gene expression, reproduction and growth were all measured. Chapter 6 is the general discussion in which the implications if the findings from previous chapters are discussed and a general conclusion is made. Future considerations are also discussed.

19

References

2000. World reference base for soil resources. World Soil Resources Reports. Rome: Food and Agriculture Organization of the United Nations. Abdellatif, M. A., Hermanson, H. P. & Reynolds, H. T. 1967. Effect of Soil Clay and Organic Matter Content upon Systemic Efficacy of Two Carbamate Insecticides. Journal of Economic Entomology, 60, 1445-1449. Bierkens, J., Klein, G., Corbisier, P., Van Den Heuvel, R., Verschaeve, L., Weltens, R. & Schoeters, G. 1998. Comparative sensitivity of 20 bioassays for soil quality. Chemosphere, 37, 2935-2947. Bucher, K. & Frey, M. 2002. Petrogenesis Of Metamorphic Rocks, Springer. Busschers, F. S., Balen, R. T. V., Cohen, K. M., Kasse, C., Weerts, H. J. T., Wallinga, J. & Bunnik, F. P. M. 2008. Response of the Rhine–Meuse fluvial system to Saalian ice-sheet dynamics. Boreas, 37, 377-398. Busschers, F. S., Kasse, C., Van Balen, R. T., Vandenberghe, J., Cohen, K. M., Weerts, H. J. T., Wallinga, J., Johns, C., Cleveringa, P. & Bunnik, F. P. M. 2007. Late Pleistocene evolution of the Rhine-Meuse system in the southern North Sea basin: imprints of climate change, sea-level oscillation and glacio-isostacy. Quaternary Science Reviews, 26, 3216- 3248. Carpenter, S. R., Defries, R., Dietz, T., Mooney, H. A., Polasky, S., Reid, W. V. & Scholes, R. J. 2006. Millennium Ecosystem Assessment: Research needs. Science, 314, 257-258. Carpenter, S. R., Mooney, H. A., Agard, J., Capistrano, D., Defries, R. S., Díaz, S., Dietz, T., Duraiappah, A. K., Oteng-Yeboah, A., Pereira, H. M., Perrings, C., Reid, W. V., Sarukhan, J., Scholes, R. J. & Whyte, A. 2009. Science for managing ecosystem services: Beyond the Millennium Ecosystem Assessment. Proceedings of the National Academy of Sciences, 106, 1305-1312. Crommentuijn, T., Doornekamp, A. & Van Gestel, C. A. M. 1997. Bioavailability and ecological effects of cadmium on Folsomia candida (Willem) in an artificial soil substrate as influenced by pH and organic matter. Applied Soil Ecology, 5, 261-271. Davis, R. & Bennet, H. 1927. Grouping of soils on the basis of mechanical analysis. In: USDA (ed.) United States Department of Agriculture Departmental Circulation no. 419 USDA. Einsele, G. 2000. Sedimentary Basins, Evolution, Facies, and Sediment Budget Springer. Fountain, M. T. 2004. Biodiversity of Collembola in urban soils and the use of Folsomia candida to assess soil 'quality'. Fountain, M. T. & Hopkin, S. P. 2005. Folsomia candida (collembola): A "Standard" Soil Arthropod. Annual Review of Entomology, 50, 201-222. Gestel, C. A. M. & Ma, W.-C. 1993. Development of QSAR's in soil ecotoxicology: Earthworm toxicity and soil sorption of chlorophenols, chlorobenzenes and chloroanilines. Water, Air, & Soil Pollution, 69, 265-276. Gibson, G. 2002. Microarrays in ecology and evolution: a preview. Molecular Ecology, 11, 17-24. Guggenheim, S. & Martin, R. T. 1995. Definition of clay and clay mineral; joint report of the AIPEA nomenclature and CMS nomenclature committees. Clays and Clay Minerals, 43, 255-256. Hanczyc, M. M., Fujikawa, S. M. & Szostak, J. W. 2003. Experimental Models of Primitive Cellular Compartments: Encapsulation, Growth, and Division. Science, 302, 618-622.

20

Hankard, P. K., Bundy, J. G., Spurgeon, D. J., Weeks, J. M., Wright, J., Weinberg, C. & Svendsen, C. 2005. Establishing principal soil quality parameters influencing earthworms in urban soils using bioassays. Environmental Pollution, 133, 199-211. Haverkamp, R. & Parlange, J. Y. 1986. Predicting the Water-Retention Curve From Particle- Size Distribution: Sandy Soils Without Organic Matter. Soil Science, 142, 325-339. Hudson, B. D. 1994. Soil organic matter and available water capacity. Journal of Soil and Water Conservation, 49, 189-194. Iso 1999. ISO, Soil Quality. Inhibition of Reproduction of Collembola (Folsomia candida). ISO Guideline 11267. International Standardization Organization. Zwitserland. Jones, D. L. 1998. Organic acids in the rhizosphere – a critical review. Plant and Soil, 205, 25-44. Kammenga, J. E., Herman, M. A., Ouborg, N. J., Johnson, L. & Breitling, R. 2007. Microarray challenges in ecology. Trends in Ecology & Evolution, 22, 273-279. Karapanagioti, H. K., Kleineidam, S., Sabatini, D. A., Grathwohl, P. & Ligouis, B. 2000. Impacts of heterogeneous organic matter on phenanthrene sorption: Equilibrium and kinetic studies with aquifer material. Environmental Science & Technology, 34, 406-414. Kuipers, S. 1976. Bodemkunde, Culemborg, Tjeenk Willink. Lande, R. 1998. Anthropogenic, ecological and genetic factors in extinction and conservation. Researches on Population Ecology, 40, 259-269. Lavelle, P., Decaëns, T., Aubert, M., Barot, S., Blouin, M., Bureau, F., Margerie, P., Mora, P. & Rossi, J. P. 2006. Soil invertebrates and ecosystem services. European Journal of Soil Biology, 42, S3-S15. Le Maitre, R. W. 2002. Igneous Rocks: A Classification and Glossary of Terms, Recommendations of the International Union of Geological Sciences, Cambridge University Press. Lundström, U. S., Van Breemen, N. & Bain, D. 2000. The podzolization process. A review. Geoderma, 94, 91-107. Lyon, R. J. P. & Burns, E. A. 1963. Analysis of rocks and minerals by reflected infrared radiation. Economic Geology, 58, 274-284. Mcgrath, S. P., Chaudri, A. M. & Giller, K. E. 1995. Long-term effects of metals in sewage sludge on soils, microorganisms and plants. Journal of Industrial Microbiology and Biotechnology, 14, 94-104. Nesbitt, H. W., Fedo, C. M. & Young, G. M. 1997. Quartz and Feldspar Stability, Steady and Non‐steady‐State Weathering, and Petrogenesis of Siliciclastic Sands and Muds. The Journal of Geology, 105, 173-192. Nota, B. 2010. Ecotoxicogenomics of springtails. PhD thesis, VU University. Nriagu, J. O. 1996. A history of global metal pollution. Science, 272, 223-224. Oades, J. M. 1993. The role of biology in the formation, stabilization and degradation of soil structure. Geoderma, 56, 377-400. Ponnamperuma, C., Shimoyama, A. & Friebele, E. 1982. Clay and the origin of life. Origins of Life and Evolution of Biospheres, 12, 9-40. Potapow, M. 2001. Synopses on Palaearctic Collembola., Staatsliches Museum fur Naturkunde Gorlitz. Press, F. & Siever, R. 1986. Earth, New York, W.H. Freeman & company. Rama Murthy, V. & Hall, H. T. 1972. The origin and chemical composition of the earth's core. Physics of The Earth and Planetary Interiors, 6, 123-130. Rieuwerts, J. S., Thornton, I., Farago, M. E. & Ashmore, M. R. 1998. Factors influencing metal bioavailability in soils: preliminary investigations for the development of a critical loads approach for metals. Chemical Speciation and Bioavailability, 10, 61-75.

21

Roper, M. & Gupta, V. 1995. Management-practices and soil biota. Australian Journal of Soil Research, 33, 321-339. Rutgers, M., Mesman, M. & Otte, P. 2005. TRIADE. Instrumentarium voor geintegreerde ecotoxicologische beoordeling van bodemverontreinigingen. Den Haag, SDU Uitgevers. Rutherford, D. W., Chiou, C. T. & Kile, D. E. 1992. Influence of soil organic matter composition on the partition of organic compounds. Environmental Science & Technology, 26, 336-340. Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. 1995. Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray. Science, 270, 467-470. Schlesinger, W. H. & Andrews, J. A. 2000. Soil respiration and the global carbon cycle. Biogeochemistry, 48, 7-20. Schouten, A., Rutgers, M. & Breure, A. 2001. BISQ on the road. Interim evaluation of the project Biological Indicator for Soil Quality RIVM. Southern, E. M. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis. Journal of Molecular Biology, 98, 503-&. Tisdall, J. M. & Oades, J. M. 1982. Organic matter and water-stable aggregates in soils. European Journal of Soil Science, 33, 141-163. Yaroshevsky, A. 2006. Abundances of chemical elements in the Earth’s crust. Geochemistry International, 44, 48-55.

22

Chapter 2

Reference genes for QRT-PCR tested under various stress conditions in Folsomia candida and Orchesella cincta (Insecta, Collembola)

Muriel E de Boer *, Tjalf E de Boer *, Janine Mariën, Martijn JTN Timmermans, Benjamin Nota, Nico M van Straalen, Jacintha Ellers and Dick Roelofs

BMC Molecular Biology (2009) 10:54

* These authors contributed equally to the work

23

Abstract

Background Genomic studies measuring transcriptional responses to changing environments and stress currently make their way into the field of evolutionary ecology and ecotoxicology. To investigate a small to medium number of genes or to confirm large scale microarray studies, Quantitative Reverse Transcriptase PCR (Q-PCR) can achieve high accuracy of quantification when key standards, such as normalization, are carefully set. In this study, we validated potential reference genes for their use as endogenous controls under different chemical and physical stresses in two species of soil-living Collembolans, Folsomia candida and Orchesella cincta. Treatments for F. candida were cadmium exposure, phenanthrene exposure, desiccation, heat shock and pH stress, and for O. cincta cadmium, desiccation, heat shock and starvation.

Results Eight potential reference genes for F. candida and seven for O. cincta were ranked by their stability per stress factor using the programs geNorm and Normfinder. For F. candida the succinate dehydrogenase (SDHA) and eukaryotic transcription initiation factor 1A (ETIF) genes were found the most stable over the different treatments, while for O. cincta, the beta- actin (ACTb) and tyrosine 3-monooxygenase (YWHAZ) genes were the most stable.

Conclusion We present a panel of reference genes for two emerging ecological genomic model species tested under a variety of treatments. Within each species, different treatments resulted in differences in the top stable reference genes. Moreover, the two species differed in suitable reference genes even when exposed to similar stresses. This might be attributed to dissimilarity of physiology. It is vital to rigorously test a panel of reference genes for each species and treatment, in advance of relative quantification of Q-PCR gene expression measurements.

24

Background Genomic techniques have undergone major developments in the last two decades. As a result, they have become conducive for evolutionary and ecotoxicological studies, which generally use non genomic-model organisms. Quantitative Reverse-Transcriptase Polymerase Chain Reaction (QRT-PCR or Q-PCR) is a technique to estimate gene expression levels. This technique is often used to confirm high throughput systems like microarrays. Its application has mainly been limited to small numbers of genes per experiment due to constraints of low throughput coinciding with relatively high costs per assay. This is about to change, as high throughput Q-PCR systems using small volume (capillary) PCR are becoming available (Morrison et al., 1998). Q-PCR is a valuable tool for ecological studies as it provides a relatively straightforward way to measure the direct transcriptional response of an organism exposed to different treatments (Snell et al., 2003). Q-PCR has been applied to study adaptive evolution at the transcriptional level (Ellers et al., 2008, Muller et al., 2004, Scharf et al., 2003, Zientz et al., 2006). For instance, Roelofs et al. (2007) conducted a Q-PCR study to assess the relevance of transcriptional regulation in the adaptive evolution of stress tolerance. The conditions that have to be met for a successful Q-PCR experiment are reviewed by Bustin (2002). Different strategies have been developed for quantifying gene expression with Q-PCR data. The most widely used method is to quantify the relative amount of target mRNA between samples, using for instance the comparative CT (2-ΔΔCT) method (Applied_Biosystems, 2001), or the more comprehensive method developed by Pfaffl, where relative quantities are adjusted for amplification efficiencies. Relative quantification methods depend on reference genes for normalization (Pfaffl, 2001). Q-PCR reference genes, sometimes called 'housekeeping' genes, can either be internal, when measured in the same reaction tube as the target gene, or external, when measured in a different reaction tube. The use of reference genes is necessary to correct for factors such as RNA input differences and reverse transcriptase efficiency variation (Huggett et al., 2005). An essential requirement for Q-PCR reference genes is transcriptional stability across the various conditions to which an organism is exposed during an experiment. Classically, the most used reference genes are carry-overs from the Northern-blot days (Heckmann et al., 2006) such as beta actin (ACTb), glyceraldehyde-3P-dehydrogenase (GAPDH) and ribosomal RNA genes (18S, 28S). Unfortunately, these classical reference genes are not always found to be suitable for this use. For example, mRNA levels of ACTb and GAPDH can fluctuate widely in human T-cells exposed to different treatments (Bas et al., 2004). Also, commonly used housekeeping genes like ACTb, GAPDH, cyclophilin A (CYP) and 28S are up or down

25 regulated in cell lines exposed to hypoxic stress (Zhong and Simons, 1999). Therefore, it is vital to validate the stability of a panel of reference genes in order to select the most suitable ones for each new treatment and species of choice. Currently two Visual Basic Applications for Microsoft Excel are widely used to determine reference gene suitability: geNorm (Vandesompele et al., 2002) and Normfinder (Andersen et al., 2004). GeNorm is based on the principle that the expression ratio of two ideal control genes should be identical in all samples and experimental conditions. It calculates gene expression stability (M), which is the mean pair-wise variation between an individual gene and all other tested reference genes. Subsequently, the least stable reference gene (highest M value) is excluded from the set and the calculation is reiterated until the two most stable reference genes remain. Normfinder (Andersen et al., 2004) estimates intergroup and intragroup variation to calculate reference gene stability and to rank them. As with geNorm it calculates a stability value for each potential reference gene but it uses a variance model based approach (mixed linear effect modeling), instead of the reiterative approach used by geNorm. Normfinder calculates intragroup variability for the genes in each of the groups, and the intergroup variability or bias among the groups (Bergkvist et al., 2008). Since the program can differentiate between groups, Normfinder is best suited when the stability of reference genes needs to be assessed over multiple treatments. When reference gene stability only needs to be calculated for samples exposed to a single treatment the two methods are similar and should give the same results (Andersen et al., 2004). In this paper we have developed a panel of reference genes for two species of collembolans, and study their stability across various treatments. Collembolans are important model organisms in evolutionary ecology and ecotoxicological studies. The soil-dwelling collembolan Folsomia candida is a standard test animal used in soil pollution test that is certified by the International Organization for Standardization (ISO, 1999). Its parthenogenetic mode of reproduction and the availability of a recently sequenced Expressed Sequence Tag (EST) database, make F. candida also a proper test animal for genomic studies on the effects of soil pollution (Fountain and Hopkin, 2005, Timmermans et al., 2007a, Timmermans et al., 2007b). An F. candida microarray is currently in use for testing soil toxicants (Nota et al., 2008) and physical conditions (de Boer et al., 2010, Timmermans et al., 2009). Therefore, confirmation of microarray results by Q-PCR is abound. Orchesella cincta is a sexually reproducing collembolan that lives in the litter layer rather than in the soil and is generally used to study adaptation and phenotypic plasticity (Bahrndorff et al., 2006, Liefting

26

Table 1: Overview of the reference and target genes used in this study including Genbank accession numbers a and Collembase clusters b

Gene name Symbol GenBank Accession no. a Collembase id b Gene Ontology Beta Actin ACTb FC: EV473840 Fcc01756 structural constituent of cytoskeleton (F) GO:0005200 OC: AY779737 n/a Glyceraldehyde 3-Phosphatase dehydrogenase GAPDH FC: EV479869 Fcc05545 Glycolysis (P) GO:0006096 OC: FJ009068 n/a Ubiquitin conjugating enzyme UBC FC: EV475860 Fcc00615 Ubiquitin-protein ligase activity (F) GO:0004842 OC: n/a n/a Succinate dehydrogenase SDHA FC: EV476739 Fcc06005 Tricarboxylic acid cycle (P) GO:0006099 OC: FJ009079 n/a Tyrosine 3-monooxygenase YWHAZ FC: EV474941 Fcc02512 Protein domain specific binding (F) GO:0019904 OC: n/a Occ00412 Elongation factor 1-alpha EF1a FC: EV473706 Fcc05454 Protein biosynthesis (P) GO:0006412 OC: AH009877 n/a Eukaryotic Transcription Initiation Factor 5a ETIF FC: EV479461 Fcc02111 Protein biosynthesis (P) GO:0006412 OC: n/a n/a Cyclophilin A CYP FC: EV475615 Fcc01655 Protein folding (P) GO:0006457 OC: n/a n/a 28S 28S FC: n/a n/a Large ribosomal subunit (C) GO0022625 OC: AF483443 n/a Alpha Tubulin TBa FC: n/a n/a contributes to the structural integrity of a cytoskeletal OC: GD180623 n/a structure (F) GO:0005200 Heat Shock Protein 70 HSP70 FC: EV473626 Fcc01609 Response to unfolded protein (P) GO:0006986 OC: FJ009069 n/a V-type ATPase ATPase FC: EV476428 Fcc04630 Hydrogen transport activity (F) GO:0015078 OC: n/a n/a Cuticle Protein CP FC: EV479600 Fcc04701 a molecule that contributes to the structural integrity of a OC: n/a n/a cuticle (F) GO:0042302 Mitochondrial chaperone BCS1 BCS1 FC: EV473062 Fcc00101 ATP binding (F) GO:0005524 OC: n/a n/a Metallothionein MT FC: n/a n/a Metal ion binding (F) GO:0046872 OC: AF036345 Occ00204

27

and Ellers, 2008). Furthermore, it is an emerging genomic model to study adaptive evolution in polluted environments (Timmermans et al., 2007a, Timmermans et al., 2007b, Roelofs et al., 2006). Here we study the stability of potential Q-PCR reference genes across various treatments. We exposed the two collembolan species to several stressors (F. candida: cadmium, phenanthrene, desiccation, temperature and pH stress. O. cincta: cadmium, desiccation, temperature stress and starvation) that are currently under investigation in gene expression studies. Based on previous studies, such as described by Heckmann et al (2006) and Spinsanti et al (2006), a panel of potential Q-PCR reference genes was developed (Table 1) using the collembolan EST database Collembase (http://www.collembase.org). In addition, a target gene (expected to be differential) was measured for each treatment, to validate the impact of the treatment at the transcriptional level of the organism. The stability of the potential reference genes was determined using both geNorm and Normfinder.

Results

Expression levels under different conditions Average cycle threshold (Ct) values varied widely between conditions and treatments, and ranged between 9.7 (28S, O. cincta) and 28.6 (CYP, F. candida, see Additional file 1 for an overview). An example of the different expression levels in treatments and conditions is given in Figure 1 for ACTb and the differential metallothionein gene (MT) of O. cincta. In O. cincta, the target gene MT showed a 7 fold up regulation (FR) in the cadmium treatment (P < 0.05) and HSP70 was 5 fold up regulated after heat shock (P < 0.001). Starvation and desiccation treatments showed no significant changes in either of the two target genes, implying that those treatments had fairly small effects, at least on the transcriptional response of these two genes. In F. candida, all but one treatment showed a significant effect on the target genes (cadmium: mitochondrial chaperone BCS-1 (BCS1), FR = 20, P < 0.0001; phenanthrene: BCS1, FR = 6.4, P = 0.001; temperature: HSP70, FR > 2, P < 0.001 (for all conditions); desiccation: Cuticle protein (CP), FR = 4, P < 0.05). The pH treatment showed no significant differences in the target V-type ATPase (ATPase); in fact the

28

ATPase gene used in the pH treatment was more stable than most reference genes, again implying that this treatment had only a modest effect on the transcriptional level of this gene.

Figure 1: Average Ct values between the biological replicates of the beta actin (ACTb) and metallothionein (MT) genes from Orchesella cincta exposed to the cadmium, desiccation, starvation and temperature treatments. Exp stands for the exposed group, “contr” for the control group. 10, 20 and 35 in the temperature treatment stand for the different temperatures (in ○C) the animals were exposed to.

Ranking of the reference genes Stability rankings were established using geNorm and Normfinder. As described in the methods section, the Normfinder analysis should be executed with genes that show no substantial treatment-specific response (Bergkvist et al., 2008). Therefore, differential genes were excluded in most cases as well as some of the other genes that showed a systematic response to the treatment (see Table 2 for an overview). The stability rankings generated by geNorm or Normfinder were largely similar, even though the ranking order of the genes differed to some extent. An overview of which genes to use for each treatment is found in Table 3. From the three treatments tested for both species, only the cadmium treatment gave similar results between F. candida (tyrosine 3- monooxygenase (YWHAZ), succinate dehydrogenase (SDHA) and GAPDH) and O. cincta (alpha-tubulin (TBa) – not tested in F. candida – SDHA and YWHAZ). To compare interspecies parallels, an additional geNorm analysis was done including only the reference genes available for both species (see Additional file 2). This did not change the fact that most

29

Table 2b: Omitted and remaining genes and their absolute total biases after preselection for Normfinder analysis for Orchesella cincta

Treatment Bias treshold Omitted genes Bias Remaining genes Bias

Temperature 2.24 YWHAZ 3.78 ETIF 1.01 HSP70 4.95 UBC 1.05 SDHA 1.19 EFIa 1.38 CYP 1.42 GAPDH 1.74 ACTb 2.02 Desiccation 1.72 GAPDH 2.19 UBC 0.43 CP 4.99 CYP 0.70 EFIa 0.70 ETIF 0.97 YWHAZ 1.13 SDHA 1.18 ACTb 1.54 Cadmium 1.44 ETIF 1.64 ACTb 0.20 BCS1 4.31 EFIa 0.32 CYP 0.84 YWHAZ 0.90 SDHA 0.95 GAPDH 1.11 UBC 1.24 Phenanthrene 0.85 ETIF 0.89 UBC 0.14 BCS1 3.00 ACTb 0.21 EFIa 0.33 YWHAZ 0.37 SDHA 0.38 GAPDH 0.55 CYP 0.79 pH 0.67 EFIa 0.67 ETIF 0.36 GAPDH 0.81 UBC 0.38 YWHAZ 0.90 SDHA 0.49 CYP 1.02 ACTb 0.57 ATPase 0.59

30

Treatment Bias treshold Omitted genes Bias Remaining genes Bias

Temperature 1.08 MT 1.38 TBa 0.24 HSP70 3.49 SDHA 0.32 YWHAZ 0.45 EF1a 0.51 28S 0.58 ACTb 0.71 GAPDH 0.87 Desiccation 0.16 SDHA 0.26 HSP70 0.02 28S 0.27 GAPDH 0.06 MT 0.34 YWHAZ 0.07 EF1a 0.09 TBa 0.10 ACTb 0.15

Cadmium 0.76 GAPDH 0.81 HSP70 0.01 MT 2.88 ACTb 0.03 SDHA 0.27 28S 0.29 YWHAZ 0.44 TBa 0.50 EF1a 0.60 Starvation 0.33 MT 0.35 ACTb 0.07 EF1a 0.55 HSP70 0.09 GAPDH 0.58 YWHAZ 0.13 SDHA 0.67 TBa 0.14 28S 0.16 All treatments 1.21 total RNA 2.06 ACTb 0.34 MT 3.33 SDHA 0.53 TBa 0.55 28S 0.91 YWHAZ 1.00 EF1a 1.04 GAPDH 1.05 HSP70 1.20

31 rankings did not correspond between species. In F. candida, overall analyses show the same outcomes for the cadmium treatment and the phenanthrene treatment: YWHAZ, SDHA, and GAPDH. For the temperature and desiccation treatments (F. candida) results also overlapped with SDHA ETIF and elongation factor 1α (EF1a) being the best suited reference genes. The results for O. cincta seem to be quite variable. The commonly used reference gene ACTb is placed in all top rankings, but not in the cadmium treatment. 28S, a well known but also controversial reference gene to use, showed a high stability in the temperature treatment. Also remarkable is the stability of HSP70, expected to be differential, in the desiccation and starvation treatments. This result has been reported previously for Drosophila melanogaster, which did not show an increase in HSP70Aa mRNA levels in response to starvation and desiccation (Sinclair et al., 2007). Despite the fact that the ranking of reference genes shows no typical overall uniformity over all different treatments, the most generally applicable reference genes suggested are : ACTb, YWHAZ and TBa for O. cincta and SDHA, ETIF and YWHAZ for F. candida.

Optimum number of reference genes The optimum numbers of reference genes are shown in Table 3. In only two out of nine treatments the two methods agreed on the number of genes to use for normalization (two genes in these cases). However, in nearly all cases either geNorm or Normfinder recommended the use of two reference genes. Comparing the two sets proposed by either program, the differences in significance levels and relative expression levels of a differential gene were found to be very small.

Effect of method of normalization on relative expression of HSP70 We selected the temperature treatment with the differential HSP7 to illustrate the effects of reference gene selection on the calculated relative expression level of a gene of interest. This treatment was chosen because there were four different conditions and HSP70expression clearly responded differently to each of these conditions. We normalized HSP70 expression in the F. candida temperature treatment with four different sets of reference genes: i) only with a commonly used single reference gene (ACTb); ii) with the appropriate number of selected genes according to the geNorm and iii) the Normfinder analysis and iv) with all available reference genes in the F. candida temperature dataset. Each set of reference genes showed significant up-regulation of HSP70 expression in

32

Table 3: The most stable reference genes and the optimum number of genes per treatment calculated by geNorm and Normfinder per collembolan species.

Folsomia candida Orchesella cincta Treatment Top 3 most stable genes Optimum no. genes Treatment Top 3 most stable genes Optimum no. genes Normfinder GeNorm Normfinder GeNorm Normfinder GeNorm Normfinder GeNorm Temperature SDHA SDHA 2 >3 Temperature 28s YWHAZ >3 2 ETIF ETIF ACTb ACTb EF1a EF1a YWHAZ 28s Desiccation SDHA SDHA 2 3 Desiccation ACTb ACTb 2 2 ETIF ETIF GAPDH GAPDH EF1a GAPDH EF1a EF1a Cadmium YWHAZ YWHAZ 3 2 Cadmium TBa TBa >3 2 SDHA SDHA SDHA YWHAZ GAPDH GAPDH YWHAZ SDHA Phenanthrene SDHA SDHA 2 2 Phenanthrene n/a n/a n/a n/a YWHAZ YWHAZ n/a n/a GAPDH GAPDH n/a n/a pH UBC ACTb >3 >3 pH n/a n/a n/a n/a ETIF CYP n/a n/a ACTb EF1a n/a n/a Starvation n/a n/a n/a n/a Starvation ACTb TBA >3 2 n/a n/a YWHAZ YWHAZ n/a n/a TBa SDHA All treatments n/a n/a n/a n/a All treatments ACTb n/a n/a n/a n/a n/a TBa n/a n/a n/a YWHAZ n/a

33 the animals exposed to 0°C and 30°C as compared to those exposed to 10°C and 20°C (P < 0.05), as well as a significant difference between 10°C and 20°C (Figure 2). However, the difference in HSP70 expression between 0°C and 30°C was only significant when using the reference genes proposed by the geNorm and Normfinder analysis. This indicates that selection of reference genes can influence the resolution with which differences in gene expression between two samples can be detected.

Figure 2: HSP70 expression between different pools of F. candida exposed to different temperatures, normalized with four different sets of reference genes (ACTb: only beta-actin; the best set selected by geNorm (SDHA & ETIF); the best set selected by Normfinder (ETIF, SDHA, EF1a & UBC) and all available reference genes). One-way ANOVA analysis with Bonferroni post-hoc test revealed significant differences (P = <0.05) between 0, 10 and 20○C and between 10, 20 and 30○C in all normalization sets. Significant differences between 0 and 30○C were only found when normalized with the optimum sets of reference genes selected by either geNorm or Normfinder. The letters above the bars indicate significant differences where different letters between bars represent a significant difference within 1 normalization.

Ranking O. cincta reference genes over all treatments Making comparisons between experiments is difficult when no standardization of reaction conditions has taken place (Bustin et al., 2005). Reverse transcription is the most crucial step

34 in the introduction of technical sampling variation in the Q-PCR procedure (Stahlberg et al., 2004). In the O. cincta dataset we attempted to meet the standards needed for a multi- treatment analysis, which allowed us to analyze the whole O. cincta dataset similarly to the single treatment analyses using Normfinder. We also included a non-normalized situation ('total RNA input') which was equal for all samples. The differential MT was omitted from ranking due to bias beyond the bias threshold as well as 'total RNA input' (Table 2, see materials section and Additional file 3. Although geNorm is less suited to handle heterogeneity than Normfinder, both algorithms produced the same top three of generally applicable reference genes: ACTb, TBa, and YWHAZ.

Discussion and Conclusion

In this study we identified appropriate reference genes of two collembolan species that show invariant expression across experimental treatments. The most important result of our study is the lack of similarity in stability of genes i) within the species upon exposure to different treatments, and ii) between species that have undergone similar treatments. The two collembolan species have a distinct physiology and ecology, but they belong to phylogenetically related families (D'Haese, 2002). An earlier study on two flatfish species with a comparable phylogenetic relatedness as the two collembolan species used here, found congruent results in reference gene stabilities between the two species (Infante et al., 2008). In our study, differences were most pronounced in the temperature and desiccation treatments, while the cadmium treatments show a greater congruency between species. It may be argued that the physiological responses to xenobiotic stress are more similar in the two species than responses to the environmental factors for which specific adaptations exist. Still, experimental methods also differed between species and therefore it cannot be excluded that part of the observed interspecific differences was due to differences in experimental set-up. Previously, it has been suggested that the results of reference gene selection studies might serve as a resource for future gene expression studies in the same or related species (Infante et al., 2008, Scharlaken et al., 2008). Indeed, some genes were stable across different treatments. ACTb and YWHAZ were the dominant genes found in the top of the rankings of three out of four O. cincta treatments, but this general pattern did not hold up for ACTb in the cadmium treatment and YWHAZ in the desiccation treatment. In F. candida, the two chemical

35 treatments (cadmium and phenanthrene) and the environmental treatments (temperature, desiccation and pH-stress) differed in selected reference genes, indicating that different classes of physiological responses, for instance detoxification and acclimation, can be expected to cause fluctuations in different 'housekeeping' genes to maintain cellular homeostasis. Therefore one cannot assume that two species responding similarly to a certain treatment will also respond in a similar way to another treatment. In addition, differences in priming strategy and Q-PCR assay characteristics can also introduce variation in stability comparisons. In the light of our present results we must caution against the use of literature data from related species to select a normalization standard, unless careful notion is taken of the parallels and differences in the technical context of the experiment as well as the internal processes of the organisms that are being studied. Optimizing a set of reference genes not only requires making the right choice of genes, it also implies choosing the right number of genes. The difference in calculated levels of HSP70 expression, using different sets of reference genes, exemplifies the importance of the number of reference genes used. Resolution was too low to detect small differences in expression, when either a single reference gene was used for normalization or when all available reference genes were included. In fact the small difference between 0°C and 30°C was only statistically significant when normalization was done with the sets proposed by geNorm and Normfinder. In the geNorm analysis, inclusion of a differential gene gives insight into the variability of all other genes. Hypothetically, this differential gene should be the most unstable gene in the dataset. In both our species and all treatments (except for the pH treatment) at least one of the selected differential genes was indeed the most variable of the set tested (Vandesompele et al., 2002). This provides evidence that the applied treatments caused an effect at the transcriptional level, which strengthens the validation of the reference genes that remained stable under the changed regimes. The only exception was observed for the pH treatment, where the pre-selected differential gene was positioned among the housekeeping genes. Most likely, the treatment may not have been severe enough or it may indicate that some exposure types do not initiate an effect on the transcriptional level. Normfinder performs relative comparisons between the potential reference genes; hence this method is more sensitive to the presence of differentially expressed genes in the dataset than geNorm. In this study, we set a threshold for the maximal allowable bias at 0.13 times the standard deviation of the treatments' intergroup variation as calculated with Normfinder. After the pre-selection, the reference genes selected by geNorm and Normfinder were

36 remarkably similar. In all treatments, except for the F. candida pH treatment and the O. cincta starvation treatment, the top three consisted of the same three genes even though the order differed. Q-PCR has proven its value in many areas of genetic and genomic research. Knowledge on genetic pathways and molecular responses to external environmental cues and chemical factors now also make a significant contribution to evolutionary ecology and ecotoxicology. Experiments in these scientific disciplines often focus on non-model organisms, such as collembolans, and routinely use large sample sizes. The availability of high-throughput Q- PCR systems (Morrison et al., 2006) means that this technique will become more valuable for molecular ecological research. The reference genes presented in this study can therefore act as a starting point for scientists who use identical collembolan species for ecological, evolutionary and ecotoxicological research. Nonetheless, as previously stated by Stürzenbaum and Kille (2001), it should always be kept in mind that technically successful Q-PCR depends not only on the genes of interest alone, and therefore reference genes should be carefully validated prior to experimentation.

Materials and methods

Collembolans cultures Folsomia candida was kept in plastic containers with a water-saturated plaster of Paris base containing 10% charcoal at 20°C in a 12:12 light dark regime. The animals were fed dried baker's yeast (Dr. Oetker) ad libitum. For all experiments animals of at least 20 days old were used. Orchesella cincta was held comparably, but fed algae (Desmococcus sp) growing on twigs of pine trees. For all the experiments animals of 4–5 weeks old were used, with a maximum age difference of seven days.

Folsomia candida treatments pH & temperature treatment Animals of 20 days old were exposed in 100 ml jars with 30 grams of OECD artificial soil.

OECD soil pH was adjusted with CaCO3 (J.T. Baker) to four different values (3.5, 4.5, 5.5 &

37

6.5) according to OECD guideline 207 (OECD, 1984). Pools of thirty animals per jar were exposed to the different pH values for three days at 75% humidity and 20°C. For the temperature treatment OECD soil at a pH of 5.5 was used. Pools of thirty animals per jar were exposed to 0, 10, 20 and 30°C for 3 days. For each condition two biological replicates were used (separately) for RNA extraction.

Desiccation treatment Drought exposure was performed as described by Bayley and Holmstrup (1999). Per replicate, a pool of twenty-five to thirty animals was exposed in plastic containers to a relative air humidity controlled at 98.2% by placing a NaCl solution of 31.6 g L-1 inside the closed container. Animals were sacrificed for RNA isolation after 8, 27, 53 and 174 hours of exposure. For each condition two separate RNA extractions were done (biological replicates).

Cadmium and phenanthrene treatment For the cadmium and phenanthrene treatments animals were exposed in 100 ml jars on a compressed layer of 10 g wet weight of LUFA 2.2 soils (for details see (Droge et al., 2006). Procedures of the standard ISO protocol 11267 (ISO, 1999) were followed to spike the soils to nominal concentrations equivalent to the LC50 28 days for cadmium (6.86 mmol kg-1 dry soil (Van Gestel and Koolhaas, 2004) and phenanthrene (422 μmol kg-1 dry soil (Droge et al.,

2006). For the cadmium spiking a solution of hydrated CdCl2 (purity 99%; J.T. Baker, The Netherlands) was used, while phenanthrene (purity 96%; Sigma-Aldrich Chemie, Germany) was dissolved in acetone (Riedel-de Haën, Seelze, Germany), followed by overnight evaporation for the acetone. Both soils did not undergo a period of aging. Pools of 15 animals per jar were exposed for a period of 48 hr (cadmium) or 96 hr (phenanthrene) to spiked and clean control soils. For each condition four biological replicates were used (separately) for RNA extraction.

O. cincta treatments

Cadmium treatment Cadmium exposure was performed as described in Roelofs et al. (2006). Animals were exposed individually to a nominal concentration of 1 μmole cadmium per gram dry weight food (algal paste). The exposure was started immediately after moulting to exclude hormonal effects on gene expression and lasted for three days. Five individuals were pooled per

38 replicate; a control was used with the same set-up but fed clean algae. For each condition three biological replicates were used (separately) for RNA extraction.

Temperature exposure Springtails were exposed in glass vials containing slightly moistened foam at the bottom and moistened foam stoppers. Five individuals per vial were exposed to three temperature treatments: cold (10°C), control (20°C) and heat (35°C). Temperature treatments consisted of four hours at 10°C respectively 20°C in a climate room, or one hour placement in a water bath of 35°C with a one hour recovery period at 20°C. For each condition three biological replicates of five pooled animals were used (separately) for RNA extraction.

Desiccation treatment The desiccation treatment followed the protocol described by Bahrndorff et al. (2007). Springtails were exposed to 97.2% relative humidity in a tightly sealed container containing a NaCl solution of 50.66 g L-1 at 20°C for five days. The control treatment followed the same protocol, but instead of the NaCl solution demineralized water was used. We used five individuals per vial and three vials per treatment. Animals from each vial were pooled for RNA isolation.

Starvation treatment We used five individuals per vial and three vials for the starvation treatment and its control. Springtails were transferred to glass vials containing slightly moistened foam at the bottom and moistened foam stoppers. In the control treatment food was made available by adding a piece of bark overgrown with green algae to the animals, while the animals in the starvation treatment did not have access to food. The experimental vials were kept at 20°C in a 12:12 light dark regime for 8 days. Animals from each vial were pooled for RNA isolation.

RNA isolation and reverse transcription After exposure animals were snap-frozen in liquid nitrogen and total RNA was isolated with the SV Total RNA isolation system (Promega) according to manufacturer's instructions. Genomic DNA was removed via a DNAse treatment supplied with the kit. RNA integrity was confirmed on a 1% agarose gel and RNA quantities were assessed with a nanodrop ND-1000 spectrophotometer (Nanodrop Technologies) and ranged between 30 and 100 ng μL-1 of total RNA. As indicated by 260/280 and 260/230 nm ratios, all samples used in this study were

39 assumed free from protein contamination and (organic) salts. Absence of amplicons after PCR with Taq-polymerase (MRC Holland, The Netherlands) and 1 μL RNA solution confirmed that no trace DNA contamination was present in the samples used in the further analyses. Synthesis of cDNA was performed using the reverse-transcriptase system of Promega with the M-MLV reverse transcriptase enzyme and an oligo-dT primer (F. candida samples) or random hexamer primers (O. cincta). Random hexamer primers were used in the case of O. cincta because 28S ribosomal RNA was used as one of the potential reference genes. In the case of the O. cinct samples, reverse transcription input amounts were equalized by diluting the total RNA concentrations to 0.5 μg μL-1, and samples were reverse transcribed together in a single run. cDNA samples were diluted four times before QPCR was carried out. All treatment conditions were reverse transcribed in triplicate, except for the pH treated samples which were performed in duplicate.

QPCR Besides reference genes, one or two differentially expressed genes were included in order to observe transcriptional effects of the treatment. These differentials were previously assessed for their response to the majority of the treatments (Bahrndorff et al., 2009, Roelofs et al., 2006, de Boer et al., 2010, Timmermans et al., 2009). For the desiccation and starvation treatments in O. cincta, lack of prior knowledge of characteristic responsive genes made us look to a fixed set of two differentials for all four treatments. QPCR assays for seven candidate reference genes (Table 1) and two genes differential for a range of stressful conditions, HSP70 and MT (Sorensen et al., 2003, Coyle et al., 2002) were developed. For F. candida, eight candidate reference genes were analyzed (Table 1) together with one differential gene for each treatment. Primers were based on sequences present in the Collembolan EST database Collembase (http://www.collembase.org) and generated with Primer Express 1.5 (Applied Biosystems) with the following settings: Melting temperatures were kept between 59°C and 60°C and the amplified fragment length was kept between 90 and 120 base pairs with an optimum amplicon melting temperature of 80°C. For primer sequences, GC content and melting temperatures see Additional file 4. Details on the positions of the QPCR amplicons in the full coding sequences are given in (Additional file 5). Reaction efficiency of each of the QPCR assays was determined by means of a standard curve consisting of 5 samples each fourfold diluted, from an initial cDNA pool. Each reaction was carried out in a total volume of 20 μL, using 2 μL cDNA template, 10 μL SYBR Green I master mix (Applied Biosystems) and 20 pmol of each gene specific primer (supplied by

40

Isogen). QPCR Cycling was performed on a DNA engine Opticon1 (Biorad), with three technical replicates per sample. Cycling conditions were kept constant for all assays. For details on PCR mix and program see Roelofs et al. (2006). Specificity of the PCR products was confirmed by analysis of the melting curve; 60–90°C with a heating rate of 0.1°C per second and one fluorescence measurement per second. Each run included a non-template control for each assay. For two of the O. cincta assays (GAPDH and SDHA) no sequence information was available. Therefore degenerate primers were developed based on sequences from multiple organisms that were taken from Genbank. Generated PCR products were subsequently cloned, sequenced and used as a template for the development of QPCR primers as described above. Proper QPCR data were not retrieved from one F. candida pH treatment replicate (pH 3.5). Therefore this sample was discarded from further analysis.

Data analysis Ct values were calculated with the Opticon Monitor 3 software (Biorad), using a manually set cycle threshold. At a level of 0.01 raw fluorescent units which in all assayed plates fell within the exponential phases of the QPCR reactions. Averages of the three technical replicates were used in case of a standard deviation of lower than 0.5 Ct. When standard deviation exceeded this number, fluorescence curves were evaluated. The analysis was always based on at least two replicates. Ranking of reference genes was determined using geNorm and Normfinder applications, as implemented in the Genex Light software package (MultiD_Analyses_AB, 2008). The optimum numbers of reference genes are based on a Vn/Vn+1 value of > 0.15 for geNorm (Vandesompele et al., 2002) and the minimum of accumulated standard deviations for Normfinder (Bergkvist et al., 2008). The original geNorm VBA applet for Excel was used for automated calculation of Vn/Vn+1. Relative gene expressions of all differential genes were calculated with the Pfaffl method and normalized with both of the optimum sets of reference genes proposed by geNorm and Normfinder. Significance levels were tested by Student's t-test, for both algorithms. To be conservative we report only the larger P-value in the results. The temperature exposures of F. candida (Figure 2) were normalized as described in the results section, and tested for their significance by one-way ANOVA analysis with a Bonferroni corrected post-hoc test between the different temperatures for each normalization method.

41

Bias threshold definition In Normfinder the total stability ('variability') of a gene is defined by the magnitude of the intragroup variation relative to the intergroup variation (Andersen et al., 2004). Differential genes, which are responding to one or more treatments, will greatly increase the intergroup variation and hence have a disproportionally large effect on the calculated variability and the ranking order of the other candidate genes. To avoid such biases, differential genes should be excluded from the NormFinder analysis using a pre-selection procedure. The pre-selection procedure consisted of an initial NormFinder analysis with all genes, from which we calculated a bias threshold for the amount of total absolute bias a gene was allowed to have, compared to the total mean absolute bias of the group. Assuming a normal distribution of biases and genes, we considered the 10% most stable genes to be suitable as potential reference genes, which sets the bias threshold between -0.13*SD and 0.13*SD of the normal distribution (MultiD_Analyses_AB, 2008). In the final NormFinder analysis only those genes that met the criterion of being among the 10% most stable genes were included. Ranking of the remaining genes was subsequently done by determining the standard deviation in a Normfinder analysis with the software settings set to not taking groups into account (M. Kubista, pers. comm.)

Acknowledgements The authors of this paper would like to thank Mikael Kubista for help with the Genex software and the data analysis and Martin Holmstrup for providing the samples for the F. candida desiccation treatment. This work was supported by grants from the Netherlands Genomics Initiative and the BSIK; “Assessing the Living Soil”. JE is supported by the Netherlands Organization for Scientific Research, VIDI-grant no. 864.03.003.

Supportive information Supportive tables and figures mentioned in the main text can be found online as additional files at http://www.biomedcentral.com/1471-2199/10/54

42

References

Andersen, C. L., Jensen, J. L. & Orntoft, T. F. 2004. Normalization of Real-Time Quantitative Reverse Transcription-PCR Data: A Model-Based Variance Estimation Approach to Identify Genes Suited for Normalization, Applied to Bladder and Colon Cancer Data Sets. Cancer Research, 64, 5245-5250. Applied_Biosystems 2001. User Bulletin #2, ABI PRISM 7700 Sequence Detection System. Bahrndorff, S., Holmstrup, M., Petersen, H. & Loeschcke, V. 2006. Geographic variation for climatic stress resistance traits in the Orchesella cincta. Journal of Insect Physiology, 52, 951-959. Bahrndorff, S., Loeschcke, V., Marien, J. & Ellers, J. 2009. Dynamics of heat-induced thermal stress resistance and hsp70 expression in the springtail, Orchesella cincta. Funct Ecol, 23, 233 - 239. Bahrndorff, S., Petersen, S., Loeschcke, V., Overgaard, J. & Holmstrup, M. 2007. Differences in cold and drought tolerance of high arctic and sub-arctic populations of Megaphorura arctica Tullberg 1876 (Onychiuridae: Collembola). Cryobiology, 55, 315 - 323. Bas, A., Forsberg, G., Hammarstrom, S. & Hammarstrom, M. L. 2004. Utility of the Housekeeping Genes 18S rRNA, Actin and Glyceraldehyde-3-Phosphate-Dehydrogenase for Normalization in Real-Time Quantitative Reverse Transcriptase-Polymerase Chain Reaction Analysis of Gene Expression in Human T Lymphocytes. Scandinavian Journal of Immunology, 59, 566-573. Bayley, M. & Holmstrup, M. 1999. Water vapor absorption in by accumulation of myoinositol and glucose. Science, 285, 1909-1911. Bergkvist, A., Forootan, A., Zoric, N., Strombom, L., Sjoback, R. & Kubista, M. 2008. Choosing a Normalization Strategy for RT-PCR. Genetic Engineering & Biotechnology, 28. Bustin, S. A. 2002. Quantification of mRNA using real-time reverse transcription PCR (RT- PCR): trends and problems. Journal of Molecular Endocrinology, 29, 23-39. Bustin, S. A., Benes, V., Nolan, T. & Pfaffl, M. W. 2005. Quantitative real-time RT-PCR - a perspective. Journal of Molecular Endocrinology, 34, 597-601. Coyle, P., Philcox, J., Carey, L. & Rofe, A. 2002. Metallothionein: The multipurpose protein. Cellular and Molecular Life Sciences, 59, 627 - 647. D'haese, C. A. 2002. Were the first springtails semi-aquatic? A phylogenetic approach by means of 28S rDNA and optimization alignment. Proceedings of the Royal Society of London Series Biological Sciences, 269, 1143-1151. De Boer, T. E., Holmstrup, M., Van Straalen, N. M. & Roelofs, D. 2010. The effect of soil pH and temperature on Folsomia candida transcriptional regulation. Journal of Insect Physiology, 56, 350-355. Droge, S., Paumen, M., Bleeker, E., Kraak, M. & Van Gestel, C. A. M. 2006. Chronic toxicity of polycyclic aromatic compounds to the springtail Folsomia candida and the enchytraeid Enchytraeus crypticus. Environmental Toxicology and Chemistry, 25, 2423 - 2431. Ellers, J., Marien, J., Driessen, G. & Van Straalen, N. M. 2008. Temperature-induced gene expression associated with different thermal reaction norms for growth rate. Journal of Experimental Zoology Part B-Molecular and Developmental Evolution, 310B, 137-147. Fountain, M. T. & Hopkin, S. P. 2005. Folsomia candida (collembola): A "Standard" Soil Arthropod. Annual Review of Entomology, 50, 201-222. Heckmann, L.-H., Connon, R., Hutchinson, T., Maund, S., Sibly, R. & Callaghan, A. 2006. Expression of target and reference genes in Daphnia magna exposed to ibuprofen. BMC Genomics, 7, 175.

43

Huggett, J., Dheda, K., Bustin, S. & Zumla, A. 2005. Real-time RT-PCR normalisation; strategies and considerations. Genes and Immunology, 6, 279-284. Infante, C., Matsuoka, M. P., Asensio, E., Canavate, J. P., Reith, M. & Manchado, M. 2008. Selection of housekeeping genes for gene expression studies in larvae from flatfish using real-time PCR. Bmc Molecular Biology, 9. Iso 1999. ISO, Soil Quality. Inhibition of Reproduction of Collembola (Folsomia candida). ISO Guideline 11267. International Standardization Organization. Zwitserland. Liefting, M. & Ellers, J. 2008. Habitat-specific differences in thermal plasticity in natural populations of a soil arthropod. Biological Journal of the Linnean Society, 94, 265-271. Morrison, T., Hurley, J., Garcia, J., Yoder, K., Katz, A., Roberts, D., Cho, J., Kanigan, T., Ilyin, S., Horowitz, D., Dixon, J. & Brenan, C. 2006. Nanoliter high throughput quantitative PCR. Nucleic Acids Research, 34. Morrison, T., Weis, J. & Wittwer, C. 1998. Quantification of Low-Copy Transcripts by Continuous SYBR Green I Monitoring during Amplification BioTechniques, 24, 954- 962. Muller, W. E. G., Grebenjuk, V. A., Thakur, N. L., Thakur, A. N., Batel, R., Krasko, A., Muller, I. M. & Breter, H. J. 2004. Oxygen-controlled bacterial growth in the sponge Suberites domuncula: toward a molecular understanding of the symbiotic relationships between sponge and bacteria. Applied and Environmental Microbiology, 70, 2332-2341. Multid_Analyses_Ab 2008. GenEx Light Software, version 4.3.5. MultiD Analyses AB 2008. Nota, B., Timmermans, M., Franken, O., Montagne-Wajer, K., Marien, J., De Boer, M. E., De Boer, T. E., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2008. Gene Expression Analysis of Collembola in Cadmium Containing Soil. Environmental Science & Technology, 42, 8152-8157. Oecd 1984. Test no 207: Earthworm, acute toxicity tests. OECD Guidelines for the testing of chemicals. Pfaffl, M. W. 2001. A new mathematical model for relative quantification in real-time RT– PCR. Nucleic Acids Research, 29, 2002-2007. Roelofs, D., Marien, J. & Van Straalen, N. M. 2007. Differential gene expression profiles associated with heavy metal tolerance in the soil insect Orchesella cincta. Insect Biochemistry and Molecular Biology, 37, 287-295. Roelofs, D., Overhein, L., De Boer, M. E., Janssens, T. K. S. & Van Straalen, N. M. 2006. Additive genetic variation of transcriptional regulation: metallothionein expression in the soil insect Orchesella cincta. Heredity, 96, 85-92. Scharf, M. E., Wu-Scharf, D., Pittendrigh, B. R. & Bennett, G. W. 2003. Caste- and development-associated gene expression in a lower termite. Genome Biology, 4. Scharlaken, B., De Graaf, D. C., Goossens, K., Brunain, M., Peelman, L. J. & Jacobs, F. J. 2008. Reference gene selection for insect expression studies using quantitative real-time PCR: The head of the honeybee, Apis mellifera, after a bacterial challenge. Journal of Insect Science, 8. Sinclair, B. J., Gibbs, A. G. & Roberts, S. P. 2007. Gene transcription during exposure to, and recovery from, cold and desiccation stress in Drosophila melanogaster. Insect Molecular Biology, 16, 435-443. Snell, T. W., Brogdon, S. E. & Morgan, M. B. 2003. Gene Expression Profiling in Ecotoxicology. Ecotoxicology, 12, 475-483. Sorensen, J., Kristensen, T. & Loeschcke, V. 2003. The evolutionary and ecological role of heat shock proteins. Ecology Letters, 6, 1025 - 1037. Spinsanti, G., Panti, C., Lazzeri, E., Marsili, L., Casini, S., Frati, F. & Fossi, C. 2006. Selection of reference genes for quantitative RT-PCR studies in striped dolphin (Stenella coeruleoalba) skin biopsies. BMC Molecular Biology, 7, 32.

44

Stahlberg, A., Kubista, M. & Pfaffl, M. 2004. Comparison of Reverse Transcriptases in Gene Expression Analysis. Clinical Chemistry, 50, 1678-1680. Sturzenbaum, S. & Kille, P. 2001. Control genes in quantitative molecular biological techniques: the variability of invariance. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 130, 281 - 289. Timmermans, M., De Boer, M., Nota, B., De Boer, T., Marien, J., Klein-Lankhorst, R., Van Straalen, N. & Roelofs, D. 2007a. Collembase: a repository for springtail genomics and soil quality assessment. BMC Genomics, 8, 341. Timmermans, M., Ellers, J. & Van Straalen, N. M. 2007b. Allelic diversity of metallothionein in Orchesella cincta (L.): traces of natural selection by environmental pollution. Heredity, 98, 311-319. Timmermans, M. J. T. N., Roelofs, D., Nota, B., Ylstra, B. & Holmstrup, M. 2009. Sugar sweet springtails: on the transcriptional response of Folsomia candida (Collembola) to desiccation stress. Insect Molecular Biology, 18, 737-46. Van Gestel, C. & Koolhaas, J. 2004. Water-extractability, free ion activity, and pH explain cadmium sorption and toxicity to Folsomia candida (Collembola) in seven soil-pH combinations. Environmental Toxicology and Chemistry, 23, 1822 - 1833. Vandesompele, J., De Preter, K., Pattyn, F., Poppe, B., Van Roy, N., De Paepe, A. & Speleman, F. 2002. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biology, 3, research0034.1 - research0034.11. Zhong, H. & Simons, J. W. 1999. Direct Comparison of GAPDH, [beta]-Actin, Cyclophilin, and 28S rRNA as Internal Standards for Quantifying RNA Levels under Hypoxia. Biochemical and Biophysical Research Communications, 259, 523-526. Zientz, E., Beyaert, N., Gross, R. & Feldhaar, H. 2006. Relevance of the endosymbiosis of Blochmannia floridanus and carpenter ants at different stages of the life cycle of the host. Applied and Environmental Microbiology, 72, 6027-6033.

45

46

Chapter 3

The effect of soil pH and temperature on Folsomia candida transcriptional regulation

Tjalf E. de Boer, Martin Holmstrup, Nico M. van Straalen and Dick Roelofs

Journal of Insect Physiology (2010) 54(4); p350-355

47

Abstract

Differences in abiotic factors like temperature and soil pH can have a significant physiological impact on soil dwelling invertebrates and may confound results in ecotoxicological testing. In this study we exposed Folsomia candida to a range of two abiotic stress treatments (pH and temperature) for 3 days and measured gene expression of a panel of nine stress-response genes with real-time Q-PCR. The exposure to different pH values had a minimal effect on the expression of the nine selected genes: only V-ATPase expression was significantly increased due to decreasing pH. ATPase expression was up-regulated, possibly due to increased proton trafficking across the cell membrane, at a lower pH. HSP70 was up- regulated in collembolans exposed to 30○C, and along with HSP40 at 0○C. We speculate that the minor pH effect on gene expression, compared to the temperature treatment, can be explained by the spatial restricted exposure to the external pH in the gut. Our data showed that only one or two stress response genes were transcriptionally affected by pH and temperature thus exerting minimal effects. The physiological effects of these treatments on F. candida might indicate interesting novel molecular mechanisms.

48

Introduction

Organisms are confronted with changes in their local environment throughout their lives, including abiotic factors, which in their extremes may lead to physiological stress in the organism. Two major types of abiotic stressors can be distinguished: physical and chemical stressors. Stress induced by physical abiotic factors can originate from changes in e.g., temperature or moisture availability, while chemical abiotic stress may be caused by chemical pollution e.g., heavy metals. Soil pH can exert stress on animals directly and/or indirectly. Changes in soil pH are a relevant environmental stress factor as evident in the acidification of forest soils, often caused by anthropogenic activities (Tamm and Hallbäcken, 1988). The bioavailability and toxicity of heavy metals in soil is, in part, dependent on pH. In general, heavy metals become more available to organisms via the soil pore water when the pH decreases (van Straalen and Verhoef, 1997, Sijm et al., 2000). Van Straalen and Verhoef (1997) developed a soil pH indicator system based on soil pH preferences of collembolans, mites and woodlice. They found that most of the test organisms had a broad pH tolerance (pH 2 and 9). Another important abiotic factor is temperature. Most literature on the physiological effects of differences in temperature on insects and collembolans is either on cold tolerance, heat shock or thermal phenotypic plasticity (Bahrndorff et al., 2007). Exposure to high temperatures, heat shock, is a well understood stress response mechanism. At higher than optimal temperatures protein structure may become unstable and newly synthesised proteins may be impaired in their folding. Molecular chaperone proteins like the large family of Heat Shock Proteins (HSP), assist proteins in establishment of their native structure. The genes encoding these proteins are highly conserved and universal throughout life. Cold shock or exposure to sub- optimal temperature exerts a different kind of stress on animals. At temperatures much below 0○C, cellular liquids can freeze causing cellular injuries. Cold tolerance in so-called freeze- avoiding arthropods is based on super cooling, a mechanism that prevents freezing of body fluids at temperatures below the melting point of these fluids. This effect is largely achieved by the removal of gut contents that may seed ice formation and the synthesis and accumulation of cryoprotectants such as glycerol or sorbitol (Zachariassen, 1985). HSPs are not only involved in the heat shock response but also in the recovery of cold shock in insects. Koštál & Tollarová-Borovanská (2009) investigated HSP70 expression in Pyrrhocoris apterus and found it was up-regulated in insects recovering from a 5 day treatment at -5○C

49

Animals injected with anti-HSP70 interfering RNA (RNAi) exhibited little up-regulation in HSP70 RNA levels after cold shock, compared to mock injection treated controls. RNAi injected insects did not show a different phenotype directly after the cold shock but it became apparent after a few days that these insects suffered from unrepaired chilling injuries. In contrast, these injuries were repaired in the control group (Koštál and Tollarová-Borovanská, 2009). Folsomia candida (Collembola, Isotomidae) is a small soil dwelling collembolan (~3 mm body length). An ISO (International Organization for Standardization) and OECD accepted soil toxicity test (ISO, 1999) uses F. candida with survival and fecundity as toxicological endpoints of chemicals or soil pollution. This species has also been used in other physiological studies including water vapour absorption by use of sugars and polyols under drought conditions (Bayley and Holmstrup, 1999, Timmermans et al., 2009). The transcriptome of F. candida has been partly sequenced enabling genetic or molecular studies (Timmermans et al., 2007). Holmstrup et al. (Holmstrup et al.) and Slotsbo et al. (Slotsbo et al., 2009) found that when F. candida was exposed to subletal concentrations of mercury its tolerance to cold and heat was reduced. An example where abiotic factors do not only influence the stressor mechanism but also cause a specific effect was given by Crommentuijn et al (1997) who investigated the influence of pH and organic matter content of soil on the toxicity of cadmium to F. candida. The pore water concentration of cadmium increased when the pH was decreased, but reproduction decreased when the soil pH was increased above pH 7.3. With the emergence of genomic techniques in the fields of ecology and ecotoxicology new possibilities for effect measurements arise (Kammenga et al., 2007). Techniques, such as Realtime Q-PCR and microarray analysis can enhance effect measurement sensitivity. However, there is also the risk that other factors may confound gene expression results. We studied the effects of pH and temperature on gene expression in F. candida using a panel of nine stress related genes with real-time quantitative PCR (Q-PCR). In soil quality measurements pH and temperature could be considered confounding as they can influence the effects that pollutants exert on animal fitness, however, the effects these treatments exert on gene expression of this invertebrate may bring about new mechanistic insights into the physiological response at the molecular level. The aim of this study is to investigate the effect pH and temperature have, as confounding factors, on gene expression in a soil quality assay and to, possibly, elucidate the biological effect these factors have on F. candida. Our hypothesis is that differences in pH will not show a severe effect in F. candida gene

50 expression as this collembolan has a broad pH tolerance. Also, not the whole animal will be exposed to the changes in pH since it can regulate the pH of its hemolymph. The temperature treatment will possibly cause a greater effect as the whole animal is exposed to the differences in temperature.

Materials & methods

Animal cultures F. candida (Berlin strain, VU Amsterdam) stock populations were cultured in PVC containers with a plaster of Paris base, containing 10% charcoal, in a climate room at 20○C, 75% relative humidity and a 12/12 hours light-dark regime. The animals were fed dried baker’s yeast (Dr. Oetker) ad libitum. Animals used in experiments were synchronized for age prior to exposure according to Fountain & Hopkin (2005). Briefly, adult animals were transferred to fresh culture vessels for 2 days where they were allowed to lay eggs. 22 days old F. candida were used for the experimental treatments. For each exposure 30 collembolans were used per biological replicate. After each exposure the animals were extracted from the soil by floatation and left to dry on a plaster of Paris base. They were then snap frozen in liquid nitrogen for RNA extraction. The time between soil extraction and freezing was kept to a minimum. pH treatment An artificial soil, OECD soil (70% sand, 20% kaolinite clay, 10% peat), was mixed 24 hours prior to the exposure to equilibrate. Moisture content was set to 50% of the soil water holding capacity (WHC) and the pH was adjusted by the addition of CaCO3 (Sigma Aldrich): pH 3.5: 0.42 g/kg dry soil, pH 4.5: 1.43 g/kg, pH 5.5: 2.79 g/kg and & pH 6.5: 5.62 g/Kg soil. Twenty two day old F. candida were randomly selected from the synchronized stock for exposure. Three biological replicates containing 25g wet soil and 30 collembolans were exposed for 3 days. An identical experimental setup was applied in a time course experiment of 12, 24, 48, 72 and 120 hours of exposure.

51

Temperature treatment OECD soil with a pH of 5.5 was used in the temperature experiment. Soil was moistened to 50% of its WHC. Thirty F. candida were exposed to 0, 10, 20 and 30○C for 3 days (3 replicates per treatment). No thermal acclimation was applied prior to exposure. Humidity levels were kept constant to prevent drought effects.

RNA isolation and Q-PCR Total RNA was isolated using the SV total RNA kit from Promega, according to the manufacturer’s instructions, for each biological replicate separately. RNA concentrations and integrity were checked on a Nanodrop spectrophotometer (Thermo Fisher Scientific) and 1% agarose gel electrophoresis, respectively. Ten μl of total RNA (approximately 60 ng/μl RNA) was reverse transcribed into cDNA using 200 units of MMLV reverse transcriptase (Promega) and 0.5 μg of oligo-dT primer. cDNA was 4 times diluted prior to Q-PCR. Q-PCR primers (Isogen Biosciences) were developed from the F. candida EST database Collembase (www.collembase.org) (Timmermans et al., 2007) using Primer Express 1.5 software (Applied Biosystems). Q-PCR was performed using PowerSYBR Master Mix (Applied Biosystems) on an Opticon 1 thermocycler (Biorad) using a 3 step Q-PCR program (95○C for 15s, 60○C for 30s, 72○C for 30s, 40 cycles) with melting curve analysis. Three technical replicates were used per sample. Beta-Actin was used as a reference gene (de Boer et al., 2009).

Data analysis & Statistics Q-PCR threshold cycles (Ct) were measured using Opticon Monitor 3 software (Biorad) using a baseline subtraction and a manual threshold setting of 0.005. Relative normalized expression values compared to the reference gene were calculated using Q-Gene software (Muller et al., 2002) according to the manual including primer pair efficiencies (Primer pair efficiency was determined using a standard dilution curve with a mixed cDNA standard). Relative expression values were log transformed for statistical analysis to obtain a normal distribution. Statistical analysis was performed using the software package SPSS14 (IBM). For the pH and temperature treatments Pearson correlations and one-way ANOVA tests were performed, respectively. Data used for the ANOVA test was tested for normal distribution.

52

Results pH experiment In order to assess the effects of abiotic stressors on F. candida we measured gene expression for nine stress related genes (Table 1). The genes were selected to cover a wide variety of effects that different stressors can exert on an animal. For example, heat shock proteins (HSP40 and HSP70) were included for general stress while the DNA mismatch repair protein MSH2 was selected for specific genotoxic effects.

Table 1: Genes investigated in Folsomia candida. ACTb was used for Q-PCR data normalization. Collembase IDs are F. candida EST sequencing clusters. For more information see www.collembase.org.

Gene name Symbol GenBank Collembas Gene Function Accesion. e id Beta Actin ACTb EV473840 Fcc01756 structural constituent of cytoskeleton Nuclear Factor kappa B NF-kB EV480501 Fcc05118 Stress response transcription factor DNA mismatch repair protein 2 MSH2 EV478898 Fcc02154 DNA mismatch repair Heatshock Protein 40 HSP40 EV477249 Fcc03002 Molecular chaparone Heatshock Protein 70 HSP70 EV481509 Fcc01609 Molecular chaparone Transforming Growth Factor B TGFb EV481120 Fcc04724 Cell proliferation/apoptosis MapK Kinase MapKK EV481273 Fcc04304 Cell signalling Vacuolar ATPase ATPase EV476428 Fcc04630 H+ transport across plasma membrane Aquaporin Aqua EV475341 Fcc01971 Active water transport Acethyl Choline Receptor ACR EV473445 Fcc05034 Neural signalling

In the pH experiment, F. candida was exposed to four pH values for 3 days. Gene expression was measured with Q-PCR and four genes were significantly correlated (Pearson correlation, P = <0.05) with differences in pH (ATPase, HSP70, NF-kB and ACR) (Table 2). One-way ANOVA with Tukey post-hoc test was performed on the four significantly correlated genes to investigate the significant differences between the different pH treatments.

53

In the ANOVA analysis only ATPase and NF-kB showed a significant result. The post-hoc test, however, only indicated significant differences for the ATPase gene between 3.5 and 5.5, and 3.5 and 6.5. The actual expression differences between the pH values for ATPase showed that the expression of this gene was higher at the lower pH values (Fig 1). To confirm that ATPase expression was influenced by differences in pH we repeated the experiment and measured ATPase expression again. The second experiment confirmed higher ATPase expression at the lower pH values (Fig. 1).

Figure 1: Relative expression levels of the ATPase gene in Folsomia candida exposed to different soil pH levels. Light grey bars denote the original experiment and dark grey the repeated treatment. ATPase gene expression is up regulated in the lower pH values (3.5 and 4.5) compared to the higher (5.5 and 6.5).

To investigate temporal effects on pH mediated gene expression in our exposure window we conducted a time course experiment with identical pH conditions in an exposure time series of 12, 24, 48, 72 and 120 hours. ATPase expression was only significantly different after 72 hours exposure time. HSP70 expression seemed to be up-regulated in the shorter and longest exposure times (12, 24 and 120, compared to 48 and 72 hours), however, the variation, especially in the shorter exposure times, was also increased. HSP70 is a protein involved in general stress and its elevated expression was probably caused by handling stress of the animals and adaptation to a new environment instead the pH treatment. The increased variation was probably caused by heterogeneity in exposure (Fig 2).

54

HSP70 timecourse experiment 3.5 4.5 0.8 5.5 6.5 0.7

0.6

0.5

0.4

0.3

relative HSP70 expression 0.2

0.1

0 12h 24h 48h 72h 120h exposure time

Figure 2: Results of experiment where Folsomia candida was exposed to soil pH levels for different exposure times. The different pH values per time point are on the x-axis while the relative HSP70 expression is on the y-axis. HSP70 expression is higher in the shorter exposure time, possibly due to handling stress.

Temperature experiment The effect of exposure to different temperatures on the expression of our panel of stress- related genes was also investigated. According to Fountain & Hopkin (2005) the optimum temperature for egg laying in Folsomia candida is 21○C while the organism is often exposed to lower temperatures in its natural environment. We exposed the animals to four different temperatures for 3 days. Only two genes (NF-kB (P = 0.002) and TGFb (P = 0.004, Pearson correlation) were significantly correlated with temperature. However, the expression differences between the temperatures for the different genes, showed that gene expression patterns followed an optimum curve instead of a linear response. HSP70 expression, for example, was up-regulated in animals exposed to 30○C, as expected, but also at 0○C (1.8 times higher expression compared to 20○C). Therefore, correlation analysis is insufficient to explain the observed data. ANOVA analysis, however, showed a significant difference between the different temperatures of HSP70 expression (P = <0.001). Other differentially expressed genes were Aquaporin, NF-kB and ACR, however, differences in gene expression were far less dramatic than observed in HSP70 (see Table 2).

55

Table 2: Overview of the Pearson correlation scores and one-way ANOVA F-values from the statistical tests on the pH and temperature treatments.

Treatment Gene Pearson One-way ANOVA F One-way ANOVA Correlation score score Exp 1 F score Exp 2 pH ATPase -0.786 (P = 0.002) 6.26 (P = 0.017) N/A HSP70 0.681 (P = 0.015) 2.67 (P = 0.119) N/A NF-kB -0.740 (P = 0.006) 6.21 (P = 0.017) N/A ACR -0.777 (P = 0.005) 4.16 (P = 0.055) N/A Temperature HSP70 N/A 79.71 (P = <0.001) 136.67 (P = <0.001) HSP40 N/A 1.67 (P = 0.250) 77.47 (P = <0.001) NFkB N/A 12.516 (P = 0.002) 1.26 (P = 0.350) Aquaporin N/A 7.875 (P = 0.009) 0.22 (P = 0.812) ACR N/A 8.46 (P = 0.007) 0.44 (P = 0.664) MSH2 N/A 1.95 (P = 0.200) 23.95 (P = 0.001) ATPase N/A 1.59 (P = 0.266) 23.78 (P = 0.001)

One explanation for HSP70 up-regulation at 0○C might be recovery when the collembolans are removed from the soil. The animals were exposed for 3 days and may have adapted to the low temperature, hence, warming the collembolans could induce HSP70. To exclude recovery effects the experiment was repeated under identical conditions, but avoiding warming during the soil extraction by working in a cold room. Expression analysis on these animals showed that HSP70 was further up-regulated at 0○C compared to 20○C (2.6 times, ANOVA P = <0.001) (figure 3a). Interestingly, HSP40 which was not up-regulated in the previous experiment, was also significantly up-regulated in this experiment (5.3 times higher expression than at 20○C, ANOVA P = <0.001; see figure 3b).

Discussion

The soil surface layer is subject to wide fluctuations in moisture and temperature and as a soil dwelling animal, Folsomia candida is exposed to these widely variable conditions. These factors can confound ecotoxicological testing with this model organism for risk assessment of soil pollutants, especially when sensitive endpoints like gene expression are measured. We

56

Figure 3: Expression of HSP70 (left) and HSP40 (right) in Folsomia candida exposed to different temperatures in two separate experiments (dark grey for experiment 1 and light grey for experiment 2).

57 determined the expression of nine stress-response genes in animals exposed to these confounding factors. Overall, we observed that pH and temperature hardly influenced expression of these nine genes. Only ATPase showed a significant effect in the pH treatment. In the temperature treatment the heat shock protein genes HSP40 and HSP70 showed a significant differential regulation. The pH treatment induced a slightly less gene-regulatory effect than the temperature treatment. This may be explained by differences in pH that do not affect the whole animal but only tissues such as the gut and the cuticle (Harrison, 2001) while differences in temperature affect all the cells of the animal. When gene expression analysis is on whole animals (F. candida is too small to focus on single organs) the effects of different tissue types are occluded and gene expression measurements diluted. As mentioned above, the only gene significantly affected by differences in pH was an ATPase gene. It is homologous to Drosophila melanogaster Vha68-2 (E value of -119) which encodes for subunit A of a vacuolar ATPase (V-ATPase), expressed on the plasma membranes of epithelial tissues (e.g., salivary glands, midgut and malpighian tubules). In the tobacco hornworm, Manduca sexta, V-ATPase is expressed on goblet cells in the gut where it transports protons out of the cell into the lumen. This proton transport balances the activity of a H+/K+ antiporter which transports protons into the cell while transporting potassium out of the cell into the gut. Potassium is then used in secondary transport where the gut columnar cells transport amino acids together with the excreted potassium into the gut epithelium (Wieczorek et al., 2000). It is possible that a lowered pH in the gut lumen (due to low environmental pH) could reduce intracellular pH due to the activity of the H+/K+ antiporter (Harrison, 2001). One mechanism of maintaining the intracellular pH at normal levels may be up-regulation of the expression of V-ATPase. HSP70 expression was elevated in the first 24 hours of exposure, but this response was not significant due to high variation. The expression elevation was probably caused by handling stress of the collembolans whilst the variation might be caused by heterogeneity in exposure due to the short exposure time. In the time course experiment, ATPase was only significantly affected by different soil pH after 72 hours of exposure, which may be associated with the moulting cycle. During this process the activity of V-ATPase is decreased by dissociation of the different subunits that make up the complete V-ATPase holoenzyme (Sumner et al., 1995), but it has also been shown that transcript levels of the genes for the different subunits are down-regulated upon

58 moulting (Wieczorek et al., 2000). Any specific pH effect on the expression of V-ATPase might be overruled by the effect of moulting on gene expression. Not much is known about the physiological effects of temperature stress on collembolans like F. candida, as most literature describes it either in combination with drought stress or addresses effects of temperature on life history traits (Stam et al., 1996, Bayley et al., 2001). For instance, F. candida reared at a lower temperature has a longer life span (240 days at 15○C compared to 111 days at 24○C) (Fountain and Hopkin, 2001). Eggs from F. candida exposed to temperatures higher than 28○C fail to hatch and we found that 30○C was the maximum temperature they could endure for 3 days. It should be mentioned that we observed mortality at 30○C in the second temperature experiment. Nevertheless, we observed up- regulation of HSP70 at 30○C when compared to the 20ºC control treatment, confirming that the proteins respond to heat. (Feder and Hofmann, 1999, Slotsbo et al., 2009). HSP70 and HSP40 expression were also up-regulated in animals exposed to 0○C. The role of HSP70 in the cold tolerance of arthropods has been debated in recent years (e.g. Nielsen et al., 2005; Sinclair et al., 2007). A number of studies have shown that HSP70 expression is up-regulated during recovery from cold shock (Goto et al., 1998, Sejerkilde et al., 2003) but expression of HSP70 during cold exposure is often absent or very limited (Kelty and Lee, 2001, Nielsen et al., 2005, Overgaard et al., 2005) or connected to cold resistant diapausing life stages (Yocum et al., 1991, Rinehart et al., 2000). In our second temperature experiment we ensured that collembolans did not experience a temperature change during extraction and handling before snap freezing by isolating them from the soil at the same temperature as during exposure in soil. Again, HSP70 expression remained up-regulated, clearly showing that the induction of HSP70 was caused by the cold treatment and not due to warming during termination of the cold exposure. This observation indicates that repair mechanisms are induced during cold exposure and points to a protective role of HSP70. HSP40 was also up-regulated, whereas, in the first experiment it seemed unaffected by the cold treatment. HSP40 regulation may act very fast so that mRNA abundance of this gene returned to normal levels before we measured it in the first experiment. Finally we should note that HSP40 over-expression may confer cold tolerance in prokaryotes (Chow and Tung, 1998). However, HSP40 regulation has, to our knowledge never been associated with cold tolerance in higher organisms. It is not known if F. candida can diapause, but it is able to enter a dormant state where it down-regulates its metabolic activity during cold periods. Collembola species from Arctic regions are able to enter diapause, e.g., Hypogasturia tullbergi (Birkemoe and Leinaas, 1999). These Collembola have a limited reproductive period. Entering diapause may enable

59 synchrony of sexually reproduction (Ims, 1990). F. candida, however, is an asexually reproducing species that does not need to be synchronized before a reproductive period. Since F. candida does not enter diapause, but merely lowers its metabolic rate during colder periods, HSP70 up-regulation under a low temperature regime may offer protection against, and repair after, cold injuries inflicted by denaturing of proteins (Ramlov, 2000). With genomic gene expression techniques like microarrays (Nota et al., 2008) and high throughput quantitative PCR becoming more common in the field of ecology and ecotoxicology, it is important to investigate the impact of conflicting cause and effects. When, for example, using a microarray platform to assess the impact of soil pollutants on test animals it is important to tease out whether the effects are caused by the pollutant or other abiotic factors. From the nine genes we tested, only one responded to differences in soil pH and two responded to temperature changes. Even though factors such as pH and temperature are considered to be confounding in soil quality testing (Amorim et al., 2008) they showed an interesting physiological effect. The up-regulation of ATPase at a low soil pH is implicated with the presence of goblet cells in higher insects. E1/Azan stained microscopic slides of F. candida gut epithelium did not show any goblet cells (data not shown) so this might indicate a novel gut organization. The up-regulation of HSP70 at lower temperatures has not been observed before. HSP70 up-regulation was previously only associated with recovery from cold. However, up- regulation under cold might indicate a novel cold defense mechanism, which remains to be elucidated in future studies.

Acknowledgments:

The authors would like to thank Martijn Timmermans, Muriel de Boer and Ben Nota for help with animal care and real time PCR experiments. This work was supported by grants from the Netherlands Genomics Initiative in the programme: “Assessing the Living Soil” (Ecogenomics)

60

References

Amorim, M. J. B., Novais, S., Römbke, J. & Soares, A. M. V. M. 2008. Avoidance test with Enchytraeus albidus (Enchytraeidae): Effects of different exposure time and soil properties. Environmental Pollution, 155, 112-116. Bahrndorff, S., Petersen, S. O., Loeschcke, V., Overgaard, J. & Holmstrup, M. 2007. Differences in cold and drought tolerance of high arctic and sub-arctic populations of Megaphorura arctica Tullberg 1876 (Onychiuridae: Collembola). Cryobiology, 55, 315- 323. Bayley, M. & Holmstrup, M. 1999. Water vapor absorption in arthropods by accumulation of myoinositol and glucose. Science, 285, 1909-1911. Bayley, M., Petersen, S. O., Knigge, T., Kohler, H. R. & Holmstrup, M. 2001. Drought acclimation confers cold tolerance in the soil collembolan Folsomia candida. Journal of Insect Physiology, 47, 1197-1204. Birkemoe, T. & Leinaas, H. P. 1999. Reproductive biology of the arctic collembolan Hypogastrura tullbergi. Ecography, 22, 31-39. Chow, K.-C. & Tung, W. L. 1998. Overexpression ofdnaK/dnaJandgroELConfers Freeze Tolerance toEscherichia coli. Biochemical and Biophysical Research Communications, 253, 502-505. Crommentuijn, T., Doornekamp, A. & Van Gestel, C. A. M. 1997. Bioavailability and ecological effects of cadmium on Folsomia candida (Willem) in an artificial soil substrate as influenced by pH and organic matter. Applied Soil Ecology, 5, 261-271. De Boer, M., De Boer, T., Marien, J., Timmermans, M., Nota, B., Van Straalen, N., Ellers, J. & Roelofs, D. 2009. Reference genes for Q-PCR tested under various stress conditions in Folsomia candida and Orchesella cincta (Insecta, Collembola). BMC Molecular Biology, 10, 54. Feder, M. E. & Hofmann, G. E. 1999. Heat-shock proteins, molecular chaperones, and the stress response: Evolutionary and Ecological Physiology. Annual Review of Physiology, 61, 243-282. Fountain, M. T. & Hopkin, S. P. 2001. Continuous Monitoring of Folsomia candida (Insecta: Collembola) in a Metal Exposure Test. Ecotoxicology and Environmental Safety, 48, 275- 286. Fountain, M. T. & Hopkin, S. P. 2005. Folsomia candida (collembola): A "Standard" Soil Arthropod. Annual Review of Entomology, 50, 201-222. Goto, S. G., Yoshida, K. M. & Kimura, M. T. 1998. Accumulation of Hsp70 mRNA under environmental stresses in diapausing and nondiapausing adults of Drosophila triauraria. Journal of Insect Physiology, 44, 1009-1015. Harrison, J. F. 2001. Insect acid-base physiology. Annual Review of Entomology, 46, 221-250. Holmstrup, M., Aubail, A. & Damgaard, C. 2008. Exposure to mercury reduces cold tolerance in the springtail Folsomia candida. Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology, 148, 172-177. Ims, R. A. 1990. The ecology and evolution of reproductive synchrony. Trends in Ecology & Evolution, 5, 135-140. Iso 1999. ISO, Soil Quality. Inhibition of Reproduction of Collembola (Folsomia candida). ISO Guideline 11267. International Standardization Organization. Zwitserland. Kammenga, J. E., Herman, M. A., Ouborg, N. J., Johnson, L. & Breitling, R. 2007. Microarray challenges in ecology. Trends in Ecology & Evolution, 22, 273-279.

61

Kelty, J. D. & Lee, R. E. 2001. Rapid cold-hardening of Drosophila melanogaster (Diptera : Drosophilidae) during ecologically based thermoperiodic cycles. Journal of Experimental Biology, 204, 1659-1666. Koštál, V. & Tollarová-Borovanská, M. 2009. The 70 kDa Heat Shock Protein Assists during the Repair of Chilling Injury in the Insect, Pyrrhocoris apterus. PLoS ONE, 4, e4546. Muller, P. Y., Janovjak, H., Miserez, A. R. & Dobbie, Z. 2002. Processing of gene expression data generated by quantitative real-time RT PCR (vol 32, pg 1378, 2002). Biotechniques, 33, 514-514. Nielsen, M. M., Overgaard, J., Sorensen, J. G., Holmstrup, M., Justesen, J. & Loeschcke, V. 2005. Role of HSF activation for resistance to heat, cold and high-temperature knock- down. Journal of Insect Physiology, 51, 1320-1329. Nota, B., Timmermans, M., Franken, O., Montagne-Wajer, K., Marien, J., De Boer, M. E., De Boer, T. E., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2008. Gene Expression Analysis of Collembola in Cadmium Containing Soil. Environmental Science & Technology, 42, 8152-8157. Overgaard, J., Sorensen, J. G., Petersen, S. O., Loeschcke, V. & Holmstrup, M. 2005. Changes in membrane lipid composition following rapid cold hardening in Drosophila melanogaster. Journal of Insect Physiology, 51, 1173-1182. Ramlov, H. 2000. Aspects of natural cold tolerance in ectothermic animals. Human Reproduction, 15, 26-46. Rinehart, J. P., Yocum, G. D. & Denlinger, D. L. 2000. Developmental upregulation of inducible hsp70 transcripts, but not the cognate form, during pupal diapause in the flesh fly, Sarcophaga crassipalpis. Insect Biochemistry and Molecular Biology, 30, 515-521. Sejerkilde, M., Sorensen, J. G. & Loeschcke, V. 2003. Effects of cold- and heat hardening on thermal resistance in Drosophila melanogaster. Journal of Insect Physiology, 49, 719- 726. Sijm, D., Kraaij, R. & Belfroid, A. 2000. Bioavailability in soil or sediment: exposure of different organisms and approaches to study it. Environmental Pollution, 108, 113-119. Slotsbo, S., Heckmann, L.-H., Damgaard, C., Roelofs, D., De Boer, T. & Holmstrup, M. 2009. Exposure to mercury reduces heat tolerance and heat hardening ability of the springtail Folsomia candida. Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology, 150, 118-123. Stam, E. M., Vandeleemkule, M. A. & Ernsting, G. 1996. Trade-offs in the life history and energy budget of the parthenogenetic collembolan Folsomia candida (Willem). Oecologia, 107, 283-292. Sumner, J.-P., Dow, J. a. T., Earley, F. G. P., Klein, U., Jäger, D. & Wieczorek, H. 1995. Regulation of Plasma Membrane V-ATPase Activity by Dissociation of Peripheral Subunits. Journal of Biological Chemistry, 270, 5649-5653. Tamm, C. O. & Hallbäcken, L. 1988. Changes in Soil Acidity in Two Forest Areas with Different Acid Deposition: 1920s to 1980s Ambio, 17, 56-61. Timmermans, M., De Boer, M., Nota, B., De Boer, T., Marien, J., Klein-Lankhorst, R., Van Straalen, N. & Roelofs, D. 2007. Collembase: a repository for springtail genomics and soil quality assessment. BMC Genomics, 8, 341. Timmermans, M. J. T. N., Roelofs, D., Nota, B., Ylstra, B. & Holmstrup, M. 2009. Sugar sweet springtails: on the transcriptional response of Folsomia candida (Collembola) to desiccation stress. Insect Molecular Biology, 18, 737-46. Van Straalen, N. M. & Verhoef, H. A. 1997. The Development of a Bioindicator System for Soil Acidity Based on Arthropod pH Preferences. The Journal of Applied Ecology, 34, 217-232.

62

Wieczorek, H., Grber, G., Harvey, W. R., Huss, M., Merzendorfer, H. & Zeiske, W. 2000. Structure and regulation of insect plasma membrane H(+)V-ATPase. Journal of Experimental Biology, 203, 127-135. Yocum, G. D., Joplin, K. H. & Denlinger, D. L. 1991. Expression of Heat-Shock Proteins in Response to High and Low-Temperature Extremes in Diapausing Pharate Larvae of the Gypsy-Moth, Lymantria-Dispar. Archives of Insect Biochemistry and Physiology, 18, 239-249. Zachariassen, K. E. 1985. Physiology of Cold Tolerance in Insects. Physiological Reviews, 65, 799-832.

63

64

Chapter 4

Transcriptional plasticity of a soil arthropod across different ecological conditions

Tjalf E. de Boer, Adriana Birlutiu, Zoltan Bochdanovits, Martijn Timmermans, Tjeerd Dijkstra, Nico M. van Straalen, Bauke Ylstra and Dick Roelofs

Submitted to Molecular Ecology

65

Abstract

Ecological functional genomics, dealing with the responses of organisms to their natural environment is confronted with a complex pattern of variation and a large number of confounding environmental factors. For gene expression studies to provide meaningful information on conditions deviating from normal, a baseline or Natural Operating Range (NOR) response needs to be established which indicates how an organism’s transcriptome reacts to naturally varying ecological factors. Here we determine the transcriptional plasticity of a soil arthropod, Folsomia candida, exposed to various natural environments, as part of a first attempt in establishing such a NOR. Animals were exposed to 26 different field soils after which gene expression levels were measured. The main factor found to regulate gene expression was soil-type (sand or clay). Cell homeostasis and DNA replication were affected in collembolans exposed to sandy soil, indicating general stress. Multivariate analysis identified soil fertility as the main factor influencing gene expression. Regarding land-use, only forest soils showed an expression pattern deviating from the others. No significant effect of land-use, agricultural practice or soil type on fitness was observed, but arsenic concentration was negatively correlated with reproductive output. In conclusion, transcriptional responses remained within a limited range across the different land-uses but were significantly affected by soil-type. This may be caused by the contrasting soil physicochemical properties to which F. candida strongly responds. The broad range of conditions over which this soil-living detritivore is able to survive and reproduce, indicates a strategy of high plasticity, which comes with extensive gene expression regulation.

66

Introduction

Genomic and gene expression measurements are becoming commonly used techniques in ecological studies (Gibson 2002; Kammenga et al. 2007). By measuring gene expression, we can determine the physiological state of animals when exposed to different ecological conditions and determine whether some of these conditions cause adverse effects. In soil ecology one or more species of animals or plants are exposed to disturbed soils in order to determine the impact of local environmental change on soil dwelling species (Roelofs et al. 2008). To arrive at meaningful results for species functioning under stressed conditions, a reference or baseline level of functioning is needed. We dub this baseline response the Natural Operating Range (NOR). Similar concepts (such as the normal operating range) have been used in ecological studies for a long time. Odum et al (1979) introduced the concept of a normal operating range in their ecosystem perturbation theory already in the seventies (Odum et al. 1979), where they stated that perturbation is any deviation or displacement from the nominal state. The nominal state was defined not as single or fixed, but as a range of normal functioning, including expected variance, hence normal operating range. We propose to apply the concept of Natural Operating Range to gene expression measurements. This NOR can be applied to differentiate organismal responses with negative effects on fitness, from responses to the natural environment. In complicated ecological studies, where it can be difficult to standardize environmental factors and where a large number of endpoints are measured, such a NOR will be essential to study the influence of stress factors on gene expression. Establishing a NOR however, is a daunting task. In this paper we take a first step towards the operationalization of this NOR concept by studying the plasticity of a test species with a single genotype exposed to a multitude of natural conditions. The variation of gene expression under these conditions reflects physiologically relevant natural variation and may provide an insight into how animals maintain homeostasis in the field. Transcriptome analysis has become a standard method for assessing the physiological state of an organism. Microarray technology is used to measure the expression of a large number of genes at once (Schena et al. 1995). In most microarray analyses a treatment sample (e.g. mRNA from animals subjected to some stress factor) is compared to a control sample (e.g. mRNA from a reference group). Less attention is paid to variation within the controls, that is, the range of gene expression variation shown by animals living under conditions that

67 can be considered to fall within their ecological niche. Such studies will help to define baseline levels of gene expression for an organism. Pritchard et al (2001), investigated gene expression differences among six normal male C57BL6 mice. Significant expression variation was found in 0.8 to 3.3% of the measured genes depending on the tissue investigated. Among the differentially expressed genes, immune-system related genes, stress-induced genes and hormonally regulated genes were highly represented. Cavalieri et al. (2000) characterised gene expression between phenotypic variants of progeny from a single parental strain and detected 6% of genes differentially expressed between these variants. Both studies emphasise the importance of determining baseline expression variation in order to avoid misinterpretation of microarray data. Ecological studies often focus on non-genomic model organisms, can be complex and involve a number of naturally varying confounding factors. Here we focus on an upcoming genomic model species with high ecological relevance for soil quality assessment, the collembolan Folsomia candida. This collembolan is an important test animal in soil ecotoxicology and is part of an ISO-recognized toxicity test (ISO 1999). Folsomia candida is easy to culture, has a short reproduction time and has low genetic variation among individuals in the same culture due to its parthenogenetic mode of reproduction (Fountain & Hopkin 2005). At the same time, physiological differences between cultures from different origins are readily observed. These attributes make it a suitable animal for laboratory experiments on population genetics and evolution (Noël et al. 2006; Smit & Van Gestel 1998). Part of the F. candida transcriptome has been sequenced and used for gene transcriptional studies (Nota et al. 2008; Timmermans et al. 2007). Timmermans et al used an oligonucleotide microarray platform for a physiological study on the molecular mechanism of drought tolerance in this springtail (Timmermans et al. 2009). They suggested carbohydrate transport, sugar catabolism and cuticle maintenance to be important biological processes involved in combating desiccation stress. Interestingly, Bayley and Holmstrup (Bayley & Holmstrup 1999) showed that F. candida becomes hyper-osmotic during desiccation by accumulating glucose and myo-inositol, so that water vapour can be extracted from the environment under desiccating conditions. Thus, the transcriptomic data supported previous physiological observations. Another study by Nota et al (2009) investigated the biotransformation pathway of xenobiotic substances in F. candida. Indeed, genes involved in phases I, II and III of the biotransformation pathway were significantly affected by the xenobiotic compound phenanthrene.

68

The ISO soil toxicity test measures F. candida survival and reproduction after an exposure of 28 days. This test, however, tends to show a lot of variation between replicates (Crouau & Cazes 2003) and it does not provide any information on the mode of action that a chemical or pollution stress exerts on the animal. Also, measurement of survival and reproduction does not reveal comparative information about modes of action between different test species. Measuring gene expression as an endpoint in a soil ecotoxicological test can possibly solve these issues by providing a sensitive and mode of action specific way of measuring toxicity in soil (van Straalen & Roelofs 2008). Furthermore, gene expression analysis can be combined with survival/reproduction measurements to link molecular information to ecological endpoints. The aim of this paper is to explore how gene expression in this collembolan varies under different ecological conditions that are part of its natural environment and what are the main factors determining such variation. Generally, F. candida is found over a wide range of soil depth and inhabits agricultural ecosystems, forests and edges of streams (Fountain and Hopkin, 2005). Previous studies have reported on different population densities (Kaneda & Kaneko 2008) and growth rates (Kaneda & Kaneko 2002) of F. candida exposed to a variety of natural soils. To obtain more mechanistic insight into the ability of F. candida to survive and reproduce under such a broad range of conditions, we measured gene expression in animals exposed to a wide variety of natural soils. We want to contribute to the development of a baseline indicative of the transcriptional Natural Operating Range in this collembolan.

Materials and Methods

Soil sampling Twenty-six soils from different sites in the Netherlands were sampled between March and June 2007. Soil samples were taken from fields with a known history of land-use (dairy farming, agriculture, natural grassland and forest). Information on specific agricultural practice (conventional versus organic) for the agricultural and dairy farming land uses was also available. The soils were chosen to represent different soil types (clay versus sand). The sites jointly represented several replicated combinations of land-use, agricultural practice and soil type, but not all combinations could be made (Table 1). Folsomia candida, the test animal used in this study, is normally found in these kinds of soils, although we did not establish its

69

Table 1: The specific data for each soil concerning soil-type, land-use and practice as well as GPS coordinates (latitude a and longitude b).

Soil soil type Land-use practice North coordinate a East coordinate b 1 sand dairy conventional 52º 14' 849" 006º 16' 130" 2 sand dairy conventional 52º 14' 368" 006º 41' 413" 3 sand dairy conventional 51º 27' 992" 005º 54' 350" 4 sand forest N/A 52º 08' 173" 005º 11' 185" 5 sand forest N/A 51º 25' 627" 005º 47' 074" 6 sand forest N/A 53° 04' 920" 006° 28' 031" 7 sand Agriculture conventional 53° 06' 337" 006° 22' 531" 8 sand Agriculture conventional 52º 43' 515" 006º 37' 160" 9 sand Agriculture conventional 53° 05' 309" 006° 49' 945" 10 sand Agriculture organic 52º 13' 220" 005º 40' 160" 11 sand Agriculture organic 52º 51' 711" 006º 43' 203" 12 sand Agriculture organic 52º 01' 241" 006º 12' 048" 13 clay (sea) Agriculture organic 51° 34' 150" 003° 35' 066" 14 clay (sea) Agriculture organic 53° 12' 922" 005° 27' 593" 15 clay (sea) Agriculture conventional 51° 33' 008" 003° 28' 059" 16 clay (sea) Agriculture conventional 52° 52' 076" 005° 02' 067" 17 clay (sea) Agriculture conventional 53° 12' 667" 005° 31' 021" 18 clay (river) dairy conventional 51º 53' 196" 006º 17' 340" 19 clay (river) dairy conventional 52º 22' 862" 006º 04' 833" 20 clay (river) dairy conventional 51º 29' 268" 005º 18' 115" 21 sand natural grassland N/A 51º 32' 706" 005º 18' 375" 22 sand natural grassland N/A 52º 03' 515" 005º 33' 899" 23 sand natural grassland N/A N/A N/A 24 sand natural grassland N/A 52º 00' 524" 005º 35' 853" 25 sand Agriculture organic 51º 42' 917" 005º 46' 725" 26 clay (river) dairy conventional 51º 51' 262" 005º 55' 181"

presence in every field. Fields were sampled at five spots in a square of 20 by 20 m (on each corner and in the middle). At each spot, five subsamples were taken within a radius of 2 m (see Table 1 for soil details and map coordinates). All soils were sampled with consent of their respective owners or caretakers. Samples per soil plot were mixed, sieved over a 4 mm

70 grid to exclude gravel and plant material and stored at 5○C. The pre-treatment procedure resulted in removal of larger infauna, although the soils were not sterilized and retained their natural physicochemical properties. Chemical analysis on 250g subsamples was performed by Blgg, Oosterbeek, the Netherlands, who measured eight metals (total Cd, Cr, Cu, Hg, Pb, Ni and As); pH; total nitrogen; total carbon; phosphate (Ptotal, Pw, P-AL and P-PAE); clay content and particle size distribution according to certified methods. We also measured the water holding capacity (WHC) for every soil. In addition to the Dutch field soils, the standard LUFA 2.2 reference soil was used. (Landwirtschaftliche Untersuchungs und Forschungsanstalt, Speyer, Germany). This control soil is a sandy soil from the western part of Germany and often used in bioassays conducted under internationally harmonized protocols.

Folsomia candida culture and exposure Folsomia candida (VU Berlin strain) was maintained in PVC containers with a plaster of Paris base containing 10% charcoal. The animals were fed baker's yeast (Dr. Oetker, Amersfoort, The Netherlands) ad libitum. Following ISO (1999) age-synchronized cultures were obtained by transferring adult collembolans to fresh culture containers where they were allowed to lay eggs for two days. After two days the animals were removed and their hatchlings were used for the experiments. All collembolans (stocks and exposed) were kept at 20 ○C in a climate-controlled room with a 12 hour dark/light cycle at 75% relative air humidity. Exposures were performed in 100 ml glass jars. The field soils were moistened to 50% of their WHC 24 hours before exposure and left to equilibrate. For the gene expression analysis four replicate jars per soil were used. Each replicate contained 25 g wet soil and 30, 23 day old collembolans randomly selected from the synchronized stock. After 2 days of exposure, the animals were removed from the soil by floatation on water and snap frozen in liquid nitrogen for RNA extraction. The 30 animals from each jar were pooled and considered one biological replicate. For practical reasons the exposures for sandy soils and clay soils were performed consecutively. For the 28-day reproduction test, 10 day old, synchronized collembolans were used. Ten animals were exposed to 25 g of wet soil (at 50% of WHC) in a 100 ml glass container for 28 days per replicate. Containers were opened twice a week for aeration, fed once a week and moisture levels were adjusted twice during the exposure. After 28 days the containers were filled with 100 ml water and emptied in a glass beaker and after gentle stirring digital

71 photographs were taken of the floating collembolans. The number of juveniles was determined with CellD software (Olympus, Hamburg, Germany).

RNA extraction and microarray hybridisation Total RNA was extracted with the SV Total RNA kit from Promega according to manufacturer’s instructions, which included a DNAse treatment. RNA integrity and concentrations were measured on a Bioanalyzer (Bioanalyzer 2100, Agilent Technologies, Santa Clara, USA) and Nanodrop spectrophotometer (Nanodrop ND-1000, Fisher Scientific, Waltham, USA). 500 ng total RNA per sample was used as input for amplification and labelling with the Low-Input Fluorescent Linear Amplification Kit (Agilent Technologies), according to the manufacturer's guidelines. In this protocol total RNA is used as input for reverse transcription into cDNA which in turn is used as template for labelled cRNA transcription. This results in greater amplification of the labelled material. One modification to the standard protocol was applied: the cRNA transcription reactions were done in half volume. Labelled cRNA was hybridized to 8*15K custom Agilent microarrays overnight, washed and scanned, all according to manufacturer’s instructions. The custom microarray contains 5069 unique F. candida gene fragments in triplicate (Nota et al. 2009), and is based on the Folsomia candida EST sequencing database Collembase (Timmermans et al. 2007) (www.collembase.org).

Microarray experimental design and analysis For the microarray experimental design we used an interwoven loop design (Altman & Hua 2006). Separate loop designs were used for either clay or sandy soil exposures. In each loop a control was incorporated which consisted of collembolans exposed to the standard soil LUFA 2.2, so the two experimental series could be compared to each other. Microarray fluorescent intensities were measured with Feature Extraction software (version 9.5.1, Agilent technologies). The data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus (Edgar et al. 2002) and are accessible through GEO Series accession number GSE21213. Pre-processing and normalization (global loess) were performed using the Limma package in R (Smyth 2004). Limma was also used for background correction (Edwards 2003) which included a minimum intensity off-set of 30 to avoid zero or negative intensities. Technical replicates on the array (three per gene) were averaged before Microarray ANOVA (MAANOVA) analysis for which we used the MAANOVA package in R (Kerr et al. 2000). The microarray ANOVA model was described

72 by the following formula: yijkl = μ + Si + Lj + (S x L)ij + Dk + Al + εijkl, where yijkl is the signal from the sample derived from the ith soil-type in combination with the jth land-use, labelled with the kth dye and hybridized to the randomly assigned lth array. The parameter μ is the overall mean, Si is the soil-type effect, Li is the land-use effect, (S x L)i is the soil-type land- use interaction, Di the dye effect (Cy3 or Cy5), Ai is the random effect of the array to which the sample has been assigned to and εijkl is the stochastic error. In the soil-type MAANOVA results a fold regulation cut-off was determined by comparing the LUFA2.2 controls from both sand and clay exposures and this cut-off was used to remove low fold-change genes from the results. The clipped list was subjected to Gene Ontology (GO) analysis using the TopGO package in R (Alexa et al. 2006) and clustered with unsupervised hierarchical clustering (Pearson un-centred) using Tigr Mev 4.5.1 (Saeed et al. 2006). Significance cut-off of the GO terms was set at 0.05 after weighing the results from the Fisher’s exact test. Weighing was done to prevent overestimation of significance due to the hierarchically structured tree of GO terms. GO terms which, in the Folsomia candida database were represented by only one gene were removed from further analysis. The significant gene list generated from the Land-use MAANOVA calculation was subjected to K-means clustering (four groups, 100 iterations) using the MAANOVA package in R. To determine overall gene expression stability, the Coefficient of Variation for each gene was calculated within the soil-type/land-use combinations. Also, to determine gene expression stability between soil-type/land-use combinations a series of pair-wise F-tests were performed on variance data per soil-type/land- use subset. Calculated P-values were adjusted according to Benjamini and Hochberg’s step-up procedure (Benjamini & Hochberg 1995). A Canonical Correlation Analysis was performed to link gene expression to soil analytical data. CCA is a method for determining the relationship between two sets of variables and seeks linear combinations between the variables in both datasets that are maximally correlated with each other.

Results

Variation of LUFA controls in time The microarray experimental design included two interwoven loops (sandy soils and clay soils). To investigate possible systematic differences between the two loops, a LUFA2.2

73 control was added to each loop which was similar (moist content, pH, four replicates per exposure) in each exposure loop. The microarray intensities for both LUFA2.2 controls were normalized, log transformed and averaged per exposure. The two exposure controls were compared directly to each other with linear regression. The linear regression revealed a slope of 0.997 and intercept of 0.077 (compared to 1 and 0 for a perfect regression where the two sets are identical). The maximum log2 fold change between the two controls was 0.42 (see Figure 1). This shows that gene expression signatures of F. candida across standard soils are highly reproducible. Clay and sand datasets were compared directly in further analyses and the expression patterns from LUFA soils were ignored.

Figure 1: MA plot between the averages for both the clay soil exposure controls and the sandy soil exposure controls. M is the ratio or fold-change between sand and clay and A is the average intensity between the two spots. The maximum fold-change between clay and sandy soil exposure was 0.42 (Log2) which was used as a fold-change cut-off to separate transcripts regulated by soil-type from transcripts regulated by exposure effects.

74

Gene expression analysis To measure gene expression differences in F. candida exposed to the different soils we determined two main levels of variation: soil-type (clay or sandy soil) and land-use (i.e. forest, organic agriculture, etc), see Table 1 for the details per soil. A Microarray ANOVA model was used to determine differences in gene expression within soil-type and land-use and to investigate if there was an interaction between these two main factors. In the main factor soil-type we found, after FDR adjustment (P < 0.05, adaptive method), 2819 out of the 5069 genes to be significantly differentially expressed between the two soil types. To exclude the possibility of an exposure effect, fold change cut-offs of 0.42 and -0.42 (Log2) were used, since this was the maximum fold change observed between the LUFA2.2 controls of the sand and clay exposures. The number of transcripts left after the fold change cut-off was 936. These 936 genes were divided into two groups according to their response type (positive or negative). The group with a positive fold change (genes up-regulated in clay soils) consisted of 449 genes and the group with a negative fold change (genes up-regulated in sandy soils) contained 487 genes. Figure 2 depicts a clustering of the 936 remaining genes clearly showing contrasting profiles for clay and sandy soils. Both sets of genes were subjected to a gene enrichment (GO) analysis (Alexa et al. 2006). In the GO analysis we focused on the terms “Biological Process” and “Molecular Function” since these seemed to be the most indicative of any effects on a biological level. In the clay up-regulated group of genes, we found for biological process GO-terms that were mainly involved in protein maintenance, cytoskeleton regulation and signal transduction. Analysis on Molecular Function in the clay up-regulated group revealed terms which were involved in protein stabilization, protein homeostasis and fatty acid metabolism. In the sand up-regulated group both Biological Process and Molecular Function revealed a large group of GO terms involved in DNA replication/repair and the cell cycle. A smaller group which was also found in both “Biological Process” and “Molecular Function” included GO terms for amino acid metabolism. The other main factor considered in the analysis was land-use. Here we compared animals exposed to soil under organic agriculture, conventional agriculture, dairy farming, forest and natural grassland. This analysis revealed twelve genes which were significantly differentially expressed between the different land uses. K-means clustering showed specific expression patterns where forest soils caused the main effect. Four genes were found up-regulated in collembolans exposed to forest soils while two genes were down-regulated. According to

75

Figure 2: Hierarchical clustering of the expression results for the 936 differentially regulated genes in soil-type. 2 main clusters between exposure to sandy soils (left) and exposure to clay soils can be seen. Sand is further divided in 2 clusters but this division remains unexplained.

76

BLAST analysis, two of the four up-regulated genes were ABC transporters while the two down regulated genes are glucuronosyl transferases. Both ABC transporters and glucuronosyl transferases are part of the phase I and III detoxification pathway involved in the removal of xenobiotic substances from the body. This could indicate exposure to potentially harmful organic compounds, such as polyphenols and humic acids, which are more abundant in forest soils than in agricultural soils (Hattenschwiler et al. 2005). We also performed a MAANOVA calculation where we investigated the interaction between soil-type and land-use. The analysis yielded no significant genes after FDR adjustment, indicating that the differences in gene expression between soil types are independent of land-use. It appears that the main dichotomy in the data is between clay and sand, while land-use has a smaller effect on the transcriptome, not interacting with soil-type.

Variation in gene expression Genome-wide increased variation in gene expression may indicate stress (Oleksiak et al. 2002). We calculated the Coefficient of Variation (CoV) per gene for each soil-type/land-use combination in order to compare expression stability between the different soil-type/land-use combinations (Figure 3). The CoV ranged from 0.005 to 0.090 with 90% of the genes showing a CoV lower than 0.030. Apparently, F. candida gene expression is relatively stable under different natural soil conditions. The 30 genes with the highest CoV included two Isopenicillin-N-synthase (IPNS) genes and an ACV synthase gene. In fungi and bacteria these genes are involved in antibiotic synthesis and are currently under investigation as possible novel antibiotic synthesizing genes present in soil arthropods (Nota et al. 2008). Also, two Niemann-pick type C genes were identified. In insects these genes are essential in cholesterol synthesis which is used for the production of the moulting hormone ecdyson (Huang et al. 2005). To investigate if some land uses induced greater variance in gene expression than others, a series of pair wise F-tests was performed on soil-type/land-use combination variance data. Before the F-tests, the data was tested on normal distribution and genes that not showed a normal distribution were removed from the dataset. The F-distribution per pair wise test was calculated for the remaining 3420 transcripts and calculated P-values were adjusted for multiple testing according to Benjamini and Hochberg’s step-up procedure. In general, low variation in gene expression was observed across the different land uses. Three pair-wise tests (forest soil versus natural grassland on sand, conventional agriculture versus dairy farming on

77

Figure 3: Average CoV per gene calculated from different soil-type/land-use combinations ordered from low CoV to high CoV. The majority of the genes (90%) show an average variance lower then 0.03.

Table 2: The number of significant genes in the different pair wise tests performed to investigate the variance in gene expression variation between different soil-type/land-use combinations.

pairwise test sig. Genes

Sand: organic agriculture vs. conventional agriculture 0

Sand: conventional agriculture vs. dairy farming 204 Clay: organic agriculture vs. conventional agriculture 0 Sand: forest vs. natural grassland 272 clay vs. sand organic agriculture 28 clay vs. sand conventional agriculture 34 clay vs. sand dairy farming 272

78 sand and dairy farming on sand versus dairy farming on clay) showed significant differences in gene expression variation (see Table 2).

Reproduction Reproduction of F. candida was measured in the same set of soils after a 28 days exposure. Variation in the numbers of juveniles between different soils was high. On the average, clay soils resulted in fewer juveniles than sandy soils (406 for clay and 558 for sand) but this was not statistically significant considering the large variation within each soil. Variation between the soils ranged from 173 juveniles on average for soil 15 (organic agriculture on clay) to 809 juveniles on average for soil 8 (conventional agriculture on sand) while the control soil LUFA2.2 contained 634 juveniles on average. We observed no statistically significant pattern in the reproduction data concerning land-use. The variation seen in the experiments (200 – 800 juveniles after 28 days) falls in the range which is normally observed in experiments of this type (Smit & Van Gestel 1998). Of all the environmental factors that were measured only the soil arsenic concentration was significantly correlated with the number of juveniles per soil (Pearson correlation, -0.495, P = 0.01) (Figure 4). According to Crouau and Moïa (2006) the concentration of As causing 50% reduction of reproduction over 28 days (EC50) is 21.7 mg/kg soil. In one soil (soil 1, dairy farming on sand) the arsenic concentration exceeded this EC50 (26 mg/kg); this soil also had a lower reproduction (61% of the average reproduction for sandy soils).

Multivariate analysis on gene expression and environmental factors A principal component analysis (PCA) was performed on the gene expression data to investigate factors causing variance in the dataset. The first two principal components (PC) accounted for 68% of the variation in the dataset (44% for the first and 24% for the second PC). The first PC explains the variation between the two soil types while the second PCA accounts for other variation, including variation induced by differences in land-use. In order to link gene expression differences to the soil environmental factors to which the collembolans were exposed, we performed a canonical correlation analysis (CCA). The obvious effect due to soil-type was removed before the analysis. The soil characteristics that showed the highest correlation with gene expression differences were three measures of phosphate: phosphate in pore water (Pw), bio-available phosphate (P-AL) and total phosphate (P-total). Even though these are all phosphate measurements, they showed no correlation between each other, reinforcing that there is a real effect of soil fertility on gene expression.

79

Figure 4: Correlation between Folsomia candida reproduction (y-axis) and soil arsenic concentration (x-axis). Reproduction in F. candida is negatively affected at higher arsenic concentrations in the soil

The weight distribution of the genes in the CCA was calculated to test which genes correlated best with increasing phosphate concentration in the soil, either by up-regulation or down-regulation. Gene correlation weights ranged from -0.053 to 0.050. A cut-off setting of - 0.025 and 0.025 was used to exclude genes with a low correlation score, yielding 201 genes that were up-regulated with increasing phosphate concentration and 200 genes that were down-regulated with increasing phosphate. Both sets of genes were subjected to GO analysis focusing on biological process and molecular function. In the GO terms that were up- regulated with increasing phosphate concentrations no clear pattern could be observed.

80

However in the terms that were negatively correlated with phosphate concentration we found a clear signature of protein synthesis. The terms; “Translation” (GO:0006412), “Transcription from Polymerase I, II and III promoters” (GO:0006360, GO:0006383 and GO:0006367) and “RNA elongation from Polymerase II promoter” (GO:0006368) were all significant in biological process.

Discussion

In this study we introduced a new concept, the Natural Operating Range of a transcriptome, which determines the baseline variation of an organism under natural ecological conditions. As a first step towards establishing this NOR we investigated the transcriptional plasticity of a soil-dwelling collembolan exposed to different natural soils. The largest source of variation in the gene expression patterns was observed between two soil types: sand and clay. It indicates that this collembolan experiences the physicochemical properties of clay and sandy soils as very different from each other, despite the fact that it can survive and reproduce equally well in these soils. F. candida prefers to lay its eggs in the soil rather than on the surface to protect them against predators (Fountain & Hopkin 2005). The texture of the soil is therefore a factor of immediate relevance for egg-laying. Additionally, soil texture influences chemical composition. Clay soils tend to have higher concentrations of metal ions bound to the exchange complex of lutum particles, while sandy soils have lower lutum and organic matter contents with lower concentrations of exchangeable metals. In our soil chemical analysis lutum content was positively correlated with the metals Cr, Zn, Ni and Cu. Despite the fact that it can easily survive in sandy soils and is often found there, Folsomia candida prefers soils that are rich in organic matter (Potapow 2001). In the gene expression GO analysis, we found for sandy soils that GO terms involved in the cell cycle and DNA replication and repair were up-regulated while for animals exposed to clay soils GO terms in protein metabolism were up-regulated. Collembolans exposed to sandy soils seem to experience more general stress probably due to the lower organic matter content of the soil. However, specific stress pathways such as those that deal with the metabolism of xenobiotic substances were not found to be regulated in either soil-type. The land-use of the soil had much less impact on gene expression than the soil-type. This might be caused by the differences in soils within the same land-use. For instance, even

81 though locations were classified to have the same land-use there were differences in, for example, vegetation and fertilization schedules. Of all the land uses, only forest soils induced a specific pattern. Gene transcripts found differentially regulated in forest soils included ABC transporters and glucuronosyl transferases. These genes are involved in the removal of plant secondary metabolites and recalcitrant aromatics which can be harmful to the collembolan. Forest soils are likely to have a higher concentration of humic acids and polyphenols than agricultural soils, due to the higher plant and tree coverage. Decomposition of plant and wood material, such as lignins, by bacteria and fungi can produce residues that are toxic to collembolans when ingested (Berg et al. 2004). The forest soils were the only soils that induced this specific detoxification pathway in Folsomia candida. We also investigated the gene expression data for an interaction effect between soil-type and land-use but no such interaction was found. One of the reasons for this might have been the small number of genes that were differentially expressed between the different land uses. Thus, soil factors associated with land-use do not influence the animal’s physiology to a great extent. Several authors have emphasized the biological relevance of natural variation in gene expression (Whitehead & Crawford 2006). For example, Crawford and Oleksiak (2007) measured expression of metabolic genes in different individuals of an out-bred strain of Fundulus heteroclitus and found that up to 81% of the variation in physiological metabolism in the heart could be explained by gene expression. Heritable variation in gene expression among individuals can also contribute to genetic differences between populations and evolutionary change (Roelofs et al. 2009). In a parthenogenetic reproducing species, such as Folsomia candida, genetic adaptation might be less dynamic but the accumulation of mutations goes faster due to the fact that they are less frequently filtered out by recombination. The 28-days reproduction test showed considerable variation in the number of juveniles among the different soils. F. candida exposed to clay soils showed an average lower (not significant) reproduction but here was no clear pattern of the number of juveniles concerning land-use. However, a significant correlation was established between the concentration of soil arsenic and F. candida fitness, despite the fact that the arsenic concentration in all soils was below the reference value for the Netherlands (29 mg/kg soil) (VROM 2000). These unexpected sub-lethal effects of arsenic might be due to toxicity of arsenic itself, but it could also be due to an interaction with phosphorus, since arsenic and phosphorus are known to compete during soil sorption and uptake by plants.

82

A reason for the high variation in the reproduction test might be the fact that the soils were not sterilized before exposure and thus may harbor different microbial communities. We deliberately did not sterilize the soils to keep the soil microbial community intact and to maintain soil ecosystem functions. Kaneda and Kaneko (Kaneda & Kaneko 2002) showed that F. candida growth rate is influenced by soil bacterial activity, which may lead to differential reproductive output after 28 days of exposure. Also small invertebrates such as nematodes, mites and other collembolans were not removed. These biotic factors may have additional effects on the reproduction test over 28 days, but are considered to be less relevant in the 2-day gene expression test. Endogenous F. candida was not observed in the soils. Three natural soils showed less than 50% of F. candida reproduction when compared to reproduction in the standard soil LUFA 2.2 (soils 6, 12 and 14), despite the fact that these soils were all unpolluted. This suggests that soil characteristics alone can confound bioassays and cause effects on F. candida gene expression when the soil is compared to a standard lab soil. Such effects can also occur when testing a suspected polluted soil so that reproduction as an endpoint for soil quality assessment is of limited value and should be supported by additional tests based on gene expression. In a canonical correlation analysis we linked gene expression to the data from the soil chemical analysis. The main soil characteristic correlated with gene expression was soil fertility. Soil fertility is mainly determined by phosphate, which was measured with four different methods (total, soluble, etc). Three of the four different methods resulted in strong correlations of phosphate with gene expression. Folsomia candida is often found in agricultural fields and prefers soils with elevated organic matter content (Potapow 2001). The fact that gene expression, even on the short term, correlates with soil fertility indicates that habitat preference of this animal is measurable on the transcriptional level. Soil fertility did not, however, correlate with F. candida reproduction. Higher concentrations of phosphate in the soil were negatively correlated with protein synthesis which indicates that collembolans exposed to lower soil fertility up-regulate their protein synthesis. Since the exposure was only 2 days, this means that F. candida, exposed to a (less preferred) soil of low fertility, stages a gene expression cascade to reach homeostasis, which involves increased protein synthesis. In this paper we took the first steps to establish a transcriptional NOR for the ecological relevant test animal Folsomia candida by determining transcriptional plasticity under various natural conditions. The soils used in this study have soil types and land/uses in which this collembolan is normally found. The Folsomia candida transcriptome seems to be rather plastic, which is indicative of the fact that it can tolerate a wide range of abiotic factors (e.g.

83 soil pH, moisture content, etc). Plasticity seems to be a common property of soil-living detritivores and plastic species may be favored when communities shift under environmental change (Berg & Ellers 2010). We have shown that plasticity comes with significant regulation of gene expression, which may involve 18.5% of the transcriptome. The next step in completing a Natural Operating Range for this invertebrate would be to vary exposure times and use multiple genotypes from natural populations to investigate response differences over time and in lineages sampled from different soil conditions.

Acknowledgements The authors would like to thank Bart Pieterse and Eiko Kuramae for their help with soil sampling; Francois Rustenburg and Paul Eijk for assisting with the microarray experiments; Ben Nota and Thierry Janssens for help with gene expression analysis and Janine Mariën for assistance during RNA isolation. This work was supported by a grant from the Netherlands Genomics Initiative (NGI) to the Ecogenomics consortium "Assessing the Living Soil".

84

References

Alexa, A., Rahnenfuhrer, J. & Lengauer, T. 2006. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics, 22, 1600-1607. Altman, N. S. & Hua, J. 2006. Extending the loop design for two-channel microarray experiments. Genetical Research, 88, 153-163. Bayley, M. & Holmstrup, M. 1999. Water vapor absorption in arthropods by accumulation of myoinositol and glucose. Science, 285, 1909-1911. Benjamini, Y. & Hochberg, Y. 1995. Controlling the False Discovery Rate - a Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B- Methodological, 57, 289-300. Berg, M. & Ellers, J. 2010. Trait plasticity in species interactions: a driving force of community dynamics. Evolutionary Ecology, 24, 617-629. Berg, M. P., Stoffer, M. & van den Heuvel, H. H. 2004. Feeding guilds in Collembola based on digestive enzymes. Pedobiologia, 48, 589-601. Cavalieri, D., Townsend, J. P. & Hartl, D. L. 2000. Manifold anomalies in gene expression in a vineyard isolate of Saccharomyces cerevisiae revealed by DNA microarray analysis. Proceedings of the National Academy of Sciences of the United States of America, 97, 12369-12374. Crawford, D. L. & Oleksiak, M. F. 2007. The biological importance of measuring individual variation. J Exp Biol, 210, 1613-1621. Crouau, Y. & Cazes, L. 2003. What causes variability in the Folsomia candida reproduction test? Applied Soil Ecology, 22, 175-180. Crouau, Y. & Moïa, C. 2006. The relative sensitivity of growth and reproduction in the springtail, Folsomia candida, exposed to xenobiotics in the laboratory: An indicator of soil toxicity. Ecotoxicology and Environmental Safety, 64, 115-121. Edgar, R., Domrachev, M. & Lash, A. E. 2002. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Research, 30, 207-210. Edwards, D. 2003. Non-linear normalization and background correction in one-channel cDNA microarray studies. Bioinformatics, 19, 825-833. Fountain, M. T. & Hopkin, S. P. 2005. Folsomia candida (collembola): A "Standard" Soil Arthropod. Annual Review of Entomology, 50, 201-222. Gibson, G. 2002. Microarrays in ecology and evolution: a preview. Molecular Ecology, 11, 17-24. Hattenschwiler, S., Tiunov, A. V. & Scheu, S. 2005. Biodiversity and litter decomposition interrestrial ecosystems. Annual Review of Ecology Evolution and Systematics, 36, 191- 218. Huang, X., Suyama, K., Buchanan, J., Zhu, A. J. & Scott, M. P. 2005. A Drosophila model of the Niemann-Pick type C lysosome storage disease: dnpc1a is required for molting and sterol homeostasis. Development, 132, 5115-5124. ISO 1999. ISO, Soil Quality. Inhibition of Reproduction of Collembola (Folsomia candida). ISO Guideline 11267. International Standardization Organization. Zwitserland. Kammenga, J. E., Herman, M. A., Ouborg, N. J., Johnson, L. & Breitling, R. 2007. Microarray challenges in ecology. Trends in Ecology & Evolution, 22, 273-279. Kaneda, S. & Kaneko, N. 2002. Influence of soil quality on the growth of Folsomia candida (Willem) (Collembola). Pedobiologia, 46, 428-439.

85

Kaneda, S. & Kaneko, N. 2008. Collembolans feeding on soil affect carbon and nitrogen mineralization by their influence on microbial and nematode activities. Biology and Fertility of Soils, 44, 435-442. Kerr, M. K., Martin, M. & Churchill, G. A. 2000. Analysis of variance for gene expression microarray data. Journal of Computational Biology, 7, 819-837. Noël, H. L., Hopkin, S. P., Hutchinson, T. H., Williams, T. D. & Sibly, R. M. 2006. Population Growth Rate And Carrying Capacity For Springtails Folsomia Candida Exposed To Ivermectin. Ecological Applications, 16, 656-665. Nota, B., Bosse, M., Ylstra, B., van Straalen, N. M. & Roelofs, D. 2009. Transcriptomics reveals extensive inducible biotransformation in the soil-dwelling invertebrate Folsomia candida exposed to phenanthrene. Bmc Genomics, 10. Nota, B., Timmermans, M., Franken, O., Montagne-Wajer, K., Marien, J., De Boer, M. E., De Boer, T. E., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2008. Gene Expression Analysis of Collembola in Cadmium Containing Soil. Environmental Science & Technology, 42, 8152-8157. Odum, E. P., Finn, J. T. & Franz, E. H. 1979. Perturbation theory and the subsidy-stress gradient. Bioscience, 29, 349-352. Oleksiak, M. F., Churchill, G. A. & Crawford, D. L. 2002. Variation in gene expression within and among natural populations. Nature Genetics, 32, 261-266. Potapow, M. 2001. Synopses on Palaearctic Collembola., Staatsliches Museum fur Naturkunde Gorlitz. Pritchard, C. C., Hsu, L., Delrow, J. & Nelson, P. S. 2001. Project normal: Defining normal variance in mouse gene expression. Proceedings of the National Academy of Sciences, 98, 13266-13271. Roelofs, D., Aarts, M., Schat, H. & Van Straalen, N. 2008. Functional ecological genomics to demonstrate general and specific responses to abiotic stress. Functional Ecology, 22, 8 - 18. Roelofs, D., Janssens, T. K. S., Timmermans, M., Nota, B., Marien, J., Bochdanovits, Z., Ylstra, B. & Van Straalen, N. M. 2009. Adaptive differences in gene expression associated with heavy metal tolerance in the soil arthropod Orchesella cincta. Molecular Ecology, 18, 3227-3239. Saeed, A. I., Bhagabati, N. K., Braisted, J. C., Liang, W., Sharov, V., Howe, E. A., Li, J., Thiagarajan, M., White, J. A. & Quackenbush, J. 2006. TM4 Microarray Software Suite. Methods in Enzymology. Academic Press. Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. 1995. Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray. Science, 270, 467-470. Smit, C. E. & Van Gestel, C. A. M. 1998. Effects of soil type, prepercolation, and ageing on bioaccumulation and toxicity of zinc for the springtail Folsomia candida. Environmental Toxicology and Chemistry, 17, 1132-1141. Smyth, G. K. 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol, 3, Article3. Timmermans, M., de Boer, M., Nota, B., de Boer, T., Marien, J., Klein-Lankhorst, R., van Straalen, N. & Roelofs, D. 2007. Collembase: a repository for springtail genomics and soil quality assessment. BMC Genomics, 8, 341. Timmermans, M. J. T. N., Roelofs, D., Nota, B., Ylstra, B. & Holmstrup, M. 2009. Sugar sweet springtails: on the transcriptional response of Folsomia candida (Collembola) to desiccation stress. Insect Mol Biol, 18, 737-46. Van Straalen, N. M. & Roelofs, D. 2008. Genomics technology for assessing soil pollution. Journal of Biology, 7, 19. VROM 2000. Streefwaarden en interventiewaarden bodemsanering. Staatscourant, 39.

86

Whitehead, A. & Crawford, D. L. 2006. Variation within and among species in gene expression: raw material for evolution. Molecular Ecology, 15, 1197-1211.

87

88

Chapter 5

The effects of aged copper pollution on Folsomia candida physiology

Tjalf E. de Boer, Erwin Temminghof, Nico M. van Straalen, Bauke Ylstra and Dick Roelofs

89

Abstract

Heavy metal pollution in the soil has been a historical problem in many European countries. Being released by the mining and smelting of metals, the emission of heavy metals into the environment has been going on since the Bronze Age. Also, with industrialization, combustion of fossil fuels and the associated emissions have increased metal pollution to a more widespread problem. Heavy metal pollution can have adverse effects on animals and plants by causing toxic effects such as oxidative stress. One such a metal is copper, which is an essential trace element present in the catalytic domain of several enzymes but also a highly toxic element in high concentrations. In this paper we studied the effect of aged copper pollution on reproduction, growth and gene expression in the collembolan Folsomia candida. The soil copper content had no effect on reproduction and growth but there was an obvious impact on gene expression. In the highest copper concentrations, 68 genes were differentially expressed, many which are part of the vesicle-mediated secretion system which is used to eliminate harmful substances from cells. Since the transcriptional response was limited to a relatively small number of genes we concluded that bioavailability of the (aged) copper pollution is low. The available copper content in the soil was not high enough to induce oxidative stress, as is often found in copper toxicity. The gene expression analysis however, was able to distinguish between the different soil copper concentrations making this a valuable extension of the standard ISO test in which survival and reproduction in F. candida is used to determine the impact of soil pollution or chemicals

90

Introduction

Heavy metal pollution is a worldwide problem that affects humans, animals and plants. In Europe, soil pollution is characterized and monitored by the European Environment Agency (EEA) which is installed by the European Union. In their 2000 report the EEA states that, even though emissions of heavy metals such as lead and cadmium have significantly decreased since 1990 there is still a large number of sites polluted with heavy metals (EEA, 2000). In order to determine the environmental risks associated with specific sites the effects of metals and their confounding factors need to be established. Metal toxicity in soil is heavily dependent on the soil physico-chemical properties. Criel et al (2008) spiked multiple European field soils, all with different soil characteristics with copper and determined the effect on two soil invertebrates: Eisenia fetida and Folsomia candida. Effect concentrations with 50% impact on reproduction in both species (EC50) varied widely over the different soils. Their conclusion was that variability in copper toxicity could best be explained by the cation exchange capacity (CEC) of the soil. One factor that influences this CEC is soil pH, which plays a major role in metal bio-availability. A lower soil pH means that there are more free protons which compete with metal ions in their binding to soil particles. Aging of pollutants also plays a role in soil metal toxicity. A large part of metal pollution in Europe is historical and sometimes dates back to as far as 2000 years ago (Nriagu, 1996). A fraction of the metal pollution will leach out of the soil and into the ground water during aging but the part that remains usually becomes more firmly bound to soil particles and therefore less bio-available. In this study we focus on the effects of aged copper pollution in multiple soil pH values and the impact of copper on reproduction, growth and gene expression in the collembolan Folsomia candida. Copper is an essential trace element required for the catalytic activity of a number of enzymes that catalyze oxidation-reduction reactions (Llanos and Mercer, 2004). As an essential trace metal, copper homeostasis and binding is tightly maintained on the cellular level since the free copper ion is extremely toxic even in low concentrations (Lutsenko and Petris, 2003). Copper and other heavy metals can enter collembolans such as Folsomia candida in two ways: via the gut when the animal eats soil particles or the micro-organisms that inhabit the soil or via an organ called the ventral tube. The ventral tube is located under the animal and is used to extract water from the soil (Fountain and Hopkin, 2005); the rest of the cuticle is rather hydrophobic and probably not important for copper uptake. Not much is

91 known about the mechanism of copper toxicity in F. candida on the cellular. However, copper toxicity in other organisms such as baker's yeast and C. elegans has been studied. In yeast strains such as Saccharomyces cerevisiae and Candida albicans copper enters the cell via two copper transport proteins located in the cell membrane (Dancis et al., 1994, Marvin et al., 2003). These copper transporting proteins are expressed by the CTR1 and CTR3 genes. The expression of these genes is limiting for copper intake as indicated by the result that yeast over-expression mutants of these genes contain higher copper levels in the cytosol. The CTR1 gene is also present in humans and fruit flies and the protein is active in the gut lumen where it transports copper into the gut epithelial cells. This indicates a high level of conservation across the animal kingdom for this copper transport system (Hua et al., 2010). Once inside the cell, copper is further transported to various locations in the cell by multiple chaperone and transport proteins such as ATOX1, COX17 and CCS/SOD (Culotta et al., 1997, Glerum et al., 1996). Since free copper ions are tightly regulated, most of the knowledge on copper toxicity in higher animals and humans comes from disorders caused by mutations in copper regulating genes (e.g. Wilson's disease, caused by a mutation in a P-type ATPase transporting copper into the bile). Copper toxicity is conferred by its chemical redox potential. This redox potential gives copper ions the ability to catalyze the formation of reactive radical ions such as the hydroxyl radical and other radical oxygen species which can cause oxidative stress (Gaetke and Chow, 2003). Oxidative stress is exerted when free radical ions react with cellular molecules such as fatty acids, proteins or DNA. These macromolecules get damaged by the reaction which has an adverse effect on cellular functioning and ultimately fitness. When free radicals react with DNA mutations can form, which might lead to disease, neoplasia or premature aging (Imlay and Linn, 1988). In the second half of the 20th century, there were indications that the high copper content, then present in fertilizers, could have adverse effects on crops, animals feeding on them and humans. To investigate these effects, an experimental field in Bennekom, the Netherlands, was spiked with four different copper concentrations and four different pH treatments in a full factorial design with eight field replicates. Normal land use practice with a three year crop rotation was commenced. The spike-in was performed in 1980. A number of previous studies have been performed on samples from this field. Tobor-Kaplon for example, investigated the microbial community and soil respiration on the two extreme copper concentrations and two extreme soil pH values in addition of a secondary stress. They found that microorganisms in soil samples with a low pH and high copper concentrations showed less resilience to the

92 secondary stress (Tobor-Kaplon et al., 2005). Kuenen et al looked at community composition and measured the biomass of all functional groups. They found no effect of either soil pH or copper concentration of community structure but communities in the unstressed samples seemed to be more constant over time (kuenen, 2009). In this study we sampled all the possible copper/pH combinations present in the field site. Four field replicates for each combination were sampled for a balanced design. As a test animal we used Folsomia candida because this collembolan is part of an ISO certified soil quality test and has already been used as a model organism to test the impact of soil pollutants including heavy metals on gene expression (ISO, 1999, Nota et al., 2010). We measured gene expression, reproduction and growth of collembolans exposed to these soils in order to investigate the effect that copper and pH have on this animal.

Materials & Methods

Soil sampling The test site where the soil samples were taken is located in Bennekom, the Netherlands and locally known as the “Bovenbuurtse site” (51º 59’34.89”N and 5º 40’15.85”E). The site was spiked with different copper concentrations in 1980 by application of CuSO4 (Lexmond, 1980). A single field was divided in 128 adjoining plots measuring 6 by 11 meters each, and spiked with four copper concentrations and four different pH values, in a fully factorial, random block design which included eight field replicates per copper/pH combination. The field was divided into eight blocks; each block contained one plot of each of the 16 combinations. The original copper concentrations were 0, 250, 500 and 750 kg/ha. Since the spiking in 1980, no extra copper has been added but the pH has been adjusted by liming or application of sulphur powder every five years. Using previously measured copper concentrations from 2001 we calculated the average copper concentration per copper/pH combination and selected four out of the eight field replicates which were closest to the average. 64 Plots were sampled in total (each copper/pH combination four times). All plots were sampled in a single day. Samples taken per plot consisted of three pooled subsamples taken in the centre of the plot which were mixed on-site. A small subsample of the total mixed sample was used for copper measurements while the bulk of the sample was

93 stored at 5 ºC. Very wet soil samples were dried at room temperature for a week and all samples were sieved over a 4 mm grid to remove stones and plant material.

Copper measurements The Cu content in the soil samples was determined by 0.43 M HNO3 extraction (solid/solution ratio (SSR) 0.1 kg/L). This method is used as a measure for the amount of reactive metals in soil samples in other studies. The Cu concentration was measured by ICP- AES (IRIS). Bioavailable Cu, soil pH, and DOC concentrations in the soil samples were determined by 0.01 M CaCl2 extraction. Cu was measured by HR-ICP-MS and DOC was measured by a TOC analyzer (Skalar SK12).

Folsomia candida culture and exposure Folsomia candida (VU Berlin strain) was maintained in PVC containers with a plaster of Paris base containing 10% charcoal. The animals were fed baker's yeast (Dr. Oetker) ad libitum. All collembolan cultures were kept at 20 ○C in a 12 hour dark/light cycle at 75% relative air humidity. Test animals were synchronized beforehand according to the ISO guideline (ISO 11267). To obtain age-synchronized cultures, adult collembolans were transferred to fresh culture containers where they were allowed to lay eggs for two days. After two days the animals were removed and their offspring were used for the exposure at an age of 23 days. Exposures were performed in 100 ml glass jars. The soil samples were moistened to 50% of their water holding capacity (WHC) 48 hours before exposure and left to equilibrate. For each copper/pH condition four replicate field samples were used. Each soil replicate was also taken as a biological replicate, that is, the animals from a single glass jar were pooled for RNA extraction. The sample containers contained 25 g wet soil and 30, 23 day old collembolans, randomly selected from the synchronized stock. After 2 days of exposure, the animals were removed from the soil by floatation on water and snap-frozen in liquid nitrogen for RNA extraction. For the 28-day reproduction and growth test, 10 day old, synchronized collembolans were used. Ten test animals were exposed to 25 g of wet soil (at 50% WHC) for 28 days per replicate. Two replicates were used per biological field replicate. Test containers were opened twice a week for aeration, fed once a week and moisture levels were adjusted halfway during the exposure. After 28 days the containers were filled with 100 ml water and emptied in a standard beaker while digital photographs were taken of the floating collembolans after gentle

94 stirring of the soil-water suspension. The number of juveniles and the size distribution of the animals were determined with CellD software (Olympus).

RNA extraction and microarray hybridization Total RNA was extracted with the SV Total RNA kit from Promega according to manufacturer’s instructions. The RNA isolation included a DNAse step to remove genomic DNA. RNA integrity and concentrations were measured on a Bioanalyzer (Agilent Technologies) and Nanodrop spectrophotometer (Fisher Scientific). 500 ng total RNA per sample was used as input for amplification and labeling with the Low-Input Fluorescent Linear Amplification Kit (Agilent Technologies), all according to the manufacturer's guidelines. One modification to the standard protocol was included: the cRNA transcription reactions were performed in half volume. Labeled cRNA was hybridized on 8*15K custom Agilent microarrays overnight, washed and scanned, all according to manufacturer’s instructions. The custom microarray (Nota et al., 2009) contains 5069 unique F. candida gene fragments in triplicate, and is based on the Folsomia candida EST sequences database Collembase (Timmermans et al., 2007) (www collembase.org).

Microarray experimental design and analysis Microarray hybridization was done according to an interwoven loop design (Altman and Hua, 2006). This type of design maximizes statistical power while using the smallest number of microarrays. Each of the 16 copper/pH combinations was replicated four times and RNA samples from the four replicates were dye-swapped during labeling (twice labeled with Cy3 and twice with Cy5). The loop design ensured that replicates of one condition were never hybridized to the same replicates of another condition more than once. This was to make sure that any interaction between two conditions was kept to a minimum. For this experiment we used the 8*15K Agilent custom microarray platform which contains eight microarrays on a single microscopic slide. Since four slides in total were used (32 arrays) we divided the four replicates of each copper/pH combination evenly over the four slides to prevent hybridization or washing effects. Microarray intensities were extracted with Feature Extraction software (version 9.5.1, Agilent Technologies). Raw Intensities were background subtracted (Edwards method, off-set of 30), normalized within arrays (LOESS) and between arrays (aquantile) using the Limma package in the R environment (Edwards, 2003, Smyth, 2004). The dataset was divided into three according to the bio-available copper concentrations (low, medium and high) and the

95 lowest concentration samples were set as a reference. A linear model was used in the Limma package to determine significant differentially expressed genes between the medium and low and the high and low copper concentration samples. Calculated P values were adjusted for multiple testing according to the method developed by Benjamini and Hochberg (Benjamini and Hochberg, 1995). A similar method was used for the pH calculation but instead of dividing the dataset in three it was divided in four different pH values with the highest pH value set as reference.

Results

Copper measurements

Total and CaCl2 extractable copper concentrations were measured for each soil sample. The

CaCl2 extractable copper concentration represents the biologically available concentration in the soil and thus counts as the copper load which the test animals experience. Metal bioavailability is influenced by soil pH which means that samples with similar total copper concentrations can differ in bio-available copper concentrations and have to be considered as different. Since total copper concentration is not directly influenced by pH, we expected the field replicates to have similar total copper concentrations yielding in total four different copper concentration platforms for all samples. Figure 1A depicts soil total copper concentrations ordered from low to high and it clearly shows that the original four levels of copper concentrations have disappeared. This has several reasons; the first is ploughing, every year the field is ploughed in one run which means that individual field have become mixed over the years. Another reason is copper leaching. The field soil composition is heterogeneous which means that differences in soil composition have a different impact on copper leaching rate. Also, soil pH influences leaching rate since copper will leach out more easily at low pH. We compared the total copper concentrations to bio-available copper concentrations to investigate how much of the soil copper is bio-available. 0.55 to 6.48% of the total soil copper was available with an average of 2.15%. Because the bio-available copper fraction represents the copper load which is experienced by the collembolans, we decided to focus on these concentrations for the gene expression analysis. This also means that copper can be analyzed independently from the soil pH. Figure

96

Figure 1: A: Total copper concentration in the soil samples ordered from low to high. No trace of the four original copper concentrations is left. B: the CaCl2 extractable copper concentration in the soil. The bars below the graph represent the samples chosen for the gene expression analysis (light grey: low, dark grey: medium and black: high).

97

1B shows bio-available copper concentrations ordered from low to high. Based on CaCl2 extracted copper concentrations we divided the soils into three groups: low, medium and high. Data from soils with intermediate concentrations were removed leaving expression profiles of 41 soil samples. This set of 41 samples was also divided into three groups according to the soil pH. See Table 1 for average copper concentrations of the three copper groups and average pH in the three pH groups. Linear regression of the soil pH and copper concentrations however showed that these two factors were heavily correlated (P = >0.0001).

Table 1: the average copper concentration in each of the classes (low, medium and high) used for the gene expression analysis. The same set of samples was divided according to soil pH. Average pH in this new classification is depicted below.

Soil property Class Concentration copper (µg/kg) low 328.9 medium 1474.2 high 3761.3 Soil pH low 4.1 medium 4.6 high 5.0

Gene expression analysis To investigate if the copper and pH groups could be analyzed separately we determined if there was an interaction in gene expression between the pH and copper groups. For this analysis a microarray ANOVA model was used. No genes were found, after the adjustment for multiple testing (P = 0.05, BH step-up procedure), that showed a significant interaction effect between the soil copper concentration and the soil pH. Because there was no interaction effect we analyzed the copper and pH data separately. A linear model was used to investigate gene expression differences between the three copper concentrations. We designated the low concentration samples as a reference and compared the medium and high concentration samples directly to the reference. In this analysis 41 of the 64 samples were used. The samples that were removed from the analysis did not fit into on the three categories. For the high copper concentration we found 68 genes

98

Figure 2: hierarchical GO tree for the copper analysis presenting the different levels for secretion and vesicle mediated secretion. Darker colors mean more significance. Square boxes have a P-value of lower than 0.05 (Fisher exact test) differentially expressed (17 down regulated and 51 up regulated) and for the medium concentration 2 genes (both up regulated) were found to be differentially expressed.. Of the two genes significant in the medium copper concentration group one gene (Fcc02218,

99 polyketide synthase) was also found the in the high copper concentration group. The gene was up regulated in both groups. A Gene Ontology analysis was performed on the genes that were significantly differentially expressed in the high copper concentration samples. The GO-terms significant in Biological Process (BP) were mostly involved in general secretion and vesicle mediated secretion (see Figure 2) indicating that the copper is being secreted from the animal already after two days of exposure. In total seven GO terms were found significant and contained two or more genes in the high copper concentrations (GO terms containing only one gene were omitted). See table 2 for all significant GO terms. A linear model was also used to determine the impact of soil pH on gene expression. The highest pH was set as a reference because a pH between 5.0 and 5.5 seems to be the optimum for this animal according to previously published studies on this collembolan in combination with soil pH (van Straalen and Verhoef, 1997, de Boer et al., 2010). The gene expression analysis showed that only the low pH group was statistically different from the reference (high) group. Interestingly the effect of low pH was greater than the copper effect; 221 genes were significant differentially regulated (92 down regulated and 129 up regulated genes. 55 of these genes were differentially regulated in both the copper and the pH samples. Interestingly however, is that Gene Ontology analysis for the significant genes found in the low pH samples showed 19 significant GO terms in Biological Process but none of these terms were found in the significant list in the copper analysis. This means that there seems to be a difference in the effects caused by pH and copper. GO terms found among the significant list were mainly involved in nucleotide and protein metabolism and (acute) inflammatory response see table 2 for all terms. The V-type ATPase which we implicated to be influenced by pH differences in a previous study was significantly up regulated in the low pH samples (fold change of 0.48, adjusted P = 0.028).

Reproduction and growth test A four week experiment was performed to investigate the impact of soil copper on reproduction and growth. The soil copper content (total copper or bio-available copper) did not have any impact on reproduction. Growth was determined by calculating the average collembolan size in each sample which was compared to the total copper concentration, CaCl2 extractable copper concentration and pH. None of the above mentioned factors influenced F. candida growth. We also performed a copper spike-in experiment to determine the EC50 on reproduction in F. candida to compare it to the field soil copper concentrations. We estimated

100

Table 2: significant GO terms found between low and high copper concentration (copper) and low and high soil pH (pH). a: the total number of genes present in this GO term. b:the number of genes found for this GO term in the significant gene list and c: P value calculated by the weighted Fisher exact test.

GO term Description (Biological Process) Annotated a Significant b P-value c Copper GO:0045045 secretory pathway 64 6 0.0050 GO:0032940 secretion by cell 68 6 0.0067 GO:0046903 secretion 76 6 0.0116 GO:0045055 regulated secretory pathway 22 3 0.0176 GO:0016192 vesicle-mediated transport 110 7 0.0198 GO:0006091 generation of precursor metabolites and energy 65 5 0.0236 GO:0006635 fatty acid beta-oxidation 10 2 0.0258 pH GO:0051260 protein homooligomerization 12 4 0.0045 GO:0006308 DNA catabolic process 8 3 0.0100 GO:0006085 acetyl-CoA biosynthetic process 3 2 0.0107 GO:0006297 nucleotide-excision repair, DNA gap filling 3 2 0.0107 GO:0015671 oxygen transport 3 2 0.0107 GO:0006044 N-acetylglucosamine metabolic process 11 3 0.0203 GO:0006200 ATP catabolic process 4 2 0.0206 GO:0006221 pyrimidine nucleotide biosynthetic process 4 2 0.0206 GO:0007004 telomere maintenance via telomerase 4 2 0.0206 GO:0000302 response to reactive oxygen species 18 4 0.0208 GO:0001737 establishment of imaginal disc-derived wing 5 2 0.0329 hair orientation GO:0022611 dormancy process 5 2 0.0329 GO:0042067 establishment of ommatidial polarity 5 2 0.0329 GO:0008406 gonad development 13 3 0.0407 GO:0002526 acute inflammatory response 7 3 0.0465 GO:0007173 Epidermal growth factor receptor signaling 6 2 0.0474 pathway GO:0016319 mushroom body development 6 2 0.0474 GO:0015986 ATP synthesis coupled proton transport 14 3 0.0496

101 the EC50 on production to be approximately 100 mg/kg (nominal concentration) which is equivalent to the highest total field copper concentrations. However, spiked-in copper, even though it was aged for two weeks, is a lot more bio-available (up to 30%) than the copper in the field soils. This might explain why copper present in the field soils did not elicit an effect on F. candida reproduction or growth after 28 days of exposure.

Discussion

In this study we investigated the impact of exposure to an aged, copper polluted soil on the physiology of Folsomia candida, a soil dwelling springtail. Since the original copper spike-in in 1981 a large part of the copper has been leached out of the soil and the remaining fraction is largely bound to soil particles and thus biologically unavailable. On average, only 2.15 % of the total copper is biologically available. The four different pH treatments have a large impact on available copper concentrations. In the four total copper concentrations, on average 4.9 times as much copper was available in the lowest soil pH as compared to the highest soil pH. Lower soil pH also has an impact on copper leaching rates since available copper is leached out more easily. This means that over the years, more copper has leached out of the lower pH soils as compared to the higher pH soils. Also, ploughing of the field has an impact on total and bio-available copper concentrations. The test field is divided in adjoining plots which contain the different pH/copper treatments. The field is subjected to a crop rotation scheme which means it is ploughed every year in one single run. The ploughing in combination with the random plot design causes plots to be mixed. Both the differences in leaching rates and yearly ploughing have leveled out the (bio-available) copper concentrations over the years, causing the extremes to converge towards each other. Due to the large differences in bio-available copper, caused by differences in soil pH, within the same total copper concentration samples, it was impossible to compare the four original copper concentrations to each other in the gene expression analysis. Instead, the dataset was divided into three copper levels; low, medium and high. This classification of copper levels was based on our own copper measurements and thus the most recent. The samples in each of these levels had more similar bio-available copper concentrations and therefore it was the logical choice to compare these levels to each other in order to obtain biologically relevant results. The lowest copper concentration level was set as a reference to

102 compare the other two levels to. Even though this reference level also contains copper and therefore was not a true control we choose to set it as a reference. This was based on the fact that there was no true control without copper among the field soil samples due to the effects described above. A standard soil such as LUFA2.2 or OECD could have been used, however, since this study is based on testing field soils we decided not to include it. Also, standard soils often contain very little copper, which could cause adverse effects on the test animals since copper is an essential trace element. It is often observed that copper, in low concentrations, has a stimulatory effect on reproduction of F. candida. This effect is called hormesis (Calabrese and Baldwin, 2000) and we felt that a possible hormesis effect would overly complicate the gene expression analysis. In the gene expression analysis the medium copper concentration samples did not show a large effect. Two genes were significant differentially expressed compared to the low concentration reference, one of which was also found among the genes differentially expressed in the high copper concentration samples. The high copper concentration samples showed a larger effect compared to the reference with 68 genes significantly, differentially expressed. This effect remains rather minor however, when compared to other studies that investigated the effect of metals and other pollutants on gene expression in F. candida (Nota et al., 2008). Gene Ontology analysis on the genes differentially expressed in the high copper concentrations, clearly showed over expression of vesicle mediated excretion. Of copper excretion and which cellular components are involved in this collembolan is not much known. However in mammalian hepatic cells copper is excreted by vesicles formed at the Golgi apparatus. Copper ions are transported to the Golgi apparatus by P-type ATPase (ATP7B) proteins and caeruloplasmin, which is a copper binding protein, similar to metallothionein (Wijmenga and Klomp, 2004). The copper is then excreted in vesicles. ATP7B was not up- regulated and caeruloplasmin is not found in the Collembase database but vesicle mediated excretion was found to be significant. Since this is a pathway that is used to excrete metals such as copper, it indicates that the copper is taken up by the animal, metabolized and excreted again. GO analysis did not show an oxidative stress effect which is typical for copper toxicity. BLAST information on individual genes also did not show a typical oxidative stress signature. Only one gene involved in oxidative stress, a HSP70 (Fcc04630) was slightly up-regulated in the high copper concentration samples. In general, all differentially regulated genes only showed low fold changes ranging from -1.10 to 1.05 (Log2). This minor effect is probably due to the low soil copper concentration in combination with ease of which copper is metabolized and excreted.

103

The same set of samples used to obtain the three copper levels was also used to investigate the effects of differences in soil pH. According to the soil pH, the dataset was divided into three pH levels; low, medium and high. An effect of pH on gene expression was only observed when the low pH was compared to the high pH. This effect was greater than the effect of soil copper and 55 of the 221 significant genes were shared with the significant gene set from the low/high copper comparison. Since both soil pH and available copper concentration were highly correlated in this set of 41 samples it seemed that pH effects could not be separated from copper effects. The Gene Ontology analysis however, showed different gene expression responses to the two treatments. This shows that data reduction methods such as GO analysis can help find specific processes as a response to different treatments. The differences in response to copper and pH, where the soil pH showed a greater effect than the copper treatment is consistent with other studies performed on this soil which also found a greater effect of pH (Tobor-Kaplon et al., 2005, kuenen, 2009) The impact of these copper containing soils on Folsomia candida reproduction and growth were also investigated in order to link gene expression differences to a physiological effect. The copper pollution in these soils however, had no impact on either reproduction or growth in this collembolan. Survival is also often used as an endpoint to measure soil quality however; here we did not measure survival because there was no impact on reproduction which is considered to be a more sensitive endpoint. These results are consistent with earlier findings by Bruus Pedersen et al (1997) who also measured reproduction and growth of F. candida when exposed to this soil and also did not find any reduction in reproduction either with increasing copper concentrations. The copper concentration and availability in this soil are not high enough to cause an effect on either reproduction or growth. Survival and reproduction of this collembolan are part of an ISO certified soil quality test and are, together with growth and avoidance behavior, measured as end points to test impact of chemicals and soil pollutants (ISO, 1999). This test however, comes with a number of drawbacks. For example, there is often high variability among the replicates yielding high uncertainty of calculated effect concentrations. Also, no mode of effect is determined and with chemical mixtures it is difficult to measure which component contributes in what manner to the effect. By measuring the expression of multiple genes as endpoints in combination with survival and reproduction, a mode of action of the concerned chemical or pollution can be determined. This might lead to a better understanding on how chemicals and pollutants influence physiology and in some cases may predict impact on future generations if, for example, genes that are involved in reproduction, genome integrity, etc are involved. In this

104 study we show that measuring gene expression as an endpoint in soil quality testing can be more sensitive than endpoints such as survival or reproduction. Low concentration or aged pollution of essential elements such as copper can be hard to detect with bio-assays that rely on endpoints such as reproduction because their uptake, metabolism and elimination are highly regulated. Chronic exposure to low concentration contamination however, might have effects over multiple generations and therefore low concentration pollution cannot be set aside as unimportant. Here we show that gene expression analysis is sensitive enough to determine effects caused by low concentration pollutants.

Acknowledgments

The authors would like to thank Ben Nota for the copper spike-in data, Wilfred Roling and Neslihan Tas for their help with soil sampling, Francois Rustenburg and Paul Eijk for technical support with the microarray experiments and Thierry Janssens for help with the gene expression analysis. This work was supported by a grant from the Netherlands Genomics Initiative to the Ecogenomics consortium "Assessing the Living Soil".

105

References

Altman, N. S. & Hua, J. 2006. Extending the loop design for two-channel microarray experiments. Genetical Research, 88, 153-163. Benjamini, Y. & Hochberg, Y. 1995. Controlling the False Discovery Rate - a Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B- Methodological, 57, 289-300. Calabrese, E. J. & Baldwin, L. A. 2000. Chemical hormesis: its historical foundations as a biological hypothesis. Human and Experimental Toxicology, 19, 2-31. Criel, P., Lock, K., Van Eeckhout, H., Oorts, K., Smolders, E. & Janssen, C. R. 2008. Influence of soil properties on copper toxicity for two soil invertebrates. Environmental Toxicology and Chemistry, 27, 1748-1755. Culotta, V. C., Klomp, L. W. J., Strain, J., Casareno, R. L. B., Krems, B. & Gitlin, J. D. 1997. The Copper Chaperone for Superoxide Dismutase. Journal of Biological Chemistry, 272, 23469-23472. Dancis, A., Haile, D., Yuan, D. S. & Klausner, R. D. 1994. The Saccharomyces cerevisiae copper transport protein (Ctr1p). Biochemical characterization, regulation by copper, and physiologic role in copper uptake. Journal of Biological Chemistry, 269, 25660-25667. De Boer, T. E., Holmstrup, M., Van Straalen, N. M. & Roelofs, D. 2010. The effect of soil pH and temperature on Folsomia candida transcriptional regulation. Journal of Insect Physiology, 56, 350-355. Edwards, D. 2003. Non-linear normalization and background correction in one-channel cDNA microarray studies. Bioinformatics, 19, 825-833. Eea 2000. Management of contaminated sites in Western Europe. Fountain, M. T. & Hopkin, S. P. 2005. Folsomica candida (collembola): A "Standard" Soil Arthropod. Annual Review of Entomology, 50, 201-222. Gaetke, L. M. & Chow, C. K. 2003. Copper toxicity, oxidative stress, and antioxidant nutrients. Toxicology, 189, 147-163. Glerum, D. M., Shtanko, A. & Tzagoloff, A. 1996. Characterization of COX17, a Yeast Gene Involved in Copper Metabolism and Assembly of Cytochrome Oxidase. Journal of Biological Chemistry, 271, 14504-14509. Hua, H., Georgiev, O., Schaffner, W. & Steiger, D. 2010. Human copper transporter Ctr1 is functional in Drosophila, revealing a high degree of conservation between mammals and insects. Journal of Biological Inorganic Chemistry, 15, 107-113. Imlay, J. & Linn, S. 1988. DNA damage and oxygen radical toxicity. Science, 240, 1302- 1309. Iso 1999. ISO, Soil Quality. Inhibition of Reproduction of Collembola (Folsomia candida). ISO Guideline 11267. International Standardization Organization. Zwitserland. Kuenen, F. 2009. Food webs under stress. PhD Thesis, VU University. Lexmond, T. M. 1980. The Effect of Soil-Ph on Copper Toxicity to Forage Maize Grown under Field Conditions. Netherlands Journal of Agricultural Science, 28, 164-183. Llanos, R. M. & Mercer, J. F. B. 2004. The Molecular Basis of Copper Homeostasis Copper- Related Disorders. DNA and Cell Biology, 21, 259-270. Lutsenko, S. & Petris, M. J. 2003. Function and Regulation of the Mammalian Copper- transporting ATPases: Insights from Biochemical and Cell Biological Approaches. Journal of Membrane Biology, 191, 1-12. Marvin, M. E., Williams, P. H. & Cashmore, A. M. 2003. The Candida albicans CTR1 gene encodes a functional copper transporter. Microbiology, 149, 1461-1474.

106

Nota, B., Bosse, M., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2009. Transcriptomics reveals extensive inducible biotransformation in the soil-dwelling invertebrate Folsomia candida exposed to phenanthrene. Bmc Genomics, 10. Nota, B., Timmermans, M., Franken, O., Montagne-Wajer, K., Marien, J., De Boer, M. E., De Boer, T. E., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2008. Gene Expression Analysis of Collembola in Cadmium Containing Soil. Environmental Science & Technology, 42, 8152-8157. Nota, B., Verweij, R. A., Molenaar, D., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2010. Gene Expression Analysis Reveals a Gene Set Discriminatory to Different Metals in Soil. Toxicological Sciences., 115, 34-40. Nriagu, J. O. 1996. A history of global metal pollution. Science, 272, 223-224. Pedersen, M. B., Temminghoff, E. J. M., Marinussen, M. P. J. C., Elmegaard, N. & Van Gestel, C. a. M. 1997. Copper accumulation and fitness of Folsomia candida Willem in a copper contaminated sandy soil as affected by pH and soil moisture. Applied Soil Ecology, 6, 135-146. Smyth, G. K. 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3, Article3. Timmermans, M., De Boer, M., Nota, B., De Boer, T., Marien, J., Klein-Lankhorst, R., Van Straalen, N. & Roelofs, D. 2007. Collembase: a repository for springtail genomics and soil quality assessment. BMC Genomics, 8, 341. Tobor-Kaplon, M. A., Bloem, J., Romkens, P. & De Ruiter, P. C. 2005. Functional stability of microbial communities in contaminated soils. Oikos, 111, 119-129. Van Straalen, N. M. & Verhoef, H. A. 1997. The Development of a Bioindicator System for Soil Acidity Based on Arthropod pH Preferences. The Journal of Applied Ecology, 34, 217-232. Wijmenga, C. & Klomp, L. W. J. 2004. Molecular regulation of copper excretion in the liver. Proceedings of the Nutrition Society, 63, 31-39.

107

108

Chapter 6

General discussion

In this thesis we investigated the response of a soil dwelling collembolan when exposed to various, natural soil conditions in order to see if it was possible to establish a Natural Operating Range for the Folsomia candida transcriptome. This NOR could then be applied in a soil quality test where it would be used to tell gene expression profiles caused by natural processes apart from pollution or chemical induced gene expression profiles. We partly succeeded in establishing the NOR for natural Dutch soils. There are however some remarks that have to be made. First the large difference in response to clay and sandy soil of this collembolan means that the two soil types are not interchangeable. For example, a polluted sandy soil cannot be compared to a clean clay soil. Also, this is a relatively small set of soils. For a greater resolution more soils will have to be tested. A world with a growing population will imply an increasing demand on local and worldwide resources. According to the ecosystem services model, a healthy soil can be considered as one of these resources. This increase in resource demand calls for new and improved techniques in measuring environmental status, which includes soil quality. In 2002 a Research Consortium in the Netherlands was founded (“Ecogenomics: Assessing the Living Soil”) with the goal of developing new techniques to determine and monitor soil quality and possibly measure the effectiveness of soil remediation techniques (Roelofsen et al., 2008, Van Straalen, 2003). One of these new techniques is the expansion of an existing, ISO certified soil quality test (1999), which measures survival and reproduction in the collembolan Folsomia candida as an indicator of soil health. In this expansion, gene expression indicators which could be more sensitive and specific. In chapter 5 of this thesis, collembolans exposed to low concentration copper soils, gave an indication that gene expression can be more sensitive and specific. It was found that in animals exposed to copper concentrations that did not cause an effect on reproduction, genes involved in vesicle-mediated excretion were clearly up-regulated. This type of excretion is normally used to remove metal ions from the body. In survival, reproduction and growth tests on the same soils no effect was found. For a soil quality test to be able to measure the quality of field soils, information is needed on how this test animal responds to clean soils and different, natural soil properties. One of these important soil properties is the pH, which, unlike moist content and exposure

109 temperature, is not easily controlled in a soil quality test. Soil pH, in conjunction with the cation exchange capacity (CEC), greatly influences the bio-availability of metals (Crommentuijn et al., 1997). Because of this interaction between pH and metal bio- availability, it is important to determine the effect of differences in pH on the test animal Folsomia candida. In Chapters 2 and 5 of this thesis, soil pH was an experimental factor. In Chapter 2, collembolans were exposed to four different soil pH values with a range between 3.5 and 6.5 and the expression levels of a set of 10 stress related genes were measured. In Chapter 5 collembolans were exposed to a field soil which was spiked with four different copper concentrations and four different pH treatments (pH with a range of 4.0. to 5.1) and gene expression levels were measured via microarray analysis. Reproduction, survival and growth after 28 days of exposure to the soil were also measured in the latter experiments. In the real-time PCR experiments only one gene responded consistently to the pH treatments. This gene, a V-type ATPase gene which transports protons across the cell membrane, was up- regulated in low pH conditions. The other eight measured genes did not respond to the pH treatment. In the microarray analysis on animals exposed to the copper/pH treated soils 221 genes were significantly, differentially expressed (4.3% of the total measured genes) between the lowest and the highest pH values. When compared to the copper analysis the pH treatment caused a different effect according to the GO analysis. The V-type ATPase gene found in the real time experiments was differentially expressed between the low and high pH. In general, soil pH had a mild effect on gene transcription in F. candida. This can partly be explained by strong regulatory mechanisms in the tissues in contact with the environment. The only organs in this animal which are exposed to the outside pH are the skin and the gut. Since whole animals are used for RNA extraction, cells and organs that are not exposed to the outside pH will not respond and dilute the effect of those organs that are exposed to the treatment. This is consistent with the fact that this animal is not very pH responsive to pH; Van Straalen and Verhoef (1997) measured the pH preference of F. candida in a choice tests with a range of pH 2 to 9 and found that this collembolan has a broad pH preference. Since pH in the soil is such a complicated factor that influences a great number of other soil components, it is beneficial for a transcriptomics based soil quality test that soil pH in itself does not show a large effect on the gene expression measured in the whole animal. In Chapter 4 we measured transcriptional variation in F. candida when exposed to various field soils in order to try to establish the Natural Operating Range (NOR) of the Folsomia candida transcriptome. The NOR could give an insight in how variable gene expression is and indicate which genes may not be very good general biomarkers for soil pollution because they

110 react strongly to certain soil types. It is not very common in transcriptomic research to measure the variation in a large numbers of controls so it is difficult to determine if the variation we found is normal in these circumstances. In studies done by Pritchard et al. (2001, 2006), on variation in gene expression within and between mouse strains, it was found that 23 - 44% of the genes within inbred strains could be significantly, differentially expressed. Between strains this percentage was lower (3%). We found a large percentage of genes (2819 in total, 936 after fold change cut-off) differentially expressed between collembolans exposed to sand and clay soils. Within these soil types and between the different land-uses we only found minor differences in gene expression patterns. The number of significant genes found between clay and sandy soils equals those found by Nota (1586 significant genes) (Nota et al., 2008) and Timmermans (2116 significant genes) (Timmermans et al., 2009) who exposed this collembolan to a chemical stress (cadmium) and to physical stress (drought). This indicates that this collembolan experiences these soil-types as very different and the effect between them is comparable to heavy stress. The differences in transcriptional activity between collembolans exposed to the two soil types are probably caused by the physicochemical properties of the soil types. A clay soil has much smaller particles than a sandy soil and leaves less space between the soil particles, so collembolans are more forced to stay on the soil surface. Another explanation might that the microbial community in clay soils is different from the community in sandy soils. Collembolans eat microorganisms which might in turn, may influence gene expression in this animal. Currently we are investigating if there are interactions between F. candida and the soil microbial community. This large difference in the responses to the two soil-types begs us to ask the question on what a good reference or control soil is. In chemical spike-in experiments, a standard soil, such as LUFA2.2 or the artificial OECD soil is often used. LUFA2.2 soil is a standard, sandy soil in which chemicals will have behave differently than in, for example, a field clay soil. But not only chemicals behave differently in these soil types; test animals will also respond differently to the LUFA2.2 control soil than to the field clay soil. So LUFA2.2 will not be a good control soil when testing clay soils for soil quality. The best option for a control would be to take an unpolluted soil as close to the soil of interest with similar physicochemical properties. We also looked at gene expression stability within similar soil-type/land-use combinations by measuring their covariance of variation (CoV). For example, genes that respond to cadmium pollution but show a wide variation in expression within similar soils might not be a very good biomarker. Folsomia candida reproduces parthenogenetically so all animals that

111 were exposed to the natural field soils were clones which were all descendents from a single female. This greatly reduces genetic variation in the test animals but it should be noted that it can have a negative effect on experimental reproducibility or tests by other laboratories because it is possible that they use a clone of the same test animal that is genetically different from the one we used. CoV values for most genes were low; indicating that for most soil- type/land-use combination gene expression is stable. Initially, one of the aims in the experiment with the copper-spiked soils from Bennekom was to calibrate the Natural Operating Range estimated by the natural field soils. The copper effect on F. candida transcription however proved to be much smaller than anticipated. Also, no LUFA2.2 control was included in the copper soil design. We decided not to include this for practical reasons but also because a hormesis effect was observed in copper reproduction tests which indicated that LUFA2.2 soil does not contain enough copper for this animal to perform optimally. Using a copper control as reference in the gene expression analysis would have complicated the analysis because the animals could have performed better in the low copper concentration samples than in the control. The lack of a control however, means that soils from the NOR experiment cannot be directly compared to the copper-spiked soils from chapter five. This is because exposure effects caused by differences in the two exposures cannot be ruled out. For future considerations it would be beneficial if a LUFA2.2 soil is included in the experimental setup, even when field soils are tested. There are other datasets by Nota et al. (2008, 2009, 2010) which include exposures to cadmium and phenanthrene but also heavily polluted field soils. These datasets include LUFA2.2 controls and can be compared to the NOR soils. Recommendations for the future of this project are twofold: the genomic perspective of the project and the soil perspective. For the genomic perspective, more sequence information of Folsomia candida genes and the genome in general is needed. When Timmermans (2007) sequenced part of the F. candida transcriptome in 2006, it gave experiments using this ecological test animal an edge over other ecological and ecotoxicological model organisms of which less gene information was known. It also allowed for the rapid development of the microarray platform for gene expression studies that was used in chapters four and five of this thesis (Nota et al., 2009). With the next generation sequencing methods developed the last few years (Shendure and Ji, 2008, Harris et al., 2008) however, it has become much easier, faster and less expensive to sequence transcriptomes and even whole genomes. One of the first steps that should be made are to obtain more sequence information by either another Expressed Sequence Tag (EST) sequencing run or by sequencing the whole genome. In this

112 way, this project and this new technique to assess soil quality can remain competitive and yield better publications. Of all the transcripts on the F. candida microarray platform 40% has a match with the Basic Local Alignment Search Tool (BLAST) database, many of which match to a hypothetical protein or other unclear term. This lack of transcript BLAST information hinders progress on the insight of the mode(s) of action that chemicals or pollutants have on this test animal. Also it can potentially cause a bias in the interpretation of the data. Interpretation of the gene expression analysis often focuses on those transcripts or genes that have BLAST information assigned to them, in order to group genes together and investigate which pathways are affected. This can cause bias in the analysis and, in a way, generalize the results. For example: the function of heat shock proteins is a well understood mechanism in stress response and their gene sequence is in general well conserved (Feder and Hofmann, 1999). When genes in this class are found in a significant gene list of animals exposed to a certain treatment, they are often focused upon. This focus might then lead to an over representation of the importance of heat shock proteins while unknown genes found in the list, could have better explained the effect if functional information was there. One of the solutions to circumvent the risk of bias in the analysis is Gene Ontology (GO) analysis (Alexa et al., 2006). In this analysis genes are assigned to GO terms which can be used, with a statistical test, to investigate the over-representation of classes of genes or processes in the dataset. For to this reason, GO analyses were used in the gene expression analysis done in chapters four and five. From the soil perspective my recommendations for future research would first be to test more soils. We investigated the Natural Operating Range of the F. candida transcriptome based on soils from the Netherlands so in essence we developed the Dutch Natural Operating Range. For a soil quality test it is beneficial if it is widely applicable. Soils in other parts of Europe can have very different properties compared to the soils that were tested here, so to broaden the NOR, more soils from all over Europe should be tested. There are multiple European soil programs that could be used for this purpose. The Eurosoils, a project from the EU, could be used to start with. This is a set of seven soils from different European countries, which are regularly used as reference soils (Gawlik et al., 2001). Another strategy to develop a broader NOR is to test the impact of different soil properties on F. candida gene expression rather than soils themselves. We investigated the impact of soil pH both on a small set of genes in OECD soil and with microarrays in the copper/pH treated soil. This can be extended to other soil properties such as organic carbon content, CEC, clay content, etc. A standard soil

113 such as OECD or LUFA2.2 could even be spiked with these properties for a singular effect analysis. This information could also be used to study the interaction between soil properties and pollutants. Finally, we would like to emphasize that the soils tested here, to establish the Natural Operating Range, are not a set of reference soils of which the gene expression data can be used as a control for other, polluted soils. These soils just gave an indication of the variation in gene expression in the springtail Folsomia candida when exposed to natural soils. The best control used to test a polluted soil is a clean soil from nearby the soil of interest with similar physicochemical properties.

114

References

Alexa, A., Rahnenfuhrer, J. & Lengauer, T. 2006. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics, 22, 1600-1607. Crommentuijn, T., Doornekamp, A. & Van Gestel, C. a. M. 1997. Bioavailability and ecological effects of cadmium on Folsomia candida (Willem) in an artificial soil substrate as influenced by pH and organic matter. Applied Soil Ecology, 5, 261-271. Feder, M. E. & Hofmann, G. E. 1999. Heat Shock Proteins, molecular chaperones & the Cellular Stress Response: Evolutionary and Ecological Physiology. Annual Review of Physiology, 61, 243-282. Gawlik, B. M., Lamberty, A., Muntau, H. & Pauwels, J. 2001. Eurosoils – A set of CRMs for comparability of soil-measurements. Fresenius' Journal of Analytical Chemistry, 370, 220-223. Harris, T. D., Buzby, P. R., Babcock, H., Beer, E., Bowers, J., Braslavsky, I., Causey, M., Colonell, J., Dimeo, J., Efcavitch, J. W., Giladi, E., Gill, J., Healy, J., Jarosz, M., Lapen, D., Moulton, K., Quake, S. R., Steinmann, K., Thayer, E., Tyurina, A., Ward, R., Weiss, H. & Xie, Z. 2008. Single-Molecule DNA Sequencing of a Viral Genome. Science, 320, 106-109. Iso 1999. ISO, Soil Quality. Inhibition of Reproduction of Collembola (Folsomia candida). ISO Guideline 11267. International Standardization Organization. Zwitserland. Nota, B., Bosse, M., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2009. Transcriptomics reveals extensive inducible biotransformation in the soil-dwelling invertebrate Folsomia candida exposed to phenanthrene. Bmc Genomics, 10. Nota, B., Timmermans, M., Franken, O., Montagne-Wajer, K., Marien, J., De Boer, M. E., De Boer, T. E., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2008. Gene Expression Analysis of Collembola in Cadmium Containing Soil. Environmental Science & Technology, 42, 8152-8157. Nota, B., Verweij, R. A., Molenaar, D., Ylstra, B., Van Straalen, N. M. & Roelofs, D. 2010. Gene Expression Analysis Reveals a Gene Set Discriminatory to Different Metals in Soil. Toxicological Sciences., 115, 34-40. Pritchard, C., Coil, D., Hawley, S., Hsu, L. & Nelson, P. S. 2006. The contributions of normal variation and genetic background to mammalian gene expression. Genome Biology, 7. Pritchard, C. C., Hsu, L., Delrow, J. & Nelson, P. S. 2001. Project normal: Defining normal variance in mouse gene expression. Proceedings of the National Academy of Sciences, 98, 13266-13271. Roelofsen, A., Broerse, J. E. W., De Cock Buning, T. & Bunders, J. F. G. 2008. Exploring the future of ecological genomics: Integrating CTA with vision assessment. Technological Forecasting and Social Change, 75, 334-355. Shendure, J. & Ji, H. 2008. Next-generation DNA sequencing. Nat Biotech, 26, 1135-1145. Timmermans, M., De Boer, M., Nota, B., De Boer, T., Marien, J., Klein-Lankhorst, R., Van Straalen, N. & Roelofs, D. 2007. Collembase: a repository for springtail genomics and soil quality assessment. BMC Genomics, 8, 341. Timmermans, M. J. T. N., Roelofs, D., Nota, B., Ylstra, B. & Holmstrup, M. 2009. Sugar sweet springtails: on the transcriptional response of Folsomia candida (Collembola) to desiccation stress. Insect Molecular Biology, 18, 737-46. Van Straalen, N. M. 2003. Peer Reviewed: Ecotoxicology Becomes Stress Ecology. Environmental Science & Technology, 37, 324A-330A.

115

Van Straalen, N. M. & Verhoef, H. A. 1997. The Development of a Bioindicator System for Soil Acidity Based on Arthropod pH Preferences. The Journal of Applied Ecology, 34, 217-232.

116

Summary

Soil and the organisms that live in it provide a number of services to human society. Most of our food is grown in it and we build our houses on it. Soil pollution can severely impact these functions and can threaten human and animal health. With the large number of chemicals manufactured by today’s modern society it is important that the impact of these potential pollutants can be assessed in a fast and sensitive manner. Gene expression analysis in selected, ecological relevant test animals may be a fast and sensitive method to test the effect chemicals and pollutants may have on animal and human health. Soil however comes in many distinct forms that have very different characteristics. These characteristics are important factors that may cause effects on gene expression or may influence pollutant properties. The main objective addressed in this thesis is how soil properties influence gene expression in the springtail Folsomia candida and how they affect the toxicity of soil pollutants. These results can be used to evaluate the effectiveness of a soil quality based on gene expression in F. candida. In chapter two a set of reference genes is described that may be used for gene expression studies in the collembolans Folsomia candida and Orchesella cincta. These genes are especially important in quantitative real-time polymerase chain reaction (Q-PCR) experiments where expression differences of a gene of interest between different treatments are measured by comparing them to a gene that is unaffected by the treatment. Some genes are thought to have a stable expression pattern regardless of outside influences such as heat or cold treatment or chemical exposures. These genes, also called housekeeping genes, are highly sought after as reference genes for Q-PCR experiments. A set of potential reference genes was developed for both collembolan species and gene expression stability was investigated by exposing collembolans to a set of treatments both abiotic (heat, drought, soil pH) and chemical (cadmium and phenanthrene) and measuring gene expression with Q-PCR. When gene expression stability was ranked it became apparent in both collembolan species that there were differences in the top three most stable genes between the different treatments. The conclusion from this data is that universally stable reference genes are very hard to find, if they even exist at all. When performing Q-PCR experiments on genes of interest it is therefore necessary to investigate which reference genes are the most stable to be with the used with the selected treatments.

117

Chapter 3 deals with abiotic stress effects on Folsomia candida transcriptional regulation. Chemicals in the soil are influenced by soil abiotic factors. Soil pH for example, influences the uptake availability of heavy metals. In metal containing soils with a low pH, metal ions are free to be taken up by animals and plants and therefore the toxic effect of these metals can be greater in low pH soils. Soil pH in itself can also cause effects on a transcriptional level. To test the effect of soil pH on F. candida transcriptional regulation we exposed collembolans to four different soil pH values (3.5, 4.5, 5.5 and 6.5), all in standard OECD soil and measured the expression levels of nine, stress implicated genes. Environmental temperature can also influence gene expression because it can alter the speed at which cellular processes take place. Higher temperature may also impair protein folding which causes stress. Collembolans were therefore also exposed to four different temperatures (0, 10, 20 and 30 ºC) and the same panel of genes was tested. In the pH experiment only one gene responded to differences in soil pH. This gene is a vacuolar ATPase which transports protons across the cell membrane. In the temperature experiments only Heat Shock Proteins (HSP) 40 and 70 were affected. Remarkably, they were not only up-regulated at 30 ºC but also at 0 ºC. The up- regulation of HSP40 and HSP70 are novel findings and merit further investigation. In Chapter 4 the concept of a Natural Operating Range is introduced. If gene expression levels are to be measured as valid endpoints in soil ecotoxicological testing, a reference database with information on how the test animal responds to natural, unpolluted soils has to be established. 26 Dutch field soils, all part of the biological soil indicator network (BoBI), an RIVM program, were sampled. F. candida was exposed to these soils and gene expression was measured with microarray analysis. A survival and reproduction test of 28 days was also performed to compare gene expression results to these original ISO-test end points. The differences between sandy and clay soils caused the largest effect on gene expression. Almost 20% of the genes were differentially expressed between these two soil types. Gene expression on animals exposed to the different land-uses only showed a minor effect; 12 genes were differentially expressed. In an multivariate analysis where gene expression was linked to the results from the soil chemical analysis it appeared that soil fertility was correlated with gene expression. No effect of soil-type or land-use was detected on survival of reproduction but the soil arsenic content was negatively correlated with reproduction. In Chapter 5 the impact of an aged copper polluted field soil on F candida gene expression was determined. In 1980 a field in Bennekom, the Netherlands was spiked with four copper concentrations and four pH treatments. The pH treatments were repeated over the years but there was no additional copper added to the soil. This design made this site an ideal,

118 controlled, aged metal-polluted soil to test. All copper/pH combinations were sampled and gene expression, reproduction and growth were all measured. A minor copper induced effect was found on gene expression. 68 Genes were differentially regulated. Gene Ontology (GO) enrichment analysis showed that GO terms involved in vesicle mediated excretion were up- regulated in the high copper concentrations. The pH treatment induced a larger effect on gene expression in which 221 genes were differentially regulated. Neither copper treatment nor pH treatment had a significant effect on survival, reproduction or growth in these springtails. The results indicate that the copper concentration and availability in these soils were too low to produce a toxic effect. The gene expression test however, was able to differentiate between the pH and copper effects and was sensitive enough to detect non toxic effects. In this thesis the effects of soil properties on Folsomia candida transcriptional regulation were discussed. Knowledge of these effects is necessary in order to separate them from effects induced by soil pollutants and chemicals. One of the remarkable finds was the large difference in genomic response to clay and sandy soils. Abiotic soils factors such as pH do exert an effect, albeit small, on gene expression. Therefore, for proper soil testing it is necessary to use a control soil that is similar to the soil of interest. This makes standard lab soils such as LUFA2.2 or OECD soil less applicable as controls in the testing of field soils when tests with many endpoints, such as gene expression analysis, are used. In this thesis a start was made to develop a NOR for the springtail Folsomia candida. To develop a complete NOR many natural soils should be tested.

119

120

Samenvatting

Titel: Stress vrije springstaarten; het bepalen van natuurlijke transcriptionele profielen in collembolen

De grond en de organismen die er in leven vormen een belangrijke schakel in ons dagelijks bestaan. Een groot deel van ons voedsel wordt er in verbouwd, we bouwen onze huizen erop en gebruiken het voor recreatie. Schone en gezonde grond is daarom essentieel voor het succes van onze samenleving. In de Europese Unie worden jaarlijks vele soorten chemicalien geproduceerd die, als ze in het mileu terecht komen, vervuiling kunnen veroorzaken. Om de invloed van deze potentiele vervuilers op flora en fauna te bepalen zijn geavanceerde milieu tests nodig. In onze groep hebben wij recent een potentiele nieuwe test ontwikkeld om grondvervuiling te meten aan de hand van gen expressie metingen in de springstaart Folsomia candida. Springstaarten, ook wel colembolen genoemd, zijn kleine, zes potige organismen verwant aan de insecten. Folsomia candida leeft in de grond waardoor het een relevant test organimse is om grond vervuiling te meten. Recente DNA gen expressie technieken, zoals microarrays, zijn in staat om in één keer de algehele staat van een organisme te meten aan de hand de respons die genen vertonen als het organisme wordt bloot gesteld aan vervuiling. Deze technieken zijn dan ook een goede aanvulling op de meer traditionele ecotoxicologiche technieken waarin overleving en het voorplanting succes van testorganismen wordt gemeten. Omdat de genen van Folsomia candida ook een respons kunnen vertonen op grond eigenschappen, zoals pH of het organisch stof gehalte, is het doel van dit proefschrift om de respons van deze springstaart op gezonde bodems en op grondeigenschappen te testen om zo een de reactie op vervuilende stoffen te kunnen scheiden van de respons op de natuurlijke bodem. In hoofdstuk 2 wordt een groep van referentie genen getest die gebruikt kunnen worden in gen expressie metingen zoals Real-time quantitatieve PCR (Q-PCR). Met Q-PCR kan de respons van een enkel gen in het test organisme op meerdere blootstellingen gemeten worden. Hiervoor is echter wel een ander gen nodig dat als controle gebruikt kan worden en dat niet op de blootstelling reageerd. Een set van mogelijke referentie genen voor de springstaart soorten Folsomia candida en Orchesella cincta werd getest in meerdere soorten blootstellingen, zowel chemisch (cadmium en fenanthreen) als fysische (warmte, grond pH, etc). Opvallend was dat geen enkel gen het meest stabiel was in alle bloodstellingen en de

121 algemene conclusie van dit hoofstuk is dan ook dat het nodig is een set van referentie genen te testen met elke nieuwe soort van bloodstelling die uitgevoerd wordt. In hoofdstuk 3 word die hierboven beschreven Q-PCR techniek gebruikt om de invloed van tempratuur en grond pH te meten op een test set van negen, aan stress gerelateerde genen, te meten. Bodem pH kan de toxiciteit van vervuilende stoffen zoals zware metalen beinvloeden maar zou ook een effect op de gen expressie in het algemeen kunnen hebben. Tempratuur kan ook invloed hebben op toxicanten omdat het cellulaire processen beinvloed. Van de negen gesteste genen reageerder maar een gen op veschillen in pH. Dit gen is een ATPase gen dat voor een eiwit codeert dat protonen door het celmembraan transporteerd. Omdat bij lage pH de protonen concentratie hoger is heeft dit transport eiwit meer moeite met het transport en wordt de expressie van dit ewit omhoog gereguleerd. Twee algemene stress genen, HSP40 en HSP70 reageerden op temperatuur verschillen. Van deze genen was bekend dat ze op warmte reageren maar in dit geval waren beide genen ook omhoog gereguleerd bij bloodstelling aan nul graden Celsius. Een belangrijk concept voor dit proefschrift: de Natural Operating Range (NOR) wordt geintroduceerd in hoofdstuk 4. Deze NOR is belangrijk want hij kan gebruikt worden om effecten geinduceerd door natuurlijke omstandigheden te scheiden van effecten die veroorzaakt worden door toxicanten en vervuilende stoffen. De NOR is echter zeer complex en zou strict gezien alleen compleet zijn wanneer alle mogelijke geronden getest zijn. In dit hoofdstuk maken wij een begin met het vaststellen van de NOR voor Folsomia candida door deze sprinstaart bloot te stellen aan een set van natuurljke gronden en de gen expressie respons te meten. Opvallend was het grote verschil in responsen op klei en zand grond. 20% van de gemeten genen van F. candida reageerden anders op zand grond dan op klei grond. Wanneer gen expressie met de chemische data van de gronden gecoreleerd werd, bleek dat gen expressie positief gecoreleerd was met de vruchtbaarheid van de grond. In een overleving en reproductie test op deze gronden werd geen direct verband gelegd met de grond soort of het landgebruik. Wel was de reproductie negatief gecoreleerd met de arsenicum concentratie in de grond. In hoofdstuk 5 werd Folsomia candida bloodgesteld aan een natuurlijke, vervuilde grond en werd gen expressie gemeten. Deze grond werd in 1981 opzettelijk behandeld met 4 verschillende koper concentraties en de pH van de grond werd aangepast in 4 verschillende pH waarden zodat er 16 combinaties ontstonden. Al deze combinaties werden getest. Het koper in de grond induceerde een kleine respons in de gen expressie waar, in de hoogst gemeten koper concentratie, 68 genen diferentieel gerguleerd werden. Deze genen waren

122 betrokken bij de excretie van metalen wat er op duidt dat het opgenomen koper in het lichaam van het organisme verwerkt en weer uitgescheiden wordt. De koper vervuiling in deze grond was niet hoog genoeg om invloed te hebben op de overleving, reproductie of groei van F. candida wat betekent dat een test gebaseerd op gen expressie gevoeliger is omdat het wel in staat is een respons op deze koper vervuiling. Het doel van mijn promotieonderzoek was om te onderzoeken hoe de eigenschappen van natuurlijke gronden de expressie van genen in de springstaat Folsomia candida beinvloeden. In de microarray gen expressie test die gebruikt wordt in dit proefschrift worden meer dan 5000 genen in keer gemeten. Door dit grote aantal genen dat gemeten wordt is het voor een succesvolle bodemqualiteits test nodig dat er kennis is over welke genen ook op normale grondeigenschappen reageren zodat er verschil gemaakt kan worden effecten veroorzaakt door vervuiling en effecten veroorzaakt door de bodem zelf. Opvallend was het grote verschil in gen expressie respons wanneer F. candida werd blootgesteld aan klei of zand grond. Voor het goed kunnen testen van een veldgrond is het daarom nodig dat de controle grond gelijkwaardige eigenschappen heeft. Dit maakt standaard gebruikte gronden zoals LUFA2.2 en OECD grond minder geschikt als controle grond in het testen van veldgronden. In dit proefschrift werd een begin gemaakt met het vast stellen van een NOR voor Folsomia candida. Om deze NOR echt te kunnen bepalen zullen meer gronden, zowel binnen Nederland als in de rest van Europa, getest moeten worden.

123

124

Dankwoord

De Engelse predikant, dichter en filosoof John Donne schreef ooit: ‘No man is an island, entire of itself’. Een zin die, naar mijn mening, zeer toepasselijk is op promotie onderzoek. De afgelopen vier jaar heb ik veel hulp gekregen van de mensen om mij heen en dat dit proefschrift tot een goed eind gekomen is, is zeker ook voor een groot deel aan hen te danken. Als eerste wil ik Nico en Dick bedanken. Jullie waren mijn promoter en co-promoter de afgelopen vier jaar en ik heb ontzettend veel van jullie geleerd. Ik heb onze samenwerking als prettig, gezellig en relaxed, maar ook als effectief ervaren. Dick, jou wil ik ook bedanken omdat je altijd bereid was om mee op veldwerk te gaan. Iets dat veel leuker is met z’n tweeën dan alleen. Ook wil ik mijn mede ecogenomics lotgenoten bedanken: Martijn, Muriel, Janine, Ben en Thierry. Van jullie heb ik enorm veel geleerd op zowel praktisch gebied als op het gebied van data analyse. Ik denk dat dit proefschrif nog lang niet af zou zijn zonder jullie hulp. Mijn dank gaat ook uit naar Kees en Rudo. Kees, jij stond altijd klaar om mijn vragen te beantwoorden over grond en dat waren er nogal wat, aangezien ik daar als moleculair bioloog niet al te veel van wist. Rudo, jij hebt me vaak geholpen met de praktische kant die grondonderzoek met zich mee bracht. Ik wil ook alle collegas bij de afdeling Dierecologie bedanken voor een goede samenwerking en een geweldige tijd. In het bijzonder zijn dat Maartje, Mieke en Miriam, met jullie heb ik echt een ontzettend gezellige tijd gehad bij ons op de kamer en tijdens de congressen die we samen bezocht hebben. Bij de microarray facility van het VUMC wil ik graag Paul Eijk en François Rustenburg bedanken voor de hulp bij het uitvoeren van de microarrays. Mijn familie en vrienden voor alle steun die ik van jullie ontvangen heb. Ondanks dat sommigen van jullie nog steeds denken dat ik de hele dag met reageerbuisjes en petrischaaltjes rond loop, of: “Het was iets met bacterien in de grond, toch?” heb ik veel aan jullie gehad de afgelopen jaren. Als laatste wil ik mijn vader bedanken. Pa, Ik heb de afgelopen vier jaar twee huizen gekocht, een huis verkocht, ik ben twee keer verhuisd en naar mijn berekening hebben we in totaal zo’n zes maanden geklust en tussendoor heb ik af en toe nog iets aan m’n promotie onderzoek gedaan. Tijdens al dit alles was jij altijd bereid om te helpen en mee te denken, hiervoor heel erg bedankt!

125

Curriculum Vitae

Tjalf de Boer werd op 4 oktober 1979 geboren te Alkmaar. In 1999 begon hij zijn studie voor Hoger Laboratorium Onderwijs aan de Hogeschool Alkmaar. Tijdens deze studie werden twee stages uitgevoerd bij het Nederlands Kanker Instituut, waar genetische afwijkingen in Borstvlieskanker werden onderzocht, en bij Macrozyme/AMC, waar onderzoek gedaan werd naar de effecten van de lipiden stapelings ziekte van Gaucher. Na het afronden van het HLO werd een Masters Course in Biomolecular Sciences gevolgd aan de Universiteit van Amsterdam. Een stage werd uitgevoerd bij de afdeling Epigentica onder leiding van professor Arie Otte. Onderwerp van de stage was het opsporen van target genen voor het chromatine aanpassend eiwit Enhancer of Zeste 2. De auteur studeerde af in september 2005 waarna begin 2006 werd begonnen aan het promotie onderzoek dat geleid heeft tot dit proefschrift.

126

Publications

T.E. de Boer, M. Holmstrup, N.M. van Straalen & D Roelofs, D. 2010. The effect of soil pH and temperature on Folsomia candida transcriptional regulation. Journal of Insect Physiology, 56, 350-355.

M.E. de Boer, T.E. de Boer, J. Marien, M.J.T.N. Timmermans, B. Nota, N.M. van Straalen, J. Ellers. & D. Roelofs. 2009. Reference genes for QRT-PCR tested under various stress conditions in Folsomia candida and Orchesella cincta (Insecta, Collembola). BMC Molecular Biology, 10, 54.

S. Slotsbo, L.H. Heckmann, C. Damgaard, D. Roelofs, T.E. de Boer & M. Holmstrup. 2009. Exposure to mercury reduces heat tolerance and heat hardening ability of the springtail Folsomia candida. Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology, 150, 118-123.

B. Nota, M.J.T.N. Timmermans, O. Franken, K. Montagne-Wajer, J Marien, M.E. de Boer, T.E. de Boer, B. Ylstra, N.M. van Straalen & D. Roelofs. 2008. Gene Expression Analysis of Collembola in Cadmium Containing Soil. Environmental Science & Technology, 42, 8152- 8157

M.J.T.N. Timmermans, M.E. de Boer, B. Nota, T.E. de Boer, J. Marien, R.M. Klein- lankhorst, N.M. van Straalen & D. Roelofs. 2007. Collembase: a repository for springtail genomics and soil quality assessment. BMC Genomics, 8, 341.

127