Applying the Exposome Concept in Birth Cohort Research: a Review of Statistical Approaches

European Journal of Epidemiology (2020) 35:193–204 https://doi.org/10.1007/s10654-020-00625-4 ESSAY Applying the exposome concept in birth cohort research: a review of statistical approaches Susana Santos1,2,3 · Léa Maitre3,4,5 · Charline Warembourg3,4,5 · Lydiane Agier6 · Lorenzo Richiardi7 · Xavier Basagaña3,4,5 · Martine Vrijheid3,4,5 Received: 29 August 2019 / Accepted: 17 March 2020 / Published online: 27 March 2020 © The Author(s) 2020 Abstract The exposome represents the totality of life course environmental exposures (including lifestyle and other non-genetic factors), from the prenatal period onwards. This holistic concept of exposure provides a new framework to advance the understanding of complex and multifactorial diseases. Prospective pregnancy and birth cohort studies provide a unique opportunity for exposome research as they are able to capture, from prenatal life onwards, both the external (including lifestyle, chemical, social and wider community-level exposures) and the internal (including infammation, metabolism, epigenetics, and gut microbiota) domains of the exposome. In this paper, we describe the steps required for applying an exposome approach, describe the main strengths and limitations of diferent statistical approaches and discuss their challenges, with the aim to provide guidance for methodological choices in the analysis of exposome data in birth cohort studies. An exposome approach implies selecting, pre-processing, describing and analyzing a large set of exposures. Several statistical methods are currently available to assess exposome-health associations, which difer in terms of research question that can be answered, of bal- ance between sensitivity and false discovery proportion, and between computational complexity and simplicity (parsimony). Assessing the association between many exposures and health still raises many exposure assessment issues and statistical challenges. The exposome favors a holistic approach of environmental infuences on health, which is likely to allow a more complete understanding of disease etiology. Keywords Exposome · Birth cohorts · Life-course epidemiology · Omics · Environmental epidemiology Introduction * Susana Santos Disease risk is largely determined by behavioral, environ- [email protected] mental, and occupational risk factors [1]. Many of these risk 1 The Generation R Study Group, Erasmus MC, factors are modifable and therefore can be a potential target University Medical Center Rotterdam, PO Box 2040, for prevention. However, current insights into the associa- 3000 CA Rotterdam, The Netherlands tion between many of these risk factors and health are too 2 Department of Pediatrics, Erasmus MC – Sophia Children’s limited for efective prevention. A substantial proportion Hospital, University Medical Center Rotterdam, Rotterdam, of the modifable risk of major chronic diseases remains The Netherlands to be discovered [1]. Also, not much is known about the 3 Institute for Global Health, ISGlobal, Barcelona, Spain complex interrelations between risk factors and how they 4 Universitat Pompeu Fabra (UPF), Barcelona, Spain relate to biological endogenous responses. The holistic expo- 5 CIBER Epidemiología y Salud Pública (CIBERESP), some concept may provide a useful framework to broaden Barcelona, Spain our understanding of the impact over the life course of a 6 Team of Environmental Epidemiology, IAB, multitude of risk factors on human health, and ultimately Institute for Advanced Biosciences, Inserm, CNRS, contribute to explaining the unattributable burden of disease CHU-Grenoble-Alpes, University Grenoble-Alpes, Grenoble, and developing efective preventive measures. Pregnancy France and birth prospective cohorts, combining a wealth of data 7 Cancer Epidemiology Unit, Department of Medical Sciences, collected at the individual and population level by methods University of Turin and CPO-Piemonte, Turin, Italy Vol.:(0123456789)1 3 194 S. Santos et al. such as questionnaires, physical examinations, biological starting point to develop a lifetime exposome [6]. Other samples and geo-spatial modelling, represent a unique and possible key time points where measures of the exposome resourceful opportunity to perform exposome studies of the could be made include the infancy, childhood, adolescence fetal and early life onwards. and adulthood life periods [3]. Whilst it is clear that full In this paper, we describe the steps required for apply- measurement of the exposome at even a single time point ing an exposome approach in epidemiologic studies, with a is currently impossible, even partial exposome coverage, specifc focus on prospective birth cohort studies. We intend where neither the totality of exposures nor the dynamic to provide researchers with relevant information to perform coverage are assessed, is valuable [3]. Important pro- exposome analyses, understand the main strengths and limi- gresses in uncovering the various aspects of the exposome tations of diferent statistical approaches and list the chal- are currently being made by diferent research projects lenges for future research on the exposome. worldwide, including the LifeCycle (www.lifec ycle-proje ct.eu), HELIX (www.proje cthel ix.eu), EXPOsOMICS (www.exposomics proje ct.eu ), HEALS (www.heals-eu.eu ), The exposome HERCULES (https ://emory hercu les.com) and CHEAR (https ://chear progr am.org) projects [7–11]. Further, an R The concept of the exposome was frst proposed in 2005 package called rexposome, which allows exposome data by the cancer researcher Christopher Wild to encompass loading, exploration, and analysis has been developed ‘life-course environmental exposures (including lifestyle [12]. In the next sections of this paper, we describe the factors), from the prenatal period onwards’ [2]. It was main steps undertaken in exposome research. developed to highlight the need for more accurate, reliable and comprehensive environmental exposure data to complement the impressive advances made in measuring the human genome. The exposome has been proposed to Selecting and pre‑processing exposures comprise three overlapping and complementary domains: (1) a general external domain, including factors such as The selection of exposures to consider in the study depends climate, urban/rural environment and societal factors, on the research question and issues of feasibility and data which are mainly assessed at the community level by accessibility. Previous studies have included a wide range geographical mapping methods; (2) a specifc external of external environmental exposures from various families, domain, including environmental pollutants, tobacco and or have been specifcally focused on certain components of diet, which are assessed at the individual level by ques- the exposome, such as the urban or the chemical exposome, tionnaires or biomonitoring, and (3) an internal domain, assessed at various developmental periods [13–19]. Socio- including internal body processes such as infammation, and public health exposome conceptual frameworks have metabolism, and endogenous circulating hormones, which also been proposed and can guide the selection of exposures are often assessed by high-throughput molecular omics [20, 21]. Once the research question and set of exposure methodologies [3]. The internal exposome is expected variables have been clarifed, a considerable amount of pre- to, at least partly, reflect the external exposome. The processing of the data is required. exposome is complex and challenging in many respects. Measuring it implies the accurate and reliable assessment of many time-varying exposures over the life course. Handling missing data in the exposures The technical challenges of measuring the external and internal domains of the exposome are beyond the scope Missing data is problematic in an exposome context that of this review and have been discussed elsewhere [4, 5]. examines exposures jointly because the number of complete Continuous exposome assessment is difcult and thus the cases may decrease as the number of included exposures life course exposome is usually obtained from exposure increases. The use of imputation techniques is therefore rec- assessments at specifc time points. The choice of the time ommended. Multiple imputation is commonly used to handle points is of utmost importance since the health efects of missing data in epidemiological studies, and Rubin’s rule a given exposure may vary during the various develop- can be used to combine coefcient estimates obtained from mental periods or may be altered when it co-occurs with each imputed dataset [22]. Applying multiple imputation to other exposures through synergistic/interactive efects [6]. large datasets involves additional difculties that were pre- Since the developing fetus is particularly vulnerable to viously described [23]. It is recommended that imputation the efects of environmental exposures and since adverse models include no more than 15–25 predictors, since adding exposures in utero during critical windows may have a life- more predictors usually provides little gain and can lead to time health impact, the pregnancy period is an important problems of convergence due to predictors collinearity [24]. 1 3 195 Applying the exposome concept in birth cohort research: a review of statistical approaches Dealing with exposure values below the limits all types of regression models or to all settings [29–31]. A of detection and quantifcation simulation study showed that two measurement error models (simulation

Applying the Exposome Concept in Birth Cohort Research: a Review of Statistical Approaches

Development of Virtual Screening and in Silico Biomarker Identification Model for Pharmaceutical Agents

Welcome to the 2018 Mechanobiology Symposium!

Lighting up the Mechanome

NORTHWESTERN UNIVERSITY Single Audit Report Year Ended August 31, 2018 (With Independent Auditors’ Report Thereon)

Pathway Analysis in Other Data Types

Why Precision Medicine Is Poised to Play Perfectly to the Strengths of Pathology and Lab Medicine

An Emerging Field Between Biology and Biomechanics

Introduction to Pathway and Network Analysis

Released March 2021

The Materiome

Materiomics: Biological Protein Materials, from Nano to Macro

Pathway Analysis in Other Data Types