Addressing Challenges of Hierarchical Structural Equation Modeling in Animal Agriculture
Total Page:16
File Type:pdf, Size:1020Kb
Addressing Challenges of Hierarchical Structural Equation Modeling in Animal Agriculture by Kessinee Chitakasempornkul B.BA., Mahidol University International Colleges, 2008 M.S., California State University Long Beach, 2012 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY Department of Statistics College of Arts and Sciences KANSAS STATE UNIVERSITY Manhattan, Kansas 2019 Abstract Feeding a world population of 9 billion people by 2050 is a fundamental challenge of our generation. Animal agriculture is positioned to play a major role on this challenge by ensuring a safe and secure supply of animal protein for food. Thus motivated, an understanding of the mechanistic interconnections between multiple outcomes in agricultural production systems is critical. Structural equation models (SEM) are being increasingly used for investigating directionality in the associations between outcomes in the system. Agricultural data pose peculiar challenges to the implementation of SEM, among them its structured architecture and its multidimensional heterogeneity. For example, observations on a given outcome collected at the animal level are often not mutually independent but rather likely to have correlation patterns due to clustering within pens or cohorts, which in turn may be subjected to common management or business practices defined at the level of a commercial operation. Also, agricultural outcomes of interest are often correlated and at multiple levels. Furthermore, a key assumption underlying SEM is that of a causal homogeneity, whereby the structural coefficients defining functional links in a network are assumed homogeneous and impervious to environmental conditions or management factors. This assumption seems particularly questionable in the context of animal agriculture, where production systems are regularly subjected to explicit interventions intended to optimize the necessary trade-offs between efficacy and efficiency of production. Despite the recent extension of SEM to a mixed-models framework, inferential issues related to hierarchical data architecture in the context of designed experiments and observational studies are not well understood. Hence, my dissertation work investigates the importance of properly specifying data structure in a hierarchical SEM. My research further develops methodological extensions to SEM to account for heterogeneity in the structural coefficients of a network and characterizes problems to be expected otherwise. Throughout my PhD dissertation, I implemented SEM in a hierarchical Bayesian framework, and I used as motivator an observational dataset in beef cattle feedlot production and a dataset from a designed experiment in swine reproduction. I first evaluated the inferential implications of properly accounting for (or ignoring) existent correlation structure due to data architecture when modeling feedlot data with SEM. Results indicated impaired model fit, biased estimation and precision loss for SEM parameters when data architecture was mispecified or ignored. I then investigated potential causal interconnections between reproductive performance outcomes in swine, for which I leveraged the mixed-models adapted inductive causation algorithm to search for and infer upon causal links. Results indicated reproductive networks distinctive by parity groups, thereby suggesting potential network heterogeneity; this finding was in direct conflict with the standard SEM assumption of causal homogeneity. Therefore, I proposed and developed a methodological extension to hierarchical SEM that explicitly specifies structural coefficients as functions of systematic and non-systematic sources of variation, thus allowing for hierarchical heterogeneity of network links in structured data. I validated the proposed method using a simulation study and applied it to assess heterogeneity of functional reproductive links in swine. Overall, this dissertation characterizes problems of SEM-based modeling and develops a general approach to hierarchical SEM for the joint network-type analysis of multiple outcomes with potential heterogeneity in functional links. Addressing Challenges of Hierarchical Structural Equation Modeling in Animal Agriculture by Kessinee Chitakasempornkul B.BA., Mahidol University International Colleges, 2008 M.S., California State University Long Beach, 2012 A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY Department of Statistics College of Arts and Sciences KANSAS STATE UNIVERSITY Manhattan, Kansas 2019 Approved by: Major Professor Nora M. Bello, DVM, PhD Copyright © Kessinee Chitakasempornkul 2019. Abstract Feeding a world population of 9 billion people by 2050 is a fundamental challenge of our generation. Animal agriculture is positioned to play a major role on this challenge by ensuring a safe and secure supply of animal protein for food. Thus motivated, an understanding of the mechanistic interconnections between multiple outcomes in agricultural production systems is critical. Structural equation models (SEM) are being increasingly used for investigating directionality in the associations between outcomes in the system. Agricultural data pose peculiar challenges to the implementation of SEM, among them its structured architecture and its multidimensional heterogeneity. For example, observations on a given outcome collected at the animal level are often not mutually independent but rather likely to have correlation patterns due to clustering within pens or cohorts, which in turn may be subjected to common management or business practices defined at the level of a commercial operation. Also, agricultural outcomes of interest are often correlated and at multiple levels. Furthermore, a key assumption underlying SEM is that of a causal homogeneity, whereby the structural coefficients defining functional links in a network are assumed homogeneous and impervious to environmental conditions or management factors. This assumption seems particularly questionable in the context of animal agriculture, where production systems are regularly subjected to explicit interventions intended to optimize the necessary trade-offs between efficacy and efficiency of production. Despite the recent extension of SEM to a mixed-models framework, inferential issues related to hierarchical data architecture in the context of designed experiments and observational studies are not well understood. Hence, my dissertation work investigates the importance of properly specifying data structure in a hierarchical SEM. My research further develops methodological extensions to SEM to account for heterogeneity in the structural coefficients of a network and characterizes problems to be expected otherwise. Throughout my PhD dissertation, I implemented SEM in a hierarchical Bayesian framework, and I used as motivator an observational dataset in beef cattle feedlot production and a dataset from a designed experiment in swine reproduction. I first evaluated the inferential implications of properly accounting for (or ignoring) existent correlation structure due to data architecture when modeling feedlot data with SEM. Results indicated impaired model fit, biased estimation and precision loss for SEM parameters when data architecture was mispecified or ignored. I then investigated potential causal interconnections between reproductive performance outcomes in swine, for which I leveraged the mixed-models adapted inductive causation algorithm to search for and infer upon causal links. Results indicated reproductive networks distinctive by parity groups, thereby suggesting potential network heterogeneity; this finding was in direct conflict with the standard SEM assumption of causal homogeneity. Therefore, I proposed and developed a methodological extension to hierarchical SEM that explicitly specifies structural coefficients as functions of systematic and non-systematic sources of variation, thus allowing for hierarchical heterogeneity of network links in structured data. I validated the proposed method using a simulation study and applied it to assess heterogeneity of functional reproductive links in swine. Overall, this dissertation characterizes problems of SEM-based modeling and develops a general approach to hierarchical SEM for the joint network-type analysis of multiple outcomes with potentially heterogeneity in functional links. Table of Contents List of Figures ................................................................................................................................ xi List of Tables ................................................................................................................................ xv Acknowledgements ..................................................................................................................... xvii Dedication .................................................................................................................................. xviii Preface........................................................................................................................................ xviii Chapter 1 - Introduction .................................................................................................................. 1 1.1. Classical multiple-trait model .......................................................................................... 2 1.2. Structural equation modeling framework .......................................................................