Optimal Design for A/B Testing in the Presence of Covariates and Network Connection 1 2 Qiong Zhang∗ and Lulu Kang† 1School of Mathematical and Statistical Sciences, Clemson University 2Department of Applied Mathematics, Illinois Institute of Technology Abstract A/B testing, also known as controlled experiments, refers to the statistical pro- cedure of conducting an experiment to compare two treatments applied to different testing subjects. For example, many companies offering online services frequently to conduct A/B testing on their users who are connected in social networks. Since two connected users usually share some similar traits, we assume that their measurements are related to their network adjacency. In this paper, we assume that the users, or the test subjects of the experiments, are connected on an undirected network. The subjects’ responses are affected by the treatment assignment, the observed covariate features, as well as the network connection. We include the variation from these three sources in a conditional autoregressive model. Based on this model, we propose a design criterion on treatment allocation that minimizes the variance of the estimated treatment effect. Since the design criterion depends on an unknown network correla- arXiv:2008.06476v1 [stat.ME] 14 Aug 2020 tion parameter, we propose a Bayesian optimal design method and a hybrid solution approach to obtain the optimal design. Examples via synthetic and real social networks are shown to demonstrate the performances of the proposed approach. Keywords: A/B testing; Conditional autoregressive model; Controlled experiments; Bayesian optimal design; D-optimal design; Network correlation. ∗
[email protected] †
[email protected] 1 1 Introduction A/B testing, also known as controlled experiments approach, has been widely used in agri- cultural, clinical trials, engineering and science studies, marketing research, etc.