Marketing Science Institute Working Paper Series 2017 Report No. 17-100

The Effects of Mobile Apps on Shopper Purchases and Product Returns

Unnati Narang and Venkatesh Shankar

“The Effects of Mobile Apps on Shopper Purchases and Product Returns” © 2017 Unnati Narang and Venkatesh Shankar; Report Summary © 2017 Science Institute

MSI working papers are distributed for the benefit of MSI corporate and academic members and the general public. Reports are not to be reproduced or published in any form or by any means, electronic or mechanical, without written permission. Report Summary

In recent years, the penetration of mobile devices has reached unprecedented levels. In particular, mobile apps are increasingly dominating mobile device use. Despite the widespread use and the potentially significant effects of mobile apps and different app features, their consequences have been underexplored. Not much is known about how app adoption affects the net monetary value of purchases after accounting for product returns.

In this research, Unnati Narang and Venkatesh Shankar reveal new insights about the effects of mobile app adoption on shopping behavior across channels. They study a rich set of managerially important outcomes, such as individual purchase incidence and amount and product return incidence and amount. They use a large-scale dataset from an omnichannel retailer of video games, consumer electronics, and wireless services with 30 million shoppers. This unique dataset identifies individual app adoption date and allows the authors to isolate the effects of the app using a difference-in-differences regression.

The research reveals that app adopters buy 21% more often but spend 12% less per purchase occasion and return 73% more often than non-adopters after adoption. Overall, app adoption results in a 24% increase in net monetary value of purchases. In monetary terms, the retailer’s net annual revenue increase due to app launch ranges from $550 million to $890 million. An analysis of app features used by the shoppers reveals the key role of offer- and rewards-related features. Surprisingly, the number of unique app features accessed by the shopper has an inverted U- shaped relationship with shopping outcomes, suggesting managerial caution against “all-in-one” app designs.

Managerial implications Their study offers five key managerial implications. First, the estimates provide a useful benchmark for managers to evaluate any app introduction decision. Second, their finding that purchase frequency is higher but monetary value per occasion is lower for app adopters suggests that managers should plan for shoppers visiting the physical and online stores more often and spending less on each occasion. These findings imply that the key task of sales associates is to encourage shoppers to visit again.

Third, the finding that app adoption leads to greater product returns exposes managers to a darker side of apps. Managers need to proactively monitor return incidence from app adopters and devise interventions to keep product returns in check. Fourth, the findings on app feature usage suggest that managers need to ensure that interactive features such as redeeming reward points and activating offers are easily accessible. Finally, the authors caution managers against an all- in-one app design and in favor of a more thoughtful combination of app features to avoid information overload. Managers should adapt their mobile app design strategies to their context, including the product category.

Unnati Narang is a Ph.D. student in Marketing and Venkatesh Shankar is Professor of Marketing and Coleman Chair in Marketing and Director of Research, Center for Retailing Studies, at the Mays Business School, Texas A&M University.

Marketing Science Institute Working Paper Series 1 Acknowledgments The authors thank participants at the 2016 Theory and Practice in Marketing (TPM) Conference and the 2016 Professors’ Institute meeting of Marketing EDGE for valuable comments.

Marketing Science Institute Working Paper Series 2

1. Introduction

In recent years, the penetration of mobile devices has reached unprecedented levels. By January

2016, 79.1% of the ’ (U.S.) population or 198.5 million people owned a smartphone

(comScore 2016a). By the end of 2016, over two billion people worldwide will be smartphone users (eMarketer 2014) and by 2020, more than 70% of the world population will own a smartphone (Ericsson Mobility Report 2016).

Mobile devices play a unique role in influencing shoppers along and beyond their paths to purchase. Mobile devices are interactive, engaging, portable, wireless, location-specific, and personable. As a result, they are uniquely positioned to influence shoppers at various stages of the shopping process – need recognition, information search, alternative evaluation, purchase, and post purchase (Shankar and Balasubramanian 2009).

Little wonder, mobile marketing is becoming a strategic priority for firms. U.S. firms spent over $28 billion on mobile advertising in 2015 and are projected to double this level by 2018

(eMarketer 2016). Chief Marketing Officers (CMOs) of leading firms already allocate up to 20% of their budget to mobile (Forrester 2016). Terry Lundgren, Macy’s CEO, views mobile to be the starting point for shopping when he says, “shoppers are starting the journey with their phone, doing their research. Then they might buy in the store or they’ll buy at Macys.com or

Bloomingdales.com” (Peterson 2015). Mobile devices have changed the way people shop, giving rise to an emerging area of mobile shopper marketing and revolutionizing retail (Shankar et al.

2016). More than 80% of U.S. shoppers use a mobile device to shop even within a store (Google

M/A/R/C Study 2013). Nearly 70% of Amazon’s customers used mobile to shop in the 2015 holiday season (Eadicicco 2015).

Marketing Science Institute Working Paper Series 3

Mobile apps are increasingly dominating mobile device use. Mobile apps account for 87% of mobile usage, which constitutes the bulk of digital media time (comScore 2016b). By 2015, there were over 250 billion app downloads from the App Store and Google Play (Sims 2015). About

20% of all Starbucks transactions originated from its “order and pay” app (Forbes 2015).

Do mobile apps influence shopper behavior? Mobile apps offer informational (e.g., product and store information) and experiential (e.g., offer, loyalty program reward redemption) benefits.

These benefits may lead shoppers to purchase more often and spend more money. However, while a mobile app can induce purchases through the app or mobile web, does it increase overall purchases across all the channels, including brick-and-mortar and online? Furthermore, a mobile app can prompt a shopper to act and make a purchase, but such an action can also result in post- purchase regret, leading to higher product returns. Therefore, the net effect of mobile apps on the monetary value of purchases is unclear.

Furthermore, app features such as product search, store check-in, loyalty program, and promotional offer, may have specific effects on shopper purchases and returns and help explain the effects of a mobile app on shopping outcomes. The use of a greater number of app features may lead to more purchases and even product returns.

Despite the widespread use and the potentially significant effects of mobile apps and their features, the consequences of mobile apps and the effects of app features have been underexplored. A few studies (e.g., Kim et al. 2015; Gill et al. 2016) have considered the effects of mobile apps on loyalty points accrued, purchase intent, website visits, or aggregate purchase amounts. We extend prior research by isolating the effects of mobile app adoption on a richer, managerially important set of outcomes, such as individual purchase incidence and amount and

Marketing Science Institute Working Paper Series 4

return incidence and amount. Importantly, we explain these effects through app feature-related mechanisms. Specifically, we address three research questions:

 Does mobile app adoption lead to higher or lower incidence and monetary value of purchases and returns?  What are the sizes of differences in purchases and returns between app adopters and non- adopters?  What are the effects of the number and type of app features on shopping outcomes and how do they help explain the overall effects of a mobile app on shopping outcomes?

We address our research questions using a unique dataset from a large omnichannel retailer of video games, consumer electronics and wireless services. Teasing out the effects of mobile apps on purchase outcomes is complex. A major challenge is endogeneity and self-selection potentially confounding the effects of variables affecting both app adoption and shopping outcomes. We tackle this challenge in two ways by: (a) adopting a combination of difference-in- differences method with propensity score matching and Heckman selection correction, and (b) carrying out a series of robustness tests to rule out alternative explanations.

Our results show that app adopters buy 21% more often but spend 12% less per purchase occasion and return 73% more often than non-adopters in the month after adoption. Overall, app adoption results in a 24% increase in net monetary value of purchases. Surprisingly, the number of unique app features accessed by the shopper has an inverted U-shaped relationship with shopping outcomes, suggesting managerial caution against “all-in-one” app designs.

Our research contributes to the mobile marketing and omnichannel marketing literatures in at least three ways. First, we expand the scope of mobile marketing literature by examining the impact of mobile apps on both purchases and returns across channels. To our knowledge, no other mobile marketing study has examined the net monetary value by accounting for returns.

Second, unlike most prior studies that focus on associations between mobile interventions (e.g., coupons) and purchase outcomes, we isolate the effect of mobile app adoption on purchases and

Marketing Science Institute Working Paper Series 5

returns. Finally, we examine the effects of a comprehensive set of mobile app features that enable managers to improve resource allocation to the conceptualization, development, launch, and maintenance of mobile apps.

2. Related Literature and Framework for Empirical Analysis

Mobile devices influence shoppers both in- and out-of-store by offering convenient and interactive anytime-anywhere access to relevant information (Shankar et al. 2016). Mobile apps may affect purchases in two major ways. First, mobile apps can provide information benefits at the right time and place to shoppers. Such benefits include product information, product reviews, and store location information (Danaher et al. 2015; Dubé et al. 2015; Fong et al. 2015). Second, mobile apps can offer experiential/interactive benefits through loyalty program use, notification, offers, and store check-ins (Shankar and Balasubramanian 2009). For example, Starbucks offers consumers location-based promotions via its mobile app for declaring their loyalty on social networks and status badges for store check-ins and a mobile pay option (Andrews et al. 2016a).

Two streams of research are relevant to our research questions. First, the literature on mobile apps has focused on the effects of apps on a few outcomes. Mobile app use improves attitude and purchase intention (Bellman et al. 2011). A brand’s mobile app also promotes visits to its website (Xu et al. 2014), enhances loyalty points accrued (Kim et al. 2015), and can influence purchase probability (Dinner et al. 2015). Furthermore, the use of mobile app can increase shoppers’ spending in the business-to-consumer (B2C) (Einav et al. 2014) as well as the business-to-business (B2B) (Gill et al. 2016) context. Informational apps are more effective in driving purchase intention than experiential apps (Bellman et al. 2011) and app features, such as information lookup significantly affect loyalty points accrual (Kim et al. 2015).

Marketing Science Institute Working Paper Series 6

Second, studies on mobile device adoption have concentrated on its effects on purchase incidence or monetary value. Wang et al. (2015) examine changes in shopper spending after mobile device adoption and find that purchase incidence and monetary value of purchases increase. In contrast, Lee et al. (2016) find that the monetary value of each purchase is lower for shoppers who transact more using mobile than web. Xu et al. (2016) study the effect of tablet adoption on commerce through smartphones and computers in the online retail context and conclude that commerce through tablets substitutes desktop commerce but complements smartphone commerce.

Our study complements and extends these research streams as shown in Table 1. We study the effects of app adoption a variety of shopping outcomes, purchase incidence, purchase amount, return incidence, and return amount. In addition, we examine the effects of a comprehensive set of app features on shopping outcomes.

(Table 1 follows References.)

Based on the evidence from these related research streams, we develop a conceptual framework delineating the drivers of shopper decisions and outcomes at various shopping stages.

It appears in Figure 1. In this framework, some existing shoppers of the firm adopt the app, while others do not. The app adoption decision depends on shopper demographics, past shopping behavior, and the data connectivity environment. Both adopters and non-adopters of the mobile app make purchases. In addition to app adoption, recency of last purchase, income and customer tenure drive the incidence and monetary value of purchases. Some shoppers return some of the purchases made. The recency of last purchase, distance to the nearest store, number of stores in the shopper’s zipcode, and order size determine the incidence and monetary value of returns.

(Figure 1 follows References.)

Marketing Science Institute Working Paper Series 7

3. Data and Research Setting

We collect data from a large US-based retailer of video games, consumer electronics and wireless services. Our data span July 2014-June 2015. In addition to transactions-related data from the retailer’s 4,175 stores across the U.S. and ecommerce website, we have access to data on mobile app usage of over 32 million customers and members of the retailer’s loyalty program.

The loyalty program contributes to nearly 75 percent of total transactions, reflecting the retailer’s overall customer base. The retailer’s primary channel is its store network; only a small proportion of sales transactions take place through its ecommerce site. We have data on shoppers’ transactions. From these data, we identify the relevant outcomes to map shopper behavior. Purchase and return incidence (whether a shopper makes a purchase or return) and the monetary value of purchases and returns are our key outcome variables. The key variables, their operationalization and descriptive statistics appear in Table 2. We supplement these data with publicly available data on region-specific data connectivity information (e.g., number of wireless providers, data speeds) from the US Federal Communications Commission (FCC).

(Table 2 follows References.)

Our focal independent variable is app adoption. The retailer launched its app in July 2014 without any targeted campaign. A subset of shoppers adopted the app over time. The purpose of the app is to allow shoppers to browse the retailer’s catalog of products, get exposure to deals and offers, order online, or locate store information to buy offline. The app allows shoppers to learn about the retailer’s stores, including nearby locations, opening hours, phone numbers, and driving directions. The app does not offer in-app purchases. Web Appendix A provides screenshots from the app.

Marketing Science Institute Working Paper Series 8

The mobile app data are organized at the shopper/app session level. A new session is recorded when a user first starts the app or loads the app after not loading it in the previous 15 minutes. For each shopper/app session, the data contain a random session ID and the app features accessed by the shopper. App features capture in-app activities during the session (e.g., browsing product catalog, clicking offers and checking reward points).

For the analysis, we divide our data into two periods, calibration period and estimation period. We use the three-month period (August-October 2014) as the calibration period. The rationale for setting aside this period is to compute shoppers’ past behavioral measures, such as past spending to help identify similar shoppers for a valid comparison of app adopters and non- adopters. We treat the months starting November 2014 as our estimation period, identifying and using shoppers’ app adoption timing as cut-off points for estimating the effects of adoption.

4. Analyses

4.1. Relationship between App Adoption and Shopping Outcomes: Descriptive Analysis

We define app adopters as shoppers who started using the app for the first time during our data period. Non-adopters are those who did not access the app even once during the study period.

Unlike most prior studies (e.g., Kim et al. 2015; Gill et al. 2016), our data allow us to uniquely identify each shopper’s app adoption date. We draw random samples of app adopters from different periods and compare their pre- and post- app adoption outcomes relative to app non- adopters. Our main analysis reports results for app adopters from December 2014.1

We draw a random sample of adopters and non-adopters with complete demographic information and who have made at least one purchase in the calibration period (Xu et al. 2016).

Table 3 reports the mean statistics for 1,629 random app adopters who started using the app on

December 1, 2014 and for 7,956 non-adopters. A simple comparison of shopping outcomes

1 We subsequently performed robustness checks using other samples (see Web Appendix B for details).

Marketing Science Institute Working Paper Series 9

shows that the average monetary value of purchases increased 43.48% ($126.25 to $181.15 per month) for app adopters, while it increased only by 25.93% ($46.89 to $59.05 per month) for non-adopters one month before and after adoption (p < 0.001). However, the average monetary value of returns for app adopters also increased by 96.09% ($9.46 to $18.55 per month) compared to app non-adopters who experienced a marginal increase of 0.75% ($4.02 to $4.05 per month) in the same period (p < 0.001). Overall, the net monetary value increased by more than 39.22% for app adopters compared with 28.30% for non-adopters. It is notable that the number of purchase transactions for the app adopters increased by over 60% relative to non- adopters who experienced only a 16% increase. As a result, the monetary value per incidence declined by 10.33% ($85.66 to $76.81 per month) for app adopters, while it increased for non- adopters by 8.63% ($66.28 to $72 per month). Histograms depicting these model-free data appear in Figure 2. (Table 3 and Figure 2 follow References.)

4.2. Econometric Model: A Quasi-experimental Approach

To estimate the effect of mobile app adoption, the ideal approach would be to compare the shopping outcomes when shoppers adopt the mobile app to the counterfactual, that is, to outcomes when the same shoppers do not adopt the mobile app. However, because we do not observe the counterfactual (a shopper cannot be both an app adopter and a non-adopter) and because the treatment is not randomly assigned (a shopper self-selects into adopting the app), we develop a quasi-experimental design to replicate the ideal experimental scenario under reasonable assumptions (Campbell and Stanley 1963). Specifically, we employ a difference-in- differences (DIFF-IN-DIFF) approach to compare pre- and post- adoption outcomes for app adopters and similar non-adopters. After specifying the baseline difference-in-differences

Marketing Science Institute Working Paper Series 10

regression model, in the next section, we outline our strategy to address the endogeneity of treatment using Propensity Score Matching (Rosenbaum and Rubin 1983) and Heckman correction procedures (Heckman 1979). We rule out several competing explanations for our results in the robustness checks. The complete list of analyses is laid out in Table 4.

(Table 4 follows References.)

Baseline Difference-in-Differences Model

We adopt a difference-in-differences approach to compare the change in outcomes for the app adopters one month before and one month after app adoption to the change in outcomes for the non-adopters over the same time period.2 Formally, our baseline difference-in-differences model can be specified as a two-period linear regression model:

(1) where i is individual, t is month, Y is the outcome variable (number and monetary value of purchases and returns), A is a dummy variable denoting treatment (1 if shopper i is an app adopter and 0 otherwise), P is a dummy variable denoting the period (1 for the period after the app has been downloaded and 0 otherwise), α is a coefficient vector, and is an error term. The coefficient of AiPt (TREAT * POST) identifies the treatment effect.

The underlying identification strategy in the DIFF-IN-DIFF approach is that the change in outcomes observed in the non-adopter group offers a good counterfactual for the change in outcomes that would have been observed in the adopter group in the absence of app adoption.

The validity of this assumption relies on the (a) similarity between the app adopters and non- adopters along their observed and unobserved characteristics (including, for example, common trends in outcomes in the pre-treatment periods), and (b) absence of any idiosyncratic shock to

2 We subsequently also examined alternative time periods in our (a) comparisons of 15-, 45- and 60-day pre- and post- outcomes (Web Appendix B), and (b) robustness check of app use vs. non-use for extended periods (Table 8).

Marketing Science Institute Working Paper Series 11

either group in the study period (e.g., no unique marketing promotions should have been sent to one group and not to the other). In our case, assumption (b) holds since there were no unique shocks to either group in the data period. In the absence of natural randomization, we ensure the validity of assumption (a) by employing matching estimates and the Heckman two-step correction process, carefully observing the balance between two groups through several checks.

4.3. Endogeneity and Self-selection

In the absence of randomization in an observational study, endogeneity of treatment becomes a major challenge in estimating the causal effects. Two sources of endogeneity exist in our setting.

First, omitted variables can affect both app adoption and shopping behavior. Consider customers who are gaming and technology enthusiasts. It is possible that they are more interested in the video game product category and therefore purchase gaming-related products. They may also be spending more time on their mobile devices, including exploring the available apps. As a result, their likelihood of adopting the app is higher. Similarly, it is possible that they read app reviews and technology columns in the media and become more aware of app functionalities. In such cases, both game purchases and app adoption are likely, but the effect may not necessarily be causal. Second, mobile app usage and purchase transaction may occur together. Imagine a customer using a mobile device while purchasing at a store or at a website. Due to the simultaneous occurrence, it is difficult to tease out causality. This issue can result in endogeneity from simultaneity (Wooldridge 2002).

Our quasi-experimental research design combined with a series of robustness and falsification checks allows us to address the endogeneity concern.

4.3.1. Selection on Observables: Matching Estimates

Marketing Science Institute Working Paper Series 12

Propensity score matching allows us to match app adopters and non-adopters on observed demographic and behavioral covariates, while tackling the curse of dimensionality. Underlying propensity score matching is the idea of conceptualizing “the observational data set as having risen from a complex randomized experiment, where the rules used to assign the treatment condition have been lost and must be reconstructed” (Rubin 2008; Guo and Fraser 2014).

We begin by calculating each shopper’s propensity score, which is defined as the shopper’s probability of adopting the app. We do this using a binomial logit model.3 Next, we identify non- adopters similar to adopters based on the estimated propensity scores to create a control group.

This approach is in line with Rosenbaum and Rubin’s approach to create a control group “that is similar to a treated group with respect to the distribution of observed covariates” (Rosenbaum and Rubin 1983). We match each app adopter to a non-adopter based on the 1:1 nearest neighbor

4 matching without replacement. Formally, if P(Xi) is individual i’s propensity score, the treated individual i is matched to the control individual j, where j is min ||P(Xi) – P(Xj)|| to create matched pairs closest to each other (Wangenheim and Bayón 2007; Huang et al. 2012).

What factors explain the decision to adopt the app? Consistent with extant literature (Hung et al. 2003; Kim et al. 2015), we model app adoption as dependent on individual shopper demographics (e.g., age, gender), behavioral measures (e.g., past spend, past returns, past online buying) and other related measures (e.g., distance to the nearest store, number of stores in the shopper’s zip code, presence of competitor stores, loyalty program membership level on the adoption day) that are likely to influence shoppers.

(2)

(3)

3 We also estimated propensity scores using a probit model and found no significant difference in our results. 4 We present alternative matching estimates, including caliper and Mahalanobis metric in Web Appendix C.

Marketing Science Institute Working Paper Series 13

where i is customer, U is the utility from app adoption, D is a vector of covariates, (γ, δ) is a coefficient vector, andεis an error term, distributed as double exponential. We also include squared terms of the covariates to allow for nonlinear relationships and for improved model fit

(Huang et al. 2012).

We conduct a series of statistical analyses to test the goodness of our propensity score matches, including the Kolmogorov-Smirnov test, Standardized Bias Reduction and

Rosenbaum’s Hidden Bias Sensitivity test (Rosenbaum 2005). The tests show that the match balance between adopters and non-adopters improved significantly after matching and that there was no concern for sensitivity of outcomes to hidden bias. The detailed results of these checks as well as alternative matching methods are reported in Web Appendix C.

4.3.2. Selection on Unobservables: Heckman Correction

To more formally account for the non-randomness of the app adoption due to unobserved factors, we use a two-stage Heckman correction procedure (Heckman 1979). In general, the rich set of demographic and behavioral covariates used for matching and the common trends between treated and control outcomes in the pre-treatment period should offer convincing evidence that the groups are comparable. However, we further test for any unobserved confounders through the Heckman procedure (Gill et al. 2016). In the first stage, we model the choice to adopt the app using a probit model. For identification, we require an exclusion restriction that affects the decision of the shopper to adopt the app without affecting the shopping outcomes. We identify three such exclusion restrictions that relate to data connectivity and technology environment:

1. Local wireless network access, operationalized as the proportion of the population in the

shoppers’ counties with access to four or more wireless providers, will likely affect the

shoppers’ mobile usage patterns and hence, shoppers’ probability of downloading apps. If

Marketing Science Institute Working Paper Series 14

there is greater access to wireless networks, shoppers are likely to engage in more mobile

use and more app download activity regardless of their intrinsic preference for a specific

firm. At the same time, wireless network access is not likely to affect purchases from any

one retailer, in particular, in stores, which serve as the primary channel for the retailer in

this setting. This measure serves as a proxy for unobserved endogenous firm preference

by making app download a function of exogenous network access.

2. Symmetric upload and download speeds, operationalized as the percentage of population

in the shoppers’ state with symmetric Digital Subscriber Lines (DSL) (same download

and upload speed) relative to asymmetric DSL (higher download speeds than upload),

may lead to low adoption of apps in those regions due to slower downloads, without

affecting purchases. This measure serves as a proxy for unobserved endogenous firm

preference by making app downloading a function of exogenous network speeds.

3. Online purchases in the calibration period, operationalized as whether the shoppers used

the online channel to make at least one purchase or not, are likely to affect the shoppers’

perceived value of the mobile app but may not influence how they buy across channels. If

an online purchase was made, on the one hand, it may be valuable for online buyers to

adopt the app to augment their experience. However, since the app does not allow direct

purchase, it may be perceived as less valuable by those who already buy online. This

measure serves as a proxy for shoppers’ tech-savviness.

These exclusion restrictions result in the following first-stage selection equation for modeling Ai, the probability of shopper i adopting the app.

(4) where WIRENET is local wireless network, SYMSPEED is symmetry of network speed,

ONLINEBUYER is a dummy indicating prior online purchases, Q is a vector of other covariates,

Marketing Science Institute Working Paper Series 15

is a coefficient vector, and is an error term. We compute the inverse Mills ratio from this probit regression. In the second stage, we augment the difference-in-differences model by including the inverse Mills ratio as an additional covariate. In a further robustness check for selection due to unobservables, we use future app adopters to identify a similar control group for the current app adopters. The premise for doing this is that there should not be unobserved differences among app adopters who adopt the app at different points in time. Thus, the DIFF-

IN-DIFF estimating the treatment effects for current app adopter (treated) cohorts relative to future app adopters (control) across the same time period acts as a falsification test (Manchanda et al. 2015) and provides consistent results (Web Appendix D).

4.4. Decomposing Treatment Effects: Exponential Type II Tobit Model

While the difference-in-differences model provides estimates for the aggregate effects, it is not informative about the source of the effects. Do app adopters buy relatively more or less frequently or do they buy more or less whenever they decide to buy? How do the monetary values of purchases and returns vary conditional on the decision to purchase or return?

We jointly model the incidence (whether or not to purchase or return) and the monetary value of purchases and returns. We use an exponential Type II Tobit for the following reasons: (1) we have a censored model with mass point at zero, (2) our outcome of interest is an empirical counterpart of a latent variable (utility from purchase or return), and (3) we want our fitted values to remain in the range of the LDV (limited dependent variable), which in this case is non- negative for monetary values, and within 0 and 1 for incidence. In the first stage of our model, we specify a probit model for modeling the binary outcome of whether a shopper purchases

(returns) in a given period. In the second stage, we subsequently model the monetary value of purchases (returns) per occasion (Wooldridge 2002).

Marketing Science Institute Working Paper Series 16

We now describe our model setup. A shopper i chooses whether to make a purchase or not at time t. If the shopper’s expected utility from purchasing is greater than zero, we expect the purchase incidence to be positive for that time period. The latent utility depends on mobile app adoption and other covariates5. Mobile app adoption in our sample is exogenous to outcomes conditional on propensity scores. In other words, we use the propensity score matched samples to estimate the Tobit models.

Purchase Incidence: Let hit, the purchase incidence of shopper i in period t be given by:

(5)

where is the latent utility of purchasing; X is a vector of covariates, is a coefficient vector, and εPI is an error term. The probability of the ith shopper making a purchase at time t is:

(6)

PI where is σPI is the standard deviation of the error term ε . We next create our conditional likelihood function across t periods for any individual i to apply the Maximum Likelihood

Estimator (MLE):

(7)

Monetary Value of Purchases: Let t, the monetary value of purchases per purchase occasion for shopper i in period t be given by:

(8)

5 As a robustness check, we estimated the Tobit models without the covariates for purchase and return amounts, and found consistent estimates for the treatment effect.

Marketing Science Institute Working Paper Series 17

where is the log of monetary value of purchases per purchase occasion for shopper i in time t.

We observe it when a shopper makes a purchase.6 β is a coefficient vector and εPA is an error term. The purchase incidence and monetary value form an exponential Type II Tobit model.

Return Incidence: Let rit, the return incidence of shopper i in period t be given by:

(9)

where is the latent utility of returning. Z is a vector of covariates, εRI is an error term, and the other terms are as defined earlier. The probability of the ith shopper making a return at time t is:

(10)

RI where σRI is the standard deviation of the error term ε . The conditional likelihood function for t periods for any individual i conditional on Zit is:

(11)

Monetary Value of Returns: Let tit, the monetary value of returns per return occasion for shopper i in period t be given by:

(12)

6 V is a vector of shopper covariates, such as income (proxied by average monthly past spending) and tenure (time elapsed since becoming a customer), denoting the income effect on spending and the experience effect on spending, respectively (Thaler 1990; Bolton 1998). Prior research (e.g., Ailawadi and Neslin 1998; Kushwaha et al. 2015) shows that the monetary value of purchases also depends on the inventory effect, modeled as the durability of the product category last purchased. In our context, it would make sense to model video game console buyers differently from games-only buyers because unlike games, consoles are infrequent and high value purchases. We use the data in the calibration period to classify shoppers into console-only buyers, game-only buyers, and buyers of both categories. We find that less than 1% of our sample bought a console. Moreover, 99% of the video game buyers are game-only buyers, possibly because they already own a console. Furthermore, there is no distinction in the percentage of multiple category shoppers between the treatment and control groups; 30% of shoppers in both groups are multi-category buyers. Therefore, we do not control for the inventory effect.

Marketing Science Institute Working Paper Series 18

where is the log of monetary value of returns per occasion for shopper i in time t. We observe it when a shopper makes a return. W is a vector of covariates and εRA is an error term. Return incidence and monetary value form an exponential Type II Tobit model. We assume that the

errors in the Tobit models are normally distributed. ), ),

) and ).

A summary of the covariates and their likely relationships with shopping outcomes appears in Table 6. The table also lists the covariate notations and the relevant supporting research.

(Table 6 follows References.)

5. Results and Robustness Checks

5.1. Results

The results from the baseline difference-in-differences model in Panel (A) of Table 5 show a positive and significant effect of app adoption on the incidence and monetary value of purchases and returns (p < 0.001). App adopters spend $42.73 more than non-adopters in the month after adoption and engage in a higher number of purchases (α3=0.772, p < 0.001). Interestingly, relative to non-adopters, app adopters also return $9.06 worth more of products and engage in more number of returns (α3=0.133, p < 0.001) each month.

Panel (B) of Table 5 refines these estimates by controlling for self-selection and ensuring that the two groups, app adopters and non-adopters, are comparable based on a propensity score matching model with 1:1 nearest neighbor matching without replacement. App adoption has a positive and significant effect on shopper outcomes, including the number and monetary value of purchases and returns (p < 0.001). Relative to the baseline model, the coefficients reflect a higher positive significant effect of app adoption for the matched sample. App adopters spend $47.91 more than non-adopters in the month after adoption and buy a greater number of times

Marketing Science Institute Working Paper Series 19

(α3=0.869, p < 0.001). Relative to non-adopters, app adopters also return $10.76 worth more of products and return products a greater number of times (α3=0.144, p < 0.001) each month.

Overall, app adoption leads to a higher net monetary value of $37.15 (p < 0.001).

The results hold even after we include the Heckman correction term in the model (see panel

(C) of Table 5). We do not find evidence for selection on unobservables as the coefficient of the inverse Mills ratio is insignificant (p > 0.4). The detailed results of the first stage probit model appear in Web Appendix D. Histograms depicting the differences in monetary value of purchases and returns for the matched samples appear in Figure 3. The pattern is similar to that in Figure 2.

(Figure 3 follows References.)

Histograms showing the distribution of propensity scores for the treated and the control group before and after matching appear in Figure 4. Matching on propensity scores improves the percentage balance of propensity scores by 99.32%, making the matched treated and control groups comparable.7 (Figure 4 follows References.)

How do the monetary values of purchases and returns vary conditional on the decision to purchase or return? The results of the exponential Type II Tobit model (Table 7) provide rich insights. Interestingly, while app adopters are likely to buy 21% more often, the effect on the monetary value of purchases per purchase occasion is negative and significant (p < 0.05) relative to non-adopters. The magnitude is close to 88% implying that adopters in the post period spend

12% less per purchase occasion than they would have in the absence of the app relative to the pre- period. From the purchase models in panel (A) of Table 7, we further note that recency

7 Appendix C reports the evidence of similarity of the two groups in the mean values of each observed covariate before and after matching, additional checks, and the results of the logit model used to compute propensity scores.

Marketing Science Institute Working Paper Series 20

negatively influences purchase incidence (p < 0.001) and income and tenure positively influence the monetary value of purchases per occasion (p < 0.05). (Table 7 follows References.)

From the exponential Type II Tobit returns model in panel (B) of Table 7, we observe a negative effect of recency on return incidence (p < 0.001). Distance to the nearest store and the number of stores in the shoppers’ zip codes do not have significant effects (p > 0.10). Our key finding from this model is that relative to non-adopters, app adopters are 73% more likely to return products (p < 0.001). Conditional on return incidence, however, there is no significant effect on the monetary value of returns per occasion (p > 0.10). We explain the intuition and possible mechanisms for these results in Section 6.

5.2. Checking Robustness and Ruling out Alternative Explanations

We perform several robustness checks and tests to rule out alternative explanations for the effect of app adoption on purchase and returns. A summary appears in Table 8, following References.

5.2.1. Alternative measures of app adoption: To rule out idiosyncrasy in the app adoption measure, we estimated an alternative model with a more nuanced measure of app adoption. In our main model, we compared app adopters and non-adopters along their outcomes one month before and after app download. In the alternative model, we apply a fine-grained measure of mobile app adoption based on app usage. In this test, we create a new quasi-experiment. We focus on only app adopters in the post-adoption period. We segment adopters into app users and non-users. A user is any app adopter who logs into the app at least once in a given month. What this means is that the user in month one could be a non-user in the next month. This process acts as a robustness check in that it shows that the effects are not driven by individual characteristics

Marketing Science Institute Working Paper Series 21

over time (Xu et al. 2016). Column (A) of Table 8 reports the results from the comparison of app users and non-users. Further, we re-estimate this model with propensity score matching, by matching users and non-users for each month’s activity dynamically. We report the estimates from four different propensity score matches, those for users and non-users based on usage status in the four months from January to April 2015 in Web Appendix B; the outcomes are measured from December 2014 to June 2015. The effects are robust and consistent with previous findings.

5.2.2. Outliers: Another possible explanation for the effects could be outliers, such as the top spenders (and not the average shopper). We test this possible explanation by first removing the top spenders from our sample (the mean plus two standard deviations of spending in the pre- period) and then carrying out propensity score matching and difference-in-differences methods as done earlier. Our estimates are robust as shown in columns (B) of Table 8. We also test robustness to outliers based on spending in the calibration period and find consistent results.

5.2.3. Shopper heterogeneity: While we match shoppers on a broad set of covariates, one possible untested alternative explanation is that the effects are driven largely by already deal- prone shoppers and not by the use of the app. In general, marketing promotions and offers are not a threat to our difference-in-differences estimates because the retailer did not send any unique offers to mobile app adopters that the non-adopters did not receive, or vice-versa. We do not expect deals to affect one set of shoppers idiosyncratically. Yet, to rule out the possibility that actual redemption or use of offers prior to adoption could influence the two groups differently, we repeat the analyses after removing deal-prone customers. Column (C) of Table 8 reports the estimates for the sample that did not use deals in the pre-period. We find robust estimates. We also test robustness to deal-proneness based on offer use in the calibration period and find consistent results. This finding is managerially significant because app adopters do not

Marketing Science Institute Working Paper Series 22

buy more simply because they are sensitive to deals but even otherwise. However, we do notice that the percentage of instances of deals-usage among shoppers after app adoption increases for app adopters (6% to 50% deal using shoppers) while remaining stagnant for non-adopters, suggesting goal-switching effects of app usage (Shankar et al. 2016).

5.2.4. Alternative matching methods: Our main analysis relies on the commonly used 1:1 nearest neighbor matching algorithm. In addition, we use the Mahalanobis metric and a refined caliper matching approach by defining the bandwidth within which to identify matched control units (Silverman 1986). We test using a bandwidth of 0.16 times the standard deviation of the propensity scores, in line with the Silverman rule of thumb. To enhance support for our matches, we also adopt a trimming approach, in which we drop the observations whose propensity score is smaller than the minimum and larger than the maximum in the opposite group (Caliendo and

Kopeinig 2005). Web Appendix C reports the results for these alternative matched samples. We find that these estimates are consistent with those from our proposed method.

5.2.5. Alternative samples: Our main analysis reports the results for a sample of 3,258 shoppers; the treatment group for this random sample comprises those who started using the mobile app on December 1, 2014. To verify that the results are generalizable to other samples, we replicated the analyses for two different types of samples: (a) a random sample selected on a different date, February 1, 2015 (column D of Table 8) and (b) a random sample selected from each month for a period of four months during (February-May) 2015 (see Web Appendix B for details). In (b), the treatment group users could have started using the app on any date in that month. We treat the month of adoption as part of the post-treatment period, similar to other studies (Xu et al. 2016). Our results are consistent across four such samples of 11,380 shoppers.

Marketing Science Institute Working Paper Series 23

5.2.6. App novelty effect: An alternative explanation for the increased net monetary value of purchases after app adoption could be the novelty of the app. It is possible that the app triggers a heightened shopping response only due to a temporary novelty effect that fades after a few days.

To test this explanation, we re-estimate our models using extended windows of time, that is, 45 days and 60 days, instead of one month. The effects of the app persist in these varying windows of time, and in fact seem to increase over time (Table B4 in Web Appendix B).

5.2.7. Future adopters as control group: Column E of Table 8 shows the results of an alternative DIFF-IN-DIFF model that uses future app adopters as a control group for current adopters. The results are substantively similar.

6. Mechanisms Explaining App Adoption Effects: App Features and Usage Patterns

Our findings provide robust evidence for the influence of mobile app adoption on shopper behavior. We find that app adopters buy 21% more often but spend 12% less per purchase occasion and return 73% more often than non-adopters in the month after adoption. Overall, app adoption results in a 24% increase in net monetary value of purchases.

What mechanisms underlie higher purchases and returns due to app adoption? App adopters’ use of app features may help answer this question. An investigation into the use of app features shows that two most commonly used features, offer feature (e.g., clicking current deals on products) and loyalty reward feature (e.g., checking loyalty points) could potentially explain app adoption effects on shopping outcomes. These features primarily involve an experiential outcome via interactivity (Bellman et al. 2011). The interactivity of mobile devices is characterized by the control that users have over the device and the notion of presence, that is, the ability to experience an environment closely through the technology. Activities like redeeming reward points and activating offers will likely lead to greater engagement and

Marketing Science Institute Working Paper Series 24

spending by shoppers (Kim et al 2015). Over 90% app adopters who accessed the app use its interactive features.

An analysis of app adopters’ use of offer and loyalty reward features is noteworthy because it helps explain our key finding of lower monetary value per purchase occasion. The descriptive statistics pertaining to the use of offer and loyalty features pre and post app adoption appear in

Tables 9 and 10, respectively. From these tables, we examine the differences in shopping outcomes for app adopters who access the offer and loyalty features versus those who did not. As expected, the value of each purchase for users of these features falls by about 16-20% per purchase occasion between the pre- and post-period while remaining virtually the same for those who do not use such features. Furthermore, in the data, shoppers who use the mobile app show increasing instances of offer usage post adoption from 6% shoppers using offers in the pre- adoption to over 50% using offers in the post-adoption period. To verify the mechanism of offer exposure through the app, we further examined the nature of app usage by shoppers on the day they make a purchase and one day before. Indeed, out of 1,712 transactions made by app users in the post adoption period, over 42% were triggered after the use of offer-related features and 53% after the use of loyalty rewards-related features. (Table 9 and 10 follow References.)

If increased offers and rewards exposure is indeed one of the mechanisms for higher purchase incidence and lower purchase values, we can easily explain higher returns for app adopters. Higher incidence of returns due to app adoption could result from three related reasons.

First, buying decisions made at the lure of an offer by app adopters may lead to post-purchase disutility and returns. Second, higher return incidence could result from exposure to disconfirming information after the shopper has made the purchase because app users are likely

Marketing Science Institute Working Paper Series 25

to get exposed to negative information about the product through reviews. Our conversations with managers from the retail company confirmed this insight; the executives revealed that when social media opinion leaders and influencers share negative reviews about a video game, even if the game received positive reactions and pre-orders in the pre-release period, it is common to see spikes in return incidence among buyers. Third, app adopters may be engaging in returns more often simply because they become less inhibited after using the app; some studies indicate that the lack of social cues in electronic device mediated communications prompt individuals to become less inhibited (Sproull and Kiesler 1986).

Finally, contrary to intuition, shoppers who access higher number of unique features do not always buy more. In fact, beyond a point, their sales and returns outcomes weaken. Figure 5 demonstrates this inverted U-shaped relationship through a scatter plot between users’ average number of unique features accessed in the app and their shopping outcomes. Shoppers who use a very high number of features in the app may experience disutility from information overload and an increased focus and attention on the device itself, rather than on actively thinking about a purchase and taking actions. (Figure 5 follows References.)

7. Managerial Implications

Our results offer several key managerial implications. First, based on the difference in net monetary value of purchases due to app adoption from the propensity score matched DIFF-IN-

DIFF model ($37) and the most conservative estimate from the robustness checks ($23), across the retailers’ two million adopters, we estimate the retailer’s net annual revenue increase due to app launch to range from $550 million to $890 million. This estimate provides a useful benchmark for managers to evaluate any app introduction decision. This estimate is likely to be

Marketing Science Institute Working Paper Series 26

higher if the retailer can convince more shoppers from its 32 million shopper base to adopt its mobile app.

Second, the findings that the purchase frequency (monetary value of purchases per occasion) is higher (lower) for adopters than non-adopters suggests that managers should plan for shoppers visiting the physical and online stores more often and spending less on each occasion. These findings also suggest reduced interaction of store associates and online agents with shoppers on a given store visit, so the key task of associates is to encourage shoppers to visit again.

Third, the finding that app adoption leads to greater product returns exposes managers to a darker side of apps. Managers need to proactively monitor return incidence from app adopters and devise interventions to keep product returns in check. To the extent that some of the returns is due the gap between expected and actual product delivered, managers can minimize the gap by offering clearer pictures, videos, and descriptions of the products in the app.

Fourth, the findings on the role of offers and reward features indicate the importance of dynamic experiential content in the app that provides additional value to shoppers. To promote engagement, managers need to ensure that interactive features such as redeeming reward points and activating offers are easily accessible.

Finally, we caution managers against an all-in-one app design and in favor of a more thoughtful combination of app features to avoid information-overload. In doing so, managers should adapt their mobile app design strategies to their context, including the product category.

8. Conclusion, Limitations, and Extension

We addressed our three research questions rigorously using two complementary research designs, a difference-in-differences method and the exponential Type II Tobit model. We tested a variety of alternative explanations that could contaminate our estimates. First, mobile adoption

Marketing Science Institute Working Paper Series 27

leads to higher purchase incidence, return incidence, and net monetary value of purchases.

Second, app adopters buy 21% more often but spend 12% less per purchase occasion and return

73% more often than non-adopters in the month after adoption. Overall, app adoption results in a

24% increase in net monetary value of purchases. Third, experiential app features like promotional offers and loyalty rewards significantly affect shopping outcomes, and the number of unique app features accessed by the shopper has an inverted U-shaped relationship with shopping outcomes.

We tested our results for several alternative explanations including shopper heterogeneity in deal-proneness. The robustness of our estimates to pre-adoption deal-proneness of shoppers shows that mobile apps influence non deal-prone shoppers as well. However, once they adopt the app, exposure to offers and reward features plays an instrumental role in driving app adopters’ shopping outcomes.

Although our research is the first to quantify the effects of mobile apps on a broad range of shopping outcomes, including returns, it has some limitations that future research can address.

First, our research relies on a quasi-experimental approach. The gold standard for causal inference is randomized field experiments. Randomized field experiments, if feasible, could provide future researchers with unique opportunities for testing specific app-related manipulations. Second, while we have examined the net monetary value of purchases, our data do not contain cost information. If cost data are available, it would be interesting to study the effect of app adoption and use on customer lifetime value. Third, we have data from only one retailer across channels. Future studies on mobile apps can examine data on multiple retailers to map shoppers’ brand loyalty and preference resulting from app adoption. Likewise, future studies can examine these research questions in the context of other retailer types such as pure

Marketing Science Institute Working Paper Series 28

play retailers with a growing bricks-and-mortar presence (e.g., Warby Parker, Bonobos). Such a setting could also offer interesting comparative insights on the effects of mobile apps on shopping outcomes in different channels. Finally, there is immense potential to continue to uncover the mechanisms underlying engagement and use of mobile apps. Furthermore, what marketing mix strategies should firms adopt to improve adoption of and engagement through apps? These are ripe areas for future investigation.

Marketing Science Institute Working Paper Series 29

References

Ailawadi KL, Neslin SA (1998) The effect of promotion on consumption: Buying more and consuming it faster. J. Marketing Res. 35(3):390–398. Anderson E, Hansen K, Simester D (2009) The option value of returns: Theory and empirical evidence. Marketing Sci. 28(3):405-423. Andrews M, Goehring J, Hui S, Pancras J, Thornswood L (2016a) Mobile promotions: A framework and research priorities. J. Interactive Marketing 34:15–24. Andrews M, Luo X, Fang Z, Ghose A (2016b) Mobile ad effectiveness: Hyper-contextual targeting with crowdedness. Marketing Sci. 35(2):218-233. Bellman S, Potter RF, Treleaven-Hassard S, Robinson JA, Varan D (2011) The effectiveness of branded mobile phone apps. J. Interactive Marketing 25(4):191–200. Bolton RN (1998) A dynamic model of the duration of the customer’s relationship with a continuous service provider: The role of satisfaction. Marketing Sci. 17(1):45–65. Caliendo M, Kopeinig S (2005) Some practical guidance for the implementation of propensity score matching. IZA DP N:1588. Campbell D, Stanley JC (1963) Experimental and Quasi-Experimental Designs for Research (Houghton Mifflin, Boston). comScore (2016a) comScore reports January 2016 US smartphone subscriber market share. ComScore. Accessed July 12, 2016, http://tinyurl.com/gq3x5s5 comScore (2016b) The 2016 US mobile app report. ComScore. Accessed October 20, 2016, http://tinyurl.com/j43tjyw Cragg JG (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39(5):829-844. Danaher PJ, Smith MS, Ranasinghe K, Danaher TS (2015) Where, when, and how long: Factors that influence the redemption of mobile phone coupons. J. Marketing Res. 52(5):710–725. Dinner I, Heerde HV, Neslin S (2015) Creating customer engagement via mobile apps: How app usage drives purchase behavior. Working paper. Dubé JP, Fang Z, Fong NM, Luo X (2015) Competitive price targeting with smartphone coupons. NBER Working Paper No. 22067. Eadicicco L (2015) More people now shop on Amazon using smartphones and tablets than computers. TIME. Accessed October 2, 2016, http://tinyurl.com/hddh88p Einav L, Levin J, Popov I, Sundaresan N (2014). Growth, adoption, and use of mobile e-commerce. The American economic rev. 104(5): 489-494. eMarketer (2014) 2 Billion consumers worldwide to get smart(phones) by 2016. Emarketer. Accessed October 2, 2016, http://tinyurl.com/kkpxevo eMarketer (2016) Mobile ad spend to top $100 billion worldwide in 2016, 51% of digital market. Emarketer. Accessed August 10, 2016, http://tinyurl.com/p79lymk Ericsson (2016) Ericsson Mobility Report. Ericsson. Accessed August 10, 2016, http://tinyurl.com/gmnezg6 Fong NM, Fang Z, Luo X (2015) Geo-conquesting: Competitive locational targeting of mobile promotions. J. Marketing Res. 52(5):726–735. Forbes (2015) How mobile ordering can impact Starbucks’ valuation. Forbes. Accessed October 2, 2016, http://tinyurl.com/zrekrwp Forrester (2016). 2016 Mobile and app marketing trends. Forrester. Gill M, Sridhar S, Grewal R (2016) On returns to business-to-business mobile engagement apps. Working paper. Google M/A/R/C Study (2013) Mobile in-store research: How in-store shoppers are using mobile devices. Google M/A/R/C. Accessed October 2, 2016, http://tinyurl.com/gr7ghpn Guo S, Fraser MW (2014) Propensity Score Analysis: Statistical Methods and Applications (Sage Publications).

Marketing Science Institute Working Paper Series 30

Heckman J (1979) Sample selection bias as a specification error. Econometrica 47(1): 153-161. Huang Q, Nijs VR, Hansen K, Anderson ET (2012) Wal-Mart’s impact on supplier profits. J. Marketing Res. 49(2):131–143. Hui SK, Inman JJ, Huang Y, Suher J (2013) The effect of in-store travel distance on unplanned spending: Applications to mobile promotion strategies. J. Marketing 77(2):1–16. Hung SY, Ku CY, Chang CM (2003) Critical factors of WAP services adoption: An empirical study. Electronic Commerce Research and Applications 2(1):42-60. Jing X, Lewis M (2011) Stockouts in online retailing. J. Marketing Res. 48(2):342–354. Kim SJ, Wang R J-H, Malthouse EC (2015) The effects of adopting and using a brand’s mobile application on customers’ subsequent purchase behavior. J. Interactive Marketing 31:28–41. Kushwaha T, Shankar V, Li S (2015) Multichannel marketing: Asymmetries across customer-channel segments and optimal marketing allocation. Working paper. Lee J, Zhuang M, Kozlenkova I, Fang E (2016) The dark side of mobile channel expansion strategies. MSI working paper. Lewis M, Singh V, Fay S (2006) An empirical study of the impact of nonlinear shipping and handling fees on purchase incidence and expenditure decisions. Marketing Sci. 25(1):51-64. Manchanda P, Packard G, Pattabhiramaiah A (2015) Social dollars: The economic impact of customer participation in a firm-sponsored online customer community. Marketing Sci. 34(3):367-387. Ofek E, Katona Z, Sarvary M (2011) “Bricks and clicks”: The impact of product returns on the strategies of multichannel retailers. Marketing Sci. 30(1):42-60. Peterson H (2015) Macy’s CEO says there’s one thing everyone is getting wrong about the retail industry. Business Insider. Accessed October 2, 2016, http://tinyurl.com/h7yf6ab Rosenbaum PR (2005) Sensitivity analysis in observational studies. Everitt BS, Howell DC, eds. Encyclopedia of Statistics in Behavioral Science (Wiley, New York), 1809-1814. Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55. Rubin DB (2008) For objective causal inference, design trumps analysis. The Ann. of Appl. Statis. 2(3):808–840. Shankar V, Balasubramanian S (2009) Mobile marketing: A synthesis and prognosis. J. Interactive Marketing 23(2):118–129. Shankar V, Kleijnen M, Ramanathan S, Rizley R, Holland S, Morrissey S (2016) Mobile shopper marketing: Key issues, current insights, and future research avenues. J. Interactive Marketing 34:37- 48. Silverman BW (1986) Density Estimation for Statistics and Data Analysis (Chapman and Hall, London). Sims G (2015) Google Play store vs the Apple App store: By the numbers. Android Authority. Accessed October 2, 2016, http://tinyurl.com/zp4ufdq Sproull L, Kiesler S (1986) Reducing social context cues: Electronic mail in organizational communication. Management sci. 32(11):1492-1512. Thaler RH (1990) Saving, fungibility, and mental accounts. J. Economic Perspectives 4(1):193–205. Wang RJ-H, Malthouse EC, Krishnamurthi L (2015) On the go: How mobile shopping affects customer purchase behavior. J. Retailing 91(2):217–234. Wangenheim FV, Bayón T (2007) Behavioral consequences of overbooking service capacity. J. Marketing 71(4):36–47. Wooldridge JM (2002) Econometric Analysis of Cross Section and Panel Data (MIT Press, Cambridge). Xu J, Forman C, Kim JB, Ittersum KV (2014) News media channels: Complements or substitutes? Evidence from mobile phone usage. J. Marketing 78(4):97–112. Xu K, Chan J, Ghose A, Han SP (2016) Battle of the channels: The impact of tablets on digital commerce. Management Sci. Forthcoming.

Marketing Science Institute Working Paper Series 31

Table 1. Selected Related Literature and Our Contribution Paper Focus DV = DV = DV = DV = Other Comprehe- Methods** Context PI PA RI RA DV nsive app features Prior Research on the Effects of App Adoption on Dependent Variables (DVs*) Bellman Effect of app use on brand Purchase Online Multiple branded et al. attitude and purchase intention intent survey and retail apps (2011) lab study Einav et Analysis of eBay’s mobile app ✔ Descriptive Online retail al. (2014) adoption and platform revenues analysis Xu et al. Effect of mobile app on demand Site visit Diff-in-diff Online news (2014) at the mobile site Dinner et Effect of app adoption on ✔ Fixed High-end clothing al. (2015) probability of making an online effects retailer’s iOS app and offline purchase panel (online and a store) Kim et al. Effect of use of app check-ins Loyalty PSM and Air Miles Reward (2015) and information look-ups on point Diff-in-diff Program app loyalty point accruals accruals Gill et al. Effect of manufacturer's mobile ✔ Diff-in-diff B2B engagement (2016) app on B2B revenues app of a tools manufacturer Prior Research on the Effects of Mobile Device Adoption on Dependent Variables (DVs) Wang et Changes in customers’ spending ✔ ✔ PSM, log- Online grocery al. (2015) behavior upon adopting M- log, hazard retailer’s app shopping models Lee et al. Effect of mobile shopping ratio ✔ ✔ Panel Data Online retail (2016) (mobile vs. web) on purchases Regression Xu et al. Effect of tablet adoption on ✔ Diff-in-diff Online retail (2016) digital commerce via smartphones and PC devices Our study PSM, Diff- Retailer with a Our paper Effects of app adoption on ✔ ✔ ✔ ✔ - ✔ in-Diff, chain of stores (2016) purchase and returns across all exponential and an channels and the roles of app Type II ecommerce site features on shopping outcomes Tobit Notes: *DV refers to the key dependent variables used in the study, including purchase incidence (PI), monetary value of purchase (PA), return incidence (RI), and monetary value of returns (RA); **Methods include difference-in-differences approach (diff-in-diff) and propensity score matching (PSM).

Marketing Science Institute Working Paper Series 32

Table 2. Variable Definitions and Descriptive Statistics Variable Notation Operationalization Mean St. Dev. Min. Max. Purchase Incidence PI/h Dummy variable indicating if at least one purchase 0.48 0.50 0 1 was made in the time period (=1); else (=0) Monetary Value of PA/g Amount associated with purchases in the period ($) 70.09 154.87 0 3733.89 Purchases Return Incidence RI/r Dummy variable indicating if at least one return was 0.07 0.25 0 1 made in the time period (=1); else (=0) Monetary Value of RA/t Amount associated with returns in the period ($) 5.73 35.91 0 919.96 Returns App Adoption TREAT/A Dummy variable indicating if the shopper adopted the 0.17 0.38 0 1 app (=1) or not (=0) in the data period

Time Period POST/P Dummy variable indicating if the time period is 0.5 0.5 0 1 before (=0) or after (=1) adoption Recency RECENCY Number of days since the shopper’s last purchase at 36.81 29.01 1 118 the start of time t Tenure TENURE Number of days of being a customer at the start of 453.41 51.33 153 482 time t

EstimationWindow Order Size QNT Number of items in an order 0.95 1.38 0 39 Age AGE Age of shopper in years at the start of the data period 32.57 11.13 11 82 Gender GENDER Gender of shopper (Female=1, Male=0) 0.21 0.41 0.00 1.00 Distance to Nearest DIST Distance in miles between the geographical centers of 4.29 7.86 2.06 196.12 Store the shopper’s and the nearest store’s zip codes Number of Stores NSTORES Number of focal retailer’s stores in shopper’s zip code 0.57 0.72 0 4 Loyalty Program LPROG Dummy indicating if the shopper is enrolled in the 0.43 0.49 0 1 Level basic (=0) or professional (=1) membership on app adoption date Area Population AREAPOPL Population of zipcode based on 2010 US census 31,611 19,009 6 113,916 Competitor Stores COMPSTORE Number of competing stores in shopper’s zipcode 0.53 0.67 0 5 Online Buyer ONLINEBUYER Dummy variable indicating whether the shopper made 0.05 0.21 0 1 an online purchase (=1) or not (=0) in the calibration period Past Purchase PASTSPEND Monetary value of average monthly purchases in the 44.56 62.19 0 844.12 Amount calibration period ($) Past Return PASTRETURN Monetary value of average monthly returns in the 4.17 17.82 0 482.64 Amount calibration period ($)

CalibrationWindow Average Purchase APF Average number of sales transactions in a month in 0.79 0.73 0 15

Frequency the calibration period

Marketing Science Institute Working Paper Series 33

Table 3. Model-free Evidence: Mean Statistics Treated pre Treated post Control pre Control post Variable period period period period Purchase Incidence 0.686 0.822 0.416 0.441 Number of purchases 1.474 2.359 0.707 0.820 Monetary value of purchases 126.252 181.149 46.887 59.048 Return incidence 0.096 0.185 0.049 0.062 Number of returns 0.134 0.282 0.062 0.077 Monetary value of returns 9.461 18.555 4.021 4.052 Net monetary value of purchases 116.792 162.594 42.866 54.995

Table 4. Overview of Analyses Section Analysis Objective Key insight/Conclusion 4.2 Baseline Difference-in- Quantifying the treatment App adoption leads to Differences (DIFF-IN-DIFF) effect of app adoption on higher incidence and Regression shopping outcomes monetary value of purchase and returns than non-adoption 4.3 DIFF-IN-DIFF Regression with Correcting for potential bias in App adoption leads to (a) Selection on observables – treatment effects due to self- higher incidence and Propensity Score Matched selection monetary value of (PSM) Sample purchase and returns than (b) Selection on unobservables non-adoption after – Heckman correction correcting for endogeneity of app adoption 4.4 Exponential Type II Tobit Decomposing the effects of App adoption leads to app adoption into incidence of higher purchase and return purchase (returns), and incidence but lower conditional on it, the monetary monetary value of value of purchase (returns) per purchase per occasion occasion in a two-stage model 5.2 Robustness Checks Ruling out alternative App adoption treatment (a) Alternative measure of app explanations for the results effects are robust to adoption alternative explanations, (b) Outliers such as outliers, customer (c) Customer heterogeneity in deal-proneness, app deal proneness novelty, and other (d) Alternative matching adoption measures, (e) Alternative samples samples, and time periods (f) App novelty and alternative time periods Web Additional Checks Evaluating robustness to any (a) Pre-adoption Appendix Web Appendix B: Visual plots other potential threats to main purchase trends in the for common trends methods control and treated Web Appendix C: groups are parallel. (a) Kolmogorov-Smirnov Test (b) PSM significantly (b) Standardized bias reduction improves balance (c) Hidden bias sensitivity between the treated analysis and control groups. Web Appendix D: Alternative (c) Effects are robust to control group using future alternative control adopters to tackle unobservables groups.

Marketing Science Institute Working Paper Series 34

Table 5. Results of Difference-in-Differences Models Coeff. (Std. Err.) Panel A. Unmatched Samples Variable Number of Monetary value of Number of Monetary value of Net monetary purchases purchases returns returns value of purchases TREAT 0.767*** 79.365*** 0.072*** 5.44*** 73.925*** (0.042) (6.099) (0.013) (1.24) (5.767)

POST 0.113*** 12.16*** 0.015* 0.032 12.129*** (0.019) (1.953) (0.005) (0.458) (1.819)

TREAT*POST 0.772*** 42.736*** 0.133*** 9.063*** 33.673*** (0.07) (8.637) (0.024) (2.094) (8.009)

Intercept 0.707*** 46.887*** 0.062*** 4.021*** 42.866*** (0.013) (1.332) (0.004) (0.348) (1.226)

Panel B. Propensity Score Matched Samples Variable Number of Monetary value of Number of Monetary value of Net monetary purchases purchases returns returns value of purchases TREAT 0.499*** 64.708*** 0.039* 3.205 61.503*** (0.052) (6.837) (0.017) (1.586) (6.352)

POST 0.016 6.983 0.004 -1.665 8.648 (0.048) (4.861) (0.015) (1.269) (4.441)

TREAT*POST 0.869*** 47.913*** 0.144*** 10.759*** 37.154*** (0.083) (9.717) (0.028) (2.405) (8.976)

Intercept 0.975*** 61.545*** 0.095*** 6.256*** 55.289*** (0.033) (3.364) (0.011) (1.048) (2.932)

Panel C. Matched Sample with Heckman Correction using Inverse Mills Ratio (IMR) as Covariate Variable Number of Monetary value of Number of Monetary value of Net monetary purchases purchases returns returns value of purchases TREAT 0.499*** 64.731*** 0.039* 3.21 61.521*** (0.052) (6.84) (0.017) (1.585) (6.355)

POST 0.016 6.983 0.004 -1.665 8.648 (0.048) (4.862) (0.015) (1.269) (4.441)

TREAT*POST 0.869*** 47.913*** 0.144*** 10.759*** 37.154*** (0.083) (9.717) (0.028) (2.405) (8.975)

IMR 0.022 5.754 0.024 1.296 4.458 (0.095) (11.22) (0.03) (2.791) (10.321)

Intercept 0.944*** 53.33** 0.06 4.406 48.924** (0.14) (16.534) (0.045) (4.182) (15.183) * Notes: Robust standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.

Marketing Science Institute Working Paper Series 35

Table 6. Covariates and their Relationships with Outcomes for Exponential Type II Tobit Model Variable Notation PI PA RI RA Support from research in other contexts Mobile app A*P ✔ ✔ ✔ ✔ (Kim et al. 2015; Wang et al. 2015) Recency RECENCY ✔ ✔ (Lewis et al. 2006; Jing and Lewis 2011) Income/Past INCOME/ ✔ Thaler (1990) Spend PASTSPEND Tenure TENURE ✔ (Bolton 1998; Kushwaha et al. 2015) Distance to DIST ✔ (Anderson et al. 2009; Ofek et al. nearest store 2011) Number of NSTORES ✔ (Anderson et al. 2009; Ofek et al. stores 2011) Order size QNT ✔ (Anderson et al. 2009) Note: Purchase incidence (PI), monetary value of purchases (PA), return incidence (RI), and monetary value of returns (RA).

Table 7. Results of Exponential Tobit Type II Model Variable Coeff. Variable Coeff. (A) (Std. Err.) (B) (Std. Err.) Log Value of Purchases Per Log Value of Returns Per Occasion Occasion POST (P) 0.134** POST (P) -0.293* (0.046) (0.117) TREAT (A) 0.259*** TREAT (A) 0.088 (0.047) (0.117) TREAT * POST (A * P) -0.125* TREAT * POST (A * P) 0.251 (0.062) (0.171) TENURE 0.383*** QNT 0.05 (0.102) (0.027) INCOME/PAST SPEND 0.0004* INTERCEPT 3.367*** (0.001) (0.553) INTERCEPT 1.308* (0.621) Purchase Incidence Return Incidence POST (P) -0.014 POST (P) 0.081 (0.044) (0.067) TREAT (A) 0.470*** TREAT (A) 0.213** (0.045) (0.065) TREAT * POST (A * P) 0.442*** TREAT * POST (A * P) 0.308*** (0.066) (0.088) RECENCY -0.007*** DIST -0.005 (0.001) (0.004) INTERCEPT 0.246*** NSTORES -0.007 (0.036) (0.035) RECENCY -0.007*** (0.001) INTERCEPT -1.283*** (0.061) Log Likelihood -9359.39 Log Likelihood -2982.73 Correlation (rho) 0.171 Correlation (rho) 0.193 Notes: There are 2,421 (5,827) censored observations for purchase (returns) model; standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.

Marketing Science Institute Working Paper Series 36

Table 8. Robustness Checks for Treatment Effects (A) (B) (C) (D) (E) Variable App use vs. Outliers Deal use Alternative Future treated non-use for heterogeneity sample as control six months Number of 0.433*** 0.88*** 0.945*** 0.647*** 0.688*** purchases (0.051) (0.083) (0.089) (0.094) (0.097) Monetary value of 25.505*** 87.592*** 66.781*** 36.521*** 34.718** purchases (4.934) (7.662) (11.295) (9.064) (11.646) Number of returns 0.055*** 0.13*** 0.139*** 0.073* 0.108** (0.015) (0.028) (0.027) (0.03) (0.036) Monetary value of 2.236* 10.81*** 11.943*** 8.986*** 5.25 returns (1.054) (1.974) (2.528) (2.126) (2.871) Net monetary value 23.269*** 76.782*** 54.838*** 27.536*** 29.468** of purchases (4.668) (7.113) (10.616) (8.349) (10.664) Number of 9,774 5,828 4,624 4,356 6,516 observations * Notes: Robust standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.

Table 9. Shopping Outcomes for Subgroups of App Adopters based on Offer Feature Usage Variable Offer features used Offer features not used Pre Post Pre Post Number of purchases 1.55 2.61 1.39 2.06 Monetary value of purchases 136.27 191.58 114.19 168.59 Number of returns 0.16 0.32 0.10 0.23 Monetary value of returns 10.80 20.50 7.85 16.22 Net monetary value of purchases 125.47 171.08 106.34 152.37 Monetary value of purchases 87.92 73.40 82.15 81.84 per occasion

Table 10. Shopping Outcomes for Subgroups of App Adopters based on Loyalty Feature Usage Variable Loyalty reward Loyalty reward features used features not used Pre Post Pre Post Number of purchases 1.50 2.40 1.40 2.26 Monetary value of purchases 130.35 178.17 116.62 188.15 Number of returns 0.14 0.27 0.13 0.31 Monetary value of returns 8.95 17.99 10.67 19.89 Net monetary value of purchases 121.40 160.19 105.95 168.26 Monetary value of purchases per 86.9 74.23 83.30 83.25 occasion

Marketing Science Institute Working Paper Series 37

Figure 1. Mobile Apps and Shopper Choices

Figure 2. Model-free Evidence: Monetary Value of Purchases and Returns for App Adopters and Non- adopters

Marketing Science Institute Working Paper Series 38

Figure 3. Propensity Score Matched Sample: Monetary Value of Purchases and Returns for App Adopters and Non-adopters

Figure 4. Distribution of Propensity Scores Pre- and Post-Matching

Figure 5. Model Free Evidence: Number of App Features and Monetary Value of Purchases and Returns

($)

urchases Monetary Value ofReturnsMonetary ($) Value Monetary Value ofP Monetary Value

Number of unique app features used Number of unique app features used

Marketing Science Institute Working Paper Series 39

The Effects of Mobile Apps on Shopper Purchases and Product Returns

WEB APPENDIX

Web Appendix A. Screenshots

Figure A1. App Screenshots on iPhone

Web Appendix B. Robustness Check for Alternative Samples, Periods and Common Trends in the Pre-period

In this section, we present the results for robustness checks relating to two alternative samples

(Tables B1-B2), alternative app adoption measures (Table B3), varying time periods (Table B4) and the common trends plots (Figures B1-B4) for the treated and control groups.

The two samples are: (a) a random sample selected on a different date, February 1, 2015 similar to our main analysis of December 1, 2014 with app adopters selected at a certain date of adoption, and (b) random samples selected from each month for a period of four months during

(February-May) 2015. In case (b), we treat the month of adoption as the post-period; the implicit assumption is that the shoppers who adopted the app did so at the beginning of the month. Such an aggregation approach will induce a downward bias in our estimates, since we would assume

Marketing Science Institute Working Paper Series 40

that shoppers start showing signs of increased spending right at the beginning of the period

(Manchanda et al. 2015).

Similar to our main estimation, we match app adopters and non-adopters using a rich set of covariates in a binary logit model and subsequently carry out a difference-in-differences estimation. The binary logit model specifications are tailored for best fit. For instance, for the

February 2015 sample, we matched the samples on each past month’s spending instead of average past spending. We also replicate the analysis for an alternative control group sample of random non-adopters. In Tables B1 and B2, we present the results for the two sets of samples described earlier.

Next, in Table B3, we report the estimates for a refined measure of app adoption – app use vs. non-use based on December adopters’ usage in four months from January to April, 2015.

These estimates demonstrate that the effect is indeed due to app use, and is robust to individual characteristics over time. Finally, in Table B4, we report the estimates for varying time windows to rule out a possible novelty effect of the app. More specifically, we find robust results using shorter (15-day) and longer (45- and 60-day) periods as the pre-post windows compared to our

30-day window for the main estimation.

In Figures B1-B4, we present the graphs showing the monetary values of purchases for adopters and non-adopters before app adoption to illustrate that the common trends assumption central to the difference-in-differences design holds.

Marketing Science Institute Working Paper Series 41

Table B1. Difference-in-Differences Model Results for Feb 1, 2015 Sample Variable (A) (B) (C) (D) Nearest Caliper Excluding Excluding neighbor matches outliers offer users matches Treatment Effect Coeff. (Std. Err.) Number of purchases 0.647*** 0.6153*** 0.6417*** 0.5569*** (0.094) (0.0887) (0.092) (0.0952) Monetary value of purchases 36.521*** 34.7187*** 41.6567*** 29.1181** (9.064) (8.5425) (8.7232) (9.1233) Number of returns 0.073* 0.088** 0.0623* 0.0449 (0.03) (0.0287) (0.0299) (0.0309) Monetary value of returns 8.986*** 7.3061*** 5.8777** 5.887** (2.126) (2.0938) (2.1032) (2.1571) Net monetary value of purchases 27.536*** 27.4126*** 35.779*** 23.2311** (8.349) (7.8975) (8.084) (8.4989) Number of individuals 2,178 2,090 1,926 1,828 Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.

Table B2. Difference-in-Differences Model Results for February-May 2015 Samples Variable (A) (B) (C) (D) February March April May adopters adopters adopters adopters Treatment Effect Coeff. (Std. Err.) Number of purchases 1.191*** 1.032*** 1.088*** 1.152*** (0.0862) (0.0936) (0.0818) (0.0838)

Monetary value of purchases 120.551*** 107.667*** 86.048*** 71.956*** (8.6463) (9.8333) (18.8776) (7.2029)

Number of returns 0.101** 0.175*** 0.131*** 0.139*** (0.0335) (0.0332) (0.0299) (0.026)

Monetary value of returns 8.658* 18.449*** 7.229** 8.411*** (4.1254) (3.2732) (2.6187) (2.1607)

Net monetary value of purchases 111.894*** 89.217*** 78.819*** 63.546*** (7.4844) (8.5666) (18.5387) (6.4654)

Number of individuals 3,180 2,750 2,804 2,646 Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.

Marketing Science Institute Working Paper Series 42

Table B3. Results of Alternative Model Comparing App Users and Non-users Variable (A) (B) (C) (D) Jan 2015 Feb 2015 March 2015 April 2015 matches matches matches matches

Effect of App Use Coeff. (Std. Err.) Number of purchases 0.805*** 0.786*** 0.844*** 0.791*** (0.052) (0.053) (0.061) (0.054) Monetary value of purchases 60.822*** 58.268*** 59.120*** 58.938*** (4.78) (5.179) (5.715) (5.883) Number of returns 0.104*** 0.112*** 0.133*** 0.083*** (0.016) (0.016) (0.018) (0.016) Monetary value of returns 6.939*** 9.418*** 7.665*** 6.840*** (1.194) (1.544) (1.372) (1.543) Net monetary value of 53.883*** 48.849*** 51.455*** 52.098*** purchases (4.421) (4.567) (5.234) (5.378)

348 298 273 234 Number of individuals 4,872 4,172 3,822 3,276 Number of observations Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.

Table B4. Difference-in-Differences Model Results for Varying Periods Based on Time from Adoption Variable (A) (B) (C) 15 days pre 45 days pre 60 days pre and post and post and post

Treatment Effect Coeff. (Std. Err.) Number of purchases 0.556*** 1.304*** 1.629*** (0.054) (0.1) (0.116) Monetary value of purchases 32.806*** 75.089*** 92.645*** (7.6) (10.847) (11.83) Number of returns 0.074*** 0.225*** 0.247*** (0.019) (0.034) (0.037) Monetary value of returns 5.773** 15.713*** 17.41*** (1.752) (2.859) (3.039) Net monetary value of 27.033*** 59.376*** 75.235*** purchases (7.11) (9.945) (10.836) Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses; number of individuals for these models is the same as the main sample, that is, 3,258 treated and control individuals.

Marketing Science Institute Working Paper Series 43

Figure B1. Purchase Trends Before App Adoption for the February 2015 Sample

100 90

80 70 60 Non 50 Adopters 40 Adopters

Purchases ($) Purchases 30

Monetary Value of of Value Monetary 20 10 0 Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Month

Figure B2. Purchase Trends Before App Adoption for the March 2015 Sample

100

90 80

70 60 Non Adopters 50 40 Adopters

Purchases ($) Purchases 30

Monetary Value of of Value Monetary 20 10 0 Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15 Month

Marketing Science Institute Working Paper Series 44

Figure B3. Purchase Trends Before App Adoption for the April 2015 Sample

120

100

80 Non 60 Adopters 40 Adopters Purchases ($) Purchases

Monetary Value of of Value Monetary 20 0

Month

Figure B4. Purchase Trends Before App Adoption for the May 2015 Sample

100 90

80 70 60 50 Non 40 Adopters

Purchases ($) Purchases 30 Adopters

Monetary Value of of Value Monetary 20 10 0

Month

Appendix C. Tests for Propensity Score Matching

First, we report the results of the binomial logit model of app adoption used to compute propensity scores. The results in Table C1 present the logit model coefficients for the likelihood of a shopper becoming an app adopter. Shoppers who are more likely to adopt the retailer’s app tend to be younger, male, online buyers, paid loyalty members on the day of app adoption, and

Marketing Science Institute Working Paper Series 45

higher frequency shoppers. We select this logit model after evaluating the model fit for several other model specifications, including probit and logit with and without non-linear covariates.

Next, we present the results of various post-matching checks. First, the Kolmogorov-Smirnov

(Table C2) test shows that the distributions of propensity scores of the matched treated and control groups are statistically similar. Second, percentage reduction in bias after matching shows significant improvements in the values of covariates across app adopters and non- adopters, thus making the two groups comparable (Table C3). Finally, there is no concern for potential hidden bias due to unobservables (Table C4) or concerns for alternative matching methods (Table C5).

We discuss these checks in detail next. The results in Table C2 show that the distribution of propensity scores is nearly identical after matching. Table C3 shows the standardized bias before and after matching. We calculate it as follows (Rosenbaum and Rubin 1983):

where for the standardized bias before matching SBBM, the numerator is the difference between value of X covariates for the treated individuals before matching and the X covariates for all unmatched control individuals before matching, and the denominator is the equally weighted variance of the two. Likewise, the standardized bias after matching SBAM uses the treated means and control means after matching. We then calculate the percentage reduction in bias as:

In Table C4, we summarize the results of a sensitivity test to assess if hidden bias is a cause for concern. In this test, we manipulate the estimated odds of receiving the treatment to see how much the estimated treatment effects may vary. In other words, we check that the estimates are

Marketing Science Institute Working Paper Series 46

robust to possible ranges of “hidden bias.” According to Rosenbaum (2005), a sensitivity analysis in an observational study asks what the unmeasured covariate would have to be like to alter the conclusions of the study.

Suppose we have two individuals j and k. Assume they have similar covariates but different chances of receiving the treatment. In other words, Xj and Xk are the same but Aj and Ak, the probability of adopting the app, may be different. The odds that they adopt the app are Aj/(1-Aj) and Ak/(1-Ak). Let us assume the odds ratio to be at most gamma, where gamma is the exponent of delta, the indicator of hidden bias. For various gamma values starting with one, we calculate bounds or intervals of p-values that show us the uncertainty due to hidden bias.

Let γ = eΔ. If gamma were exactly one, or equivalently delta exactly zero, then there would be no hidden bias and if Xj and Xk are equal then so would be their log odds of getting treated.

Gamma is a measure of degree of departure from a study free of hidden bias (Guo and Fraser

2014). Our test finds that for varying values of gamma, 1 through 2, our inference is robust or the study is insensitive to hidden bias. In other words, extremely high values of gamma would be needed to change the inference.

Marketing Science Institute Working Paper Series 47

Table C1. App Adoption Model: Logit Estimates Variable Coeff. (Std. Err.) 1 (Intercept) 67.187* (26.133) 2 ln(1+AGE) -0.872*** (0.085) 3 GENDER (F=1,M=0) -0.454*** (0.079) 4 ln(1+DIST) 0.584* (0.252) 5 ln(1+NSTORES) 1.122 (0.609) 6 ln(1+AREAPOPL) 0.091 (0.359) 7 ln(1+COMPSTORE) -0.044 (0.09) 8 ln(1+TENURE) -24.709** (9.166) 9 LPROG 0.549*** (0.058) 10 ONLINEBUYER 0.424*** (0.116) 11 ln(1+PASTSPEND) -0.098 (0.125) 12 ln(1+PASTRETURN) -0.117 (0.086) 13 ln(1+NSTORES) Sq. -0.538 (0.356) 14 ln(1+AREAPOPL) Sq. -0.004 (0.019) 15 ln(1+TENURE) Sq. 2.226* (0.801) 16 ln(1+DIST) Sq. -0.135* (0.058) 17 ln(1+PASTSPEND) Sq. 0.006 (0.018) 18 ln(1+PASTRETURN) Sq. 0.024 (0.023) 19 ln(1+APF) 2.371*** (0.317) 20 ln(1+APF) Sq. -0.507** (0.168) Notes: Null deviance: 8,737.9 on 9584 degrees of freedom; Residual deviance: 8,109.8 on 9,565 degrees of freedom; AIC: 8,149.8, Number of Fisher Scoring iterations: 5. Log likelihood: -4054.91. McFadden’s Pseudo R squared 0.072; *** p < 0.001, ** p < 0.01, * p < 0.05.

Table C2. KS Test Results Two sample Kolmogorov-Smirnov test Before matching: D = 0.29467, p-value < 2.2e-16 After matching: D = 0.004911, p-value = 1

Marketing Science Institute Working Paper Series 48

Table C3. Propensity Score Matching Results: Percentage Reduction in Bias After Matching Variable Means treated Means Means Percent (before control control (after balance matching) (before matching) improvement matching) (Intercept) 0.227 0.158 0.226 99.318 ln(1+AGE) 3.379 3.476 3.380 98.467 GENDER (FEMALE) 0.135 0.225 0.141 93.883 ln(1+DIST) 1.053 1.050 1.041 -269.836 ln(1+NSTORES) 0.357 0.359 0.352 -136.892 ln(1+AREAPOPL) 10.100 10.092 10.127 -240.089 ln(1+COMPSTORE) 0.342 0.343 0.334 -472.658 ln(1+TENURE) 6.086 6.074 6.087 87.540 LPROG 0.561 0.400 0.570 94.299 ONLINEBUYER 0.079 0.040 0.076 92.149 ln(1+PASTSPEND) 3.532 3.175 3.528 98.853 ln(1+PASTRETURN) 0.604 0.411 0.642 80.196 ln(1+NSTORES) Sq. 0.302 0.303 0.296 -326.628 ln(1+AREAPOPL) Sq. 102.810 102.676 103.290 -257.693 ln(1+TENURE) Sq. 37.053 36.919 37.069 88.140 ln(1+DIST) Sq. 2.239 2.260 2.145 -349.191 ln(1+PASTSPEND) Sq. 13.699 11.196 13.687 99.507 ln(1+PASTRETURN) Sq. 1.957 1.268 2.063 84.568 ln(1+APF) 0.655 0.501 0.649 96.139 ln(1+APF) Sq. 0.557 0.327 0.541 93.251 Notes: 1,629 adopters are matched with 1,629 non-adopters out of a pool of 7956 non-adopters pre-matching.

Table C4. Hidden Bias Sensitivity Test Results Rosenbaum Sensitivity Test for Wilcoxon Signed Rank P-Value Unconfounded estimate.... 0 Gamma Lower bound Upper bound 1 0 0 1.1 0 0 1.2 0 0 1.3 0 0 1.4 0 0 1.5 0 0 1.6 0 0 1.7 0 0 1.8 0 0 1.9 0 0 2 0 0

Marketing Science Institute Working Paper Series 49

Table C5. Results of Difference-in-Differences Model with Different Matching Methods Variable (M1) (M2) (M3) Matching Matching Matching (common (Mahalanobis (calipers) support with metric) trimming) Treatment Effect Coeff. (Std. Err.) Number of 0.799*** 0.823*** 0.899*** purchases (0.083) (0.082) (0.084)

Monetary value of 39.61*** 45.294*** 48.157*** purchases (9.785) (9.485) (9.949)

Number of returns 0.139*** 0.137*** 0.14*** (0.028) (0.027) (0.027)

Value of returns 9.446*** 11.109*** 10.052*** (2.286) (2.253) (2.337)

Net monetary value 30.164** 34.186*** 56.777*** of purchases (9.092) (8.862) (6.583)

Number of 6,516 6,404 6,484 observations Note: *** p < 0.001, ** p < 0.01.

Appendix D. Selection on Unobservables

In this section, we present (a) the results of the first-stage probit model used for the Heckman correction in Table D1, and (b) the results for an alternative difference-in-differences using future treated cohorts of app adopters as controls in Table D2.

Table D1. First-Stage Probit Model Results DV = App Adoption Variable Coeff. (Std. Err.) Wireless network (WIRENET)* 0.196** (0.092) Online buying (ONLINEBUYER)* 0.381***(0.066) Symmetry in upload and -0.255** (0.103) download speeds (SYMSPEED)* Age -0.015*** (0.001) Gender -0.256*** (0.042) Tenure 0.002*** (0.0000) Distance -0.002 (0.002) Loyalty program level 0.35*** (0.031) Competitor stores -0.017 (0.024) Intercept -1.152*** (0.14) Notes: *** p < 0.01, ** p < 0.05; * indicates exclusion restrictions.

Marketing Science Institute Working Paper Series 50

Table D2. Alternative Difference-in-Differences Model Results with Future App Adopters as Control Group Variable Unmatched Matched sample sample

Treatment Effect Coeff. (Std. Err.) Number of purchases 0.758**** 0.688**** (0.076) (0.097) Monetary value of purchases 37.236** 34.718*** (11.749) (11.646) Number of returns 0.122**** 0.108*** (0.026) (0.036) Monetary value of returns 7.216**** 5.25* (2.207) (2.871) Net monetary value of purchases 30.021* 29.468*** (11.236) (10.664) Notes: **** p < 0.001, *** p < 0.01, ** p < 0.05, *p < 0.10; in this method, future app adopters from Feb-May 2015 are used as controls for current app adopters from December 2014.

Marketing Science Institute Working Paper Series 51