Comparative Analysis of Coffee Franchises in the Cambridge-Boston Area May 10, 2010 ESD.86: Models, Data, and Inference for Socio-Technical Systems Paul T. Grogan [email protected] Massachusetts Institute of Technology Introduction The placement of storefronts is a difficult question on which many corporations spend a great amount of time, effort, and money. There is a careful interplay between environment, potential customers, other storefronts from the same franchise, and other storefronts for competing franchises. From the customer’s perspective, the convenience of storefronts, especially for “discretionary” products or services, is of the utmost importance. In fact, some franchises develop mobile phone applications to provide their customers with an easy way to find the nearest storefront.1 This project takes an in-depth view of the storefront placements of Dunkin’ Donuts and Starbucks, two competing franchises with strong presences in the Cambridge-Boston area. Both franchises purvey coffee, coffee drinks, light meals, and pastries and cater especially well to sleep-deprived graduate students. However, Dunkin’ Donuts typically puts more emphasis on take-out (convenience) customers looking to grab a quick coffee before class whereas Starbucks provides an environment conducive to socializing, meetings, writing theses, or studying over a longer duration. These differences in target customers may drive differences in the distribution of storefronts in the area. The goal of this project is to apply some of the concepts learned in ESD.86 on probabilistic modeling and to the real-world system of franchise storefronts and customers. The focus of the analysis is directed on the “convenience” of accessing storefronts, determined by the distance to the nearest location from a random customer. The “nearest neighbor” probabilistic model is a natural choice for application to this problem. Under this model, the distance from a random uniformly-distributed customer to the closest spatially Poisson distributed storefront can be expressed with a closed-form equation. Of course, in the real-world system, there are several assumptions that must be checked. • Can the franchise storefronts be modeled with a spatial Poission distribution? • Can the customers be modeled with a uniform distribution? • Does the “nearest-neighbor” distance correlate with the “actual” closest storefront distance? • Is the Euclidean or Manhattan distance metric appropriate for pedestrian walking paths? To answer these questions, as well as the greater question of which coffee franchise provides better service to the residents of the Cambridge/Boston area, the project is broken down into three parts. First, data must be gathered on the existing storefront locations within an area of interest. Fortunately, both franchises provide “store locator” services from the corporate web sites. Additionally, data representing 1 myStarbucks App for iPhone and iPod Touch, http://www.starbucks.com/coffeehouse/mobile-apps/mystarbucks Grogan – ESD.86 2 the demand distribution either through population density or other relevant features are required for constructing the customer model. Second, probabilistic distributions will be created in accordance with the nearest neighbor model. Using the data gathered in the first phase, storefront locations will be modeled as spatial Poisson distributions and customers will be modeled with uniform distributions. Finally, comparative analysis will investigate the differences between the two franchises as well as the underlying assumptions and accuracy of the probabilistic models. Grogan – ESD.86 3 Data Gathering The data gathering portion of the project assembles the information required to build the probabilistic models. There are two primary formats of data needed: positional data and population data. Positional data provides coordinates for storefront locations for both franchises as well as locations of other features that may be helpful in the analysis. Population data provides a sense of customer density that will be used to help drive customer demand models. Positional Coordinates Not long ago, gathering position coordinates in a format conducive to numerical analysis would have been an insurmountable challenge for a term project. Fortunately, with the confluence of several technologies, it is no longer out of scope to build a very accurate representation of the real world. The general process to gather location data is as follows: 1. Aggregate addresses using online-available services or documents 2. Process addresses into GPS coordinates using online GeoCoder tool2 3. Visualize GPS coordinates using online mapping applications such as Google Maps, iterating on improperly-identified addresses as necessary 4. Transform GPS coordinates into Cartesian coordinates using the haversine formula3 The main innovation in the above steps is the availability of the GeoCoder tool, which allows batch queries of addresses to either Yahoo or Google mapping applications. Though the queries are not always correct, it dramatically reduces the time required to generate GPS coordinates (latitude and longitude) from text-based addresses. Franchise Storefronts The franchise storefront addresses are readily available on both Dunkin’ Donuts4 and Starbucks5 corporate websites. In both cases, the search criteria was limited to a target area being within five miles of ZIP code 02139 , which resolves to a location near Central Square in Cambridge, MA. In addition, all franchise storefront locations at Logan International Airport were removed under the assumption that 2 GeoCoder tool provides search queries using Yahoo or Google: http://www.gpsvisualizer.com/geocoder/ 3 Haversine formula computes great-circle distances: http://en.wikipedia.org/wiki/Haversine_formula 4 Dunkin’ Donuts store locator: https://www.dunkindonuts.com/aboutus/store/Search.aspx 5 Starbucks store locator (legacy): http://ie.starbucks.com/en-ie/_Our+Stores/ Grogan – ESD.86 4 airline customers do not include locally-quantifiable customers. With these restrictions, there were a total of 163 Dunkin’ Donuts and 59 Starbucks franchise storefronts identified in the target area. MBTA Stations As noted in one journal article, the optimal storefront placement for “discretionary services” may be at intersections of high pedestrian traffic.6 In the Boston area, the MBTA public transportation system hosts an average weekday ridership of 1.24 million customers as of April 20107 and is a prime target for storefront location placement. In this project, MBTA stations on the red, blue, green, orange, and silver lines were considered as inputs for a potential customer model. Also, as addresses are not widely used for these stations, an freely-distributable list of 142 stations current through 2006 including GPS coordinates was used for station location data.8 Visualizations As an important part of gathering data, visualizations were used throughout the project to verify locations. Figure 1 (below) shows plots of the storefront locations and MBTA stations using both GPS and Cartesian coordinate systems. In the Cartesian coordinate system, the five-mile radius is highlighted. a) b) Figure 1: a) Raw GPS Position Coordinates b) Cartesian Position Coordinates with 5-Mile Radius Highlighted To improve the context of the franchise storefronts and MBTA stations, the location data was overlaid on an area map9, as shown in Figure 2. 6 Berman, O., Larson R., Fouska N., “Optimal Location of Discretionary Service Facilities,” Transportation Science, Vol. 26, No. 3, pp. 201-211, August 1992. 7 Davey R., “MBTA Scorecard,” April 2010. Retrieved 4/25/2010 from http://mbta.com/about_the_mbta/scorecard/ 8 Demaine, E., “Boston Subway Google Map.” Retrieved 4/25/2010 from http://erikdemaine.org/maps/mbta/ 9 Background map retrieved from Google Maps: http://maps.google.com Grogan – ESD.86 5 Figure 2: Location Data Overlaid on Map Population Density Gathering population density data was a challenge for this project. Although population data is commonly available from decadal censuses, it is commonly aggregated by county or city which is not conducive for spatial analysis. Fortunately, an online “Digital Atlas” of Boston includes population maps based on the 1990 census utilizing red dots to represent 100 persons randomly distributed within a census tract.10 With some post-processing using Adobe Photoshop, the image was copped, resized, and filtered to display only the population information which is readable using built-in MATLAB image processing functions. The processed data is shown in Figure 3. Though there are some concerns over the accuracy of the resulting population data,11 it should be internally consistent and be helpful towards the modeling process. 10 Bowen, W., “Boston and Vicinity: Total Population,” 1997. Retrieved 4/25/2010 from http://130.166.124.2/boston/bos1.GIF 11 There is some discrepancy if a “dot” is one pixel or two and whether the pixels were sampled with or without replacement. In some cases, one pixel could represent somewhere between 50 and 100 people, more if there could be overlap, though from rough estimates, the 100 people per pixel seems to provide accurate population data. Grogan – ESD.86 6 a) b) Figure 3: a) Raw Population Data for Boston Area b) Processed Population Data of Target Area Grogan – ESD.86 7 Probabilistic Modeling Within the topics covered in ESD.86, the discussion of spatial probability distributions involved the “nearest neighbor” problem of finding the expected distance to the closest
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages53 Page
-
File Size-