INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

Hie quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough,margim, substandard and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

University Microfilms International A Bell & Howell Information Com pany 300 North 2eeb Road Ann Arbor. Ml 48106-1346 USA 313' 761-4700 800/521-0600 Order N u m b e r 9411996

Solving large scale location-spatial interaction models for retail analysis: A GIS-supported heuristic approach

Lao, Yong, Ph.D.

The Ohio State University, 1993

UMI 300 N. Zeeb Rd. Ann Arbor, MI 48106 SOLVING LARGE SCALE LOCATION-SPATIAL INTERACTION MODELS FOR RETAIL ANALYSIS: A GIS-SUPPORTED HEURISTIC APPROACH

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Yong Lao, B.A., M.A.

The Ohio State University 1993

Dissertation Committee; Approved by

M.E. 0'Kelly D.F. Marble & t Adviser L . A. Brown Department of Geography To My Family

ii ACKNOWLEDGMENTS

I would like to express my sincere appreciation to Dr. Morton O'Kelly for his advice and guidance throughout the dissertation research. His encouragement and support during my years of graduate study at Ohio State have been, and will continue to be my inspiration to work happier and harder. I am also deeply grateful to Dr. Duane Marble, who is always a source of valuable stimulations and insights. A special thank goes to Dr. Lawrence Brown for his great enthusiasm, help, and trust for my career development. The summer internship offered by ESRI is gratefully acknowledged. I am especially indebted with Dr. Jay Sandhu for his unfailing support during my work at ESRI. Thanks are also due to Mr. Paul Galimore and Mr. Xiaobo Zhang for their very useful suggestions on my Arc/Info work. To ny best friends, Zaiyong Gou, Lin Liu, Qin Tang, Deming Xiong, and Ling Li, words would never be enough to express my gratitude for your endless support of my professional pursuit, and for your everyday kindness and understanding. Finally, I dedicate this dissertation to my parents, Lao Keying and Liang Yan, my sister, Lao Jia, and ny uncle Lee Chu Hing, who are forever behind me with their love and trust. iii VITA

February 8, 1967 ...... Born - Kunming, P. R. China 1984-1987 ...... Beijing University, Beijing, P. R. China 1988 ...... B.A., The Ohio State University, Columbus, Ohio 1990 ...... M.A., The Ohio State University, Columbus, Ohio 1988-1993 ...... Teaching Associate, Department of Geography, The Ohio State University

PUBLICATION 1991. M.E. O'Kelly and Y. Lao, "Mode choice in a hub-and- spoke network: a zero-one linear programming approach". Geographical Analysis. Vol.23, No.4, pp.283-297.

FIELDS OF STUDY Major Field: Geography Studies in Location Analysis, Quantitative Methods, with Morton O'Kelly; Geographic Information Systems, with Duane Marble.

iv TABLE OF CONTENTS

DEDICATION ...... ii ACKNOWLEDGEMENTS...... iii VITA ...... iv LIST OF T A B L E S ...... viii LIST OF FIGURES ...... x LIST OF PLATES ...... xii INTRODUCTION...... 1 CHAPTER PAGE I. INTRODUCTION ...... 1 1.1 Problem Statement ...... 1 1.2 Background of Research...... 2 1.2.1 Linking Location Allocation and Spatial Interaction ...... 3 1.2.2 Using Heuristic Algorithms ...... 5 1.2.3 Combining Location Analysis with GIS . 7 1.3 Research Organization ...... 8 II. LITERATURE R E V I E W ...... 13 2.1 Introduction...... 13 2.2 Location-Spatial Interaction (LSI) Models . . 16 2.2.1 Spatial Interaction Based Allocation . 16 2.2.2 The LSI M o d e l s ...... 20 the cost minimizing approach . . . 21 the benefits maximizing approach . 23 the entropy maximizing approach . . 25 2.2.3 Summary of LSI M o d e l i n g ...... 28 2.3 Heuristic Approaches toLocation-Allocation M o d e l i n g ...... 29 2.3.1 General Heuristic Methods in Location Analysis ...... 30 2.3.2 New Heuristic Approaches ...... 35 2.3.3 The Evaluation of Heuristic Methods . 42 v 2.3.4 Summary of Heuristic Approaches . . . 43 2.4 GIS Supported Visual Interactive Modeling . . 44 2.4.1 Interactive Visualization ...... 46 2.4.2 Interactive Optimization ...... 50 2.4.3 GIS Supported Visual Interactive Modeling...... 51 2.4.4 Summary of GIS Supported Visual Interactive Modeling ...... 54 2.5 Summary and C o n c l u s i o n ...... 55 III. MODEL DEVELOPMENT AND SOLUTION PROCEDURE ...... 57 3.1 Introduction...... 57 3.2 Model Development ...... 57 3.2.1 Model A s s u m p t i o n s ...... 57 3.2.2 Model Formulation ...... 60 3.2.3 Model Complexity...... 65 3.3 Solution Strategies ...... 66 3.3.1 Initialization...... 69 random start ...... 69 interactive start ...... 70 the greedy heuristic ...... 71 the voting heuristic ...... 74 3.3.2 Search...... 84 the vertex substitution method . . 85 tabu s e a r c h ...... 87 the hashing s t r a t e g y ...... 92 3.3.3 Evaluation and Improvement...... 96 lagrangian relaxation and subgradient search ...... 96 interactive evaluation ...... 101 3.4 Summary and C o n c l u s i o n ...... 103 IV. DESIGN AND IMPLEMENTATION OF THE GIS PROTOTYPE . . 105 4.1 Introduction...... 105 4.2 System Design ...... 107 4.2.1 Data Requirement...... 108 4.2.2 Functional Requirement...... Ill 4.2.3 User Interface D e s i g n ...... 113 4.3 System Inplementation...... 120 4.3.1 Database Construction ...... 120 4.3.2 The Menu Structure...... 125 4.3.3 Integration of Arc/Info and LSI M o d e l ...... 133 4.4 Summary and Conclusion ...... 146 V. R E S U L T S ...... 148 5.1 Introduction...... 148 5.2 Solution Quality ...... 149 vi 5.2.1 Problem Desi g n ...... 149 5.2.2 The Primal Versus the D u a l ...... 150 5.3 Solution Dynamics...... 155 5.3.1 The Average Trip L e n g t h ...... 156 5.3.2 The Cost Effectiveness...... 160 5.3.3 The Location and Spatial Interaction Pattern ...... 162 5.4 Computational Experience ...... 172 5.4.1 Run Time A n a l y s i s ...... 172 5.4.2 The Memory Requirement...... 179 5.5 Summary and C o n c l u s i o n ...... 181 VI. CONCLUSION...... 184 6.1 Research S u m m a r y ...... 184 6.2 Contributions...... 192 6.3 Directions for Further Research ...... 194 6.3.1 Vertical Development...... 195 6.3.2 Horizonal Devel o p m e n t...... 199

BIBLIOGRAPHY ...... 202

vii LIST OF TABLES

PAGE Major Survey Articles on Location Analysis 14 Published Books on Location Analysis . . . . 15 An Example of the Allocation Table ...... 41

Data of the Sample Network ...... 68 The Initial Solution Using Random Start . . . 69 The Initial Solution Using Interactive Start 71 The Result of Using Step One of Greedy Add 72 The Result of Using Step Two of Greedy Add 73 The Node Centrality Index of the Sample Network ...... 76 The Result of Applying Voting Rule Two to the Sample Network ...... 77 The Accessibility Index of the Sample Network 77 The Voting Process of Rule Number Three . . . 80 Comparison of Initialization Strategies . . . 82 The Degree of Cut Provided by Different Tabu Strategies ...... 91 The Hash Table for the Sample Problem . . . . 95 Functional Process within the Modeling Module 118 The Coverage and Info Files for the City of Redlands ...... 124 The Menu Structure of GRALSIM ...... 126 Solutions of the First Set LSI Problem . . . 151 viii 20. Solutions of the Second Set LSI Problem . . . 152 21. The Computational Experience of the First Set LSI Problem...... 173 22. The Computational Experience of the Second Set LSI P r o b l e m ...... 174 23. The Computational Experience of the P-median P r o b l e m ...... 175

ix LIST OF FIGURES

PAGE An Output Example of the Planar Version Huff Model ...... 19 A Taxonomy for Locational Decision Making Displays ...... 49 The Sample Network ...... 69 The Tabu Decision Tree ...... 90 The Way of Hashing ...... 93 The DFD of the Prototype Interface . . 114 The DFD of the Information Module . . . 116 The DFD of the Modeling Module .... 117 The DFD of the Tools Module ...... 119 The Procedure of Creating Nodal Population ...... 123 The Main Menu ...... 126 The Information Menu ...... 128 The Modeling M e n u ...... 129 The Tools Menu ...... 130 The Scenario Planning Menu ...... 132 The LSI Modeling Menu ...... 134 The Model Initialization Menu ...... 135 The Model Evaluation Menu ...... 141 The Structure of GRALSIM ...... 146

x 20. The Solution Quality of the LSI Model .... 153 21. The Solution Quality of the P-median Model ...... 154 22. The Average Trip Length Versus B e t a ...... 157 23. The Average Trip Length Versus P ...... 159 24. The Cost-Effectiveness of the LSIModel . . . 161 25. The Cost-Ef fectiveness of the P-median Model...... 163 26. The CPU Time Versus Problem S i z e ...... 178 27. The Memory Requirement Versus Problem S i z e ...... 180

xi LIST OF PLATES PLATE PAGE I. An Example of Site Candidate Selection . . . 138 II. The Display of LSI Model Solution...... 142 III. The Display of the P-median Solution .... 143 IV. The Display of Both LSI and P-median Solutions...... 144 V. Multiple Windows for Viewing Solutions . . . 145 VI. The Solution Pattern of the LSI Model (P = 5, M = N = 397, S = 2 . 0 ) ...... 165 VII. The Solution Pattern of the LSI Model (P = 5, M = N = 397, fi = 3 . 5 ) ...... 167 VIII. The Solution Pattern of the LSI Model (P = 5, M = N = 397, B = 5 . 0 ) ...... 169 IX. The Solution Pattern of the P-median Model (P = 5, M = N = 397) 171 X. Trade Area Mapping for Two Selected S t o r e s ...... 176

xii CHAPTER I INTRODUCTION

1.1 Problem Statement This research integrates optimization methods and Geographic Information Systems (GIS) to perform highly sophisticated retail location analysis. The fundamental question being asked is: given the current modeling capability and GIS technology, how can we effectively tackle large scale location-spatial interaction (LSI) problems which arise typically in siting retail outlets? Here "large scale" implies problems of the type that usually involve hundreds of facility candidates. The primary objectives of the research are twofold. First, to develop an efficient hybrid algorithm based on the investigation of a variety of heuristic techniques. Second, to link the solution procedures with a GIS software such that a retail site analysis system is created allowing for visual interactive modeling, mapping, and query. The research will contribute to the theories of location analysis and spatial interaction, as well as retail applications within a GIS environment.

1 Finally, in step three the model and solution strategies are integrated with GIS. From the modeling perspective, this step introduces interactivity and visual graphic support which are widely recognized to have many advantages over traditional batch mode approach. From the GIS perspective, this step greatly enhances the analytical functionality within current GIS system. Ultimately, the goal is to take advantage of previous studies and current GIS technology and to develop more efficient and effective ways to deal with real world location-spatial interaction (LSI) problems.

1.2.1 Linking Location Allocation and Spatial Interaction The early researchers in locational studies focused on well-formulated location problems, with the assumption that both location and allocation are under control by the model designers. In general, attention was mainly paid to the following prototype problems as well as their variations: (1) the classical Weber problem, first extended as a location-allocation problem by Cooper (1963); (2) the p-median and p-center problem, first introduced by Hakimi (1964); (3) the set covering problem, first formulated by Toregas et al (1972); (4) the simple plant location problem, often attributed to the work by Kuehn and Hamburger (1963), and by Manne (1964). 4 In all these location problems, a deterministic allocation rule is adopted such that customers are assigned to the facilities based on the minimum travel distance or time. This is the so-called nearest center hypothesis. However, when applied in retail location analysis, this hypothesis has invited criticism from many social scientists who are engaged in the study of consumer travel behavior. Their research suggests that the nearest center rule may not be followed by consumers in the real world (see, for example, Clark and Rushton, 1970; Fingleton, 1975; Hubbard, 1978; O'Kelly, 1983; Bacon, 1984) . Besides travel cost, many other factors such as store size, service quality, types of goods, price range, etc. can also affect people's choices of shopping places. Therefore, they argue that more practical representation should be included in the location allocation modeling to more closely reflect human travel behavior. This has led to the extension and refinement of the traditional location models such that some probabilistic allocation rules are employed. Within geography, since spatial interaction modeling has grown into a well recognized field with rich theories and applications, its fruits have been readily incorporated into location allocation models (for reviews, see Wilson, et al, 1981; Fotheringham and O'Kelly, 1989). Basically, the strategy is to convert the nearest center, all or nothing assignment rules into the rules suggested by spatial interaction theory; 5 a) customers (demand) are assigned to a facility not only based on their traveling cost, but also according to the attractiveness of the facility. b) if more than one facility is to be located, they are considered to compete with each other. Consequently, the degree to which a facility attracts a customer is designated as the probability that the customer is going to travel to this facility. In other words, the traditional all or nothing assignment has been changed to the probabilistic assignment. In short, the location-spatial interaction (LSI) model differs from previous location allocation models in that the flows between demand points and facility locations are allocated according to the facility's characteristics and their spatial relationships (O'Kelly, 1987). Such an integration iir^roves the realism in location allocation modeling, making it more insightful in terms of understanding the real world consumer search behavior in connection with the site selection process.

1.2.2 Using Heuristic Algorithms For many years solving large scale location allocation problems has been a challenging task facing scientists. Some major obstacles are: 1) models of location applications often involve thousands of variables and constraints (see, for example, Domich et al, 1991). 2) many location models are nonconvex, nonlinear optimization problems (see Brandeau and Chiu, 1989). 3) many have been confirmed belonging to the class of intractable problems known as NP-hard (for discussions, see Cornuejols et al, 1977; Garey and Johnson, 1979; Kariv and Hakimi, 1979; Papadimitriou, 1981; Megiddo and Supowit, 1984) . In other words, the increase of problem size will result in an exponential or factorial increase in the number of operations needed to optimally solve the problem (Current and Schilling, 1990a). 4) the aggregation approaches to large size location allocation problems are not attractive because of various errors encountered in estimating the inter-point distance, the objective function, as well as the boundary effects (Rushton, 1989; Current and Schilling, 1990a; Densham and Rushton, 1991). .For literature concerning aggregation effects, see Goodchild (1979), Bach (1981), Casillas (1987), Daskin et al (1987), Current and Schilling (1987, 1990b). Due to the above complications, it is often necessary to develop efficient heuristic methods for dealing with large location allocation problems. A heuristic is defined as "an approximate algorithm that finds good (or near optimal) 7 solutions, but not necessarily the optimal solution, to a problem- (Current and Schilling, 1990a). Clearly the tradeoff here is to save computational time and effort yet to end up with less accurate solutions. As a matter of fact, the use of heuristics is more desirable in real world applications. This is because “optimal11 solutions are often subject to modifications based on many decision elements and concerns usually impossible to be included in the optimization model. In this sense, relatively satisfactory solutions generated by heuristics may well be accepted by decision makers.

1.2.3 Combining Location Analysis with GIS From the very beginning, both location allocation and spatial interaction modeling have capitalized on rapid advances in computer technology. There is no doubt that the use of powerful computer hardware, software and solution algorithms has made such analysis more challenging and rewarding. In recent years, the development and application of GIS technology has opened a new avenue for even more realistic and advanced spatial analysis. It is well recognized that a new trend is emerging in an attempt to integrate location analysis with the strong GIS capability for spatial data handling (Gaile and Willmott, 1989, p.781). From the perspective of spatial analysis, such an integration has many advantages, for example: (a) easy access and manipulation of spatial data; 8 (b) more realistic representation of the environment; (c) interactive modeling and evaluation; (d) scientific visualization and output control. On the other hand, the complexity of the integrative approach should also be fully understood. In general, the following questions are of primary concerns: (a) what are the problems suitable for integrative analysis? (b) how to represent the problems at an appropriate scale and detail? (c) what are the appropriate solution methods? (d) how to implement the models coherently in a GIS environment? Apparently these are challenging questions faced by every researcher who works on GIS applications. They can only be answered in accord with specific problems and GIS domain. Therefore, one of the major objectives of this research is to explore such complexities and to realize the integrative advantages.

1.3 Research Organisation This research includes a sequence of distinct stages: i) an extensive review of literature on location-spatial interaction (LSI) models, solution methods, and relevant modeling approaches in GIS; ii) formulation of a LSI model and construction of solution strategies; iii) development of a prototype GIS system for retail site location analysis by 9 incorporating the LSI model as well as solution strategies created in stage ii; and iv) simulation and experimentation with the model parameters, algorithms, and system performance. Chapter II summarizes previous literature that are relevant to the research. This review encompasses multidisciplinary studies focusing on the model formulations, solution procedures, and the integration with GIS. Although this research emphasizes the solution strategies, the model formulations are crucial in determining how to transfer the basic concepts of spatial interaction into mathematical languages that are logically linked with traditional location allocation models. The formulations also suggest the complexities of the model, thus directly affecting the solution procedures. Since one major objective of this research is to develop an efficient algorithm for solving large scale location- spatial interaction (LSI) problems, it is necessary to learn about previous algorithmic ideas, strategies, as well as their effectiveness and weakness. A substantial investigation of heuristic methods will provide a comprehensive picture of the existing solution techniques. They form the basis for constructing new algorithms in this research. There is no previous attempt that systematically summarizes the literature on GIS supported interactive modeling. The review in Chapter II is intended to fill this gap. By doing so, a variety of issues associated with both 10 GIS and optimization modeling will be discussed, with the focus on how to integrate GIS and modeling coherently. It is believed this part of review will provide constructive guidelines for designing and implementing the proposed prototype system. Chapter III is divided in two parts. The first part identifies a specific location-spatial interaction (LSI) model for retail site selection. Attention will be given to variable definitions, model objective function, and constraints. Apparently even for the same problem, there are many ways to generate model formulations. The main purpose is to use an example to show how concepts in spatial interaction theory can be incorporated into conventional location allocation models. The second part of Chapter III is devoted to the development of solution procedures. It represents perhaps the most important contribution of this research. Therefore, every detail on the process of algorithm construction will be examined. Like many other heuristics, the solution procedures are divided into starting phase, search phase, and evaluation phase. The most challenging task is to analyze different solution strategies and to organize them into a logical and efficient framework. Since the heuristic is used for interactive modeling, considerations also need be given to user involvement. Chapter IV presents the prototype system that integrates modeling and GIS functionality. This attempt 11 to link modeling function and database management function has raised many interesting issues. Particularly, the database construction and user interface design are among the major concerns. Again the whole process will be illustrated in detail. Nevertheless, many of these issues are simply beyond the scope of this research, for example, the spatial data structure issue opens a wide list of unsolved problems. Since not much work has been done in combining optimization models and GIS, hopefully current exploration can make a good contribution and offer feedbacks for further research in this area. In Chapter V, the detailed experimental results are reported. The experiments are conducted to demonstrate how the interaction among user, solution algorithm and GIS can effectively deal with retail location problems with different assumptions and under various situations. Specifically, the following aspects will be reported: (1) solution quality analysis. That is, given different problem size and parameters, how well does the model perform in reaching an optimal solution; (2) solution dynamics analysis. This refers to how the model solutions differ with different assumptions and parameters. In particular, what are the spatial implications attached to these different model output, and how user involvement may enhance the effectiveness of location and spatial interaction. 12 (3) run time and memory analysis. The computer running time and storage space that are needed to solve a problem. Further, how they are affected by different model elements and solution procedures. (4) map generation. The hard copy maps that contain the modeling results. Finally in Chapter VI, the research as a whole will be summarized and conclusions will be drawn. In general, three things are to be discussed: (i) the summary of the entire working effort; (ii) the contributions of this research; and (iii) the directions for further research. CHAPTER II LITERATURE REVIEW

2*1 Introduction The field of location analysis has experienced tremendous growth in the last three decades. It spans across many academic disciplines, including business, computer science, economics, engineering, geography, mathematics, operations research, planning and regional science (Current and Schilling, 1990a). For instance, a 1985 bibliography on location analysis lists approximately 1800 articles (Domschke and Drexl, 1985) . The extent of the literature is also evident in the number of survey papers as well as books published on this subject, as indicated in Table 1 and Table 2. For the purpose of the current research project, only one branch of the literature, namely, the location-spatial interaction modeling, is reviewed in this chapter. Although both formulations and solution methods are important components of modeling, the focus of the review is given to the latter due to the methodological orientation of this research. In the next section of this chapter the theoretical framework of location-spatial interaction (LSI)

13 14

Table 1. Major Survey Articles on Location Analysis

Author(s) Year Subject Current and 1990a Facility Location Analysis Schilling Current, et al 1990 Multiobjective Location Analysis Brandeau and Chiu 1989 Classification of Location Problems Erkut and Neuman 1989 Locating Undesirable Facilities ReVelle 1987 Urban Public Facility Location Aikens 1985 Location and Distribution Planning \ Wong 1985 Bibliography on Network 1 Location I Francis, et al 1983 Planar, Warehouse, Network, and I Discrete Locational Models Hansen, et al 1983 Public Facility Location Models Krarup and Pruzan 1983 The Simple Plant Location Problem Tansel, et al 1983 Network Location Leonardi 1981a Public Facility Location Models 1981b Krarup and Pruzan 1979 Center and Median Problems Hodgart 1978 Central Facility Location ReVelle, et al 1977 Facility Location and EMS models Lea 1973 Bibliography on Location- Allocation Systems Francis and 1974 Bibliography on Location Theory I Goldstein ReVelle, et al 1970 Public and Private Location Models | Scott 1970 Location-Allocation Models 15

Table 2. Published Boohs on Location Analysis

Author(s) Year Subject Brown 1992 Retail Location Mirchandani and 1990 Discrete Location Theory Francis (ed.) Hurter and 1989 Location and Production Martinich Louveaux, et al 1989 Location Theory and Application (ed.) Berry and Parr 1988 Retail Location Love, et al 1988 Facility Location Ghosh and 1987 Retail and Service Location McLafferty Ghosh and Rushton 1987 Location-Allocation (ed.) Thisse and Zoller 1983 Public Facility Location (ed.) Handler and 1979 Network Location Mirchandani Francis and White 1974 Facility Layout and Location Haggett 1966 Location Analysis in Human Geography Isard 1956 Location and Space Economy Losch 1954 The Economics of Location models is discussed. This is followed by the extensive review of heuristic procedures that might be used to solve the LSI problems. Finally the literatures on GIS supported location modeling are examined in section 2.3. 16 2.2 Location-Spatial Interaction (LSI) Models In general, LSI models are mathematical programs that simultaneously determine site locations and link allocations with spatial interaction theory. They are mostly applied to retail site location problems involving suppliers and shoppers. As mentioned earlier, the main motivation of using LSI approach is that a spatial interaction based allocation provides better assessment of flows between origins (residential zones or demand places) and destinations (stores or supply places) . Further, it is well known that the conventional all-or-nothing assignment is simply a special case of spatial interaction based assignment (Evens, 1973; Wilson and Senior, 1974) . The differences among LSI models, then, are often reflected in the different ways of setting model objectives when incorporating spatial interaction theory into location analysis.

2.2.1 Spatial Interaction Baaed Allocation The pioneering work of applying spatial interaction for retail analysis is attributed to Huff (1964), and Lakshmanan and Hansen (1965)l. They first introduced the following planar version of probabilistic model that were used for predicting the market share of a retail outlet;

1The Lakshmanan-Hansen model adds a power parameter a to the single attractiveness variable (W.,) . 17

where the variables are: i = origin; j = store of interest; PAJ = visiting probability from place i to store j; Wj = the attractiveness of store j, usually defined as j's size; CtJ - the cost of travel from i to j ; a, & = non-negative parameters indicating the degree of influence; N, = the set of competing facilities; The model states that the flow probability between demand i and store j is equal to the ratio of that outlet j's gravity form of utility to the sum of the utilities of all existing stores. In other words, both attractiveness and accessibility become factors in governing consumers' spatial choices. This approach is regarded more realistic in portraying shoppers' behavior compared to the conventional nearest center allocation rule. As a result, the model is widely adopted by location theorists and is usually an essential element in LSI models. Figure 1 provides an output example of the planar version Huff model, in which probability contours are drawn to delimit the market area for store number four. 18 One major modification of the Huff model was done by Nakanishi and Cooper (1974) , who proposed what they called the multiplicative competitive interaction (MCI) model. The MCI model extends the Huff model such that more measures are allowed to represent the store attractiveness. The formulation may be written as:

expt-pC^) (2 )

(3)

Notice in the Huff model, only one variable (W®.,) is employed to measure the attractiveness of store j . In contrast, the product of multiple variables (W®1^ * W®2^ * ... * W®1^) are used in the MCI model. In cases that more data on various store attributes are available, the MCI model is very likely to provide a more accurate estimation of market share (see, for example, Jain and Mahajan 1979; Hansen and Weinberg, 1979; Naert and Weverbergh, 1981; Ghosh, Neslin, and Shoemaker, 1984). Another often used model for predicting spatial interaction is the multinomial logit model (MNL) . It is different from the Huff and the MCI model in that it mainly deals with disaggregated interaction data at individual level. 19

01

90

Figure 1. An Output Example of the Planar version Huff Model (From an unpublished manuscript by H.J. Miller and M.B. O'Kelly) 20 The formulation of the MNL model is not discussed here because it is seldom incorporated into the LSI models. A general discussion of the MNL model can be found in McFadden (1974), and Hensher and Johnson (1981).

2.2.2 The LSI Models The majority of LSI models aim to find the optimal locations for a retail system provided that the system allocations (cash flows or customer trips) will be following some sorts of spatial interaction rules. Consequently, a term S^ either based on the Huff model or the MCI model is often included in a LSI model. In terms of model assumptions, usually the following aspects need be addressed and they are widely varied in different approaches: (1) demand: whether the demand is elastic or inelastic, static or random; (2) supply: whether the supply side is competitive or cooperative; (3) travel cost: whether it is linear, nonlinear, or stochastic; (4) spatial structure: whether it is planar, network, or hierarchical; (5) facilities: whether they are static or dynamic, capacitated or incapacitated. 21 It is beyond the scope of this research to review the details of these issues. For excellent discussions, see Leonardi (1981a), 0'Kelly (1987), and Brown (1992). For the sake of simplicity, here three types of LSI models are identified: the cost minimizing approach, the benefits maximizing approach, and the entropy maximizing approach. Each approach represents a unique perspective on how new retail outlets should be chosen in the context of many complicated issues -- demand versus supply, cost versus benefit, and competition versus cooperation.

(1) The cost minimizing approach The cost minimizing approach is basically an extension of the traditional p-median location model. It either replaces the all-or-nothing assignment variable (Wi;j) with the spatial interaction variable (Si;j), or changes Wtj to W ^ j . Thus the objective becomes minimizing a cost function that includes the flow variables satisfying a spatial interaction model. Mathematically, the objective function might be written as: Minimize

(4)

Here S4j can be regarded as the probability of customer i visiting store j, given S i;j is based on a spatial interaction model. 22 The work by Brotchie (1969), Dickey and Najafi {1973) attempted to minimize the combination of facility establishment cost and gravity-based travel cost. Hodgson (1978) presented a p-median version location allocation model that was embedded in a doubly constrained spatial interaction model. He solved the LSI model using Teitz and Bart heuristic. Goodchild and Booth (1980) applied a production constrained LSI model for optimally siting public swimming pools in London, Ontario. Mirchandani and Oudjit (1982) formulated a competitive m-median problem on network, in which a probabilistic spatial choice variable was introduced into the p-median objective function. The rationale behind these approaches is that by minimizing aggregated interaction cost, the retail system will realize the maximum market share or demand. Moreover, with probabilistic demand based on both attraction of stores and travel cost, the competition effects are implicitly captured in the model (Lea and Menger, 1990b). For excellent reviews and discussions, see Fotheringham and O' Kelly (1989, pp.151-169). Another somehow different approach, proposed by Goodchild (1978), was to initially solve the p-median model, and to determine the size of facilities according to the nearest, all-or-nothing allocations at each facility. Then it used the size as an input for solving the spatial interaction model and obtaining the probabilistic allocations. These allocations were again accumulated at each facility to obtain a new size. 23 The loop process would stop when an equilibrium interaction pattern was reached. The assumption of the model was that besides other elements, the size of facilities should also be a function of its expected usage level. The model might be useful in simulating competition among existing outlets, or in dealing with some time-dependent spatial flows. (2) The benefits maximizing approach The benefits maximizing approach suggests that in order to truly reflect consumer behavior in the process of location and spatial interaction, some kind of locational or consumer benefits need to be maximized. In the context of transportation and land use evaluation, Neuberger (1971) first derived a locational surplus function by aggregating random choice models. His idee was further extended and summarized by Williams (197 6) and Leonardi (1978). Generally, the locational surplus or user benefits function can be formulated as:

(5)

Since its introduction, to maximize LS has become a widely used objective in LSI models. Out of these benefits maximizing models, two typical approaches can be identified in terms of how location allocation and spatial interaction are integrated. 24 The first approach has been developed mainly by a group of British geographers (Coelho and Wilson, 1976; Harris and Wilson, 1978; Clarke and Wilson, 1985; Wilson, 1988; Williams, Kim and Martin, 1990) . In general, the problem they considered is to determine the size of stores, often expressed as the total floor space, as well as consumer expenditures at each chosen store such that the total user benefits or store profits are maximized. Since store location is available at every predefined zone, no explicit site location variable appears in the model. Within this framework, a variety of issues were studied with respect to equilibrium solutions, competitive configurations, demand elasticity, and organizational dynamics (Williams and Kim, 1990a, 1990b). The second approach is evolved from the traditional planar or discrete version location-allocation model. Again, given the assumption that allocation follows the spatial interaction rule, the problem is to search for the store sites on a continuous plane or a network so as to maximize the locational or consumers' benefits. Beaumont (1980) first generalized such a model by introducing the spatial interaction elements into the Weber problem. He proved the model was in fact a two dimension Hotelling's ice-cream-vendor location model. Further, the Weber problem is simply a special case of the planar LSI model. Using the consumers' maximization objective suggested by Wilson (1976), Hodgson (1981) devised a discrete LSI model for locating motor vehicle 25 licensing facilities in Edmonton, Canada. He showed the model's optimal solutions were similar to those produced by the p-median model. Erlenkotter and Leonardi (1985) formulated a spatial interaction based simple plant location model and solved it by nonlinear branch-and-bound algorithm. Excellent reviews can be found in Leonardi (1981a, 1981b). In addition to the above benefits maximizing approach, several location models adopted the objective of maximizing system profit or revenue, with allocations following the MCI model (Achabal, Gorr, and Mahajan, 1982; Ghosh and McLafferty, 1982; Ghosh and Craig, 1983; 1991). Such models are usually designed to deal with multiple location problems for franchise systems such as chain stores and fast food restaurants. Excellent review and discussion on MCI based location allocation models are in Craig, Ghosh and McLafferty (1984), Ghosh and McLafferty (1987).

(3) The entropy maximizing approach The entropy maximizing approach seeks the optimal facility locations that are embedded in the most likely 0-D flow pattern, subject to the spatial interaction constraints. Clearly, there are lots of combinations of individual decisions that may lead to an 0-D flow pattern satisfying spatial interaction constraints. According to the idea of entropy (Wilson, 1967, 1970, 1974), each possible combination is a “state" of the spatial interaction system, and each state has an equal likelihood of occurring. For a given distribution pattern S = {... S^ .. .}, the number of states associated it can be written as:

H - (6)

where N is the total number of trips, i.e. N = S^. Based on the information-minimizing theory (see, Jaynes, 1957), the 0-D trip pattern with the highest probability of occurring is the one that contains maximum number of states. Mathematically, this can be determined by maximizing equation 6 or its logarithm:

Max In H ■ In « In Ml - Lf In SAi I (7) ILS«' 7

Since the term In N! is a constant, it can be dropped from the equation 7. The objective function is now rewritten as:

(8)

If all Sjj are large, using Stirling's approximation log x! = xlogx - x, equation 8 becomes the commonly seen objective function: 27 The entropy maximizing objective is widely adopted in traffic assignment models in which the facility locations are not considered. For example, Jornsten (1980) presented a entropy based combined distribution and assignment model, with explicit flow cost and flow capacity constraints. He solved the problem by using Benders' decomposition. For excellent discussions, see Erlander (1980), and Sheffi (1985). In the LSI literature, O'Kelly (1987) formulated a p-median based location-allocation model with combined entropy maximizing and cost minimizing objective function. He demonstrated the conventional p-median model is a special case of his LSI model. He also presented a dual-based exact algorithm for tackling the LSI model. It is noticed the benefits maximizing approach and the entropy maximizing approach have generated very similar objective functions, although the formulations are derived from different perspectives. In regard to this, Neuburger (1971) wrote that: "... (Wilson's) entropy function should be one of the admissible utility functions. The resemblance seems to be purely formal, and any attempt to interpret entropy as utility or vice versa is likely to be futile." Whether such linkages are pure mathematical coincidence, as Neuburger speculated, remains to be investigated. Similarly, the work by Evans (1973), Wilson and Senior (1974), Erlander (1980) have demonstrated that there does exist mathematical 28 linkages among cost minimizing, benefits maximizing and entropy maximizing approaches. But they did not provide clear micro-economic or behavioral interpretations.

2.2.3 Summary of LSI Modeling Like traditional location allocation models, LSI models deal with a variety of issues arise in the locational process. Despite the complications, the key of LSI modeling is to replace the deterministic assignment rule with the probabilistic assignment rule. The review of literature indicates the Huff's gravity-based approach and the MCI model are most often adopted in LSI models. The MCI model is more generalized to allow multiple attractiveness measures being included in the LSI model. Yet it requires more effort in terms of data collection and calibration. The MNL model is seldom incorporated into location allocation models due to difficulties of obtaining disaggregate data. Most LSI models embed in an objective function into a production constrained or a doubly constrained spatial interaction model. Generally there are three types of objective functions: the cost minimizing objective, the benefits maximizing objective, and the entropy maximizing objective. Although derived from different perspectives, these objectives either have similar mathematical formulations or generate close optimal solutions. The theoretical and 29 operational linkages among different LSI models remain as an interesting research topic.

2.3 Heuristic Approaches to Location-Allocation Modeling Although exact algorithms (e.g., branch and bound, decomposition, and dynamic programming) have been applied widely to solve location-allocation problems, such approaches are often infeasible in the following situations (Brandeau and Chiu, 1989) : (a) large-scale, discrete location problems that have too many potentially optimal solutions; (b) nonlinear location problems for which derivatives are not well-defined, thus many differentiable optimization techniques cannot be applied; (c) nonlinear location problems with nonconvex objective functions, so that there exist many locally optimal solutions; (d) planar location problems that have an infinite number of potentially optimal solutions. Therefore, heuristic methods in the above context are appealing for offering trade-offs between computational effort and solution quality. A heuristic is defined as "an approximate algorithm that finds good (or near optimal) solutions, but not necessarily the optimal solution, to a problem- (Current and Schilling, 1990a). Usually heuristic approaches involve much less complicated algorithmic 30 procedures compared to exact methods, thus resulting in lower cost in terms of computing time and storage memory. In fact, under many practical situations when the goal is to compare alternative location choices and to seek a satisfactory rather than an optimal decision, using exact methods is not absolutely necessary. Even in cases where exact algorithms are applicable, heuristic methods are often employed as the first step to obtain a good staring solution and an upper bound for the optimal objective function (Handler and Mirchandani, 1979, p.58). A good example can be seen in the work by Domich et al (1991).

2.3.1 General Heuristic Methods in Location Analysis Many heuristic techniques have been developed for solving large combinatorial problems. A general classification can be found in Ball and Magazine (1981). In the following section, some of the most often employed heuristic approaches in location analysis are discussed.

(1) Ths greedy heuristic The rationale behind a greedy heuristic is to start by making an initial guess or computing a feasible solution, and then proceed to move along certain directions toward the goal as fast as possible, and get to the goal as close as possible. The heuristic stops when it is no longer possible to improve the current solution by moving. In solving location- 31 allocation problems, the greedy heuristic method has two sub­ approaches : the ADD-procedure and the DROP-procedure. The ADD-procedure, initially developed by Kuehn and Hamburger (1963), was applied to solve a warehouse location problem. It contains two parts, the "main program- and the "bump and shift routine". The main program adds an additional warehouse to the configuration, which generates the greatest cost savings. The program stops when no additional facilities can be added without incurring a higher total cost. The bump and shift routine enters after the main program terminates. It provides a final chance to refine the location configuration created by the main program. The bumping procedure drops certain facilities considered no longer economical because their customers are reassigned to other facilities located subsequently. The shifting procedure considers moving each facility to other potential sites within its territory in an attempt to further reduce the total cost. The Kuehn and Hamburger heuristic is applied to twelve sample problems, all involving 24 potential locations, 50 markets, as well as assumptions of linear facility operating and transportation costs. The solutions reached are either equal to or better than those obtained through the alternative methods considered. In fact, the main program performs so well that the bump and shift routine may not be needed since it produces little or no improvement in the cost function. 32 In contrast to the ADD-procedure, an opposite greedy heuristic is the DROP-procedure, first designed by Feldman, Lehrer and Ray (1966). The technique begins with every site containing a facility, that is, all locations are open. At each iteration a facility is dropped at the location where the largest saving is achieved. The process terminates when no further reduction can be obtained in the total cost function. The DROP-procedure was tested by using the same problems in the Kuehn and Hamburger paper. It was reported that the drop heuristic improved upon the add heuristic by an average of 3%. The authors also argued that the DROP-procedure was better suited for a location system with concave operating cost, since in such case the technique was more likely to avoid premature termination compared to the add technique. (2) The alternate (or eequential) location-allocation heuristic The alternate location-allocation (ALA) approach is first developed by Rapp (1962), and formally presented by Cooper (1964) as well as by Maranzana (1964) . The basic idea includes the following steps; (i) divide the whole service area into m approximately equal sub-regions; (ii) find a single optimal facility location for each sub- region; (iii) given the above location pattern, redefine the service area for each facility; 33 (iv) repeat (ii) and (iii) until no further improvement can be made. Notice the ALA heuristic differs from the greedy heuristic in that it allows for changes of earlier decisions, instead of only one way moves (from 0 to 1 in ADD, from 1 to 0 in DROP) associated with the largest savings. On the other hand, the ALA method can be integrated with the greedy heuristic as well as other exact or approximate methods in step two and three (Jacobsen, 1983). For example, one could use the ADD-procedure in step two and the DROP-procedure in step three, or vice versa. Such advantages have made the ALA heuristic be widely adopted for solving warehouse location and network median location problems (see, for example, Eilon et al, 1971; Baxter, 1981; Love and Juel, 1982; Brandeau et al, 1986). (3) The vertex substitution method (the interchange heuristic) The vertex substitution method (VSM), first independently presented by Teitz and Bart (1968), is one of the most widely used algorithms for dealing with location-allocation problems. Like the ALA, it was originally designed to solve the p-median problem. The algorithm starts with an initial feasible solution, then substitutes a location not in the current solution for each one that is in the solution. The substitution that yields the greatest improvement is accepted. The process continues until no improvement can be made by such interchanges. 34 Extensive computational studies have been published based on tests of the VSM and its variations for generating solutions in location-allocation models {see, for example, Garfinkel et al, 1974; Khumawala et al, 1974; Cornuejols, et al, 1977; Rosing et al, 1979;Jacobsen, 1983; Krarup and Pruzan, 1983; Densham and Rushton, 1991; Klincewicz, 1991). According to these reports, although the VSM is not known as polynomially bounded and does not perform as well as the greedy heuristic on the basis of worst case analysis, overall it is superior to other heuristic because of its operational simplicity and computational robustness. Specifically, the VSM has the following attractive characteristics: (a) it frequently reaches theoptimal or close to optimal solutions. For instance, Khumawala et al (1974) reported their VSM converged to the all but one of 27 problems tested. Klincewicz (1991) used both single-exchange and double­ exchange heuristic for solving the p-hub problem. In most cases the VSM performed very well, enabling him to capture the optimal solutions. In the largest problem tested, with 10 hubs and 52 nodes, the VSM arrived at a solution less than 6% from the optimum within 11 seconds. (b) The VSM has been demonstrated to be easily integrated with other exact and heuristic methods. For example, Jarvinen et al (1972) combined a branch-and-bound algorithm with the VSM in an attempt to solve the m-median problem more effectively. Cornuejols et al (1977) and Jacobsen (1983) showed how the VSM 35 could be successfully integrated with the greedy ADD and DROP procedures. (c) Since the VSM is not linked with a particular data structure and implementation strategy, it can be modified and applied to many location-allocation models besides the p- median problem (Hillsman, 1984; Densham and Rushton, 1992). In fact, several spatial-interaction-based location-allocation models have been solved by using some modified versions of the VSM (see, for example, Hodgson, 1978; Jacobsen, 1987).

2.3.2 New Heuristic Approaches Recently five new heuristic approaches have emerged for improving the efficiency of handling complex combinatorial problems: genetic algorithms, neural network, simulated annealing, tabu search, and target analysis (for a review, see Glover and Greenberg, 1989) . These methods are largely linked with artificial intelligence (Al). Some are inspired by natural phenomena or process, such as genetic algorithms and simulated annealing. Some represent the philosophy of designing heuristics that have learning and reasoning capabilities similar to those of human beings, as in the case of tabu search and target analysis. Among them, genetic algorithms and tabu search seem to be particularly promising for solving large-scale location-allocation problems. 36 (1) Genetic algorithms Genetic algorithms, initially introduced by Holland (1975), are based on the analogy between optimization process and the survival behavior in the biological system. Essentially genetic algorithms are adaptive search strategies that attempt to migrate the initial trial solutions in an evolutionary fashion to the global optimum solution. There are four steps involved (Glover and Greenberg, 1989): (i) start with a set of trial solutions to a problem which are called "parent solutions"; (ii) propagate "offspring solutions" from the parent solutions based on certain rules traditionally applied in genetics; (iii) retain the best offspring solution for the next generation of mating; (iv) repeat (ii) and (iii) until the highest quality offspring is found in compatible with the environment (i.e., the problem constraints). Hosage and Goodchild (1986) have explored the application of the genetic algorithms for location-allocation solutions. Although their algorithms do not lead to competitive computational results, they believe there exists a great potential to refine the current algorithms and to further improve its performance. 37

(2) Tabu search* Tabu search is first introduced by Glover (1977). A detailed review can be found in two of his recent papers (Glover, 1989, 1990). The method is basically a constrained yet very flexible search procedure. It aims to avoid cycling as well as trapping into a local optimum by forbidding certain moves (making them tabu) , and by selecting remaining moves not on the tabu list based on certain learning and unlearning rules. As a result, the heuristic it is embedded in will continue to proceed despite lack of improvement, thereby receiving more opportunities for arriving at the global optimum. There are four major rules which are key elements of tabu search (Glover, 1989, 1990): (a) Short term memory function: this rule is also called short term strategic forgetting. It prevents backtrack moves within a small number of iterations. (b) Intermediate term memory function: this rule is also named intermediate term regional intensification. It "learns" the common features associated with competitive solutions during a particular period of search. Then it looks for new solutions that exhibit such features. (c) Long term memory function: this rule is also known as long term global diversification. Its goal is to diversify the search strategies by enploying rules opposite to those in regional intensification. Thus it

2The word "tabu" means a forbidden move. 38 actually learns from the past and intends to generate new starting solutions. (d) Aspiration level function: this rule allows the tabu status of a move to be overruled if certain conditions are met. Typically if a forbidden move leads to the best update solution, its tabu status can be ignored. Tabu search has been successfully applied to a variety of combinatorial problems, including the location-allocation problem. Klincewicz (1990) reportd that by embedding tabu search in the VSM, he was able to obtain optimal solutions for the p-hub problem in 29 out of 32 cases. In the three cases where suboptimal solutions are captured, the average gap from the optimum was only 1.0%. Furthermore, the computational result indicated that good solutions tended to be found in the early stages of tabu search. (3) The allocation table approach In addition to the above AI related approaches, there have been continuing efforts to fine-tune traditional heuristic methods. The allocation table approach, designed by Densham and Rushton (1992), represented such an example. The idea was to modify the VSM by developing more efficient implementation strategies such that the spatial structure of location-allocation problems could be fully explored. In their paper, Densham and Rushton presented three major strategies as outlined in the following. 39 First, when using the VSM to solve an uncapaoitated location-allocation problem, it was noticed that during the process of evaluating a substitution and determining a new allocation pattern, only a subset of demand nodes needed to be examined in order to relocate them and to calculate the net changes in the objective function. This point was neglected by Teitz and Bart since their algorithm evaluated the allocation of every demand node after each substitution. Thus large savings in processing could be achieved by exploring this spatial structure, especially in large-scale location- allocation problems. Second, Densham and Rushton argued that the traditional data structure employed by the VSM--distance matrices--was very inefficient for computer storage and retrieval. For instance, the storage requirements increased as the square of problem size {i.e., M2, where M is the number of demand nodes). They proposed a more efficient approach which was to employ distance strings. A distance string is a base node with an ordered node list that indicates the distance between the node and the base node. Two types of distance strings could be developed: the demand string and the candidate string. The demand string was designed to record distance information (in ascending order) between every demand node and potential candidates that might serve it. Similarly, the candidate string was designed for to record distance information (in ascending order) between every candidate node 40 and the potential demand nodes it might serve. According to Densham and Rushton (1992), the use of distance strings could reduce the upper bound on data storage requirements from M2 to (M * N), where N is the number of candidates. Third, in order to further reduce the data computation and access time, Densham and Rushton introduced the approach of using allocation tables, instead of using the swap matrix as in the case of Teitz and Bart. An allocation table contained M columns (again, M is the number of demand nodes) and six rows. Row 1 & row 2 recorded the identifier and the weighted distance respectively of the closest facility for each demand node. In the same way row 3 & row 4 recorded the information of the second closest facility for each demand node. Row 5 & row 6 showed the identifier and the weighted distance of the candidate as a result of substitution. Finally, Net changes of objective function were listed at the bottom of the table (Table 3). By using the allocation table in conjunction with the distance strings and other data pre-processing techniques, Densham and Rushton provedthat it was easier and faster to carry out the node substitution and to update the information on the objective function. Consequently, great savings in data storage, access and processing could be realized. These fine-tuning strategies enabled them to handle large problems of up to 3000 nodes in a micro-computer environment. 41

Table 3. An Example of the Allocation Table (P.J. Densham and 0. Rushton, 1992)

Demand Nodes 1 2 3 4 5 • • •

Closest Facility Identifier Weighted Distance

2nd Closest Facility Identifier Weighted Distance

Candidate Identifier Weighted Distance

Net Changes in Objective Function

2.3.3 The Evaluation of Heuristic Methods Since heuristic methods do not guarantee an optimal solution to a problem, and vary greatly with both test problems and the conqputer environment, there is always a necessity to evaluate the quality of any heuristic techniques. Ball and Magazine (1981) suggested seven criteria to evaluate the performance of heuristic algorithms: (1) Quality of solution. This includes the closeness of the solution from the optimum and the ability to find a feasible solution if one exists. 42 (2) Running time and storage requirement. This refers to the time and storage space that are needed to solve a problem. (3) Difficulty of implementation. This refers to the complexity of the coding and the extent of the data requirement. (4) Flexibility. That is, the ease of handling changes in the objective function and constraints. (5) Robustness. This refers to such capabilities as performing sensitivity analysis, generating bounds, reporting information on the dual, etc. (6) Simplicity and analyzability. Good heuristic should be clearly constructed and readily lend themselves to analysis. (7) Interactive computing. This indicates the ability to take the advantage of man-machine interaction. In terms of techniques of evaluating heuristic, a simple review can be found in Brandeau and Chiu (1989) . Generally, three techniques are often used in addition to the enqpirical studies: (a) Bound on optimal solution. This approach aims to seek bounds on the optimal solution of a problem so that the solution generated by a heuristic can be compared and contrasted (see, for example, Efroymson and Ray, 1966; Cornuejols et al, 1977; Rosing et al, 1979; Juel, 1981; Love and Dowling, 1989) . 43 (b) Worst-case analysis. This technique studies the performance of a heuristic under the worst situations. It specifies the possible maximum gap between a heuristic solution and an optimal solution. Excellent research on this subject is in Cornuejols et al (1977) and in Fisher (1981). (c) Probabilistic analysis. In this approach certain probability distributions are designated for the problem data. Based on the assumption, the performance of a heuristic on a set of test problems is studied and its probabilistic properties are established (Cornuejols et al, 1980; Hochbaum, 1984) .

2.3.4 Summary of Heuristic Approaches There exist a variety of heuristic methods in the literature of location allocation modeling. In this section three traditional and three newly developed heuristics are reviewed. The traditional approaches--the greedy heuristic, the ALA heuristic, and the VSM--have been widely applied for solving location allocation problems. Their properties and performances are well examined under various conditions. Generally the VSM is considered superior to other heuristic because of its operational simplicity and computational robustness. Among many newly developed heuristic methods, genetic algorithms and tabu search seem to have the greatest potential being applied for solving large-scale location-allocation 44 problems. However, not many attempts have been made to test their effectiveness in solving location problems. This prompts the needs in the future research to study the applicability and usefulness of these Al based heuristic in location modeling. The allocation table approach is essentially an extension and enhancement of the Teitz and Bart heuristic. Major advantages are achieved by adopting a new way of data storage and by modifying the rule of vertex substitution. So far the method is only applied in location problems with deterministic allocation rules. Whether it is suitable for dealing with LSI models remains an unanswered question. Performance evaluation of heuristic methods is itself a complicated research agenda. There are many criteria and techniques that can be employed, and they vary greatly depending on the nature of test problems and the computing environment. In most cases, the main concern is about the quality of the solution, and the operation efficiency in terms of running time and storage requirement.

2.4 GIS Supported Visual Interactive Modeling A geographic information system (GIS) is a computer-based system for dealing with spatial data, it is defined to have the following components (Marble, 1990a): a) a data input subsystem which collects and/or processes spatial data; 45 b) a data storage and retrieval subsystem; c) a data manipulation and analysis subsystem which performs a variety of tasks ranging from data aggregation to space-time optimization or simulation; d) a data reporting and visualization subsystem. Since GIS represents the state of the art in computer technology for spatial data handling, it will inevitably exert a significant impact on the theory and methodology of spatial analysis (Marble, 1990b). As a matter of fact, the development and application of GIS technology in recent years has achieved remarkable success, especially in areas such as environmental assessment, resource and facility management, automated mapping, as well as in urban and regional planning (Tomlinson, 1987). However, the majority of the GIS applications have focused on the limited domain of spatial data input, data storage, and map output, i.e., in subsystems a), b), and d) of Marble's definition. Research and development on the data analysis subsystem is still very limited, which has resulted in a so-called "data rich-theory poor" GIS environment (Goodchild, 1987; Burrough, 1990; Openshaw, 1990) . This is partly due to the fact that the field of GIS is still in its infancy. Many complex technical issues related to data storage, display and exchange have not been fully resolved, making it expensive and difficult for the GIS to support highly sophisticated spatial analysis. On the 46 other hand, many enthusiasts of GIS are simply attracted by the use of high technology for producing high quality maps, and are less interested in developing proper spatial analytical theories and methods amenable to GIS (Goodchild, 1987) . The lack of analytical capability in the modern GIS is also reflected in location-allocation related GIS applications. Goodchild wrote in 1987 that: There are no systems currently on the market which recognize the object-pair and its importance to spatial analysis. Furthermore, the capabilities of most current systems for sophisticated analysis are very limited. Well-known algorithms for optimal spatial search, such as location allocation, have, until recently, been virtually unknown in the GIS field (Goodchild, 1987). A review of the recent literature confirms Goodchild's comment. While it is true some attenqpts have been made to implement graphics-based location-allocation models, only a few of them aimed to link location-allocation modeling with a true geographic information system which has all the components described in Marble's definition. Based on the difficulty level of data representation and modeling capability, three types of approaches can be identified: interactive evaluation, interactive optimization, and GIS supported interactive modeling.

2.4.1 interactive visualization It is natural that many location analysts start to use GIS as a tool for interactive visualization. Since whether it 47 is on plane or on network, the process of location and allocation is essentially a spatial phenomena and needs be shown on maps. The development of GIS technology has made interactive visualization an increasingly important vehicle to carry out location allocation modeling, yet much easier to implement than before. Research dealing with interactive visualization generally focuses on how to effectively display the result of location-allocation models. In other words, the computer system is not involved in the location decision making process. It simply performs an allocation and display function to allow interactive evaluation by the users. This method is useful when the primary concern is on site analysis of known facility locations. Particularly, if only one store location is to be selected from a limited number of potential locations, interactive display and evaluation can be experimented on each candidate site for examining and comparing various aspects such as trade area, potential sales, and traffic flows. However, if there are multiple stores to locate, or if the number of candidate sites are great, it becomes less appealing to use interactive visualization without first running an optimal location model. Hopmans (1986) developed a spatial decision support system for the purpose of viewing and evaluating spatial interaction patterns for given branch bank locations. Allard and Hodgson (1987) demonstrated how to use INTERGRAPH in portraying allocations for LSI models. Kalinski (1992) 48 constructed a prototype system in Arc/Info 5.0 to draw 3-D maps of retail potential surface. Armstrong et al (1992) summarized the state of the art of mapping location allocation solutions. According to them, three types of cartographic displays can be identified to support location decision making (Figure 2). The first is called the chorognostic displays, which are used to show the study area and its geographical background, such as demand, supply, annular, etc. The second type of display is used to show a single solution of location- allocation, which is called the monoplan displays. The often seen center-border maps, spider diagrams and so on belong to this category. The third type of display is the delta displays, which are designed to compare the alternatives of locational choices. They can be further classified as center- delta maps and allocation-delta maps. From the above review it is clear that there exist many possibilities in location-allocation mapping. To choose a good way of showing the locational process, several key issues need be considered in the context of individual location problems. First, what are the known data or information about the study area? Whether they are spatial or aspatial? And how accurate they are? Second, what are the computer hardware or software that are available? How can spatial data and their attributes be represented? What are the system capability and limits in terms of data structure, storage memory, display 49

Reference

Demand Cborognostic Displays (Used to show general Supply information about the study area) Annular

Ancillary

Display for Location Cenier-border Decision Making Monoplan Displays (Used to show a single solution Nodalchromatic

Spider

Delta Displays

(Used to compare Allocation-delta two solutions)

Figure 2. A Taxonomy for Locational Decision-Making Displays (From Armstrong, M.P. et al, 1992) 50 resolution, and graphics functionality (for example, 3-D view, dynamic segmentation, etc.)? Finally, the most important issue is how to achieve scientific visualization. That is, to organize all the information in a clear and logical way and to present them through intelligently designed cartographic outputs. These issues will be examined in detail later in this research project.

2.4.2 interactive Optimization Works on interactive optimization have been mostly undertaken by researchers and analysts from OR/MS. Usually the objective is to use a graphical-based, interactive algorithm for solving location-allocation problems. A typical example was the paper by Brady and Rosenthal (1980), in which they solved the planar version of the minimax location problem with an algorithm linked with man-machine interaction. Malczewski and Ogryczak (1990) devised an interactive based multiobjective programming model for locating pediatric hospitals in Warsaw, Poland. The system they inqplemented on PC allowed users to communicate with the optimization model by defining and modifying the aspiration and/or reservation levels, which are the mechanism of controling modeling process. Recently two real world applications using interactive optimization have been reported. One was to develop AT&T's telemarketing site selection system (Spencer III et al, 1990), and the second was to locate tax facilities 51 for the IRS (Domich et al, 1991) . A general discussion of interactive optimization with respect to facility location planning can be found in an early paper by Geoffrion (1975), while a more recent review is provided by Fisher (1985), and by Bell (1991). In the last several years some attempts have been made to enhance the interactive optimization approach by combining operations research and geographic information system. For instance, some commercial GIS software such as Arc/Info (ESRI), TransCAD (Caliper Corporation), and GeoRoute (GIRO, Inc., see Lapalme et al, 1992) are equipped with optimization modules which contain some well-developed computer algorithms from OR. The purpose is to help decision makers to perform transportation planning (routing, scheduling, traffic assignment, etc.) in a more efficient way and at a more realistic level. However, so far most commercial efforts in combining OR and GIS have been directed toward network related applications, the location related applications are almost neglected.

2.4.3 GIS Supported Visual Interactive Modeling Research under this category aims to combine interactive visualization, interactive optimization, and a user-friendly interface into a coherent system. It enables users to search through the database, to navigate the process of mathematical modeling, and to control the cartographic output. In the 52 field of location analysis, such an integration with GIS is widely regarded to have a great prospect (Gaile and Willmott, 1989, p.781). First, solving location problems requires manipulation of spatial objects in a two-dimensional domain. Thus location analysis can readily lend itself to a geographic information system for data input, storage and display. For instance, facility locations can be represented by points; routes or linkages can be shown by lines; and, facility hinterland or allocation patterns can be defined by polygons. Second, in regard to the modeling aspect, the optimization process in location analysis is, in essence, to search the attributes of demand points and to determine the best spatial relationships among them subject to a set of constraints. Similarly, geographic information systems are frequently built upon a relational database, and function by dealing with spatial objects and their associated attributes. Thus, current GIS technology should have no problem incorporating the optimization process. Third, the GIS can input data from multiple sources (census data, satellite data, etc.) and has a highly efficient mapping capability. This allows for more realistic data representation and location-allocation modeling (Current and Schilling, 1990a). Early works of this kind tended to be graphics oriented modeling without sophisticated GIS database management function. Goodchild and Noronha (1983) built a microconqputer- based system for location-allocation, which was called the 53 PLACE suite. Goodchild (1984) also created a retail site selection system named ILACS. This system incorporated the theory of spatial interaction, allowed for realistic barriers to interaction (such as rivers), and considered competition among different group of facilities. Fotheringham (1988) presented a program named MARKET1 based on the multinomial logit model. It was designed to assess the marketing impact of opening or closing a store. Some maps were drawn to show the spatial phenomenon of store location and market share, but not in an interactive mode. Recently several researchers have suggested establishing a spatial decision support system (SDSS) by integrating model- based location-allocation with a geographic information system. The SDSS will provide a complete environment for location analysis, with detailed maps, attribute tables, texts, statistical tools, as well as interactive algorithms. A general discussion on various aspects of the SDSS may be found in Densham and Rushton (1988) . So far only two contributions are found. The first was a knowledge-based SDSS developed by Armstrong et al (1990) . The system included three parts: (1) a knowledge base which elicits and stores different kinds of knowledge separately, facilitating quick data access and modification by users; (2) a problem-solving subsystem, which combines the user's knowledge with general techniques, strategies and algorithms for the purpose of dealing with a variety of location-allocation problems; and 54 (3) a metaplanner, which interacts with the user in assisting problem formulation, knowledge search, and solution evaluation. This knowledge-based SDSS may be regarded as the best effort so far in the sense of integrating location- allocation modeling and GIS. The second contribution came from Kohsaka (1993), who constructed a retail SDSS called MLDSS based on his location and trade area delimitation model (Kohsaka, 1989, 1992). The system dealt with planar retail location problems and could draw three dimensional trade area maps using IDRISI. The details of system implementation were not discussed in the paper. Yet the author pointed out that the system was very slow due to heavy computing burdens. For example, it took ten minutes to compute the interpolation of a trade area at 50 m intervals on a 386-based computer. From the above review one is not hard to see that the integrative approach requires a through knowledge of both modeling and GIS. Most work up to date tend to construct SDSS from scratch. This is certainly time and effort consuming, and can hardly done by one person. Further, the specialized SDSS may not be able to match the strong database management functionalities existed in commercial GIS software.

2.4.4 Summary of QIS Supported visual Interactive Modeling There is no doubt that GIS supported interactive modeling has a great potential and prospect in both theoretical research and real applications. In this section the 55 approaches of interactive visualization, interactive optimization, and GIS-based interactive modeling are reviewed. Interactive visualization is highly problem as well as system dependent. Most research in this area focus on the ways and effectiveness of the display, without linking the modeling process with the cartographic output. Interactive optimization, on the other hand, has been emphasized on the design of interactive algorithms for the user involvement during the location decision making process. Only in recent years has some commercial software started to link interactive optimization with GIS, but mostly for network related routing and scheduling problems. Today the trend is toward an integrative approach that combines modeling, database management, and visualization into a coherent system. Although some attempts have been made, such integrative approaches are still in the starting and experimenting phase. Many issues, in particular how to efficiently link optimization and GIS, remain to be unclear.

a, ? Conclusion In this chapter, a broad body of literature on the modeling aspects of location-spatial interaction and GIS are reviewed. This review encompasses both theoretical and applicational research, and spans the disciplines of Geography, Management Science, Operations Research, and Regional Science. 56 Location spatial interaction (LSI) models represent a unique contribution by geographers to the retail location analysis. Both model formulations and solution methods are reviewed in detail to provide useful insights of various aspects of LSI modeling. Despite many different versions of model formulation, most approaches have similar location objectives and spatial interaction constraints. The differences mainly lie in the variation of model assumptions and locational scenarios. Not much attention has been paid to the solution methods of LSI models. This gives rise to the need for improving the efficiency of heuristic for solving LSI models. There is a possibility that some sorts of hybrid algorithms could be developed with strategies from both conventional heuristic, as well as newly introduced Al-based heuristics. This will be discussed and demonstrated in the next chapter. The use of GIS in LSI modeling will certainly enhance the effectiveness of location decision making. Yet most work in linking location modeling and GIS are far from satisfaction. Therefore, it is necessary to explore various issues associated with design and inqplementation of the application system, as well as to examine the advantages and disadvantages of the integrative approach. This will be the focus of Chapter V. CHAPTER III MODEL DEVELOPMENT AND SOLUTION PROCEDURE

3.1 Introduction In this chapter, a LSI model is devised and a heuristic algorithm is developed for solving the model. The work will create a major building block for the construction of GIS based prototype system, which follows in Chapter IV. Model development is discussed in section 3.2, including model assunqptions, model formulation, and model complexity. In section 3.3, a hybrid algorithm is presented based on the exploration of a variety of heuristic strategies. The algorithm consists of three phases, the initialization phase, the search phase, and the evaluation and improvement phase. Section 3.4 provides a summary and conclusion of the chapter.

3.2 Model Development

3.2.1 Model Assumptions The decision problem of interest in this research is to search for the optimal locations of multiple retail outlets based on location-spatial interaction theory. As reviewed earlier, even within the LSI modeling framework, this is a "classic" problem that has been tackled by many researchers

57 58 from different perspectives. Therefore, it is necessary to lay out model assumptions in this research before any mathematical formulations are devised. The following assumptions are held for current model development and later GIS implementation: (1) Discrete location on the network. The optimal store locations are to be chosen among a set of known candidate sites on a network. Usually these sites are demand points represented by nodes. Thus the number of nodes on the network will largely determine the size of the location problem. This assumption is considered more realistic compared to the continuous space assumption, where it is possible to locate facilities anywhere within the study region. After all, the candidate sites in the real world are often quite restrictive and the solution space is better represented by a street network than by a continuous plane. (2) Shortest path traveling with distance-decay function. In many conventional location allocation problems, straight line distance is used to indicate the traveling cost between any origin 1 and destination j . In this research, since the solution space is a street network, the cost of traveling will be the shortest path distance between 1 and j . If the speed limit is known for every street on the network, then the accumulated time along the shortest path can be calculated to represent the cost of traveling. Further, a distance- inpedance function is employed to capture the well known role 59 of distance friction in the spatial interaction. Following Wilson (1981), O'Kelly (1987) and many other researchers, a parameter & is used to define the rate at which the distance friction impedes the traveling process. (3) Inelastic demand based on nodal population. The retail shoppers are fixed at nodes on the given network. This assumption is supported by the argument that normally the total demand for services within a study area are independent of store locations (O'Kelly, 1987). Further, it is relatively straightforward to transform a model to an elastic version (Sheppard, 1980). The determination of nodal population is a GIS operation and will be discussed in the next chapter. (4) Spatial interaction based allocation rule. This is perhaps the most important assumption of the model. It suggests that the Huff's idea reviewed earlier is adopted as the probabilistic allocation rule for store location and interaction analysis. That is, the number of trips from place

i to store j is positively related to the attractiveness of the store, and negatively related to the travel impedance. The location version of the equation is a little different

from the usual gravity model in that a binary variable Y, is used to signify the competing facilities. It can be derived from information minimization theory (see Wilson et al., 1981; O'Kelly, 1987), with either an exponential function or power function for distance decay: 60

s ij “ ° i x (10) or

(11)

where the variables are: S4J number of trips (or amount of expenditure) from place 1 to store j;

Oa the proportion of demand (trips or amount of expenditure) originating at place 1; 04 isnormalized so that Xjpt - 1; Wj the attractiveness of store j, usually defined as j's size;

CtJ the cost of travel from i to j; Yj ■ 1 if a store islocated at j; ■ 0 otherwise; £ a non-negative parameter indicating the degree of friction.

3.2.2 Model Formulation Given the above assumptions and variable definitions, a LSI model for retail location can be written as follows: Subject to

- 0, i-1 (13)

(14)

i-1 m 3 - i n (15)

The objective function is a nonlinear convex function that consists of two parts. The first part is an entropy based objective attempting to seek the most probable trip distribution pattern. The second part is a cost minimization objective aiming to reduce as much as possible the total trip cost. Thus the objective function has the dual goals of seeking the most likely trip distribution pattern and minimizing the total trip cost. Note the two goals are somewhat conflicting in nature. The entropy maximization goal is inclined to assign an equal or proportional demand of a place to all the competing facilities, depending on the gravity based utility of each 62 facility. The cost minimization goal, on the other hand, tends to assign all the demand of a place to the facility with the lowest traveling cost. To reconcile the conflict, the non­ negative £ parameter is introduced into the objective function. If £ is small, the traveling friction is small. As a result, the entropy maximization goal weighted by l/£ is favored in the objective function. The model will tend to assign a visiting probability of each demand node to each retail outlet more based on the store attractiveness Wj. As £ increases, the distance friction increases, the entropy based objective tends to diminish and gives the priority to the cost minimization objective. When £ closes to infinity, the first part of objective closes to zero. With only the cost minimization goal, the model then becomes a traditional p-median location model (O'Kelly, 1987) . Therefore, the p- median location problem can be viewed as a special case of the LSI model. Since £ plays a critical role in the model, one may wish to know what is an appropriate £ value for a given study area. According to the definition, £ is designated as the degree of traveling frictions, which in the real world could be affected by many different factors such as surface topography, traffic condition, weather, and time. This makes it very difficult to develop a universal formula for specifying the best £ value. Our suggestion here is to compute the range of £ values for a given area, and to determine good £ values with simulation or 63 calibration. Assuming C-in and CBax are the shortest and the longest traveling distance within the study area, one simple way to calculate the range of fi values is:

Suppose exp(pMtfftC^n) *

Then T d n

Similarly,

Suppose 2 C^n

Then

As a result, the range of 6 can be written as:

lnC*tS i | | £ iESs* (16) ^■x

Another important aspect of the objective function is about the interpretation of the entropy function. Although the entropy objective is derived from information minimization theory, it has been found to bear interesting locational implications, too. As Erlander (1980) demonstrated, the entropy function can be viewed as a measure of dispersion from the statistical perspective. In the LSI model, to maximize entropy implies to maximize the dispersion of S15 among a set 64 of open retail outlets. Intuitively, S4j reflects the of shopping flow from demand node i to facility j, thus the phenomenon of having more dispersed Stj could be interpreted as consumers having more comparably appealing choices, or stores being more competitive in terms of their locational attractiveness. That is why in the LSI literature, to maximize entropy or the like is often referred as to maximize consumer benefits or to maximize locational surplus. Finally, notice is a probability value between 0 and 1, so the entropy function - 1) has a negative value. This often causes the final objective function to be negative, which is different from many conventional location models where objective functions are usually positive. The model constraints are relatively straightforward. The first constraint states that for every origin i, its demand should be equal to the total trips from i to all the competing stores where Yj ■ 1. The second constraint suggests that the total number of stores in the study area equals to P. P is a known parameter indicating the number of existing plus the planned retail facilities. This constraint also implies a budget restriction for current location planning. The third constraint ensures that the trips out of demand places can end up at destinations only if the destinations contain retail outlets. 65 3.2.3 Model Complexity It is useful to do some simple analysis of model complexity before moving to the development of solution procedures. In terms of decision variables, clearly the binary variable Yj is essential to the model. Suppose there are m demand nodes and n candidate nodes within the study area, the location status of each candidate is represented by the Yj value. Then there are total of n binary variables. If Yj is known, it is not difficult to calculate all the 8^ values and the objective function. In terms of model constraints, it is easy to tell there are m+n+1 constraints. Finally, like many other location problems, the total number of feasible location configurations is a function of n and p, which can be computed from the following equation:

T(n'*> - inwr (17)

This equation portrays an explosive growth of possible solutions as the size of the problem increases. For example, with n ■ 300 and p ■ 10, there are about 1.398 x 1018 alternative solutions. Therefore, seeking practical solutions to this sort of combinatorial problem has become a major challenge of current research. 66

3.3 solution Strategies The LSI model is similar to the traditional location models in that the binary variable Ys plays a critical role in the modeling process. Once the status of Y3 is known for all the candidate sites, the rest of the operation becomes quite straightforward. This similarity makes it possible to borrow many heuristic methods reviewed earlier for dealing with the LSI model. However, these methods may not be very useful without careful and creative modifications. First, some effective strategies in the conventional heuristics are designed for solving the p-median related problems with all or nothing assignment rule. They may not be applicable to the LSI model that has a probabilistic assignment rule. For example, several fast algorithms, including the allocation table methods (see Whitaker, 1983; Densham and Rushton 1992), are based on the fact that during the vertex substitution process, only some demand nodes are influenced when new facilities are entered. And much savings in data storage and analysis could be achieved by only taking care of those demand that are affected. This fact, unfortunately, does not hold in the LSI model, in which the allocation probability 8i3 needs be recomputed whenever there are changes in the membership of the set of competing stores. Second, whereas traditional location algorithms are mostly run in batch mode and tend to regard computer resource as a minor constraint. Current LSI model will be run interactively in connection with a GIS, 67 thereby forcing us to tackle the issue of practical compatibility. We believe that interactive user involvement is essential if the location process is to offer significant practical application of theoretical optimal solutions. In this section a hybrid heuristic algorithm is developed to solve the LSI model. Efforts are made to combine good heuristic strategies from the literature while paying special attention to the structure and operational behavior of the LSI model. Since it is impossible to enumerate all the feasible solutions, the rationale of the algorithm is: (1) To start with a reasonably good feasible solution. (2) To examine as many promising solutions as possible without trapping in a local optimum. (3) To obtain tightened lower bounds for the heuristic solution. (4) To take advantage of visual interactive control and evaluation. The following sections are devoted to the detailed design and implementation of the algorithm, which includes three major phases: initialization, search, and evaluation. For illustration purpose, a small sample network is randomly generated, as shown in Figure 3. It consists six nodes, with their shortest path distances (C4j) , level of demand (Ot) and potential attractiveness (Wj) displayed in Table 4. Assume a firm wishes to locate two stores {p - 2) in the area and all six demand nodes are feasible sites. Since - 4 and ■ 68

Figure 3. The Sample Network

14, using the equation 17, the distance decay parameter has an approximate range of 0.10 £ & SI 0.66. We will test the LSI model with a small £ value (0.1), a middle S value (0.3), and large & value (0.6).

Table 4. Data of the Sample Network

Travel Distance Place 1 2 3 4 5 6 Demand(%) Attractiveness 1 0 4 10 11 6 4 17 2 2 4 0 6 8 4 8 16 3 3 10 6 0 4 9 14 15 4 4 11 8 4 0 5 10 17 3 5 6 4 9 5 0 5 18 2 6 4 8 14 10 5 0 19 1 69 3.3.1 Initialization The objective of the initialization phase is to obtain a good starting solution (i.e. a good upper bound) for the problem3. Because it is always feasible to pick any combinations of P facilities as a starting solution, many different strategies can be used to make the P choices. We will first examine and compare some of the most often used procedures, then we can make a judgement on what are the good strategies for our LSI model.

3.3.1.1. The random start approach The simplest way to get a starting set of facilities is to randomly pick P candidate nodes. This approach is easy to implement in terms of programming effort, but often ends up with poor values of objective function. For example, in our sample network, if we pick node one and two as a starting solution, the result is:

Table 5. The Initial Solution Using Random Start

Starting Entropy Average Trip Objective Solution Function Length Function (1,2) (Zl) (22) (Zl + Z2) fi = 0.1 -44.508 5.141 -39.367 6 = 0.3 -14.484 4.601 -9.883 fi = 0.6 -6.877 4.094 -2.783

3Although a “too good" starting solution may create a local optima trap for the search phase, we consider it as an unusual case and assume the closer to the optimal value, the better a starting solution is. 70 3.3.1.2 The interactive etart approach This approach relies on user's experience to get started. The user may choose any P desired candidate sites as an initial solution. Obviously this approach has many advantages over the random start approach. At least the human eyes would not pick some sites that are apparently unsuitable. The map layer in the GIS might also be tapped for clues about suitable initial solutions. If the user is very knowledgeable about the study area, he or she might be able to select site locations with great potential from the very beginning. Therefore, such an approach provides an opportunity to compare and contrast the decision making between human experience and mathematical models. Nevertheless, we should be aware that human eyes are not always accurate, especially when the problem (P and N) gets large and when the user's information is limited. Let's designate Zl and Z2 as entropy function value and average trip length, then the objective function value equals Zl + Z2. In the sample network, suppose one picks node two and node five as a starting facility set (since they look quite centrally located), the resulting objective function values are: This does provide better starting solutions than those from the random start approach. Note there are other ways to interactively get started. For example, the Alternate Location Allocation (ALA) method reviewed earlier offers an excellent way of choosing locations 71

Table 6. The Initial Solution using Interactive Start

Starting Entropy Average Trip Objective Solution Function Length Function (2,5) (Zl) (Z2) (Zl + Z2)

& = 0.1 -44.533 4.776 -39.757

S = 0.3 -14.537 4.307 -10.230 VO o II • -6.914 3.819 -3.095 on a plain. The user can interactively divide candidate points into P subgroups, and then allows the computer to choose the weighted centroid of each group as the member of the starting solution. But it seems to be difficult to apply such methods on a network because graph partitioning itself is a complicated problem.

3.3.1.3 The greedy approach The greedy approach, which includes the greedy add and the greedy drop procedures, has been a popular starting strategy for location allocation problems. Since the greedy drop procedure is computationally very expensive, and is not suitable for problems with convex objective function, here we only consider the greedy add procedure. The greedy add algorithm consists of P iterations. At each iteration one facility location is added such that the decrease of the objective function is maximized. In our sample network, the process looks like the following: 72 Step 1. Define P = 1, calculate the objective function for every candidate site. The results are:

Table 7. The Result of Using Step One of Greedy Add

Entropy Average Objective Yj = 1 Function Trip Function (Zl) Length (Zl + Z2)

Clearly when 6 = 0.1, node 3 performs the best among the six candidates, thus the first facility location is at node 3. Node two is the first choice when 6 = 0.3 or 0.6. Step 2. Define P = 2, repeat step 1 with node 3 as an existing facility for S = 0.1, and node 2 as an existing facility for 6 = 0.3 and 0.6. The result is: 73 Table 8. The Result of Using Step Two of Greedy Add

Entropy Average Objective Y, = 1 Function trip Function (Zl) Length (Zl + Z2) (Z2) 3 1 -45.853 5.677 -40.176 3 2 -47.714 5.811 -41.903 fi = 0.1 3 4 -47.948 6.871 -41.077 3 5 -45.688 5.653 -40.035 3 6 -43.975 6.065 -37.910 2 1 -14.484 4.601 -9.883 2 3 -15.126 4.594 -10.532 £ = 0.3 2 4 -14.807 4.187 -10.620 2 5 -14.537 4.307 -10.230 2 6 -13.514 4.146 -9.368 2 1 -6.877 4.094 -2.783 2 3 -7.030 3.827 -3 .203 £ = 0.6 2 4 -7.052 3.780 -3.344 2 5 -6.914 3.819 -3.095 2 6 -6.497 3.762 -2.735

As the above table indicates, the greedy add strategy has picked (2,3) as the starting solution when £ = 0.1, and (2,4) as the starting solution when £ = 0.3 and 0.6. Note in each case the initial solution does not have the shortest average trip length (Z2), it is the addition of the entropy function (Zl) that makes a difference. On the other hand, as £ increases, the average trip length in the initial solution gets closer to the existing lowest trip length. In other words, the distance minimization function makes more contributions to the objective function as £ increases. 74

3.3.1.4 The voting algorithm If the strategies discussed earlier are considered "top- down" or "global" approaches, then the voting algorithm is a "bottom-up" or "local" approach. The procedure is analogous to the often seen voting process. Suppose the demand nodes are endowed with certain voting powers according to their population. Besides voting for itself, each of them is required to vote for one {or P-l) facility location(s) among the candidate sites. The P candidate sites that receive the highest number of votes become the set of the starting solution. The critical part of a voting algorithm is how to design the voting rules. Generally the following issues are of major concerns: a) Including itself, how many facility locations can each node vote? b) How will the votes be allocated? in a deterministic fashion or a probabilistic fashion? c) Should each node vote for the candidate sites based on closeness, or based on the gravity type of utility? Preliminary study reveals that for the LSI problem, the performance of voting procedure is closely related to the S value. When fi is small, travel friction is of minor importance. The situation becomes similar to the two dimension ice cream vendor location problem, where central locations would have accessibility advantages over periphery 75 locations. Thus the location pattern tends to be clustered to capture the dispersed shopping flows. When 6 is large, the LSI problem is more like a P-median location problem in which the location pattern tends to dispersed whereas shopping flows are more spatially concentrated. Such logical assertions have motivated the development of the following "voting rules". Rule Number One When 6 is very small, say less than or equal to Slwf, the voting process can be reduced to a simple ranking process. Since traveling friction is of little concern, a "global" perspective becomes more effective. The computer simply picks up the top P nodes with high ranks in terms of the combination of attractiveness and centrality. Assuming i represents a demand node, j represents a candidate site, the procedure includes the following two steps: Step 1. Compute the following centrality index for every candidate site j :

Step 2. Select the P sites with highest index value as an initial solution. Using our sample network, the node centrality is: 76

Table 9. The Node Centrality Index of the Sample Network

Voting Node Centrality Index Rule One 1 0.34 2 0.58 IS = 0.1 3 0.52 4 0.46 5 0.42 6 0.15

Therefore, node 2 and 3 have become the initial solution. This is the same selection as reached by the greedy add strategy, but computationally much easier to handle. In fact, given 6 £ 0.1, this is also the optimal solution for the sample problem. Rule Number Two When fi is very large, say greater than or equal to Shiflh, voting has become a good way of reflecting the local or regional perspective. The voting process is deterministic such that all the votes of each demand node will go to only one candidate site. Driven by the consideration that the problem is close to a P-median case, each demand node will vote itself, plus one nearest candidate site. The rationale here is to look for the regional centers that are able to capture a great deal of weighted nearest traveling distance, in addition to their own demand. The strategies are4:

4If it is handy, one can get the solution of p-median problem as the starting location pattern. Step 1. For every demand node i, find out which candidate site is closest to it, and calculate the votes as weighted travel cost -- V1 ■ O^j. If there exists a tie in terms of nearest site, pick the site with a higher combination of demand and attractiveness. In the sample problem, the result is:

Table 10. The Result of A p p l y i n g voting Rule Two to the Sample Network

Voting Node Vote for Votes Rule Two 1 6 68 2 5 64 ii P> o • CTi 3 4 60 4 3 68 5 2 72 6 1 76

Step 2. For each candidate site, obtain the accessibility index by adding all the votes it receives, plus its own demand: 2+Vi + Oj V^-1,2,. ..n (19)

In our example, each candidate node receives one vote, the final votes are:

Table 11. The Accessibility Index of the Sample Network

Voting Node Accessibility Index Rule Two 1 93 2 88 & = 0.6 3 83 4 62 5 82 6 87 78 Step 3. Choose the P highest indexed nodes as an initial solution. In this case node 1 and 2 become the initial solution. Step 4. Check if there exists "unfair voting". If it does, replace the lower ranked one with next high ranked node and repeat the checking process. Unfair voting refers to the situation where two nodes in the initial solution vote for each other. In the example, since node 1 and 2 do not vote for each other, so the initial solution remains the same. This happens to be the same starting solution as the one picked by the random start strategy. The objective function is -2.783, with an average trip length of 4.094, which does not appear as good as those reached by interactive and greedy add strategies. However, notice for & = 0.6, node 2 has been selected by all the strategies tested. Whereas node 1, which is actually the most important regional center and will appear in the final optimal solution, is discovered only by the voting strategy. Rule Number Three When fi value is medium, say between filow and fihiohs, the voting process is probabilistic. Each demand node can vote for P facilities, and its votes (population) will be allocated to each of the p facilities according to the Huff's probabilistic model.

5The range may be adjusted depending on individual cases. 79 Step 1. For each demand node, compute the gravity utility it may get from every candidate site (not include itself). The utility function is:

(Ty - W^exp (-pc^), V i * 1,2, . . .m; j * 1,2, . . .n (20)

Step 2. For each demand node, vote for the top P candidate sites with the highest utility function. If there is a tie between two candidates, choose the candidate with higher level of demand, or split the flow if the demand is also equal. For example, in our case node 2 receives the highest utility from node 3, the second highest attraction from node 1 and 5, it will only vote for node 3 and 5 as node 5 has a slightly higher demand than node 3. Defining K* as the set containing those P candidates voted by i, the votes allocation for each i can be written as:

vu “ » *lexc> — Vi- i,a,.. .nt JeKt (21)

Step 3. For each candidate site, add all the probabilistic votes it receives, along with its self-vote (demand) . The top P sites with highest final votes then become the initial solution. Again using the sample network, the process is illustrated in Table 12 for S equal to 0.3. Since node 2 and 5 are having the highest votes, they are the initial solution. 80 Table 12. The Voting Process of Rule Number Three

Candidates and Votes Voters 1 2 3 4 5 6 1 12 .44 4.56 2 8.37 7.63 3 5.32 9.68 4 12 .41 4.59 5 10.34 7.66 6 10.91 8.09 Self 17.00 15.00 18.00 19.00 votes 16.00 17.00 Total 27.91 35.78 19.00 votes 44.10 34.34 42.87

With £ = 0.3, the objective function value is -10.230, the average trip length is 4.307. Note rule number three is computationally more costly than the other two rules. When the problem gets very large, it is possible to make rule number three easier to handle by requiring that each demand node only needs vote for one candidate site instead of P sites. 81

3.3.1.5 Comparison of initialization strategies Table 13 summarizes the initial solutions reached by the five different approaches. When 6 = 0.1, both greedy add and voting rule one chose (2,3) as their starting solution, which in this case was also the optimal solution. The node set (2, 5) selected by interactive start appeared to be quite good, and was better than the random start choice (1, 2). When & = 0.3, greedy add continued to perform best by reaching the optimal solution (2,4). The voting strategy and interactive start selected a near optimal solution (2,5). Random start remained the least appealing strategy with its poor performance. When & = 0.6, the greedy add picked (2,4), which was the closest upper bound. Interactive start ranked second with its choice of (2,5). The voting strategy and random start ranked third with the pick of (1,2). Since the sample problem was randomly designed for illustration purpose, it might be too small to reflect the capability of these strategies. This has given rise to the necessity of doing some further experiment with large problems. Overall, some preliminary remarks can be drawn about the initialization strategies. First, the random start approach is the easiest method to use but often result in poor solutions. It should not be employed other than being a strategy to diversify the starting pattern. 82

Table 13. Comparison of Initialization strategies (m ■ n ■ 6, p ■ 2)

Initial Objective Average Strate­ £ Solution Function Trip Comment gies Length 0.1 -39.367 5.141 easy to Random inplement, Start 0.3 (1,2) -9.883 4.601 result unpredict­ 0.6 -2.783 4.094 able 0.1 -39.757 4.776 easy to inqplement, Interac­ 0.3 -10.230 4.307 result tive (2,5) depending Start on user 0.6 -3.095 3 .819 experience 0.1 (2,3) -41.903 5.811 able to get good upper Greedy bounds, Add 0.3 (2,4) -10.620 4.187 computa­ tionally expensive 0.6 (2,4) -3.344 3.780 to implement easy to Voting implement, Rule One 0.1 (2,3) -41.903 5.811 good result when & is small able to get good upper Voting bounds when Rule 0.3 (2,5) -10.230 4.307 £ is Three medium, computa­ tionally costly to implement easy to Voting implement, Rule Two 0.6 (1,2) -2.783 4.094 good result when £ is large | 83 Second, the performance of the interactive start approach relies entirely on the user's ability. This method is preferred when the user has a full knowledge of the study area and wants to take more control of the location process. With interactive modeling, this should always be an option for the user. On the other hand, when P gets larger, interactive start will be less efficient, and more likely to involve errors. Third, the greedy add method has performed quite well in many studies (for example, Kuehn and Hamburger, 1963; Cornuejols, et al, 1977; Whitaker, 1983). For LSI models, the greedy add approach is recommended when S is medium or large. Another possibility is to use "interactive greedy add" if P is not big (usually less than 10)6. That is, to give user the ultimate control over the greedy add process. This should increase the effectiveness of the initialization process. Finally, it is the first time that a new procedure--the voting algorithm--is introduced to deal with the LSI model in a comprehensive fashion. It contains three voting rules that pay special attention to the nature of the LSI problem when S value changes . Specifically, the algorithm computes different indices to guide the voting direction and allocation based on the inter-play between facility attractiveness, demand level, and travel cost. Rule number one targets for the central

6This depends on the willingness of users to interactively add stores, one by one using a cursor or a mouse. 84 locations when £ is small. Rule number two searches for regional medians when 15 gets very large. And rule number three favors a balanced probabilistic voting approach for problems with medium 6 values. The strength of the voting algorithm also lies in its flexibility. That is, the voting rules can be modified and redesigned for different types of location problems. From the earlier analysis it appears that rule number one and two are easy to handle, but rule number three is computationally costly. However, considering the voting procedure is based on a "local perspective" with each node behaves the same way, there is a very good chance that parallel processing can be used for managing the voting procedure. This would overcome the computing bottleneck for large problems and make the algorithm more appealing and applicable.

3.3.2 Search The search phase is the primary component of the heuristic. It starts with the initial solution and attempts to move to a reasonably good (if not optimal) solution. As reviewed in Chapter II, the vertex substitution method (VSM) has been the most widely adopted heuristic for accomplishing the search task. In this section, a modified version of VSM incorporating tabu search and hashing procedure is developed. We will first use the conventional VSM to solve the sample 85 problem, then we will discuss how tabu search and the hashing strategy can help to improve the performance of the VSM. 3.3.2.1 The vertex substitution method The VSM starts with an initial feasible solution, then substitutes a facility not in the current solution for each one that is in the solution. The substitution that yields the greatest improvement is accepted. The process continues until no improvement can be made by such interchanges. Applying the

VSM to our sample problem (n - 6, m ■ 6, ft ■ 0.6), the following two iterations can be generated:

iteration One 1. Suppose the starting solution is node 1 and 2 as suggested by the voting strategy: S* = (1, 2) Z* = -2.783 NJ = {3, 4, 5, 6} Zold = -2.783 S* and Z# are symbols for current location pattern and objective function. NJ is defined as the set containing the candidate sites that are not in the initial solution. Z ^ records the initial objective function. 2. Substitute node 3 for each node in the current set: 51 = (3, 2) Z! = -3.203 52 = (1, 3) Z2 = -3.312 Z* = -2.783 Both substitutions would improve the current solution. Since Z2 < Z1# make the substitution of 2 for 3. S* = (1, 3) Z* = -3.312 3. Substitute node 4 for each node in the current set: 86 51 = (4, 3) Zi = -1.616 52 = (1, 4) Z2 = -3.469 Z* = -3.312 Since Z2 < Z*, make the exchange of 3 and 4. S* = (1, 4) Z* = -3.469 4. Substitute node 5 for each node in the current set: 51 = (5, 4) Zx = -3.008 52 = (1, 5) Z2 = -2.742 Z* = -3.469 Since Zx > Z*, Z2 > z*, do not make the exchange. 5. Substitute node 6 for each node in the current set: 51 = (6, 4) Zi = -2.634 52 = (1, 6) Z2 = -1.042 Z* = -3.469 Since Zi > Z*, Z2 > Z*, do not make the exchange. 6. Because all the nodes not in the initial solution have been examined, terminate this iteration. Define as the current best solution: Z.*, = -3.469. Iteration Two If Zoj,^ equals Zstop. Otherwise, repeat iteration one using Z ^ as a initial objective function. That is, with S* = (1, 4), Z* = -3.469, NJ = {3, 2, 5, 6}, Z ^ = -3.469, go back to the substitution process. Notice in the second run of iteration one, the evaluation process will encounter many patterns already examined before. As a matter of fact, in out example, only (2, 4) is a new pattern in the second run, the rest of substitutions -- (3, 4), (1, 3), {1, 2), (5, 4), (1, 5), (6, 4), (1, 6) all lead to repeated patterns. Later we will introduce hashing strategies 87 to avoid such redundancies. Since the objective function associated with (2, 4) is -3.344, there is no improvement over the initial solution. The heuristic stops and the set (1, 4) has become the final solution. Although in this case (1, 4) is indeed the optimal solution, for large problems it is highly possible the solution reached by the VSM is a local optimum. To minimize the chance of trapping into a local optimum, it is necessary to introduce other strategies to guide the search process.

3.3.2.2 Tabu search The word "tabu" means forbidden move. Tabu search is basically a constrained yet very flexible search procedure. It aims to avoid cycling as well as trapping into a local optimum by forbidding certain moves (making them tabu), and by selecting remaining moves that are not on the tabu list based on "certain learning and unlearning rules" (Glover and Greenberg, 1989). When incorporated into the VSM, two major types of tabu procedure can be used: forward tabu and backward tabu. 1. Forward tabu Forward tabu is designed to sharpen the search direction of the VSM. It could be viewed as a memory function that provides a "cut" for reducing the feasible region throughout the substitution process. As a result, the heuristic will move forward in a more efficient way by avoiding "bad swaps" that are recorded by the tabu list. 88 Definition. If the exchange of two nodes could result in the improvement of the objective function, then the node being switched out becomes a forward tabu.

Mathematically, suppose a, b are two nodes, a is in the

current solution, b is not: a e S*, b E NJ. Z* is the

current solution, Z' is the new solution with b e 8*, a

e n j . If Z' < Z*, make the exchange, and node a becomes tabu. There exist some variations in the treatment of the forward tabu. Thus some further definitions need be made. First, base on the scope of tabu status, we may have targeting tabu and universal tabu, a) targeting tabu. The tabu status of a node is only targeted toward the node that replaces it. This can be written as:

If node a is swapped out by node b, tabu(a) - b.

In other words, it is still possible for node a to come back to the current solution replacing nodes other than b. If a node is replaced by different nodes at various times, only the most recent tabu is recognized. For instance, the following situation could occur: tabu(a) ■ b» tabu(b) - c, tabu(c)

■ a, tabu(a) ■ d. At this time the targeting tabu status of node a has changed to node d, so node a is now allowed to swap with node b provided such exchange would improve the objective function. Notice that the cycling effect among node a, b, and c would not happen unless some other nodes in the current solution have been exchanged. 89 b) universal tabu. Once in the tabu status, a node is not allowed to substitute any nodes in the current solution. This can be written as:

If node a is swapped out by node b, tabu(a) - 1. Second, according to the easiness of entering the tabu, we can find hard tabu and soft tabu. c) hard tabu. A node enters tabu only when it is being swapped out from the current solution. d) soft tabu. A node can become tabu if it has the potential to be switched out from the current solution, even though the exchange does not happen. This occurs when a newly introduced candidate is so good that it could switch with more than one nodes in the current solution. Yet only the substitution that yields the greatest improvement of the objective function is carried out. Soft tabu must be a targeting tabu, otherwise the cut is too strong. Third, depending on how long a tabu status can last, we can have temporary tabu and permanent tabu.

•) temporary tabu. If a length limit is inposed on thetabu list such that after certain iterations, the first node on the tabu list will be freed from its tabu status and be allowed returning to the substitution process. In other words, the VSM will not touch a tabu node within certain iterations, f) permanent tabu. If no length limit is imposed on thetabu list, once a node is switched out and enters tabu, it will 90 never appear in the substitution process and be considered as a candidate site. Given the above classifications of tabu, there are eight possible combinations of tabu, as shown by a decision tree in Figure 4. One must keep in mind that using forward tabu will not improve the solution quality, rather it is for the purpose of enhancing the effectiveness of the VSM during the second iteration. Computational experience indicates that the “hard tabu tree" is a more favorable choice over the "soft tabu tree".

Forward Tabu

Hard Soft

Permament Temporary Permanent Temporary

V \ / \ / \ Targeting Universal Targeting Universal Targeting Universal Targeting Universal

Figure 4. The Tabu Decision Tree

This leaves four possible combinations as the commonly used tabu strategies. It is difficult to determine which one is 91 superior over the other since their effectiveness is largely problem dependent. Table 14 compares the degree of "cut" created by the hard tabu tree. Stronger cut implies that more feasible solutions will be ruled out for exchange consideration in the VSM.

Table 14. The Degree of Cut Provided by Different Tabu Strategies

1 Tabu Strategies Targeting Universal Temporary weak cut strong cut Permanent strong cut strongest cut

2. Backward tabu Backward tabu is designed to force the continuation of the search process despite of lack of progress. It can help to avoid local optimum traps. As a result, the heuristic it is embedded in will be receiving more opportunities for arriving at the global optimum. Definition. If the current solution is reached by iteration one, then the node(s) from the initial solution that has never been replaced becomes a backward tabu and is forced out of the current solution. The node replacing it can be chosen randomly or based on certain probabilities. The search then renews its substitution process.

To summarize, forward tabu is an intensification strategy that attempts to focus on searching good solutions, whereas 92 the backward tabu is a diversification strategy that aims to jump out of the local optimum traps. When used appropriately with the vertex substitution heuristic, tabu search can greatly enhance the capability of the VSM.

3.3.2.3 The hashing strategy The hashing strategy is designed to avoid checking the repeated location patterns during the vertex substitution process. It also enables the algorithm to start with different initial solutions without doing redundant evaluations. The fundamental idea of hashing is to use a simple table recoding addresses of a very large dynamic set, which in our case is the set of location patterns that have been examined. As a result of hashing, for a given element, the computer can quickly check whether it has a slot in the hash table (this is termed as sequential probe). If it does, that means it is already a member of the set, otherwise it is assigned to a slot in the hash table. There are many ways to manage hashing. Figure 5 illustrates how the hashing strategy works in our algorithm. Each location solution is addressed by a key and a hash function. It is possible to have key collision, that is, to allow many solutions to share the same key. However, it is prohibited to have more than one solutions sharing both the same key and the same hash function. The key determines which 93

Key Hash Function H(K)

K1 H(2) H(3) —

K 2 H(l) H(4) « • •

...

Figure 5. The Nay of Hashing 94 row a solution is stored, the hash function determines which column it is settled. In addition, there is a dynamic array that records the hashing length of each row. It is used to help the sequential probe as well as the assignment of slots to new comers. Definition. Each location pattern can be uniquely represented by a key and a hash function: {k, h(k,i)}. The key to solution equals the sum of the node numbers, and the hash function equals the squared sum of node numbers. Table 15 shows how the 15 possible solutions in our sample problem can be represented by a hash table. Notice at the beginning the hash table is empty, it is gradually filled up as the search process goes on. For instance, given a solution (3,4), the hashing strategy is: step 1. calculate the key and the hash value: k = 3 + 4 = 7, h(k) = 3a + 42 = 25. step 2. go to row 7, then probe the column sequentially: h (7,1), h (7,2), ..., h(7,length(7)) If a number of 25 is encountered, that means (3,4) has been examined before, stop the hashing. Otherwise, go to step 3. step 3. update the hashing length of this row and add the new hash value: length(7) = length(7) + 1, hash(7,length(7)) * 25. 95 Table 15. The Hash Table for the Sample Problem

Hash Function Hashing Length | 1 K e y I K h(k,1) h(k,2) h(k,3) 1 length(1) = 0 2 length(2) = 0 3 5 length(3) = 1 (1,2) 4 10 length(4) = 1 I (1,3) 5 17 13 length(5) = 2 I (1,4) (2,3) 6 26 20 length(6) = 2 | (1,5) (2,4) 7 37 29 25 length(7) = 3 (1,6) (2,5) (3,4) 8 40 34 length(8) = 2 (2,6) (3,5) 9 41 length(9) = 1 (4,5) 10 50 length(10) = 1 (4,6) 11 61 length(11) = 1 j (5,6)

Although hashing provides a fast way of retrieval and insert, to maintain a increasingly large hash table does need a lot of effort and storage space. Thus it is necessary to employ hashing only when a) the computation of the objective function is very costly, and b) at the same time the chances of evaluating redundant location patterns are high. 96

3.3.3 Evaluation and Improvement At this stage we assumably have obtained a very good solution, if not optimal, to our LSI model. However, the following two questions remain unanswered: 1) How good is the solution reached by the search procedure? Is it optimal? If not, how close is it to the optimal? 2) Is current solution practically acceptable? What if some assumptions or conditions are changed? In order to answer the first question, Lagrangian relaxation and subgradient search are employed to find the lower bound of the LSI problem. For the second question, interactive evaluation procedures are designed for performing various scenario analysis.

3.3.3.1 Lagrangian relaxation and eubgradient search Ever since it was introduced (Geoffrion, 1974), Lagrangian relaxation has been extensively applied to many location and trip distribution problems (see, for example, Geoffrion and McBride, 1978; Christofides and Beasley, 1983; 0'Kelly, 1987; Galvao and Raggi, 1989; Holmberg and Jornsten, 1989; Cornuejols, Sridharan, and Thizy, 1991) . Excellent surveys can be found in Bazaara and Goode (1979), Fisher (1981), Krarup and Pruzan (1983), and Cornuejols et al. (1989) . The idea is to partition the constraints of an program into two sets. A relaxed problem is then constructed by removing (relaxing) one set constraints, and by 97 introducing a penalty function into the objective function for violation of the relaxed constraints. The aim is to make the relaxed problem easy (sometimes trivial) to handle for a specified set of Lagrangian multipliers. Then by solving the dual of the Lagrangian problem, a tightened bound can be achieved for the original problem. The degree to which this is successful depends on the problem structure as well as the values of the multipliers. And the most often used way to adjust and update the multipliers is subgradient search. Let's first look at our LSI model: Min

(ln-^-1) + (22)

Subject to

i ■ 1 ,...,m (23)

(24)

(25) * Yi

Following O'Kelly (1987), let ut £ 0, relax the first constraint, we get the Lagrangian: 98

This is the same as:

L " " 1 + **C l J + PUi> " ? Ui0i (27>

By making dL/dStl ■ 0, we get:

dL/dSy * (l/p)5^5^ (1 + ln^jp - 1 + Pc^ + Pu4) - 0 (28)

Therefore:

SU * ft^exp[-p (Cy * uA)] (29)

Now substituting this back to L:

L * (1/P) Sjjf-l) - uiOi (30)

Clearly the second part of L is always less than or equal to zero. Thus it becomes trivial to solve the Lagrangian problem. We can select the P smallest values of which is the same as choosing the P largest value of

£i£)WJexp[-S(Ci) + u4)]. 99 Given the above Lagrangian problem, for a specified set of u4, we can always obtain its trivial solution as the lower bound for the original problem. However, in order to find the most tightened lower bound, we want the trivial solution as large as possible. In other words, we need to solve the following Lagrangian dual; Max Lq - {min L), .i.e.

Max Min L * - ^ uiOi (31)

Subject to

i - 1, m (32)

(33)

Now let's turn to the subgradient method that can help to adjust and update the vector u* such that I* is pushed toward the maximum. Define; k: the current iteration; )i: a non-negative parameter decreasing over iterations; Z” : the upper bound arrived by the VSM; ZD(uk): the lower bound of current iteration; ft(k) : the subgradient for u4 at current iteration, f4(k) ■ 04 ~ IjSij 100 The general procedures of subgradient search are:

Step 1. Compute the initial Lagrangian multipliers u4. First, with the primal solution from the VSM, we can get the S15 values for every demand node 1 (to avoid confusion, let's define it as P4J) :

(34)

Recall in the Lagrangian dual: S4j * Waexp[-£(CU + ut)]. If the current primal solution is optimal, then - s„. Thus, f rom

P4j - Waexp[-£(Cia + u4)], the initial vector u4 can be calculated as:

(35) * ~ F ln^f * c«

Step 2. Solve I* for the given vector {u4}, the result ZD(uk) gives the current lower bound, update the lower bound if necessary. Also compute the subgradient £4(k) for the current iteration. Step 3. Terminate the subgradient search if one of the following three cases occurs, otherwise go to step 4: a) all subgradients are zero: f4(k) ■ 0. This indicates that given the current vector uif the relaxed constraint is not violated by the Lagrangian solution. The result of the Lagrangian dual has converged with the result of the original 101 problem. Thus 2** is proved to be the optimal solution value to the LSI problem. b) the duality gap is small enough as set by the small number X: (2* - 2g(uk))/2m * 100 < X. For instance, if X ■ 5, the current upper bound 2" is less than 5% deviation from the optimal solution. c) a predefined number of iterations have been reached: k > <&. Step 4. Update the Lagrangian multipliers and go to step two. The equation is: ut(k+l) - ut(k) - rkfA{k) (35) where Tk - HI2" - 2D(uk) ] (36)

rk is the so called step size which controls the pace of movement from u±(k) to Ui(k+1) . It usually decreases with the increase of k, since less corrections need to be made as the subgradient search proceeds. fi(k) is the subgradient that controls the moving direction of ut. When f4(k) >0, uA(k+l) decreases, so Ui(k+1) £ uA(k)f EjSij would increase. Conversely, if ft(k) < 0, ut(k+l) increases, so ut(k+l) 2 Ui(k) i £jS|j would decrease. If fA(k) * 0, no adjustment is needed (see O'kelly, 1987). 3.3.3.2 Interactive evaluation Aside from the Lagrangian relaxation and subgradient search, interactive procedures are designed for users to perform post optimization analysis. This provides an 102 opportunity of doing more experiment and comparison, as well as entailing elements not considered by the LSI model. It gives the user the power to control the modeling process and final decision making. The following operations can be interactively performed: 1. To change model input, including the total number of facilities (i.e. P value), demand, supply, distance decay function, and G value. This is helpful for experimenting different assunqptions and scenarios for the study area. 2. To regulate model operation by specifying a new set of required and prohibited sites, and by selecting different initialization strategies. This is a good way of simulating retail competitions as well as testing initialization strategies. 3. To modify model output by adding new facilities, removing or replacing facilities chosen by the model. This is particularly useful when the model has selected sites that are practically unacceptable due to physical conditions, zoning regulations, or other social, political and administrative concerns. 4. To compare the result of the LSI problem with that of the p-median location problem. Since the p-median problem is the most popular model in the location literature, it could provide a bench mark for the LSI model in terms of running time, solution quality, and sensitivity to G. For example, it 103 would be interesting to see how big the S value is when the LSI solution is the same as the p-median solution. All these interactive procedures are not only part of the LSI modeling process, but also closely associated with the geographic information systems. To a large extent, the role of user involvement is determined by the user interface in GIS. Therefore, more detailed information on the functionality and implementation of these interactive procedures will be given in the next chapter.

3.4 8ummarv and Conclusion Following the work of O'Kelly (1987), a specific LSI model is formulated and solution strategies are developed in this chapter. The model aims to locate P retail stores with the most likely gravity based spatial interaction pattern as well as with the minimum total traveling cost. The balance of the two goals is maintained by the distance decay parameter £. As a result, when S is small, the first goal dominates the model objective. When S gets larger, the model increasingly focuses on minimizing the traveling cost. The heuristic algorithm presents a variety of strategies that can be used to tackle the LSI model. In general, three phases are employed sequentially. The initialization phase looks for a reasonably good starting solution for the LSI problem. Besides some conventional tactics, a so called voting algorithm is introduced for initialization purpose. It 104 consists of three voting rules which are designed to capture the inter-play between the facility attraction and travel friction as & changes. This is followed by the search phase which aims to obtain a satisfactory (if not optimal) solution. The search adopts a modified version of the vertex substitution method in which tabu search and hashing strategy are introduced. By making bad swaps as forbidden moves, forward tabu search intensifies the search process. On the contrary, by temporarily making good swaps as forbidden moves, backward tabu diversifies the search process and enables the VSM to jump out of local optimum traps. The hashing procedure records all the locations patterns that have been examined and helps the VSM to avoid redundant swaps and computations. Finally, in the evaluation and improvement phase the lower bound of the LSI problem is generated through Lagrangian relaxation and subgradient search. Moreover, users are allowed to interactively manipulate various aspects of the LSI modeling in order to test model capability and to accomplish the final location decision making. In the next chapter, report is given on how the LSI model and the heuristic are linked to a geographic information system. This includes discussions on the ways of constructing such an integrated system, the design of GIS-based retail analysis in general, and detailed implementation of the prototype GIS for location and spatial interaction modeling. CHAPTER IV DESIGN AND IMPLEMENTATION OP THE GIS PROTOTYPE

4.1 Introduction In this chapter, the design and implementation of the GIS based prototype system are presented. The primary goal of this system is to combine a geographic information system with the LSI model and solution procedure developed in Chapter III. It is hoped that such an integration will realize the dual advantages of having a strong database management system as well as an effective interactive modeling capability. The first question we have to answer is: how do we want to construct the prototype system? Generally there are three ways of integrating a spatial analysis model with GIS (Shaw, 1993). The first is to create powerful modeling routines and commands within a GIS software. This is done mainly within private GIS companies. For instance, Arc/Info 6.2 will have some commands dealing with location analysis. The result is a unified database management and modeling environment. Ideally, this is the kind of system that people would like to have for their GIS applications. Nevertheless, due to the limitation of data structure as well as marketing concerns, the modeling functionalities within a GIS software are often

105 106 generalized and may not be able to deal with a variety of special applications interested to users. Currently, no GIS software is equipped with the functionality that can be directly applied for handling LSI models. The second way is to build a SDSS from scratch, that is, to develop a specific modeling system with certain database management and mapping capabilities. The work by Armstrong et al (1990), Spencer III et al (1990), Domich et al (1991) represent typical examples of such an approach. The resulted spatial decision support system usually has strong modeling capabilities that meet the special needs of users. It also requires less time to learn compare to a complicated GIS software. However, since the SDSS is emphasized on modeling, its data representation may not suitable for even routine database management functions, such as spatial query, retrieval, and overlay. So lack of sufficient database management and graphics functionality have become major disadvantages of this approach. In addition, to construct a decent SDSS often requires significant and distinct expertise and effort. It is very difficult for a single person to carry out the entire workload. The third approach is to develop a user interface to link a GIS software with user developed modeling systems. The GIS software and users' programs are operated upon different data structures. So data have to be transferred between the software and user programs usually through ASCII files. This 107 strategy is often favored by individual users because 1) it leads to a GIS modeling framework that meets the deep and narrow needs of user applications while maintaining the broad and general GIS database management functions; 2) in most cases the design and implementation procedures involve less time and effort compared to the other two approaches mentioned above; and 3) the resulted application system is easy to learn for both experienced and inexperienced users in terms of their understanding toward GIS and spatial analytical modeling. Yet on the negative side, with this strategy two separate data systems must be maintained which creates data redundancy and potential data inconsistency (Shaw, 1993) . Also data transformation may considerably slow down the operating speed. Overall, the third approach appears more suitable to this research from the cost-benefit perspective. Having this resolved, the remaining part of the chapter will be organized as follows. Section 4.2 presents system design strategies. This is followed by the report on system implementation in section 4.3. Finally a summary and conclusion of the chapter is given in section 4.4.

4.2 Svatem Design The design of the prototype system is carried out in light of retail location analysis. It is driven by three major considerations: functionality, interactivity, and flexibility. In other words, the system should be designed in 108 such a way that a) it meets the functional requirements of retail application, in particular supports retail location analysis at various levels; b) it has a user friendly environment which facilitates human-computer interactions; c) it is considerably flexible to facilitate further modifications and refinements. The design methodology follows the concepts and strategies of structured analysis (Yourdon, 1989). Specifically, data flow diagrams (DFD) are used as a modeling tool to depict the functional dynamics of the prototype system. The designing process centers around three issues: data requirement, functional requirement, and user interface.

4.2.1 Data Requirement The first step in the design of the prototype involves choosing and organizing different types of data based on system objectives and user analysis. Since the system mainly serves as a spatial analytical tool for retail location analysis, naturally it is to our best interest to obtain all the valuable data that can fulfill the operational requirements without compromise. From a theoretical perspective, four major types of data can be identified in order to carry out high quality retail location analysis. Aside from the temporal dimension, each type of data contains a variety of spatial and aspatial attributes. 109 1. Demographic Data Demographic data are associated with population within the studying area (Block Group, Zip Code, Census Tract, Township, City, County, MSA, State, Country), They can be further classified according to the kind of aspatial attributes attached: a) natural profile: number, age structure, sex ratio. b) cultural profile: race, ancestry, language, religion. c) economic profile: employment, occupation, income, education, poverty. d) household profile: owner cost, renter cost, house type, family type, etc. e) consumer profile: life style, shopping expenditure, brand preference, etc. f) traveling profile: number of vehicles, travel time to work, major travel routes. 2. Retail Data Retail data are usually either store based or product based (Beaumont, 1989). Here store based data are preferred. They provide information about types of services, store characteristics, operation strategies, current competition, cost and profitability. The data will vary with different retail sectors. The following seven segments are considered to offer great potentials for GIS-based analysis: a) restaurants, especially fast food chains b) grocery stores and supermarkets 110 c) mass merchandisers d) discounters e) department stores f) convenience stores g) mail-order business

3. Real Batata Data The data of real estate properties are crucial in determining the sites of retail outlets. Major information are location of sites, building conditions, available space (including parking space), leasing cost, etc. In addition, site images and zoning regulations can also considered as part of real estate data. 4. Network Data Road networks provide linkages between consumers and retailers. A good retail location should be highly accessible to potential shoppers and consumers. Thus the data on road condition, structure, speed limit, congestion level etc. are among the major elements for retail accessibility analysis. The above overview provides useful insights on what are the vital data elements in retail location analysis. The data requirement is set in a general and flexible way, considering in reality it is nearly inpossible to acquire all those data for a given region at given times. Even though the data are available, they may be quite expensive, or they are not in a compatible data format or appropriate spatial-temporal level. Ill To overcome such pitfalls and problems are challenging tasks at the implementation stage.

4.2.2 Functional Requirement Functional requirement specifies what types of functions that the prototype should be able to perform. Assume the primary users of the prototype are retail consultants, researchers, and store managers. The following general functional requirements can be perceived. 1. visualization Function The GIS prototype should allow users to interactively navigate the relational database as well as model results. Users can derive and view any existing objects in a well designed format, such as 2-D and 3-D maps, aerial photographs, even satellite pictures.

2. Query function The query function provides users instant access to spatial objects and their attributes at their preference. The query process is often combined with the visualization function. There are basically three types of queries. a) spatial query: users can use a pointing device to define an area or place on the map, and obtain all the relevant information associated with the area or place. b) logical query: users may specify certain characteristics they are interested in, and the computer will display all the objects with such features. 112 c) statistical query: users are allowed to perform certain simple statistical analysis in the database, such as max, min, sum, mean, etc. 3. Modeling Function The modeling function is the result of implementing retail analysis models in GIS. These models provide insightful perspectives on how to examine, optimize and simulate various aspects of retail services. When combined with GIS, they become more effective due to human-computer interaction and scientific visualization. Following is a list of useful retail modeling techniques: a) LSI modeling b) traffic analysis c) site analysis d) scenario planning

4. Report Function The system should permit users to interactively generate data files and map compositions for reporting or data exchange purpose. 5. Other Functions Besides the above major functional requirement, the prototype should offer users many basic functions whenever it is necessary. For instance, HELP, CANCEL, KEYBOARD, ZOOM IN AND ZOOM OUT, SCREEN SNAPSHOT etc. should be available to users at most times. Some network editing functions might be needed for modeling and simulating purpose. 113 4.2.3 User Interface Design User interface is a collection of menus through which users communicate with the prototype system. It inter­ connects with users, the database, and all the functional elements of the system. The general requirements for designing user interface are: 1. The menus should be structurally organized and logically presented to mirror the data property and functional objectives of the system. 2. The menus should be displayed in graphical windows, visually clear and friendly. 3. Certain standards in screen layout and menu choice should be maintained throughout the design process. 4. Mouse and cursor are major supporting device to encourage interactivity. Keystroke use should be kept to minimum but still remains as an option for users. Keeping these requirements in mind, a data flow diagram (DFD) shown in Figure 6 is constructed to help the design of user interface. The three bubbles refer to the three basic functional modules within the user interface. Information leads to a set of data browsing functions. Modeling launches a series of location-spatial interaction analysis. Tools offers simple statistical analysis and report generating functions. When a bubble is selected, it will guide users to perform functions belong to the corresponding unit. The three Screen Database Fortran Display Program

AML InformationProgram Modeling

User

Database Model Output

Tools Plotter Maps & Report

AML Program

Figure 6. The DFD of the Prototype Interface 115 bubbles are relatively independent from each other yet heavily interact with the external entities (users, database, and programming languages) represented by boxes. Only through user involvement can they affect each other. For example, if a user modifies the network structure during the data browsing process, later the input for modeling and the output for reporting will be affected. Figure 7 is a DFD for the information module. There are four basic functions (bubbles) within the module: query, edit, overlay, and display. They are designed for database browsing purpose. The external entities that the bubbles deal with are mainly the relational database. Users can employ these functions to interactively navigate the four types of data as well as their attributes. Besides its unique functions, each bubble also share many common database management functions with other bubbles, such as SELECT, IDENTIFY, LIST, CLASSIFY, SAVE, CANCEL etc. The modeling module invites users to perform retail analysis that are mainly developed through mathematical programming techniques. As shown in Figure 8, the DFD contains four bubbles: LSI modeling, traffic analysis, site analysis, and scenario planning. Each bubble has to involve complex external entities, including users, database, GIS functionalities, operation system functionalities, and different programming languages. Table 16 lists the major contents within each bubble. Screen Display

Network Demographic Data Data

Real Estate Display Data Retail Data

AML Program

Edit

Query User

AML Program

Retail Real Estate Data Data

Overlay

Demographic Network Data Data

New Coverage

Figure 7. The DFD of the Information Module 117

Database

Fortran Program

Program

LSI Site Modeling Analysis I 2

Model Output User Output Model Output Maps Maps Output

Traffic Scenario Analysis Planning 3 4

AML Program

Traffic New Models Assumptions & Objectives

Database

Figure 8. The DFD of The Modeling Module 118 Table 16. Functional Process within the Modeling Module

Bubble Objective Primary Cons iderations LSI Modeling perform site available site; selection and population; traveling I trade area cost; current H mapping competition. \ Traffic Analysis analyze the network structure; accessibility road conditions; of the retail major barriers; environment traffic flow pattern & congestion level. Site Analysis investigates available space; conditions of rent; parking individual conditions; sites proximity to competitors; complementarily of neighboring stores; zoning regulations. Scenario Planning integrate the population change; above store relocation; analysis to product promotion; answer a set sales forecasting; of “what if" traffic simulation. questions on the dynamics of retail process

Figure 9 depicts the DFD of the tools module, which aims to provide functional tools for generating output. Bubble 1 allows users to interactively perform simple statistical analysis and measurement. For example, to calculate the total population within certain areas, or to measure the size of a polygon, etc. Bubble 2 opens a interactive mapping session for users to create their own maps. Bubble 3 starts a file Facts & Customized Statistics M a p s

Database

Statistical Interaction Anallysis Mapping

AML AML Programs User Programs

Plot

Database

Hard Co p y Output M a p s File

Figure 9. The DFD of the Tools Module 120 management function that helps users obtain selected data files in the format they desired. Finally, upon selection, bubble 4 produces plot files and invokes the plotter for hard copy maps.

4.3 System Implementation The retail analysis prototype system is implemented using Arc/Info 6.1 on a Sparc 2 workstation. This is the best software and hardware combination available to the current project. The Arc/Info 6.1 represents perhaps the most advanced GIS software on the current market. It not only provides highly sophisticated spatial analytical functions for automated data handling and interactive mapping, but also allows users to create customized interface for their own applications. The implementation is carried out with the Arc Macro Language (AML) and FORTRAN 77. It is called GRALSIM, which stands for GIS-based Retail Analysis with Location Spatial Interaction Modeling. So far a relational database with 2 Arc/Info coverages and 4 Info files has been created. There are 47 AML programs, 31 menus, more than 45 icons, and 3 FORTRAN programs. The remaining of this section reports the technical procedures of the inqplementation.

4.3.1 Database Construction The development of an application database is always a challenging task (Berry and Maclean, 1989). It is almost 121 impossible to obtain all the spatial and aspatial data designed earlier. For the testing purpose, the population and street network data for the city of Redlands, California are adopted. However, the retail data are hard to find except some locations of current shopping centers and major stores. And no real estate data of the area is available. Although it is possible to run the LSI model, the problem of data shortage has largely limited the ability of the prototype in performing more comprehensive and in-depth retail analysis. The Arc/Info coverages and attribute tables of population and street network are derived from census tract data and TIGER files. The process involves many complicated technical issues. The general procedures are summarized as follows (see Liu, 1991) . Step 1 . Get the county coverage where the city is located from the TIGER file. In this case the Arc command TIGERARC is used to generate a coverage of the San Bernardino county which contains the city of Redlands. Step 2 . Get the city coverage from the county coverage using the Arc command CLIP or CREATE. In order to preserve census blocks from the TIGER file, it is suggested to determine city boundaries with the census block lines that best match the true city boundaries. Otherwise if true city boundaries are used, it might cut through census blocks. Step 3 . Create thematic coverages of the city. This is done by first selecting desired county coverage attribute records 122 (population, roads, rivers, etc.) that fall within the city boundaries. Then these extracted attributes data may be saved in ASCII files and added into the INFO files of the corresponding city coverages. Major Arc commands used for the operations are TABLES, SELECT, UNLOAD, and ADD. In the case of Redlands, a population coverage with a polygon attribute table (PAT) and a street network coverage with an arc attribute table (AAT) are produced. Step 4 . Create approximate “nodal population" of the city. This is necessary because the LSI model requires the population to reside on the nodes of the street network, but the census tract population is associated with polygons. The strategy employed here is first to generate a point coverage on the basis of the centroid of each polygon, and assign the population of each polygon to the corresponding point. Major commands used are CENTROIDLABEL in Arc and SELECT, PUT in Arcedit. Next, the point coverage is snapped to the network coverage such that the point population is assigned to the nearest node. This is accomplished with the Arc command POINTNODE. As a result, a node attribute table (NAT) is created with nodal population (Figure 10). Table 17 summarizes the Arc/Info database built for the prototype system. There are three principal info files, one polygon attribute table (PAT), one arc attribute table (AAT), and one node attribute table (NAT). Each contains a series of topological and social-economic attributes. Topological 123

Coverages Attribute Tables

Polygon Population 1 280 ► 500 150

Point Population 1 280 ► 2 500 3 150

~ T 7 ' I 7 Node Population 4 280 5 500 8 150

Figure 10. The Procedure of Creating Nodal Population 124 Table 17. The Coverages and Info Piles for the City of Redlands

Coverages Info Files Major Info Items (Attributes) Population Population.pat area total perimeter white population# black population-id hispanic county amerind tract asian other totall8+ Street Street.aat street# county street-id city fnode# boundary tnode# -flag lpoly# county-flag rpoly# city-flag length tract-flag street-code zip-flag address street.name street.type walk-imped ft-drive-imped tf-drive-imped direction Street.nat node# total node-id white x-coord black y-coord hispanic amerind asian other totall8+ supply Candidacy- flag 125 structures are maintained by assigning unique ID-numbers and locational flags to all the nodes, arcs, and polygons. Since the population data are transformed from polygons to nodes, the population.pat and street.nat share the same demographic items. The street.aat describes the arc features of the network. Note that all info items and records can be deleted, added and modified according to users' needs. Once changes are made within an info table, all the relevant information in other info files are also updated. In addition, some temporary info files may be created in the process of interactive operation. For instance, files that are needed for recording user input, selection and location spatial interaction output.

4.3.2 The Menu Structure The prototype user interface is built with numerous well organized form menus. Pop-up windows are also used whenever necessary. Based on the structured interface design, these menus can be divided into three hierarchical levels, as illustrated in Table 18.

4.3.2.1 Level one menu The main menu is the only level one menu. It appears on the screen when the prototype system is started (Figure 11). There are six functional buttons in the main menu: Table 18. The Menu Structure of GRALSIM

Level One Level Two Level Three Main.menu Information.menu pop_browser.menu retail_browser.menu real_estate.menu road_browser.menu help.menu Modeling.menu sit e_analys i s.menu loc_alloc.menu traffic.menu planning.menu help.menu Tools.menu map_malcing. menu stat_analysis.menu report.menu plot.menu help.menu 127 a. database: invokes the information.menu. b. modeling: invokes the modeling.menu. c. tools: invokes the tools.menu d. help: describes the usage of each button in the main menu. e. keyboard: temporally leavethe main menu and return to the keyboard control mode. Users need to type the AML directive "treturn" in order to come back to the main menu. f. quit: quit the prototype.

4.3.2.2 Level two menu There are three level two menus which are called by the main.menu. At this stage the menus are still notdirectly connected with AML programs. The selection of a button within a level two menu will a) lead users to the next level menus; or, b) go back to the main menu if the button "Main Menu" is clicked; or, c) hide the menu if "Hide" is chosen. The three menus are described as follows, along with figures showing exactly as they appear on the screen. 1. information.menu (Figure 12) systematically present users the database of the prototype. At this time the population and network browsers are implemented. The retail browser and real estate browser are provided (but unused) as hooks for systems where retail and real estate data are available. 128

Figure 12. The Information Menu Figure 13. The Modeling Menu 130 131 2. modelino.menu (Figure 13) allows users to perform retail modeling. The location allocation analysis is fully inplemented. Only shortest path and traveling salesman functions are implemented under traffic analysis. Site analysis and scenario planning are not implemented. 3. tools.menu (Figure 14) helps users to produce graphics and text output. This has not been implemented.

4.3.2.3 Level three menu Level three menus are those that are branched out from level two menus. Upon invoked, they either go to other sub­ menus or start an AML program. The following is a list of level three menus. 1. called by information.menu: popbrowser.menu: provides population information, retailbrowser.menu: provides store information. real_.estate.menu: provides real estate information, roadbrowser.menu: provides road network information. 2. called by modeling menu: site_analysis.menu: site history, image, trade area, accessibility, etc. loc_aHoc .menu: LSI modeling, open, close, move stores. traffic.menu: shopping flows, routes, etc. planning.menu: retail scenario planning, competition analysis. 3. called by tools.menu: view.menu: quick display of different map layers, query.menu: quick view of database, statistical analysis. report.menu: generate report tables in ASCII format. plot.menu: generate hard copy maps. Although some the above menus and functionalities are not fulfilled, the existing non-functional menus do provide insight and guidance for future implementation work. For example, Figure 15 shows how scenario planning might be interactively presented to users as a choice menu.

• H e r s 1 .d:, . *v'i v

| . \ , t,y if j-1. *

r — ■ -

j *; f : l 11 > ■■■.,!, r . • ' -■■■ r —

j ; 4 « i I i, + i J I 4 I . i t |fa t ' ! 1 4 ! . * 4 ’ • is

Figure 15. The Scenario Planning Menu 133 4.3.3 Integration of Arc/Info and the LSI Model One of the most important implementation tasks is to link Arc/Info with the LSI model and algorithms. To accomplish this, attention must be paid to the data input and output operations. Specifically, the user input has to be transformed from AML variables and info files to LSI input files in ASCII format. On the other hand, the LSI model output has to be sent back to info database for interactive display and analysis. As shown in the LSI modeling menu (Figure 16), the process contains three sequential steps: model initialization, candidate selection, and model operation.

4.3.3.1 Model Initialization The purpose of model initialization is to interactively set up essential model parameters. When the button "SELECT MODEL PARAMETERS" is invoked, a sub-menu shown in Figure 17 appears on the screen. The following parameters need be chosen:

1. total number of facilities (P) : users can type in an integer number. The default value is five. 2. demand_item (0t) : the info item that is used as demand data. When pushing the third button of the mouse, a pop-up window will appear showing all the node attributes of the current network. Users can select one desired item as demand. For instance, one may choose the Asian population as the 134

Figure 16. The LSI Modeling Menu Figure 17. The Model initialization Menu 136 demand item if the model is intended for locating oriental grocery stores in the study area. The default item is the total population. 3. supply item (Wj) : this is the same operation as the above except to choose a supply item. The default supply item is store size.

4. distance input (Ct]): the three buttons are available choices for obtaining traveling distance. It can be computed based on latitude and longitude, xy coordinates on a plane, or distance matrices already existed. For network location problems, only the third button is a valid choice. The all- pairs shortest path can be obtained as a by-product of running the INTERACTION command under Arcplot. 5. distance_decay_function: this can be a power function or exponential function. For a small area like a city, the exponential function is a more realistic representation. 6. beta value: the £ value can be typed in or selected by using the slide bar. The range of £ is between 0.0 and 4.0. 7. search radius: this allows the user to impose an attracting range for stores. In other words, the probability of assigning a demand node to a store becomes zero if the node is located outside the store's search radius. This is not implemented at this time so that the default search radius is set to a very large number. All the above user input are recorded as AML global variables. Then they are written to a text file called INPUT1 137 using the AML function [WRITE] . Based on the format of INPUTl, corresponding READ statements in the FORTRAN program are used to obtain the parameter information. 4.3.3.2 Candidate selection The secondstep of the modeling sequence is to select candidate sites for facility locations. Once the button "SELECT CANDIDATE SITES" is clicked, a map of street network will be displayed on the screen. Meanwhile, a pop-up window offer users the following means of choosing the candidate sites: a) many: using a cursor to select candidate sites, one by one. b) box: using a box to define candidate sites. c) circle: using a circle to define candidate sites. d) polygon: using a polygon to define candidate sites. e) all: indicating all nodes on the network are candidate sites. As shown in the initialization menu (Figure 17), users also have the options of selecting required and prohibited sites. A required site acts as an existing facility and will always show up in the final model choices. On the contrary, a prohibited site will never be considered as a location candidate. In fact, what happens is that initially each node will bear a value of 0 in its candidacy_flag. Once it is selected as a required or prohibited site, its info record of candidacy_f lag will be assigned a value of 1 or 2. Consequently, the system is able to recognize users choices and to write such information into input files for the LSI algorithm.

Plate I provides an example of candidate sites selection.

In this case a polygon is used to define the candidate sites.

One node in the southwest Redlands is chosen as a required facility, and a circle is used to define the nodes surrounding the required site as prohibited locations. Such a way of selection implies that, a) the nodes at the periphery area of the city are unlikely to be favored as optimal locations; and

Plate I. An Example of Site Candidate Selection 139 b) the required facility ought to have a threshold area to protect its sales. Moreover, by specifying some nodes being prohibited locations, computational cost will be reduced for the vertex substitution process. This is an advantage that can not be achieved with batch mode modeling.

4.3.3.3 Model operation Given the model parameter input and candidate sites information, the LSI algorithm can be called in an AML program by using the directive fcsystem {Arc/Info recognizes the command following &system as an operating system command). The command is (under Arc or Arcplot) : & system a. out, where a.out is a compiled FORTRAN program. The model output can be sent back to the info database by using the command DEFINE and ADD under TABLES. DEFINE will create an info file with items that can record location allocation result. And ADD will upload the model output file into the info tables for display and evaluation. The algorithm is currently written with the following capabilities: 1. initialization. The voting algorithm is employed to get a starting solution. 2. search. A targeting, permanent tabu embedded in the vertex substitution method is used. The hashing strategy is also utilized throughout the search process. 3. bounding. Lagrangian relaxation is adopted for obtaining lower bounds for the heuristic solution. 140 4. interactive evaluation. This is carried out through a sub-menu (Figure 18). There are six functional buttons within the menu, each representing a possible evaluation strategy. a) move facilities. Users can use a mouse to move any of the facilities in the current solution to new sites, then re-run the LSI model and see the result. b) close facilities. When facilities are closed, they become prohibited sites. If re-run the LSI model, the algorithm will search for new facilities to replace the closed ones. This is the same as running the LSI model with the same P value, and with those non-closed facilities as required facilities. c) add facilities. It is possible to add new facilities to the current solution set. This option is particularly useful if one wants to study the competition effect assuming new stores are being added. d) remove facilities. This option permanently deletes one or more facilities from the solution set. Then the LSI model is run with a smaller P value. e) path. This is actually a traffic analysis function. It finds the shortest path between any pair of nodes. The result is displayed on the screen with descriptions on traveling directions from the origin to the destination. f) tour. This function produces a traveling salesman tour for given origin, destination and stops. The display and direction guidance are similar to the path function. 141 142

All the above modeling capabilities can also be applied for dealing with the p-median problem. This is useful because comparison and contrast can be made between the LSI and the P- median model. In the loc_alloc.menu, a button is designated to display the LSI solution and P-median solution together.

As shown in Plate II, a solution of the LSI model is displayed

Plate II. The Display of LSI Model Solution 143 with colored clusters around store locations indicating the highest visiting probability for each demand node. The P- median solution, on the other hand, is shown with a conventional star-allocation pattern (Plate III) . Plate IV is the picture when both solutions are put together. To facilitate better evaluation and judgement, multiple windows can also be opened to zoom in sub areas of the city to view details of location allocation pattern (Plate V ) .

Plate III. The Display of the P-median Solution 144

Plate IV. The Display of Both LSI and P-median Solutions 145

Plate V. Multiple Windows for Viewing Solutions 146 4.5 Summary and Conclusion This chapter has presented the design and implementation of GRALISM -- a GIS-based retail analysis prototype. The system is established by developing user friendly interface to link database management, retail modeling, and user involvement (Figure 19). Such an integrative approach has many advantages over the traditional batch mode modeling approach, and it is suitable and supportive for applications involving both experienced and inexperienced users.

Models User Database Algorithms Interface

End User

Figure 19. The Structure of GRALSXM

The conceptual design of GRALISM is based on the structured analysis method, which utilizes data flow diagrams to model the functional process within individual modules. It centers around issues of data requirement, system functionality, and user interface. Although casting an idealistic perspective, the design has provided useful insights on important aspects of a GIS supported retail analysis system. The implementation is carried out using Arc/Info 6.1 on a Sun Sparc 2 workstation platform. Due to the limitation of 147 time and data, the focus of implementation is given to the construction of user interface and GIS-based LSI modeling. It demonstrates how GIS technology may be effectively combined with optimization based location allocation and spatial interaction modeling. In the future, more data can be collected and other spatial analytical models can be incorporated to support more comprehensive and in-depth analysis. In the next chapter, details of running the LSI model within GRALSIM will be discussed. This includes report of solution quality, analysis of model dynamics, and evaluation of the computational experience. CHAPTER V

RESULTS

5.1 Introduction This chapter presents the experimental results of the GIS supported LSI model developed in Chapter III and IV. The solution algorithms are coded in FORTRAN-77 and are run on a Sun Sparc-2 workstation (JAGUAR) at the Department of Geography GIS Lab. The goals of these experiments are: (a) to test the ability of the heuristic in dealing with large scale LSI problems; (b) to capture the intrinsic relationships among various model elements; (c) to generalize the spatial pattern and process of the LSI model under different assumptions and circumstances; and (d) to gain useful computational experience. The Chapter is organized as follows. Section 5.2 evaluates the quality of model solutions. Section 5.3 discusses the dynamics of model solutions, such as the changing characteristics of average trip length, the comparison of cost effectiveness between LSI and P-median, and the contrast of their spatial pattern and process. The computational experience in terms of model running time and memory requirement is reported in section 5.4. Section 5.5 provides a summary and some concluding remarks.

148 149 5.2 Solution Quality 5.2.1 Problem Design The road network of Redlands, California has more than 2,000 nodes. After transforming census tract data to nodal population, 397 nodes received population ranging from 100 to 1000 people. So the 397 nodes have become the eligible store location candidates in our experiment. The shortest path distances among these nodes are created using the INTERACTION command in ARCPLOT, and saved as a model input file. In order to obtain useful and objective test results, two sets of large scale location problems are designed. The first set of sample problems are interactively selected and submitted, with user specified required and prohibited sites. The resulting customized network environment contains 1 required site, 51 prohibited sites, 288 supply nodes (candidate sites), and 397 demand nodes. The second set of sample problems are submitted in conventional batch mode, with no user specifications on required or prohibited sites. So there are 397 demand nodes as well as supply nodes. For comparison purpose, each testing problem is run three times respectively with a small, a medium, and a large S value. In addition, the p-median location model is applied to each test problem. It is our intention to contrast the performance and result between the LSI and the p-median model. 150 5.2.2 The Primal Versus the Dual The first issue is whether the heuristic designed earlier is able to obtain optimal solutions for given problems. Specifically, the following results are needed: (1) What is the gap between the primal and the dual solution? (2) How well does the initialization strategy, i.e., the voting algorithm perform? Table 19 and 20 present the solutions for the LSI problems. A total number of 27 runs were made on the two sets of problems. The algorithm stops if the gap between primal and dual is less than 1% or a maximum of 50 iterations have been reached. In general the following observations are made: 1. Overall the performance of the heuristic is very good. Out of the 27 problems, the algorithm has obtained 9 optimal solutions. In most cases the gap between the primal and the dual is between 1% to 2%. Only in three cases the gap is larger than 2%. 2. Intuitively we would think it is more difficult to reach optimal solutions when P gets larger, since the problem gets more complicated with larger number of facilities. However, as shown in Figure 20 and 21, for both LSI and P-median problems the gap between the primal and the dual tends to decrease with the increase of P. On the other hand, when the £ value becomes larger, the gap also gets larger. 151 Table 19. Solutions of the First Set LSI Problem

p OPEN M CLOSE N 6 INITIAL PRIMAL DUAL

5 1 288 51 397 0.05 -162.40 -162.40 -162 .40 0.5 -14.99 -15.30 -15.30 5.0 -0.59 -0.93 -1.02 10 1 288 51 397 0.05 -176.28 -176.28 -176.28 0.5 -16.46 -16.71 -16.71 5.0 -0.93 -1.08 -1.15 20 1 288 51 397 0.05 -190.16 -190.16 -190.16 I 0.5 -17.82 -18.10 -18.09 5.0 -1.16 -1.21 -1.24 30 1 288 51 397 0.05 -198.27 -198.27 -198.27 0.5 -18.72 -18.91 -18.91 5.0 -1.26 -1.29 -1.31 40 1 288 51 397 0.05 -204.02 -204.02 -204.02 0.5 -19.35 -19.48 -19.48 5.0 -1.31 -1.35 -1.36 152

Table 20. Solutions of the Second Set LSI Problem

p OPENM CLOSE N £ INITIAL PRIMAL DUAL

5 0 397 0 397 0.06 -135.23 -135.23 -135.23 0.8 -8.79 -9.23 -9.25 4.0 -0.94 -1.27 -1.42 10 0 397 0 397 0.06 -146.77 -146.77 -146.77 0.8 -9.57 -10.08 -10.12 4.0 -1.22 -1.47 -1.49 20 0 397 0 397 0.06 -158.31 -158.31 -158.31 0.8 -10.63 -10.93 -10.96 4.0 -1.58 -1.66 -1.69 30 0 397 0 397 0.06 -165.07 -165.07 -165.07 0.8 -11.28 -11.45 -11.45 4.0 -1.71 -1.76 -1.77 TIm Objacilv» Function Figure 20. The Solution Quality of the LSI Model LSI the of Quality Solution The 20. Figure h SolutionThe Qua I ity LSI Model of CM = 8, = 397, = N 288, ThaNunfear C ofFaeilitiM H - beta = 5} beta= n

Initial dual 153 154

The Solution Quality of P-Median

CM = 288, N = 3973

\ \ w \ w initial

S * 8

Figure 21. The Solution Quality of the P-median Model 155 3. It is also indicated by Figure 19 and 20 that the initialization strategy generally works quite well. When S is small, which in these cases is less than 0.1, the initialization strategy {voting rule one) has obtained the starting solutions that are also the optimal solutions. This verifies the theoretical specification that facility locations tend to be clustered to central sites when travel friction is of minor importance in the model. When £ increases, the initial solutions tend to be further away from the optimal solution. In other words, the search strategies become more crucial when £ gets larger. 4. It should be pointed out that the quality of the dual solution relies heavily on the choice of the step size in the lagrangian procedure. Although there exist analytical methods to select step size (see Bazaraa and Sherali, 1981), it seems to be easier and more straightforward to experiment different step size values. In GRALSIM, users are allowed to interactively test different step size and iteration values if they are not satisfied with the current dual solution.

5.3 Solution Dvn»wHr-n Perhaps the most important task of our experiment is to search for the spatial implications of model solutions. In other words, given the LSI model and its optimal or near optimal solutions, what do they mean to retail analysts? How useful and different are the results compared to the 156 conventional P-median model? To answer these questions, we will analyze several aspects of model outputs and compare them to the results of P-median model.

5.3.1 The Average Trip Length The average trip length is represented by the second part of the objective function in the LSI model, i.e., ^L-jS^C^. It is an aggregate measure of the average traveling cost in current location allocation system. In the P-median problem, the model objective is to make this cost as small as possible. Whereas in the LSI model, this is only part of the objective function which plays a more important role as & increases. Figure 22 shows how average trip length varies with the changing value of 6. As expected, when 15 gets larger, the cost of traveling goes up, so the probability of visiting nearest store increases, which results in a shorter average trip length. A closer look at the decrease of trip length indicates that the rate of changes actually slows down as 15 becomes larger. For example, when 15 changes from 2 to 2.5, the average trip length decreases from 7.38 to 6.78, about 8.13%. When 15 increases from 4.5 to 5, the average trip length only declines about 3.74%, from 4.81 to 4.63. Thus we can conclude that with the increase of 15, its "marginal effect" upon the objective function tends to diminish. As a matter of fact, given P = 5, the location pattern of the LSI model is the same as that of the P-median model when 15 reaches 157

The Trip Length v.s. Beta Value

Cp = 5, M = N = 3973

4

a

a 8 1 4B.5 4.5 S

tot* V*lu*

Figure 22. The Average Trip Length Versus Beta 158 5.0. Notice that in this case any fi values larger than 5.0 will no longer affect the store location pattern, but the average trip length will continue to decrease with more people traveling to the nearest outlet. Another way to examine the dynamics of average trip length is to see how it changes with the value of P. As shown in Figure 23, the variation is not only affected by P, but also under strong influence of B. In most cases the average trip length decreases with the increase of P, since more stores are opened to provide easier access for shoppers. When 13 gets larger, the average trip length drops more quickly and behaves more like a P-median model. For instance, with P increases from 5 to 10, if fi = 0.8, the average trip length drops from 9.67 to 9.62, only about 0.5%. If fi = 4, the average trip length drops from 5.26 to 4.23, about 19.6%. In the P-median model, an increase of P from 5 to 10 will produce a decrease of average trip length from 4.22 to 3.23, about 23.5%. Similar to the situation of changing fi, when P becomes larger, the increase of P will have lees inpact on the decline of average trip length. This is clearly indicated by the shape of the curves in Figure 23. A unique case occurs when fi becomes so small such that the increase of P would lead to the decrease of the probability of visiting the nearest outlets. As a result, the average trip length may actually increase. If we look at the objective function, we can find the increase of average trip 159

The Changing Trip Length v.s. P

.08 10

1 0 ■0 80 or i nti CP>

Figure 23. The Average Trip Length versus P 160 length is far more offset by the decrease of the entropy objective, which aims to reach an equilibrium pattern in terms of flow allocation. In the real world, it is not unusual that when more stores are introduced, customers may take less trips to the old, perhaps closer outlets, but pay more visits to those new, more competitive outlets, even though these new stores are farther away. Such possibilities will never be reflected in the P-median model, but can be easily captured by the LSI model.

5.3.2 The Cost-Hffectiveness The cost-effectiveness analysis aims to evaluate the relationship between the benefit and the cost in the model solution. In the LSI model, since the objective function has the combined aim of maximizing entropy and minimizing average trip length, it is difficult Co directly tell the exact amount of benefit for a given value of objective function. On the other hand, the cost is quite straightforward, which is attributed to the number of stores designated by the P value. What we will do then is to see how the objective function decreases with the increasing number of stores. Such analysis is useful if one wants to simulate the level of profitability of individual outlets, provided the demand (purchasing power) and the number of total stores are known. Figure 24 depicts the cost-effectiveness pattern for the LSI model with 6 equal to 0.06, 0.8, and 4.0 respectively. We 161

Th* Cost erfseti of LSI

397 ^

Ths Cost Effsctl is of LSI N - 397 3

m

Ths Cost Effsctivsnsss of LSI

C Bsts -4.0, M 397 3

Figure 24. The Cost-Effectiveness of the LSI Model 162 can see that the £ value does not seem to generate much impact upon the overall cost-effectiveness pattern. When £ = 0.06, the objective function value in average drops about 0.88% with the addition of one new outlet. In contrast, the average declining rate of the objective function is approximately 0.97% and 1.54% for £ = 0.8 and £ = 4.0. For comparison purpose, the cost-effectiveness pattern of the P-median model is displayed in Figure 25. Apparently the per unit decrease of the objective function is much more evident in the P-median model, where the average declining rate is about 2.6%. Note the above comparison should not be interpreted as that the LSI model is less cost-effective compared to the P-median model. Rather it simply indicates that the goal is more costly to achieve for the LSI model than for the P-median model. This is not unexpected because the LSI model has to take care objectives other than cost minimization. Finally, notice in both LSI and P-median models, the larger the P value, the slower the per unit decrease of objective function. That is, the marginal benefit declines for the newly added stores. 5.3.3 The Location and Spatial Interaction Pattern This section presents a case study of the LSI problem. What we intend to do is to locate 5 stores from the 397 candidate sites in Redlands, California. That is, P = 5, M = N = 397. We want to examine the location and spatial interaction pattern for a set of £ values and to compare them 163

The Cost-Effectiveness of P-Median

337 3

ao 30

Th* or

Figure 25. The Cost-Effectiveness of the P-median Model 164 to the P-median location pattern. It is believed the study will enhance our understanding of the model property as well as its spatial implications. 1. 6 = 2.0 The heuristic has chosen nodes 803, 1385, 1522, 1541 and 2259 as store locations. The primal objective function equals -3.16, the dual equals -3.20. The average trip length is 7.38. Plate VI displays the resulting location pattern, as well as the highest visiting probability of each demand node. The spatial interaction pattern is also summarized by the following model output:

ID INFLOW % NRST % TRIP LENGTH 803 20.81 70.94 6.406 1385 19.50 45.93 7.559 1522 20.23 23 .86 8.036 1541 19.71 27.16 7.608 2259 19.75 77.54 7.349

INFLOW %: Percentage of customers captured by the store. NRST %: Percentage of visitors who are partroning to their closest outlet. It is clear that an equilibrium allocation pattern is maintained by the LSI model, with each store capturing approximately 20% of the total customers. Among them, store 1385, 1522, 1541 have formed a cluster in downtown Redlands. Most customers visiting these three stores are non-closest people, as indicated by the statistics of NRST %. Store 803 and 2259 are located at the periphery area of the city, with over 70% of customers being nearest visitors. 165 Plate VI. The Solution Pattern of the LSI Model (P ■ 5, M * N « 397, £ « 2.0) 166 In terms of the visiting probability, people in northeast and southwest Redlands are predominantly attracted to store 803 and 2259 respectively. Therefore the area around the two stores displays a visiting probability between 50% to 75%. For the stores located at downtown, they have captured larger trade area compared to the other two stores, but the visiting probability is relatively lower, only around 25% to 50%. This is mainly due to the competition among themselves. 2. S = 3.5 The heuristic in this case has chosen nodes 803, 1195, 1743, 1816 and 2259 as store locations. The primal objective function equals -1.53, the dual equals -1.67. The average trip length is 5.60. Plate VII displays the resulting location pattern, as well as the highest visiting probability of each demand node. The spatial interaction pattern is as follows:

ID INFLOW % NRST % TRIP LENGTH 803 22.88 84.21 4.716 1195 19.17 51.70 6.439 1743 18.25 56.19 6.013 1816 18.10 74.73 4.845 2259 21.60 90.14 6.091

The location pattern differs from the earlier case in that the three clustering stores in the downtown area are replaced by three more distant stores. With the increase of fi, the percentage of customers that visit their closest 167

Plate VII. The Solution Pattern of the LSI Model (P - 5, M - N - 397, & - 3.5) 168 outlets has largely increased. This phenomena is reflected in the decline of average trip length from 7.38 to 5.60. A close look at the map reveals that each demand node has increased its visiting probability to the closest outlet. This is because the increasing traveling cost has made the nearest outlet more attractive. For instance, the areas surrounding store 803 and 2259 previously have a visiting probability between 50% to 75%, now the probability is over 75%. If we check the statistics of INFLOW%, we can see the flow captured by each store is getting less balanced compared to the earlier situation. The two stores at node 803 and 2259 have expanded their customer bases with their location advantages. 3. £ = 5.0 The algorithm has picked nodes 368, 778, 1682, 1743, and 2259 as store locations. The primal objective function equals -0.93, the dual equals -1.16. The average trip length is 4.64. Plate VIII displays the resulting location and spatial interaction pattern. Now it is easy to see that the gravity effect becomes quite remarkable, with each store being a dominant regional attractive center. Again some basic statistics are as follows:

ID INFLOW % % NRST TRIP LENGTH 368 11.77 83.79 4.475 778 21.59 92.14 3.660 1682 22.02 83.65 4.235 1743 22.05 78.30 4.855 2259 22.58 93.83 5.837 169 Plate VIII. The Solution Pattern of the LSI Model (P - 5, M - N * 397, & - 5.0) 170 In general, the changing location and spatial interaction pattern exhibit the same characteristics as when S increases from 2.0 to 3.5. They can be summarized as: (1) a more dispersed location pattern, i.e., store locations with increasing inter-facility distance; (2) higher probability of visiting the nearest outlet; (3) decreasing average trip length; (4) less balanced flow allocation among competing stores; In addition to the above common properties, the so-called “survival is the best" rule seems to be valid, too. For example, node 2259 remains as a store location in all three cases. Its location advantage has made it less vulnerable than any other candidate sites. Experienced users thus can make node 2259 as a required site when performing other tests to save computational effort. As mentioned earlier, when S = 5.0, the resulting location pattern is also the same as the result from the P- median model. Plate IX illustrates the location-allocation pattern of the P-median solution, in which all the population of each demand node are assigned to the nearest facility. Notice even though both the LSI and the P-median models have selected the same facility locations, their spatial interaction patterns are not the same. Since the LSI model follows probabilistic allocation rule, a small percentage of population still attend non-closest facilities, which lead to a longer average trip length. Another important difference 171

Plate IX. The Solution Pattern of the P-median Model 172 between the two models lies in the trade area delimitation. It is very straightforward to define the boundary of trade area in the P-median solution. But in the LSI solution, the trade area boundaries are blurred. Plate X shows the trade area of store 368 and 2259 without considering the impact from other three outlets. Such kind of maps are particularly useful for simulating the competition among different stores.

5.4 Coipnnf-afcional Experience This section reports the computational experience of running the test problems. The focus is given to the running time and memory requirement. It is our hope that such analysis can help to establish benchmarks for future test work, and to provide a sense of how computational effort varies with different model elements. One result is that by looking at the CPU time we are able to learn what is the appropriate problem size for interactive modeling.

5.4.1 Run Tine Analysis Table 21, 22 and 23 record the total CPU time needed to finish running every LSI and P-median test problem, including input/output, initialization, vertex substitution, and lagrangian relaxation/subgradient search. It should be pointed out that there are many elements that can affect the CPU time. Here only several most important elements are discussed. 173 Table 21. The Computational Experience of the Plret Set LSI Problem

p OPEN M CLOSE N & MEMORY RUN TIME I (SECONDS) | 5 1 288 51 397 0.05 920k 108 0.5 928k 108 5.0 1128k 151 10 1 288 51 397 0.05 1148k 169 0.5 1156k 170 5.0 1200k 195 1 20 1 288 51 397 0.05 1392k 418 0.5 1388k 420 5.0 1396k 444 30 1 288 51 397 0.05 1480k 805 0.5 1480k 810 5.0 1480k 836 40 1 288 51 397 0.05 1528k 1314 0.5 1524k 1324 5.0 1532k 1321 I 174

Table 22. The Computational Experience of the Second Set LSI Problem

p OPEN M CLOSEN S MEMORY RUN TIME (SECONDS) 5 0 397 0 397 0.06 1132k 164 I 0.8 1136k 166 I 4.0 1284k 247 I 10 0 397 0 397 0.06 1436k 280 I 0.8 1448k 284 1 4.0 1776k 644 1 20 0 397 0 397 0.06 1724k 733 0.8 1908k 1176 4.0 1944k 1751 1 30 0 397 0 397 0.06 1796k 1456 1 0.8 1996k 2634 I 4.0 2008k 3171 | 175

Table 23. The Computational Experience of the P-median Problem

p OPENMCLOSEN MEMORY RUN TIME (SECONDS) 5 1 288 51 397 844k 119 10 1 288 51 397 884k 139 20 1 288 51 397 956k 216 30 1 288 51 397 1004k 332 ■ 40 1 288 51 397 1036k 482 5 0 397 0 397 984k 187 1 10 0 397 0 397 1040k 215 I 20 0 397 0 397 1168k 356 1 30 0 397 0 397 1256k 567 | 176

Plate X. Trade Area Mapping for Two Selected Stores 177 1. Problem size The values of P, M and N directly determine the number of operations needed to solve the problem. They are the most critical factors to affect running time. As indicated by Figure 26, the increase of P can cause the CPU time goes up dramatically. That is why users are encouraged to set required and prohibited sites, since such participation actually decreases the size of problem, therefore will save much computational effort. For example, considering the first test problem in Table 21, with OPEN * 1, CLOSE =51, P = 5, M = 288, N = 397, the cost of CPU is about 108 seconds. The same problem without any user specifications ( P = 5, M = N = 397 ) would take about 164 seconds to run, which is a 51.8% increase of CPU time. 2. £ value As shown in Figure 26, the running time tends to be longer with the increase of £ values. This is because it is more costly to calculate distance decay functions with larger £ values. In the P-median model, since no calculation of distance decay function and allocation probability is involved, the CPU time is mainly affected by problem size. Thus the running time increases less rapidly compared to that of the LSI model. 178

The Run Time v.s. Number of Facilities

C M = N = 397 }

bataaO >

bata*4.0

p w a d Ian 1 2 1000

10 so

Th* I M T or tllltlaa

Figure 26. The CPU Time versus Problem Size 179 3. Lagrangian relaxation The CPU time is also influenced by the selection of step size and number of iterations during the process of lagrangian relaxation. An inappropriate step size (too big or too small) can lead to numerical errors or ineffective moves in searching for the dual objective function, causing the CPU goes up dramatically. As for the number of iterations, obviously its increase will cost more running time. 4. Initialization strategy The initialization strategy may significantly affect the CPU time after the problem size gets very large. For instance, in the test problem where P = 30, M * N = 397, the running time is longer for 15 = 0.8 than for 15 = 4.0, which appears quite unusual. The reason is that the voting rule number three associated with medium 15 value is more conplicated than the voting rule number two associated with the large 15 value. It is the effort spent on the initialization that makes a difference in the CPU time.

5.4.2 The Memory Requirement As indicated by Table 21 and 22, generally there is a positive relationship between memory requirement and the total CPU time. On average the LSI problems demand between 1000 and 2000 kilobytes to run. Therefore the concern of computer memory should not be a problem when the model is running on a workstation. However, it would be very difficult to move the modeling process to the PC platform. 180

The Memory Requirement v,s. P

C M = N = 397 3

Bata-0 I *-- 1500 M m 4 .0

4 - 1000 Ian I soo

5 10 ao so

Tha Nuabar of PaciI itiaa

Figure 27. The Memory Requirement Versue Problem Size 181 Figure 27 displays the changing pattern of memory requirement when the problem size goes up. Clearly in comparing to the P-median model, the LSI model requires more memory to run. Usually problems with larger 6 values also need more memory. But unlike the curves of the CPU cost, the pace of increase of memory requirement actually slows down with the expand of problem size. Many researchers have developed strategies to cut the memory requirement (see, for example, Densham and Rushton, 1992). Such strategies are usually more critical to the PC platform where the computer memory is a constant obstacle.

R.R fliiiwnnry and Conclusion In this chapter, the results from testing the LSI and the P-median model are presented. There are total about 30 testing cases, which represent typical retail location problems with a variety of model parameters. Basically three aspects of model performance are examined: the quality of model solution, the dynamics of location and spatial interaction pattern, and the computational experience. Overall the heuristic algorithm designed in Chapter 3 has performed very well. The initialization strategies often reach solutions quite close or even equal to the optimal solutions. About one-third of the test problems are solved to optimum. These are all cases with small £ values. The rest 182 of test problems on average had primal solutions within 1% to 2% of the dual objectives. The solution dynamics are mainly under the influence of two factors: the problem size and the S value. The problem size controls the complexity of the modeling process, whereas the fi value directs the inter-play between the entropy maximization objective and the cost minimization objective. It is observed that when the number of facilities (P) increases, the marginal utilities associated with the changing location and spatial interaction pattern tend to decrease. On the other hand, when S increases, the LSI model behaves more and more like a P-median model. Such comparative analysis of various model aspects has greatly enhanced our understanding to the dynamic forces and elements behind the spatial analytical process of retail site selection. It also reveals the great potential of the GIS supported LSI model in dealing with all kinds competitive location and spatial interaction problems. The computational experiences offer information about the cost of CPU time and computer memory, which tend to be neglected by previous research on the LSI models. It turns out that the LSI model consumed at least 50% more time and about 15% more memory than the same size P-median model. In addition, the roles played by different model elements in affecting the computer running time and memory requirement are 183 examined. Current experiences also provide a benchmark for further location spatial interaction analysis. In the next chapter, the primary goals achieved by this research will be summarized, and its significance as well as weakness will be discussed in a broader context. Directions for further research will also be addressed. CHAPTER VI CONCLUSION

6 .1 Research Summary Location spatial interaction (LSI) modeling involves simultaneously selecting facility locations and assigning demand to each facility in a gravity-based probabilistic fashion. As such, it emphasizes both the characteristics of facilities and their spatial relationship with demand places. In the realm of location analysis, LSI models are mostly applied in retail site selection and trade area analysis. The objective of this research was to establish a GIS- supported modeling framework that facilitates large scale comprehensive retail location and spatial interaction analysis. A LSI model was formulated based on the work of 0'Kelly (1987) . Solution strategies were developed and linked with a GIS software through user friendly interface. This has created a single user environment for interactive modeling, database management, visualization, and mapping. Experimental tests were conducted to explore various aspects of the LSI model, and to evaluate the efficiency and effectiveness of the solution algorithms. Such an integrative approach is

184 185 relatively new in the literature of both location analysis and GIS applications. In Chapter II, an extensive review of the literature on LSI modeling and related GIS application was accomplished. This review enconpassed location spatial interaction theories, heuristic solution procedures, and operational issues of GIS. The substantial body of literature, which represented fruitful contributions from many disciplines, had provided a solid background to guide this research. Theoretically, LSI modeling is only one branch of the vast location literature. The development of LSI models attempts to improve the effectiveness of retail location models by linking the allocations with spatial interaction theory. In contrast to the conventional nearest, all-or- nothing assignment, a LSI model seeks a probabilistic assignment of flows between origins (residential zones or demand places) and destinations (stores or supply places). The gravity-based function is often in the form of a Huff model, or a multiplicative competitive interaction (MCI) model. The degree to which the probabilities are distributed among facility locations can be controlled through model parameters. This makes the LSI models flexible enough to be applied in different geographical settings and environments. In terms of model formulation, three types of LSI models can be identified according to the model objectives: the cost minimizing approach, the benefits maximizing approach, and the 186 entropy maximizing approach. Each approach offers a unique perspective on how new retail outlets should be located in the context of many complicated issues -- demand versus supply, cost versus benefit, and competition versus cooperation. Nevertheless, these approaches have not adequately treated retail site selection as a multiobjective process. There exist a wide range of solution methods that can be used to tackle location problems. They are either exact methods or heuristic procedures. For large scale, nonlinear location problems such as the LSI models, heuristic approaches seem to be most appealing due to their tractability, lower cost of computing time, and storage memory. Traditionally, three heuristic methods are most commonly employed: the greedy heuristic, the alternate location-allocation heuristic, and the vertex substitution heuristic. In recent years, some AI- based heuristic algorithms such as genetic algorithm and tabu search are also developed for handling complex combinatorial problems. Despite all these effort, not much attention has been paid to the development of solution procedures suitable for attacking large scale LSI problems. It is widely agreed that GIS supported interactive modeling has a great potential and prospect in both theoretical research and real applications. However, research on how to link optimization based location analysis with GIS is still in the experimental phase with many unresolved issues. The trend is toward an integrative approach that 187 combines database management and visual interactive modeling into a coherent system. Chapter III introduced a location-spatial interaction model and proposed a heuristic algorithm for solving the problem. The LSI model has two objectives. The first objective is to seek the most likely trip distribution pattern according to the entropy maximization principle. It suggests that due to intensive competition, the shopping flow pattern will end up as such an equilibrium that every retail outlet captures a fraction of consumers compatible to the store size and location. The second objective is to minimize the average trip length of shoppers. This implies that the retail system should provide as much accessibility to consumers as it possibly can. The balance of these two somewhat conflicting goals is maintained by the non-negative parameter 6, which stands for the degree of traveling frictions. When 6 is small, the entropy maximization objective prevails. As £ increases, the cost minimization objective takes more control. Eventually, when £ moves to infinity, the LSI model will be equivalent to a conventional P-median problem. A set of heuristic strategies were designed to tackle the problem by focusing on (a) selecting good candidates; (b) avoiding bad candidates; (c) skipping identical location patterns. The resulting algorithm consists of three sequential phases: initialization, search, and evaluation. 188 The initialization phase intends to discover a good starting solution for the LSI problem. In particular, a so- called voting algorithm was devised for starting up the LSI model. It consists of three voting rules which capture the inter-play between the facility attraction and travel friction as & changes. Each rule is en^loyed to produce the initial solution in cases where £ is small, big, or in between. On the basis of the starting solution, the search phase aims to obtain a satisfactory (if not optimal) solution. The search process was developed to combine the vertex substitution method, tabu search, and hashing strategy. Tabu search is able to intensify as well as diversify the vertex substitution process. The hashing procedure records all the location patterns that have been examined and helps the VSM to avoid redundant swaps and computations. Finally, in the evaluation phase the lower bound of the LSI problem is generated through Lagrangian relaxation and subgradient search. The bound provides the information of the approximate gap between the current solution and the best solution. Moreover, considering the algorithm is run interactively, many options have been designed to allow users manipulating various aspects of the LSI model. This includes performing sensitivity analysis, adjusting model parameters, and controlling model input and output. Chapter IV was devoted to the design and implementation of the prototype GRALSIM (GIS-based Retail Analysis with 189 Location Spatial Interaction Modeling). The concepts and strategies of structured analysis were employed to guide the entire process. Specifically, three issues were focuses of the work: database, functionality, and user interface. In light of the retail location analysis, a high quality database should comprise at least four types of data: demographic data, retail data, real estate data, and network data. Aside from the temporal dimension, each type of data consists of a variety of closely related spatial and aspatial attributes. There were basically four functions supported by the system: the visualization function, the query function, the modeling function, and the report function. Within each basic function, many different routines and operations were implemented. For example, both LSI model and P-median model were part of the modeling function. The user interface was created in accord with database organization, functionality requirement, and users' convenience. It included a set of hierarchically organized menus and pop-up windows. These menus could be identified as belonging to three basic branches: the "information11 menus, the "modeling" menus, and the "tools" menus. For decision makers, whether they are experienced or not, they can find the user interface is a powerful consulting network very friendly to work with. 190 Chapter V presented the results of experimenting with GRALSIM. The focus was given to the performance of the LSI and the P-median model. More than 30 testing cases were carefully designed to represent typical retail location problems with a variety of model parameters. Basically three aspects of model performance were examined: the quality of model solution, the dynamics of location and spatial interaction pattern, and the computational experience. Overall the heuristic algorithm designed in Chapter III performed very well. The initialization strategy often reached solutions quite close or even equal to the optimal solutions. The general tendency was that the smaller the & value, the more effective the voting algorithm. The test problems on average had primal solutions within 1% to 2% of the dual objectives. The solution dynamics were reflected by the changing characteristics of average trip length, cost-effectiveness, and location interaction process. They were mainly under the influence of two factors: the problem size (P) and the & value. It was observed that when the number of facilities (P) increased, the marginal utilities associated with the changing location and spatial interaction pattern tended to decrease. On the other hand, when & increased, the LSI model behaved more and more like a P-median model. It was shown that in general, the average trip length would decrease with the increase of & or P. However, in case 191 where £ was very small, the increase of P sometimes would lead to the increase of average trip length due to intensive store competition. This phenomenon was, for the first time, discovered by the LSI model, but would never be captured by a P-median model. In terms of the cost-effectiveness, usually with the addition of every new retail outlet, the decrease of objective function was much slower in the LSI model than in the P-median model. In other words, the objective was more costly to achieve for the LSI model than for the P-median model. Also it was indicated the £ value had little influence upon the cost-effectiveness of the LSI model. The location and spatial interaction pattern of the LSI model exhibited several tendencies. When £ was very small, store locations were more likely to be clustered at the geographical center of the study area. If fi increased, a more dispersed pattern would be formed. At the same time, the probability of shoppers visiting the closest store also went up. This would in turn result in a more unbalanced flow allocation among competing stores. The computational experience offered information about the cost of CPU time and computer memory, which had been neglected by previous research on LSI models. It turned out that the LSI model consumed at least 50% more time and about 15% more memory than the same size P-median model. The 192 computational cost was inclined to be higher with the increase of £ value.

6.2 Contributions This research spans from several realms in connection with retail location analysis. It directly contributes to the existing literature on location spatial interaction modeling and to the general effort of promoting GIS applications in human geography. The contribution takes place at both methodological and substantive levels. Methodologically, the proposed LSI model casts a multiobjective view upon the process of retail site selection. It not only considers the attractiveness of stores and travel cost for the consumers, but also connects these elements with the competition among different facility locations. The model is also flexible enough to include a variety of spatial and aspatial elements, yet remains consistent to location and spatial interaction theory. For instance, the interaction term Sij may be defined in accord with different models of the spatial interaction family. Further, there is a great potential to derive new LSI models based on the current approach. This will be discussed in the next section. The heuristic algorithm has provided a more comprehensive approach to tackle the LSI model than previous methods. The hybrid strategies uniquely combine voting start, tabu search, hashing, vertex substitution, as well as lagrangian relaxation 193 and subgradient search. These strategies are relatively independent from each other and from model assumptions, which means they can be selectively employed depending on problem characteristics. They have proved to be very effective in solving the LSI problem. If properly combined, they can also be successfully applied to other location problems. The use of GIS has significantly strengthened the effectiveness of retail location modeling process. By linking a GIS software with a user developed application model, the prototype system is able to retain both sophisticated database management functionalities and strong modeling capabilities. In particular, a new way of displaying LSI model output has been developed to portray the probabilistic allocations. With interactivity and visualization, users have been changed from passive model operators to active decision makers. Current approach represents a practical way of integrating quantitative modeling and GIS. It demonstrates how modern technology can become a powerful driving force behind scientific research. Substantively, the spatial implications carried by the LSI model have been fully explored. First, no previous research has reported so many details in comparing the LSI and the P-median model. It is confirmed that the LSI model works better than the P-median model in characterizing the complexities of location and spatial interaction in the real world. The major disadvantage of the LSI model is that it 194 costs more computer resources to operate. However, this will become less relevant as the computer technology gets more advanced. Second, the comparative analysis of various model aspects have provided useful insights into the dynamic forces and elements behind the spatial analytical process of retail site selection. These insights can help to improve the current as well as future approaches, from model construction, algorithm design, to system refinement. For example, the system feature of interactive testing step size during subgradient search was not designed at the beginning. Only after many experiments was it convinced that this should be added for users. Third, the experiences gained from the experiments provide valuable feedbacks and benchmarks for further location spatial interaction analysis. New models may be devised and compared to the current model in the same fashion as was conducted in this research. New algorithm may be developed to improve the performance of current heuristic. New system may be created with better design and more sophisticated functions. This leads the topic to the discussion of future research possibilities.

6.3 Directions for Further Research This research has demonstrated the theoretical and practical significance of using GIS supported LSI model to deal with retail location and spatial interaction problems. 195 Now the question is: what are the future research directions in the context of current work? In general, there are two major directions. The first is called vertical development, which refers to the extension and refinement of this research. The second is called horizonal development, which means the exploration and expansion of new research realms in light of current approach. Vertical development represents the short term research goals, whereas horizonal development points to the long term research directions.

6.3.1 Vertical Development 6.3.1.1 Extension of the LSI model As indicated earlier, the proposed LSI model is flexible enough to encompass many different factors related to retail site selection. Therefore, a straightforward extension of the model is to add more complexities into model variables, objectives, and constraints. Some possibilities are summarized as follows. 1. model variables. The attractiveness variable Wj may be extended to a MCI approach with increased number of variables. The travel cost variable C i;j may be changed from shortest path distance to more complicated cost functions that simultaneously consider distance, flow congestion, and other elements associated with individual search activity (Miller, 1991). The demand variable Oi might be embedded into a population projection model. The interaction S^ term can be 196 further divided into Sijt for addressing of the temporal dimension of spatial interaction, which is usually ignored by current spatial interaction models. 2. model objectives. If current LSI model is treated as an entropy-median approach, it is not difficult to imagine that other types of objective functions may be developed in a similar way. For example, the approaches of entropy-center, entropy-covering, entropy-medi-center objectives all seem to represent interesting perspectives on retail location and competition. 3. model constraints. As usual, more constraints can be added to transform the LSI model into a capacitated version, or a model with restrictions on inter-facility distance. In addition, instead of imposing a maximum threshold radius for stores or search radius for shoppers, it is possible to define such restrictions in a probabilistic fashion or in both manners. For instance, a store would not take account shoppers beyond certain distance as well as those with a visiting probability lower than 30%. 6.3.1.2 improvement of the solution strategies Although the heuristic developed in this research has proved to be very effective, there are unanswered questions and rooms for improvement. First, the role played by tabu search during vertex substitution needs to be further examined. It is not clear that a) how much computational effort may be saved with the cut provided by forward tabu; b) 197 how vulnerable is the solution quality in terms of being affected by the tabu cut; c) how effective is the backward tabu search in helping the VSM to avoid local optimum traps; and d) what is the potential associated with different combinations of a forward tabu strategy and the vertex substitution process. Second, the man-computer relationship during interactive modeling has not been fully explored. It is necessary to conduct more experiments with regard to both experienced and inexperienced users. Their comments and suggestions will help to determine the appropriate level of interactive user control, and to improve the performance of the prototype system. Third, it is noticed the voting algorithm is essentially a ranking method applied to the site selection process. There is a possibility that such a ranking mechanism can be borrowed to the vertex substitution process. Theoretically, if an algorithm consistently picks the candidate with greatest potential to enter and the candidate with the least contribution to leave, it should have a better chance of arriving at the optimal solution. Nevertheless, it is not worth implementing this approach if the computational cost turns to be too high. 6.3.1.3 Refinement of the prototype system Due to the limitation of data availability and time allowance, some of the original system designs were not 198 implemented. It is necessary to continue working on the prototype system to make it complete and better. Attention should be paid to the following three aspects: 1. database. Current data are far from enough to meet the needs of system design. In particular, there is a lack of store based retail data and real estate data. This has become a major obstacle to hinder the conduct of more practical and sophisticated retail analysis. Future work on the refinement of GRALSIM thus should start with data collection. The database needs to be expanded to include compatible information with regard to different types of stores, and about different regions at various times. 2. functionality. There is always a continuous necessity to enhance and expand system functionality. Besides bringing in new location and spatial interaction models, one immediate working direction in GRALSIM is to improve the functionality of performing statistical analysis. It is very inconvenient and even frustrating to export model outputs to a spreadsheet software for the purpose of doing performance evaluation. Given current GIS technology, such statistical analysis should be allowed carrying out interactively within the prototype system. In addition, the LSI based trade area should be allowed to be displayed in 3-D maps. 3. empirical study. Substantively, the experiments completed in this research are at a more generalized level which focuses on the exploration of spatial implications of the LSI model. 199 If more data are collected, it will be very interesting to use the prototype system to examine one particular type of stores. For instance, to simulate the changing trade area pattern and sales potential of department stores with the opening of new facilities or close of old outlets. Alternatively, one may compare the dynamic features of store location and spatial interaction in different regions.

6.3.2 Horizonal Development 6.3.2.1 LSI Routing So far the literature of LSI modeling has ignored the routing process, assuming people either travel through direct distance on a plane or shortest path distance on a network. In reality, though, such assumptions are often incorrect. For example, people may go shopping on their way back from work. They may also choose a shopping route that is less congested or a route that links several stores. On the other hand, many stores rely on delivery service to maintain their profit. The location decisions of these stores also need to take account routing possibilities. Such behaviors on both demand and supply sides have given rise to the need of developing LSI models that are able to treat the routing procedures in a perhaps more realistic fashion. In fact, almost all retail outlets, especially gas stations, have to account on the flows passing by their sites for generating profit. The flow pattern, to a large extent, is related to the routing pattern 200 within the study area. Therefore, the development of LSI routing models is a promising research realm for future research. 6.3.2.2 Parallel Processing Applying parallel processing to location modeling has just got started (Densham and Ding, 1993) . It is relevant to this research because, as mentioned in Chapter III, the voting algorithm is based on a "local perspective" with each node behaves the same way. There is a very good chance that parallel processing may be employed for managing the voting procedure. Similarly, the vertex substitution process can be viewed as repeated actions of exchanges and evaluations. By using parallel processing, more than one swaps and evaluations could be handled at the same time. This would overcome the confuting bottleneck for large problems and make current algorithms more appealing and applicable.

€.3.2.3 intelligent interactive modeling Under current approach, a decision maker has control over model input, output, and parameter specification. However, model objectives and constraints are predefined and can not be modified by users. In contrast, intelligent interactive modeling would allow users to construct their own models by interactively specifying model assumptions, objectives, constraints, and various parameters. With object oriented coding, the algorithm is able to identify customized models and to solve them accordingly. Obviously intelligent 201 interactive modeling would offer maximum flexibility and control to decision makers. It is surely a very promising direction for future research. BIBLIOGRAPHY

Achabal, D. D., w. L. Oorr and V. Hahajan 1982. MULTILOC: a multiple store location decision model, Journal of Retailing 58(2):5-25. Aikens, C. H. 1985. Facility location models for distribution planning, European Journal of Operational Research 22:263-279. Allard, L. and M. J. Hodgson 1987. Interactive graphics for mapping location-allocation solutions, American Cartographer 14(l):49-60. Armstrong, M. P., S. Da, P. J. Densham, P. Lolonls, G. Rushton and V. K. Tewari 1990. A knowledge-based approach for supporting locational decision-making, Environment and Planning B 17:341-364. Armstrong, M. P., P. J. Densham, p. Lolonls, and G. Rushton 1992. Cartographic displays to support locational decision making, Cartography and Geographic Information Systems 19 (3): 154-164. Bach, L. 1981. The problem of aggregation and distance for analysis of accessibility and access opportunity in location-allocation models, Environment and Planning A 13:955-978. Bacon, R. W. 1984. Consumer Spatial Behavior. Oxford University Press, New York. Ball, M. and M. Magazine 1981. The design and analysis of heuristic, Network 11:215-219. Baxter, J. 1981. Local optima avoidance in depot location. Journal of Operational Research Society 32:815-819.

Bazaara, M. S. and J. J. Goode 1979. A survey of various tactics for generating Lagrangian multipliers in the context of Lagrangian duality, European Journal of Operational Research 3:322-338.

202 203 Beaumont, J. R. 1980. Spatial interaction models and the location-allocation problem, Journal of Regional Science 20(1):37-50. Beaumont, J. R. 1989. Towards an integrated information system for retail management. Environment and Planning A 21:299- 309. Beaumont, J. R. 1991. GIS and market analysis, in Geographical Information Systems: principles and applications. Maguire, D. J., M. F. Goodchild and D. W. Rhind (eds.), Longman, London, pp.139-151, Vol 2.

Bell, P. C. 1991. Visual interactive modeling: the past, the present, and the prospects, European Journal of Operational Research 54:274-286. Berry, B. and J. Parr 1988. Market Centers and Retail Location Theory and Applications. Englewood Cliffs, N.J. Prentice- Hall. Berry, J. and F. Maclean 1989. Managing the development of a customer marketing database, Environment and Planning A 21:617-623 .

Brady, S. D. and R. R. Rosenthal 1980. Interactive computer graphical solutions of constrained minimax location problems, AIIE Transactions 12(3):241-248. Brandeau, M. L. and S. S. Chiu 1989. An overview of representative problems in location research, Management Science 35(6):645-674.

Brandeau, M. L., S. S. Chiu and R. Batta 1986. Finding the two-medians of a tree network with continuous link demands, Annals of Operations Research 6:223-253.

Brotchie, J. F. 1969. A general planning model, Management Science 16:265. Brown, S. 1992. Retail Location: A Micro-Scale Perspective. Aveburry, Brookfield, Vermont. Burkard, R. B. 1990. Locations with spatial interactions: the quadratic assignment problem, in Discrete Location Theory. Mirchandani, P. B. and R. L. Francis (ed.), John Wiley & Sons, Inc., New York, pp. 387-437. Burrough, P. A. 1990. Methods of spatial analysis in GIS, International Journal of Geographical Information Systems 4(3): 221-223. 204 Caliper Corporation 1988. TransCAD Users Manual V. 1.20. Caliper Corporation, Newton. Casillas, P. 1987. Aggregation problems in location-allocation modeling, in Spatial Analysis and Location-allocation Models. Ghosh, A. and G. Rushton, (ed.), van Nostrand Reinhold, New York. pp.327-344.

Christofides, N. and J. E. Beasley 1983. Extensions to a Lagrangian relaxation algorithm for the capacitated plant location problem, European Journal of Operational Research 12:19-28. Clarice, M. and A. O. Wilson 1985. The dynamics of urban spatial structure: the progress of a research program, Transactions of the Institute of British Geographers 10:427-451. Clark, W. A. v. and O. Rushton 1970. Models of intraurban consumer behavior and their implications for central place theory, Economic Geography 46:486-497. Coelho, J. D. and A. G. Wilson 197 6. The optimal location and size of shopping centers. Regional Studies 10:413-421. Cooper, L. 1963. Location-allocation problems, Operations Research 11:331-343. Cooper, L. 1964. Heuristic methods for location-allocation problems, SIAM Review 6:37-52.

Comuejols, G., M. L. Fisher and G. L. Nemhauser 1977. Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Management Science 23(8):789-810. Comuejols, G., G. L. Nemhauser and L. A. Wolsey 1980. Worst- case and probabilistic analysis of algorithms for a location problem. Operations Research 28(4):847-858.

Comuejols, G., G. L. Nemhauser and L. A. Wolsey 1989. The uncapacitated facility location problem, in Discrete Location Theory. Mirchandani, P. B. and R. L. Francis (ed.), John Wiley & Sons, Inc., New York, pp. 119-173. Comuejols, G., R. Sridharan and J. M. Thizy 1991. A comparison of heuristicsand relaxations for the capacitated plant location problem, European Journal of Operational Research 50:280-297. 205 Craig, C. S., A. Ghosh and S. McLafferty 1984. Models of retail location process: a review, Journal of Retailing 60(1):5-36. Current, J. R. and D. A. Schilling 1987. Elimination of source A and B errors in p-median location problems, Geographical Analysis, 19:95-110. Current, J. R. and D. A. Schilling 1990a. Facility location modeling, working paper series 90-70, College of Business, the Ohio State University. Current, J. R. and D. A. Schilling 1990b. Analysis of errors due to demand data aggregation in the set covering and maximal covering location problem, Geographical Analysis 22:116-126.

Current, J . R., H. Min and D. A. Schilling 1990. Multiobjective analysis of facility location decisions, European Journal of Operational Research 49:295-307.

Daskin, M. S., A. B. Haghani, M. Khanal and C. Malandraki 1987. Aggregation effects in maximum covering models, in Proceedings of the International Symposium on Locational Decisions IV, Namur, Belgium.

Densham, P. J. and G. Rushton 1988. Decision support systems for locational planning, in Behavioral Modelling in Geography and Planning. Golledge, R. and H. Timmermans (ed.), Croom Helm, London, pp. 56-90.

Densham, P. J. and G. Rushton 1992. Strategies for solving large location-allocation problems by heuristic methods, Environment and Planning A 24:289-304.

Densham, P. J. and Ding, Y. 1993 . Integrating parallel location algorithms with GIS, paper presented at the 89th Annual Meeting of the Association of American Geographers, abstracted in 1993 AAG Annual Meeting Abstracts. Atlanta, Georgia, p.54.

Dickey, J. W. and F. J. Najafi 1973. Regional land use schemes generated by TOPAZ, Regional Studies 7:373-386.

Domich, P. D., K. L. Hoffman, R. H. F. Jackson and M. A. McClain 1991. Locating tax facilities: a graphics-based microcomputer optimization model, Management Science 37(8):960-979. 206

Domschke, W. and A. Drexl 1985. Location and layout planning: an international bibliography, in Lecture Notes in Economics and Mathematical Systems, vol.238, Springer- Verlag, New York. Efroymson, H. A. and T. L. Ray 1966. A branch-bound algorithm for plant location, Operations Research 14:361-368.

Eilon, S., C. D. T. Watson-Gandy and N. Christofides 1971. Distribution Management: Mathematical Modelling and Practical Analysis. Griffin Publishing, London.

Ellon, S. and R. D. Galvao 1978. Single and double vertex substitution in heuristic procedures for the p-median, Management Science 24(16):1763-1766.

Brkut, B. and S. Neuman 1989. Analytical models for locating undesirable facilities, European Journal of Operational Research 40:275-291. Brlander, S. 1980. Optimal spatial interaction and the gravity model, in Lecture Notes in Economics and Mathematical Systems. vol.173, Springer-Verlag, New York.

Erlenkotter, D. and G. Leonard! 1985. Facility location with spatially-interactive behavior, Sistemi Urbani 1:29-41.

BSRI 1991. ARC/INFO Network Users Guide. Environmental Systems Research Institute, Redlands.

Evans, S. 1973. The relationship between the gravity model for trip distribution and the transportation problem in linear programming, Transportaion Research 7:39-61.

Feldman, B., F. A. Lehrer and T. L. Ray 1966. Warehouse location under continuous economies of scale. Management Science 12(9):670-684.

Fingleton, B. 1975. A factorial approach to the nearest center hypothesis, Institute of British Geographers Transactions 65:131-140. Fisher, M. L. 1980. Worst-case analysis of heuristic algorithms, Management Science 26(X):1—17.

Fisher, M. L. 1981. The Lagrangian relaxation method for solving integer progamming problems, Management Science 27(1):1-18.

Fisher, N. L. 1985. Interactive Optimization, Annals of Operations Research 5:541-556. 207

Fotheringham, A. S. 1988. Market share analysis techniques: a review and illustration of current US practice, in Store Choice, Store Location and Market Analysis. Wrigley, N. (ed.), Routledge, Chapman and Hall, Andover, Hants, pp. 120-159. Fotheringham, A. S. and N. B. O'Kelly 1989. gpfttifrl Interaction Models: Formulations and Applications . Kluwer Academic Publishers.

Francis, L. and J. M. Goldstein 1974. Location theory: a selected bibliography. Operations Research 22:400-410.

Francis, L. and J. A. White 1974. Facility Layout and Location: An Analytical Approach. Prentice-Hall, Englewood Cliffs, New Jersey.

Francis, L., L. F. McGinnis, and J. A. White 1983. Locational analysis, European Journal of Operational Research 12:220-252.

Galvao, R. D. and L. A. Raggi 1989. A method for solving to optimality uncapacitated location problems, Annals of Operations Research 18:225-244.

Gaile, G. L. and Willmott, C. J. (ed.) 1989. Geography in America. Merrill Publishing Company.

Garey, M. and D. Johnson 1979. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, San Francisco, 1979.

Garfinkel, R. S., A. W. Neebe and M. R. Rao 1974. An algorithm for the m-median plant location problem, Transportation Science 8:217-236. Geoffrion, A. M. 1974. Lagrangian relaxation and its use in integer programming, Mathematical Programming Study 2 :82- 114.

Geoffrion, A. M. 1975. A guide to con\puter-assisted methods for distribution systems planning, Sloan Management Review 16:17-41. Geoffrion, A. M. and R. Mcbride 1978. Lagrangian relaxation applied to capacitated facility location problems, AIIE Transactions 10:40-47.

Ghosh, A. and C. S. Craig 1983. Formulating retail location strategy in a changing environment, Journal of Marketing 47:56-68. 208 Ghosh, A. and C. S. Craig 1991. FRANSYS: a franchise distribution location model, Journal of Retailing 67 (4) :466-495. Ghosh, A. and C. A. Zngene 1991. Spatial Analysis in Marketing; Theory. Methods, and Applications. JAI Press, Greenwich. GhoBh, A. and S. McLafferty 1982. Locating stores in uncertain environments: a scenario planning approach, Journal of Retailing 58(4):5-22. Ghosh, A. and S. MeLafforty 1987. Location Strategies for Retail and Service Firms. Lexington, Mass. Ghosh, A., S. Neslln and R. W. Shoemaker 1984. A comparison of market share models and estimation procedures, Journal of Marketing Research 21:202-210. Ghosh, A. and G. Rushton (ed.) 1987. Spatial Analysis and Location-allocation Models, van Nostrand Reinhold, New York. Glover, F. 1977. Heuristic for integer programming using surrogate constraints. Decision Sciences 8:156-166. Glover, F. 1989. Tabu search, part I, ORSA Journal on Computing 1 (3):190-206. Glover, F. 1990. Tabu search, part II, ORSA Journal on Computing 2(1): 4-32. Glover, F. and H. J. Greenberg 1989. New approaches for heuristic search: a bilateral linkage with artificial intelligence, European Journal of Operational Research 39:119-130. Goodchild, M. F. 1979. The aggregation problem in location- allocation, Geographical Analysis 11:240-255. Goodchild, M. F. 1984. ILACS: a location-allocation model for retail site selection, Journal of Retailing 60(1):84-100. Goodchild, M. F. 1987. A spatial analytical perspective on geographical information systems. International Journal of Geographical Information Systems 1(4):327-334. Goodchild, M. F. and P. J. Booth 1980. Location and allocation of recreation facilities: public swimming pools in London, Ontario, Ontario Geography 15:35-51. 209 Goodchild, M. F. and V. Noronha 1983. Location-allocation for small computers, Monograph No. 8, Department of Geography, the University of Iowa, Iowa City. Haggett, P., 1966. Location Analysis in Human Geooraphv. New York: St. Martins Press. Hakimi, S. L. 1964. Optimal locations of switching centers and the absolute centers and medians of a graph, Operations Research 12:450-459. Hakimi, S. L. 1990. Locations with spatial interactions: competitive locations and games, in Discrete Location Theory. Mirchandani, P. B. and R. L. Francis (ed.), John Wiley & Sons, Inc., New York, pp. 439-478. Handler, Y. and P. B. Mirchandani 1979. Location in Network: Theory and Algorithms. M.I.T. Press, Cambridge, Massachusetts. Hansen, M. H. and C. A. Weinberg 1979. Retail market share in a competitive environment. Journal of Retailing 56(1) :37- 46. Hansen, P., D. Peeters, and J. F. Thisse 1983. Public facility location models: a selective survey, in Locational Analysis of Public Facilities. Thisse, J. F. and H. G. Zoller (ed.), North-Holland, New York, pp.223-262. Harris, B. and A. G. Wilson 1978. Equilibrium values and dynamics of attractiveness terms in production- constrained spatial-interaction models, Environment and Planning A 10:371-388. Hensher, D. A. and L. W. Johnson 1981. Applied Discrete Choice Modeling. John Wiley & Sons, New York. Hillsman, B. L. 1984. The p-median structure as a unified linear model for location-allocation analysis, Environment and Planning A 16:305-318. Hocbbaum, D. S. 1984. When are NP-hard location problems easy?, Annals of Operations Research 1:201-214. Hodgart, R. L. 1978. Optimizing access to public services: a review of problems, models and methods of locating central facilities, Progress in Human Geography 2:17-48. Hodgson, M. J. 1978. Towards more realistic allocation in location-allocation models: an interaction approach. Environment and Planning A 10:1273-1285. 210

Hodgson, M. J. 1981. A location-allocation model maximising consumers' welfare, Regional Studies 15:493-506. Holland, J. H. 1975. Adaption in natural and artificial systems. University of Michigan Press, Ann Arbor, MI. Holmberg, K. and K. Jornsten 1989. Exact methods for gravity trip-distribution models. Environment and Planning A 21:81-97.

Holms, J., F. B. williams and L. A. Brown 1972. Facility location under maximum travel restriction: an example using day care facilities. Geographical Analysis 4:258- 266. Hopmans, A. C. X. 1986. A spatial interaction model for branch bank accounts, European Journal of Operational Research 27:242-250. Hosage, C. M. and if. J. Ooodchild 1986. Discrete space location-allocation solutions from genetic algorithms, Annals of Operations Research 6:35-46. Hubbard, R. 1978. A review of selected factors conditioning consumer travel behavior, Journal of Consumer Research 5:1-21. Huff, D. L. 1964. Determining and estimating a trade area, Journal of Marketing 28:34-38. Huff, D. L. 1966. A programmed solution for approximating an optimum retail location, Land Economics 42:293-303.

Hurter, A. and J. Martinich 1989. Facility location and the theory of production. Kluwer Academic Publishers, Boston. Isard, W. 1956. Location and Space-Economv. M.I.T. Press, Massachusetts. Jain, A. K. and V. Mahajan 1979. Evaluating the competitive environment in retailing using multiplicative competitive interactive models, in Research in Marketing. J. Sheth(ed-), JAI Press, Greenwich, Conn. Jacobsen, S. K. 1987. On heuristic for some entropy maximizing location models, Research Report 12/87, IMSOR, Institute of Mathematical Statistics and Operations Research, Technical Institute of Denmark.

Jacobsen, S. K. 1983. Heuristic for the capacitated plant location model, European Journal of Operational Research 12:253-261. 211

Jarvinen, P., I. Rajala and H. Sinorvo 1972. A branch-and- bound algorithm for seeking the m-median, Operations Research 20:173-178. Jaynes, B. T. 1957. information theory and statistical mechanics, Physical Review 106(4):620-630. Jornsten, K. O. 1980. A maximum entropy combined distribution and assignment model solved by benders decomposition, Transportation Science 12(3):262-276. Juel, H. 1981. Bounds in the location-allocation problem. Journal of Regional Science 21:277-282.

Kalinski, A. A. 1992. A GIS-based prototype for retail potential surface mapping and analysis, URISA Proceedings, Vol. 4, pp.15-23.

Kariv, O. and S. L. Hakimi 1979. An algorithmic approach to network location problems, SIAM Journal on Applied Mathematics 37:513-560.

Khumawala, B. M., A. W. Neebe and D. G. Dannenbring 1974. A note on El-Shaieb's new algorithm for locating sources among destinations, Management Science 21:230-233. Klincewicz, J. G. 1990. Avoiding local optima in the p-hub location problem using tabu search and GRASP, presented at ISOLDE V, June 1990. Klincewicz, J. G. 1991. Heuristic for the p-hub location problem, European Journal of Operational Research 53:25- 37,

Kohsaka, H. 1993. A monitoring and locational decision support system for retail activity. Environment and Planning A 25:197-211.

Krarup, J. and P. M. Pruzan 1979. Selected families of location Problems, Annals of Discrete Mathematics 5:327- 387.

Krarup, J. and P. M. Pruzan 1983. The simple plant location problem: survey and synthesis, European Journal of Operational Research 12:36-81.

Kuehn, A. A. and J. J. Hamburger 1963. A heuristic program for locating warehouses, Management Science 9(4):643-667.

Lakshmanan T. R. and W. G. Hansen 1965. A retail market potential model, Journal of American Institute Planning 31:134-143. 212

Lapalme, G., J. Rousseau, S. Chapleau, M. Cormier, P. Cossette and S. Roy 1992. GeoRoute: a geographic information system for transportation applications, Communications of the ACM 35(1) :80-88. Lea, A. C. 1973. Location-Allocation Systems: An Annotated Bibliography. Discuss Pap. No.13, Department of Geography, University of Toronto, Canada.

Lea, A. C. 1989. An overview of formal methods for retail site evaluation and sales forecasting: part 1, The Operational Geographer 7(2):8-17 .

Lea, A. C. and G. L. Monger 1990a. An overview of formal methods for retail site evaluation and sales forecasting: part 2, spatial interaction models, The Operational Geographer 8(1):17-23.

Lea, A. C. and G. L. Monger 1990b. An overview of formal methods for retail site evaluation and sales forecasting: part 3, location-allocation models, The Operational Geographer 8(3):17-26.

Leonard!, G. 1978. Optimum facility by accessibility maximizing, Environment and Planning A 10:1287-1306. Leonard!, G. 1981a. A unifying framework for public facility location problems-part 1: a critical overview and some unsolved problems, Environment and Planning A 13:1001- 1028.

Leonard!, G. 1981b. A unifying framework for public facility location problems-part 2: some new models and extensions. Environment and Planning A 13:1085-1108.

L!u, Lin 1991. TIGER and census population data conversion for redistricting application at city level. Technical Report For ESRI. Loach, A. 1954. The Economics of Location, trans. Woglom, W.H. and Stopler, W.F., Yale University Press, New Haven. Louveaux, P. v ., M. Labbe, and J. 7. Thiaae (ed.) 1989. Facility Location Analysis; Theory and Applications, Annals of Operations Research 18:1-372. Love, R. F. and H. Juel 1982. Properties and solution methods for large location-allocation problems. Journal of Operational Research Society 33:443-452. 213 Love, R. P., j. G. Morris, and G. 0. Wesolowsky, 1988. Facilities Location: Models and Methods. North-Holland, New York.

Love, R. P. and P. D. Dowling 1989. A generalized bounding method for multifacility location models, Operations Research 37(4):653-657. Malczewaki, J. and w. Ogryczak 1990. An interactive approach to the central facility location problem: locating pediatric hospitals in Warsaw, Geographical Analysis 22(3):245-258.

Manne, A. S. 1964. Plant location under economies of scale: decentralization and computation, Management Science 11:213-235.

Maranzana, P. B. 1964. On the location of supply points to minimize transportation costs, Operations Society Quarterly 15:261-270.

Marble, D. P. 1990a. Geographic information systems: an overview, in Introductory Readings in Geographic Information Systems. Peuquet, D. J. and D. F. Marble (ed.), Taylor and Francis, pp.8-17. Marble, D. P. 1990b. The potential methodological impact of geographical information systems on the social sciences, in Interpreting Space: GIS in Archaeology and Anthropology. Allen, Zubrow, and Green (ed.), Taylor & Francis, London.

McPadden, D. 1974. Conditional logit analysis of qualitative choice behavior, in Frontiers in Economics. P. Zarembkar (ed.), Academic Press, New York.

Megiddo, N. and K. J. Supowit 1984. On the complexity of some common geometric location problems, SIAM Journal on Computing 13:182-196.

Mirchandani, P. B. and R. L. Francis (ed.) 1990. Discrete Location Theory. John Wiley & Sons, New York.

Mirchandani, P. B. and A. Oudjit 1982. Probabilistic demands and costs in facility location problems, Environment and Planning A 14:917-932.

Naart, P. A. and M. Weverbergh 1981. On the prediction power of market share attraction models, Journal of Marketing Research 18:146-153. 214 Nakanishi, if. and L. G. Coopar 1974. Parameter estimates for multiplicative competitive interactive models: least squares approach, Journal of Marketing Research 11:303- 311. Neuberger, H. 1971. User benefit in the evaluation of transport and land use plans, Journal of Transport Economics and Policy 5:52-75. O'Kelly, M. B. 1983. Impacts of multistop multipurpose trips on retail distributions, Urban Geography 4:173-190. O'Kelly, M. B. 1987. Spatial interaction based location- allocation models, in Spatial Analysis and Location- allocation Models. Ghosh, A. and G. Rushton (ed.), van Nostrand Reinhold, New York. pp.302-326. O'Kelly, M. B. and J. B. Storbeck 1984. Hierarchical location models with probabilistic allocation, Regional .Studies 18(2):121-129. Openshaw, S. 1990. Spatial analysis and geographical information systems: a review of progress and possibilities, in Geographical Information Systems for Urban and Regional Planning. Scholten, H. J. and J. C. H. Stillwell (ed.), Kluwer Academic Publishers. Papadimitriou, C. H. 1981. Worst-case and probabilistic analysis of a geometric location problem, SIAM Journal on Confuting 10:542-557. Rapp, Y. 1962. Planning of exchange locations and boundaries, Ericsson Technics 2:1-22. ReVelle, C. 1987. Urban public facility location, in Handbook of Regional and Urban Economics. Chapter 27, Vol. II, Mills, E. S. (ed.), Elsevier Science Publishers B.V., pp.1053-1096. ReVelle, C., D. Bigman, D. A. Schilling, J. Cohon, and R. Church 1977. Facility location: a review of context-free and EMS models. Health Service Research 12:129-146. ReVelle, C., D. narks, and J. C. Liebman 1970. An analysis of private and public sector location models, Management Science 16:692-699. Rosing, K. B., B. L. Hillsman and H. Rosing-Vogelaar 1979. A note comparing optimal and heuristic solutions to the p- median problem, Geographical Analysis 11:86-89. 215 Rushton, 0. 1989. Applications of location models, Annals of Operations Research 18:25-42. Scott, A. *7. 1970. Location-allocation systems: a review, Geographical Analysis 2:95-119.

Shaw, S. L. 1993. GIS for urban travel demand analysis: requirements and alternatives, Coznputer Environment and Urban Systems 17:15-29. Sheffi, Y. 1985. Urban Transportation Networks: Equilibrium Analysis with Mathematical Programming Methods. Prentice- Hall, Inc., Englewood Cliffs, New Jersey. Sheppard, B. S. 1980. Location and the demand for travel, Geographical Analysis 12:111-128. Spencer ill, T., A. J. Brigand!, D. R. Dargon and M. J. Sheehan 1990. AT&T's telemarketing site selection system offers customer support, Interfaces 20:83-96. Tansel, C., R. L. Francis, and T. J. Lowe 1983. Location on networks: a survey. Management Science 29:482-511. Teitz, M. B. and P. Bart 1968. Heuristic methods for estimating the generalized vertex median of a weighted graph, Operations Research 16:955-961. Thisse, J. F. and H. Q. Zoller (ed.) 1983. Locational Analysis of Public Facilities. North-Holland, Amsterdam. Tomlinson, R. F, 1987. Current and potential uses of geographical information systems: the North American experience. International Journal of Geographical Information Systems 1 (3):203-218. Toregas, C. R., R. Swain, C. ReVelle and L. Bergman 1971. The location of emergency service facilities, Operations Research 19:1363-1373. Weber, A. 1909. Uber den Standort der Industrien. English Translation: Theory of the Location of Industries. Friedrich, C. J. (ed. and transl.), Chicago University Press, Chicago, 1929. Whitaker, R. A. 1983. A fast algorithm for the greedy interchange for large-scale clustering and median location problems, Infor 21(2):95-108.

Williams, H. C. W. L. 1976. Travel demand models, duality relations and user benefit analysis, Journal of Regional Science 16 (2):147-166. 216

Williams, H. C. W. L., K. S. Kim and D. Martin 1990. Location- spatial interaction models: 1. benefit-maximizing configurations of services, Environment and Planning A 22:1079-1089. Williams, H. C. w. L ., K. S. Kim 1990a. Location-spatial interaction models: 2. Competition between independent firms, Environment and Planning A 22:1155-1168. Williams, H. C. W. L., K. S. Kim 1990b. Location-spatial interaction models: 2. Competition between organizations, Environment and Planning A 22:1181-1290.

Wilson, A. O. 1967. Statistical theory of spatial trip distribution models, Transportaion Research 1:253-269.

Wilson, A. O. 1970. The use of the concept of entropy in system modeling, Operational Research Quarterly 21:247- 265. Wilson, A. G. 1974. Urban and Regional Models in Geography and Planning. Wiley, London.

Wilson, A. G. 1976. Retailers' profits and consumers welfare in a spatial interaction shopping model, in Theory and Practice in Regional Science. Masser (eds.), Pion, London, pp. 42-59.

Wilson, A. G. 1988. Store and shopping-centre location and size: a review of British research and practice, in Store Choice. Store Location and Market Analysis. Wrigley, N. (ed.), Routledge, Chapman and Hall, Andover, Hants, pp. 160-186. Wilson, A. G., J. D. Coelho, S. M. Macgill and H. C. W. L. williams 1981. Optimization in Locational and Transport Analysis. John Wiley & Sons, New York.

Wilson, A. G. *nA M. L. Senior 1974. Some relationships between entropy maximizing models, mathematical programming models, and their duals, Journal of Regional Science 14:207-215.

Wong, T. 1985. Location and Network Design, in Combinatorial Optimization: Annotated Bibliographies. O'hEigeartaigh, J. K. Lenstra and A. H. G. Rinnooy Kan (ed.), Wiley, New York.

Wrigley, N. 1988. Store Choice. Store Location and Market Analysis. Routledge, Chapman and Hall, Andover, Hants.

Yourdon, B. 1989. Modern Structured Analysis. Prentice-Hall.