Exploration of Customer Churn Routes Using Machine Learning Probabilistic Models

UNIVERSITAT POLITECNICA` DE CATALUNYA Exploration of Customer Churn Routes Using Machine Learning Probabilistic Models DOCTORAL THESIS Author: David L. Garc´ıa Advisors: Dra. Angela` Nebot and Dr. Alfredo Vellido Soft Computing Research Group Departament de Llenguatges i Sistemes Informatics` Universitat Politecnica` de Catalunya. BarcelonaTech Barcelona, 10th April 2014 A bird doesn’t sing because it has an answer, it sings because it has a song. Maya Angelou AMonica,´ Olivia, Pol i Jana 1 Acknowledgments Siempre he pensado que las cosas, en la vida, no ocurren por casualidad; son el conjunto de pequenas˜ (y grandes) aportaciones, casualidades, ideas, retos, azar, amor, exigencia y apoyo, animo´ y desanimo,´ es- fuerzo, inspiracion,´ reflexiones y cr´ıticas y, en definitiva, de todo aquello que nos configura como personas. Y la presente Tesis Doctoral no es una excepcion.´ Sin la aportacion´ en pequenas˜ dosis de estos ingredientes por parte de muchas personas esta Tesis no hubiera visto nunca la luz. A todos ellos mi publica´ gratitud. A los Dres. Angela` Nebot y Alfredo Vellido, en primer lugar por reconocer algun´ tipo de ’valor’ difer- encial en un perfil como el m´ıo y aceptarme en el Soft Computing Group como estudiante de doctorado. Y en segundo lugar, y mas´ importante, por ser el alma de este trabajo. Si os fijais´ observareis´ su presencia e ideas a lo largo de la Tesis Doctoral. Para ellos mi gratitud y admiracion.´ A mis clientes y colegas en el mundo empresarial, con los que he compartido infinitas horas y problemas y cuya exigencia ha puesto a prueba, cada d´ıa, el sentido y utilidad practica´ de las ideas y metodolog´ıas aqu´ı presentadas. Uno de estos ultimos,´ Francesc Massip ha resultado de inestimable ayuda en operativizar estas ideas en algunas de las grandes compan˜´ıas en las que hemos trabajado. Desde aqu´ı mi reconocimiento a su trabajo infatigable, que hago extensivo al interminable grupo de colaboradores al lado de los cuales, desde hace mas´ de 20 anos,˜ llevo haciendo consulta. Gracias por dejarme compartir vuestra exigencia y excelencia en el trabajo. Y, por ultimo,´ a mi familia (empezando por mi esposa Monica´ y su infinita paciencia, mis hijos Olivia, Pol y Jana. Y tambien´ a mi madre Pepita, mi hermano Jordi y mi cunada˜ Montse) lo mas´ importante en mi vida, que siempre me han animado a continuar y que por fin vuelven a respirar al verme liberado de la esclavitud del autor. Sin su amor, ejemplo y apoyo, nunca habr´ıa sido capaz de completar este estudio de doctorado. 2 Abstract The ongoing processes of globalization and deregulation are changing the competitive frame- work in the majority of economic sectors. The appearance of new competitors and technolo- gies entails a sharp increase in competition and a growing preoccupation among service pro- viding companies with creating stronger bonds with customers. Many of these companies are shifting resources away from the goal of capturing new customers and are instead focusing on retaining existing ones. In this context, anticipating the customer’s intention to abandon, a phenomenon also known as churn, and facilitating the launch of retention-focused actions represent clear elements of competitive advantage. Data mining, as applied to market surveyed information, can provide assistance to churn management processes. In this thesis, we mine real market data for churn analysis, placing a strong emphasis on the applicability and interpretability of the results. Statistical Machine Learning models for simultaneous data clustering and visualization lay the foundations for the analyses, which yield an interpretable segmentation of the surveyed markets. To achieve interpretability, much attention is paid to the intuitive visualization of the experimental results. Given that the modelling techniques under consideration are nonlinear in nature, this represents a non-trivial challenge. Newly developed techniques for data visualization in nonlinear latent models are presented. They are inspired in geographical representation methods and suited to both static and dynamic data representation. 3 Contents 1 Introduction 12 1.1 Main goals of the thesis . 13 1.1.1 Market analysis goals of the thesis . 13 1.1.2 Data modelling goals of the thesis . 14 1.2 Structure of the Thesis . 14 2 Customer Continuity Management as a foundation for churn Data Mining 16 2.1 Customer churn: the business case . 16 2.1.1 Customer continuity management . 18 2.2 Customer churn prevention: loyalty construction . 21 2.2.1 Basic concepts . 21 2.2.2 Models of loyalty . 41 3 Predictive models in churn management 53 3.1 Building predictive models of abandonment . 54 3.1.1 Stage 1: Identifying and obtaining the best data . 54 3.1.2 Stage 2: Selection of attributes . 56 3.1.3 Stage 3: Development of a predictive model . 56 3.1.4 Stage 4: Validation of results . 59 3.2 A summarized review of the literature . 60 4 Manifold learning: visualizing and clustering data 75 4.1 Latent variable models . 75 4.2 Self Organizing Maps . 76 4.2.1 Basic SOM . 76 4.2.2 The batch-SOM algorithm . 77 4.3 Generative Topographic Mapping . 77 4.3.1 The GTM Standard Model . 77 4.3.2 The Expectation-Maximization (EM) algorithm . 79 4.3.3 Data visualization and clustering through GTM . 79 4.3.4 Magnification Factors for the GTM . 80 5 Supervised customer loyalty analysis 82 5.1 Petrol station customer satisfaction, loyalty and switching barriers . 82 5.2 Methods . 83 5.2.1 Bayesian ANN with ARD . 83 5.2.2 Orthogonal Search-based Rule Extraction . 84 5.3 Experiments . 85 5.3.1 Results and discussion . 85 5.4 Conclusions . 88 6 Unsupervised churn analysis in a telecommunications company 90 4 6.1 Problem Description . 91 6.2 Telecommunications customer data . 92 6.3 Methods . 93 6.3.1 The FRD-GTM . 93 6.3.2 Two-Tier market segmentation . 93 6.4 Experiments . 95 6.4.1 Visualization and clustering using GTM . 96 6.4.2 Visualization and clustering using FRD-GTM . 97 6.5 Conclusions . 100 7 Distortion visualization in GTM 102 7.1 Visualization of multivariate data using cartograms . 102 7.1.1 Methods . 104 7.1.1.1 Density-equalizing cartograms . 104 7.1.2 Cartogram visualization of the GTM magnification factors . 105 7.2 Experiments . 106 7.2.1 Experiments with artificial data . 106 7.2.1.1 Preliminary experiment with 3-D artificial data . 106 7.2.1.2 Further experiments with artificial data . 110 7.2.2 Experiments with real data . 116 7.3 Conclusions . 123 8 Distortion and Flow Maps visualization for churn analysis 124 8.1 Methods and Materials . 125 8.1.1 Flow Maps for the visualization of customer migrations in GTM . 125 8.1.2 Brazilian telecommunication company . 125 8.1.3 Spanish pay-per-view television company . 125 8.2 Experiments . 126 8.2.1 Brazilian telecommunication company . 127 8.2.1.1 Experimental Setup . 127 8.2.1.2 Results . 127 8.2.2 Discussion . 131 8.2.3 Spanish pay-per-view company . 133 8.2.3.1 Results and discussion . 133 8.3 Conclusions . 136 9 Conclusions and future research 137 9.1 Conclusions . 137 9.2 Suggestions for future research . 139 Publications 141 References 142 5 List of Figures 2 Customer Continuity Management as a foundation for churn Data Mining 2.1 Commercial development alternatives in mature markets. 17 2.2 Stages in a client’s lifecycle. 17 2.3 Customer lifecycle management model: Customer Continuity Management. 19 2.4 Illustration of the effect of developing satisfaction excellence policies and procedures and positive switching barriers. 20 2.5 Models analyzed by Cronin . 42 2.6 Model proposed by Jones . 43 2.7 Model proposed by Chen . 43 2.8 Model proposed by Ranaweera . 44 2.9 Model proposed by Patterson . 45 2.10 Model proposed by Gounaris . 45 2.11 Model proposed by Kim . 46 2.12 Model proposed by Lam . 46 2.13 Integrated model of retail service relationships proposed by Fullerton . 47 2.14 Second variant of integrated model proposed by Fullerton . 47 2.15 Model proposed by Vazquez´ . 48 2.16.

Exploration of Customer Churn Routes Using Machine Learning Probabilistic Models

Evolutionary Data Mining Approach to Creating Digital Logic

Health Outcomes from Home Hospitalization: Multisource Predictive Modeling

A Genetic Algorithm Solution Approach Introd

Optimization-Based Evolutionary Data Mining Techniques for Structural Health Monitoring

Review on Predictive Modelling Techniques for Identifying Students at Risk in University Environment

Multi Objective Optimization of Classification Rules Using Cultural Algorithms

From Exploratory Data Analysis to Predictive Modeling and Machine Learning Gilbert Saporta

Data Mining and Exploration Michael Gutmann the University of Edinburgh

Efficient Evolutionary Data Mining Algorithms Applied to the Insurance Fraud Prediction

Evolutionary Data Mining for Link Analysis: Preliminary Experiments on a Social Network Test Bed

Comparing Out-Of-Sample Predictive Ability of PLS, Covariance, and Regression Models Completed Research Paper

Deciding the Best Machine Learning Algorithm for Customer Attrition Prediction for a Telecommunication Company