Using Data Science to Predict Booking Cancellations

IUL School of Technology and Architecture Department of Information Science and Technology Hotel Revenue Management: Using Data Science to Predict Booking Cancellations Nuno Miguel da Conceição António Thesis specially presented for the fulfilment of the degree of Doctor in Information Science and Technology Supervisors: PhD Ana de Almeida, Assistant Professor, ISCTE-IUL PhD Luis Nunes, Assistant Professor, ISCTE-IUL June, 2019 IUL School of Technology and Architecture Department of Information Science and Technology Hotel Revenue Management: Using Data Science to Predict Booking Cancellations Nuno Miguel da Conceição António Thesis specially presented for the fulfilment of the degree of Doctor in Information Science and Technology Jury: PhD Pedro Ramos, Associate Professor, ISCTE-IUL PhD Markku Vieru, Associate Professor, University of Lapland PhD Paulo Rita, Full Professor, NOVA IMS (Universidade Nova de Lisboa) PhD Francisco Serra, Coordinator Professor, Universidade do Algarve PhD Catarina Silva, Associate Professor, Instituto Politécnico de Leiria PhD Ana de Almeida, Associate Professor, ISCTE-IUL June, 2019 To Sofia for all the love and support. In memory of my mother, Ana Rosa. Hotel Revenue Management: Using Data Science to Predict Booking Cancellations ABSTRACT In the hotel industry, demand forecast accuracy is highly impacted by booking cancellations. Trying to overcome loss, hotels tend to implement restrictive cancellation policies and employ overbooking tactics which in turn reduces the number of bookings and reduces revenue. To tackle the uncertainty arising from cancellations, models for the prediction of a booking’s cancellation were developed. Data from hotels’ reservations systems was combined with data from other sources (events, holidays, online prices/inventory, social reputation and weather). Despite data class imbalance, concept drift, and dataset shift problems, it was possible to demonstrate that to predict cancellations of bookings is not only possible but also accurate. Moreover, it helped to better understand what the cancellation drivers can be. In order to assess the models under real conditions, a prototype was developed for field tests allowing to evaluate how an automated machine learning system that predicts booking’s cancellations could be integrated into hotels’ systems. The model’s performance in a real environment was assessed, including the impact on the business. The prototype implementation enable an understanding of adjustments to be made in the models so that they could effectively work in a real environment, as well as fostered the creation of a new measure of performance evaluation. The prototype enabled hoteliers to act upon identified bookings and effectively decrease cancellations. Moreover, results confirmed that booking cancellation prediction models can improve demand forecast, allowing hoteliers to understand their net demand, i.e., current demand minus predicted cancellations. Keywords: Data science, Hotel industry, Machine learning, Predictive analytics, Revenue management i ii Hotel Revenue Management: Using Data Science to Predict Booking Cancellations RESUMO Na indústria hoteleira, a precisão da previsão da procura é altamente impactada pelos cancelamentos de reservas. Na tentativa de mitigar as consequências dos cancelamentos, os hotéis tendem a implementar políticas de cancelamento restritivas e táticas de overbooking, o que, por sua vez, reduz o número de reservas e a receita. Para combater a incerteza decorrente dos cancelamentos, foram desenvolvidos modelos capazes de prever a probabilidade de cada reserva vir a ser cancelada. Neste desenvolvimento foram utilizados dados de oito sistemas de gestão de reservas de outros tantos hotéis, conjuntamente com dados de outras fontes (eventos, feriados, preços/inventário online, reputação social e clima). Apesar dos problemas de desequilíbrio de classe de dados, desvio de conceito e variação de distribuição entre variáveis ao longo do tempo, foi possível demonstrar que prever cancelamentos de reservas não é apenas possível realizar, mas que é possível de fazer com elevada precisão. A elaboração dos modelos ajudou ainda a compreender os fatores que influenciam o cancelamento. Para avaliar os modelos em condições reais, foi desenvolvido um protótipo, o qual permitiu avaliar como um sistema automatizado baseado em aprendizagem automática para prever os cancelamentos de reservas pode ser integrado nos sistemas dos hotéis. Este protótipo permitiu ainda avaliar o desempenho dos modelos num ambiente real, incluindo o seu impacto na operação. A implementação possibilitou também compreender os ajustes a serem feitos aos modelos para que pudessem efetivamente trabalhar num ambiente real, bem como fomentou a criação de uma nova medida de avaliação de desempenho. O protótipo permitiu que os hoteleiros agissem sobre as reservas identificadas e efetivamente diminuíssem os cancelamentos. Para além disso, os resultados confirmaram que os modelos de previsão de cancelamento de reservas podem melhorar a previsão de procura, permitindo que os hoteleiros compreendam melhor a sua procura líquida, ou seja, a procura atual menos os cancelamentos previstos. Palavras-chave: Aprendizagem automática, Ciência de dados, Gestão de receita, Indústria hoteleira, Modelos preditivos iii iv Hotel Revenue Management: Using Data Science to Predict Booking Cancellations ACKNOWLEDGMENTS Gratitude is perhaps the word that best defines what I feel in completing this long and challenging journey that is to do a doctorate. Although it is a journey that is made up of many periods of solitary work, it is a journey that is impossible to accomplish without the support of other people and institutions. It is good to know that when we need it, there are so many people willing to help. To all of them, and there were many, I want to leave my deepest gratitude here. Firstly, I would like to express my enormous gratitude to my supervisors, Professor Ana de Almeida and Professor Luis Nunes. Without knowing me, they not only believed in my initial proposal, but they also made an effective contribution in transforming it in a PhD research project and accepted to be my supervisors. Without Professor Ana de Almeida, and Professor Luis Nunes knowledge, guidance, motivation, availability, and patience (I know that sometimes I can be annoying) this project would not have been completed. Their willingness to help in all matters, including in bureaucracies, conferences participations, among other non-research related things, is an example that I will try to follow and something I will never forget. Besides my supervisors, I would like to thank also the rest of my thesis committee: Professor Paulo Rita and Professor Francisco Serra. Before starting this doctoral course, I did not know Professor Paulo Rita, but during the course, I had the privilege of getting to know him better and beneficiating from his insightful comments and suggestions. Professor Francisco Serra is someone I admire, as a professional and as a person, since he was my MsC supervisor. Although he started working in a public office at the same time I started the doctoral course, he was always available for helping, allowing me to beneficiate from his revenue management and hotel industry expertise. I would like to thank also all the professors of the classes I had in the doctoral course, at ISCTE- IUL. However, I have to highlight Professors Fernando Batista and Ricardo Ribeiro. It was their classes and subsequent collaboration in two papers, that made me treasure natural language processing. A special thanks to Professor Paul Phillips from the University of Kent, U.K. The opportunity to work with him was an honor that I cherish and that made me grow as a researcher. A word of gratitude also to my colleagues at the Universidade do Algarve, in particular, to Marisol Correia, Célia Ramos, Carlos Afonso, José Santos, and Ana Marreiros who have taught me much. Even though for privacy reasons I cannot mention any names, I would like to express my sincere gratitude to the hotels’ board members and hotels’ managers who allow me to use their hotels’ data in my research and for sharing their valuable time in meetings with me. Conducting doctoral research and making it known is something that can require some investment. In my case, this was particularly true due to the computational power and data storage v space required. I managed to secure all the infrastructure resources with the help of three different entities. First, ITBASE, the company I work, lent me a server for the duration of the course. Second, a friend, Pedro Reis from Liderlink, lent me two servers on his company datacenter. Third, Microsoft. By conceding me a Microsoft Azure Data Science Grant Award in the value of 50 000 USD it allowed me to overcome all the remaining computational restrictions I had. I also had the financial support of ISCTE-IUL, FCT (funded by FCT/MEC through national funds under the project UID/EEA/50008/2013), ISTAR-IUL, CISUC, Instituto das Telecomunicações, and Fundação Luso-Americana para o Desenvolvimento for sharing and publishing my PhD research. To all of them, my deepest thanks. A very special gratitude goes to Carlos Soares and Carlos Alexandre Soares that understood what the PhD meant for me and not only agreed on being ITBASE to support the tuition costs but also gave me the latitude to manage my working time according to the PhD requirements. A big thank you to my friend and fellow diver from PORTISUB, Peter Bolton, for the time he spent revising the English of some of my publications. A big thank you also to my colleague Tatiana Simões for revising some of my many PowerPoint presentations. I want to thank my family, friends, and fellow members of the board of PORTISUB, for their support and for understanding my absence from many events during this period of the doctoral course. Last but not least, I would like to thank my wife Sofia. It was several years with virtually no vacations. Many weekends spent working. Many days waking up early or staying up until late to work on the PhD research. Without Sofia’s understanding and support it would have been impossible to get to the end of this journey.

Using Data Science to Predict Booking Cancellations

25 Little Known Ways to Get Paid to Speak.Pages

Module 1: How to Find Speaking Opportunities (The Secret Is They Are EVERYWHERE!)

Online Research Tools

Transcript for #Ls11 Tuesday.Pdf

2015 PCA Conference Final Program

United States Securities and Exchange Commission Form

September 17, 2019 High School Library AGENDA I

The Future Is

Lunch & Learn Your Way to Knowledge!

Topical Community Detection in Event-Based Social Network

Row Labels Count of Short Appname Mobileiron 3454 Authenticator 2528

Thanks for Downloading Our Big List of Marketing Tools for Event Organisers!