Causal Inference, Cricket and Retail by Muhammad Jehangir Amjad B.S.E, Electrical Engineering, Princeton University (2010)

Sequential Data Inference via Matrix Estimation: Causal Inference, Cricket and Retail by Muhammad Jehangir Amjad B.S.E, Electrical Engineering, Princeton University (2010) Submitted to the Sloan School of Management, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Operations Research at the Massachusetts Institute of Technology September 2018 c 2018 Massachusetts Institute of Technology All Rights Reserved. Signature of Author: Muhammad Jehangir Amjad Operations Research Center August 15, 2018 Certified by: Devavrat Shah Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by: Patrick Jaillet Dugald C. Jackson Professor of Electrical Engineering and Computer Science Co-Director, Operations Research Center 2 Sequential Data Inference via Matrix Estimation: Causal Inference, Cricket and Retail by Muhammad Jehangir Amjad Submitted to the Sloan School of Management in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Operations Research Abstract This thesis proposes a unified framework to capture the temporal and longitudinal variation across multiple instances of sequential data. Examples of such data include sales of a product over a period of time across several retail locations; trajectories of scores across cricket games; and annual tobacco consumption across the United States over a period of decades. A key component of our work is the latent variable model (LVM) which views the sequential data as a matrix where the rows correspond to multiple sequences while the columns represent the sequential aspect. The goal is to utilize information in the data within the sequence and across different sequences to address two inferential questions: (a) imputation or “filling missing values" and \de-noising" observed values, and (b) forecasting or predicting \future" values, for a given sequence of data. Using this framework, we build upon the recent developments in \matrix estimation" to address the inferential goals in three different applications. First, a robust variant of the popular \synthetic control" method used in observational studies to draw causal statistical inferences. Second, a score trajectory forecasting algorithm for the game of cricket using historical data. This leads to an unbiased target resetting algorithm for shortened cricket games which is an improvement upon the biased incumbent approach (Duckworth-Lewis-Stern). Third, an algorithm which leads to a consistent estimator for the time- and location-varying demand of products using censored observations in the context of retail. As a final contribution, the algorithms presented are implemented and packaged as a scalable open-source library for the imputation and forecasting of sequential data with applications beyond those presented in this work. 3 4 Acknowledgments It's a marathon, not a sprint. I recall a former mentor's valuable \career" advice. The last five years have felt like a bit of both. For whatever it was that I thought I would be doing during the course of this PhD, it was nothing like it. This has been a complex, trying and deeply challenging but ultimately a most rewarding journey. From rushing to submit seconds before a deadline to planning out years in advance, my way through this PhD experience has been anything but monotonous. Five years ago as a first year graduate student, I took refuge in the comforts of routine coursework and exams but also realized that I was not optimally prepared for the challenges of a rigorous program at MIT. I spent several days contemplating whether a PhD and I were meant for each other. Life was much easier, much simpler before MIT. But five years on, I couldn't be more grateful for the experience and would not change a thing. Not only have I begun to understand and appreciate the material that comprises the core of my research, I have come to know myself better than ever before. The rigors of academia have also necessitated profound soul-searching. From not having a clue about \research" or what I wanted to work on to letting myself meander aimlessly through life, five years on I am proud of the ideas I have explored and am closer to being the person I have always aspired to be. It is inconceivable that I would have persisted through the duration of the program had it not been for the constant and dedicated support of an army of people, far too many to name. I am especially grateful to my advisor, Prof Devavrat Shah, who has demonstrated that success in one's career can gracefully coexist with kindness of personality. Devavrat's patience and constant encouragement are the chief sources of my academic and research successes at MIT. His words of wisdom and advising style have left a deep impression on me{lessons I hope to emulate in my own career. At the half-way mark of my PhD, when my research was still in its infancy and I was struggling to find a breakthrough, the most unthinkable tragedy struck: my dear sister, Aliya, passed away due to an unexpected illness. That was the most difficult time in my life and I had to dig deep in my resolve to be there for the rest of my family while 5 6 we all struggled to come to terms with that unbearable loss. Aliya was the heart and soul of our family and life without her was unimaginable. Aliya's passing brought on weeks and months of introspection but I returned to MIT more determined and focused than ever. My \struggles" over the first couple of years paled in comparison to the experience of Aliya's passing and that provided the necessary motivation for me to finally turn a leaf with progress on my research. Like much else, I owe most of my research \accomplishments" to Aliya. Those who know me understand well how significant a role the game of cricket plays in my life. It is a great source of satisfaction for me that I was able to think through and propose solutions to two key problems related to cricket which are discussed in this thesis. Through these problems I experienced the pure euphoria of thinking about problems from scratch, break them in to smaller problems and make steady progress and eventually proposing novel solutions. The coming together of my research and passion (cricket) is the greatest gift for me. This is all the more special because I am not only a keen observer and student of the game but I also play the sport on a regular basis. I helped found the MIT Cricket Club which has been my lifeline at MIT. The club has provided me with some of the most memorable moments of the past few years, both exhilarating and the utterly disappointing, and for that I will remain forever indebted to the team members who are my family at MIT. I am indebted to all my friends and extended family for their kindness and faith in me. I am blessed to have close and devoted friends across the globe who I have met at various stages of my life: from my time in Rawalpindi, Islamabad, Lahore, Flekke, Princeton, Seattle, New York, New Jersey and Cambridge. Their friendship has sustained me at every step along my way and I cannot imagine my life without each and every one of them. Through the events of birth and marriage(s), I have been blessed with an extensive set of relatives, some of whom I am very close to. My in-laws, and my sister's in-laws deserve a special thanks for being among my most ardent supporters and who believe I can move mountains. Every trip home during this PhD has provided me with a much-needed doze of adrenaline and belief which is largely due to my closest friends and relatives. Even little Meesha, born a few weeks before my thesis defense, provided me with those much needed pick-me-ups that I needed to sustain high levels of productivity and concentration in that stressful period. My wife, Tabinda, is without doubt my most enthusiastic advocate. Without her presence, love, devotion and level-headedness, this body of work may not have seen the light of day for another few years, at least. And it is not just a matter of emotional support. Tabinda has also paid keen attention to the finer details of my research and teaching commitments and developed an acute understanding of the material. For any researcher, having a captive audience is priceless and Tabinda's attention and feedback have provided me with the greatest confidence in the merits of my work. Her devotion to my work has ensured I remain focused which has resulted in higher quality and greater 7 efficiency than at any other point in my academic career. From marital bliss to higher levels of productivity, it has been a remarkable start to our marriage and I have Tabinda to thank for it all. I am a believer in divine interventions. It can only be due to fortune that I was blessed with parents and sisters who have sacrificed everything for me. They have all selflessly vested themselves in my career and have experienced all the joys and disappointments as their own. Fareeha and Fizza, who mean the world to me, have not only been besides me through the thick and thin of life but ensured that my absence at home is more than compensated for. Without their personal sacrifices, this PhD would have been inconceivable. It has been my parents' dream for me to get my PhD. Academics themselves, they understood first hand what this journey entails and this journey would have been impossible without their unwavering support and encouragement. I wouldn't be anywhere near where I am today without their sacrifices and devotion.

Causal Inference, Cricket and Retail by Muhammad Jehangir Amjad B.S.E, Electrical Engineering, Princeton University (2010)

11TOIDC COL 21R1.QXD (Page 1)

The Cricketer Annual Report & Year Book 2003-2004 Contents

Triple Tie at Top

West Indies England Zimbabwe

West Indies England Zimbabwe

BAFSAL Layout 13/2

7 March 2002 Every Number 10 Batsman Dreams Of

Cricket Ground, Pudong 2002 CRICKET Articles

Practice Questions

Cricket Memorabilia Society Postal Auction Friday 9

16TOIDC COL 01R2.QXD (Page 1)

West Indies England Zimbabwe