All-Nba Team Voting Patterns: Using Classification Models to Identify How and Why Players Are Nominated

1 This thesis has been approved by The Honors Tutorial College and the Department of Business Administration ________________________ Dr. William A. Young II O’Bleness Associate Professor of Business Analytic Analytics & Information Systems Thesis Advisor ________________________ Dr. Ehsan Ardjmand Assistant Professor Analytics & Information Systems Committee Member ________________________ Dr. Raymond Frost Director of Studies, Business Administration Analytics & Information Systems ________________________ 2 ALL-NBA TEAM VOTING PATTERNS: USING CLASSIFICATION MODELS TO IDENTIFY HOW AND WHY PLAYERS ARE NOMINATED ____________________________________ A Thesis Presented to The Honors Tutorial College Ohio University _______________________________________ In Partial Fulfillment of the Requirements for Graduation from the Honors Tutorial College with the degree of Bachelor of Business Administration ______________________________________ by Graydon R. Levine May 2019 3 Abstract This research aims to connect the implications of on-court performance in the National Basketball Association (NBA) to All-NBA Team voting patterns. At the end of each regular season, a panel of writers and broadcasters vote on who they deem to be the 15 best players during that season, forming the All-NBA Team. As both the ultimate signifier of high-performance and the ultimate determinant of maximum contracts, a player’s selection to one of the All-NBA Teams can change both the course of their career and the long-term success of the team they are on. Due to the outsize importance of nominations and due to the lack of available research in this field, this research’s main purpose is two-fold: it aims to both find the best classification model for voting patterns and, concurrently, the most sensitive attributes, in order to construct a framework for future players. Subjects in the initial data set consist of all players who started at least 82 regular season games from 2010 to 2017, with the dependent variable being whether or not they were nominated to one of the three teams within the same timeframe. Due to the rarity of an All-NBA Team selection, this data set contains a dominant majority class. 21 models will be constructed based off of it, and the methodology utilizes existing classification methods—bagging, boosting, random foresting, logistic regressions, and classification trees—for model generation. After constructing these models on the original imbalanced data set, an oversampled set, and an undersampled set, it then scores all of the models against a control group and runs a sensitivity analysis on the best model. The results of the study found that the bagged classification tree run on the original imbalanced data set best predicted voting patterns, and that the assists per game a player registers during the regular season in which they either received their first nomination or registered their best PER season is the most sensitive attribute. Keywords: classification; multiple models; voting; ensembles; basketball 4 Table of Contents 1 Introduction .....................................................................................................................9 1.1 Motivation for the Study ..................................................................................10 1.2 Significance of the Research ............................................................................12 1.3 Organization of the Report...............................................................................13 2 Literature Review ..........................................................................................................14 2.1 Existing Literature In the NBA ........................................................................14 2.1.1 Research on player decision-making and performance in the NBA ..14 2.1.2 Research on team decision-making and performance in the NBA ....18 2.2 Classification Techniques ................................................................................19 2.2.1 Classification trees .............................................................................19 2.2.2 Logistic regressions ...........................................................................20 2.3 Ensemble modeling ..........................................................................................20 2.3.1 Aggregation of models and practical applications of methodology ..21 3 Methodology .................................................................................................................22 3.1 Data Set Formulation .......................................................................................22 3.2 Pre-Analysis PER Insights ...............................................................................25 3.3 Pre-Processing..................................................................................................28 3.4 Model Creation ................................................................................................29 4 Results ...........................................................................................................................31 5 Conclusion .....................................................................................................................37 5 6 Reflection ......................................................................................................................40 6.1 Initial Project Formulation ...............................................................................40 6.1.1 Personal interests ...............................................................................40 6.1.2 Academic field ...................................................................................41 6.1.3 Professional pursuits ..........................................................................42 6.1.4 Initial project ......................................................................................42 6.2 Deviation, New Project Formulation ...............................................................45 6.3 Learning Outcomes ..........................................................................................47 6.3.1 Academic ...........................................................................................47 6.3.2 Personal ..............................................................................................48 7 Bibliography ..................................................................................................................51 6 Acknowledgements Both the formulation of my thesis idea and its execution have been long, arduous processes that could not have happened without the help of many, many people. First of all, I would like to express my sincere thanks and appreciation for my director of studies, Dr. Raymond Frost. Seeing as he is the man that granted me a position within the Honors Tutorial College in the first place, without him, none of this would be possible. Ever since my freshman year in 2015, Dr. Frost has continually encouraged me to pursue whatever weird class or project I desired with vigor. Moreover, his emphasis on fraternizing with fellow HTC Business Administration students has been instrumental in bringing me closer to my cohort; as the expression goes, you are the average of the people you spend the most time around, and Dr. Frost has facilitated my association with many, many incredible students. Finally, his more-than-evident sense of general care has provided me with the academic backbone I knew I needed the second I applied for college. The inception of my thesis occurred during the spring semester of my sophomore year in Dr. Katherine Hartman’s research tutorial class. In her class, my cohort and I were tasked with learning different research methods each week by collecting reports incorporating various methodologies. Throughout the semester, we were to collect sources that pertained to one, central idea. Mine happened to be basketball-related, and, at the end of the semester, I had compiled a sizeable bibliography with reports that can be found in this very paper. Thank you, Dr. Hartman, for giving me the foundation necessary for this report. Also, I am sorry for never joining the CRC. 7 Next, I would like to give a huge, huge thank-you to Dr. Norman O’Reilly. As my first thesis advisor, Dr. O’Reilly provided my fledgling idea with a solid analytical background. Working with him week-to-week was a pleasure, as it is not often that an undergraduate student gets to work with one of the best researchers in any field, and even less often when that researcher is patient, receptive, and kind. Even after moving back to Canada, Dr. O’Reilly took time out of his incredibly busy schedule to work with me—for that, I am very thankful. I am sure that if Dr. O’Reilly never moved, I would have a different yet equally successful thesis fully under his guidance. The road to completing my thesis was jagged and included one, massive fork: switching advisors. After parting ways with Dr. O’Reilly, I chose Dr. William A. Young II to be my “anchor” of sorts. With a CV longer than my thesis and with a breadth of wisdom to boot, Dr. Young was the perfect advisor to take my thesis home. Due to the circumstances of my thesis being delayed and my impending graduation, I forced a sped- up timeline of completion unto Dr. Young. Thankfully, he did not flinch. Through countless mental battles, countless needs for clarification, and countless emails and texts, Dr. Young was nothing but supportive and helpful to me.

Load more