Project Football) Aaron Luo, Shravan Byra Michael Rosenston, Ashwin John
Total Page:16
File Type:pdf, Size:1020Kb
Project Basketball (Formerly Project Football) Aaron Luo, Shravan Byra Michael Rosenston, Ashwin John Abstract: Our project seeks to create a convenient and easy to interpret Basketball statistics as a specialized search engine. A user would enter the names of two NBA players as the query. They would then be shown a page that contains pictures of each player, a table of their statistics, relevant headlines, and a radar overlay chart comparing their key statistics. This radar chart would allow fans and analysts to objectively choose the better player simply by checking which player’s overlay covered the most area on the chart. Our tool eliminates the uncertainty of visually comparing multiple numbers by turning all the key statistics into an easy to read chart. Introduction: Basketball fans, reporters, coaches, players, and executives often like to compare the statistics of two players to determine who is superior. Current statistics tables simply give the user a table with the statistics of two players side by side. This data alone makes it challenging to objectively choose the better player in most situations. Eyeballing and weighting the comparisons of multiple statistical categories is difficult and often open to the influence of fan bias. It is hard to keep track of who is better and in what area and by how much. Our tool seeks to remedy this issue, by providing basketball enthusiasts with an easy to read visual comparison. A radar chart lets the user add points around the circle. The interior circles represent increasing measurements with the outermost circle being the maximum. Data is the filled in to extend toward the points. It is essentially a multi axis graph. Each point on the outside is a different basketball statistic. The two different colored areas represent two players. The player with the larger overall player would often be judged to be the better player. A player whose area extends more extremely toward one category would be judged as being better in that statistic. When a user of our tool enters two player names we deliver them one of these charts comparing the two players they requested. Additionally, we return two images of each player. These images are located with an image web crawler that we designed for this purpose. Lastly, we use an API or crawler to pull the statics of the two players and display it under our chart. This way the user receives the old statistics tables they are used to as well as our radar chart. The user then can use our chart to make a quick judgment and if they so choose explore the number more in depth on their own. If applicable, recent headlines of players are also pulled and displayed. Related Work: Basketball-reference.com contains a plethora of basketball statistics for teams and players both past and present. It also provides numbers on the performance of coaches as well as outcomes and box scores of every NBA game ever played. It however does not feature a visual tool to easily compare players. At espn.com/nba, users can look up players, view their statistics, see a photograph of them, and read the latest news on that player. What they can’t do is compare two players with an easily readable graph. NBA.com offers similar features as ESPN, but still no easy way to visually compare two players. Our tool sets itself apart, by allowing users to visually compare two players with a radar chart. With a high number of statistics attributed to each player, it is very difficult to keep track of which player has the best numbers. Most related sites display a numeric comparison of players, but without our radar graph to help interpret the data, it is very difficult to see at first glance who the superior player is. Our tool solves this problem while others fail. Problem Definition: We look to solve the problem of comparing players easily and speedily. We solve this by taking a query of two players from the user and display to them the relevant information in a timely fashion. The input to our system is a query from the user; it consists of two NBA players. Our problem is to then retrieve their statistics using an API, information extractor, or crawler, crawl the web for images, make a radar chart comparing them retrieve relative headlines, and display all of this data in an aesthetically pleasing manner. Returning our output in a visually pleasing and timely fashion is essential. If we take too long, users will become unhappy and not use our product. If the data is display in a disorganized manner, it will be hard to read and interpret. This would reduce the utility of our tool. Methods: The back end was written mainly in python; it consisted of parsing, API calls, crawlers (Scrappy) and text extraction/interpretation. The API we used was limited in its ability to retrieve data: it could only retrieve statistics if the players is currently on a known roster. So, upon receiving input names, the first thing we did run crawlers our crawlers (common seed sites were Google, ESPN.com, NBA.com, sports.yahoo.com, etc). The crawler would find the player’s current team if he is an active player, and retrieve statistics (using information extraction and pattern matching) if the player is an inactive (retired) player. This method also allows for typos in the name as well as up to date data (since the API pulls the most current data to the minute). For inactive players, those statistics are immediately used to create the radar chart, and the graph, statistics, and relative headlines (if any) are sent directly to the front end. For active players, a series of API calls will be used to determine and parse statistics before being handled the same way. Crawler4j is an open source Java Crawler that was used to crawl websites to select photographs to use in our search. The original crawler was modified to find “patterns” on the page which satisfy (.png, .jpg .tff etc.). The Crawler controller was also modified to take input of a player name to start crawling the specific player site to pull images. Since the starting crawl page was Wikipedia.org, the crawler was able to specify what size images it wanted to download. Other pages such as Google images wouldn’t have been able to provide such a precise image size. For the front end we used AngularJS, so that we could use a more MVC design for the website. There aren't really any page reloads because of angular. Index.html is initially loaded and then angular watches the url for changes, when a url changes, it will then load the new view depending on the url extension. So the new view that will be loaded is copied to the <div ng- view></div> tag in index.html, and the appropriate model and controller are loaded with it. The search button is linked to a function in the angular "controller" this function is called every time the button is clicked. The function then calls another function that sends an http post request to the back end flask server. The flask server then sees the post request from that specified url. For instance, if the front end sends a request to "/mySearch", then we add a url rule to the flask server to accept post requests from "/mySearch" and we can then use the information from the post in our search functions. Evaluation/Sample Results: After overcoming some developmental challenges, our tool finally achieved functionality. Sample results on players usually result positively. However, the text extractor robot is not perfect. Some historical players would have missing stats or crazy outputs. I have hardcoded an acceptable range of numbers for each statistic to fall under. If their statistic doesn’t comply, the stat is tossed out. This error could come from two places: 1) certain statistics were not tracked before a certain time, therefore the player’s time/era/age plays a factor; 2) format of statistics was too difficult for the extractor to understand. Tests on misspelled names work just as well (with help from Google). The graph is easily interpretable and helps basketball enthusiasts decide the better player. The statistical tables can be seen clearly and are well organized. The pictures of the players are perfectly sized as to not dominate the page, but still give the user a feel for the players whose performance they are comparing. If required, users also have the option to inspect statistics numerically. This allows for more detailed analysis and versatility. If only one player is selected, position averages are hardcoded in and the player is “compared” to the average player of his position. Conclusions and Future Work: Our tool takes a query from the user in the form of two basketball players. It then returns to them photos of the players, tables of their statistics, relative headlines, and most importantly a radar chart comparing their performance. Our tool is unique and superior to other basketball references when it comes to comparing players. We allow users to easily view a radar chart of two players. This method is quick and objective. This tool if employed by a highly trafficked sports website, could be of great use to basketball fans. Further expansion of our tool could be to allow the comparison of three or more players. The challenge of this is that it makes the graph harder to read with each overlay. Additionally, it requires the sending and receiving of more data.