<<

International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017 AI Methodology for Automated Selection of Playing XI in IPL

C.Deep Prakash C.Patvardhan C.Vasantha Lakshmi Assistant System Engineer Dayalbagh Educational Dayalbagh Educational Trainee,TCS Institute Institute CTO, NOIDA, Agra, India Agra, India

ABSTRACT Rajasthan Royals, Kolkata Knight Riders, Kings XI T20 cricket has revolutionized competitive cricket with Punjab, ) were assigned to them. fans finding the shortest format just ideal for an exciting The franchises selected their squads according to evening. The , whose 10th edition the rules of IPL through competitive bidding from a was completed in May 2017 is a case in point. The stakes pool of Indian and foreign players selected by in IPL tournaments are huge. Enormous sums of money BCCI. BCCI has been organizing the IPL T20 are spent by franchises to acquire the best talent to cricket tournaments every year since then. 10 IPL represent them. This brings up the problem of selecting the tournaments have been held till date with latest best playing XI to benefit from the investment. A being completed in May 2017. completely automated and objective procedure based on comprehensive analytics of performance data using state of the art AI techniques is presented in this paper. The The use of analytical and statistical modeling in approach is validated on data from IPL 9. It is shown that every aspects of cricket such as Batting, Bowling, in a high proportion of cases i.e. 73.3 % cases the team Fielding, Team Selection, Result Prediction, Player that matches more closely the playing XI selected by the Ranking, Team Ranking, Target Revision in a rain proposed methodology wins. Thus the proposed approach affected match is very important. Analytics are is eminently suitable and can be gainfully utilized by Team popular because Indian fans are also followers of Management for automated solution of this complex problem. The methodology is completely objective and free statistical records. Analysis of IPL data thus of bias. becomes more important. Keywords Some data analytics studies related to cricket Machine Learning, Team Selection, Clustering, IPL, reported in the literature are as follows. Deep ReliefF Prakash et al. presented Deep Performance Index for ranking batsmen and bowlers [3]. Barr et al. used a weighted combination of average and strike INTRODUCTION rate for performance evaluation of both batsmen The first T20 World Cup was organised in South and bowlers who played in the first T20 world cup Africa in 2007. India won the tournament after and based upon their ranking they presented the defeating Pakistan in a high voltage final by 5 runs. World XI [4]. Deep Prakash et al. presented a team This prompted Board of Control for Cricket in India selection methodology based of heuristics and (BCCI) [1] to introduce a T20 league, named Random Forests algorithm for IPL season 9 [5] and Indian Premier League (IPL) [2] in 2008. In the first presented another approach using a memetic genetic season eight of the largest cities in India algorithm for selecting the best playing XI for each (Bangalore, Chennai, Delhi, Hyderabad, Jaipur, team in IPL season 9 [6]. However, no attempt was Kolkata, Mohali, Mumbai) were taken and eight made to validate the approach. Amit Kumar and franchises (Royal Challengers Bangalore, Chennai Sindhu [7] use a variety of detailed metrics to Super Kings, Delhi Daredevils, Deccan Chargers, analyze batting performance in IPL. The metrics are

419 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017 designed to reflect the impact of a player's described. In section IV, the new Cluster-based performance on the match outcome. However, Performance Index (CBI) and the proposed team detailed data that is necessary to perform the task selection methodology are given. In section V, are not available easily. Dey and Ghosh [8] employ results are described. In section VI, conclusions and an MCDM approach for evaluating Bowlers future work are given. performance in IPL. In this approach the number of features are very less and almost 25% of the FEATURES EXTRACTION FOR BATSMEN importance is assigned to the number of matches AND BOWLERS played, number of overs bowled and the number of taken. Thus the players who have played For every batsman who played in IPL season 9, 38 more matches may get undue benefit over the young features are calculated from the respective datasets talented players. (Career, Previous Year, IPL). For every bowler who played in IPL season 9, 37 features are calculated There are many performance evaluation metrics but from the respective datasets (Career, Previous they are based on very limited considerations. There Year, IPL). The details of features and their is a need for a more comprehensive metric which mathematical definitions are given for batsmen and incorporates a larger set of cricketing attributes of bowlers in tables 1 and 2 respectively. the players. This is the motivation for developing an analytical framework of detailed features such that all cricketing attributes could be taken into account CLUSTERING OF BATSMEN AND while evaluating the performance of the players. In BOWLERS this work complete analysis of IPL season 9 is The team needs to be balanced in terms of done. The data has been considered in three parts as availability of players for different roles like follows. openers, middle order batsmen, finishers, fast (i) Overall T20 International career data of the bowlers, spinners, wicketkeeper, etc. A player up to IPL season 9. team lacking in one particular type of players would (ii) Previous year's international T20 data in order find it difficult to win. A clustering based solution is to take their current form into account [9] and, proposed for the team selection problem. Players are clustered so that players with the similar (iii) IPL career data up to season 9 [10] . abilities and prospective roles can be compared 38 features are calculated for batsmen out of which against each other while selecting the best playing 22 features are extracted from their career data, 8 XI. Each player has a different role in the team and features from their previous year's data and 8 clustering gives relevant information. features from their IPL career data. Similarly 37 K-Means Clustering [11] is used to cluster the features are calculated for bowlers out of which 22 batsmen and bowlers into different clusters. K- features are extracted from their career data, 8 Means Clustering is an unsupervised learning features from their previous year's data and 7 algorithm [12] in Data Mining [13]. All the features from their IPL career data. This is the first batsmen are categorized into k clusters, in which work in which such a comprehensive set of features each batsman belongs to a cluster with nearest has been considered. After extraction of these mean. Let the feature vectors of players be P(1), features, clustering of players is done on the basis of P(2), P(3),...... P(m). In order to group them into k similarity between players. A new integrated and clusters, in the training data, feature vectors P(i) are comprehensive performance index called the given, where i=1...... m (number of players). Since it Cluster Based Index (CBI) is developed and is an unsupervised learning algorithm, there is no computed. A team selection strategy is developed target variable. The goal of K-Means Clustering that uses information of these clusters and CBI algorithm is to calculate K centroids (one for each values and identifies the best playing eleven for cluster) and assign a cluster C(i) to each player P(i). each team. Elbow Method [14] is used to determine the number The rest of the paper is organized as follows. In of clusters K. In Elbow Method the percentage of section II, details of features and their extraction is variance is examined as a function of the number of described. In section III, the clustering approach is 420 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017 clusters K and number of clusters K is chosen such The features for batsmen and their corresponding that addition of another cluster doesn't give much weights, according to their respective cluster are better modeling of the data. The number of clusters given in the table 3. In cluster 1, the ability of the for batsmen is determined using Elbow Method in batsman to hit boundaries is the most prominent Figure 1. feature and the ability of maintaining good strike rate on Asian pitches as well as other continental pitches while maintaining the consistency and getting big scores are also important. For cluster 2 players, strike rate matters most along with their ability to stay , their consistency in the winning matches as well as in IPL and their experience matters most. Ability to hit boundaries is the most prominent feature for players in cluster 3 and their ability to remain not out, their ability to Fig1. Determining number of clusters using rotate the strike and their performance in the Elbow method winning matches are also important. For cluster 4 batsmen, experience matters the most and along Using K-means clustering on batsmen 6 clusters of with that their consistency in the target chasing sizes 28 players , 22 players , 9 players , 8 players , matches and in the winning matches and their strike 17 players and 11 players are obtained. In order to rate are also important. For cluster 5 batsmen, their analyze the significance of each cluster and consistency in the big matches matters most and importance of individual features for that particular along with that their consistency while batting first, cluster, ReliefF algorithm [15] is used to determine their experience and their strike rate in the loosing the relative weightage of each feature within that matches are also important. For cluster 6 batsmen, cluster. Most Valuable Player Index (MVPI) their strike rotating ability matters the most and defined by Rediff Cricket [16] is taken as the target along with that their strike rate in the past year, their value. MVPI is computed as follows. overall strike rate and their consistency in the winning matches are also important. Thus in every Batting = [(Player's Batting cluster different features are important. The number Average/Tournament Batting Average )* Runs of clusters for bowlers is determined using Elbow Scored by the Player + (Player's Batting Strike Method in Figure 2. Rate / Tournament Batting Strike Rate)^2 * Runs Scored by the Player ReliefF works on dataset with m batsmen each having p features. The feature values are normalized to range 0 and 1. The algorithm starts with a p-long weight vector each initialized with 0. It then iterates. At each iteration, it picks a random vector (X) from the dataset and the feature vectors of the instance closest to X (by Euclidean distance) from each class. A near-hit (near-miss) is the closest same-class (different-class) instance. The weight vector is updated as follows. Fig 2 . Determining the number of clusters using Elbow method Wi= Wi - ( Xi - nearHiti)^2 + (Xi - nearMissi)^2 Features are computed and K-means clustering is The weight of a feature is reduced if it farther in employed for bowlers obtaining 7 clusters having 9, features space from a feature in nearby instances of 17, 12, 16, 30, 12 and 11 bowlers respectively are the same class and closer to nearby instances of the obtained. Again, ReliefF algorithm is used to other classes. It is enhanced in the opposite case. determine the importance of each feature within that After m iterations, the relevance vector is obtained cluster. Most Valuable Player Index (MVPI) by dividing each element of the weight vector by m. 421 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017 defined by Rediff Cricket as the target value for  There must be one captain and one this purpose. wicketkeeper in the playing eleven. Bowling = [(Tournament Bowling  There must be 2 openers, 3 middle order Average/Player's ) + batsmen and 2 all-rounders. (Tournament Economy Rate / Player's Economy  There must be at least one uncapped (Player Rate )*2] * Wickets Taken By the Player who hasn't played for the country ) Indian The features for bowlers with their corresponding player. weights according to their respective cluster are A careful analysis of each cluster according to given in the table 4. For cluster 1 bowlers, the cricketing knowledge leads to a viable approach for number of times they have taken big wickets ( 4 or the team selection. For batsmen, all the experienced 5) in the previous year is the most prominent players, with good consistency, good strike rate and feature. Further, their economy in the night matches power hitting capability fall in cluster 1. Thus while and in the winning matches and their consistency selecting the team, players in cluster 1 would be and the attacking index are also important features. given first preference. Batsmen in cluster 5 have For cluster 2 bowlers, their short performance in the good T20 record and thus, they will be given next previous year matters the most and along with that preference. Next cluster 3 players would be their tacking ability, strike rate and considered because these are the young uncapped consistency are also important features. For cluster Indian players. All the young talented Indian 3 bowlers, the number of matches which they have players with a good IPL record fall in cluster 2. played in the previous year are the most important Since there must be at least 7 Indians in the team, feature and along with that their wicket taking players from this cluster get second preference in ability, strike rate, consistency and short the proposed team selection methodology. In cluster performance in IPL are also important. For cluster 4 4 those players are there who have very good T20 bowlers, their night economy index is the most record but they didn't play any T20 matches in the prominent feature and along with that their last one year. Thus they can be considered in the economy in the big matches, their experience , team because of their brilliant past career record. consistency and wicket taking capability are also Last, cluster 6, which has foreign players who are important. For cluster 5 bowlers, their economy in playing IPL for the first time would be considered. the winning matches matters the most and along Thus the clustering algorithm has really worked and with that their short performance capability, their found these clusters of batsmen based upon the ability to bowl maiden overs and their strike rate are features which were defined. also important features. For cluster 6 bowlers, their Similarly for bowlers, cluster 6 bowlers will be short performance in IPL matters the most and taken into consideration first in which all those along with that their ability to take big wickets, bowlers are there who have performed very well in average, consistency and experience are also the IPL. Then cluster 4 players will be given the important features. For cluster 7 bowlers, their next preference in which all the experienced players consistency is the most important feature and along with good consistency and wicket taking capability with that their economy and strike rate in the night are there. In cluster 5 all the young Indian bowlers matches are also important. Thus in every cluster are there who have not got much chance at different features dominate the cluster. international level. Then cluster 2 players will be considered, in which all the young and key bowlers TEAM SELECTION METHODOLOGY from all the teams are there with brilliant short performance index and wicket taking capability. Deciding the playing XI for particular team out of After that cluster 3 bowlers will be selected in the available players is a big task for each franchise. which all those bowlers are there who have caught There are some criteria which have to be satisfied attention with their performance in the previous while selecting the playing eleven. year. Then cluster 7 bowlers will be taken into  There cannot be more than 4 foreign players in consideration in which bowlers with good the playing eleven. consistency are there. Then cluster 1 bowlers who

422 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017 have performed well on some occasions will be combined with an optimization algorithm like considered. Genetic Algorithm for the team selection problem. A Cluster Based Index (CBI) is defined in order to It would also be interesting to explore the calculate the ranks of players within that cluster. If performance of Deep learning algorithms for the there are n features team selection problem. Efforts are being directed in these areas. CBI = , Where Fi is the feature value and Wi is the corresponding feature weight which 15 is calculated∑ Fi using∗ Wi the ReliefF algorithm. 10 True 5 Prediction RESULTS 0 False MI

The CBIs of batsmen in various teams are given in DD RPS KKR GLR RCB SRH

KXIP Prediction table 5. Team-wise bowlers and their corresponding CBIs are given in table 6. The teams which have Teams been formed using the CBI are given in table 7. For each match the actual playing XI is compared Fig 3. Predictions based on Best XI against the best XI found and the team in which less number of matches are found is predicted to be the REFERENCES loser in that match. In 73.33% of matches the [1] Board of Control for Cricket in India, prediction made on this basis was found to be true https://en.wikipedia.org/wiki/Board_of_Control_for which is quite encouraging given that T20 cricket is _Cricket_in_India known to be hard to predict. The team-wise results [2] Indian Premier League, are shown in figure 3 and match-wise results are https://en.wikipedia.org/wiki/ shown in table 8. One factor that has not yet been Indian_Premier_League considered is that some players left the tournament [3] Prakash, C. Deep, C. Patvardhan, and Sushobhit mid way due to other commitments or injuries e.g. Singh. "A new Machine Learning based Deep K Peterson, , M.Marsh, F.Du Plesis, Performance Index for Ranking IPL T20 S.Marsh, L.Simmons, T Perera, Yuvraj Singh etc. A Cricketers." International Journal of Computer more detailed analysis of these would certainly Applications (0975–8887) Volume. further improve the results in terms of prediction [4] Barr, G. D. I., C. G. Holdsworth, and B. S. Kantor. accuracy. However, the mission in this work is to "Evaluating performances at the 2007 cricket world present a new methodology with sufficient cup." South African Statistical Journal 42.2 (2008): validation which is certainly accomplished. 125. [5] Prakash, C. Deep, C. Patvardhan, and C. Vasantha Lakshmi. "Team Selection Strategy in IPL 9 using CONCLUSIONS Random Forests Algorithm." International Journal The paper presents a novel methodology for team of Computer Applications (0975–8887) Volume. selection. First the players are classified into various [6] Prakash, C. Deep. "A New Team Selection categories and they are ranked on the basis of a Methodology using Machine Learning and Memetic weighting function determined specifically for each Genetic Algorithm for IPL-9." Int. Jl. of Electronics, category using a machine learning algorithm. The Electrical and Computational System IJEECS ranking is utilized to select the best team according ISSN (2016). to heuristics that ensure a balanced team. The [7] Amit Kumar and Ritu Sindhu, “Reflection against procedure is completely automatic and is based on perception: Data Analysis of IPL Batsmen”, analyzing a comprehensive set of features that are International Journal of Engineering Science more detailed than those considered hitherto. Invention, Vol. 3, Issue 6, June 2014, pp 7 – 11. Simulation of the procedure on IPL IX data is ISSN No. 2319-6734. performed to validate it. The high accuracy of the [8] Ahmad F, Kalyanmoy Deb and Abhilash Jindal, predictions is encouraging. The clustering “Multiobjective Optimization and decision making methodology and the ranking scheme can also be approaches to cricket team selection”, Applied Soft Computing, Vol 13, 2013, pp 402 – 414. 423 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

[9] http://www.espncricinfo.com/india/content/player/28 [13] Berkhin, Pavel. "A survey of clustering data mining 081. html, T20 statistics of each player techniques." Grouping multidimensional data. [10] http://www.iplt20.com/teams/Royal-challengers Springer Berlin Heidelberg, 2006. 25-71. bangalore/squad/236/chris-gayle, IPL statistics of [14] Optimal number of Clusters, https://www.r- each player bloggers.com/optimal-number-of-clusters/ [11] Hartigan, John A., and Manchek A. Wong. [15] Kononenko, I., Simec, E., & Robnik-Sikonja, M. "Algorithm AS 136: A k-means clustering (1997). Overcoming the myopia of inductive algorithm." Journal of the Royal Statistical Society. learning algorithms with RELIEFF. Series C (Applied Statistics) 28.1 (1979): 100-108. [16] http://www.rediff.com/cricket/report/icc-world-cup- [12] Hastie, Trevor, Robert Tibshirani, and Jerome devilliers-maintains-big-lead-shami-rises-to-7th-in- Friedman. "Unsupervised learning." The elements of mostvaluable-player-table/20150320.htm statistical learning. Springer New York, 2009. 485- 585. Table 1: Batting features and their mathematical formulae S.No. Feature Definition 1. Consistency Index (Total Runs Scored) / (No. Innings Played in which he got out) 2. Attacking Index (Total Runs Scored / Total Balls Faced) * 100 3. Experience Index No. Matches Played 4. Runs Index Total Runs Scored 5. Big Score Index (No. Hundreds + Fifties )/ No. Innings Played 6. Power Hitter Index (No. Boundaries) / (No. Balls Faced) 7. Finishing Index ( No. Not Out's / No. innings) 8. Strike Rotating Index (No. Runs Scored without Boundaries)/ (No. Balls Faced without a Boundary) 9. Winning Consistency Index (Total Runs Scored in winning matches) / (No. Innings played in the winning matches in which he got out) 10. Winning Attacking Index (Total Runs Scored in winning matches / Total Balls Faced in winning matches) * 100 11. Loosing Consistency Index (Total Runs Scored in loosing matches) / (No. Innings Played in the loosing matches in which he got out) 12. Loosing Attacking Index (Total Runs Scored in loosing matches / Total Balls Faced in loosing matches) * 100 13. Pressure Consistency Index (Total Runs Scored in finals or semi-final matches) / (No. Innings Played in the finals or semi-final matches in which he got out) 14. Pressure Attacking index (Total Runs Scored in finals or semi-final matches / No. Balls Faced in finals or semi-final matches) * 100 15. Night Consistency Index (Total Runs Scored in Night matches) / (No. Innings Played in the Night matches in which he got out) 16. Night Attacking Index (Total Runs Scored in Night matches / Total Balls Faced in Night matches) * 100 17. Target Setting Consistency Index (Total Runs Scored while Batting First) / (No. Innings Played while Batting First in which he got out) 18. Target Setting Attacking Index (Total Runs Scored while Batting First) / (Total Balls Faced while Batting first) * 100 19. Target Chasing Consistency Index (Total Runs Scored while Batting Second) / (No. Innings Played while Batting Second in which he got out) 20. Target Chasing Attacking Index (Total Runs Scored while Batting Second) / (Total Balls Faced while Batting Second) * 100 424 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

21. Asian Pitch Consistency Index (Total Runs Scored on Asian Pitches) / (No. Innings Played on Asian Pitches in which he got out) 22. Asian Pitch Attacking Index (Total Runs Scored on Asian Pitches) / (Total Balls Faced on Asian Pitches) * 100 23. Previous Consistency Index (Total Runs Scored in Previous Year) / (No. Innings Played in Previous Year in which he got out) 24. Previous Attacking Index (Total Runs Scored in Previous Year/ Total Balls Faced in Previous Year) * 100 25. Previous Experience Index No. Matches Played in Previous Year 26. Previous Runs Index Total Runs Scored in Previous Year 27. Previous Big Score Index (No. Hundreds + Fifties in Previous Year )/ (No. Innings Played in Previous Year) 28. Previous Power Hitter Index (No. Boundaries in Previous Year) / (No. Balls Faced in Previous Year) 29. Previous Finishing Index ( No. Not Out's in Previous Year / No. innings in Previous Year) 30. Previous Strike Rotating Index (No. Runs Scored without Boundaries in Previous Year)/ (No. Balls Faced without Scoring a Boundary in Previous Year) 31. IPL Consistency Index (Total Runs Scored in IPL) / (Number of Innings Played in IPL in which he got out) 32. IPL Attacking Index (Total Runs Scored in IPL / Total Balls Faced in IPL) * 100 33. IPL Experience Index No. Matches Played in IPL 34. IPL Runs Index Total Runs Scored in IPL 35. IPL Big Score Index (No. Hundreds + Fifties in IPL )/ (No. Innings Played in IPL) 36. IPL Power Hitter Index (No. Boundaries in IPL) / (No. Balls Faced in IPL) 37. IPL Finishing Index ( No. Not Out's in IPL/ No. innings in IPL) 38. IPL Strike Rotating Index (No. Runs Scored without Boundaries in IPL)/ (No. Balls Faced without Scoring a Boundary in IPL) Table 2. Bowling features and their mathematical definitions S.No. Feature Definition 1. Consistency Index (No. Runs Given)/ (No. Overs bowled) 2. Attacking Index (No. Balls Bowled )/ (No. Wickets Taken) 3. Experience Index No. Matches Played 4. Wickets Index No. Wickets Taken 5. Big Wickets Index (No. Times 4 or 5 Wickets Taken)/(No. Matches Played) 6. Short Performance Index (No. Wickets Taken Excluding the Big Wickets)/ (No. Matches Played Excluding the Big Wicket Matches) 7. Average Index (No. Runs Given)/ (No. Wickets Taken) 8. Maiden Index No. Maiden Overs Bowled 9. Winning Economy Index (No. Runs Given in Winning Matches)/ (No. Overs bowled in Winning Matches) 10. Winning SR Index (No. Balls Bowled in Winning Matches)/ (No. Wickets Taken in Winning Matches) 11. Loosing Economy Index (No. Runs Given in Loosing Matches)/ (No. Overs bowled in Loosing Matches) 12. Loosing SR Index (No. Balls Bowled in Loosing Matches)/ (No. Wickets Taken in

425 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

Loosing Matches) 13. Big Match Economy (No. Runs Given in Finals or Semi-Final Matches)/ (No. Overs bowled Index in Finals or Semi-Final Matches) 14. Big Match SR Index (No. Balls Bowled in Finals or Semi-Final Matches)/ (No. Wickets Taken in Finals or Semi-Final Matches) 15. Night Economy Index (No. Runs Given in Night Matches)/ (No. Overs bowled in Night Matches) 16. Night SR Index (No. Balls Bowled in Night Matches)/ (No. Wickets Taken in Night Matches) 17. Target Defending (No. Runs Given while Bowling Second)/ (No. Overs bowled while Economy Bowling Second) 18. Target defending SR (No. Balls Bowled while Bowling Second)/ (No. Wickets Taken while Bowling Second) 19. Target Restricting (No. Runs Given while Bowling First)/ (No. Overs bowled while Economy Bowling First) 20. Target Restricting SR (No. Balls Bowled while Bowling First)/ (No. Wickets Taken while Bowling First) 21. Asia Economy Index (No. Runs Given on Asian Pitches)/ (No. Overs bowled on Asian Pitches) 22. Asia SR Index (No. Balls Bowled on Asian Pitches)/ (No. Wickets Taken on Asian Pitches) 23. Previous Consistency (No. Runs Given in Previous Year)/ (No. Overs bowled in Previous Index Year) 24. Previous Attacking Index (No. Balls Bowled in Previous Year)/ (No. Wickets Taken in Previous Year) 25. Previous Experience No. Matches Played in Previous Year Index 26. Previous Wickets Index No. Wickets Taken in Previous Year 27. Previous Big Wickets (No. Times 4 or 5 Wickets Taken in Previous Year)/(No. Matches Played Index in Previous Year) 28. Previous Short (No. Wickets Taken Excluding the Big Wickets in Previous Year)/ (No. Performance Index Matches Played Excluding the Big Wicket Matches in Previous Year) 29. Previous Average Index (No.Runs Given in Previous Year)/ (No. Wickets Taken in Previous Year) 30. Previous Maiden Index No. Maiden Overs Bowled in Previous Year 31. IPL Consistency Index (No. Runs Given in IPL)/ (No. Overs bowled in IPL) 32. IPL Attacking Index (No. Balls Bowled in IPL )/ (No. Wickets Taken in IPL) 33. IPL Experience Index No. Matches Played in IPL 34. IPL Wickets Index No. Wickets Taken in IPL 35. IPL Big Wickets Index (No. Times 4 or 5 Wickets Taken in IPL)/(No. Matches Played in IPL) 36. IPL Short Performance (No. Wickets Taken Excluding the Big Wickets in IPL)/ (No. Matches Index Played Excluding the Big Wicket Matches in IPL) 37. IPL Average Index (No. Runs Given in IPL)/ (No. Wickets Taken in IPL)

426 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

Table 3. Cluster wise features and their corresponding weights Feature Cluster 1 Cluster 2 Cluster Cluster 4 Cluster 5 Cluster 6 3 Consistency Index 0.056 0.856 0 0.263 0.0005 0.203 Attacking Index 0.005 0.813 0 -0.030 -0.003 -0.051 Experience Index 0.082 0.608 0 0.112 0.137 0.378 Runs Index 0.120 0.849 0 0.296 0.250 0.424 Big Score Index 0.061 0 0 0.293 0.003 0.171 Power Hitter Index -0.004 0.831 0 -0.035 -0.024 -0.021 Finishing Index 0.0009 0.038 0 0.002 -0.063 -0.066 Strike Rotating Index -0.018 0.340 0 -0.057 -0.017 -0.0255 Winning Consistency Index 0.070 0.831 0 0.097 -0.005 0.128 Winning Attacking Index 0.0009 0.831 0 -0.036 -0.024 -0.040 Loosing Consistency Index 0.007 0.074 0 0.148 -0.006 0.043 Loosing Attacking Index -0.009 0.074 0 0.004 -0.008 -0.049 Pressure Consistency Index 0.052 0 0 0.248 0 0.151 Pressure Attacking index -0.007 0 0 -0.044 0 0.303 Night Consistency Index 0.041 0.039 0 0.078 0.027 0.047 Night Attacking Index 0.010 0.039 0 -0.032 -0.023 -0.050 Target Setting Consistency Index 0.031 0 0 0.247 0.012 0.109 Target Setting Attacking Index 0.014 0 0 0.010 -0.006 -0.073 Target Chasing Consistency 0.044 0.866 0 -0.032 0.037 0.084 Index Target Chasing Attacking Index -0.003 0.694 0 -0.074 0.008 -0.070 Asian Pitch Consistency Index 0.072 0.039 0 0.257 -0.050 -0.049 Asian Pitch Attacking Index 0.008 0.039 0 -0.0006 -0.049 -0.076 Previous Consistency Index 0.038 0.039 0 0 -0.002 0.217 Previous Attacking Index 0.037 0.039 0 0 0.0002 0.060 Previous Experience Index -0.001 0.062 0 0 -0.023 0.218 Previous Runs Index 0.022 0.039 0 0 0.011 0.353 Previous Big Score Index 0.013 0 0 0 0 0.347 Previous Power Hitter Index 0.035 0 0 0 0.012 0.096 Previous Finishing Index -0.006 0 0 0 -0.054 -0.062 Previous Strike Rotating Index 0.019 0.039 0 0 -0.036 -0.068 IPL Consistency Index -0.013 -0.016 0 0.160 0.087 0 IPL Attacking Index 0.016 -0.109 0 -0.042 -0.026 0 IPL Experience Index -0.017 0.070 0 0.080 -0.041 -0.072 IPL Runs Index -0.013 0.111 0 -0.025 0.050 0 IPL Big Score Index -0.010 -0.030 0 0.0195 0.152 0 IPL Power Hitter Index 0.0002 -0.061 0 -0.050 0.0003 0 IPL Finishing Index -0.011 0.013 0 -0.103 -0.023 0 IPL Strike Rotating Index -0.015 0.043 0 -0.104 -0.026 0 427 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

Table 4. Cluster wise features and their corresponding weights Feature Cluster Cluster Cluster Cluster Cluster Cluster Cluster 1 2 3 4 5 6 7 Consistency Index -0.131 -0.019 -0.018 0.0005 0.985 -0.052 0 Attacking Index -0.031 0.037 0.087 0.004 1 0.097 0 Experience Index -0.095 0.173 0.188 0.158 0.986 0.083 0 Wickets Index 0.650 0.212 0.206 0.231 1 0.203 0 Big Wickets Index 0.942 0.015 -0.074 0.115 0 0.129 0 Short Performance Index 0.302 -0.019 0.045 0.079 1 0.014 0 Average Index -0.078 0.019 0.070 0.001 1 0.023 0 Maiden Index 0 0.087 0.044 0.172 0 0.003 0 Winning Economy Index 0.313 -0.010 -0.045 -0.052 1 -0.038 0 Winning SR Index -0.088 -0.001 0.076 -0.003 1 0.112 0 Loosing Economy Index -0.048 0.002 0.008 -0.008 -0.034 -0.034 0 Loosing SR Index -0.118 0.008 -0.027 -0.011 0 -0.057 0 Big Match Economy Index 0 -0.039 -0.071 0.068 0 -0.045 0 Big Match SR Index 0 -0.039 0.089 0.006 0 -0.023 0 Night Economy Index -0.150 0.008 0.014 0.020 0 -0.032 0 Night SR Index -0.121 0.017 0.074 0.058 0 0.011 0 Target Defending Economy 0.050 -0.004 0.028 -0.041 0 -0.049 0 Target defending SR -0.144 0.024 0.077 0.014 0 0.199 0 Target Restricting Economy -0.098 0.016 -0.039 -0.035 -0.034 -0.018 0 Target Restricting SR -0.104 0.019 -0.051 -0.038 0 0.016 0 Asia Economy Index -0.151 -0.003 -0.002 -0.019 0.980 -0.049 0 Asia SR Index -0.013 -0.017 0.080 -0.018 1 0.029 0 Previous Consistency Index -0.043 0.006 -0.010 0.095 0.474 0.034 0 Previous Attacking Index -0.119 -0.006 0.055 0 0.408 0.007 0 Previous Experience Index -0.093 0.034 0.016 0.095 0.408 -0.012 0 Previous Wickets Index -0.119 0.018 0.034 0 0.864 0.0008 0 Previous Big Wickets Index 0 -0.033 -0.074 0 0 -0.013 0 Previous Short Performance -0.119 -0.002 -0.030 0 0.932 -0.018 0 Index Previous Average Index -0.119 -0.004 0.046 0 0.241 0.047 0 Previous Maiden Index 0 0 -0.027 0 0 0.093 0 IPL Consistency Index 0.195 -0.081 -0.044 -0.056 -0.071 -0.039 0 IPL Attacking Index -0.116 -0.011 -0.058 -0.043 -0.020 -0.021 0 IPL Experience Index -0.112 -0.042 0.032 -0.020 0.004 0.035 0 IPL Wickets Index -0.110 -0.056 0.055 -0.019 0.013 0.017 0 IPL Big Wickets Index -0.145 -0.035 0.006 -0.027 0.709 -0.035 0 IPL Short Performance Index 0.104 -0.047 0.034 -0.007 0.043 -0.042 0 IPL Average Index -0.012 -0.015 -0.064 -0.050 -0.028 -0.042 0

428 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

Table 5. Team-wise batsmen and their CBI Cluster1 JPDuminy(DD,CBI=0.44),DeKock(DD,CBI=0.28),McCullum(GLR,CBI=0.48),A.Finch(GLR,CBI=0.3 8),S.Raina(GLR,CBI=0.28),DJBravo(GLR,CBI=0.26),S.A.Hasan(KKR,CBI=0.30),A.Russell(KKR,CB I=0.19),R.Uthappa(KKR,CBI=0.18),D.Miller(KXIP,CBI=0.27),Maxwell(KXIP,CBI=0.24),R.Sharma( MI,CBI=0.35),Simmons(MI,CBI=0.34),K.Pollard(MI,CBI=0.21),V.Kohli(RCB,CBI=0.62),C.Gayle(RC B,CBI=0.43),S.Watson(RCB,CBI=0.37),DeVilliers(RCB,CBI=0.32),F.DuPlesis(RPS,CBI=0.36),MSDh oni(RPS,CBI=0.34),T.Perera(RPS,CBI=0.26),S.Smith(RPS,CBI=0.20),A.Rahane(RPS,CBI=0.16),E.M organ(SRH,CBI=0.36),D.Warner(SRH,CBI=0.36),Williamson(SRH,CBI=0.33),Yuvraj(SRH,CBI=0.31 ),S.Dhawan(SRH,CBI=0.19)

Cluster2 M.Aggarwal(DD,CBI=0.04),K.Nair(DD,CBI=0.07),P.Negi(DD,CBI=0.08),S.Iyer(DD,CBI=.09),J.Hold er(KKR,CBI=0.67),SKYadav(KKR,CBI=0.06),R.Sathish(KKR,CBI=0.03),W.Saha(KXIP,CBI=0.01), M.Vohra(KXIP,CBI=-0.06),GS Mann(KXIP,CBI=- 0.06),A.Rayudu(MI,CBI=0.73),U.Chand(MI,CBI=-0.05),Mandeep(RCB, CBI=-0.03),KL Rahul(RCB,CBI=- 0.03),Sarfaraz(RCB,CBI=0.07),S.Tiwary(RPS,CBI=0.0008),R.Bhatia(RPS, CBI=-0.01),N.Ojha(SRH,CBI=1.18),K.Sharma(SRH,CBI=-0.02),B.Sharma(SRH,CBI=- 0.05),A.Tare(SRH, CBI=-0.07),D.Hooda(SRH,CBI=-0.09) Cluster3 R.Pant(DD,CBI=0),A.Nath(GLR,CBI=0),E.Dwivedi(GLR,CBI=0),I.Kishan(GLR,CBI=0),N.Naik(KXI P,CBI=0),K.Pandya(MI,CBI=0),N.Rana(MI,CBI=0),S.Baby(RCB,CBI=0),Handscomb(RPS,CBI=0) Cluster4 D.Smith(GLR,CBI=0.38),D.Karthik(GLR,CBI=0.03),G.Gambhir(KKR,CBI=0.66),Y.Pathan(KKR,CB I=0.09), K.Peterson(RPS,CBI=0.93),G.Bailey(RPS,CBI=0.27),I.Pathan(RPS,CBI=0.20),A.Morkel(RPS,CBI=0. 15)

Cluster5 S.Samson(DD,CBI=-0.0009),C.Morris(DD,CBI=-0.19),R.Jadeja(GLR,CBI=- 0.11),J.Faulkner(GLR,CBI=- 0.13),M.Pandey(KKR,CBI=0.02),C.Lynn(KKR,CBI=0.07),S.Marsh(KXIP,CBI=0.12),M.Vijay(KXIP, CBI=0.01),A.Patel(KXIP,CBI=-0.12),P.Patel(MI,CBI=-0.02),H.Pandya(MI,CBI=- 0.09),Jadhav(RCB,CBI=-0.08),S.Binny(RCB,CBI=-0.11,D.Wiese(RCB,CBI=- 0.17),M.Marsh(RPS,CBI=-0.13),Henriques(SRH,CBI=-0.04),B.Cutting(SRH,CBI=-0.08)

Cluster6 S.Billings(DD,CBI=0.23),Braithwet(DD,CBI=-0.06),C.Munro(KKR,CBI=0.86),Hastings(KKR,CBI=- 0.13), H.Amla(KXIP,CBI=1.25),Behradien(KXIP,CBI=0.25),M.Stoinis(KXIP,CBI=- 0.20),M.Guptil(MI,CBI=1.698), J.Butler(MI,CBI=1.096),T.Head(RCB,CBI=0.01),U.Khwaja(RPS,CBI=0.42)

429 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

Table 6. Team-wise bowlers and their CBI Cluster1 M.Stoinis(KXIP,CBI=0.31),L.Simmons(MI,CBI=1.45),R.Sharma(MI,CBI=0.32),S.Binny(RCB,CBI=0.02), Richardson(RCB,CBI=0.26),S.Boland(RPS,CBI=0.17),K.Sharma(SRH,CBI=0.05),B.Cutting(SRH,CBI=0. 16),Williamson(SRH,CBI=-0.27) Cluster2 CoulterNile(DD,CBI=0.04),C.Morris(DD,CBI=0.01),J.Faulkner(GLR,CBI=0.04),Hastings(KKR,CBI=0), J.Holder(KKR,CBI=0.03),U.Yadav(KKR,CBI=0.14),K.Abott(KXIP,CBI=0.19),A.Patel(KXIP,CBI=0.03), T.Southee(MI,CBI=0.23),K.Pollard(MI,CBI=0.008),D.Wiese(RCB,CBI=0.05),S.Arvind(RCB,CBI=0.13),A .Zampa(RPS,CBI=0.06),M.Marsh(RPS,CBI=- 0.05),Rehman(SRH,CBI=0.15),Yuvraj(SRH,CBI=0.06),T.Boult(SRH,CBI=0.02) Cluster3 Braithwet(DD,CBI=0.5),Duminy(DD,CBI=0.21),S.Raina(GLR,CBI=0.21),A.Russell(KKR,CBI=0.16),M.S harma(KXIP,CBI=0.21),McCleneghan(MI,CBI=0.32),J.Bumrah(MI,CBI=0.20),H.Pandya(MI,CBI=0.06), A.Milne(RCB,CBI=0.21),C.Jordan(RCB,CBI=0.19),C.Gayle(RCB,CBI=0.16),T.Perera(RPS,CBI=0.27) Cluster4 A.Mishra(DD,CBI=0.05),Z.Khan(DD,CBI=0.004),M.Shami(DD,CBI=0.04),P.Kumar(GLR,CBI=0.03),D. Smith(GLR,CBI=0.07),M.Morkel(KKR,CBI=0.37),Y.Pathan(KKR,CBI=0.01),B.Hogg(KKR,CBI=0.02),P. Chawla(KKR,CBI=0.10),Johnson(KXIP,CBI=0.23),V.Kumar(MI,CBI=0.05),I.Pathan(RPS,CBI=0.11),A. Dinda(RPS,CBI=0.05),RP Singh(RPS,CBI=0.05),I.Sharma(RPS,CBI=-0.1),Henriques(SRH,CBI=-0.09) Cluster5 P.Negi(DD,CBI=3.7),J.Yadav(DD,CBI=.09),S.Nadeem(DD,CBI=0.02),S.Jakati(GLR,CBI=0.26),P.Tambe (GLR,CBI=0.21),D.Kulkarni(GLR,CBI=0.006),P.Sangwan(GLR,CBI=0.01),S.Ladda(GLR,CBI=0.03),Una dkat(KKR,CBI=0.006),A.Rajpoot(KKR,CBI=0.02),R.Sathish(KKR,CBI=0.05),S.Sharma(KXIP,CBI=1.15 ),GSMann(KXIP,CBI=0.01),Anureet(KXIP,CBI=0.02),R.Dhawan(KXIP,CBI=0.02),Cariappa(KXIP,CBI= 0.02),S.Gopal(MI,CBI=-0.005),J.Suchith(MI,CBI=- 0.02),Y.Chahal(RCB,CBI=0.008),I.Abdulla(KKR,CBI=0.011),Mandeep(RCB,CBI=- 0.01),H.Patel(RCB,CBI=-0.01),V.Aaron(RCB,CBI=-0.01),P.Rasool(RCB,CBI=- 0.03),F.DuPlesis(RPS,CBI=0.28) R.Bhatia(RPS,CBI=0.085),A.Sharma(RPS,CBI=-0.02),A.Reddy(SRH,CBI=-0.005),B.Sharma(SRH,CBI=- 0.01), D.Hooda(SRH,CBI=-0.07) Cluster6 I.Tahir(DD,CBI=0.06),D.Steyn(GLR,CBI=0.02),R.Jadeja(GLR,CBI=0.10),DJBravo(GLR,CBI=0.09),S.H asan(KKR,CBI=0.20),S.Narine(KKR,CBI=0.12),Harbhajan(MI,CBI=0.15),S.Watson(RCB,CBI=0.11),R. Ashwin(RPS,CBI=0.20),A.Morkel(RPS,CBI=0.01),B.Kumar(SRH,CBI=0.09),A.Nehra(SRH,CBI=0.068) Cluster7 A.Nath(GLR,CBI=0),S.Kaushik(GLR,CBI=0),K.Yadav(KKR,CBI=0),M.Vijay(KXIP,CBI=0),P.Sahu(KXI P,CBI=0),S.Singh(KXIP,CBI=0),K.Pandya(MI,CBI=0),T.Shamsi(RCB,CBI=0),M.Ashwin(RPS,CBI=0),D. Chahar(RPS,CBI=0),B.Sran(SRH,CBI=0) Table 7. Playing XI for each team DD De Kock, M.Agarwal, S.Samson, K.Nair, JP Duminy, R.Pant, C.Morris, A.Mishra, M.Shami, Zaheer, I.Tahir GLR B.McCullum, A.Finch, S.Raina, D.Karthik, DJ Bravo, R.Jadeja, J.Faulkner, P.Kumar, D.Kulkarni, S.Jakati, P.Tambe KKR G.Gambhir, R.Uthappa, M.Pandey,C.Lynn,Suryakumar,Y.Pathan, A.Russel, SA Hassan, S.Narine, P.Chawla, J.Unadkat KXIP M.Vijay, S.Marsh, G.Maxwell, D.Miller, W.Saha, N.Naik, Axar Patel, Anureet, M.Johnson, R.Dhawan, S.Sharma MI P.Patel, L.Simmons, N.Rana, R.Sharma, K.Pollard,K.Pandya, H.Pandya, T.Southee, R. Vinay Kumar, M.McCleneghan, Harbhajan RCB C.Gayle, , AB de Villiers, S.Watson ,K.Jadhav, S.Binny, D.Wiese, H.Patel,Y.Chahal, I.Abdulla, V.Aaron RPS A.Rahane, F.Du Plesis, S.Smith, S.Tiwary, MS Dhoni, T.Perera,M.Marsh, I.Pathan, R.Ashwin, A.Dinda, RP Singh SRH D.Warner, S.Dhawan, K.Williamson, E.Morgan, Yuvraj, M.Henriques, N.Ojha, B.Sharma, B.Kumar, A.Reddy, A.Nehra

430 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

Table 8: Match results prediction Match No. Teams Winner Prediction 1 MI (7) vs RPS (7) RPS True 2 DD (7) vs KKR (7) KKR True 3 KXIP (7) vs GLR(8) GLR True 4 RCB (8) vs SRH (8) RCB True 5 KKR (7) vs MI (7) MI True 6 GLR (11) vs RPS (7) GLR True 7 DD (7) vs KXIP (8) DD False 8 SRH (7) vs KKR (9) KKR True 9 MI (8) vs GLR (11) GLR True 10 KXIP (7) vs RPS (7) KXIP True 11 RCB (8) vs DD (8) DD True 12 SRH (8) vs MI (7) SRH True 13 KXIP (7) vs KKR (9) KKR True 14 MI (8) vs RCB (7) MI True 15 GLR (10) vs SRH (7) SRH False 16 RCB (6) vs RPS (6) RCB True 17 DD (9) vs MI (8) DD True 18 KXIP (8) vs SRH (7) SRH False 19 RCB (7) vs GLR (10) GLR True 20 RPS (7) vs KKR (8) KKR True 21 MI (8) vs KXIP (8) MI True 22 SRH (8) vs RPS (9) SRH True 23 GLR (9) vs DD (9) GLR True 24 KKR (9) vs MI (8) MI False 25 RPS (7) vs GLR (8) GLR True 26 DD (8) vs KKR (7) DD True 27 SRH (8) vs RCB (6) SRH True 28 KXIP (7) vs GLR (8) KXIP False 29 RPS (7) vs MI (8) MI True 30 RCB (8) vs KKR (9) KKR True 31 GLR (8) vs DD (9) DD True 32 KKR (8) vs KXIP (6) KKR True 33 DD (7) vs RPS (6) RPS False 34 GLR (8) vs SRH (8) SRH True 35 RPS (7) vs RCB (6) RCB False 36 KXIP (6) vs DD (8) KXIP False 37 SRH (8) vs MI (8) SRH True 38 KKR (8) vs GLR (9) GLR True 39 RCB (9) vs KXIP (6) RCB True

431 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi International Journal of Engineering Technology Science and Research IJETSR www.ijetsr.com ISSN 2394 – 3386 Volume 4, Issue 6 June 2017

40 SRH (8) vs RPS (7) SRH True 41 RCB (7) vs MI (8) MI True 42 SRH (8) vs DD (9) DD True 43 MI (7) vs KXIP (6) KXIP False 44 RCB (7) vs GLR (9) RCB False 45 RPS (7) vs KKR (9) KKR True 46 KXIP (7) vs SRH (7) SRH True 47 MI (7) vs DD (8) MI False 48 KKR (9) vs RCB (7) RCB False 49 DD (10) vs RPS (7) RPS False 50 RCB (7) vs KXIP (5) RCB True 51 KKR (8) vs GLR (9) GLR True 52 SRH (8) vs DD (7) DD False 53 KXIP (6) vs RPS (7) RPS True 54 MI (8) vs GLR (9) GLR True 55 KKR (7) vs SRH (7) KKR True 56 DD (7) vs RCB (7) RCB True 57 GLR (9) vs RCB (7) RCB False 58 SRH (7) vs KKR (6) SRH True 59 GLR (8) vs SRH (7) SRH False 60 SRH (7) vs RCB (7) SRH True

432 C.Deep Prakash, C.Patvardhan, C.Vasantha Lakshmi