Offensive Direction Inference in Real-World Football Video
Total Page:16
File Type:pdf, Size:1020Kb
AN ABSTRACT OF THE THESIS OF Qingkai Lu for the degree of Master of Science in Computer Science presented on June 10, 2015. Title: Offensive Direction Inference in Real-World Football Video Abstract approved: Alan P. Fern Automatic analysis of American football videos can help teams develop strategies and ex tract patterns with less human effort. In this work, we focus on the problem of automat ically determining which team is on offense/defense, which is an important subproblem for higher-level analysis. While seemingly mundane, this problem is quite challenging when the source of football video is relatively unconstrained, which is the situation we face. Our football videos are collected from a web-service used by more than 13; 000 high school, college, and professional football teams. The videos display huge variation in camera viewpoint, lighting conditions, football field properties, camera work, among many other factors. These factors taken together make standard off-the-shelf computer vision algorithms ineffective. The main contribution of this thesis is to design and eval uate two novel approaches for the offense/defense classification problem from raw video, which make minimal assumptions about the video properties. Our empirical evalua tion on approximately 1200 videos of football plays from 10 diverse games validate the effectiveness of the approaches and highlight their differences. ©c Copyright by Qingkai Lu June 10, 2015 All Rights Reserved Offensive Direction Inference in Real-World Football Video by Qingkai Lu A THESIS submitted to Oregon State University in partial fulfillment of the requirements for the degree of Master of Science Presented June 10, 2015 Commencement June 2015 Master of Science thesis of Qingkai Lu presented on June 10, 2015. APPROVED: Major Professor, representing Computer Science Director of the School of Electrical Engineering and Computer Science Dean of the Graduate School I understand that my thesis will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my thesis to any reader upon request. Qingkai Lu, Author ACKNOWLEDGEMENTS Firstly, I would like to express my appreciation to my advisor Prof. Alan Fern, who guided me through my master studies with great support and help. I would like to thank you for giving me the opportunity to work on this interesting research project and providing me precious suggestions to learn and improve. I also would like to appreciate Prof Todorovic’s help in my research project. Secondly, I would like to thank the rest of my thesis committee members: Prof. Borello, Prof. Bailey and Prof. Tadepalli. I appreciate your support and time. Thirdly, I would like to thank our group members(especially Sheng), who helped me a lot in this project. Last but not the least, I would like to thank my family and friends, especially my parents and two sisters, who has been supporting and encouraging me all the time. TABLE OF CONTENTS Page 1 Introduction 1 1.1 Motivation .................................... 1 1.2 Background and Challenges ........................... 2 1.3 Problem Statement and Overview of Our Approaches . 4 2 Related Work 5 2.1 Activity Recognition ............................... 5 2.2 American Football Video Analysis ....................... 5 3 Approaches 7 3.1 KLT Trajectories Analysis ............................ 8 3.1.1 Motivation and Overview ....................... 8 3.1.2 KLT Trajectories Generation and Pre-processing . 9 3.1.3 LOS and Wide Receivers Detection . 10 3.1.4 Offensive Direction Inference ..................... 11 3.2 Spatial Pyramid Matching ............................ 12 3.2.1 Introduction of Spatial Pyramid Match Kernels . 12 3.2.2 Motivation and Overview ....................... 15 3.2.3 Players Foreground Generation .................... 16 3.2.4 Registration .............................. 17 3.2.5 LOS Detection ............................. 18 3.2.6 Feature Extraction ........................... 20 3.2.7 Classifiers Training .......................... 21 3.3 A Method Based on the Spatial Distribution of Players . 23 4 Experiments and Results 25 4.1 Data-set ...................................... 25 4.2 Quantitative Results ............................... 26 4.2.1 Accuracy of Different Methods .................... 26 4.2.2 Error Sources Analysis ........................ 28 4.2.3 LOS Detection Evaluation ...................... 29 4.2.4 SVM with Different Kernels ..................... 30 4.2.5 KNN .................................. 32 TABLE OF CONTENTS (Continued) Page 4.3 Running Time Analysis of Different Methods . 32 5 Conclusion 34 Bibliography 34 LIST OF FIGURES Figure Page 1.1 Mos frames of 6 football plays which show the huge huge diversities of our football videos. ................................. 3 3.1 Red bounding boxes represent DPM player detections. There are three problems here: the player in the yellow box is missing, the referee in the blue box is detected as a player, and two close players in the green box are detected as one player. 8 3.2 Two frames showing two wide receivers (in red boxes) running from right to left. ...................................... 9 3.3 Pipeline of KLT tracks analysis method. 9 3.4 Plot of one video’s KLT trajectories with LOS bounding box (purple rect angle) and ranges for WRs (green rectangles). Red (blue) vectors are the KLT trajectories outside (inside) the range of the WRs. 11 3.5 A simple example of a three level pyramid[18]. Three types of features are represented by circles, crosses and diamonds respectively. Firstly, the image is subdivided into 3 levels. Secondly, count the number of features inside each grid to construct the histogram. Finally, compute the weight of different resolution levels using equation 3.2 . 15 3.6 Pipeline of spatial pyramid matching method. 16 3.7 The left image is the MOS frame of a play. The right image is the fore ground of the left image. In the right image, the yellow pixels represent the players foreground and the green pixels represent the background. 17 3.8 The left image is the registration result of the MOS frame of Fig 3.7. The right image is the registration result of the foreground of Fig 3.7. The blue bounding boxes represent the LOS detection result in both images. The blue dots inside the LOS bounding box are the detected LOS center. 18 3.9 The left image is the registered foreground of the play p. The LOS de tection of p using foreground can be seen from the middle image. The LOS detection of p using both foreground and color is shown in the right image. The blue rectangle represents the LOS bounding box in both the middle and right image. ............................ 20 LIST OF FIGURES (Continued) Figure Page 3.10 An example of our four-level spatial pyramid. Each level is represented by one image. Red rectangles represent the spatial pyramid grids. Blue rectangle is the LOS bounding box. Blue dot is the LOS center. 21 3.11 An example of football formation from Wikipedia (http : ==en:wikipedia:org=wiki=Americanf ootbal lrules). Red O symbols represent offensive players, blue X symbols represent de fensive players. In this diagram, the offense is in the I formation and the defense is in the 4-3 formation, which are both common formations. 24 3.12 The MOS frame of one football play in our dataset. The left yellow team is on defense and the right white team is on offense. 24 4.1 Three error cases of KLT trajectories analysis. For each image of a cer tain play, purple rectangle is the LOS bounding box, green rectangles are the ranges for WRs, red (blue) vectors are the KLT trajectories outside (inside) the range of the WRs. From left to right, the main error sources for these 3 cases are: wrong MOS detection and large camera motions, inexact low WR range detection, and LOS detection failure, respectively. 28 4.2 Three error cases of spatial pyramid matching. In each image, red rectan gles represent the spatial pyramid cells, blue rectangle is the LOS bound ing box, blue dot is the LOS center. From left to right, the main error sources for these 3 cases are: wrong LOS detection caused by both the field logo and wrong MOS detection, missing of players and false posi tive foreground generated by the field logo, and wrong MOS detection, respectively. .................................. 29 4.3 One error case of the players spatial distribution method, which is due to: i) the difference of the variance between the players spatial distribution of offensive and defensive team is relatively small, ii)false positive player foreground generated by field logos/boundaries/markings. 29 4.4 LOS center detection accuracy for KLT tracks analysis with different offset pixel distance. 30 4.5 LOS center detection accuracy for spatial pyraimd matching with different offset pixel distance. .............................. 31 LIST OF TABLES Table Page 4.1 Accuracy of different methods ......................... 27 4.2 Leave-one-play-out accuracy of SVM with different kernels. 31 4.3 Leave-one-out accuracy of of KNN with different K ............. 32 LIST OF ALGORITHMS Algorithm Page Chapter 1: Introduction 1.1 Motivation Activity recognition is an important research area in the field of computer vision, which aims at automatically analysing ongoing events and their contexts from video data [29]. Various state-of-the-art techniques have been developed for these prevalent ac tivity recognition applications, such as security surveillance, patient monitoring and human-computer interactions. However, the activity recognition in sports domain is still under-served. American football is the most popular sport in the United States. The game planning of American football teams needs a lot of annotation and analysis of videos of their own and opponent games. However, the existing services for football game planning still involve large amount of manual management, annotation, and analysis of videos, since they usually provide only essential user interface functionalities. Even only for one game, video annotation needs to be done repetitively for an average of around 150 plays.