Video Transcoding Using Machine Learning
Total Page:16
File Type:pdf, Size:1020Kb
VIDEO TRANSCODING USING MACHINE LEARNING by Christopher Holder A Thesis Submitted to the Faculty of The College of Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Master of Science Florida Atlantic University Boca Raton, Florida December 2008 ACKNOWLEDGMENTS I would like to thank Dr. Kalva for being an awesome advisor, professor and friend. Also I would like to thank anyone who reads this work. I would like to also thank Paula for printing the thesis. iii ABSTRACT Author: Christopher Holder Title: Video Transcoding Using Machine Learning Institution: Florida Atlantic University Thesis Advisor: Dr. Hari Kalva Degree: Master of Science Year: 2008 The field of Video Transcoding has been evolving throughout the past ten years. The need for transcoding of video files has greatly increased because of the new upcoming standards which are incompatible with old ones. This thesis takes the method of using machine learning for video transcoding mode decisions and discusses ways to improve the process of generating the algorithm for implementation in different video transcoders. The transcoding methods used decrease the complexity in the mode decision inside the video encoder. Also methods which automate and improve results are discussed and implemented in two different sets of transcoders: H.263 to VP6 , and MPEG-2 to H.264. Both of these transcoders have shown a complexity loss of almost 50%. Video transcoding is important because the quantity of video standards have been increasing while devices usually can only decode one specific codec. iv TABLE OF CONTENT TABLES ..........................................................................................................................viii FIGURES........................................................................................................................... ix Chapter 1. INTRODUCTION............................................................................................. 1 1.1 Overview and Motivation ......................................................................................... 1 1.2 Video Standards........................................................................................................ 1 1.2.1 H.263.................................................................................................................. 1 1.2.2 VP6 standard codec............................................................................................ 2 1.2.3 MPEG-2 standard codec .................................................................................... 2 1.2.4 H.264 standard codec......................................................................................... 3 1.2.5 Comparison of Video Standards ........................................................................ 4 1.4 Design of a Video Decoder....................................................................................... 6 1.5 Design of a Video Encoder....................................................................................... 7 1.6 Basic Video Transcoding Techniques ...................................................................... 8 1.3.1 Video Transcoding in the Transform Domain................................................... 8 1.3.2 Video Transcoding in the Pixel Domain............................................................ 9 v 1.7 Transcoding Complexity......................................................................................... 12 1.7 Thesis Organization ................................................................................................ 13 Chapter 2. TOOLS............................................................................................................ 14 2.1. Weka Tree Assistant .............................................................................................. 14 2.2 Graph Assistant....................................................................................................... 17 2.3 Weka ....................................................................................................................... 17 3.4 Visual Studios......................................................................................................... 18 2.5 Intel Ipp Transcoder................................................................................................ 18 Chapter 3. H.263 TO VP6 TRANSCODING................................................................... 20 3.1 Transcoding H.263 to VP6 ..................................................................................... 20 3.2 Complexity Reduction Using Dynamic Search Range........................................... 22 3.3 Complexity Reduction Using Dynamic Search Window ....................................... 23 3.4 Machine learning techniques .................................................................................. 25 3.4.1 Trees Structure Style:....................................................................................... 27 3.5 Conclusions............................................................................................................. 28 3.6 Future Work............................................................................................................ 31 Chapter 4. MPEG2 TO H.264 TRANSCODING............................................................. 32 4.1. Introduction............................................................................................................ 32 vi 4.2. Machine Learning based Transcoding................................................................... 33 4.3 Machine Learning Techniques................................................................................ 34 4.3.1 Classifier C4.5 (J48) ....................................................................................... 34 4.3.2 Mode Separated ............................................................................................... 35 4.3.3 On-line Multi-Tree Boosting ........................................................................... 35 4.3.4 Training Process............................................................................................... 37 4.4. Results.................................................................................................................... 37 4.5. Comparasion of transcoder .................................................................................... 40 4.6. Conlcusion ............................................................................................................. 41 Chapter 5. CONCLUSIONS............................................................................................. 44 5.1 Future Work on Video Transcoding ....................................................................... 44 5.2 Lessons Learned...................................................................................................... 44 5.3 Papers Published ..................................................................................................... 45 REFERENCES ................................................................................................................. 47 vii TABLES Table 1 H.263 Vs VP6........................................................................................................ 5 Table 2 MPEG-2 vs. H.264................................................................................................. 5 Table 3 Wekatree.exe Options.......................................................................................... 16 Table 4 MB Mode Mapping for Foreman Sequence (CIF, 297 frames) in H.263 to VP6 Transcoding....................................................................................................................... 21 Table 6 Football H.263 to VP6 Results ............................................................................ 28 Table 7 Use of Variables in the trees................................................................................ 36 Table 8 Results for Mode Seperated................................................................................. 39 Table 9 Results for Online Boosting................................................................................. 39 Table 10 Results for One Frame Trained Tree ................................................................. 39 viii FIGURES Figure 1 Macroblock partitions: 16x16, 16x8, 8x16, 8x8................................................... 4 Figure 2 Sub-macroblock partitions: 8x8, 8x4, 4x8 and 4x4.............................................. 4 Figure 3 Decoder Block Diagram....................................................................................... 6 Figure 4 Encoder Block Diagram For H.264 Encoder. ...................................................... 7 Figure 5 Motion Vector refinement on 3x3 Window ....................................................... 10 Figure 6 MPEG-2 modes against average residue [8] ...................................................... 11 Figure 7 8x8 separation using motion vectors surrounding for weighted average [7] ..... 12 Figure 9 First H.263 to VP6 results .................................................................................. 24 Figure 10 Time Results for H.263 to VP6 transcoder for Coastguard.............................. 29 Figure 11 Time Results for H.263 to VP6 transcoder for Football .................................. 29 Figure 12 PSNR curve for Coastguard ............................................................................