3D-Tv Content Generation and Multi-View Video Coding
Total Page:16
File Type:pdf, Size:1020Kb
3D-TV CONTENT GENERATION AND MULTI-VIEW VIDEO CODING by Mahsa Talebpourazad B.A.Sc. Iran University of Science and Technology, 2000 M.A.Sc., University of Manitoba, 2004 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Graduate Studies (Electrical and Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) June 2010 © Mahsa Talebpourazad, 2010 ABSTRACT The success of the 3D technology and the speed at which it will penetrate the entertainment market will depend on how well the challenges faced by the 3D- broadcasting system are resolved. The three main 3D-broadcasting system components are 3D content generation, 3D video transmission and 3D display. One obvious challenge is the unavailability of a wide variety of 3D content. Thus, besides generating new 3D- format videos, it is equally important to convert existing 2D material to the 3D format. This is because the generation of new 3D content is highly demanding and in most cases, involves post-processing correction algorithms. Another major challenge is that of transmitting a huge amount of data. This problem becomes much more severe in the case of multiview video content. This thesis addresses three aspects of the 3D-broadcasting system challenges. Firstly, the problem of converting 2D acquired video to a 3D format is addressed. Two new and efficient methods were proposed, which exploit the existing relationship between the motion of objects and their distance from the camera, to estimate the depth map of the scene in real-time. These methods can be used at the transmitter and receiver- ends. It is especially advantageous to employ them at the receiver-end since they do not increase the transmission bandwidth requirements. Performance evaluations show that our methods outperform the other existing technique by providing better depth approximation and thus a better 3D visual effect. ii Secondly, we studied one of the problems caused by unsynchronized zooming in stereo-camera video acquisition. We developed an effective algorithm for correcting unsynchronized zoom in 3D videos. The proposed scheme finds corresponding pairs of pixels between the left and right views and the relationship between them. This relationship is used to estimate the amount of scaling and translation needed to align the views. Experimental results show our method produces videos with negligible scale difference and vertical parallax. Lastly, the transmission of 3D-content problem is addressed and two schemes for multiview video coding (MVC) are proposed. While both methods outperform the current MVC standard, one of them introduces significantly less random access delay compared to the MVC standard. iii TABLE OF CONTENTS Abstract.............................................................................................................................. ii Table of Contents ............................................................................................................. iv List of Tables ................................................................................................................... vii List of Figures................................................................................................................. viii Glossary ..............................................................................................................................x Acknowledgements ......................................................................................................... xii Dedication .........................................................................................................................xv Co-Authorship Statement ............................................................................................. xvi Chapter 1: Introduction and Overview ...........................................................................1 1.1 Introduction .....................................................................................................1 1.2 The Human 3D Visual System........................................................................4 1.3 3D Content Generation....................................................................................6 1.3.1 Stereoscopic dual-camera approach ............................................................7 1.3.2 3D depth-range camera approach ................................................................8 1.3.3 2D to 3D video conversion approach ........................................................10 1.3.4 Multiview video camera approach.............................................................11 1.4 3D Video Coding...........................................................................................12 1.4.1 Depth-based coding...................................................................................13 1.4.2 Multiview video coding.............................................................................13 1.4.3 Multiview video plus depth coding ...........................................................16 1.5 3D Displays...................................................................................................17 1.5.1 Binocular-with passive glasses..................................................................18 1.5.2 Binocular-with active glasses ....................................................................21 1.5.3 Autostereoscopic displays.........................................................................22 1.6 Specific Challenges in 3D Technology .........................................................24 1.7 Thesis Objectives...........................................................................................27 1.8 Thesis Contributions......................................................................................27 1.9 Thesis Summary ............................................................................................30 1.10 References .....................................................................................................33 Chapter 2: An H.264-based Scheme for 2D to 3D Video Conversion.........................36 2.1 Introduction ...................................................................................................36 2.2 Proposed 2D-TO-3D Conversion Scheme ....................................................40 2.2.1 Motion vector estimation...........................................................................42 2.2.2 Camera motion correction .........................................................................43 iv 2.2.3 Correction of displacement estimates........................................................44 2.2.4 Displacement correction of object borders................................................45 2.2.5 Perceptual depth enhancement ..................................................................46 2.3 Performance Evaluation................................................................................47 2.4 Conclusion.....................................................................................................52 2.5 References .....................................................................................................54 Chapter 3: Generating the Depth Map from the Motion Information of H.264-Encoded 2D Video Sequence ...............................................................................56 3.1 Introduction ...................................................................................................56 3.2 Background....................................................................................................61 3.3 Proposed Scheme...........................................................................................64 3.3.1 Motion vector estimation...........................................................................64 3.3.2 Camera motion correction .........................................................................67 3.3.3 Correction of false displacement estimates ...............................................68 3.3.4 Displacement correction of object-border pixels.......................................71 3.3.5 Displacement correction of object-body pixels .........................................73 3.3.6 Perceptual depth enhancement ..................................................................75 3.4 Performance Evaluation and Discussion.......................................................76 3.5 Conclusion.....................................................................................................84 3.6 References .....................................................................................................85 Chapter 4: Unsynchronized Zoom Correction in 3D Video ........................................87 4.1 Introduction ...................................................................................................87 4.2 Impact of Zoom Mismatch on Subjective 3D Quality ..................................91 4.3 Proposed Zoom Correction Method ..............................................................94 4.4 Experimental Results.....................................................................................98 4.4.1 Objective results on digitally zoomed videos............................................98 4.4.2 Results on 3D video with unsynchronized optical zoom.........................100 4.5 Conclusions .................................................................................................100 4.6 References