Mobile Vision Mixer
Total Page:16
File Type:pdf, Size:1020Kb
Mobile Vision Mixer A System for Collaborative Live Mobile Video Production Ramin Toussi Department of Computer and Systems Sciences January 2011 Advisor: Oskar Juhlin Second Advisor: Arvid Engstrom¨ Examiner: Fredrik Kilander This masters thesis corresponds to 30 credits. IV Summary. Mobile phones equipped with video cameras and access to high-bandwidth net- works enable a new generation of mobile applications: live amateur video production. There are already mobile technologies that allow capturing, editing and broadcasting live video from mobile phones; however some parts of production techniques still remain exclusively in the hands of professionals; multi-camera filming in a live broadcast of sporting events is an obvious example for such. In this thesis, a system is described that is developed to address these needs for amateur video producers. Specifically, it focuses on providing a real-time collaborative mobile environment that can be used by mobile users who are interested in making live video from various angles only by using their phones. A user study was also conducted to evaluate the system and see how people would use the system to collaboratively produce a live video. Results from the study prove that although pro- ducing a live mobile video is not always easy and straightforward, features like live preview and being able to actively communicate with each other are of great importance and help. Acknowledgments This project would have never become possible without the contribution of several people. First and foremost I would like to thank my advisor and MobileLife centre director, Oskar Juhlin. Thank you for the great support, for trusting me and guiding me through the project. I also want to thank Arvid Engstrom,¨ my second advisor and Mobility Studio director at MobileLife, whose knowledge and talent have always inspired me. Arvid also led the team while evaluating the prototype in Malmo¨ and Goteborg.¨ When it comes to the evaluation of the work, acknowledgments should go to Alexan- dra Weilenmann and Emelie Dahlstrom¨ as well, with whom I had fantastic experi- ences. Great thank you to Mahak Memar and Mudassar Ahmed Mughal, my friends and colleagues at SICS and MobileLife who helped us running tests by acting as remote operator in Stockholm. I also have to mention all those anonymous young people who volunteered to participate in our test sessions. Emelie wrote a separate report about the evaluation. It was written in Swedish; but later on, she helped me translate some parts of it to English, for this thesis and the paper. Tack sa˚ mycket Emelie! The project, the dissertation and all other related reports and papers received some invaluable comments from other contributors to this effort. First and most of all I want to thank Goranka Zoric (Goga) for the fruitful discussions we had when writing the paper. She really inspired me by her knowledge and patience. This also includes Kari Gustafsson, Michael Kitzler and Per Sivborg with whom we collaborated a lot for patenting the idea; the thesis and particularly its technical parts was excellently aspired by these people. While implementing, I took advice and technical support from a couple of people from external companies. Among them, I mention Bambuser and MediaLooks who provided us with resources about their services. Also, they were both smart and fast in responding to my technical questions. Specially, I want to mention Mans˚ Adler from Bambuser and Hanno Roding from MediaLooks particularly. VI A very special thanks to Fredrik Kilander, my teacher at DSV, KTH Interactive System Engineering program manager and this thesis examiner. I believe I was so lucky to meet you and have you as my examiner. You most impressed with all your kindness, patience and commitment to your job and to your students. The present dissertation also received several excellent comments from you. Thank you so very much. Acknowledgments as well to all people at MobileLife, SICS and Interactive Insti- tute, for every wonderful moment we shared together. And last but not least, I have to thank my family and friends from the bottom of my heart. My parents, for always being there; my father who has always been a real father, supportive, caring and compassionate. To mum, for all the goods you have in your soul, your love and devotion. My brother and his wife, for all the laughter we had together and that you will be amazing parents soon! Contents 1 Introduction ::::::::::::::::::::::::::::::::::::::::::::::::::::: 1 1.1 Research Problem . 4 1.2 Methodology. 5 1.3 Contribution . 6 1.4 Layout . 6 2 Background ::::::::::::::::::::::::::::::::::::::::::::::::::::: 9 2.1 Video in HCI and CSCW . 10 2.2 Video Production . 10 2.2.1 Professional Production . 11 2.2.2 Amateurs and semi-professionals practices . 12 2.2.3 Comparison and conclusion . 13 2.3 Mobile Broadcasting, More Mobility . 14 2.4 Related Work . 14 3 System overview ::::::::::::::::::::::::::::::::::::::::::::::::: 17 3.1 Inspiration and Implication for Design . 17 3.2 Employed Technologies and Components . 18 3.3 Ideal Architecture . 21 3.4 Use Scenario . 21 3.5 First Attempts and Lessons Learnt . 23 3.6 Implemented Architecture . 25 3.6.1 Bambuser API . 27 3.6.2 Broadcasting with Adobe Flash Media Live Encoder . 27 3.6.3 Combiner Process . 27 3.6.4 Switch Process . 28 3.6.5 Vision Mixer Mobile Application . 29 3.6.6 Communication . 30 3.7 Further Technical Improvements . 30 VIII Contents 4 System Evaluation ::::::::::::::::::::::::::::::::::::::::::::::: 33 4.1 Method and Setting . 33 4.2 Study Results . 34 4.3 Problems Found . 35 4.4 Discussion . 36 5 Conclusion and Further Work ::::::::::::::::::::::::::::::::::::: 39 5.1 Further Work . 39 References :::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 41 List of Figures 1.1 SKY Sport24 news channel production control room . 3 1.2 A typical combination of live streams . 4 1.3 The Mobile Vision Mixer application prototype running on a Nokia N86 8MP phone . 5 3.1 FLV playback with DirectShow in GraphEdit . 20 3.2 Vision Mixer Architecture . 22 3.3 Mobile Vision Mixer in Operation . 22 3.4 Simplified Vision Mixer Architecture . 26 3.5 Mobile Vision Mixer in Operation, in a Simplified Architecture . 26 3.6 Combiner Flash component layout . 28 3.7 Abstract Model of Communication and Data Flow in Mobile Vision Mixer........................................................ 30 3.8 Conceptual Design of the Ideal Video Combiner and Switch Component Integration. 32 4.1 Evaluation in Malmo...........................................¨ 34 4.2 Codirecting with MVM . 35 List of Tables 3.1 Some of the available metadata fields contained in a typical video object returned by Bambuser ”getVideos” API function . 27 v 1 Introduction This thesis is reporting on the Mobile Vision Mixer (MVM)system, an application prototype than can provide mobile users with a real-time collaborative environment by which they can make live broadcast with their own footage of any event. It can be specifically useful for mobile users who are interested in video practices. This work can be recognized as a significant step forward in mobile video; between July and December 2010 it gained some press interests1, designated the title of innovation while a patent is also pending and is expected to be finalized soon2. Having features like video cameras and high-bandwidth network access inte- grated into the recent(2010) mobile technologies, mobile users are now enabled to have a firsthand ability of social media creation. With this, mobile phones are now beyond communication and passive media consumption devices. This integration, by taking advantage of a sheer peculiar characteristic of mo- bile devices, being available everywhere and all the time, has ended up with the emergence and development of new services for immediate publishing and sharing of live image and video [16, 26, 27]. ComVu Pocket Caster, launched in 2005 which later on was renamed Livecast is the pioneer in live mobile video publishing. In the years that followed, more services were introduced like Qik, Kyte, Flixwagon, Floobs, Next2Friends, Stickam, Ustream and Bambuser3; among which, Qik and Bambuser are the most widely used [22]. Employing this sort of ”capture-and-share-straightaway” [23] services allows peo- ple to instantly share their captured mobile images through manageable web pages instead of using emails, paper prints and web publishing [23,26]. Mobile phones in this way enhance a shared experience among the spectators of a live event. More- over, in distributed events like car rally or bicycle racing, this experience will become even more enjoyable [16, 18]. However, results from previous studies and research show that although these live mobile broadcasting services are available for indi- 1 e.g.: http://www.metro.se/2010/09/22/49027/gor-gerillatv-med-din-mobiltelefon/ 2 European Patent Office, under Rule 19(3) EPC 3 http://bambuser.com/ 2 1 Introduction viduals, allowing people only to broadcast from their mobile devices is not enough. This includes situations in which a group collaborates to create a live show such as sporting events or live TV interviews. Accordingly, challenges still remain for the designers of these services to provide their users with features that so far have been exclusively in the hands of professionals [22]. The production of live TV shows usually takes place under a time critical con- dition [19] and needs to be highly coordinated among the entire production team members. On the other hand, events like team based sports might be distributed over a large area or could happen so fast (as it is in ice hockey and football) to be cov- ered only by one camera; hence the need for the real-time coordination of several cameras is extremely felt [19]. In such multi-camera settings, each camera starts filming from a position that is defined by the director; their corresponding video streams are simultaneously transmitted to the production control room.