Methods for Objective and Subjective Video Quality Assessment and for Speech Enhancement
Total Page:16
File Type:pdf, Size:1020Kb
SPEECH ENHANCEMENT ASSESSMENT AND FOR VIDEO QUALITY METHODS FOR OBJECTIVE AND SUBJECTIVE ABSTRACT The overwhelming trend of the usage of multi- from the coded bitstream of a video, and in media services has raised the consumers’ awa- the case of RR methods additional pixel-based reness about quality. Both service providers and information is used. Specifically, NR methods are METHODS FOR OBJECTIVE AND consumers are interested in the delivered level developed with the help of suitable techniques SUBJECTIVE VIDEO QUALITY of perceptual quality. The perceptual quality of of regression using artificial neural networks and an original video signal can get degraded due to least-squares support vector machines. Subsequ- ASSESSMENT AND FOR SPEECH compression and due to its transmission over a ently, in a later study, linear regression techniques ENHANCEMENT lossy network. Video quality assessment (VQA) are used to elaborate the interpretability of NR has to be performed in order to gauge the level and RR models with respect to the selection of of video quality. Generally, it can be performed by perceptually significant features. The presented following subjective methods, where a panel of studies on subjective experiments are performed humans judges the quality of video, or by using using laboratory based and crowdsourcing plat- objective methods, where a computational mo- forms. In the laboratory based experiments, the del yields an estimate of the quality. Objective focus has been on using standardized methods in Muhammad Shahid methods and specifically No-Reference (NR) or order to generate datasets that can be used to Reduced-Reference (RR) methods are preferable validate objective methods of VQA. The subjec- because they are practical for implementation in tive experiments performed through crowdsour- real-time scenarios. cing relate to the investigation of non-standard methods in order to determine perceptual pre- This doctoral thesis begins with a review of ex- ference of various adaptation scenarios in the isting approaches proposed in the area of NR context of adaptive streaming of high-definition image and video quality assessment. In the review, videos. recently proposed methods of visual quality as- sessment are classified into three categories. This Lastly, the use of adaptive gain equalizer in the is followed by the chapters related to the descrip- modulation frequency domain for speech en- tion of studies on the development of NR and hancement has been examined. To this end, two RR methods as well as on conducting subjective methods of demodulating speech signals namely experiments of VQA. In the case of NR methods, spectral center of gravity carrier estimation and the required features are extrated convex optimization have been studied. Muhammad Shahid Blekinge Institute of Technology Doctoral Dissertation Series No. 2014:15 2014:15 ISSN 1653-2090 Department of Applied Signal Processing 2014:15 ISBN: 978-91-7295-294-2 Methods for Objective and Subjective Video Quality Assessment and for Speech Enhancement Muhammad Shahid BlekingeBlekinge InstituteInstitute ofof TechnologyTechnology Doctoraldoctoral disseDissertationrtation seriesSeries NNoo 2014:03 2014:15 Psychosocial, Methods Socio-Demographicfor Objective and Subjectiveand Health Video Determinants Quality Assessment in Informationand for Speech Communication Enhancement Technology Use of Older-Adult MuhammadJessica Berner Shahid DoctoralDoctoral DisserDissertationtation inin AppliedApplied Health Signal Technology Processing DepartmentDepartment of Applied of HealthSignal Processing BlekingeBlekinge InstituteInstitute of TTechnologyechnology SWEDEN © 2014 Muhammad Shahid Department of Applied Signal Processing Publisher: Blekinge Institute of Technology SE-371 79 Karlskrona, Sweden Printed by Lenanders Grafiska, Kalmar, 2014 ISBN: 978-91-7295-294-2 ISSN: 1653-2090 urn:nbn:se:bth-00603 Abstract The overwhelming trend of the usage of multimedia services has raised the consumers’ awareness about quality. Both service provid- ers and consumers are interested in the delivered level of percep- tual quality. The perceptual quality of an original video signal can get degraded due to compression and due to its transmission over a lossy network. Video quality assessment (VQA) has to be performed in order to gauge the level of video quality. Gener- ally, it can be performed by following subjective methods, where a panel of humans judges the quality of video, or by using objective methods, where a computational model yields an estimate of the quality. Objective methods and specifically No-Reference (NR) or Reduced-Reference (RR) methods are preferable because they are practical for implementation in real-time scenarios. This doctoral thesis begins with a review of existing approaches proposed in the area of NR image and video quality assessment. In the review, recently proposed methods of visual quality assess- ment are classified into three categories. This is followed by the chapters related to the description of studies on the development of NR and RR methods as well as on conducting subjective experi- ments of VQA. In the case of NR methods, the required features are extracted from the coded bitstream of a video, and in the case of RR methods additional pixel-based information is used. Specifically, NR methods are developed with the help of suitable techniques of regression using artificial neural networks and least-squares sup- port vector machines. Subsequently, in a later study, linear re- gression techniques are used to elaborate the interpretability of NR and RR models with respect to the selection of perceptually significant features. The presented studies on subjective experi- ments are performed using laboratory based and crowdsourcing platforms. In the laboratory based experiments, the focus has been on using standardized methods in order to generate datasets that can be used to validate objective methods of VQA. The subjective experiments performed through crowdsourcing relate to the inves- tigation of non-standard methods in order to determine perceptual preference of various adaptation scenarios in the context of adap- tive streaming of high-definition videos. Lastly, the use of adaptive gain equalizer in the modulation ii frequency domain for speech enhancement has been examined. To this end, two methods of demodulating speech signals namely spectral center of gravity carrier estimation and convex optimiza- tion have been studied. Keywords: Video Quality Assessment, No-Reference Methods, Reduced- Reference Methods, Subjective Experiments, Speech Enhancement, Adaptive Gain Equalizer iii To my parents and siblings! To Mahwish, the love of my life! To Ayesha, the blessing of my life! Acknowledgments The years during the course of my PhD studies have been significantly impor- tant for shaping professional growth as well as a reformation of my personality. Undoubtedly, thinking like a researcher has given me a pragmatic sight to en- visage the world differently. This dissertation, the knowledge that I gained, and the candid consuetude that I learnt has been made possible by an enormous support of many. I would like to pay my heartiest gratitude to all of them. By all means, I consider that Dr. Benny Lövström played the most influ- ential role in this awakening. Since the time I began my masters degree thesis back in 2010 and later on continued with my PhD under his supervision, his mentorship has always been there in the most of my research activities. Not only that, he has been the source of counseling for me in many of my daily life personal and social pursuits as well. I believe that I can not thank him enough in response to what he has been doing for me during these years. It has been a matter of great pleasure, learning, and prosperity that I have been endowed by his guidance. Many thanks, Benny! Prof. Ingvar Claesson, my supervisor and examiner, has also been a great source of guidance for me. I still remember the time when I met him while he was heading the panel for my interview of admission in PhD studies. Since then, I have found him compassionate, supportive, and propitious in all mat- ters, be it related to research work or the availability of resources for a comfort- able work environment. Besides guidance from my official supervisors, I had the great opportunity of learning from Prof. Hans-Jürgen Zepernick while writing articles with him. Many thanks to him for saying, "This sentence is correct but it hardly possesses any academic value!" I would really like to mention Dr. Andreas Rossholm, a very good friend of mine, as one of those great people who played very important roles in my PhD studies. He has been a kind support during my masters degree thesis v vi and a considerate collaborator during the research work of my PhD studies. I should also acknowledge the kindness of all people from VQEG especially Kjell Brunnström. Many thanks to my research collaborators outside BTH including Katerina Pandremmenou, Rizwan Ishaq, Jacob Søgaard, Jeevan Pokhrel, and Samira Tavakoli. Many thanks to Rector Anders Hederstierna and Vice-rector Eva Petters- son for their support regarding many official matters including IEEE BTH SB. Since the time of inception of IEEE BTH SB, I never fell short of a speaker on any topic as I always had the support of Prof. Wlodek Kulesza. Not only this, I would like to thank a lot for his comments and discussions on improving the presentation of research aims of this thesis. I really appreciate the support of Dr. Niklas Lavesson and Lena Vogelius regarding IEEE BTH SB matters. Since the time of my masters degree at BTH, Prof. Lars Håkansson has been a really kind and supportive teacher to me. Many thanks to Dr. Siamak Khat- ibi for all the thoughtful discussions. Similarly, it has been really nice to have discussions with Prof. Mats Petterson. Dr. Sven Johansson has been a great support related to department and educational affairs. A very special thanks to my previous office mates Josef Ström Bartunek and Irina Gertsovich for being so cooperative. Many thanks to my previous and current colleagues/teachers including Dr. Nedelko Grbic, Dr. Jörgen Nordberg, Dr.