Content-Prioritised Video Coding for British Sign Language Communication
Total Page:16
File Type:pdf, Size:1020Kb
CONTENT-PRIORITISED VIDEO CODING FOR BRITISH SIGN LANGUAGE COMMUNICATION LAURA JOY MUIR A thesis submitted in partial fulfilment of the requirements of The Robert Gordon University for the degree of Doctor of Philosophy October 2007 Abstract Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people. ii Acknowledgements I am grateful to The Robert Gordon University for giving me the opportunity to embark on a part-time PhD within a year of taking up my lecturing appointment. I wish to acknowledge the contribution of all those who inspired, guided and supported me from inception to completion. In particular, I wish to thank those individuals who contributed to the development of my knowledge and understanding of sign language, video communication systems and the importance of human factors in the design of systems for people. Lillian Lawson, Director of the Scottish Council on Deafness, provided the motivation for this research. Professor Iain Richardson gave his expert guidance and asked the right questions to help me develop and adduce my ideas. Professor Richardson and my colleagues at the School of Engineering and the Centre for Video Communications; Dr Tony Miller, Dr Yafan Zhao and Dr Sampath Kannangara, proffered highly valued encouragement, advice, support and friendship. The research conducted for this thesis would not have been possible without the support and participation of the deaf community in Aberdeen. I am most grateful to Edith Ewen, Honorary Chairperson of the Aberdeen Deaf Social and Sports Club, for organising volunteers for the experimental work and to all the very willing members of the Club who participated in the research. I would like to thank Lisa Davidson, Kathleen Cameron and Michael Tocher for signing in the video sequences used in the experimental work. My special thanks go to Jim Hunter, an enthusiastic and indefatigable supporter of the deaf community, for his BSL interpreting skills. I am immensely grateful to Jim, his wife Norma and the profoundly deaf people who participated in this research for giving me an insight into the needs, aspirations and contributions of deaf people. I would also like to thank Isobel Slessor and Mags Christie for their expert BSL tuition and their patience. iii Contents Abstract....................................................................................................................... ii Acknowledgements ................................................................................................... iii Contents..................................................................................................................... iv List of Figures.......................................................................................................... xiii List of Tables............................................................................................................ xvi Abbreviations and Acronyms ............................................................................... xviii PART ONE: BACKGROUND ....................................................................................... 1 CHAPTER 1: INTRODUCTION........................................................................................... 2 1.1 The Research Problem and Rationale...................................................................2 1.2 Research Hypothesis ............................................................................................6 1.3 Aim and Objectives ...............................................................................................6 1.4 Research Contributions.........................................................................................7 1.4.1 New Approach to Video Coding Optimisation.................................................7 1.4.2 New Methods of Measuring the Perceived Quality of BSL Video Communication ................................................................................................................7 1.4.3 New Insights into BSL communication ...........................................................8 1.4.4 New Methods for Content-Prioritised BSL Video Coding ................................8 1.4.5 Perceptually Optimised Video Compression...................................................9 1.5 Research Impact...................................................................................................9 iv 1.6 Structure of the Thesis ........................................................................................11 CHAPTER 2: HUMAN VISION......................................................................................... 16 2.1 The Human Visual System ..................................................................................16 2.1.1 Anatomy of the Eye .....................................................................................16 2.1.2 Visual Processing........................................................................................19 2.1.3 Perception of Objects and Features.............................................................22 2.1.3.1 Recognition of Image Features............................................................................23 2.1.3.2 Perception and Recognition of Visual Objects and Faces.....................................24 2.2 Eye Movements and Attention.............................................................................26 2.2.1 Eye Movements...........................................................................................26 2.2.2 Fixations......................................................................................................28 2.2.3 Fixation Patterns and Scan Path Theory......................................................29 2.2.4 Attention and Selective Visual Perception....................................................32 2.3 The Spatio-Temporal Response ..........................................................................35 2.3.1 Spatial and Temporal Filtering .....................................................................35 2.3.2 The Spatial Response .................................................................................39 2.3.3 The Temporal Response .............................................................................41 2.4 The Perception of Motion ....................................................................................42 2.5 Summary and Model of Visual Perception ...........................................................45 CHAPTER 3: BRITISH SIGN LANGUAGE COMMUNICATION .............................................. 49 3.1 British Sign Language .........................................................................................49 v 3.1.1 The Visual Representation of Sign Language ..............................................49 3.1.2 The Diversity of Sign Languages .................................................................50 3.1.3 The Complexity of Sign Languages .............................................................50 3.2 Video Telephony .................................................................................................56 3.2.1 Video Telephones .......................................................................................57 3.2.2 PC-Based Systems .....................................................................................57 3.2.3 Video Relay Services ..................................................................................58 3.3 Quality of Service Requirements .........................................................................59