FRAME RATE PREFERENCES in LOW BIT RATE VIDEO Gayatri

FRAME RATE PREFERENCES IN LOW BIT RATE VIDEO Gayatri Yadavalli, Mark Masry and Sheila S. Hemami Cornell University, School of Electrical and Computer Engineering, Ithaca, NY email:[email protected], {masry, hemami}@ece.cornell.edu ABSTRACT The paper is organized as follows: the frame rate subjective preference test is described in Section 2. In Section 3, a statistical A double stimulus subjective evaluation was performed to deter- analysis is used to determine viewer preferences across subgroups. mine preferred frame rates at a fixed bit rate for low bit rate video. Section 4 concludes the paper. Stimuli consisted of eight reference color video sequences of size 352 × 240 pixels. These were compressed at rates of 100, 200 2. TEST DESCRIPTION and 300 kbps for low, medium, and high motion sequences, respectively, using three encoders and frame rates of 10, 15 and 30 This section presents the design of the frame rate preference eval- frames per second. Twenty-two viewers ranked their frame rate uation. It includes a summary of content chosen, followed by a preferences using an adjectival categorical scale. Their preferences discussion of the coding conditions and a description of the test were analyzed across sequence content, motion type, and encoder. environment. Viewers preferred a frame rate of 15 frames per second across all categories, with several notable content-based exceptions. 2.1. Video Sequences 1. INTRODUCTION To determine the effects of frame rate on different content types, eight reference sequences of approximately eight seconds each were chosen specifically to represent a variety of streaming video As the demand for streaming video rises among Internet users, 352 × 240 providers are faced with new challenges to maximize video qual- content and resized to pixels. All sequences were dig- ity under fixed bandwidth constraints. At a given bit rate, it is itized using the 4:2:2 YUV color space. The reference sequences not known how the frame rate affects the perceived quality of the are described here: video. The majority of digital video available today is coded at 30 frames per second (fps). A lower frame rate improves the qual- Low motion: ity of individual frames at the expense of the smooth motion of • videoconference - a head-and-shoulders view of a woman video coded at high frame rates. Example frames from one of the talking to a stationary camera. video sequences in the Video Quality Experts Group (VQEG) [1] • news - news footage consisting of one scene with a station- database coded at 10 and 30 fps are shown in Figure 1. Given ary camera. this tradeoff, it is useful for video providers to have a set of rules governing the selection of an optimal frame rate for a particular Medium motion: content type, given a fixed bit rate. • crowd - a moving crowd with a high level of detail and some This paper presents the results of a subjective evaluation of camera motion. viewer-preferred frame rates for a variety of low bit rate video • martial arts - a fight scene involving several people and fast content. Low, medium, and high motion sequences were selected; motion. the motion content of each sequence determined the bit rate used, • ranging from 100 to 300 kbps. The reference sequences were se- airport - several people walking through a moving crowd lected from a wide assortment of film and television programming with high camera motion. to span the range of available content types. In order to test frame • animation - several scenes of computer animated characters rate preferences, each reference sequence was compressed with a with some camera motion. given encoder at three different frame rates: 10, 15, and 30 fps. High motion: The resulting set of three compressed video sequences was presented and evaluated together. Since the bitrate was fixed for each • sports - a panning shot of a football game with a large num- reference sequence and each of the eight reference sequences was ber of moving players on a stationary background. processed with three encoders, there were 24 such test sets, all of • car chase - a very high-speed car chase scene with many which were evaluated by each viewer. The evaluation was per- cuts and extremely high motion. formed using the double-stimulus, five-grade adjectival, categorical ITU-R quality scale of ITU-R BT.500-11 [2]. A rank-based 2.2. Coding Conditions analysis of variance was performed to analyze the significance of viewer preferences in terms of sequence content, motion level, and Each of the sequences was encoded in color using three differ- encoder. Fifteen fps was generally preferred across categories. ent motion-compensated video coding algorithms: the Sorenson Specific exceptions are noted and discussed. Professional video coder version 2.1 [3], the University of British Columbia’s H.263+ coder version 3.0 [4], and a wavelet-based rate-distortion optimized coder developed at Cornell University [5]. The three coders implement vector quantization, the Discrete Cosine Transform (DCT), and the wavelet transform, respectively, and are referred to as VQ, H.263+ and Wavelet throughout this paper. Different bit rates were chosen for each motion category. The low motion sequences were encoded at 100 Kbps, the medium motion sequences at 200 Kbps, and the high motion at 300 Kbps in order to maintain a similar level of quality across the categories. The encoded sequences exhibited a range of blocking and blur- ring artifacts typical of coded video sequences due to the low bit rates used. These artifacts differed across coders. Blockiness oc- curs when quantization causes the appearance of distinct edges be- tween adjacent blocks. Blurriness is the loss of high frequency detail. Sequences coded with the H.263+ and VQ coders exhibited a higher degree of blocking, while those coded with the Wavelet coder showed greater blurriness. The Wavelet coder also exhibited a pulsing effect that was most visible on low motion sequences. (a) H.263+ was the only coder that dropped frames in order to meet bit rate targets. 2.3. Test Environment The test environment was designed to simulate standard viewing conditions as nearly as possible for low bit rate video. Room light- ing was fixed at approximately 230 lux. The video sequences were displayed on a 21” Nokia Multigraph 445XPro monitor at a display resolution of 1024 × 768. Viewing distance was fixed at 6 picture heights. Monitor gamma was 2.3. Maximum and minimum lumi- nances were measured at 98.9 and 1.5 cd/m2, respectively. The test setup consisted of a single screen with two display areas. Each of the twenty-four test sets of sequences with 10, 15 and 30 fps was presented to every viewer in random order to re- move contextual effects. For each of these sets, subjects were first shown the broadcast quality reference video coded at 4 Mbps and 30 fps in the left display area. This video remained available to replay at any time during the test. Subjects then viewed each of (b) the three possible frame rate sequences in the right area. Three buttons labeled A, B, and C located just below the right display Fig. 1. Example sections of frames compressed at 275 kbps using area allowed users to replay the test sequences as desired. Each H.263+ and frame rates of (a) 30 fps and (b) 10 fps of the three encodings in a test set were assigned to these buttons in a pseudorandom order to eliminate the possibility of viewer ac- climatization to a button/fps combination. Viewers were able to examine all three video sequences before rating them. in terms of sequence content, motion category, and encoder to de- Viewers used a five-position slider to rate each of the three se- termine the effects of each on viewer preference. quences in each test set on a five-grade categorical scale; ratings were then converted to ordinal rankings for the purposes of analysis. Ties were allowed in order account for lack of preference. 3.1. Statistical Analysis The test subjects consisted of twenty-two viewers - eleven male Twenty-four test sets were generated as described in section 2. and eleven female - with varying levels of experience viewing and Ratings for the sequences within a test set were then converted into rating video quality. Each viewer performed two full trials of the ordinal rankings using a three rank scale (i.e. ranks of 1, 2 and 3) test, and the results from both trials have been consolidated into for analysis purposes. Tied values were assigned as the average of the overall results. Viewers had normal (20/20) visual acuity or the rankings they would otherwise occupy. corrective lenses. Lower ranking numbers corresponded to higher viewer preference. The rankings for each test set were grouped and summed 3. RESULTS AND ANALYSIS by frame rate to determine overall preferences for each viewer. In order to test preference across a particular subset of sequences, the This section analyzes the results of testing frame rate preferences rankings of each viewer’s ratings of the sequences in that subset for twenty-two viewers. The rank-based Friedman Test used to were summed. The sums for each viewer were then ranked again. perform the statistical analysis is described. Results are discussed The Friedman Test [6] was performed on the resulting set of 22 Sum of Viewer Rankings Preference χ2 Confidence Level by Frame Rate Grouping Motion 10 15 30 10 15 30 By Video Sequence Videoconference L 49.5 31.5 51 2 1 3 10.70 99.5% News L 31.5 36.5 62 1 2 3 24.38 99.9% Crowd M 43 31.5 56.5 2 1 3 14.25 99.9% Martial Arts M 45 33 53 3 1 2 9.23 99.0% Airport M 52 27 53 2 1 2 19.73 99.9% Animation M 39.5 39 51.5 1 1 3 8.07 98.2% Sports H 54 37 41 3 1 2 7.18 97.2% Car chase H 39.5 37.5 55 2 1 3 8.34 98.4% By Motion Low Motion L 39.5 34 58.5 2 1 3 15.02 99.9% Medium Motion M 43 32 57 2 1 3 14.27 99.9% High Motion H 47 35.5 49.5 3 1 2 5.07 92.0% By Encoder Wavelet All 54 35 43 3 1 2 8.27 98.4% H.263+ All 45 32 55 2 1 3 12.09 99.7% VQ All 37.5 36 58.5 2 1 3 14.39 99.9% Table 1.

Load more