H P Intel® Quick Sync Video Tec Nology on Intel® Iris™ Graphics
Total Page:16
File Type:pdf, Size:1020Kb
WHITE PAPER Intel Quick Sync Video Intel® Quick Sync Video Tech n ology on Intel® Iris™ Graphics and Intel® HD Grap h ics family—Flexible Transcode Performance an d Quality The 4th generation Intel® Core™ Processor introduces Intel® Iris™ Pro Graphics, Intel® Iris™ Graphics and Intel® HD Graphics (4200+ Series), featuring the newest iteration of Intel® Quick Sync Video technology. In addition, Intel Quick Sync technology is also available in the Intel Xeon® Processor E3-1200 v3 product family with Intel HD Graphics P4600/P4700. For the purposes of this whitepaper, we will refer simply to Intel Quick Sync technology. This advanced graphics solution sets a milestone in performance by delivering a blazingly fast transcode experience, and balances image quality and performance through a series of optimized user targets for easy adoption. This paper explains the changes in Intel Quick Sync Video transcode target usages, describes the new high-performance hardware acceleration engine, and sets performance and quality expectations for each target usage. Introduction In the past, transcoding has been a Video transcoding involves converting one compute-intensive task that demanded a compressed video format to another. The large amount of precious CPU resources. process can apply changes to the format, With multiple cores and more powerful such as moving from MPEG2 to H.264, or processors driving today’s systems, transcoding can change the properties of transcoding happens faster, and is a given format, like bit rate or resolution. commonly used to support the format Conversions cannot generally happen in a requirements of a whole range of video single step; rather, a series of transitions consumption devices. Intel Quick Sync is usually involved. At a high level, video Video can enable hardware-accelerated transcoding requires these steps: transcoding to deliver better performance than transcoding in the CPU without 1. Read the source video from a file sacrificing quality. 2. Decode frames from the source codec Intel Quick Sync Video technology was first format into raw pictures introduced with the 2nd generation Intel 3. Perform any desired video processing Core processor in 2011. Early generations of and image enhancement (e.g. picture lntel Quick Sync Video technology enabled size scaling, noise reduction, fast H.264 transcode speed with de-interlacing, etc.) acceptable visual quality on mobile devices. 4. Encode the raw pictures into the target The integrated hardware acceleration codec format actually provided a faster than real-time 5. Write the resulting bit stream to a file. H.264 transcode solution. Intel Quick Sync Video Intel Quick Sync Video on the Intel 3rd New flexible Intel Quick Sync Video The key improvements are the following: generation Core processor with Intel HD Graphics 4000 is capable of transcoding Intel Quick Sync Video on the 4th • Additional JPEG/MJPEG decode in the 1080p high quality to 1080p H.264 normal Generation Intel Core Processor includes Multi-format codec engine. This support quality at up to eight times the speed of these new H.264 encoding features: is on top of existing energy-efficient, high-performance AVC encode/decode real-time (8x), depending on the contents. 1. Per-MB bit rate control The performance of transcoding a 720p24 that sustains multiple 4K and Ultra HD 2. Trellis quantization video streams (AVC High Profile L4.1 at 3.6Mbps) video to the same format with normal quality 3. Multi-level hierarchical • A dedicated new video quality engine to reaches about 11x. The transcode varies, motion estimation provide extensive video processing at based on the content selection and other 4. Multi-reference low power consumption factors such as system load memory. 5. Multi-predictor Practically speaking, at this speed, it is • Programmable and media-optimized possible for a user to convert 40 minutes 6. B-pyramid EU (execution units)/Samplers for high quality of video in less than 4 minutes and get the 7. Lookahead output ready to upload, for example to a • A newly-designed media sampler capable Figure 1 depicts the high-level design of social networking website for sharing with of executing faster than previous the new Intel® Iris™ and Intel® Iris™ Pro family and friends. generations Graphics (5100+ Series) and Intel HD Market research has revealed a fast- Graphics (4200+ Series), which has seen • A scalable architecture with the growing consumer segment of media a number of significant enhancements. flexibility to enable acceleration based on enthusiasts who record high-quality Many new features and capabilities are application demand videos. These individuals, sometimes included to improve performance, reduce called “prosumers”, use high-end digital power consumption, and boost image camcorders and DSLR cameras, and have quality. a strong need to edit and create high quality content for large, high-resolution monitors and TVs. Although prosumers welcome faster transcode speeds, quality is often a much higher priority. Quality versus transcode speed is a trade-off prosumers have patiently accepted as long as sufficient disk space was available to handle the output. Now, with Intel Quick Sync Video, prosumers can get both speed and quality. Figure 1. Intel HD Graphics 4200+ Block Diagram Overview 2 Intel Quick Sync Video Evolution of Target Usage Intel Quick Sync Video technology pre- defines various target usages (TU), based on a range of different performance and quality tradeoffs. The baseline configurations are defined for a low-level driver interface. Table 1 summarizes the low-level target usage evolution for each Intel Quick Sync Video generation. Target usages highlighted with the same color share the same configuration, quality, and performance. Note that at the Intel® Media SDK API level, there are additional options available for developers to further fine- tune their targets. Full details are available in the SDK documentation. Figure 2. Intel HD Graphics 4000 vs. unidentified HW encoders, at 1920x1080. Note that the high- Prior generations of Intel Quick Sync lighted areas in the top photos are expanded in the bottom row, and improve moving left to right. Video focused on three pre-defined configurations or usages, but now There are different benefits to software this whitepaper, Intel hardware encoders application developers have more options. and hardware approaches for encoding should improve video quality. Intel Quick Sync Video now defines solutions. Typically, software encoders seven distinct target usages (TUs), each The screenshots in Figure 2 demonstrate have the luxury of performing extremely with a particular balance of quality and the different quality levels observed in complicated motion estimation and performance. Many software vendors popular hardware encoders, as compared exhaustive rate-distortion optimization only use two or three of the defined to the previous-generation Intel HD on the CPU, in order to obtain the target usages in their application, saving Graphics 4000. Clearly, Intel HD Graphics best possible quality. The tradeoff development time. 4000 provides superior visual quality for comes in a very high computation cost. wider usages, as compared with these Speed goes up in this numbering scheme, Hardware encoders, on the other hand, other hardware encoders. With the so in general, TU7 gives the best have commonly been regarded as less new acceleration power and flexibility performance, while TU1 gives the best flexible and thought to be incapable of provided by the latest Intel Quick Sync video quality. delivering the necessary quality. This Video technology, even better hardware- is the reason many hardware encoders based encoding is possible. Developers are assumed to be only good enough for should quickly learn to facilitate state of simple video production and viewing. the art algorithms that should improve However, with the advances described in the quality. Table 1. Evolution of Low-level Target Usage Definition. 3 Intel Quick Sync Video The benefit of the approach illustrated in transcode them to the same resolution reference from ITU-T at http://www-ee. Table 1 is better usage experience cover- and same profile, using Constant QP and uta.edu/dip/courses/ee5356/BDPSNR.doc. age across the wide range of performance 2 B-frame encode settings. In order to However, keep in mind that the objective and quality requirements. In the next make it easier to observe the performance metric does not necessarily correlate to section of this paper, we will focus on difference across target usages, we subjective visual perception for the human balancing performance and quality. normalized all performance to Intel Demo eye. It is possible to have one content Clip1 TU1 transcode speed. sample look better visually than the other content when the two are mathematically Performance and Quality As we can observe in Figure 3 below, close in an objective metric. Therefore, Differentiation Across performance is monotonically increased always observe image quality subjectively Target Usages in each target usage for the Intel Iris™ in addition to using objective metrics to To examine the performance and quality Pro Graphics 5200. Performance for TU7 form a complete quality evaluation. differentiation across all target usages, is ~3.5x faster than for TU1. In addition, we will compare the Intel Iris™ Pro TU1 also provides an average of ~0.6 dB It is also worth noting that Constant Graphics 5200 capabilities with the Intel quality improvement (See Figure 4). Note QP (Constant Quantization Parameter) HD Graphics 4000. The image quality the encode/transcode speed is highly and VBR (Variable Bit Rate Control) will be the same across all versions of dependent on content selection. are two different types of usages in video transcoding. VBR is used when the 4th Generation Intel Core Processor Understanding video encode quality is not the predictable file size is important with Intel® Iris™ and Intel® Iris™ Pro and trivial, and it becomes more complicated to users. The encoder needs to meet Intel HD Graphics and Xeon.