Isize Bitsave:High-Quality Video Streaming at Lower Bitrates Via
Total Page:16
File Type:pdf, Size:1020Kb
White Paper iSIZE BitSave: High-Quality Video Streaming at Lower Bitrates via Amazon ECS Fast and easily integrated with any encoder including Streaming high-resolution video comes with an inevitable trade-off between AVC, HEVC, VP9 and AV1, without breaking standards or requiring any changes on end user device available bandwidth and quality-of-experience for the end user. Delivering iSIZE BitSave pre-processing is single-pass through the uncompromising video quality typically requires excessively high bitrates, content with 1 frame latency – it can be used with VoD, which can result in slow starts, video buffering and high CDN and live streaming, gaming, IoT storage costs. As the percentage of IP traffic attributed to high-resolution video iSIZE BitSave supports: multi-codec, multi-recipe, multi- increases, these problems are only exacerbated. bitrate and multi-resolution ABR ladders iSIZE deep neural network models improve significantly Current solutions that attempt to minimize bandwidth without compromising on with each new generation, providing increased gains with each new generation of pre-processing models quality are centered around the development of more intelligent video encoders; either by replacing parts, or the entirety of a standard video coding Next generation sustainable results and significant reduction in the energy consumption for video delivery pipeline. The latter, in particular, is a risky proposition for video encoding services, given the inevitable requirement for bespoke transport mechanisms and decoders or post-processors on client devices. On the other hand, any improvement to a standards-based coder is severely constrained by the need for compliance to the utilized standard. iSIZE’s innovative BitSave technology comprises an AI-based perceptual optimization that adaptively filters the video content before it reaches the video encoder (Fig. 1), such that the output after compressing with any iSIZE Technologies standard video codec is perceptually optimized (e.g., less motion artifacts or Russell Anam blurring), for the same bitrate. BitSave neural networks are able to isolate areas of perceptual importance, such as those with high motion or detailed texture, Ilya Fadeev and optimize their structure so that they are preserved better at lower bitrate by any subsequent encoder. As shown in our recent paper [1], when integrated as a Aaron Chadha preprocessor prior to a video encoder, BitSave is able to reduce bitrate Yiannis Andreopoulos requirements for a given quality level by up to 20% versus that encoder. When coupled with open-source AVC, HEVC and AV1 encoders, BitSave allows for up to 40% bitrate reduction at no quality compromise in comparison to leading third-party AVC, HEVC and AV1 encoding services on Amazon Web Services. Lower bitrates equate to smaller filesizes, lower storage and distribution costs, reduced energy consumption and more satisfied customers. iSIZE BitSave Storage in multiple Container containers like on Amazon ECS: Any External Any Streaming mp4, webm, Preprocessing only Video Encoder or CDN Service Apple QuickTime Source Content Preprocessed raw video AVC, HEVC or AV1 Reducing cost up to 20% iSIZE BitSave Storage in multiple Container on Amazon ECS: containers like Any Streaming Preprocessing and Encoding with mp4, webm, or CDN Service Open-Source Encoders Apple QuickTime Source Content AVC, HEVC or AV1 Reducing cost up to 40% Figure 1. iSIZE’s BitSave Container pipeline for preprocessing and encoding with third-party encoders (top) and preprocessing and encoding with open-source encoders (bottom) Importantly, given that BitSave pre-processes video content without changing the encoding, these gains are attainable without any of the disadvantages associated with replacing components of encoders, streaming infrastructures, or client devices. iSIZE BitSave: High-quality Video Streaming at Lower Bitrates via Amazon ECS iSIZE’s proprietary deep learning-based models can run on all BitSave is compatible with any existing codec (including CPU and GPU hardware. The current generation of BitSave MPEG AVC/H.264, HEVC/H.265, EVC, VVC and models is able to run on 4 CPU cores or higher, achieving AOMedia VP9, AV1, AV2), and supports all resolutions up real-time preprocessing and AVC encoding on a to 8K. The preprocessing accepts all video formats as g4dn.8xlarge AWS instance with 12% average bitrate input and outputs same-resolution raw Y4M, which can be saving versus x264 ‘slow’ preset and 33% average bitrate optionally downscaled, before passing to any standard saving versus a leading AWS quality-based variable video encoder. Only a single pass processing is bitrate (QVBR) encoding service. The remainder of this required (with single frame latency), prior to any white paper details the expected bitrate savings and speed number of subsequent encodings with any video benchmarks on a range of CPU and GPU-enabled AWS encoder. instances. Figure 2. BitSave preprocessing performs a perceptual and rate-constrained decomposition of each frame to ensure that the output of any standard codec is perceptually optimized for a given bitrate Bitrate savings on typical full-HD test (1080p) videos are We also compare against a widely-used and highly- reported in Table 1 below for the regime of 2 to 10 mbps. respected third-party encoding service on AWS. The More negative values indicate higher bitrate saving versus corresponding bitrate savings versus that benchmark are the corresponding video encoder (anchor) with the same reported in Table 2. All BitSave encodings were done preset and the same bitrate targets. The average bitrate with the settings shown in Table 1. All third-party saving across multiple encoders and VMAF, VMAF_NEG encodings were done with the recommended settings of and SSIM was 11%, with BitSave offering up to 20% that external service, including the use of its proprietary saving on difficult content (e.g., ‘ducks take off’, QVBR rate control. The results show that, against well- ‘touchdown pass’, ‘park joy’). Importantly, all encoded established third-party encoders, BitSave and open- bitstreams are fully standard compliant and do not require source encoders lead to bitrate savings between 17% to any changes in content packaging, delivery or client side. 40%. Figure 3 (next page) illustrates some examples of This means they can be decoded by any device that typical segments of sequences at the same bitrate, supports AVC, HEVC or AV1 decoding without requiring showcasing the quality improvement obtained when any downscaling or upscaling on the client device. utilizing BitSave. Bitrate Saving (%) over Anchor Encoder Bitrate Saving (%) over Leading QVBR Encoding Service on AWS Sequence AVC x264 HEVC x265 AV1 svt-av1 Sequence AVC HEVC AV1 ‘slow’ preset ‘slow’ preset (cpu=5) two-pass two-pass single-pass red kayak -8.8 -5.8 -6.5 red kayak -28.4 -17.5 -24.4 ducks take off -15.2 -16.2 -20.3 ducks take off -35.9 -39.4 -30.8 riverbed -16.3 -3.3 -6.0 riverbed -26.1 -19.6 -22.9 west wind easy -10.4 -0.1 -6.0 west wind easy -41.4 -17.6 -22.3 rush field cuts -8.6 -8.8 -9.3 rush field cuts -32.0 -21.1 -21.8 crowd run -11.0 -9.2 -11.0 crowd run -29.6 -32.2 -25.6 park joy -12.8 -15.1 -18.4 park joy -34.0 -41.5 -32.4 touchdown pass -12.9 -18.8 -17.2 touchdown pass -36.0 -27.7 -20.9 Average -12.0 -9.7 -11.8 Average -32.9 -27.1 -25.1 Table 1: Example BD-rates (%) of BitSave+anchor encoder versus Table 2: Representative BD-rates (%) of BitSave+encoder (x264, anchor encoder on 8 XIPH sequences, with the encoder being x264, x265, svt-av1) versus a market-leading VBR-based external encoding x265 and svt-av1 with bitrate targets set to [2,4,6,8,10] mbps, service for on 16 XIPH sequences, with bitrate targets and GOP GOP=200 for x264 and x265 and GOP=120 for svt-av1. All BD-rates sizes set as for Table 1. All BD-rates are average BD-rates for SSIM, are average BD-rates for the SSIM, VMAF_NEG and VMAF quality VMAF_NEG and VMAF. Higher negative values indicate higher metrics. Higher negative values indicate higher bitrate saving. bitrate saving. iSIZE BitSave: High-quality Video Streaming at Lower Bitrates via Amazon ECS Third-party AVC QVBR, 4.46mbps x264 slow, 4.15mbps BitSave + x264 slow, 4.15mbps VMAF=43.0, VMAF_NEG=41.8, SSIM=0.925 VMAF=50.2, VMAF_NEG=48.5, SSIM=0.927 VMAF=58.7, VMAF_NEG=54.9, SSIM=0.943 Third-party AVC QVBR, 2.12mbps x264 slow, 1.93mbps BitSave + x264 slow, 1.93mbps VMAF=50.4, VMAF_NEG=48.9, SSIM=0.940 VMAF=57.7, VMAF_NEG=55.8, SSIM=0.944 VMAF=62.4, VMAF_NEG=59.6, SSIM=0.947 Figure 3. Visual comparison of encodings corresponding to Table 1 and Table 2 (top: ‘crowd run’; bottom: ‘rush field cuts’). x264 and BitSave + x264 is at approx. 8% lower bitrate than the third-party AVC QVBR. Typical speed benchmarks obtained on various AWS and g4dn.8xlarge instances. Live processing and encoding instances are reported in Table 3. The g4 instances are is achieved for 1080p@50fps on a g4dn.8xlarge instance. GPU-enabled, while c5d is CPU-based. As shown in BitSave and HEVC and AV1 encoding remains at high Table 3, BitSave achieves over 50fps for 720p and over speed in comparison to third-party alternatives, while 24fps for 1080p processing on all instance types. When offering significant bitrate savings, as shown in Table 2. combined with x264 ‘slow’ preset, live processing and encoding is achieved for 720p@50fps on g4dn.4xlarge Frames-Per-Second (FPS) Workload c5d.12xlarge g4dn.4xlarge g4dn.8xlarge 720p BitSave only 60 267 267 720p BitSave + x264 slow 39 58 74 720p BitSave + x265 slow 12 9 11 720p BitSave + svt-av1 9 4 7 1080p BitSave only 26 115 115 1080p BitSave + x264 slow 19 36 56 1080p BitSave + x265 slow 8 6 9 1080p BitSave + svt-av1 5 2 4 Table 3: Runtime (FPS) for various AWS instances processing HD or full-HD input resolutions.