Encoding at Scale for Live Video Streaming
Total Page:16
File Type:pdf, Size:1020Kb
WHITEPAPER Encoding at Scale for Live Video Streaming In this whitepaper, discover how to deliver an economical, high quality, and scalable cloud encoding architecture for live video streaming. WHITEPAPER Encoding at Scale for Live Video Streaming Introduction 2 The Need for Video Encoding 3 Video Streaming Delivery Architecture 4 Video Source 4 Display Devices 5 Video Encoding and Transcoding 6 Video Delivery and Content Distribution Network (CDN) 7 Alternatives for Encoding Processing 8 Introducing the Codensity™ T400 Video Transcoding Solution 9 Flexibility and Scalability through U.2 Packaging and NVMe Infrastructure 9 Density from SoC-based Encoding and Transcoding 10 Real-time Encoding Quality 12 Integration through FFmpeg Interface 13 Benefits 14 Conclusions and Summary 15 References and Notes 16 1 WHITEPAPER Encoding at Scale for Live Video Streaming Introduction The generation, storage, and consumption of streaming video is booming across the Internet. According to Domo’s “Data Never Sleeps 6.0” 2018 report,1 Netflix streams 97,222 hours of content every minute, while 400 hours of new fresh video content is uploaded to YouTube every minute. According to the Cisco 2018 VNI report,2 IP video traffic will be 82 percent of all consumer Internet traffic by 2021, up 73 percent from 2016. While VoD growth has been strong, the next frontier of growth for over-the-top (OTT) and direct-to-customer video streaming will be delivering live video content. The Cisco 2018 VNI reports that live video will grow 15-fold from 2016 to 2021.2 With significant growth forecasted for live video streaming content, the industry needs economical and scalable video encoding solutions for live video streaming. This paper will outline an innovative video encoding solution, which combines the performance and densities of System-on-Chip (SoC) encoding with NVMe-based cloud infrastructure, to provide an economical, high quality, and scalable solution to deliver encoding at scale for live video streaming. With significant growth forecasted for live video streaming content, the industry needs economical and scalable video encoding solutions for live video streaming. 2 WHITEPAPER Encoding at Scale for Live Video Streaming The Need for Video Encoding Raw video files are very large—making the transmission and storage of raw video files prohibitively expensive. To help make video content economical to transmit and store, video media is compressed using a video codec (short for coder/decoder) to encode the video data into a much smaller format for more economical transmission to destination or into a much smaller file size for storage. The corresponding decoder at the receiving device or browser-based video player reverses the encoding for playback. As seen below in Figure 1, sourced from Bitmovin’s Video Developer Report 2018,3 the H.264 Advanced Video Coding (AVC) is the most common video codec in use today, achieving ~50x compression ratio compared to raw video file size. Support of H.264 is almost ubiquitous across the full spectrum of video devices, from HDTVs to legacy 3G mobile phones. The second most common codec is the newer H.265 High Efficiency Video Coding (HEVC), which can deliver comparable quality to H.264 using ~50% less bandwidth and thereby reduces video delivery costs by ~50%. HEVC usage is expected to grow in the near term due to Apple’s 2017 announcement that H.265 HEVC will be natively supported in iOS, tvOS, and OSX. VP9 is another current generation video codec supported in Chrome browsers, with similar compression benefits to H.265, however usage of VP9 is lower than H.265. 100% 95% 92% 80% 60% 42% 40% 28% 20% 10% 11% 6% 6% 0% 2017 2018 2017 2018 2017 2018 2017 2018 H.264 / AVCH.265 / HEVC VP9AV1 Figure 1. Video Codec Usage 3 H.265 encoding requires approximately 10x the processing resources of H.264 encoding. The tradeoff for improving codec compression efficiencies is more processing resources are required. H.265 encoding requires approximately 10x the processing resources of H.264 encoding. Encoding complexity for the newest AV1 video codec is even worse. Interest in AV1 is growing, 3 WHITEPAPER Encoding at Scale for Live Video Streaming with promises of an additional ~30% compression and bandwidth savings compared to H.265 and VP9, however today’s early AV1 encoder implementations deliver “glacially slow encoding”, 4 making AV1 economically prohibitive for most live content publishers in the short term. So in summary, live streaming content and service providers are advised to focus on solutions that can economically encode and scale H.264 AVC and H.265 HEVC video, which is the majority of the use cases today. Video Streaming Delivery Architecture The objective of a video streaming delivery architecture (Figure 2) is to distribute video content from a source capture device (left side of diagram), to as many subscribers as possible using a variety of playback devices (right side of diagram). In this section, we will present a high-level overview of the key components and functions in a video streaming delivery architecture to help highlight the key requirements for scalable video encoding. Display Devices Video Source H.264/H.265 Encoding Ladder ABR Streaming H.264/H.265 4K/UHD 4K/UHD or 1080p 1080p Content Distribution Encoding & 720p Network Transcoding 480p 360p Figure 2. Video Streaming Delivery Architecture Advances in video cameras, video production tools, and live video Video Source streaming technologies are fueling Advances in video cameras, video production tools, and live an explosion in the variety and video streaming technologies are fueling an explosion in the quantity of live video content variety and quantity of live video content producers on the producers on the Internet today. Internet today. Example use cases include: • Live Sporting Events: Sports fans today pay premium subscription fees to watch their favorite sports teams, FIFA soccer, or Olympic events. Live sporting events are often live streamed to tens of thousands of viewers, hence compressing video streams through efficient encoding can generate significant savings in bandwidth costs. 4 WHITEPAPER Encoding at Scale for Live Video Streaming • Conferences and Breaking News: While sports events might result in hundreds of streams, our conference and news category might have thousands of unique producers. As live streaming content becomes more specialized with fewer viewers for each video source, encoding costs as a percentage of overall costs will grow, hence encoding scaling efficiencies become increasingly important. • User-Generated Content, including Social Media and “Gamers”: With the capabilities of modern smartphones and affordable HD video capture devices, the number of individual video content producers is exploding. Facebook™ Live is a great example of individuals posting a huge variety of live streaming content to their followers. Another example would be skilled “gamers” who livestream themselves playing a video game to their followers, as seen on services such as Twitch™. Regardless of the content type or business model, the video source is normally captured at the highest resolution possible—typically 1080p or 4K UHD resolution— and then encoded (usually using H.264) for economical distribution to subscribers or followers. For video content intended for post-production for VOD playback later, the video might be provided as an MP4 file. However, for live video streaming, the video media is typically provided as a continuous Real Time Messaging Protocol (RTMP) video stream. Display Devices At the right side of Figure 2 are the consumers of the video content—potentially thousands of subscribers or Consumers want to enjoy their video followers watching each unique live video stream. content from wherever they are, using a variety of playback devices. Consumers want to enjoy their video content from wherever they are, using a variety of playback devices. Consumers able to watch the content from PCs or 4K UHD TV devices with broadband connections will be served a high-quality video stream, delivered at a high bitrate. However, other consumers might be using a Wi-Fi connected tablet watching a 720p video player inside a browser or downloadable application, or possibly even a 3G mobile phone with only 640x360 (360p) resolution. Sending a 4K UHD high resolution video stream to a 3G mobile phone is not only a waste of bandwidth and connectivity expense, but likely to cause buffering and clipping during video playback, resulting in a poor quality of experience (QoE) for the mobile subscriber. For these reasons and others, every source of original video content, whether intended for VOD playback later or a live streaming service, is usually encoded in different bitrates—what is sometimes known as an encoding ladder. 5 WHITEPAPER Encoding at Scale for Live Video Streaming Video Encoding and Transcoding The goal of the video transcoding function is to ingest the The goal of the video transcoding live video stream, and generate the encoding ladder for function is to ingest the live video distribution through the Content Distribution Network (CDN) stream, and generate the encoding to the end devices. ladder for distribution through the Content Distribution Network (CDN) The top step of the encoding ladder is the best quality to the end devices. encoding of the original source material, delivered with the highest bitrate and framerate, intended for high resolution video displays and devices. If the source video can be captured as 4K UHD resolution, then the top step for the ladder can also be a 4K UHD output stream. The bottom step of the encoding ladder would be the exact same original video content, but transrated down to lower bitrates, frame rates, and resolution for “worst case scenario” streaming conditions to the smallest smartphone. In between are the incremental steps of the encoding ladder, with increasing levels of bitrates and resolutions, based on the variety of target devices or target video players that consumers might use to watch the live streaming content.