Thailand Tuullitu

THAILANDUS009979960B2 TUULLITU (12 ) United States Patent ( 10) Patent No. : US 9 ,979 , 960 B2 Zhu et al. ( 45 ) Date of Patent: May 22 , 2018

(54 ) FRAME PACKING AND UNPACKING (3858 ) Field of Classification Search BETWEEN FRAMES OF CHROMA CPC . . HO4N 19 / 00436 ; H04N 19 /63 ; H04N 19 /70 ; SAMPLING FORMATS WITH DIFFERENT H04N 19/ 503 ; H04N 19/ 436 ; CHROMA RESOLUTIONS (Continued ) (71 ) Applicant : Microsoft Corporation , Redmond , WA (56 ) References Cited (US ) U . S. PATENT DOCUMENTS (72 ) Inventors : Lihua Zhu , Mountain View , CA (US ) ; 5 ,047 ,853 A * 9 / 1991 Hoffert ...... G06T 9 /005 Sridhar Sankuratri, Campbell, CA 375 / 240 . 01 (US ) ; B . Anil Kumar, Saratoga , CA 5 ,412 ,428 A * 5 / 1995 Tahara .. .. H04N 7 / 54 (US ); Yongjun Wu, Bellevue , WA 375 / 240. 25 (US ); Sandeep Kanumuri, Redmond , WA (US ) ; Shyam Sadhwani, Bellevue, (Continued ) WA ( US ) ; Gary J. Sullivan , Bellevue , FOREIGN PATENT DOCUMENTS WA (US ) CN 1627830 6 / 2005 ( 73 ) Assignee: Microsoft Technology Licensing , LLC , CN 102801988 11 / 2012 Redmond , WA (US ) ( Continued ) ( * ) Notice : Subject to any disclaimer, the term of this OTHER PUBLICATIONS patent is extended or adjusted under 35 U . S . C . 154 ( b ) by 373 days . International Search Report and Written Opinion dated Dec . 2 , 2013 , from International Patent Application No . PCT/ US2013 / (21 ) Appl. No .: 14 /027 ,013 061784 , 15 pp . (Continued ) ( 22 ) Filed : Sep . 13, 2013 Primary Examiner - Jay Patel (65 ) Prior Publication Data Assistant Examiner — Marnie Matt US 2014 / 0092998 A1 Apr. 3 , 2014 (74 ) Attorney , Agent, or Firm — Klarquist Sparkman , LLP Related U . S . Application Data ( 57 ) ABSTRACT (60 ) Provisional application No .61 / 708 ,328 , filed on Oct . Video frames of a higher - resolution chroma sampling format 1 , 2012 . such as YUV 4 : 4 : 4 are packed into video frames of a lower - resolution chroma sampling format such as YUV (51 ) Int. CI. 4 : 2 : 0 for purposes of video encoding. For example , sample HO4N 19 / 186 ( 2014 .01 ) values for a frame in YUV 4 :4 : 4 format are packed into two H04N 19 / 30 ( 2014 .01 ) frames in YUV 4 : 2 :0 format. After decoding , the video ( Continued ) frames of the lower -resolution chroma sampling format can ( 52 ) U .S . CI. be unpacked to reconstruct the video frames of the higher CPC ...... H04N 19 / 00436 (2013 . 01 ); H04N 1 /646 resolution chroma sampling format . In this way , available ( 2013 .01 ) ; H04N 9 64 ( 2013 .01 ) ; encoders and decoders operating at the lower -resolution (Continued ) (Continued )

900

Y 444 plane U444 plane V444 plane for frame 801 for frame 801 for frame 801 of YUV= 4 : 4 : 4 format of YUV= 4 : 4 : 4 format of YUV= 4 : 4 : 4 format pack as two video frames of YUV 4 : 2 : 0 format, with every second row as B4 and B5

first frame 902 of YUV 4 :2 :0 B1 format Y # 44 plane B2 B3 Udzo plane | V4zo plane B6 B8 ( second frame 903 BA of YUV 4 : 2 : 0 odd rows of U444 plane B7 B9 format) B5 -odd rows of V444 plane BB US 9 ,979 ,960 B2 Page 2 chroma sampling format can be used , while still retaining 9 ,288 ,500 B23 /2016 Budagavi higher resolution chroma information . In example imple 2002 /0101536 A1 * 8 /2002 Cook ...... HO4N 9 /641 348 / 453 mentations, frames in YUV 4 : 4 : 4 format are packed into 2003 /0108248 AL 6 /2003 Huang et al. frames in YUV 4 : 2 : 0 format such that geometric correspon 2005 /0024384 A1 * 2 /2005 Evans ...... GO9G 5/ 363 dence is maintained between Y , U and V components for the 345 /604 frames in YUV 4 : 2 : 0 format. 2005 /0053294 A1 * 3 /2005 Mukerjee ...... H04N 19 /52 382 /236 29 Claims, 11 Drawing Sheets 2005 /0228654 Al 10 /2005 Prieto et al . 2006 / 0013490 A1 1 / 2006 Sun 2007 /0074266 A1 3 /2007 Raveendran et al . 2007 /0110153 A1 5 / 2007 Cho et al . 2008 / 0043852 Al 2/ 2008 Park et al. ( 51 ) Int. Cl. 2008 /0069247 Al 3 / 2008 He H04N 19 /59 ( 2014 . 01 ) 2009 / 0003435 Al 1 / 2009 Cho et al . H04N 19 / 70 ( 2014 .01 ) 2009 /0219994 A1 * 9 /2009 Tu ...... HO4N 19 /186 H04N 19 / 80 ( 2014 .01 ) 375 /240 .08 H04N 19 / 88 ( 2014 . 01 ) 2009 /0225225 A1 9 /2009 Nakagawa et al . HO4N 1 /64 ( 2006 .01 ) 2010 /0046612 A1 2 /2010 Sun et al. H04N 9 /64 ( 2006 . 01 ) 2010 /0046635 AL 2 /2010 Pandit et al. H04N 21/ 2343 ( 2011 .01 ) 2011/ 0199542 AL 8 / 2011 Hirai H04N 19 / 33 2011 /0280316 A1 * 11/ 2011 Chen ...... HO4N 13 / 0048 ( 2014 .01 ) 375 / 240 . 25 (52 ) U .S . CI. 2011/ 0286530 A1 * 11/ 2011 Tian ...... HO4N 21/ 2365 CPC ...... H04N 19 / 186 ( 2014 . 11 ) ; H04N 19 /30 375 / 240 .25 ( 2014 . 11 ) ; H04N 19 /59 ( 2014 . 11) ; H04N 2012/ 0008679 A1 * 1/ 2012 Bakke ...... H04N 19 / 186 19 / 70 (2014 . 11 ); H04N 19 / 80 ( 2014 . 11) ; 375 /240 . 08 H04N 19 /88 ( 2014 . 11 ) ; H04N 21 /234327 2012 /0020413 A1* 1 /2012 Chen ...... H04N 19 /597 (2013 .01 ) ; H04N 21/ 234363 (2013 . 01 ) 375 /240 . 26 2012/ 0093226 A1* 4 / 2012 Chien ...... HO4N 19 / 139 (58 ) Field of Classification Search 375 / 240 . 16 CPC ...... HO4N 19 / 186 ; H04N 19 /30 ; HO4N 19 / 59 ; 2012/ 0236115 AL 9 /2012 Zhang et al. HO4N 19 / 80 ; HO4N 19 / 88 ; HO4N 1 /646 ; 2012 /0307904 Al 12 /2012 Yi et al. HO4N 9 /64 ; H04N 21/ 234327 ; HO4N 2013 /0003840 AL 1 /2013 Gao et al. 21 / 234363 2013 /0106998 A1 * 5 /2013 Pahalawatta ...... H04N 13/ 0029 USPC ...... 375 / 240 . 08 , 240 . 29 348 / 43 See application file for complete search history . 2013 /0113884 A1 5 / 2013 Leonatris 2013 /0121415 Al 5 /2013 Wahadaniah et al. (56 ) References Cited 2013 / 0188744 A1 7 /2013 Van de Auwera 2013/ 0202201 A1 8 / 2013 Park et al. U . S . PATENT DOCUMENTS 2013 /0243076 A1 9 / 2013 Malladi 5 ,650 ,824 A * 7 / 1997 Huang ...... H04N 7 /012 2013 /0287097 A1 10 /2013 Song et al. 348/ 450 2014 /0022460 A11 /2014 Li et al . 5 , 712, 687 A * 1 / 1998 Naveen ...... HO4N 9 /64 2014 / 0064379 A1 3 / 2014 Mrak et al. 348 / 441 2014 / 0072027 Al 3 / 2014 Li et al. 5 ,742 ,892 A * 4 / 1998 Chaddha ...... G06T 9 /008 2014 /0072048 A13 / 2014 Ma et al. 375 / 240 . 11 2014 / 0112394 A1 4 / 2014 Sullivan et al . 6 ,208 ,350 B13 / 2001 Herrera 2014 /0169447 A1 6 /2014 Hellman 6 ,674 ,479 B2 * 1 / 2004 Cook ...... H04N 9 /641 2014 / 0192261 A1 7 / 2014 Zhu 345 /600 2014 / 0247890 A1 9 /2014 Yamaguchi 6 ,700 , 588 B13 / 2004 MacInnis et al. 2014 /0301464 A 10 / 2014 Wu 6 , 937 , 659 B1 * 8 / 2005 Nguyen ...... H04N 19 / 503 2014 /0341305 Al 11 /2014 Qu et al . 348/ 397 . 1 2015 / 0010068 A1 1 / 2015 Francois et al . 6 . 938. 105 B2 8 / 2005 Osa 7 , 551 , 792 B2 6 / 2009 Kong et al. 2015 /0016501 A1 1 / 2015 Guo et al. 7 ,924 ,292 B24 / 2011 Bujold et al . 2016 /0212373 A1 7 /2016 Aharon et al. 7 , 995 ,069 B2 8 / 2011 Van Hook et al . 2016 /0212433 A1 7/ 2016 Aharon et al. 8 , 054 , 886 B2 11 / 2011 Srinivasan et al . 2016 /0277762 A1 9 /2016 Zhang et al. 8 ,139 , 081 B1 * 3 /2012 Daniel ...... H04N 19 /40 345 / 546 FOREIGN PATENT DOCUMENTS 8 , 472 , 731 B2 6 / 2013 Suzuki et al. 8 , 532 ,175 B2 9 / 2013 Pandit et al. EP 0788282 8 / 1997 8, 532 ,424 B2 9 / 2013 Zarubinsky et al . EP 1542476 6 / 2005 8 , 625 ,666 B2 1 / 2014 Bakke 2456204 5 / 2012 8 , 639, 057 B1 1/ 2014 Mendhekar et al. 2904769 8 / 2015 8 , 737, 466 B2 5 / 2014 Demos 2001 - 197499 7 / 2001 8 ,780 ,996 B2 7 / 2014 Bankoski et al . 2004 - 120134 4 / 2004 8, 787 ,443 B2 7 / 2014 Sun et al. 2005 - 176383 6 / 2005 8 , 787 ,454 B1 7 / 2014 Chechik et al. 2010 - 110007 5 /2010 8 , 817, 179 B2 8 / 2014 Zhu et al. JP 2010 - 531609 9 / 2010 8 , 837, 826 B1 . 9 / 2014 Gaddy WO WO 99 / 37097 7 / 1999 8 , 953 , 673 B2 2 / 2015 Tu et al. WO WO 2008 / 143156 11 / 2008 9 ,153 , 017 B1 10 /2015 Russell et al. WO WO 2009 /002061 12 / 2008 9 ,185 ,421 B2 11/ 2015 Wang et al. WO WO 2013 / 128010 9 / 2013 US 9, 979 , 960 B2 Page 3

( 56 ) References Cited ISO /IEC , " Information Technology - Coding of Audio - Visual Objects : Visual, ISO / IEC 14496 -2 , Committee Draft, ” 330 pp . (Mar . OTHER PUBLICATIONS 1998 ) . ITU - R , Recommendation ITU - R BT. 601 - 7 , “ Studio Encoding ISO / IEC , “ Information Technology Coding of Audio - Vi Parameters of Digital Television for Standard 4 : 3 and Wide -screen sual Objects — Part 10 : Advanced Video Coding ,” ISO / IEC 16 : 9 Aspect Ratios, " 19 pp . (Mar . 2011) . 14496 - 10 , 7th edition , 720 pp . (May 2012 ). ITU - R , Recommendation ITU - R BT. 709 - 5 , “ Parameter Values for Jin et al. , “ Resynchronization and Remultiplexing for Transcoding the HDTV Standards for Production and International Programme to H . 264 / AVC , ” Journal of Zhejiang University Science A , pp . Exchange ," 32 pp . ( Apr. 2002 ) . 76 -81 ( Jan . 2006 ). ITU - R , Recommendation ITU - R BT. 2020 , “ Parameter Values for Ma et al. , “ De - Ringing Filter for Scalable Video Coding, ” IEEE Ultra -high Definition Television Systems for Production and Inter Int 'l Conf. on Multimedia and Expo Workshops, 4 pp . ( Jul. 2013 ) . national Programme Exchange, " 7 pp . ( Aug . 2012 ). Rao et al ., Techniques and Standards for Image , Video , and Audio ITU - T , “ ITU - T Recommendation H . 261, Video Codec for Audio Coding , Prentice -Hall , Ch . 2 , pp . 9 - 16 ( 1996 ) . visual Services at px64 kbits, ” 28 pp . ( Mar. 1993 ) . Reddy et al ., “ Subband Decomposition for High -Resolution Color ITU - T, “ ITU - T Recommendation H . 262 , Information Technol in HEVC and AVC 4 :2 :0 Video Coding Systems, ” Microsoft ogy Generic Coding of Moving Pictures and Associated Audio Research Tech Report MSR - TR - 2014 -31 , 12 pp . (Mar . 2014 ) . Information : Video , ” 218 pp . ( Jul. 1995 ) . Rossholm et al. , “ Adaptive De -Blocking De -Ringing Post Filter, " ITU - T , “ ITU - T Recommendation H . 263 , Video Coding for Low Bit IEEE Int' l Conf. on Image Process , vol. 2 , 4 pp . (Sep . 2005 ). Rate Communication , " 167 pp . ( Feb . 1998 ) . Schwarz et al . , “ Overview of the Scalable Video Coding Extension ITU - T, H . 264 , “ Advanced Video Coding for Generic Audiovisual of the H .264 / AVC Standard, ” IEEE Trans. on Circuits and Systems Services ,” 680 pp . (Jan . 2012 ) . for Video Technology, vol. 17 , No. 9 , pp . 1103 - 1120 (Sep . 2007 ) . ITU - T , T .800 , " Information Technology — JPEG 2000 Image Cod Wan et al ., “ Perceptually Adaptive Joint Deringing - Deblocking ing System : Core Coding System , ” 217 pp . ( Aug. 2002 ) . Filtering for Scalable Video Transmission over Wireless Networks, ” Le Gall et al. , “ Sub -band Coding of Digital Images Using Sym Proc. of Signal Processing : Image Communication , vol. 22 , Issue 3 , metric Short Kernel Filters and Arithmetic Coding Techniques ,” 25 pp . (Mar . 2007) . IEEE Trans. on Acoustics , Speech , and Signal Processingpp . 761 Wong , “ Enhancement and Artifact Removal for Transform Coded 764 ( Apr. 1988 ). Document Images, " powerpoint slides, 45 pp . (Apr . 2010 ) . Lin et al ., “ Syntax and Semantics of Dual -coder Mixed Chroma Ying et al. , “ 4 : 2 : 0 Chroma Sample Format for Phase Difference sampling - rate (DMC ) Coding for 4 : 4 : 4 Screen Content, " JCTVC Eliminating and Color Space Scalability ," JVT- 0078 , 13 pp . ( Apr. JO233 , 4 pp . ( Jul. 2012 ) . 2005 . ) . Microsoft Corporation , “ Clear Type Information ,” downloaded from Ali et al . , " Survey of Dirac : A Wavelet Based Video Codec for http : / /www .microsoft . com / typography /cleartypeinfo .mspx , 2 pp . Multiparty Video Conference and Broadcasting , " Intelligent Video ( Jan . 2010 ) . Event Analysis & Understanding , pp . 211 -247 (Jan . 2011) . Microsoft Corporation , “ Microsoft RemoteFX ,” downloaded from Bross et al . , “ High Efficiency Video Coding (HEVC ) Text Speci http : / /technet .microsoft . com /en -us / library / ff817578 (WS . 10 ) .aspx , fication Draft 7 ," JCTVC -11003 _ d5 , 294 pp . ( Apr. 2012 ) . 10 pp . ( Feb . 2011) . Bross et al. , “ High Efficiency Video Coding (HEVC ) Text Speci Mohanbabu et al. , “ Chroma Subsampling Based Compression of fication Draft 9 ," JCTVC -K1003 , 311 pp . (Oct . 2012 ). Bayer -Pattern Video Sequences using Gradient Based Interpola Bross et al . , “ High Efficiency Video Coding (HEVC ) Text Speci tion , ” European Journal of Scientific Research , vol. 86 , No . 4 , pp . fication Draft 10 ( for FDIS & Last Call ), ” JCTVC -L1003 , 310 pp . 556 - 564 (Sep . 2012 ) . (Jan . 2013 ) . Rothweiler, “ Polyphase Quadrature Filters A New Subband Cod Bross et al. , “ Proposed Editorial Improvements for High Efficiency ing Technique, ” IEEE Trans. on Acoustics, Speech , and Signal Video Coding (HEVC ) Text Specification Draft 8 , " JCTVC -K0030 , Processing , pp . 1280 - 1283 ( Apr. 1983 ) . 276 pp . (Oct . 2012 ) . Smith et al ., “ Exact Reconstruction Techniques for Tree - Structured Calderbank et al ., “ Wavelet Transforms That Map Integers to Subband Coders , ” IEEE Trans. on Acoustics , Speech , and Signal Integers ,” Applied and Computational Harmonic Analysis , vol. 5 , Processing , vol. ASSP - 34 , No . 3 , pp . 434 -441 ( Jun . 1986 ) . pp . 332 - 369 ( 1998 ) . SMPTE 421M , “ VC - 1 Compressed Video Bitstream Format and Chen et al ., “ R - D Cost Based Effectiveness Analysis of Dual- coder Decoding Process, ” 493 pp . ( Feb . 2006 ) . Mixed Chroma- sampling - rate (DMC ) Coding for 4 : 4 : 4 Screen Sullivan et al ., “ Recommended 8 - Bit YUV Formats for Video Content, " JCTVC - J0353, 6 pp . ( Jul. 2012 ) . Rendering (Windows ) , ” downloaded from http :/ / msdn .microsoft . Cohen et al. , “ Biorthogonal Bases of Compactly Supported Wave com /en -us / library /windows / desktop / dd206750 % 28v = vs .85 % 29 . lets , " Communications on Pure and Applied Mathematics , pp . aspx , 14 pp . (Nov . 2008 ) . 485 - 560 ( 1992 ) . Sullivan et al. , " Overview of the High Efficiency Video Coding Flynn et al. , “ High Efficiency Video Coding ( HEVC ) Range Exten (HEVC ) Standard ,” IEEE Trans. on Circuits and Systems for Video sions Text Specification : Draft 3 , " JCTVC -M1005 , 315 pp . ( Apr. Technology , vol. 22 , No. 12 , pp . 1649 - 1668 (Dec . 2012 ) . 2013 ) . Sweldens, “ The Lifting Scheme: A Construction of Second Gen Gold , “ Stop Worrying About Compression With an on - Camera eration Wavelets , ” to appear in SIAM Journal on Mathematical Video Recorder, " downloaded from http : / /www .bhphotovideo .com / Analysis, 42 pp . (May 1995 ) . indepth /video /hands - reviews / stop -worrying - about- compression Uytterhoeven et al. , “ Wavelet Based Interactive Video Communi camera - video - recorder , 6 pp . ( Jul. 2012 ) . cation and Image Database Consulting - Wavelet TransformsUsing He et al. , “ De- Interlacing and YUV 4 : 2 : 2 to 4 : 2 :0 Conversion on the Lifting Scheme, " 24 pp . (Apr . 1997 ) . TMS320DM6446 Using the Resizer, ” Texas Instruments Applica Vetro , “ Frame Compatible Formats for 3D Video Distribution ," tion Report SPRAAK3B , 18 pp . (Dec . 2008 ) . Mitsubishi Electric Research Laboratories TR2010 - 099 , 6 pp . (Nov . “ HEVC Software Repository , " downloaded from https : // hevc .hhi . 2010 ). fraunhofer. de / svn /svn _ HEVCSoftware , 1 p . (downloaded on Sep . Villasenor et al ., “ Wavelet Filter Evaluation for Image Compres 17 , 2013 ) . sion ,” IEEE Trans. on Image Processing , vol. 4 , No . 8 , pp . 1053 ISO / IEC , “ Information Technology — JPEG 2000 Image Coding 1060 (Aug . 1995 ) . System - Part 11 : Wireless, ” ISO / IEC FCD 15444 - 11 , 72 pp . (Mar . Wiegand et al. , “ Overview of the H . 264 /AVC Video Coding Stan 2005 ) . dard ,” IEEE Trans. on Circuits and Systems for Video Technology , ISO /IEC , “ ISO /IEC 11172 - 2 , Information Technology - Coding of vol. 13 , No . 7 ,pp . 560 -576 ( Jul. 2003 ) . Moving Pictures and Associated Audio for Digital Storage Media at Wikipedia , “ Chroma Subsampling ," 8 pp . ( last modified Aug. 20 , up to About 1 .5 Mbit /s , " 122 pp . ( Aug. 1993) . 2013 ) . US 9, 979 , 960 B2 Page 4

( 56 ) References Cited Lin et al. , “ Mixed Chroma Sampling -Rate High Efficiency Video Coding for Full - Chroma Screen Content, ” IEEE Trans . on Circuits OTHER PUBLICATIONS and Systems for Video Technology , vol. 23 ,No . 1, pp . 173 -185 ( Jan . 2013 ) . Wu et al. , “ Frame Packing Arrangement SEI for 4 : 4 : 4 Content in Su et al . , " Image Interpolation by Pixel Level Data - Dependent 4 : 2 : 0 Bitstreams, ” JCTVC -K0240 , 6 pp . (Oct . 2012 ) . Wu et al. , “ Tunneling High -Resolution Color Content through 4 : 2 : 0 Triangulation ,” Computer Graphics Fourm , 13 pp . ( Jun . 2004 ) . HEVC and AVC Video Coding Systems, ” IEEE Data Compression Wiegand , “ Study of Final Committee Draft of Joint Video Speci Conf. , 10 pp . (Mar . 2013 ) . fication ( ITU - T Rec . H . 264 | ISO / IEC 14496 - 10 AVC ” JVT- F100 , Zhang et al. , “ Additional Experiment Results for Frame Packing 242 pp . (Dec . 2002 ) . Arrangement SEI Message for 4 :4 :4 Content in 4 : 2: 0 Bitstreams, " Dumic et al ., “ Image Quality of 4 :2 : 2 and 4 : 2 : 0 Chroma JCTVC -M0281 , 11 pp . ( Apr. 2013 ) . Subsampling Formats ,” Int 'l Symp . ELMAR , pp . 19 - 24 (Sep . 2009 ) . Zhang et al. , “ BD - rate Performance vs . Dictionary Size and Hash First Office Action dated Jun . 19 , 2017 , from Chinese Patent table Memory Size in Dual - coder Mixed Chroma- sampling- rate Application No. 201380051471. 9 , 17 pp . (DMC ) Coding for 4 :4 :4 Screen Content ,” JCTVC - J0352, 3 pp . Khairat et al. , “ Adaptive Cross- Component Prediction for 4 :4 :4 ( Jul. 2012 ) . High Efficiency Video Coding, ” Int' l Conf. on Image Processing , Zhang et al. , “ Updated Proposal for Frame Packing Arrangement pp . 3734 - 3738 (Oct . 2014 ) . SEI for 4 : 4 : 4 Content in 4 : 2 : 0 Bitstreams, ” JCTVC - L0316 , 10 pp . Lin et al. , “ Mixed Chroma Sampling - Rate Coding : Combining the ( Jan . 2013) . Merits of 4 : 4 :4 and 4 : 2 : 0 and Increasing the Value of Past 4 : 2 : 0 Zhang et al. , “Updated Proposal with Software for Frame Packing Investment, ” JCTVC -H0065 , 5 pp . (Feb . 2012 ). Arrangement SEI Message for 4 : 4 : 4 Content in 4 : 2 : 0 Bitstreams, " Notification of Reasons for Refusal dated Jun . 20 , 2017 , from JCTVC -N0270 , 9 pp . ( Aug. 2013 ) . Japanese Patent Application No . 2015 - 534632 , 6 pp . Communication Pursuant to Rules 161( 1 ) and 162 EPC dated May 11 , 2015 , from European Patent Application No. 13774029 . 6 , 2 pp . * cited by examiner U . S . Patent May 22, 2018 Sheet 1 of 11 US 9 ,979 , 960 B2

Figure 1 ------| computing environment 100 communication Ir------connection (s ) 170 | | central graphics or Til processing co -processing input device( s ) 150 unit 110 unit 115 =

= output device ( s ) 160 memory 120 mcmory 125 storage 140 L = 2 . . . i - - - - - — — — — — — — software 180 implementing one or more innovations for frame packing and /or unpacking for higher - resolution chroma sampling formats

Figure 2a 201

RTC tool 210 RTC tool 210 encoder 220 encoder 220 network 250 340decoder 270 - decoder3 270

202 playback tool 214 Figure 2b decoder 270 encoding tool 212 network 250 playback tool 214 encoder 220 decoder 270 U . S . Patent May 22, 2018 Sheet 2 of 11 US 9 ,979 , 960 B2

outputframe386ofhigher outputframe381of reschromasamplingformat lower-reschroma samplingformat 2output destination390 9 frameunpacker 385 [un]decoder360

-> - coded data341 framepackingarrangement metadata317 channel 350 300 coded data341 -* videosource 310 framepacker 315 encoder340 arrivingsourceframe(s) sourceframc(s)316of Figure3 311ofhigher-res chromasamplingformat lower-reschroma samplingformat U . S . Patent May 22 , 2018 Sheet 3 of 11 US 9 .979 ,960 B2

] ?

- channel coder 480 codedtemporary 470arcadata coded data471 channel490 ;

Figure4 1 400 decodingprocess450 emulator MMCOinformation442 coded |frame(s)441 frame 451 ??????????? decoded -agram storagearea460 decodedframe temporarymcmory 461| coreencoder440 4631146n ? decoded(ref) frame(s)469, arrivingsourceframe(s) source framc(s)431 411ofhigher-reschromaformat sampling memorystoragearca420 frame packer 415 sourceframetemporary 421|42242n selector 430 frame(s) 429 Sourceframe(s)416 sour source oflower-reschroma samplingformat mcm ???video source 410 U . S . Patent May 22, 2018 Sheet 4 of 11 US 9 ,979 , 960 B2

coded data521 channel decoder 520 temporarycoded dataarea530

$ - 2 , - 510 channel -

- Figure5 - -

- 500 - - somdet550decodercore - frame(s) - MMCOinformation532 ??? coded 531 -

1 - 551 - decoded frame decoded(ref) - 569frame)(s - decodedframe temporarymemory 560areastorage -

561562 56356n - DD - - of586frameoutput - higher-reschroma samplingformat - -

- frame unpacker 585 output scqucncer 580 { - - - - - L |outputframe581of1 |lower-reschroma |samplingformat output destination 590 L U . S . Patent May 22, 2018 Sheet 5 of 11 US 9 ,979 , 960 B2

encoded data695 600 encoded 695data buffer690 buffer690 IntraPath InterPath cntropy coder680 entropycoder680 motion information 615 -02-05quantizer 670 inverse quantizer 676 inverse quantizer676 quantizer 670 residual645 L= frequency transfor mer660216 inverse frequency tran.666 inverse frcquencytran666. E39-frcqucncy transfor 660mer 605offramecurrent 65-9-EH 635 lower-reschroma samplingformat currentframe605oflower-res in-loop dcblock filter622 motion compen 630sator chromasamplingformat reference frame(s)625 frame storc(s)620 motion estimator 610 U . S . Patent May 22, 2018 Sheet 6 of 11 US 9 ,979 , 960 B2

700700 cncoded data795 – encoded 795data buffer790 buffer790

frame(s)725 rangement- referenceLo entropy decoder 780 motion compen 730sator entropy decoder 780 715 predicted frame735 15-3inverse quantizer 770 inverse quantizer 770

storc(s) residuals745 IntraPath InterPath framc 720 reconstructed inverse frequency 760tran. inverse frequency tran,760

in-loop dcblock filter710 705framereconstructed 705framereconstructed oflower-reschroma samplingformat oflower-reschroma samplingformat Figure7 post-proc. deblock filter708 U . S . Patent May 22, 2018 Sheet 7 of 11 US 9 ,979 , 960 B2

Figure 8 800

Y 444 planc U444 planc V444 planc for framc 801 for framc 801 for framc 801 of YUV 4 : 4 : 4 format of YUV 4 : 4 : 4 format of YUV 4 : 4 :4 format

partition spatially

Q1- U444 : Q2- U444 Q1- V 444 : Q2 - V 444 Y 444 plane plane , plane plane plane for frame 801 of YUV 4 :4 :4 format H2- U444 plane H2- V444 plane

reorganize as one or two video frames of YUV 4 :2 :0 = 93BEformat

first frame 802 of YUV 4 :2 :0 format Y444 plane Q1- U444 Q1- V444 plane plane (second frame 803 H2- U444 plane Q2 -U444 Q2 - V444 of YUV 4 : 2 : 0 plane plane format) H2- V444 plane U . S . Patent May 22, 2018 Sheet 8 of 11 US 9 ,979 , 960 B2

Figure 9 900

Y 444 planc U444 planc V444 planc for frame 801 for frame 801 for frame 801 of YUV 4 :4 :4 format of YUV 4 :4 : 4 format of YUV 4 :4 : 4 format

pack as two vidco frames of YUV = =4 : 2 : 0 format , with =every second row as B4 and B5

first frame 902 of YUV 4 : 2 : 0 B1 format Y 444 plane B2 B3 M U420 plane V420 plane BA B6 OSB8 ( sccond framc 903 of YUV 4 : 2 : 0 odd rows of U444 planc B7 | B9 format) 2 au tem automobiliariosB5 Roman andodd rows of V444 planeplans usU . S . Patentreell Maysom 22, 2018. Sheet 9 of 11 US 9 ,979 , 960 B2

Figure 10 1000

Yote Us V4444

.: .

* * * * * * * * . :.

------...... ,

. YUV 4 : 4 : 4 ??????????????????????????????????????? . YYYYYYYYYYYYYYYYYYYYYYYY

. . .: . frame 1001 . . - - -

. : . * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

. . :

Y 420 U 420 V 420

.. . ????????????????????

-** * * * .* .* . . ... * * * * * * * * * * * * * * * " main view 1002 . LES . ... . of YUV 4 : 2 : 0 > . ...

frame . *** . * . .. * ** ...... *** * * ---. .. .4. * * - .

Y 420 U420 V 420

YYYYYYYYYYYYYYYYYYYYYYYYYY SELLE auxiliary view ...... RES

- 1003 of YUV -

. - -

4 : 2 : 0 frame . -

PPP. - rrrrrrrrrrrrrrr - rrrrrrrrrrrrrrrrrr - U . S . Patent May 22, 2018 Sheet 10 of 11 US 9 ,979 , 960 B2

Figure 11 1100

Y 444 plane U444 plane V 444 plane for frame 801 for frame 801 for frame 801 of YUV 4 : 4 : 4 format of YUV 4 :4 : 4 format of YUV 4 : 4 : 4 format

= = pack as two video= frames of YUV 4 : 2 : 0 format, with cvery second 13column as A4 and A5 first framc 1102 of YUV 4 : 2 : 0 A1 format Y444 planc A2 A3 U420 planc V420g planc (second frame A4 A5 A6 A7 A8 A9 1103 of YUV odd columns odd columns 4 : 2 : 0 format ) ZELTof U444planc of V444 planc U . S . Patent May 22, 2018 Sheet 11 of 11 US 9 ,979 , 960 B2

Figure 12 Figure 13

1200 1300 Start Start

- 1210 1310 Pack frame( s) of higher - res Decode frame( s ) of lower chroma sampling format into res chroma sampling format. frame( s ) of lower -res chroma sampling format. 1320 Unpack frame( s ) of lower res chroma sampling format Encode framc( s ) of lower - 1220 into frame( s ) of higher - res res chroma sampling format. chroma sampling format . End 01H02End US 9 ,979 , 960 B2 FRAME PACKING AND UNPACKING 4 : 4 : 4 and YUV 4 : 2 : 0 chroma sampling formats are more BETWEEN FRAMES OF CHROMA easily perceived by viewers . For example , for encoding / SAMPLING FORMATS WITH DIFFERENT decoding of computer screen text content , animated video CHROMA RESOLUTIONS content with artificial hard - edged boundaries, or certain 5 features of video contentmore generally (such as scrolling CROSS REFERENCE TO RELATED titles and hard -edged graphics, or video with information APPLICATION concentrated in chroma channels ), a 4 : 4 : 4 format may be preferable to a 4 : 2 : 0 format. Although screen capture codecs This application claims the benefit of U . S . Provisional supporting encoding and decoding in a 4 : 4 : 4 format are Patent Application No. 61/ 708 ,328 , filed Oct. 1 , 2012 , the 10 available , the lack of widespread support for codecs sup disclosure of which is hereby incorporated by reference . porting 4: 4 :4 formats (especially with respect to hardware BACKGROUND codec implementations) is a hindrance for these use cases. Engineers use compression (also called source coding or 15 SUMMARY source encoding ) to reduce the bit rate of digital video . Compression decreases the cost of storing and transmitting In summary , the detailed description presents innovations video information by converting the information into a in frame packing of video frames of a higher - resolution lower bit rate form . Decompression (also called decoding ) chroma sampling format into video frames of a lower reconstructs a version of the original information from the 20 resolution chroma sampling format for purposes of encod compressed form . A " codec ” is an encoder / decoder system . ing. For example , the higher - resolution chroma sampling Over the last two decades , various video codec standards format is a YUV 4 : 4 : 4 format, and the lower - resolution have been adopted , including the ITU - T H . 261, H . 262 chroma sampling format is a YUV 4 : 2 : 0 format. After (MPEG - 2 or ISO / IEC 13818 - 2 ) , H . 263 and H . 264 (MPEG - 4 decoding , the video frames of the lower- resolution chroma AVC or ISO /IEC 14496 - 10 ) standards and the MPEG - 1 25 sampling format can be unpacked to reconstruct the video ( ISO /IEC 11172 - 2 ) , MPEG - 4 Visual (ISO /IEC 14496 - 2 ) and frames of the higher -resolution chroma sampling format . In SMPTE 421M standards. More recently, the HEVC standard this way, available encoders and decoders operating at the ( ITU - T H .265 or ISO /IEC 23008 -2 ) has been under devel- lower -resolution chroma sampling format can be used , while opment. See , e . g . , draft version JCTVC - 11003 of the HEVC still retaining higher resolution chroma information . standard — “ High efficiency video coding (HEVC ) text 30 For example , a computing device packs one or more specification draft 7 , ” JCTVC - 11003 _ d5 , 9th meeting , frames of a higher- resolution chroma sampling format into Geneva , April 2012 . A video codec standard typically one or more frames of a lower -resolution chroma sampling defines options for the syntax of an encoded video bitstream , format . The computing device can then encode the one or detailing parameters in the bitstream when particular fea - more frames of the lower -resolution chroma sampling for tures are used in encoding and decoding . In many cases, a 35 mat. video codec standard also provides details about the decod - As another example , a computing device unpacks one or ing operations a decoder should perform to achieve confor - more frames of a lower -resolution chroma sampling format mant results in decoding . Aside from codec standards, into one or more frames of a higher- resolution chroma various proprietary codec formats define other options for sampling format. Before such unpacking , the computing the syntax of an encoded video bitstream and corresponding 40 device can decode the one or more frames of the lower decoding operations. resolution chroma sampling format. A video source such as a camera , animation output, screen The packing or unpacking can be implemented as part of capture module , etc. typically provides video that is con - a method , as part of a computing device adapted to perform verted to a format such as a YUV 4 : 4 : 4 chroma sampling the method or as part of a tangible computer - readable media format . A YUV format includes a luma ( or Y ) component 45 storing computer - executable instructions for causing a com with sample values representing approximate brightness puting device to perform the method . values as well as multiple chroma ( or U and V ) components The foregoing and other objects , features , and advantages with sample values representing color difference values . In of the invention will become more apparent from the fol a YUV 4 : 4 : 4 format, chroma information is represented at lowing detailed description , which proceeds with reference the same spatial resolution as luma information . 50 to the accompanying figures. Many commercially available video encoders and decod ers support only a YUV 4 : 2 : 0 chroma sampling format. A BRIEF DESCRIPTION OF THE DRAWINGS YUV 4 : 2 : 0 format is a format that sub - samples chroma information compared to a YUV 4 : 4 : 4 format, so that FIG . 1 is a diagram of an example computing system in chroma resolution is half that of luma resolution both 55 which some described embodiments can be implemented . horizontally and vertically. As a design principle , the deci FIGS . 2a and 2b are diagrams of example network sion to use a YUV 4 : 2: 0 format for encoding /decoding is environments in which some described embodiments can be premised on the understanding that, for most use cases such implemented . as encoding / decoding of natural camera - captured video con - FIG . 3 is a diagram of a generalized frame packing tent, viewers do not ordinarily notice many visual differ - 60 unpacking system in which some described embodiments ences between video encoded /decoded in a YUV 4 : 2 : 0 can be implemented format and video encoded /decoded in a YUV 4 : 4 : 4 format. FIG . 4 is a diagram of an example encoder system in The compression advantages for the YUV 4 : 2 : 0 format, conjunction with which some described embodiments can be which has fewer samples per frame, are therefore compel- implemented . ling . There are some use cases, however , for which video has 65 FIG . 5 is a diagram of an example decoder system in richer color information and higher color fidelity may be conjunction with which some described embodiments can be justified . In such use cases , the differences between YUV implemented . US 9 ,979 , 960 B2 FIG . 6 is a diagram illustrating an example video encoder information ( that is , chroma information is represented at in conjunction with which some described embodiments can the same resolution as luma information ) . As a design be implemented . principle , the decision to use a YUV 4 :2 :0 format for FIG . 7 is a diagram illustrating an example video decoder encoding /decoding is premised on the understanding that , in conjunction with which some described embodiments can 5 for most use cases such as encoding /decoding of natural be implemented . camera -captured video content, viewers do not ordinarily FIG . 8 is a diagram illustrating an example approach to notice many visual differences between video encoded / frame packing that uses spatial partitioning of frames. decoded in a YUV 4 : 2 : 0 format and video encoded /decoded FIG . 9 is a diagram illustrating an example approach to in a YUV 4 :4 : 4 format . The compression advantages for the frame packing in which every second row of chroma com - 10 YUV 4 . 2 . 0 forn ponent planes of frames of a higher -resolution aroma chroma com - " YUV 4 : 2: 0 format, which has fewer samples per frame, are sampling format is copied . therefore compelling . FIG . 10 is a diagram illustrating example frames packed There are some use cases, however, for which the differ according to the approach of FIG . 9 . ences between the two formats are more easily perceived by FIG . 11 is a diagram illustrating an example approach to 15 viewers . For example , for encoding / decoding of computer frame packing in which every second column of chroma screen text content (especially text rendered using component planes of frames of a higher- resolution chroma ClearType technology ), animated video content with artifi sampling format is copied . cial hard -edged boundaries, or certain features of video FIG . 12 is a flow chart illustrating a generalized technique content more generally ( such as scrolling titles and hard for frame packing for frames of a higher - resolution chroma 20 edged graphics, or video with information concentrated in sampling format . chroma channels ), a 4 :4 : 4 format may be preferable to a FIG . 13 is a flow chart illustrating a generalized technique 4 : 2 : 0 format . The lack of widespread support for video for frame unpacking for frames of a higher- resolution codecs supporting 4 : 4 : 4 formats (especially with respect to chroma sampling format. hardware codec implementations ) is a hindrance for these 25 use cases . DETAILED DESCRIPTION The detailed description presents various approaches to packing frames of a higher -resolution chroma sampling A video source such as a camera , animation output, screen format into frames of a lower - resolution chroma sampling capture module , etc . typically provides video that is con verted to a format such as a YUV 4 : 4 : 4 chroma sampling 30 formatF . The frames of the lower - resolution chroma sampling format ( an example of a 4 : 4 : 4 format, more generallysamping ) . A 30 format can then be encoded using an encoder designed for YUV format includes a luma (or Y ) component with sample the lower - resolution chroma sampling format . After decod values representing approximate brightness values as well as ing (using a decoder designed for the lower - resolution multiple chroma (or U and V ) components with sample chroma sampling format ) , the frames at the lower - resolution values representing color difference values . The precise 355 chromachr sampling format can be output for further processing definitions of the color difference values and conversion and display. Or, after such decoding, the frames of the operations to / from a YUV color space to another color space higher -resolution chroma sampling format can be recovered such as RGB ) depend on implementation . In general , as used through frame unpacking for output and display . In many herein , the term YUV indicates any color space with a luma cases , these approaches alleviate the shortcomings of the ( or luminance ) component and one or more chroma ( or 40 prior approaches by preserving chroma information from the chrominance ) components , including Y 'UV , YIQ , Y 'IQ and frames in the higher - resolution chroma sampling format , YDbDr as well as variations such as YCbCr and YCOCg. while leveraging commercially available codecs adapted for The component signal measures that are used may be the lower - resolution chroma sampling format. In particular, adjusted through the application of a non - linear transfer widely available codecs with specialized , dedicated hard characteristics function ( generally known as " gamma pre - 45 ware can provide faster encoding decoding with lower compensation ” and often denoted by the use of a prime power consumption for YUV 4 : 4 : 4 video frames packed into symbol, although the prime symbol is often omitted for YUV 4 : 2 : 0 video frames . typographical convenience ) . Or, the component signal mea - The described approaches can be used to preserve chroma sures may be in a domain that has a linear relationship with information for frames of one chroma sampling format when light amplitude . The luma and chroma component signals 50 encoding / decoding uses another chroma sampling format . may be well aligned with the perception of brightness and Some examples described herein involve frame packing color for the human visual system , or the luma and chroma unpacking of frames of a YUV 4 : 4 : 4 format for encoding component signals may somewhat deviate from such mea - decoding using a codec adapted for a YUV 4 : 2 :0 format. sures (e . g ., as in the YCOCg variation , in which formulas are Other examples described herein involve frame packing applied that simplify the computation of the color compo - 55 unpacking of frames of a YUV 4 : 2 : 2 format for encoding nent values ) . Examples of YUV formats as described herein decoding using a codec adapted for a YUV 4 : 2 : 0 format. include those described in the international standards known More generally, the described approaches can be used for as ITU - R BT. 601 , ITU - R BT. 709 , and ITU - R BT. 2020 . other chroma sampling formats . For example, in addition to Examples of chroma sample types are shown in Figure E - 1 variations of YUV color spaces such as Y 'UV , YIQ , Y 'IQ , of the H . 264 / AVC standard . A 4 : 4 : 4 format can be a YUV 60 YdbDr, YCbCr, YCOCg , etc . in sampling ratios such as 4 :4 :4 format or format for another color space, such as RGB 4 :4 : 4 , 4 :2 : 2 , 4 : 2: 0 , 4 : 1: 1 , 4 :0 :0 , etc ., the described or GBR approaches can be used for color spaces such as RGB ,GBR , Many commercially available video encoders and decod - etc . in sampling ratios such as 4 : 4 : 4 , 4 : 2 : 2 , 4 : 2 : 0 , 4 : 1 : 1 , ers support only a YUV 4 : 2 : 0 chroma sampling format (an 4 : 0 : 0 , etc. as the chroma sampling formats . example of a 4 : 2 : 0 format, more generally ) . YUV 4 : 2 : 0 is a 65 In example implementations, specific aspects of the inno format that sub - samples chroma information compared to a vations described herein include , but are not limited to , the YUV 4 : 4 : 4 format, which preserves full -resolution chroma following: US 9 ,979 ,960 B2 6 Packing a 4 : 4 : 4 frame into two 4 : 2 : 0 frames, and encod Additional???? innovative aspects of frame packing and ing the two 4 : 2 : 0 frames using a video encoder unpacking for higher -resolution chroma sampling formats designed for 4 :2 :0 format. are also described . The described techniques may be applied Decoding the encoded frames using a video decoder to additional applications other than video coding /decoding , designed for 4 : 2 : 0 format, and unpacking the two 5 such as still - image coding , medical scan content coding , decoded 4 : 2 : 0 frames to form a decoded 4 : 4 : 4 frame. multispectral imagery content coding, etc . Although opera Performing the packing for a YUV format such that a tions described herein are in places described as being geometric correspondence is maintained between Y , U performed by an encoder ( e . g ., video encoder ) or decoder and V components for each of the two 4 : 2 : 0 frames . ( e . g . , video decoder ) , in many cases the operations can Performing the packing for a YUV format such that one 10 alternatively be performed by another type of media pro of the two 4 :2 :0 frames (a main view ) represents the cessing tool. complete scene being represented by 4 : 4 :4 frame, albeit Some of the innovations described herein are illustrated with chroma components at a lower resolution , while with reference to syntax elements and operations specific to the other 4 : 2 : 0 frame (an auxiliary view ) packs the the HEVC standard . For example , reference is made to the remaining chroma information . 15 draft version JCTVC - 11003 of the HEVC standard — “ High Signaling an indication of use of the two 4 :2 :0 frames efficiency video coding (HEVC ) text specification draft 7 ,” with a type of supplemental enhancement information JCTVC - 11003 _ d5 , 9th meeting , Geneva , April 2012 . The (“ SEI” ) message or other metadata , such that a decoder innovations described herein can also be implemented for that processes this SEI message can output the 4 : 4 : 4 other standards or formats . For example , innovations frame or the 4 : 2 : 0 frame that represents the scene . 20 described herein can be implemented for the H . 264 / AVC Pre- processing and post - processing operations that can standard using frame packing arrangement SEI messages. improve the quality of the final displayed frame for a More generally , various alternatives to the examples YUV format when only one 4 : 2 : 0 frame (out of the two described herein are possible . For example , any of the 4 :2 : 0 frames ) is used for final display . In conjunction methods described herein can be altered by changing the with such pre - processing and post - processing opera - 25 ordering of the method acts described , by splitting , repeat tions, the 4 : 2 : 0 frames can have a higher bit depth for ing , or omitting certain method acts , etc . The various aspects encoding/ decoding , so as to avoid loss of chroma of the disclosed technology can be used in combination or information in pre -processing and post - processing separately . Different embodiments use one or more of the operations . described innovations. Some of the innovations described Packing a 4 : 2 : 2 frame into two or less ) 4 : 2 :0 frames , and 30 herein address one or more of the problems noted in the encoding the 4 :2 : 0 frames using a video encoder background . Typically , a given technique/ tool does not solve designed for 4 : 2 : 0 format. all such problems. Decoding the encoded frames using a video decoder I. Example Computing Systems. designed for 4 : 2: 0 format, and unpacking the decoded FIG . 1 illustrates a generalized example of a suitable 4 : 2 : 0 frames to form a decoded 4 : 2 : 2 frame. 35 computing system ( 100 ) in which several of the described In specific example implementations that use frame pack - innovations may be implemented . The computing system ing arrangement SEI messages, the definition of frame ( 100 ) is not intended to suggest any limitation as to scope of packing arrangement SEI message is extended to support use or functionality , as the innovations may be implemented representing 4 : 4 : 4 content in a nominally 4 : 2 : 0 bitstream . In in diverse general - purpose or special- purpose computing some examples, one constituent frame ( e . g . , in a top -bottom 40 systems. packing or alternating - frame coding scheme) can be With reference to FIG . 1, the computing system ( 100 ) decoded compatibly as an ordinary 4 : 2 : 0 image , or can be includes one or more processing units ( 110 , 115 ) and supplemented with the data from another constituent frame memory ( 120 , 125 ) . In FIG . 1 , this most basic configuration to form a complete 4 : 4 : 4 image representation . Since YUV ( 130 ) is included within a dashed line . The processing units 4 : 2 : 0 is the most widely supported format in products 45 ( 110 , 115 ) execute computer - executable instructions . A pro ( especially with respect to hardware codec implementa - cessing unit can be a general- purpose central processing unit tions ), having an effective way of conveying YUV 4 : 4 : 4 (“ CPU ” ) , processor in an application - specific integrated content through such decoders can provide the substantial circuit or any other type of processor . In a multi - processing benefit of enabling widespread near- term deployment of system , multiple processing units execute computer - execut YUV 4 : 4 : 4 capabilities ( especially for screen content cod - 50 able instructions to increase processing power. For example , ing ) . In example implementations, the samples of a 4 : 4 :4 FIG . 1 shows a central processing unit ( 110 ) as well as a frame are packed into two 4 : 2 : 0 frames , and the two 4 : 2 :0 graphics processing unit or co - processing unit ( 115 ) . The frames are encoded as the constituent frames of a frame tangible memory (120 , 125 ) may be volatile memory ( e . g . , packing arrangement. For implementations that use the registers , cache, RAM ), non - volatile memory ( e . g ., ROM , frame packing arrangement SEI message , the semantics of 55 EEPROM , flash memory , etc . ) , or some combination of the the content _ interpretation _ type syntax element are extended two , accessible by the processing unit ( s ). The memory ( 120 , to signal this usage . The content _ interpretation _ type syntax 125 ) stores software (180 ) implementing one or more inno element signals how to interpret the data that are packed vations for frame packing and / or unpacking for higher using a packing arrangement, and the frame configuration resolution chroma sampling formats , in the form of com for the packing arrangement is signaled with a different 60 puter - executable instructions suitable for execution by the syntax element. Some approaches described herein have processing unit ( s ) . high practical value for applications involving screen con - A computing system may have additional features. For tent. Also , relative to native 4 : 4 :4 encoding , some example , the computing system ( 100 ) includes storage approaches described herein can provide the advantage of ( 140 ), one or more input devices ( 150 ) , one or more output compatibility with the ordinary 4 : 2 : 0 decoding process that 65 devices ( 160 ) , and one or more communication connections is expected to be more widely supported in decoding prod ( 170 ) . An interconnection mechanism ( not shown ) such as a ucts . bus, controller , or network interconnects the components of US 9 ,979 , 960 B2 the computing system ( 100 ). Typically , operating system ASIC digital signal process unit (“ DSP ' ) , a graphics pro software ( not shown ) provides an operating environment for cessing unit (“ GPU ” ) , or a programmable logic device other software executing in the computing system ( 100 ) , and (“ PLD " ) , such as a field programmable gate array coordinates activities of the components of the computing (“ FPGA " ) ) specially designed or configured to implement system ( 100 ). 5 any of the disclosed methods . The tangible storage ( 140 ) may be removable or non - For the sake of presentation , the detailed description uses removable , and includes magnetic disks, magnetic tapes or terms like " determine " and " use " to describe computer cassettes , CD -ROMs , DVDs, or any other medium which operations in a computing system . These terms are high can be used to store information in a non - transitory way and level abstractions for operations performed by a computer , which can be accessed within the computing system ( 100 ) . 10 and should not be confused with acts performed by a human The storage ( 140 ) stores instructions for the software ( 180 ) being . The actual computer operations corresponding to implementing one or more innovations for frame packing these terms vary depending on implementation . and / or unpacking for higher- resolution chroma sampling II . Example Network Environments. formats . FIGS. 2a and 2b show example network environments The input device ( s ) ( 150 ) may be a touch input device 15 (201 , 202 ) that include video encoders ( 220 ) and video such as a keyboard , mouse , pen , or trackball , a voice input decoders ( 270 ) . The encoders (220 ) and decoders ( 270 ) are device , a scanning device , or another device that provides connected over a network ( 250 ) using an appropriate com input to the computing system ( 100 ) . For video encoding , munication protocol. The network (250 ) can include the the input device ( s ) ( 150 ) may be a camera , video card , TV Internet or another computer network . tuner card , or similar device that accepts video input in 20 In the network environment ( 201 ) shown in FIG . 2a , each analog or digital form , or a CD -ROM or CD - RW that reads real- time communication (“ RTC ” ) tool ( 210 ) includes both video samples into the computing system ( 100 ). The output an encoder (220 ) and a decoder ( 270 ) for bidirectional device ( s ) ( 160 ) may be a display, printer, speaker , CD - communication . A given encoder ( 220 ) can produce output writer, or another device that provides output from the compliant with the SMPTE 421M standard , ISO / IEC 14496 computing system (100 ) . 25 10 standard (also known as H .264 /AVC ) , H . 265 /HEVC The communication connection ( s ) ( 170 ) enable commu- standard , another standard , or a proprietary format , with a nication over a communication medium to another comput- corresponding decoder (270 ) accepting encoded data from ing entity . The communication medium conveys information the encoder ( 220 ). The bidirectional communication can be such as computer- executable instructions , audio or video part of a video conference , video telephone call , or other input or output , or other data in a modulated data signal. A 30 two - party communication scenario . Although the network modulated data signal is a signal that has one or more of its environment ( 201 ) in FIG . 2a includes two real - time com characteristics set or changed in such a manner as to encode m unication tools (210 ) , the network environment ( 201 ) can information in the signal. By way of example , and not instead include three or more real- time communication tools limitation , communication media can use an electrical, opti - (210 ) that participate in multi -party communication . cal, RF, or other carrier. 35 A real- time communication tool ( 210 ) manages encoding The innovations can be described in the general context of by an encoder (220 ) . FIG . 4 shows an example encoder computer -readable media . Computer -readable media are any system (400 ) that can be included in the real- time commu available tangible media that can be accessed within a nication tool (210 ) . Alternatively, the real- time communica computing environment. By way of example , and not limi- tion tool (210 ) uses another encoder system . A real- time tation , with the computing system ( 100 ) , computer -readable 40 communication tool ( 210 ) also manages decoding by a media include memory ( 120 , 125 ), storage ( 140 ) , and com - decoder ( 270 ) . FIG . 5 shows an example decoder system binations of any of the above . (500 ) , which can be included in the real- time communica The innovations can be described in the general context of tion tool (210 ). Alternatively , the real- time communication computer - executable instructions , such as those included in tool (210 ) uses another decoder system . program modules, being executed in a computing system on 45 In the network environment (202 ) shown in FIG . 2b , an a target real or virtual processor. Generally , program mod encoding tool ( 212 ) includes an encoder ( 220 ) that encodes ules include routines , programs, libraries, objects , classes , video for delivery to multiple playback tools (214 ) , which components , data structures , etc . that perform particular include decoders (270 ) . The unidirectional communication tasks or implement particular abstract data types . The func - can be provided for a video surveillance system , web camera tionality of the program modules may be combined or split 50 monitoring system , remote desktop conferencing presenta between program modules as desired in various embodi- tion or other scenario in which video is encoded and sent ments . Computer -executable instructions for program mod from one location to one or more other locations . Although ules may be executed within a local or distributed computing the network environment (202 ) in FIG . 2b includes two system . playback tools ( 214 ) , the network environment ( 202 ) can The terms " system " and " device ” are used interchange - 55 include more or fewer playback tools ( 214 ) . In general , a ably herein . Unless the context clearly indicates otherwise , playback tool (214 ) communicates with the encoding tool neither term implies any limitation on a type of computing (212 ) to determine a stream of video for the playback tool system or computing device . In general, a computing system (214 ) to receive . The playback tool ( 214 ) receives the or computing device can be local or distributed , and can stream , buffers the received encoded data for an appropriate include any combination of special- purpose hardware and /or 60 period , and begins decoding and playback . general - purpose hardware with software implementing the FIG . 4 shows an example encoder system (400 ) that can functionality described herein . be included in the encoding tool (212 ) . Alternatively , the The disclosed methods can also be implemented using encoding tool ( 212 ) uses another encoder system . The specialized computing hardware configured to perform any encoding tool ( 212 ) can also include server - side controller of the disclosed methods. For example , the disclosed meth - 65 logic for managing connections with one or more playback ods can be implemented by an integrated circuit ( e . g ., an tools (214 ) . FIG . 5 shows an example decoder system (500 ) , application specific integrated circuit (“ ASIC ” ) ( such as an which can be included in the playback tool (214 ). Alterna US 9 ,979 , 960 B2 10 tively, the playback tool ( 214 ) uses another decoder system . rate of, for example , 30 frames per second . As used herein , A playback tool (214 ) can also include client- side controller the term “ frame" generally refers to source , coded or recon logic for managing connections with the encoding tool structed image data . For progressive - scan video , a frame is ( 212 ) . a progressive - scan video frame. For interlaced video , in III . Example Frame Packing /Unpacking Systems. 5 example embodiments , an interlaced video frame is de FIG . 3 is a block diagram of a generalized frame packing interlaced prior to encoding . Alternatively , two complemen unpacking system (300 ) in conjunction with which some tary interlaced video fields are encoded as an interlaced described embodiments may be implemented . video frame or as separate fields . Aside from indicating a The system ( 300 ) includes a video source ( 310 ) , which progressive - scan video frame, the term “ frame” can indicate produces source frames (311 ) of a higher - resolution chroma 10 a single non - paired video field , a complementary pair of sampling format such as a 4 : 4 :4 format . The video source video fields, a video object plane that represents a video ( 310 ) can be a camera , tuner card , storage media , or other object at a given time, or a region of interest in a larger digital video source . image . The video object plane or region can be part of a The frame packer ( 315 ) rearranges the frames ( 311 ) of the larger image that includes multiple objects or regions of a higher -resolution chroma sampling format to produce source 15 scene . After color space conversion from the capture format frames ( 316 ) of a lower -resolution chroma sampling format ( e . g ., an RGB format ) , the source frames ( 411 ) are in a such as a 4 :2 : 0 format. Example approaches to frame higher -resolution chroma sampling format such as a 4 : 4 : 4 packing are described below . The frame packer ( 315 ) can format . signalmetadata ( 317 ) that indicates whether and how frame The frame packer (415 ) rearranges the frames ( 411 ) of the packing was performed , for use by the frame unpacker (385 ) 20 higher- resolution chroma sampling format to produce source after decoding . Example approaches to signaling frame frames (416 ) of a lower- resolution chroma sampling format packing arrangement metadata are described below . such as a 4 : 2 : 0 format. Example approaches to frame The encoder (340 ) encodes the frames (316 ) of the packing are described below . The frame packer (415 ) can lower- resolution chroma sampling format. Example encod - signal metadata ( not shown ) that indicates whether and how ers are described below with reference to FIGS . 4 and 6 . The 25 frame packing was performed , for use by a frame unpacker encoder ( 340 ) outputs coded data ( 341 ) over a channel after decoding . Example approaches to signaling frame ( 350 ) , which represents storage , a communications connec packing arrangement metadata are described below . The tion , or another channel for the output. frame packer (415 ) can perform pre- processing operations , The decoder ( 360 ) receives the encoded data ( 341 ) and for example , as described below . decodes the frames ( 316 ) of the lower -resolution chroma 30 An arriving source frame (416 ) is stored in a source frame sampling format . Example decoders are described below temporary memory storage area ( 420 ) that includes multiple with reference to FIGS. 5 and 7 . The decoder outputs frame buffer storage areas ( 421, 422 , . . . , 42n ) . A frame reconstructed frames ( 381 ) of the lower - resolution chroma buffer ( 421 , 422 , etc . ) holds one source frame in the source sampling format . frame storage area (420 ) . After one or more of the source The frame unpacker (385 ) optionally rearranges the 35 frames (416 ) have been stored in frame buffers (421 , 422 , reconstructed frames (381 ) of the lower -resolution chroma etc .) , a frame selector (430 ) periodically selects an indi sampling format to reconstruct the frames (386 ) of the vidual source frame from the source frame storage area higher -resolution chroma sampling format. Example (420 ) . The order in which frames are selected by the frame approaches to frame unpacking are described below . The selector ( 430 ) for input to the encoder (440 ) may differ from frame unpacker (385 ) can receive the metadata (317 ) that 40 the order in which the frames are produced by the video indicates whether and how frame packing was performed , source ( 410 ), e. g . , a selected framemay be ahead in order , and use such metadata ( 317 ) to guide unpacking operations. to facilitate temporally backward prediction . The frame unpacker (385 ) outputs the reconstructed frames The order of the frame packer (415 ) and frame storage of the higher - resolution chroma sampling format to an area (420 ) can be switched . Before the encoder (440 ) , the output destination ( 390 ) . 45 encoder system (400 ) can include another pre -processor ( not IV . Example Encoder Systems. shown ) that performs pre -processing ( e. g ., filtering ) of the FIG . 4 is a block diagram of an example encoder system selected frame (431 ) before encoding . (400 ) in conjunction with which some described embodi The encoder ( 440 ) encodes the selected frame ( 431) (of ments may be implemented . The encoder system (400 ) can the lower - resolution chroma sampling format) to produce a be a general- purpose encoding tool capable of operating in 50 coded frame (441 ) and also produces memory management any of multiple encoding modes such as a low - latency control operation (“ MMCO ” ) signals (442 ) or reference encoding mode for real - time communication , transcoding picture set (“ RPS ” ) information . If the current frame is not mode , and regular encoding mode for media playback from the first frame that has been encoded , when performing its a file or stream , or it can be a special - purpose encoding tool encoding process, the encoder ( 440 ) may use one or more adapted for one such encoding mode . The encoder system 55 previously encoded decoded frames ( 469 ) that have been ( 400 ) can be implemented as an operating system module, as stored in a decoded frame temporary memory storage area part of an application library or as a standalone application . (460 ). Such stored decoded frames (469 ) are used as refer Overall, the encoder system (400 ) receives a sequence of ence frames for inter - frame prediction of the content of the source video frames ( 411 ) ( of a higher - resolution chroma current source frame ( 431 ) . Generally , the encoder ( 440 ) sampling format such as a 4 : 4 : 4 format ) from a video source 60 includes multiple encoding modules that perform encoding (410 ) , performs frame packing to a lower- resolution chroma tasks such as motion estimation and compensation , fre sampling format such as a 4 : 2 : 0 format , encodes frames of quency transforms, quantization and entropy coding . The the lower - resolution chroma sampling format, and produces exact operations performed by the encoder ( 440 ) can vary encoded data as output to a channel ( 490 ) . depending on compression format. The format of the output The video source (410 ) can be a camera , tuner card , 65 encoded data can be a Windows Media Video format, VC - 1 storage media , or other digital video source . The video format, MPEG -X format ( e. g . , MPEG - 1 , MPEG - 2 , or source ( 410 ) produces a sequence of video frames at a frame MPEG - 4 ) , H . 26x format ( e . g ., H . 261, H . 262, H .263 , H . 264) , US 9 ,979 ,960 B2 HEVC format or other format. In general, the encoder (440 ) ( 461, 462 , etc . ) with frames that are no longer needed by the is adapted for encoding frames of the lower - resolution encoder (440 ) for use as reference frames. After modeling chroma sampling format . the decoding process , the decoding process emulator ( 450 ) For example , within the encoder (440 ), an inter -coded , stores a newly decoded frame (451 ) in a frame buffer ( 461 , predicted frame is represented in terms of prediction from 5 462 , etc . ) that has been identified in this manner. reference frames. A motion estimator estimates motion of The coded frames (441 ) and MMCO /RPS information sets of samples of a source frame (441 ) with respect to one (442 ) are also buffered in a temporary coded data area (470 ) . or more reference frames (469 ) . A set of samples can be a The coded data that is aggregated in the coded data area macroblock , sub -macroblock or sub -macroblock partition (470 ) can also include media metadata relating to the coded (as in the H . 264 standard ) , or it can be a coding tree unit or 10 video data ( e . g ., as one or more parameters in one or more prediction unit ( as in the HEVC standard ) . Generally , as SEI messages ( such as frame packing arrangement SEI used herein , the term " block " indicates a set of samples, messages ) or video usability information (“ VUI” ) mes which may be a single two - dimensional (“ 2D ” ) array or sages ) . multiple 2D arrays ( e . g ., one array for a luma component The aggregated data (471 ) from the temporary coded data and two arrays for chroma components ) . When multiple 15 area (470 ) are processed by a channel encoder ( 480 ) . The reference frames are used , the multiple reference frames can channel encoder (480 ) can packetize the aggregated data for be from different temporal directions or the same temporal transmission as a media stream ( e . g ., according to a media direction . The motion estimator outputs motion information container format such as ISO /IEC 14496 - 12 ) , in which case such as motion vector information , which is entropy coded . the channel encoder (480 ) can add syntax elements as part A motion compensator applies motion vectors to reference 20 of the syntax of the media transmission stream . Or, the frames to determine motion - compensated prediction values . channel encoder (480 ) can organize the aggregated data for The encoder determines the differences ( if any ) between a storage as a file ( e . g . , according to a media container format block 's motion -compensated prediction values and corre such as ISO /IEC 14496 - 12 ), in which case the channel sponding original values. These prediction residual values encoder ( 480 ) can add syntax elements as part of the syntax ( i . e . , residuals, residue values ) are further encoded using a 25 of the media storage file . Or, more generally , the channel frequency transform , quantization and entropy encoding . encoder ( 480 ) can implement one or more media system Similarly , for intra prediction , the encoder ( 440 ) can deter - multiplexing protocols or transport protocols, in which case mine intra - prediction values for a block , determine predic - the channel encoder ( 480 ) can add syntax elements as part tion residual values , and encode the prediction residual of the syntax of the protocol ( s ) . Such syntax elements for a values . The entropy coder of the encoder (440 ) compresses 30 media transmission stream , media storage stream , multi quantized transform coefficient values as well as certain side plexing protocols or transport protocols can include frame information ( e . g . , motion vector information , QP values, packing arrangement metadata . The channel encoder (480 ) mode decisions , parameter choices ) . Typical entropy coding provides output to a channel (490 ) , which represents storage , techniques include Exp -Golomb coding , arithmetic coding a communications connection , or another channel for the differential coding, Huffman coding , run length coding , 35 output. variable - length - to -variable - length (“ V2V ” ) coding, vari - V . Example Decoder Systems. able - length - to - fixed - length ( “ V2F ” ) coding, LZ coding , dic FIG . 5 is a block diagram of an example decoder system tionary coding , probability interval partitioning entropy cod (500 ) in conjunction with which some described embodi ing ( “ PIPE ” ) , and combinations of the above . The entropy ments may be implemented . The decoder system (500 ) can coder can use different coding techniques for different kinds 40 be a general- purpose decoding tool capable of operating in of information , and can choose from among multiple code any of multiple decoding modes such as a low - latency tables within a particular coding technique . decoding mode for real- time communication and regular The coded frames (441 ) and MMCO /RPS information decoding mode for media playback from a file or stream , or (442 ) are processed by a decoding process emulator ( 450 ) . it can be a special- purpose decoding tool adapted for one The decoding process emulator ( 450 ) implements some of 45 such decoding mode . The decoder system (500 ) can be the functionality of a decoder, for example , decoding tasks implemented as an operating system module , as part of an to reconstruct reference frames that are used by the encoder application library or as a standalone application . Overall , ( 440 ) in motion estimation and compensation . The decoding the decoder system ( 500 ) receives coded data from a channel process emulator ( 450 ) uses the MMCO /RPS information (510 ) , decodes frames of a lower -resolution chroma sam ( 442 ) to determine whether a given coded frame (441 ) needs 50 pling format such as a 4 : 2 : 0 format , optionally performs to be stored for use as a reference frame in inter -frame frame unpacking from the lower -resolution chroma sam prediction of subsequent frames to be encoded . If the pling format to a higher - resolution chroma sampling format MMCO /RPS information ( 442 ) indicates that a coded frame such as a 4 : 4 : 4 format, and produces reconstructed frames ( 441 ) needs to be stored , the decoding process emulator ( of the higher- resolution chroma sampling format) as output ( 450 ) models the decoding process that would be conducted 55 for an output destination (590 ) . by a decoder that receives the coded frame (441 ) and The decoder system (500 ) includes a channel (510 ) , produces a corresponding decoded frame (451 ) . In doing so , which can represent storage, a communications connection , when the encoder (440 ) has used decoded frame ( s ) ( 469 ) or another channel for coded data as input. The channel that have been stored in the decoded frame storage area (510 ) produces coded data that has been channel coded . A ( 460 ) , the decoding process emulator ( 450 ) also uses the 60 channel decoder (520 ) can process the coded data . For decoded frame( s ) ( 469 ) from the storage area (460 ) as part example , the channel decoder (520 ) de -packetizes data that of the decoding process. has been aggregated for transmission as a media stream The decoded frame temporary memory storage area (460 ) ( e .g . , according to a media container format such as ISO /IEC includes multiple frame buffer storage areas ( 461 , 462 , . . . 14496 - 12 ) , in which case the channel decoder (520 ) can , 46n ) . The decoding process emulator (450 ) uses the 65 parse syntax elements added as part of the syntax of the MMCO /RPS information ( 442 ) to manage the contents of media transmission stream . Or, the channel decoder (520 ) the storage area ( 460) in order to identify any frame buffers separates coded video data that has been aggregated for US 9 ,979 ,960 B2 13 14 storage as a file ( e . g ., according to a media container format structed frame. The decoder (550 ) can similarly combine such as ISO /IEC 14496 - 12 ) , in which case the channel prediction residuals with spatial predictions from intra pre decoder (520 ) can parse syntax elements added as part of the diction . A motion compensation loop in the video decoder syntax of the media storage file . Or, more generally , the (550 ) includes an adaptive de -blocking filter to smooth channel decoder (520 ) can implement one or more media 5 discontinuities across block boundary rows and/ or columns system demultiplexing protocols or transport protocols , in in the decoded frame (551 ) . which case the channel decoder (520 ) can parse syntax The decoded frame temporary memory storage area (560 ) elements added as part of the syntax of the protocol( s ) . Such includes multiple frame buffer storage areas (561 , 562 , . . . syntax elements for a media transmission stream , media , 56n ). The decoded frame storage area (560 ) is an example storage stream , multiplexing protocols or transport protocols 10 of a DPB . The decoder (550 ) uses the MMCO /RPS infor can include frame packing arrangement metadata . mation (532 ) to identify a frame buffer (561 , 562 , etc . ) in The coded data (521 ) that is output from the channel which it can store a decoded frame (551 ) of the lower decoder (520 ) is stored in a temporary coded data area (530 ) resolution chroma sampling format. The decoder (550 ) until a sufficient quantity of such data has been received . The stores the decoded frame (551 ) in that frame buffer . coded data (521 ) includes coded frames (531 ) ( at the lower - 15 An output sequencer (580 ) uses the MMCO /RPS infor resolution chroma sampling format ) and MMCO / RPS infor m ation (532 ) to identify when the next frame to be produced mation (532 ) . The coded data (521 ) in the coded data area in output order is available in the decoded frame storage area (530 ) can also include media metadata relating to the (560 ) . When the next frame (581 ) of the lower -resolution encoded video data ( e . g ., as one or more parameters in one chroma sampling format to be produced in output order is or more SEI messages such as frame packing arrangement 20 available in the decoded frame storage area (560 ) , it is read SEI messages or VUImessages ) . In general, the coded data by the output sequencer (580 ) and output to either ( a ) the area (530 ) temporarily stores coded data (521 ) until such output destination (590 ) ( e . g . , display ) for display of the coded data (521 ) is used by the decoder (550 ). At that point, frame of the lower- resolution chroma sampling format , or coded data for a coded frame (531 ) and MMCO /RPS ( b ) the frame unpacker (585 ) . In general, the order in which information (532 ) are transferred from the coded data area 25 frames are output from the decoded frame storage area (560 ) (530 ) to the decoder (550 ) . As decoding continues , new by the output sequencer (580 ) may differ from the order in coded data is added to the coded data area (530 ) and the which the frames are decoded by the decoder (550 ) . oldest coded data remaining in the coded data area (530 ) is The frame unpacker (585 ) rearranges the frames (581 ) of transferred to the decoder ( 550 ) . the lower - resolution chroma sampling format to produce The decoder (550 ) periodically decodes a coded frame 30 output frames (586 ) of a higher- resolution chroma sampling (531 ) to produce a corresponding decoded frame (551 ) of format such as a 4 : 4 : 4 format . Example approaches to frame the lower- resolution chroma sampling format . As appropri - unpacking are described below . The frame packer (585 ) can ate, when performing its decoding process , the decoder use metadata (not shown ) that indicates whether and how (550 ) may use one or more previously decoded frames (569 ) frame packing was performed , to guide frame unpacking as reference frames for inter - frame prediction . The decoder 35 operations. The frame unpacker (585 ) can perform post (550 ) reads such previously decoded frames (569 ) from a processing operations, for example , as described below . decoded frame temporary memory storage area (560 ) . Gen - VI. Example Video Encoders . erally , the decoder (550 ) includes multiple decoding mod FIG . 6 is a block diagram of a generalized video encoder ules that perform decoding tasks such as entropy decoding , (600 ) in conjunction with which some described embodi inverse quantization , inverse frequency transforms and 40 ments may be implemented . The encoder (600 ) receives a motion compensation . The exact operations performed by sequence of video frames of a lower- resolution chroma the decoder (550 ) can vary depending on compression sampling format such as a 4 : 2 : 0 format, including a current format. In general, the decoder (550 ) is adapted for decoding frame (605 ) , and produces encoded data (695 ) as output. frames of the lower- resolution chroma sampling format. The encoder (600 ) is block - based and uses a macroblock For example , the decoder (550 ) receives encoded data for 45 format that depends on implementation . Blocks may be a compressed frame or sequence of frames and produces further sub -divided at different stages, e . g . , at the frequency output including decoded frame (551 ) of the lower - resolu - transform and entropy encoding stages . For example , a tion chroma sampling format . In the decoder (550 ) , a buffer frame can be divided into 16x16 macroblocks , which can in receives encoded data for a compressed frame and makes the turn be divided into 8x8 blocks and smaller sub - blocks of received encoded data available to an entropy decoder. The 50 pixel values for coding and decoding . entropy decoder entropy decodes entropy - coded quantized The encoder system (600 ) compresses predicted frames data as well as entropy - coded side information , typically and intra - coded frames . For the sake of presentation , FIG . 6 applying the inverse of entropy encoding performed in the shows an “ intra path ” through the encoder (600 ) for intra encoder. A motion compensator applies motion information frame coding and an “ inter path ” for inter- frame coding . to one or more reference frames to form motion - compen - 55 Many of the components of the encoder ( 600 ) are used for sated predictions of blocks ( e . g . , macroblocks, sub -macro - both intra - frame coding and inter - frame coding . The exact blocks, sub -macroblock partitions, coding tree units , predic operations performed by those components can vary tion units , or parts thereof) of the frame being reconstructed . depending on the type of information being compressed . An intra prediction module can spatially predict sample If the current frame (605 ) is a predicted frame, a motion values of a current block from neighboring , previously 60 estimator (610 ) estimates motion of blocks ( e . g . , macrob reconstructed sample values . The decoder (550 ) also recon locks, sub -macroblocks , sub -macroblock partitions, coding structs prediction residuals . An inverse quantizer inverse tree units , prediction units , or parts thereof ) of the current quantizes entropy - decoded data . An inverse frequency trans - frame ( 605 ) with respect to one or more reference frames . former converts the quantized , frequency domain data into The frame store (620 ) buffers one or more reconstructed spatial domain information . For a predicted frame, the 65 previous frames ( 625 ) for use as reference frames . When decoder (550 ) combines reconstructed prediction residuals multiple reference frames are used , the multiple reference with motion -compensated predictions to form a recon - frames can be from different temporal directions or the same US 9 ,979 ,960 B2 15 16 temporal direction . The motion estimator (610 ) outputs as QP values and other control parameters to control quanti side information motion information (615 ) such as differ zation of luma components and chroma components during ential motion vector information . encoding . For example , the controller can vary QP values to The motion compensator (630 ) applies reconstructed dedicate more bits to luma content of a given frame (which motion vectors to the reconstructed reference frame( s ) (625 ) 5 could be a main view or auxiliary view in a frame packing when forming a motion -compensated current frame (635 ) . approach ) compared to chroma content of that frame. Or, in The difference ( if any) between a block of the motion - a frame packing approach with main and auxiliary views, the compensated current frame (635 ) and corresponding part of controller can vary QP values to dedicate more bits to the the original current frame (605 ) is the prediction residual main view ( including luma and sub - sampled chroma com ( 645 ) for the block . During later reconstruction of the 10 ponents ) compared to the auxiliary view ( including remain current frame, reconstructed prediction residuals are added ing chroma information ). to the motion - compensated current frame (635 ) to obtain a In some approaches to frame packing , even after chroma reconstructed frame that is closer to the original current information from frames in a higher -resolution chroma frame (605 ) . In lossy compression , however, some informa- sampling format has been packed into to -be - encoded frames tion is still lost from the original current frame (605 ). The 15 of the lower -resolution chroma sampling format, the intra path can include an intra prediction module (not encoder can exploit geometric correspondence among shown ) that spatially predicts pixel values of a current block sample values of the chroma components in several ways. from neighboring , previously reconstructed pixel values . The term geometric correspondence indicates a relationship A frequency transformer (660 ) converts spatial domain between ( 1 ) chroma information at positions of a (nomi video information into frequency domain ( i . e ., spectral, 20 nally luma component of a frame constructed from the transform ) data . For block -based video frames , the fre - lower -resolution chroma sampling format and ( 2 ) chroma quency transformer (660 ) applies a discrete cosine trans - information at corresponding scaled positions of chroma form , an integer approximation thereof, or another type of components of the frame of the lower- resolution chroma forward block transform to blocks of pixel value data or sampling format. A scaling factor applies between positions prediction residual data , producing blocks of frequency 25 of the luma and chroma components . For example , for 4 : 2 : 0 transform coefficients . A quantizer (670 ) then quantizes the the scaling factor is two both horizontally and vertically , and transform coefficients . For example , the quantizer (670 ) for 4 : 2 : 2 , the scaling factor is two horizontally and one applies non - uniform , scalar quantization to the frequency vertically . domain data with a step size that varies on a frame -by -frame The encoder can use the geometric correspondence to basis , macroblock -by - macroblock basis or other basis . 30 guide motion estimation , QP selection , prediction mode When a reconstructed version of the current frame is selection or other decision -making processes from block - to needed for subsequent motion estimation /compensation , an block , by first evaluating recent results of neighboring inverse quantizer (676 ) performs inverse quantization on the blocks when encoding a current block of the to - be -encoded quantized frequency coefficient data . The inverse frequency frame. Or, the encoder can use the geometric correspon transformer (666 ) performs an inverse frequency transform , 35 dence to guide such decision -making processes for high producing blocks of reconstructed prediction residuals or resolution chroma information packed into chroma compo pixel values . For a predicted frame, the encoder ( 600 ) nents of the to -be - encoded frame, using results from combines reconstructed prediction residuals (645 ) with encoding of high - resolution chroma information packed into motion - compensated predictions (635 ) to form the recon - a " luma " component of the to - be - encoded frame. Or, more structed frame (605 ). (Although not shown in FIG . 6 , in the 40 directly , the encoder can use the geometric correspondence intra path , the encoder (600 ) can combine prediction residu - to improve compression performance, where motion vec als with spatial predictions from intra prediction . ) The frame tors , prediction modes , or other decisions for high - resolution store (620 ) buffers the reconstructed current frame for use in chroma information packed into a “ luma” component of the subsequent motion - compensated prediction . to -be - encoded frame are also used for high - resolution A motion compensation loop in the encoder (600 ) 45 chroma information packed into chroma components of the includes an adaptive in - loop deblock filter (622 ) before or to - be -encoded frame. In particular, in some approaches after the frame store (620 ) . The decoder (600 ) applies described herein ( e . g ., approach 2 , below ) , when chroma in - loop filtering to reconstructed frames to adaptively information is packed into an auxiliary frame of the lower smooth discontinuities across boundaries in the frames . The resolution chroma sampling format, spatial correspondence adaptive in - loop deblock filter (622 ) can be disabled for 50 and motion vector displacement relationships between the some types of content. For example , in a frame packing nominally luma component of the auxiliary frame and approach with main and auxiliary views, the adaptive in nominally chroma components of the auxiliary frame are loop deblock filter (622 ) can be disabled when encoding an preserved . Sample values at corresponding spatial positions auxiliary view ( including remaining chroma information in Y , U and V components of the auxiliary frame tend to be that is not part of a main view ) so as to not introduce artifacts 55 consistent, which is useful for such purposes as spatial block such as blurring. size segmentation and joint coding of coded block pattern The entropy coder (680 ) compresses the output of the information or other information that indicates presence / quantizer (670 ) as well as motion information (615 ) and absence of non - zero coefficient values . Motion vectors for certain side information ( e. g . , QP values ) . The entropy coder corresponding parts of Y , U and V components of the (680 ) provides encoded data (695 ) to the buffer (690 ) , which 60 auxiliary frame tend to be consistent ( e . g ., a vertical or multiplexes the encoded data into an output bitstream . horizontal displacement of two samples in Y corresponds to A controller ( not shown ) receives inputs from various a displacement of 1 sample in U and V ) , which also helps modules of the encoder . The controller evaluates interme- coding efficiency . diate results during encoding , for example , setting QP values Depending on implementation and the type of compres and performing rate -distortion analysis . The controller 65 sion desired , modules of the encoder can be added , omitted , works with other modules to set and change coding param - split into multiple modules, combined with other modules , eters during encoding. In particular , the controller can vary and /or replaced with like modules . In alternative embodi US 9 ,979 , 960 B2 17 18 ments, encoders with different modules and / or other con auxiliary view ( including remaining chroma information figurations of modules perform one or more of the described that is not part of a main view ). techniques . Specific embodiments of encoders typically use In FIG . 7 , the decoder (700 ) also includes a post -process a variation or supplemented version of the encoder (600 ) . ing deblock filter (708 ) . The post -processing deblock filter The relationships shown between modules within the 5 (708 ) optionally smoothes discontinuities in reconstructed encoder (600 ) indicate general flows of information in the frames . Other filtering (such as de -ring filtering ) can also be applied as part of the post -processing filtering . Typically , encoder ; other relationships are not shown for the sake of reconstructed frames that are subjected to later frame simplicity . unpacking bypass the post -processing deblock filter ( 708 ). VII . Example Video Decoders . 00 ) 10 Depending on implementation and the type of decom FIG . 7 is a block diagram of a generalized decoder (700 ) pression desired , modules of the decoder can be added , in conjunction with which several described embodiments omitted , split into multiple modules , combined with other may be implemented . The decoder (700 ) receives encoded modules, and /or replaced with like modules. In alternative data (795 ) for a compressed frame or sequence of frames and embodiments, decoders with different modules and / or other produces output including a reconstructed frame ( 705 ) of a ? lower - resolution chroma sampling format such as a 4 : 2 : 0 described techniques. Specific embodiments of decoders format. For the sake of presentation , FIG . 7 shows an “ intra typically use a variation or supplemented version of the path ” through the decoder ( 700 ) for intra - frame decoding decoder (700 ). The relationships shown between modules and an “ inter path ” for inter - frame decoding . Many of the within the decoder (700 ) indicate general flows of informa components of the decoder (700 ) are used for both intra - 20 tion in the decoder ; other relationships are not shown for the frame decoding and inter - frame decoding. The exact opera sake of simplicity . tions performed by those components can vary depending on VIII . Frame Packing /Unpacking for Higher- Resolution the type of information being decompressed Chroma Sampling Formats . A buffer ( 790 ) receives encoded data ( 795 ) for a com This section describes various approaches to packing pressed frame and makes the received encoded data avail - 25 frames of a higher -resolution chroma sampling format into able to the parser/ entropy decoder (780 ) . The parser /entropy frames of a lower -resolution chroma sampling format. The decoder (780 ) entropy decodes entropy - coded quantized frames of the lower -resolution chroma sampling format can data as well as entropy - coded side information , typically then be encoded using an encoder designed for the lower applying the inverse of entropy encoding performed in the resolution chroma sampling format . After decoding ( using a encoder. 30 decoder designed for the lower - resolution chroma sampling A motion compensator (730 ) applies motion information format) , the frames at the lower -resolution chroma sampling (715 ) to one or more reference frames (725 ) to form motion format can be output for further processing and display. Or, compensated predictions (735 ) of blocks ( e . g . , macroblocks, after such decoding , the frames of the higher - resolution sub -macroblocks , sub -macroblock partitions, coding tree chroma sampling format can be recovered through frame units , prediction units , or parts thereof ) of the frame ( 705 ) 35 unpacking for output and display . being reconstructed . The frame store (720 ) stores one or A . Approaches to Frame Packing /Unpacking for YUV more previously reconstructed frames for use as reference 4 : 4 : 4 Video . frames . Various approaches described herein can be used to The intra path can include an intra prediction module ( not preserve chroma information for frames of a 4 :4 : 4 format shown ) that spatially predicts pixel values of a current block 40 when encoding/ decoding uses a 4 : 2 :0 format, as one specific from neighboring , previously reconstructed pixel values . In example . In these approaches, for example , a YUV 4 : 4 : 4 the inter path , the decoder ( 700 ) reconstructs prediction frame is packed into two YUV 4 : 2 : 0 frames . A typical 4 : 4 : 4 residuals . An inverse quantizer (770 ) inverse quantizes frame contains 12 sample values for every 4 pixel positions, entropy -decoded data . An inverse frequency transformer while a 4 : 2 : 0 frame contains only 6 sample values for every ( 760 ) converts the quantized , frequency domain data into 45 4 pixel positions . So , all the sample values contained in a spatial domain information . For example , the inverse fre - 4 : 4 : 4 frame can be packed into two 4 : 2 : 0 frames . quency transformer ( 760 ) applies an inverse block transform 1 . Approach 1 . to frequency transform coefficients , producing pixel value I n approach 1 , a YUV 4 : 4 : 4 frame is packed into two data or prediction residual data . The inverse frequency YUV 4 : 2 : 0 frames using spatial partitioning . FIG . 8 shows transform can be an inverse discrete cosine transform , an 50 this approach (800 ) to frame packing that uses spatial integer approximation thereof, or another type of inverse partitioning of the YUV 4 : 4 : 4 frame. frequency transform . A Y444 plane , U444 plane, and V444 plane are the three For a predicted frame , the decoder (700 ) combines recon - component planes for the YUV 4 :4 :4 frame (801 ). Each structed prediction residuals (745 ) with motion - compen - plane has the resolution of width W and height H . For sated predictions (735 ) to form the reconstructed frame 55 convenience in describing the examples used herein , both W ( 705 ) . ( Although not shown in FIG . 7 , in the intra path , the and H are divisible by 4 , without implying that this is a decoder ( 700 ) can combine prediction residuals with spatial limitation of the approach . The approach (800 ) to packing predictions from intra prediction . ) A motion compensation the YUV 4 :4 : 4 frame into two YUV 4 : 2 : 0 frames splits the loop in the decoder ( 700 ) includes an adaptive in - loop YUV 4 : 4 : 4 frame as shown in FIG . 8 . The Um plane of the deblock filter ( 710 ) before or after the frame store (720 ) . The 60 YUV 4 : 4 : 4 frame ( 801 ) is partitioned into a bottom half decoder (700 ) applies in - loop filtering to reconstructed H2 - U444 and two upper quarters Q1- U444 and Q2 -U444 using frames to adaptively smooth discontinuities across bound - spatial partitioning . The V4 plane of the YUV 4 : 4 : 4 frame aries in the frames . The adaptive in - loop deblock filter ( 710 ) (801 ) is partitioned into a bottom half H2- V 444 and two can be disabled for some types of content, when it was upper quarters Q1- V444 and Q2 - V 444 using spatial partition disabled during encoding . For example , in a frame packing 65 ing . approach with main and auxiliary views, the adaptive in - The partitioned planes of the YUV 4 : 4 : 4 frame (801 ) are loop deblock filter ( 710 ) can be disabled when decoding an then reorganized as one or more YUV 4 : 2 :0 frames . The US 9 ,979 , 960 B2 20 Y444 plane for the YUV 4 : 4 : 4 frames becomes the luma For area B1, Y 420 main ( x , y ) = Y444 ( x , y ), where the range of component plane of a first frame (802 ) of the YUV 4 : 2 : 0 (x , y ) is [0 , W - 1] x [O , H - 1] . format . The bottom halves of the U444 plane and the V444 plane become the luma component plane of a second frame For area B2 ,U420 main (x , y )= U444( 2x , 2y ), where the range (803 ) of the YUV 4 : 2 :0 format. The top quarters of the U4445 of ( x , y ) is plane and the V444 plane become the chroma component planes of the first frame (802 ) and second frame (803 ) of the YUV 4 :2 : 0 format as shown in FIG . 8 . [0 , * - 1] x[ 0 , 5 - 11. The first frame (802 ) and second frame ( 803 ) of the YUV 4 : 2 : 0 format can be organized as separate frames (separated by the dark line in FIG . 8 ) . Or, the first frame ( 802 ) and For area B3 , V420main ( x , y ) = V444 ( 2x , 2y ) , where the range second frame (803 ) of the YUV 4 : 2 : 0 format can be orga of (x , y ) is nized as a single frame having a height of 2xH (ignoring the dark line in FIG . 8 ) . Or, the first frame ( 802 ) and second 15 frame (803 ) of the YUV 4 : 2 : 0 format can be organized as a [0 , " -1 ] [0 , *2. -1 ] single frame having a width of 2xW . Or, the first frame (802 ) and second frame (803 ) of the YUV 4 : 2 : 0 format can be organized as a single frame or multiple frames using any of For area B4, Y 42044 * (x , y ) = U444 (X , 2y + 1) , where the the methods defined for frame_ packing _ arrangement_ type 20 range of ( x , y ) is in the H . 264 / AVC standard or the HEVC standard . Although this type of frame packing works, it does not result in geometric correspondence between Y , U and V components within each of the two YUV 4 : 2 :0 frames. In [0 , W –1 ] X [ 0 , 5 -1 ] . particular, for the second frame (803 ) of the YUV 4 :2 :0 25 format , there is typically not a geometric correspondence between the luma component and chroma components. For area B5 , Other packing approaches described herein typically achieve much better geometric correspondence . Alternatively , Approach 1 can be used for color spaces 30 Y928 (x , ? + y) = V441 ( x, 2y + 1) , such as RGB , GBR , etc . in sampling ratios such as 4 : 4 : 4 , 4 : 2 : 2 , 4 : 2 : 0 , etc ., as the chroma sampling formats . 2 . Approach 2 . where the range of ( x , y ) is In approach 2 , a YUV 4 : 4 : 4 frame is packed into two YUV 4 :2 :0 frames while maintaining geometric correspon - 35 dence for chroma information of the YUV 4 : 4 : 4 frame. YUV 4 : 2 : 0 frames with good geometric correspondence among [0 , 6 – 1] X [0 , 5 - 1) . their Y , U and V components can be compressed better because they fit the model expected by a typical encoder adapted to encode YUV 4 : 2 : 0 frames. 40 For area B6 , U420 Aux (x , y ) = U444 (2x + 1, 4y ), where the The packing can also be done such that one of the two range of ( x , y ) is YUV 4 : 2 : 0 frames represents the complete scene being represented by the YUV 4 : 4 : 4 frame, albeit with color components at a lower resolution . This provides options in decoding . A decoder that cannot perform frame unpacking, 45 [0 , " - 1 ] [0 , 4 - 1] or chooses not to perform frame unpacking , can just take a reconstructed version of the YUV 4 :2 :0 frame that repre For area B7 , sents the scene and directly feed it to the display . FIG . 9 illustrates one example approach (900 ) to frame packing that is consistent with these design constraints . In 50 this approach ( 900 ) , a YUV 4 : 4 : 4 frame ( 801 ) is packed into vero( x , ** + y ) = V444( 2x + 1, 4y) , two YUV 4 : 2 : 0 frames ( 902 , 903 ) . The first frame ( 902 ) provides a “ main view ” in YUV 4 : 2 : 0 format a lower chroma resolution version of the complete scene represented where the range of ( x , y ) is by the YUV 4 :4 : 4 frame ( 801) . The second frame ( 903 ) 55 provides an " auxiliary view ” in YUV 4 : 2 : 0 format and contains remaining chroma information . In FIG . 9 , the areas B1 . . . B9 are different areas within [0. -1 ] • [ [0 , * -11 the respective frames ( 902 , 903 ) of YUV 4 : 2 : 0 format. The sample values of odd rows of the U444 plane and V444 plane 60 For area B8 ,V aux (x , y )= U . (2x + 1. 4y + 2 ), where the of the YUV 4 : 4 : 4 frame ( 801 ) are assigned to the areas B4 range of (x , y ) is and B5 , and the sample values of even rows of the U444 plane and V444 plane of the YUV 4 : 4 :4 frame (801 ) are Specificallydistributed ,between sample valuesthe areas of theB2 , YuB3 planeand ,B6 U44 . . plane. B9., 65 10 , ñ - 1 x |0 , h - 11. and V444 plane of the YUV 4 : 4 : 4 frame (801 ) map to the areas B1 . . . B9 as follows. US 9 , 979 , 960 B2 21 22 For area B9, The auxiliary view ( 1003 ) contains chroma information for the YUV 4 : 4 : 4 frame ( 1001 ) . Even so , the auxiliary view ( 1003) fits the content model of a YUV 4 :2 :0 frame and is well suited for compression using a typical YUV 4 : 2 : 0 video vam lx. * + y )= V114( 2x + 1, 4y + 2) , encoder. Within the frame, the auxiliary view ( 1003 ) exhib its geometric correspondence across its Y , U and V compo nents . Between frames , the auxiliary views are expected to where the range of (x , y) is show motion that is highly correlated across Y , U and V components. FIG . 11 illustrates another example approach ( 1100 ) to [0 , 5 – 1] x[ 0 , * - 1] . frame packing that is consistent with these design con straints . In this approach ( 1100 ) , a YUV 4 : 4 : 4 frame ( 801 ) is packed into two YUV 4 : 2 :0 frames (1102 , 1103 ). Much Alternatively , the sample values of the Y444 plane, U444 like the approach (900 ) of FIG . 9 , in the approach ( 1100 ) of plane , and Vsa plane of the YUV 4 : 4 : 4 frame ( 801 ) can be 15 FIG . 11 . the first frame ( 1102) provides a “ main view ” in assigned to the areas B1 . . . B9 in a different way . For YUV 4 : 2 : 0 format a lower chroma resolution version of example , the sample values of even rows of the U444 plane the complete scene represented by the YUV 4 : 4 : 4 frame and V444 plane of the YUV 4 : 4 : 4 frame ( 801) are assigned ( 801 ) while the second frame ( 1103 ) provides an “ auxil to the areas B4 and B5, and the sample values of odd rows iary view ” in YUV 4 : 2 : 0 format and contains remaining of the U . plane and Vm plane of the YUV 4 : 4 : 4 frame 20 chroma information . (801 ) are distributed between the areas B2, B3 and B6 . . . In FIG . 11 , the areas A1 . . . AI are different areas within B9. Or, as another example , data from the original U plane the respective frames ( 1102 , 1103 ) of YUV 4 : 2 : 0 format . of the YUV 4 : 4 : 4 frame can be arranged in the U plane of The sample values of odd columns of the U444 plane and the auxiliary YUV 4 : 2 : 0 frame, and data from the original V V plane of the YUV 4 : 4 : 4 frame (801 ) are assigned to the plane of the YUV 4 : 4 : 4 frame can be arranged in the V plane 25 areas A4 and A5, and the sample values of even columns of of the auxiliary YUV 4 : 2 : 0 frame. In this example , com - the U44 plane and Vam plane of the YUV 4 : 4 : 4 frame (801 ) pared to FIG . 9 , the sample values from Vam ( 2x + 1 , 4y ) that are distributed between the areas A2 , A3 and A6 . . . A9 . are assigned to area B7 in the equations above can instead Specifically , sample values of the Y 444 plane , U444 plane , be assigned to area B8 , and the sample values from U444 and Va plane of the YUV 4 : 4 : 4 frame ( 801 ) map to the ( 2x + 1 , 4y + 2 ) that are assigned to area B8 in the equations 30 areas A1 . . . A9 as follows. above can instead be assigned to area B7. Or, the same For area A1, Y420main ( x , y ) = Y444 ( x , y ) ,where the range of sample values from U 444 can be copied into a single area for (x , y ) is [ 0 , W - 1 ] x [ 0 , H - 1 ]. B6 and B7 without separating every second row , and the For area A2 , U420main ( x , y ) = U444 ( 2x , 2y ) , where the range same sample values from V444 can be copied into a single of (x , y ) is area for B8 and B9 without separating every second row . 35 Either way , the U plane ( or V plane ) of the auxiliary YUV 4 : 2 : 0 frame is constructed from the U plane (or V plane ) of the YUV 4 : 4 :4 frame, without mixing content from different [0 , - 1] x [0 . 5 - 1] original U and V planes . ( In contrast , in the example of FIG . 9 , the U plane (or V plane ) of the auxiliary YUV 4 : 2: 0 frame 40 has a mixture of data from the U and V components of the For area A3, V420 main (x , y) = V444 ( 2x ,2y ) , where the range YUV 4 : 4 : 4 frame. The upper half of the U plane (or V plane ) of (x , y ) is of the auxiliary YUV 4 : 2 :0 frame contains data from the original U plane , and the lower half contains data from the original V plane . ) 45 The first frame (902 ) and second frame (903 ) of the YUV [0 . " -1 ] * [ 0 . 5 -1 ) 4 : 2 : 0 format can be organized as separate frames (separated by the dark line in FIG . 9 ) . Or, the first frame ( 902 ) and For area A4 , Y420 aux ( x , y )= U444 (2x + 1 , y ), where the second frame ( 903 ) of the YUV 4 : 2 : 0 format can be orga range of ( x , y ) is nized as a single frame having a height of 2xH ( ignoring the 50 dark line in FIG . 9 ) . Or, the first frame ( 902 ) and second frame ( 903 ) of the YUV 4 : 2 : 0 format can be organized as a [0 , W - 1] Xx [ 0 , H H –- 11 )] . single framehaving a width of 2xW . Or, the first frame ( 902 ) and second frame (903 ) of the YUV 4 : 2 : 0 format can be organized as a single frame using any of the methods defined 55 For area A5 , for frame _ packing_ arrangement _ type in the H . 264/ AVC standard or the HEVC standard . FIG . 10 illustrates example frames packed according to W the approach (900 ) of FIG . 9 . FIG . 10 shows a YUV 4 : 4 : 4 You + x, y ) = V444( 2x + 1, y) , frame (1001 ) that includes a Y444 plane , U444 plane , and 60 V444 plane . After frame packing, the main view ( 1002 ) ( first YUV where the range of ( x , y ) is 4 : 2 : 0 frame) is the YUV 4 : 2 : 0 equivalent of the original YUV 4 :4 : 4 frame ( 1001 ) . A decoding system can simply display a reconstructed version of the main view ( 1002 ) if 65 [0 , " - - 1X0 1 ]* [0 , H – 1 )]. YUV 4 : 4 : 4 is either not supported or considered not neces sary . US 9 ,979 , 960 B2 23 24 For area A6, U4200+ (x , y )= U444 (4x , 2y + 1 ) , where the The first frame ( 1102 ) and second frame ( 1103 ) of the range of ( x , y ) is YUV 4 : 2 : 0 format can be organized as separate frames (separated by the dark line in FIG . 11 ) . Or, the first frame (1102 ) and second frame ( 1103 ) of the YUV 4 : 2 : 0 format 5 can be organized as a single frame having a height of 2xH [0 , * - 1] x[ 0 , " , - 1] . ( ignoring the dark line in FIG . 11 ) . Or, the first frame ( 1102 ) and second frame ( 1103 ) of the YUV 4 : 2 : 0 format can be organized as a single frame having a width of 2xW . Or, the For area A7 , first frame ( 1102 ) and second frame ( 1103 ) of the YUV 4 : 2 : 0 10 format can be organized as a single frame using any of the methods defined for frame_ packing _ arrangement_ type in uz ( + x , y) = V441( 4x , 2y + 1) , the H . 264 / AVC standard or the HEVC standard . Frame unpacking can simply mirror frame packing . Samples assigned to areas of with frames of YUV 4 : 2 : 0 where the range of (x , y ) is 15 format are assigned back to original locations in chroma components of frames of YUV 4 : 4 : 4 format. In one imple mentation , for example , during frame unpacking , samples in areas B2 . . . B9 of frames of YUV 4 : 2 : 0 format are assigned [o . * -1 ] > [0 . 5 " -1 ] to reconstructed chroma components U ' 444 and V ' 444 of 20 frame of YUV 4 : 4 : 4 format as shown in the following For area A8, V420 aux (x , y ) = U444 ( 4x + 2, 2y + 1 ), where the pseudocode. range of ( x , y ) is for ( x = 0 ; x < ( W > > 1 ); x + + ) { for( y = 0 ; y < ( H > > 1 ) ; y + + ) { 25 U ' 444( 2x , 2y + 1) = Y " 420 aux ( 2x , y) [o . " -1 ] +[ o . 4 - 11 V ' 444 (2x , 2y + 1 ) = Y " 420 aux ( 2x , (H > > 1 ) + y ) U ' 444 ( 2x + 1 ,2y + 1 ) = Y " 42090 (2x + 1 ,y ) V ' 444 (2x + 1 , 2y + 1 ) = Y " 420420 ( 2x + 1 , ( H > > 1 ) + y ) For area A9, if( y % 2 = = 0 ) { U '444 (2x + 1 , 2y ) = U " 420 aux (x , y > > 1) 30 V '444 (2x + 1 ,2y ) = U " 420422 (x , (H > > 2 ) + (y > > 1 )) } else { U '444 ( 2x + 1, 2y ) = V " 420 Qux (x , y > > 1) v ( +x , y ) = V44( 4x +2 , 2y +1 ) , V '444 ( 2x + 1, 2y ) = V " 420 aux (x , (H > > 2 ) + (y > > 1 )) U ' 444 ( 2x, 2y ) = U " 420main ( x , y ) where the range of (x , y ) is 35 V ' 444 ( 2x , 2y) = V " 420main (x , y )

[0 , -1 ]X [ 0 , " – 1] . where the " mark indicates reconstruction from (possibly 40 lossy ) coding.

Alternatively , the sample values of the Yum plane, UM44 B . Syntax and Semantics of Values for Signaling Frame plane , and VM plane of the YUV 4 : 4 : 4 frame ( 801 ) can be Packing Information . assigned to the areas Al . . . A9 in a different way. For In example implementations, a frame packing arrange ment SEI message is used to signal that two 4 : 2 : 0 frames example , the sample values of even columns of the U444 45 include a packed 4 : 4 : 4 frame . Frame packing arrangement plane and V444 plane of the YUV 4 : 4 :4 frame (801 ) are SEI messages are defined in the H . 264/ AVC standard as well assigned to the areas A4 and A5 , and the sample values of as the HEVC standard , although such frame packing odd columns of the U444 plane and V444 plane of the YUV arrangement SEImessages have previously been used for a 4 : 4 : 4 frame ( 801 ) are distributed between the areas A2 , A3 different purpose . and A6 . . . A9 . Or, as another example , data from theC so50 A frame packing arrangement SEImessage was designed original U plane of the YUV 4 : 4 : 4 frame can be arranged in to send stereoscopic 3D video frames using a 2D video the U plane of the auxiliary YUV 4 : 2 : 0 frame, and data from codec . In such a case , two 4 : 2 : 0 frames represent the left and the original V plane of the YUV 4 :4 : 4 frame can be arranged right views of a stereoscopic 3D video scene . For the in the V plane of the auxiliary YUV 4 : 2 :0 frame. In this approaches described herein , the scope of the frame packing example , compared to FIG . 11 , the sample values from 55 arrangement SEI message can be extended to instead sup Va (4x , 2y + 1 ) that are assigned to area A7 in the equations port encoding / decoding of two 4 : 2 : 0 frames obtained from above are instead assigned to area A8 , and the sample values a single 4 : 4 : 4 frame, followed by frame unpacking to from U444 (4x + 2 , 2y + 1 ) that are assigned to area A8 in the recover the 4: 4 : 4 frame. The two 4 : 2 :0 frames represent a equations above are instead assigned to area A7. Or, the main view and an auxiliary view . Both the main and same sample values from U444 can be copied into a single 60 auxiliary views ( frames ) are in a format that is an equivalent area for A6 and A7 without separating every second column , of a 4 : 2 : 0 format. The main view ( frame ) may be indepen and the same sample values from V444 can be copied into a dently useful, while the auxiliary view ( frame ) is useful single area for A8 and A9 without separating every second when interpreted appropriately together with the main view . column . Either way, the U plane (or V plane ) of the auxiliary Thus, these approaches can use the frame packing arrange YUV 4 : 2 : 0 frame is constructed from the U plane ( or V 65 ment SEI message to effectively support encoding/ decoding plane ) of the YUV 4 : 4 : 4 frame, without mixing content from 4 : 4 : 4 frames using video codecs capable of coding /decoding different original U and V planes . 4 :2 : 0 frames. US 9 ,979 , 960 B2 25 26 To this end , the SEI message is extended . For example , arranged . For example , frame_ packing _ arrangement _ the semantics of the syntax element content_ type = 3 indicates side -by - side packing of the two constitu interpretation _ type are extended as follows. In the relevant e nt frames , frame _ packing _ arrangement _ type = = 4 indicates frame packing approaches, for a YUV 4 : 4 : 4 frame, there are top - bottom packing of the two constituent frames, and two constituent YUV 4 : 2 : 0 frames a first frame for a main 5 frame _ packing _ arrangement_ type = = 5 indicates temporal view , and a second frame for an auxiliary view . The con - interleaving of the two constituent frames . The syntax tent _ interpretation _ type indicates the intended interpreta - element frame_ packing _ arrangement _ type can be used tion of the constituent frames as specified in the following similarly in conjunction with values of content _ interpreta table . The values 0 , 1 and 2 are interpreted as in the tion _ type that indicate packing of frames of a higher H .264 / AVC standard and HEVC standard . New values for 10 resolution chroma sampling format . For example , content _ interpretation _ type are defined to indicate that the frame _ packing _ arrangement_ type = = 3 can indicate side -by constituent frames should be interpreted as containing data side packing of main and auxiliary frames , frame_ packin from YUV 4 : 4 : 4 frames : g _ arrangement_ type = 4 can indicate top - bottom packing of

Value Interpretation

O Unspecified relationship between the frame packed constituent frames. Indicates that the two constituent frames form the left and right views of a stereo view scene , with frame o being associated with the left view and frame 1 being associated with the right view . Indicates that the two constituent frames form the right and left views of a stereo view scene , with frame ( being associated with the right view and frame 1 being associated with the left view . Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 : 0 frames representing a YUV 4 : 4 : 4 frame, with frame o being associated with the main view and frame 1 being associated with the auxiliary view . Indicates that the chroma samples of frame o should be interpreted as unfiltered samples of the 4 : 4 : 4 frame (without anti - alias filtering ) . Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 : 0 frames representing a YUV 4 : 4 : 4 frame, with frame o being associated with the main view and frame 1 being associated with the auxiliary view . Indicates that the chroma samples of frame ( should be interpreted as having been anti- alias filtered prior to frame packing . Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 : 0 frames representing a YUV 4 : 4 : 4 frame, with frame 1 being associated with the main view and frame ( being associated with the auxiliary view . Indicates that the chroma samples of frame 1 should be interpreted as unfiltered samples of the 4 : 4 : 4 frame (without anti- alias filtering ) . 6 Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 : 0 frames representing a YUV 4 : 4 : 4 frame, with frame 1 being associated with the main view and frame ( being associated with the auxiliary view . Indicates that the chroma samples of frame 1 should be interpreted as having been anti -alias filtered prior to frame packing . 40 Alternatively , different values for the syntax element main and auxiliary frames , and frame_ packing _ arrange content _ interpretation _ type are associated with the interpre - ment_ type = = 5 can indicate temporal interleaving of main tations shown in the preceding table . Or, other and/ or and auxiliary frames. Or, frame packing arrangement meta additional interpretations for content _ interpretation _ type data is signaled in some other way . Alternatively , instead of can be used to support encoding /decoding of frames of a 45 extending the semantics of the content _ interpretation _ type lower - resolution chroma sampling format obtained from one syntax element to indicate packing of frames of a higher or more frames of a higher -resolution chroma sampling resolution chroma sampling format, the semantics of format by frame packing. frame _ packing _ arrangement_ type can be extended to indi In addition , for the purpose of simplification , one ormore cate packing of frames of a higher -resolution chroma sam of the following constraints may also be imposed for other “ pling format . For example , frame packing arrangement syntax elements of a frame packing arrangement SEI mes - metadata ( such as values of frame _ packing_ arrangement _ sage . When content _ interpretation _ type has a value between type higher than 5 ) can indicate whether frame packing/ 3 and 6 ( that is , for cases involving frame packing of YUV unpacking is used or not used , whether filtering or other 4 : 4 : 4 frames into YUV 4 : 2 : 0 frames ), the values of the 55 pre - processing operations were used or not used ( and hence syntax elements quincunx _ sampling _ flag , spatial_ whether corresponding post -processing filtering or other flipping _ flag, frameO _ grid _ position _ x , post- processing operations should be used or not used ), the framel _ grid _ position _ y, framel _ grid _ position _ x , and type of post -processing operations to perform , or other framel _ grid _ position _ y shall be 0 . Furthermore, when con - information about frame packing /unpacking , in addition to tent_ interpretation _ type is equal to 3 or 5 ( indicating 60 indicating how the main and auxiliary views are arranged . absence of filtering in pre - processing ) , chroma _ loc _ In these examples, the frame packing arrangement SEI info _ present_ flag shall be 1 , and the values of chro - message informs a decoder that the decoded pictures contain ma _ sample _ loc _ type _ top _ field and chroma_ sample _ main and auxiliary views of a 4 :4 : 4 frame as the constituent loc _ type _ bottom _ field shall be 2 . frames of the frame packing arrangement. This information In the H . 264 /AVC standard (and in the HEVC standard ), 65 can be used to process the main and auxiliary views appro the syntax element frame_ packing _ arrangment _ type indi- priately for display or other purposes. For example , when cates how two constituent frames of a stereoscopic view are the system at the decoding end desires the video in 4 :4 : 4 US 9 ,979 , 960 B2 27 28 format and is capable of reconstructing the 4 : 4 : 4 frames tent. ) For a given chroma sample location type, if the chroma from the main and auxiliary views, the system may do so , sample aligns with the luma samples for a particular direc and the output format will be 4 : 4 : 4 . Otherwise , only the tion (horizontal or vertical) , then an odd - tap symmetric filter main view is given as output , and the output format will then ( such as [ 1 2 1 ] / 4 , or [ 0 .25 0 . 5 0 .25 ), along with a rounding be 4 : 2 :0 . 5 operation ) is used to filter chroma in that direction . On the C . Pre- Processing and Post -Processing Operations. other hand , if the chroma sample does not align with the Simple sub - sampling of the chroma sample values of luma samples for a particular direction (horizontal or verti cal) , and the chroma sample grid positions are centered frames of a higher -resolution chroma sampling format can between the luma sample positions for a particular direction introduce aliasing artifacts in the downsampled chroma (horizontal / vertical) , then an even - tap symmetric filter ( typi sample values . To mitigate aliasing , frame packing can 10 cally [ 1 1 ]/ 2 , or [ 0 . 5 0 . 5 ] , along with a rounding operation ) include pre -processing operations to filter chroma sample is used to filter chroma in that direction . Another possible values . Such filtering can be termed anti -alias filtering . filter choice for the latter case is [ 1 3 3 1 ]/ 8 , or [ 0 . 125 0 . 375 Corresponding frame unpacking can then include post 0 . 375 0 . 125 ] , along with a rounding operation . The choice of processing operations to compensate for the pre - processing post -processing operation is usually made such that the filtering of the chroma sample values . For example , with 15 pospost - processing operation compensates for the pre - process reference to the preceding table, when the content _ interpre ing operation . In some cases post - processing directly inverts tation _ type is 4 or 6 , pre -processing operations can be used pre -processing , while in other cases post -processing only to filter the chroma sample values during frame packing , and approximately inverts pre - processing , as explained below . frame unpacking can include corresponding post - processing In implementations of frame packing / unpacking in con operations . 20 junction with AVC coding / decoding or HEVC coding/ de There are various reasons for pre - processing and post- coding, if the chroma sample location type is 1 for chro processing adapted to frame packing / unpacking. ma _ sample _ loc _ type _ top _ field and For example , pre -processing can help improve quality chroma _ sample _ loc _ type _ bottom _ field syntax elements , when only the YUV 4 : 2 : 0 frame representing the main view the chroma sample does not align with luma samples in is used for display. This can permit a decoder to ignore the 25 either horizontal or vertical direction , and hence the filter YUV 4 : 2 : 0 frame representing the auxiliary view without [ 0 . 5 0 . 5 ] is applied in both the horizontal and vertical running the risk of aliasing artifacts caused by simple directions for the pre -processing operation . In such a case , sub - sampling of chroma information . Without pre -process for the approach (900 ) illustrated with reference to FIG . 9 , ing (when the chroma signal for the YUV 4 : 2: 0 frame the equations for deriving the sample values areas B2 and B3 representing the main view is obtained by direct sub - 30 are as follows. sampling of the chroma signal from the YUV 4 : 4 : 4 frame) , For area B2 : U420main _ filt (x , y ) = [U444 (2x , 2y ) +U444 (2x + aliasing artifacts can be seen on some content, for example , 1 ,2y ) + U444 (2x , 2y + 1 ) + U444 (2x + 1 , 2y + 1 ) + 2 ] 4 , and ClearType text content , when only the main view is used to For area B3 : V420 main filt (x , y ) = [V444 (2x , 2y ) + V444 (2x + generate output. 1 , 2y ) + V444( 2x , 2y + 1 ) + V444 ( 2x + 1 , 2y + 1 ) + 2 ]/ 4 , As another example , pre - processing and post - processing 35 where the range of ( x , y ) is can help maintain / enforce consistency and smoothness of the compressed chroma signal in the YUV 4 : 4 : 4 domain . When frame packing is used to pack a YUV 4 : 4 : 4 frame into two YUV 4 : 2 : 0 frames , the chroma signal is split into (0 , - 1] x [0 , - 1] . multiple areas, and each area may get compressed differently 40 ( e . g . , with a different level of quantization ) depending on its for both areas . location . Because of this , when the chroma signal is Due to this filtering , the sample values at positions assembled again by interleaving the data from 1 multiple U444U . . (2x , 2y) and V444 ( 2x , 2y ) from the YUV 4 : 4: 4 frame are areas , artificial discontinuities and high - frequency noise not represented directly in the main view (902 ) ; instead , may be introduced . A post - processing operation can help 45 filtered sample values (U . main _ filt ( x , y ) and V main _ filt ( x . smooth the differences caused in these areas due to com - y ) ) are at positions in the main view ( 902 ) . The sample pression . values at U444 (2x + 1, 2y ), U444 (2x , 2y + 1 ), U444 (2x + 1, 2y + 1) , As another example, pre -processing can help enhance the V444 ( 2x + 1 ,2y ) , V444 ( 2x , 2y + 1 ) and V444 ( 2x + 1 , 2y + 1 ) from compression of the YUV 4 : 2: 0 framene representingrepresenting the aux - the YUV 4 : 4 :4 frame are still represented directly in the iliary view , which contains the remaining chroma informa - 50 auxiliary view ( 903 ) among the areas B4 . . . B9. tion . In corresponding filtering as part of post -processing In some example implementations, the pre - processing operations when frames in YUV 4 : 4 : 4 format are to be operations and post - processing operations are limited such output , the sample values for position U444 (2x , 2y ) and that they affect only the chroma signal that is part of the V444 (2x , 2y ) of the YUV 4 :4 :4 frame can be calculated as YUV 4 : 2 : 0 frame representing the main view . That is , the 55 U ' ^ ( 2x , 2y ) and V ( 2x , 2y ) , from values in the packed filtered sample values are part of the chroma components of frame. as follows: the main view . Additionally , for frame packing/ unpacking in conjunction U '44 (2x , 2y ) = (1 + 3a )* U " 420 Main _ filt( x , y ) - c * [ U " 44 (2x + with AVC coding / decoding or HEVC coding/ decoding , pre 1 ,2y ) + U " 444( 2x , 2y + 1 ) + U " 444 ( 2x + 1 , 2y + 1 ) ] , and processing operations and post -processing operations can be 60 V '444 ( 2x , 2y = (1 + 30 ) * V " 420 main filt( x , y )- a * [V " 444 ( 2x + based on the chroma sample location type ( indicating 1 ,2y ) + V " 444 ( 2x , 2y + 1 ) + V " 444 ( 2x + 1 , 2y + 1 ) ] . chroma sample grid alignment with luma sample grid ) . The chroma sample location type is determined from chro ma _ sample _ loc _ type _ top _ field and chroma _ sample _ loc _ type _bottom _ field syntax elements signaled as part of 65 10 , î - 1 ] * 10 , 3 - 1 ) the compressed bitstream . ( These two elements would ordi narily have equal values for progressive - scan source con US 9 ,979 ,960 B2 29 30 a is a weighting factor that depends on implementation , and 2y + 1 ) and ( 2x + 1 , 2y + 1 ) are filtered to produce a sample the " mark indicates reconstruction from (possibly lossy ) value 17 .25 , which is rounded to 17 . The value filtered coding. With chroma sample grid positions centered sample value of 17 is used in place of the original sample between luma sample positions both horizontally and ver value of 29 . During post - processing , the sample value for tically , with the suggested anti -alias filter of [ 0 .5 0 .5 ] , the 5 the position ( 2x , 2y ) is reconstructed to be 68 – 15 - 7 - 18 = 28 . value a = 1 would perfectly reconstruct the input values in the The difference between the original sample value (29 ) and absence of quantization error and rounding error, directly reconstructed sample value (28 ) shows loss of precision due inverting the filtering performed in pre- processing. For other to the filtering for the pre- processing operation . values of a , filtering during post -processing only approxi - Alternatively , a device can selectively skip filtering opera mately inverts the filtering performing in pre - processing . 10 tions during post -processing , even when filtering was per When considering quantization error, using a somewhat formed during pre -processing . For example , a device can smaller value of a ( e . g . , a = 0 . 5 ) may be advisable in order skip filtering during post- processing to reduce the compu to reduce perceptible artifacts. In general, a should be in the tational load of decoding and playback . range from 0 . 0 to 1 . 0 , and a should be smaller when the 5 Alternatively , the pre -processing operations and post quantization step size is larger. Using a high value of a may processing operations are not limited to the chroma signal of exacerbate artifacts introduced due to lossy compression . Or, different weights can be assigned for different sample the 4 : 4 : 4 frame that is part of the 4 : 2 : 0 frame representing positions. The sample values for position U444 ( 2x , 2y ) and the main view ( for example , areas B2 and B3 for the frame V444 (2x , 2y ) of the YUV 4 : 4 : 4 frame can be calculated as 902 represented in FIG . 9) . Instead , the pre -processing U ' 444 ( 2x , 2y ) and V '444 ( 2x , 2y) , from values in the packedì 20 operations and post - processing operations are also per frame, as follows: formed for the chroma signal of the 4 : 4 : 4 frame that is part U '444 (2x , 2y )= (1 + 2 + B + y) * U " 420 main fill (x , y )- a * U " 444 of the 4 : 2 : 0 frame representing the auxiliary view ( for (2x + 1 , 2y ) - B * U " 444( 2x , 2y + 1 ) - y * U " 444 (2x + 1, 2y + 1) , example , areas B4 to B9 of the frame 903 represented in V ' 444 (2x , 2y ) = (1 + 2 + B + y) * V " 420Main Fill( x , y ) - c * V " 444 FIG . 9 ) . Such pre- processing and post -processing operations ( 2x + 1 , 2y ) - B * V " W ( 2x , 2y + 1 ) - * * V " ( 2x + 1 , 2y + 1 ) , 25 (for the chroma signal of the 4 : 4 : 4 frame that is part of the where the range of (x , y ) is 444 4 : 2 : 0 frame representing the auxiliary view ) can use differ ent filtering operations than the pre -processing and post processing of the chroma signal of the 4 : 4 : 4 frame that is made part of the 4 : 2 : 0 frame representing the main view . [0 , 5 – 1] x[ 0 , 5 - 1] , 30 In the foregoing examples of pre - processing operations and post -processing operations, an averaging filtering is a , ß and y are weighting factors that depend on implemen used during pre - processing and corresponding filtering is tation , and the " mark indicates reconstruction from (possi - used during post- processing . Alternatively , the pre -process bly lossy) coding. With chroma sample grid positions cen - 35 ing operations and post - processing operations can imple tered between luma sample positions both horizontally and ment a transform / inverse transform pair. For example, the vertically , with the suggested anti -alias filter of 0 . 5 0 . 5 ) , the transform / inverse transform pair can be one of the class of values a = B = y = 1 would perfectly reconstruct the input val wavelet transformations, lifting transformations and other ues in the absence of quantization error and rounding error, transformations. Specific transforms can also be designed directly inverting the filtering performed in pre -processing . 40 depending on use case scenarios , so as to satisfy the different For other values of a , ß and y , filtering during post design reasons mentioned above for the use of pre- process processing only approximately inverts the filtering perform ing operations and post- processing operations in the context ing in pre -processing . When considering quantization error , of packing 4 : 4 : 4 frames . Or, the pre - processing and post using somewhat smaller values of a , B and y ( e . g ., processing can use other filter structures, with other filter a = B = y = 0 . 5 ) may be advisable in order to reduce perceptible 45 regions of support or use filtering that is adaptive with artifacts . In general, A , B and y should be in the range from respect to content and / or fidelity ( e . g ., adaptive with respect 0 . 0 to 1 . 0 , and A , B and y should be smaller when the to the quantization step sizes used for the encoding ) . quantization step size is larger . Using high values of a , ß and In some example implementations, the representation y may exacerbate artifacts introduced due to lossy compres - and /or compression of the frame- packed 4 : 2 : 0 content can sion . The values of a , ß and y can be designed for condi- 50 use a higher sample bit depth than the original sample bit tional optimality using cross -correlation analysis . depth of the 4 : 4 : 4 content . For example , the sample bit depth When a = B = r = 1 , the sample values for position U444 ( 2x , of the 4 : 4 : 4 frames is 8 bits per sample , and the sample bit 2y ) and V444 (2x , 2y ) of the YUV 4 : 4 : 4 frame can simply be depth of the frame- packed 4 : 2 : 0 frames is 10 bits per calculated as U '444 (2x , 2y ) and V '444 (2x , 2y ) , from values in sample . This can help reduce precision loss during the the packed frame , as follows : 55 application of pre - processing operations and post- process U 444( 2x, 2y ) = 4 * U " 420 Main _ filt (x , y ) - U " 444 ( 2x + 1 ,2y ) ing operations . Or, this can help achieve higher level of U " 444( 2x , 2y + 1 ) - U " 444( 2X + 1, 2y + 1 ) , and fidelity when 4 : 2 : 0 frames are encoded using lossy com V ' 444( 2x , 2y )= 4 * V " 420 main_ filt ( x , y) - V " 144( 2x + 1, 2y ) pression . For example, if the 4 : 4 : 4 content has a sample bit V " 444 (2x , 2y + 1 ) - V " 444 ( 2x + 1, 2y + 1 ) , depth of 8 bits per sample , and the frame- packed 4 : 2 : 0 where the range of ( x , y ) is content has a sample bit depth of 10 bits per sample , the bit depth of 10 bits per sample can be maintained in all or most internal modules of the encoder and decoder . The sample bit depth can be reduced to 8 bits per sample , if necessary, after [0 , " -1 ] > [o, 5 –1 ] . unpacking the content to 4 : 4 : 4 format at the receiving end . 65 More generally , the sample values of frames of the higher For example , during pre- processing, the sample values resolution chroma sampling format can have a first bit depth 29, 15 , 7 , and 18 for locations ( 2x, 2y ) , (2x + 1 , 2y ) , ( 2x , (such as 8 , 10 , 12 or 16 bits per sample ) , while the sample US 9 ,979 ,960 B2 31 32 values of frames of the lower - resolution chroma sampling FIG . 11 , the approach ( 1100 ) can be used as is , except that format ( following frame packing ) have a second bit depth the areas A4 and A5 will not have data . higher than the first bit depth . In example implementations, new values for content _ int D . Alternatives for YUV 4 : 2 : 2 Video . erpretation _ type are defined to signal the packing of YUV In many of the foregoing examples, YUV 4 :4 :4 frames are 5 4 :2 : 2 frames into the constituent YUV 4 : 2: 0 frames, as packed into YUV 4 : 2 :0 frames for encoding and decoding. shown in the following table .

Value Interpretation O Unspecified relationship between the frame packed constituent frames . 1 Indicates that the two constituent frames form the left and right views of a stereo view scene , with frame ( being associated with the left view and frame 1 being associated with the right view . 2 Indicates that the two constituent frames form the right and left views of a stereo view scene , with frame 0 being associated with the right view and frame 1 being associated with the left view . Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2: 0 frames representing a YUV 4 : 2 : 2 frame, with frame ( being associated with the main view and frame 1 being associated with the auxiliary view . Indicates that the chroma samples of frame o should be interpreted as unfiltered samples of the 4 : 2 : 2 frame (without anti - alias filtering ). Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 : 0 frames representing a YUV 4 : 2 : 2 frame, with frame ( being associated with the main view and frame 1 being associated with the auxiliary view . Indicates that the chroma samples of frame ( should be interpreted as having been anti - alias filtered prior to frame packing . Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 : 0 frames representing a YUV 4 : 2 : 2 frame, with frame 1 being associated with the main view and frame o being associated with the auxiliary view . Indicates that the chroma samples of frame 1 should be interpreted as unfiltered samples of the 4 : 2 : 2 frame (without anti - alias filtering) . 10 Indicates that the two constituent frames form the main and auxiliary YUV 4 : 2 :0 frames representing a YUV 4 : 2 : 2 frame, with frame 1 being associated with the main view and frame ( being associated with the auxiliary view . Indicates that the chroma samples of frame 1 should be interpreted as having been anti - alias filtered prior to frame packing.

In other examples , YUV 4 :2 :2 frames are packed into YUV 35 Alternatively , different values for the syntax element 4 :2 : 0 frames for encoding and decoding. A typical 4 :2 : 2 content_ interpretation _ type are associated with the interpre frame contains 8 sample values for every 4 pixel positions, tations shown in the preceding table . Or, other and /or while a 4 : 2 : 0 frame contains only 6 sample values for every additional interpretations for content_ interpretation _ type 4 pixel positions . So , the sample values contained in a 4 :2 : 2 can be used to support encoding /decoding of frames of a frame can be packed into 4 / 3 4 :2 : 0 frames . That is , when 40 lower -resolution chroma sampling format obtained from one packed efficiently , three 4 : 2 : 2 frames can be packed into four or more frames of a higher - resolution chroma sampling 4 :2 : 0 frames . format by frame packing . In one approach , the frame packing for 4 : 2 : 2 frames is E . Other Chroma Sampling Formats . done in a simple manner similar to the simple approach . Many of the examples described herein involve variations (800 ) illustrated in FIG . 8 for 4 :4 :4 to 4 : 2 : 0 frame packing of YUV color spaces such as Y 'UV , YIQ , Y 'IQ , YdbDr, In other approaches, a YUV 4 : 2 : 2 frame is packed into YCbCr, YCOCg , etc . in sampling ratios such as 4 : 4 : 4 , 4 : 2 : 2 , YUV 4 :2 :0 frames while maintaining geometric correspon 4 : 2 :0 , etc ., as the chroma sampling formats . Alternatively , dence for chroma information of the YUV 4 : 2 : 2 frame. The the described approaches can be used for color spaces such resulting YUV 4 : 2 :0 frames with good geometric correspon - 50 as RGB , GBR , etc . in sampling ratios such as 4 : 4 : 4 , 4 :2 : 2 , dence among their Y , U and V components can be com 4 : 2: 0 , etc. , as the chroma sampling formats . For example , a pressed better because they fit the model expected by a device can pack frames of a higher- resolution non - YUV typical encoder adapted to encoded YUV 4 : 2 : 0 frames . At chroma sampling format ( such as RGB 4 :4 : 4 or GBR 4 : 4 : 4 ) the same time, the packing can be done such that a YUV into frames of a lower resolution format ( such as a 4 : 2 : 0 4 : 2 : 0 frame represents the complete scene being represented 55 format ) , which may then be encoded . In the encoding, the by YUV 4 : 2 : 2 frame , albeit with color components at a nominally luma component and nominally chroma compo lower resolution . nents represent sample values of the non - YUV components These design constraints can be satisfied while packing a ( rather than approximate brightness and color -difference YUV 4 :2 : 2 into two YUV 4 : 2 : 0 frames (main view and values ) . In corresponding unpacking , a device unpacks auxiliary view ) . The auxiliary view will have " empty ” areas, 60 frames of the lower resolution format ( such as a 4 : 2 : 0 but these areas can be filled using a fixed value or by format ) into frames of the higher -resolution non - YUV replicating chroma values . Or, the empty areas can be used chroma sampling format ( such as RGB 4 : 4 : 4 or GBR 4 : 4 : 4 ) . to indicate other information such as depth of a scene . For Also , the described approaches can be used for frame example , for the packing approach (900 ) described with packing of video content of a 4 : 4 : 4 format, 4 : 2 : 2 format or reference to FIG . 9 , the approach ( 900 ) can be used as is , 65 4 :2 : 0 format into a 4 :0 : 0 format , which is typically used for except that the areas B4 and B5 will not have data . Or, for gray scale or monochrome video content . The chroma the packing approach ( 1100 ) described with reference to information from a frame of the 4 : 4 : 4 format, 4 : 2 : 2 format US 9 ,979 , 960 B2 33 34 or 4 : 2 : 0 format can be packed into the primary component ing can be signaled as part of a supplemental enhancement of one or more additional or auxiliary frames of 4 : 0 : 0 information message or as some other type of metadata . format . FIG . 13 shows a generalized technique ( 1300 ) for frame F . Generalized Techniques for Frame Packing /Unpacking . unpacking . A computing device that implements a frame FIG . 12 shows a generalized technique ( 1200 ) for frame 5 unpacker , for example , as described with reference to FIG . packing . A computing device that implements a frame 5 . can perform the technique ( 1300 ). packer, for example , as described with reference to FIG . 4 , can perform the technique (1200 ) . Before the frame unpacking itself, the device can decode The device packs ( 1210 ) one or more frames of a higher ( 1310 ) the frame( s) of a lower- resolution chroma sampling resolution chroma sampling format into one or more frames 10 format . Alternatively, a different device performs the decod of a lower- resolution chroma sampling format. For example , ing ( 1310 ) . the device packs frame( s ) of 4 :4 : 4 format ( e .g ., YUV 4 :4 :4 The device unpacks ( 1320 ) one or more frames of the format ) into frame( s ) of 4 : 2 : 0 format ( e . g . , YUV 4 : 2 : 0 lower -resolution chroma sampling format into one or more format ) . Or, the device packs frame( s ) of 4 : 2 : 2 format ( e . g ., frames of a higher - resolution chroma sampling format . For YUV 4 : 2 : 2 format) into frame( s ) of 4 : 2 : 0 format (eg . YUV 15 example , the device unpacks frame( s ) of 4 : 2 : 0 format ( e . g . , 4 : 2 : 0 format) . Or, the device packs frame( s ) of 4 : 4 : 4 format YUV 4 : 2 : 0 format) into frame( s ) of 4 : 4 : 4 format ( e . g ., YUV ( e . g . , YUV 4 : 4 :4 format ) into frame ( s ) of 4 : 2 : 2 format ( e .g ., 4 : 4 : 4 format ) . Or, the device unpacks frame( s ) of 4 : 2 :0 YUV 4 :2 : 2 format) . format ( e . g ., YUV 4 : 2 : 0 format) into frame( s ) of 4 : 2 : 2 For YUV formats , the device can perform the frame format ( e . g . , YUV 4 : 2 : 2 format ) . Or, the device unpacks packing ( 1210 ) so as to maintain geometric correspondence 20 frame( s ) of 4 :2 : 2 format ( e. g ., YUV 4 :2 : 2 format) into between adjacent sample values of chroma components of frame( s ) of 4 :4 : 4 format ( e . g . , YUV 4 : 4 : 4 format ) . the frame( s ) of the higher - resolution chroma sampling for - When a lower chroma resolution version of the frame( s ) mat after the packing . For example , such sample values are of the higher- resolution chroma sampling format is embed maintained as adjacent samples and /or collocated portions of ded as part of the frame ( s ) of the lower- resolution chroma luma and chroma components of the frame ( s ) of the lower - 25 sampling format, the device has options for display . The part resolution chroma sampling format. Later encoding can of the frame( s ) of the lower - resolution chroma sampling exploit such geometric correspondence . format that represents a lower chroma resolution version of In some frame packing approaches, the device can embed the frame( s ) of the higher -resolution chroma sampling for a lower chroma resolution version of the frame ( s ) of the mat can be reconstructed for output and display . The rest of higher - resolution chroma sampling format as part of the 30 the frame( s ) of the lower - resolution chroma sampling format frame( s ) of the lower - resolution chroma sampling format. represents remaining chroma information from the frame( s ) Thus , part of the frame( s ) of the lower- resolution chroma of the higher - resolution chroma sampling format, and can be sampling format represents a lower chroma resolution ver - used as part of frame unpacking . In other frame unpacking sion of the frame( s ) of the higher- resolution chroma sam approaches , to reverse spatial partitioning of the frame( s ) of pling format. The rest of the frame ( s ) of the lower - resolution 35 the higher -resolution chroma sampling format, the device chroma sampling format represents remaining chroma infor - assigns sample values of luma and chroma components of mation from the frame( s ) of the higher - resolution chroma the frame( s ) of the lower - resolution chroma sampling format sampling format. In other frame packing approaches to chroma components of the frame( s ) of the higher - reso according to spatial partitioning of the frame( s ) of the lution chroma sampling format . higher - resolution chroma sampling format, the device 40 During the frame unpacking, the sample values of chroma assigns sample values of chroma components of the frame( s ) components of the frame( s ) of the higher -resolution chroma of the higher- resolution chroma sampling format to luma sampling format can be filtered as part of post -processing . In and chroma components of the frame( s ) of the lower - some implementations , at least some sample values of the resolution chroma sampling format. chroma components of the frame( s ) of the higher- resolution During the frame packing, the sample values of chroma 45 chroma sampling format have a higher bit depth ( e . g . , 10 bits components of the frame ( s ) of the higher - resolution chroma per sample ) before the post- processing filtering , and such sampling format can be filtered , and filtered sample values sample values have a lower bit depth ( e . g . , 8 bits per sample ) are assigned to parts of chroma components of the frame( s ) after the post- processing filtering . of the lower- resolution chroma sampling format . In some The device can also receive metadata about frame pack implementations , the sample values of the chroma compo - 50 ing /unpacking . For example , the device receives metadata nents of the frame ( s ) of the higher - resolution chroma sam - that indicates whether frame packing/ unpacking is used or pling format have a lower bit depth ( e. g ., 8 bits per sample ), not used . Or, the device receives an indication that the and the filtered sample values have a higher bit depth ( e . g . , sample values of the chroma components of the frame( s ) of 10 bits per sample ) for encoding at the higher bit depth . the higher - resolution chroma sampling format have been The device can then encode (1220 ) the frame( s ) of the 55 filtered during the frame packing , and should be filtered as lower - resolution chroma sampling format. Alternatively , a part of post - processing . The metadata about frame packing / differentdevice performsthe encoding ( 1220 ). The device ( s ) unpacking can be signaled as part of a supplemental can repeat the technique ( 1200 ) on a frame -by - frame basis enhancement information message or as some other type of or other basis . metadata . The device can signal metadata about frame packing / 60 The device ( s ) can repeat the technique ( 1300 ) on a unpacking . For example , the device signals metadata that frame- by - frame basis or other basis . indicates whether frame packing/ unpacking is used or not In view of the many possible embodiments to which the used . Or, the device signals an indication that the sample principles of the disclosed invention may be applied , it values of the chroma components of the frame ( s ) of the should be recognized that the illustrated embodiments are higher -resolution chroma sampling format have been filtered 65 only preferred examples of the invention and should not be during the frame packing , and should be filtered as part of taken as limiting the scope of the invention . Rather, the post - processing . Themetadata about framepacking / unpack - scope of the invention is defined by the following claims. We US 9 ,979 , 960 B2 35 36 therefore claim as our invention all that comes within the maintaining geometric correspondence among the sample scope and spirit of these claims. values of the chroma components of the one or more frames of the higher - resolution chroma sampling for We claim : mat after the packing; and 1 . A method comprising: 5 embedding a lower chroma resolution version of the one receiving one or more frames of a higher -resolution or more frames of the higher- resolution chroma sam chroma sampling format , the one or more frames of the higher- resolution chroma sampling format including pling format as part of the one or more frames of the sample values of chroma components and sample val lower- resolution chroma sampling format . ues of luma components ; and 8 . The method of claim 1 wherein the luma component of packing the one or more frames of the higher - resolution one of the one or more frames of the lower- resolution chroma sampling format into one or more frames of a chroma sampling format is a luma component of an auxil lower - resolution chroma sampling format , wherein the iary view among the one or more frames of the lower lower -resolution chroma sampling format has a lower resolution chroma sampling format , and wherein the pack chroma resolution than the higher- resolution chroma 15 ing 1includes , for the chroma components of the one or more sampling format , and wherein the packing includes frames of the higher- resolution chroma sampling format: assigning some of the sample values of the chroma assigning sample values corresponding to every second components of the one or more frames in the higher row or column of the chroma components of the one or resolution chroma sampling format to sample values of more frames of the higher - resolution chroma sampling a luma component of one of the one or more frames of 20 format to sample values of the luma component of the the lower- resolution chroma sampling format. auxiliary view among the one or more frames of the 2 . The method of claim 1 further comprising , after the lower -resolution chroma sampling format; packing: assigning every second sample value corresponding to encoding the one or more frames of the lower- resolution other rows or columns of the chroma components of the chroma sampling format. 25 one or more frames of the higher - resolution chroma 3 . The method of claim 2 wherein the packing maintains sampling format to sample values of chroma compo geometric correspondence between adjacent sample values nents of a main view among the one or more frames of of the chroma components of the one or more frames of the the lower- resolution chroma sampling format ; and higher- resolution chroma sampling format as adjacent assigning other sample values corresponding to the other samples and / or collocated portions of luma and chroma 30 rows or columns of the chroma components of the one components of the one or more frames of the lower -resolu or more frames of the higher - resolution chroma sam tion chroma sampling format, and wherein the encoding pling format to sample values of chroma components of exploits the geometric correspondence . the auxiliary view among the one or more frames of the 4 . The method of claim 3 wherein the encoding exploits lower- resolution chroma sampling format. the geometric correspondence ( a ) in encoding operations 35 9 . The method of claim 1 wherein the packing includes : that include one or more of derivation of motion vectors and anti - alias filtering the sample values of the chroma com derivation of prediction modes , and / or (b ) to guide encoding ponents of the one or more frames of the higher decisions that include one or more of motion estimation , resolution chroma sampling format; and selection of quantization parameters and selection of pre assigning the filtered sample values to sample values of diction modes . 40 chroma components of a main view among the one or 5 . The method of claim 1 wherein first parts of the one or more frames of the lower -resolution chroma sampling more frames of the lower - resolution chroma sampling for format . mat represent a lower chroma resolution version of the one 10 . The method of claim 1 wherein the sample values of or more frames of the higher- resolution chroma sampling the one or more frames of the higher - resolution chroma format, and wherein second parts of the one or more frames 45 sampling format have a first bit depth , and wherein sample of the lower -resolution chroma sampling format, the second values of the one or more frames of the lower -resolution parts including the luma component of one of the one or chroma sampling format have a second bit depth higher than more frames of the lower- resolution chroma sampling for the first bit depth . mat, represent remaining chroma information from the one 11 . The method of claim 1 further comprising : or more frames of the higher - resolution chroma sampling 50 signaling an indication of whether the sample values of format . the chroma components of the one or more frames of 6 . The method of claim 1 further comprising: the higher -resolution chroma sampling format have signaling metadata that indicates that first parts of the one been filtered as part of the packing. or more frames of the lower -resolution chroma sam 12 . A computing device comprising : pling format represent a lower chroma resolution ver - 55 one or more processing units ; sion of the one or more frames of the higher - resolution volatile memory ; and chroma sampling format, and indicates that second non - volatile memory and / or storage , the non - volatile parts of the one or more frames of the lower - resolution memory and / or storage having stored therein computer chroma sampling format, the second parts including the executable instructions for causing the computing luma component of one of the one or more frames of 60 device , when programmed thereby , to perform opera the lower- resolution chroma sampling format , repre tions to reconstruct frames of a higher -resolution sent remaining chroma information from the one or chroma sampling format , the operations comprising : more frames of the higher- resolution chroma sampling receiving one or more frames of a lower - resolution format. chroma sampling format , wherein the lower -resolu 7 . The method of claim 1 wherein the packing is consis - 65 tion chroma sampling format has a lower chroma tent with plural design constraints , the plural design con resolution than the higher- resolution chroma sam straints including : pling format; and US 9 , 979 , 960 B2 37 38 unpacking the one or more frames of the lower - reso to sample values at other positions of the other rows or lution chroma sampling format into one or more columns of the chroma components of the one or more frames of the higher -resolution chroma sampling frames of the higher -resolution chroma sampling for format, the one or more frames of the higher -reso mat . lution chroma sampling format including sample 5 17 . The computing device of claim 12 wherein the values of chroma components and sample values of unpacking includes : luma components , wherein the unpacking includes filtering the sample values of the chroma components of assigning sample values of a luma component of one the one ormore frames of the higher - resolution chroma of the one or more frames of the lower - resolution chroma sampling format to some of the sample 10 sampling format . values of the chroma components of the one or more 18 . The computing device of claim 12 wherein the sample frames in the higher -resolution chroma sampling values of the one or more frames of the higher- resolution format . chroma sampling format have a first bit depth , and wherein 13 . The computing device of claim 12 wherein the opera sample values of the one or more frames of the lower tions further comprise , before the receiving and the unpack - 15 resolution€ chroma sampling format have a second bit depth ing : higher than the first bit depth . decoding the one or more frames of the lower - resolution 19 . The computing device of claim 12 wherein the opera chroma sampling format. tions further comprise : 14 . The computing device of claim 12 wherein first parts receiving an indication of whether the sample values of of the one or more frames of the lower - resolution chroma 20 chroma components of the one or more frames of the sampling format represent a lower chroma resolution ver higher - resolution chroma sampling format have been sion of the one or more frames of the higher - resolution filtered as part of packing. chroma sampling format, and wherein second parts of the 20 . A computing device comprising a processor and one or more frames of the lower- resolution chroma sampling memory , wherein the computing device implements : format , the second parts including the luma component of 25 a video encoder adapted to encode and / or video decoder one of the one or more frames of the lower- resolution adapted to decode frames of a YUV 4 : 2 :0 format ; and chroma sampling format, represent remaining chroma infor a module for processing a supplemental enhancement mation from the one or more frames of the higher - resolution information message that indicates ( a ) first parts of one chroma sampling format. or more frames of the YUV 4 : 2 : 0 format represent a 15 . The computing device of claim 12 wherein the opera - 30 lower chroma resolution version of one or more frames tions further comprise : of a YUV 4 : 4 : 4 format, ( b ) second parts of the one or receiving metadata that indicates first parts of the one or more frames of the YUV 4 : 2 : 0 format represent more frames of the lower - resolution chroma sampling remaining chroma information from the one or more format represent a lower chroma resolution version of frames of the YUV 4 : 4 : 4 format, and ( c ) whether the one or more frames of the higher - resolution chroma 35 sample values of chroma components of the one or sampling format, and indicates second parts of the one more frames of the YUV 4 : 4 : 4 format have been or more frames of the lower- resolution chroma sam filtered as part of packing into the one or more frames pling format, the second parts including the luma of the YUV 4 : 2 : 0 format, some of the sample values of component of one of the one or more frames of the the chroma components of the one or more frames of lower- resolution chroma sampling format, represent 40 the YUV 4 :4 : 4 format being assigned to sample values remaining chroma information from the one or more of a luma component among the second parts of the one frames of the higher- resolution chroma sampling for or more frames of the YUV 4 :2 : 0 format. mat. 21 . One or more tangible computer -readable media stor 16 . The computing device of claim 12 wherein the luma ing computer- executable instructions for causing one or component of one of the one or more frames of the lower - 45 more processing units , when programmed thereby , to per resolution chroma sampling format is a luma component of form operations to reconstruct frames of a higher -resolution an auxiliary view among the one or more frames of the chroma sampling format, the operations comprising : lower - resolution chroma sampling format, and wherein the receiving one or more frames of a lower - resolution unpacking includes , for the chroma components of the one chroma sampling format, wherein the lower -resolution or more frames of the higher -resolution chroma sampling 50 chroma sampling format has a lower chroma resolution format : than the higher- resolution chroma sampling format ; assigning the sample values corresponding to the luma and component of the auxiliary view among the one or unpacking the one or more frames of the lower - resolution more frames of the lower- resolution chroma sampling chroma sampling format into one or more frames of the format to sample values of every second row or column 55 higher -resolution chroma sampling format, the one or of the chroma components of the one or more frames of more frames of the higher - resolution chroma sampling the higher -resolution chroma sampling format; format including sample values of chroma components assigning sample values corresponding to chroma com and sample values of luma components , wherein the ponents of a main view among the one or more frames unpacking includes assigning sample values of a luma of the lower - resolution chroma sampling format to 60 component of one of the one or more frames of the sample values at every second position of other rows or lower - resolution chroma sampling format to some of columns of the chroma components of the one or more the sample values of the chroma components of the one frames of the higher -resolution chroma sampling for or more frames in the higher - resolution chroma sam mat ; and pling format. assigning sample values corresponding to chroma com - 65 22 . The one or more tangible computer- readable media of ponents of the auxiliary view among the one or more claim 21 wherein the operations further comprise , before the frames of the lower - resolution chroma sampling format receiving and the unpacking : US 9 ,979 ,960 B2 39 40 decoding the one or more frames of the lower -resolution higher - resolution chroma sampling format to sample chroma sampling format. values of a luma component of one of the one or 23 . The one or more tangible computer - readable media of more frames of the lower -resolution chroma sam claim 21 wherein the luma component of one of the one or pling format. more frames of the lower- resolution chroma sampling for - 5 26 . The computing device of claim 25 wherein the pack mat is a luma component of an auxiliary view among the one ing maintains geometric correspondence between adjacent or more frames of the lower - resolution chroma sampling sample values of the chroma components of the one or more format , and wherein the unpacking includes , for the chroma frames of the higher- resolution chroma sampling format as components of the one or more frames of the higher adjacent samples and / or collocated portions of luma and resolution chroma sampling format: 10 chroma components of the one or more frames of the assigning the sample values corresponding to the luma lower -resolution chroma sampling format . component of the auxiliary view among the one or 27 . The computing device of claim 26 wherein the opera more frames of the lower - resolution chroma sampling tions further comprise , after the packing , encoding the one format to sample values of every second row or column or more frames of the lower - resolution chroma sampling of the chroma components of the one ormore frames off 1515 format" , wherein the encoding exploits the geometric corre the higher - resolution chroma sampling format ; spondence . assigning sample values corresponding to chroma com 28 . The computing device of claim 25 wherein the opera ponents of a main view among the one or more frames tions further comprise : of the lower- resolution chroma sampling format to signaling metadata that indicates that first parts of the one sample values at every second position of other rows or 20 or more frames of the lower - resolution chroma sam columns of the chroma components of the one or more pling format represent a lower chroma resolution ver frames of the higher- resolution chroma sampling for sion of the one or more frames of the higher - resolution mat ; and chroma sampling format, and indicates that second assigning sample values corresponding to chroma com parts of the one or more frames of the lower -resolution ponents of the auxiliary view among the one or more 25 chroma sampling format , the second parts including the frames of the lower - resolution chroma sampling format luma component of one of the one or more frames of to sample values at other positions of the other rows or the lower -resolution chroma sampling format, repre columns of the chroma components of the one or more sent remaining chroma information from the one or frames of the higher- resolution chroma sampling for more frames of the higher- resolution chroma sampling mat. format. 24 . The one or more tangible computer -readable media of 29 . The computing device of claim 25 wherein the luma claim 21 wherein the sample values of the one or more component of one of the one or more frames of the lower frames of the higher - resolution chroma sampling format resolution chroma sampling format is a luma component of have a first bit depth , and wherein sample values of the one an auxiliary view among the one or more frames of the or more frames of the lower - resolution chroma sampling 35 lower- resolution chroma sampling format , and wherein the format have a second bit depth higher than the first bit depth . packing includes, for the chroma components of the one or 25 . A computing device comprising : more frames of the higher -resolution chroma sampling for one or more processing units ; mat: volatile memory ; and assigning sample values corresponding to every second non -volatile memory and / or storage, the non - volatile 40 row or column of the chroma components of the one or memory and / or storage having stored therein computer more frames of the higher- resolution chroma sampling executable instructions for causing the computing format to sample values of the luma component of the device , when programmed thereby, to perform opera auxiliary view among the one or more frames of the tions comprising : lower- resolution chroma sampling format ; receiving one or more frames of a higher- resolution 45 assigning every second sample value corresponding to chroma sampling format, the one or more frames of other rows or columns of the chroma components of the the higher - resolution chroma sampling format one or more frames of the higher - resolution chroma including sample values of chroma components and sampling format to sample values of chroma compo sample values of luma components ; and nents of a main view among the one or more frames of packing the one or more frames of the higher - resolution 50 the lower- resolution chroma sampling format; and chroma sampling format into one or more frames of assigning other sample values corresponding to the other a lower -resolution chroma sampling format , wherein rows or columns of the chroma components of the one the lower- resolution chroma sampling format has a or more frames of the higher- resolution chroma sam lower chroma resolution than the higher- resolution pling format to sample values of chroma components of chroma sampling format , and wherein the packing 55 the auxiliary view among the one or more frames of the includes assigning some of the sample values of the lower- resolution chroma sampling format . chroma components of the one or more frames in the * * * * *