Learning Better Lossless Compression Using Lossy Compression
Total Page:16
File Type:pdf, Size:1020Kb
Learning Better Lossless Compression Using Lossy Compression Fabian Mentzer Luc Van Gool Michael Tschannen ETH Zurich ETH Zurich Google Research, Brain Team [email protected] [email protected] [email protected] Abstract xl x <latexit sha1_base64="FCgE+SeO51rbUyQCgOO9cG+tGjw=">AAAB6nicbVDJSgNBEK2JW4xbXG5eGoPgKcxEQU8S8OIxolkgGUJPpyZp0tMzdPeIYcgnePGgiFe/yJt/Y2c5aOKDgsd7VVTVCxLBtXHdbye3srq2vpHfLGxt7+zuFfcPGjpOFcM6i0WsWgHVKLjEuuFGYCtRSKNAYDMY3kz85iMqzWP5YEYJ+hHtSx5yRo2V7p+6olssuWV3CrJMvDkpVY9gilq3+NXpxSyNUBomqNZtz02Mn1FlOBM4LnRSjQllQ9rHtqWSRqj9bHrqmJxapUfCWNmShkzV3xMZjbQeRYHtjKgZ6EVvIv7ntVMTXvkZl0lqULLZojAVxMRk8jfpcYXMiJEllClubyVsQBVlxqZTsCF4iy8vk0al7J2XK3cXper1LA3IwzGcwBl4cAlVuIUa1IFBH57hFd4c4bw4787HrDXnzGcO4Q+czx/XUo4v</latexit> l BPG <latexit sha1_base64="SY0uayGBkROUrzaICLpYrvJshQs=">AAACv3icbVFda9swFFW8ry77Sre97UUsDNoRgp0NNhgZGXvZYwdLW7BNuJavExFJNpLc1DP+HXvZ6/af9m8mOy20TS8Ijs7RvUf33qQQ3Fjf/9fz7ty9d//B3sP+o8dPnj4b7D8/NnmpGc5ZLnJ9moBBwRXOLbcCTwuNIBOBJ8n6a6ufnKE2PFc/bFVgLGGpeMYZWEctBoOoq1EvNVQNPV+IxWDoj/0u6C4ILsBw9pJ0cbTY7/2K0pyVEpVlAowJA7+wcQ3aciaw6UelwQLYGpYYOqhAoonrzrahbxyT0izX7ihLO/ZqRg3SmEom7qUEuzI3tZa8VTu/NNiVEnkbHZY2+xjXXBWlRcW2X8tKQW1O28nRlGtkVlQOANPcdUfZCjQw6+Z7zcDy9U/Xt8INy6UElb6NGNduGGkYxHXUyuHl0qYHbZFxez2M6z69EpHKUwzNCgqcbvNHGRdiullxi6NUw2bElUJNnfN04mZOu1qHtB4GzaemcasMbi5uFxxPxsG78eT7++Hs83anZI+8Iq/JAQnIBzIj38gRmRNGzshv8of89b54S095xfap17vIeUGuhVf9B7+92bM=</latexit> We leverage the powerful lossy image compression al- gorithm BPG to build a lossless image compression sys- QC RC RC tem. Specifically, the original image is first decomposed + <latexit sha1_base64="VYjfhLw3lH1NsTscrWXm90QWm30=">AAACu3icbVFda9swFFW8ry776sfe9iIWBu0Iwc4GGxSXwF722MHSFmyvXMvXiRZJNpK8NDP+Hdvr9q/2byY7LbRNLwiOztG9R/fetBTcWN//1/Pu3X/w8NHW4/6Tp8+ev9je2T0xRaUZTlkhCn2WgkHBFU4ttwLPSo0gU4Gn6eJTq5/+QG14ob7aVYmJhJniOWdgHfUt7irUMw2rhl6cbw/8kd8F3QTBJRhMXpIujs93er/irGCVRGWZAGOiwC9tUoO2nAls+nFlsAS2gBlGDiqQaJK6M23oG8dkNC+0O8rSjr2eUYM0ZiVT91KCnZvbWkveqV1cGWxKqbyLjiqbf0xqrsrKomLrr+WVoLag7dRoxjUyK1YOANPcdUfZHDQw62Z7w8DyxU/Xt8IlK6QElb2NGdduGFkUJHXcytHVwsL9tsiovR4kdZ9ei1gVGUZmDiWG6/xhzoUIl3NucZhpWA65Uqipcw7Hbua0q3VA60HQHDaNW2Vwe3Gb4GQ8Ct6Nxl/eDyZH652SLfKKvCb7JCAfyIR8JsdkShjR5Df5Q/56oce8755YP/V6lzl75EZ41X8Hvtij</latexit> x into the lossy reconstruction obtained after compressing it <latexit sha1_base64="1kNcN2IiPamqSzlJt9OuLPtgnQo=">AAAB6HicbVDLTgJBEOzFF+ILHzcvE4mJJ7KLJnoyJF48QiJIAhsyO/TCyOzsZmbWSAhf4MWDxnj1k7z5Nw4LBwUr6aRS1Z3uriARXBvX/XZyK6tr6xv5zcLW9s7uXnH/oKnjVDFssFjEqhVQjYJLbBhuBLYShTQKBN4Hw5upf/+ISvNY3plRgn5E+5KHnFFjpfpTt1hyy24Gsky8OSlVjyBDrVv86vRilkYoDRNU67bnJsYfU2U4EzgpdFKNCWVD2se2pZJGqP1xduiEnFqlR8JY2ZKGZOrviTGNtB5Fge2MqBnoRW8q/ue1UxNe+WMuk9SgZLNFYSqIicn0a9LjCpkRI0soU9zeStiAKsqMzaZgQ/AWX14mzUrZOy9X6hel6vUsDcjDMZzAGXhwCVW4hRo0gAHCM7zCm/PgvDjvzsesNefMZw7hD5zPH1h+jVA=</latexit> x <latexit sha1_base64="T2anKYaMGRZ5lXVzxQUOz1CFtFo=">AAAC03icbVFNb9MwGHbD1yhfHRy5WLRDG1RVUg6bQEWVuHAcEt0mJVHlOG9bq7YT2Q5dsXJBnJB25j/sxBX+BUf+DU66Stu6V7L0+Hne7zfJOdPG9/81vFu379y9t3W/+eDho8dPWttPj3RWKAojmvFMnSREA2cSRoYZDie5AiISDsfJ/EOlH38BpVkmP5tlDrEgU8kmjBLjqHGrExk4NXUeO1VkWdqIMkU5pLYTJcK+LjtlOW61/Z5fG94EwQVoD3fOz9/+PXt5ON5u/IzSjBYCpKGcaB0Gfm5iS5RhLnfZjAoNOaFzMoXQQUkE6NjWbZR4xzEpnmTKPWlwzV6OsERovRSJ8xTEzPR1rSJv1E7XBTalRNxEh4WZHMSWybwwIOmqtUnBsclwtU2cMgXU8KUDhCrmpsN0RhShxu38SgHD5l/d3BIWNBOCyPTVetFhENuoksP1IQe7VZJe9d2LbRNfskhmKYR6RnIYrOK7E8b5YDFjBrqpIosukxIUdpUHfbdzXOfaw7YdlO/qUwbXD7cJjvq94E2v/yloD9+jlW2h5+gF2kUB2kdD9BEdohGi6Af6hX6jP97Is9437/vK1WtcxDxDV8w7+w+96eW1</latexit> <latexit sha1_base64="Z96E99LFNgyTtBLMezyLVQfMhCE=">AAAC1HicbVFdb9MwFHXDx0b56gZvvFh0kzZUqqR7YBIqqsQLj0Oi3aQkqhznZrVqO5Ht0BWTJ8QTEq/8Dl7hl/BvcNJV2tZdKdLxOb73xOcmBWfa+P6/lnfn7r37W9sP2g8fPX7ytLOzO9F5qSiMac5zdZYQDZxJGBtmOJwVCohIOJwm8/e1fvoZlGa5/GSWBcSCnEuWMUqMo6ad/YgyRTmkNjJwYZqBNuGEziu7FyXCvq72qmra6fp9vym8CYJL0B09R02dTHdav6I0p6UAaSgnWoeBX5jYEmWYc6vaUamhcC7kHEIHJRGgY9vYV3jfMSnOcuU+aXDDXu2wRGi9FIm7KYiZ6ZtaTd6qXawNNqVE3EaHpcmOY8tkURqQdPVrWcmxyXEdJ06ZAmr40gFCFXOvw3RGFKHGhX7NwLD5F/duCQuaC0Fk+modfRjELn0nh+tNDg/qIf36eBjbNr5SkcxTCPWMFDBc9fcyxvlwMWMGeqkiix6TEhR2zsOByxw3sw6x7QbV22aVwc3FbYLJoB8c9Qcfg+7o3WqnaBu9QC/RAQrQGzRCH9AJGiOKfqDf6A/66028r9437/vqqte67HmGrpX38z/27uLo</latexit> − with BPG and the corresponding residual. We then model p(r xl) p(r xl) the distribution of the residual with a convolutional neu- <latexit sha1_base64="9q+K4qK6hF6bq4VkdAfWgFffsfE=">AAAB73icbVDLSgNBEOz1GeMrPm5eBoMQL2E3CnqSgBePEcwDkiXMTmaTIbOz48ysGNb8hBcPinj1d7z5N042OWhiQUNR1U13VyA508Z1v52l5ZXVtfXcRn5za3tnt7C339Bxogitk5jHqhVgTTkTtG6Y4bQlFcVRwGkzGF5P/OYDVZrF4s6MJPUj3BcsZAQbK7VkST09dvlpt1B0y24GtEi8GSlWDyFDrVv46vRikkRUGMKx1m3PlcZPsTKMcDrOdxJNJSZD3KdtSwWOqPbT7N4xOrFKD4WxsiUMytTfEymOtB5Fge2MsBnoeW8i/ue1ExNe+ikTMjFUkOmiMOHIxGjyPOoxRYnhI0swUczeisgAK0yMjShvQ/DmX14kjUrZOytXbs+L1atpGpCDIziGEnhwAVW4gRrUgQCHZ3iFN+feeXHenY9p65IzmzmAP3A+fwAm05AQ</latexit> | <latexit sha1_base64="kdKRmLR3/WeLs/x2QLngL0+K1sw=">AAACxHicbVFda9swFFXcfXTZV7qPp72IhUEyQrDTQgcjIzAYe+xgaQu2CbJ8nYhIspHkpqnm/Y0973X7Rfs3k50W2qYXBEfn6N6je29ScKaN7/9reTv37j94uPuo/fjJ02fPO3svjnVeKgpTmvNcnSZEA2cSpoYZDqeFAiISDifJ8nOtn5yB0iyX3826gFiQuWQZo8Q4atZ5HTU17FyRdYWLnvpxPuP9WafrD/0m8DYILkF38go1cTTba/2K0pyWAqShnGgdBn5hYkuUYZRD1Y5KDQWhSzKH0EFJBOjYNt4VfueYFGe5ckca3LDXMywRWq9F4l4KYhb6tlaTd2rnVwbbUiLuosPSZB9iy2RRGpB087Ws5NjkuB4fTpkCavjaAUIVc91huiCKUOOGfMPAsOWF61vCiuZCEJm+jyhTbhhpGMQ2quXwanPjXl1kWF/7sW3jaxHJPIVQL0gB403+IGOcj1cLZmCQKrIaMClBYec8HrmZ46ZWH9tuUH2sKrfK4PbitsHxaBjsD0ffDrqTT5udol30Br1FPRSgQzRBX9ERmiKKLPqN/qC/3hePe9orN0+91mXOS3QjvJ//AXwA25Q=</latexit> | ral network-based probabilistic model that is conditioned AC ACAC on the BPG reconstruction, and combine it with entropy r r coding to losslessly encode the residual. Finally, the im- Figure 1. Overview of the proposed learned lossless compression age is stored using the concatenation of the bitstreams pro- approach. To encode an input image x, we feed it into the Q- duced by BPG and the learned residual coder. The resulting Classifier (QC) CNN to obtain an appropriate quantization param- compression system achieves state-of-the-art performance eter Q, which is used to compress x with BPG. The resulting lossy in learned lossless full-resolution image compression, out- reconstruction xl is fed into the Residual Compressor (RC) CNN, performing previous learned approaches as well as PNG, which predicts the probability distribution of the residual, p(rjxl), WebP, and JPEG2000. conditionally on xl. An arithmetic coder (AC) encodes the resid- ual r to a bitstream, given p(rjxl). In gray we visualize how to reconstruct x from the bistream. Learned components are shown in violet. 1. Introduction The need to efficiently store the ever growing amounts of overhead incurred by using an imprecise model of the data data generated continuously on mobile devices has spurred distribution. One beautiful result is that maximizing the a lot of research on compression algorithms. Algorithms likelihood of a parametric probabilistic model is equivalent like JPEG [51] for images and H.264 [53] for videos are to minimizing the bitrate obtained when using that model used by billions of people daily. for lossless compression with an entropy coder (see, e.g., After the breakthrough results achieved with deep neu- [29]). Learning parametric probabilistic models by likeli- ral networks in image classification [27], and the subse- hood maximization has been studied to a great extent in the quent rise of deep-learning based methods, learned lossy generative modeling literature (e.g. [50, 49, 39, 34, 25]). arXiv:2003.10184v1 [cs.CV] 23 Mar 2020 image compression has emerged as an active area of re- Recent works have linked these results to learned lossless search (e.g. [6, 45, 46, 37,2,4, 30, 28, 48]). In lossy compression [29, 18, 47, 24]. compression, the goal is to achieve small bitrates R given Even though recent learned lossy image compression a certain allowed distortion D in the reconstruction, i.e., methods achieve state-of-the-art results on various data the rate-distortion trade-off R + λD is optimized. In con- sets, the results obtained by the non-learned H.265-based trast, in lossless compression, no distortion is allowed, and BPG [43,7] are still highly competitive, without requir- we aim to reconstruct the input perfectly by transmitting ing sophisticated hardware accelerators such as GPUs to as few bits as possible. To this end, a probabilistic model run. While BPG was outperformed by learning-based ap- of the data can be used together with entropy coding tech- proaches across the bitrate spectrum in terms of PSNR [30] niques to encode and transmit data via a bitstream. The and visual quality [4], it still excels particularly at high- theoretical foundation for this idea is given in Shannon’s PSNR lossy reconstructions. landmark paper [40], which proves a lower bound for the In this paper, we propose a learned lossless compres- bitrate achievable by such a probabilistic model, and the sion system by leveraging the power of the lossy BPG, as 1 illustrated in Fig.1. Specifically, we decompose the in- boom et al.[18] propose Integer Discrete Flows (IDFs), put image x into the lossy reconstruction xl produced by defining an invertible transformation for discrete data. In BPG and the corresponding residual r. We then learn a contrast to L3C, the latter works focus on smaller data sets probabilistic model p(r x ) of the residual, conditionally such as MNIST, CIFAR-10, ImageNet32, and ImageNet64, j l on the lossy reconstruction xl. This probabilistic model is where they achieve state-of-the-art results. fully convolutional and can be evaluated using a single for- ward