High-Throughput Lossy-To-Lossless 3D Image Compression
Total Page:16
File Type:pdf, Size:1020Kb
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 40, NO. 2, FEBRUARY 2021 607 High-Throughput Lossy-to-Lossless 3D Image Compression Diego Rossinelli , Gilles Fourestey , Felix Schmidt, Björn Busse , and Vartan Kurtcuoglu Abstract— The rapid increase in medical and biomed- strength particularly in the elderly. Such large footprints call ical image acquisition rates has opened up new avenues for parallel file systems for archival as well as HPC clusters for image analysis, but has also introduced formidable for image analysis, as illustrated by Reynaud et al. [5]. challenges. This is evident, for example, in selective plane illumination microscopy where acquisition rates of Furthermore, sharing those large images over the network is about 1-4 GB/s sustained over several days have redefined still a largely unsolved problem. the scale of I/O bandwidth required by image analysis Over the last three decades, the signal processing commu- tools. Although the effective bandwidth could, principally, nity has developed sophisticated data compression schemes be increased by lossy-to-lossless data compression, this is for images, relying on the concept of multi-resolution analy- of limited value in practice due to the high computational demand of current schemes such as JPEG2000 that reach sis (MRA) and wavelets [6]–[11]. These schemes have brought compression throughput of one order of magnitude below two game-changing benefits: substantial increase in effective that of image acquisition. Here we present a novel lossy- storage capacity and drastic increase in effective I/O and to-lossless data compression scheme with a compression network bandwidth. Part of this research culminated in the throughput well above 4 GB/s and compression rates and JPEG 2000 standard [12], with 3D images considered in rate-distortion curves competitive with those achieved by JPEG2000 and JP3D. Part 10 [13], which is usually referred to as “JP3D”. The bitstreams generated by JPEG2000 are not just compressed, Index Terms— Image compression, integration of multi- but also lossy-to-lossless and scalable with respect to quality, scale information, parallel computing. resolution, and region of interest (ROI) -accessible. Lossy- I. INTRODUCTION to-lossless refers to the ability to read just a fraction of the ERKEL has recently referred to the challenge of effec- compressed bitstream to get a reasonable approximation of Ptively handling very large sets of images as “the struggle the entire image, whereas reading the entire bitstream leads to with image glut” [1]. While this struggle is already evi- a lossless decompression. Quality-scalable bitstreams provide dent in medical imaging, the development of adjacent fields us with the ability to control the distortion by prescribing a foreshadows what is yet to come. For instance, selective reading bitrate. The efficiency of a quality-scalable bitstream plane illumination microscopy (SPIM), a tool employed, e.g, is generally characterized by its rate-distortion curve (r-d in developmental biology, may generate data at rates of up curve) in terms of peak signal-to-noise ratio (PSNR) versus to 4 GB/s [2]. Image sets of 10-30 TB in size are not bits-per-sample (BPS). ROI-accessibility allows us to read unusual. Other biomedical imaging modalities producing large exclusively those portions of the bitstream describing a specific footprint images include optical coherence tomography [3] and ROI, whereas resolution-scalable bitstreams directly expose high-resolution peripheral quantitative computed tomography sequences of bits representing a specific resolution. (HR-pQCT) [4], used clinically to assess bone structure and While lossy-to-lossless is, principally, a promising approach to handling and sharing the image glut in medicine and life Manuscript received September 9, 2020; revised October 17, 2020; sciences, the JPEG2000 and JP3D formats are inadequate: on accepted October 20, 2020. Date of publication October 23, 2020; date of current version February 2, 2021. This work was supported, in part, the latest CPUs, their performance is one order of magnitude by the Swiss National Science Foundation through NCCR Kidney.CH below what is required to keep up with the highest image as well as through Grants 153523 and 182683. (Corresponding author: acquisition rates [1], [5], [14]. Pursuing substantial improve- Diego Rossinelli.) Diego Rossinelli is with the Institute of Physiology, University of Zurich, ments in compression speed, Amat et al. [14] proposed the 8057 Zürich, Switzerland, and also with Lucid Concepts AG, 8005 Zürich, Keller-Lab Block (KLB) file format, achieving a throughput Switzerland (e-mail: [email protected]). of about 600 MB/s (which would correspond to 3 GB/s on the Gilles Fourestey is with SCITAS, EPFL, 1015 Lausanne, Switzerland (e-mail: gilles.fourestey@epfl.ch). platforms considered here, assuming perfect scaling). This was Felix Schmidt and Björn Busse are with Center for Experimen- bought at the expense of dropping all bitstream features but tal Medicine, Institute of Osteology and Biomechanics, Universität- ROI-accessibility. As the file format acronym suggests, the raw sklinikum Hamburg-Eppendorf (UKE), 20251 Hamburg, Germany (e-mail: [email protected]; [email protected]). file is decomposed into spatiotemporal tiles - hence the ROI- Vartan Kurtcuoglu is with the Institute of Physiology, University of accessibility - and each block is independently compressed Zurich, 8057 Zürich, Switzerland (e-mail: [email protected]). exploiting the available thread-level parallelism (TLP). Color versions of one or more of the figures in this article are available online at https://ieeexplore.ieee.org. Today’s image analysis software are primarily limited Digital Object Identifier 10.1109/TMI.2020.3033456 by I/O bandwidth – much more so than by storage [1]. 0278-0062 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. Authorized licensed use limited to: MAIN LIBRARY UNIVERSITY OF ZURICH. Downloaded on February 18,2021 at 12:06:07 UTC from IEEE Xplore. Restrictions apply. 608 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 40, NO. 2, FEBRUARY 2021 Quality-scalable bitstreams directly address this issue. Visually and update steps, corresponding to a polyphase matrix factor- lossless image previews can be achieved by reading perhaps ization of the transform. Among several other advantages, this just 10% of the bitstream resulting in a 10X boost of the decomposition provides a two-fold algorithmic improvement effective I/O bandwidth. While the KLB format typically leads over first-generation wavelets [23]. Integer wavelets, whose to a 2:1 compression rate, it brings no benefits in terms of transform is implemented exclusively with integer operations effective I/O bandwidth, as KLB bitstreams are not quality- [21], [24], show the power of the lifting scheme and have scalable. direct implications for data compression. 2) Zerotree Codecs: A major advancement in image com- A. Contributions pression came in the form of zerotree codecs in conjunc- tion with wavelets, such as the embedded zerotrees wavelet The contributions of this article are as follows. We demon- codec (EZW) proposed by Shapiro [25]. These codecs generate strate that it is possible to devise data compression schemes quality-scalable bitstreams by exploiting the parent-children generating scalable lossy-to-lossless bitstreams at throughputs coefficients correlation across the image’s MRA. Zerotree exceeding the acquisition rates of the latest microscopes and based codecs saw a peak in recognition with the work of Said scanners. Specifically, the scheme described herein leads to and Pearlman [26], [27], where the reasons for the outstanding : compressed bitstreams featuring EZW efficiency were elucidated and the set partitioning in • quality-scalability and ROI-accessibility hierarchical trees (SPIHT) codec was outlined. The SPIHT • multiresolution representation algorithm is surprisingly compact and improves upon the • compression rates comparable to those of JP3D compression results achieved by EZW. • r-d curves comparable to those of JP3D With their ability to encode zerotrees - insignificant pyrami- • lossless compression throughput of 30 GB/s, per-node dal regions within the MRA - with a single symbol, zerotree • lossless decompression throughput of 30 GB/s, per-node codecs [25]–[27] lead to outstanding compression rates. With • lossy decompression throughput of 80 GB/s, per-node respect to other 2D and 3D codecs, zerotree codecs provide We are not aware of open source or commercial counterparts us with several other advantages: with comparable performance. The effective I/O bandwidth • compactness and low computational complexity achieved with our scheme allows out-of-core analysis algo- • flexibility in granularity rithms to be accelerated by one to two orders of magnitude. • capture of data correlation across subbands In the following text we describe our approach and assess both timings and compression rates against the state of the art. Simplicity and compactness enable us to quickly assess (and The assessment relies on three datasets acquired with three often discard) ways to map these codecs on current CPUs. different modalities: wide-field microCT, scanning electron The ability to process groups of coefficients rather than indi- microscopy (SEM) and HR-pQCT. vidual coefficients give us the flexibility to trade compression efficiency for speed.