Biased Run-Length Coding of Bi-Level Classification Label Maps

1 Biased Run-Length Coding of Bi-Level Classification Label Maps of Hyperspectral Images Amir L. Liaghati, W. David Pan, Senior Member, IEEE, Zhuocheng Jiang Abstract—For efficient coding of bi-level sources with and storage [2]. Our recent work in [3] addressed lossless some dominant symbols often found in classification label methods for efficiently compressing arbitrarily shaped maps of hyperspectral images, we proposed a novel bi- “sub-images” belonging to a certain class of particular ased run-length (BRL) coding method, which codes the interest (see Fig. 1 for an example of the bi-level classi- most probable symbols separately from other symbols. fication label maps for 16 classes resulting from pattern To determine the conditions in which the BRL coding method would be effective, we conducted an analysis of classification via the support vector machine [4]). Any the method using statistical models. We first analyzed pixel within an hyperspectral image either belongs to a the effect of two-dimensional blocking of pixels which certain class or belongs to any other classes. Therefore, were assumed to have generalized Gaussian distributions. the class label map is a bi-level image, with the same The analysis showed that the resulting symbol blocks size as that of the original image in any spectral band. tended to have lower entropies than the original source The map provides critical spatial information required without symbol blocking. We then analyzed the BRL to reconstruct the pixel values belong to a certain class. coding method applied on the sequence of block symbols Therefore, efficient lossless compression of these bi-level characterized by a first-order Markov model. Information- maps would be useful or even critical in some remote theoretic analysis showed that the BRL coding method tended to generate codewords that have lower entropies sensing applications with severely limited bandwidths than the conventional run-length coding method. Further- [5]. This is a separate problem not addressed in our prior more, numerical simulations on lossless compression of work [3], [6]. actual data showed improvement of the state-of-the-art. Specifically, end-to-end implementation integrating symbol blocking, BRL and Huffman coding achieved up to 4.3% Conventional bi-level image compression techniques higher compression than the JBIG2 standard method, and include run-length coding [7], arithmetic coding [8], up to 3.2% higher compression than the conventional run- and geometric-based coding [9], [10]. In addition, In- length coding method on classification label maps of the ternational standards for binary image compression have widely used “Indian Pines” dataset. been developed [11], including JBIG2 [12], and JPEG 2000 [13]. In order to exploit the pixel correlations I. INTRODUCTION in both horizontal and vertical directions, our previous Hyperspectral imaging techniques have been used in a work [14] adopted a symbol packing approach, where a wide array of earth observing applications such as mate- binary image was first partitioned into blocks, with each rial quantification and target detection. A hyperspectral block being scanned in a raster-scan order. The resulting image is often organized as a three-dimensional dataset sequence of bi-level symbols was then converted to a with two spatial dimensions and one spectral dimension. binary representation of that block. We observed that in As an example, see Fig. 1(a), which is the 30th spectral many bi-level images, either the all-1 or the all-0 block band (out of a total of 220 bands) from NASA’s Airborne symbol tends to be the most probable one among all Visible/Infrared Imaging Spectrometer (AVIRIS) hyper- possible symbols. To take advantage of the redundancy spectral image dataset [1]. The high spectral resolution associated with the most probable symbols, we intro- makes it possible to address various applications requir- duced a biased run-length encoding method, which run- ing very high discrimination capabilities in the spectral length codes only the most probable block symbol. In domain. However, the large data volume of hyperspectral the following, we first give a brief survey in Section II images presents a challenge for both data transmission on the run-length coding methods. We then point out the novelty of the proposed biased run-length method. A. L. Liaghati is with the Boeing Company, 1100 Redstone Section III presents the analysis for proposed biased Gateway SW, Huntsville, AL 35824, USA, and W. D. Pan and Z. run-length coding method, based on the mathematical Jiang are with the Dept. of Electrical and Computer Engineering, University of Alabama in Huntsville, Huntsville, AL 35899, USA. model given in Section IV. Simulation results are given The first two authors contributed equally to this work. in Section V. The paper is concluded in Section VI. 2 (a) Spectral band 30. (b) 16 classes identified. (c) Class 0. (d) Class 1. (e) Class 2. (f) Class 3. (g) Class 4. (h) Class 5. (i) Class 6. (j) Class 7. (k) Class 8. (l) Class 9. (m) Class 10. (n) Class 11. (o) Class 12. (p) Class 13. (q) Class 14. (r) Class 15. (s) Class 16. Fig. 1: Sample dataset (“Indian Pines”) and the classification label maps. (a) A sample band. (b) 16 classes identified using the support vector machine (SVM) method [6], [14]. (c) Class “0” belongs to the pixels that are not actually classified (due to lack of ground truth for assessment). (d)-(s) Individual classification label maps (pixels belonging to the same class are shown in white, with the pixels of other classes shown in black). II. RUN-LENGTH CODING METHODS data compression on which run of data are stored as a single data value and count, rather than as the original A run is a sequence of pixels having an identical value, run. This is most useful on data that contains many such and the number of such pixels is length of the run. Run- runs [7], [15]. In some recent work, [16] introduces a so- length encoding (RLE) is a very simple form of lossless called extended frequency-directed run length coding for 3 test data compression, and [17] proposes a variable prefix III. BIASED RUN-LENGTH CODING METHOD dual run length coding for VLSI test data compression, In order to exploit the pixel correlations in both among other similar run length coding methods proposed horizontal and vertical directions, we proposed a symbol for this purpose [18]–[20]. Furthermore, [21] combined packing approach in order to pack more pixels in a block run-length coding with Huffman coding for lossless com- symbol, because the background appears to contain the pression of fluoroscopy medical images. [22] presented majority of the pixels. In addition, objects usually have a lossless audio coding method using Burrows-Wheeler some “thickness”, meaning that there might be more Transform with the combination of run-length coding. grouped “0” pixels in the image than those isolated Additional work on application of run-length coding “0” pixels. To show that this is indeed the case, we includes papers on image and data compression [23]– calculated the probability distribution of block symbols [26], feature and region data extraction [27], [28], image for all the bi-level maps considered. Fig. 3 shows the quality index [29] and image hiding [30]. To improve the first nine most probable block symbols in descending conventional RLE, [31] introduces an adaptive scheme order from left to right. Note that all 512 possible block that encodes run and level separately using adaptive symbols will be used in constructing the Huffman code binary arithmetic coding and context modeling. [32] table. Block symbol “511” is the most probable block proposed a method to parse the binary data sequences symbol with probability of 0.9192. Block symbol “0” to make run-length coding more efficient, where the is the next most probable block with probability of run-length code belongs to the family of variable-length 0.0408. We can see that the probability of the block constrained sequence codes [33]. symbols with same pixel values grouped together (group In contrast to the exiting methods, we proposed to of white/black pixels) are higher than block symbols code only the most probable block symbol to avoid with random pattern (isolated black or white dots). Note excessive number of short runs (a major source of coding that using larger blocks would allow us to better exploit inefficiency). Thus the main novelty of the proposed the spatial correlations in the source image; however, a biased run length (BRL) coding methods lies in its large alphabet of block symbols would make the actual separate special treatment of the most probable symbol implementation of entropy coding (e.g., Huffman coding) from the rest of the block symbols. The run-lengths overly complicated. Given a block size of N × N, the 2 of the most probable block symbol are entropy coded number of distinct block symbols is 2N . In this work, using a variable-length code such as the Huffman code. we found the block size of 3 × 3 pixels (corresponding On the other hand, we use a separate Huffman code to to 29 = 512 possible block symbols) offered a good compress the modified sequence of block symbol values. tradeoff. A further increase of the block size to, for Fig. 2 shows the block diagram for the biased run length instance, 4 × 4, will generate 65; 536 possible block coding method. The goal of this paper is to provide symbols, making the Huffman code table too large to an in-depth analysis of the compression performance of be manageable in practical implementations. the proposed biased run-length coding method based on statistical model, instead of relying on empirical data.

Biased Run-Length Coding of Bi-Level Classification Label Maps

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support