electronics
Article Medical Video Coding Based on 2nd-Generation Wavelets: Performance Evaluation
Merzak Ferroukhi 1, Abdeldjalil Ouahabi 2,3,* , Mokhtar Attari 1, Yassine Habchi 4 and Abdelmalik Taleb-Ahmed 5
1 Laboratory of Instrumentation, Faculty of Electronics and Computers, University of Sciences and Technology Houari Boumediene, 16111 Algiers, Algeria; [email protected] (M.F.); [email protected] (M.A.) 2 Polytech Tours, Imaging and Brain, INSERM U930, University of Tours, 37200 Tours, France 3 Computer Science Department, LIMPAF, University of Bouira, 10000 Bouira, Algeria 4 Electrical Engineering Department, LTIT, Bechar University, 08000 Bechar, Algeria; [email protected] 5 IEMN DOAE UMR CNRS 8520, UPHF, 59313 Valenciennes, France; [email protected] * Correspondence: [email protected]; Tel.: +33-603894463
Received: 5 October 2018; Accepted: 8 January 2019; Published: 14 January 2019
Abstract: The operations of digitization, transmission and storage of medical data, particularly images, require increasingly effective encoding methods not only in terms of compression ratio and flow of information but also in terms of visual quality. At first, there was DCT (discrete cosine transform) then DWT (discrete wavelet transform) and their associated standards in terms of coding and image compression. The 2nd-generation wavelets seeks to be positioned and confronted by the image and video coding methods currently used. It is in this context that we suggest a method combining bandelets and the SPIHT (set partitioning in hierarchical trees) algorithm. There are two main reasons for our approach: the first lies in the nature of the bandelet transform to take advantage of capturing the geometrical complexity of the image structure. The second reason is the suitability of encoding the bandelet coefficients by the SPIHT encoder. Quality measurements indicate that in some cases (for low bit rates) the performance of the proposed coding competes with the well-established ones (H.264 or MPEG4 AVC and H.265 or MPEG4 HEVC) and opens up new application prospects in the field of medical imaging.
Keywords: bandelets; medical imaging; quadtree decomposition; SPIHT coder; video coding; MPEG4 AVC; HEVC; video quality measure
1. Introduction and Motivation The extensive volume of patient medical data recorded at every moment in hospitals, medical imaging centers and other medical organizations has become a major issue. The need for quasi-infinite storage space and efficient real-time transmission in specific applications such as medical imaging, military imaging and satellite imaging requires advanced techniques and technologies including coding methods to reduce the amount of data to be stored or transmitted. Without coding of information, it is very difficult and sometimes impossible to make a way to store or communicate big data (large volumes of high velocity and complex images, audio and video information) via the internet. The encoding of these data is obtained by eliminating the redundant or unnecessary information in the original frame which cannot be identified with the naked eye. Telemedicine offers valuable specialty diagnostic services to underserved patients in rural areas, but often requires significant image compression to transmit medical images using limited bandwidth
Electronics 2019, 8, 88; doi:10.3390/electronics8010088 www.mdpi.com/journal/electronics Electronics 2019, 8, x FOR PEER REVIEW 2 of 18 bandwidth networks. The difficulty is that compressing images such as those for tele-echography can introduce artifacts and reduce medical image quality which affects directly the diagnosis [1]. Electronics 2019, 8, 88 2 of 18 This possible disadvantage is inherent in all lossy compression methods. Figure 1 recalls the key steps of video coding. Transformation is a linear and reversible operationnetworks. Thethat difficultyallows the is thatimage compressing to be represented images such in asthe those transform for tele-echography domain where can the introduce useful informationartifacts and is reduce well localized. medical image This property quality which will make affects it possible directly theduring diagnosis the compression [1]. to achieve goodThis discrimination, possible disadvantage that is to say is inherent the suppressi in all lossyon of compressionunnecessary methods.or redundant information. The discreteFigure cosine1 recalls transform the key stepsand the of videowavelet coding. tran Transformationsform are among is a linearthe most and reversiblepopular transforms operation proposedthat allows for the image image and to video be represented coding. in the transform domain where the useful information is wellQuantization localized. is This the step property of the willcompression make it possibleprocess during during which the compression a large part of to the achieve elimination good ofdiscrimination, unnecessary or that redundant is to say theinformation suppression occurs. of unnecessary or redundant information. The discrete cosineCoding transform techniques and the e.g., wavelet Huffman transform coding areand among arithmetic the mostcoding popular [2], are transforms the best-known proposed entropy for codingimage andparticularly video coding. in image and video coding.
Figure 1. SimplifiedSimplified video coding scheme.
ItQuantization is well known is the that step lossless of the compression compression methods process duringhave the which advantage a large of part providing of the elimination an image withoutof unnecessary any loss or redundantof quality informationand the disadvantage occurs. of compressing weakly. Hence, this type of compressionCoding techniquesis not suitable e.g., for Huffman transmission coding or and storage arithmetic of big coding data such [2], areas video. the best-known Conversely, entropy lossy compressioncoding particularly produces in image significantly and video smaller coding. file sizes often at the expense of image quality. However,It is well lossy known coding that based lossless on transforms compression has methods gained great have importance the advantage when of several providing applications an image withouthave appeared. any loss In of particular, quality and lossy the coding disadvantage represents of compressingan acceptable weakly. option Hence,for coding this medical type of images.compression e.g., a is DICOM not suitable (Digital for Imaging transmission Communica or storagetion ofin bigMedicine) data such JPEG as (Joint video. Photographic Conversely, Expertslossy compression Group) based produces on discrete significantly cosine smallertransform file (DCT) sizes oftenand a at DICOM the expense JPEG of 2000 image based quality. on wavelet.However, In lossy this context, coding basedwavelet-based on transforms image has codi gainedng, e.g., great JPEG2000, importance using when multiresolution several applications analysis [3,4]have exhibits appeared. performance In particular, that lossy is highly coding superior represents to other an acceptable methods optionsuch as for those coding based medical on DCT, images. e.g., JPEG.e.g., a DICOM (Digital Imaging Communication in Medicine) JPEG (Joint Photographic Experts Group) basedIn on 1993, discrete Shapiro cosine introduced transform a (DCT) lossy andimage a DICOM compression JPEG 2000 algorithm based onthat wavelet. he called In thisembedded context, wavelet-basedzerotrees of wavelet image coding,(EZW) e.g.,[5]. JPEG2000,The success using of co multiresolutionding wavelet analysisapproaches [3,4 ]is exhibits largely performance due to the occurrencethat is highly of superior effective to sub-band other methods coders. such The as thoseEZW basedencoder on DCT,is the e.g., first JPEG. to provide remarkable distortion-rateIn 1993, Shapiro performance introduced while aallowing lossy image progressive compression decoding algorithm of the that image. he calledThe principle embedded of zerotrees,zerotrees ofor waveletother partitioning (EZW) [5]. structures The success in sets of codingof zeros, wavelet makes approachesit possible to is take largely account due toof the residualoccurrence dependence of effective of sub-bandthe coefficients coders. between The EZW them. encoder More is precisely, the first tosince provide the high-energy remarkable coefficientsdistortion-rate are spatially performance grouped, while their allowing position progressive is effectively decoding comput ofed theby indicating image. The the principle position of thezerotrees, sets of orlow-energy other partitioning coefficients. structures After separation in sets of of zeros, the low-energy makes it possible coefficients to take and account the energetic of the coefficients,residual dependence the latter of is the relative coefficientsly independent between them.and can More be efficiently precisely, since coded the using high-energy quantification coefficients and entropyare spatially coding grouped, techniques. their position is effectively computed by indicating the position of the sets of low-energyIn 1996, coefficients. Said and Pearlman After separation proposed of a the hierarchical low-energy tree coefficients partitioning and coding the energetic scheme coefficients, called set partitioningthe latter is relativelyin hierarchical independent tree (SPIHT) and can[6], bebased efficiently on the codedsame basic using EZW, quantification and exhibiting and entropy better performancecoding techniques. than the original EZW. VideoIn 1996, compression Said and Pearlman algorithms proposed such as a H.264 hierarchical and MPEG tree partitioning use inter-frame coding prediction scheme to called reduce set videopartitioning data between in hierarchical a series of tree images. (SPIHT) This [6 involves], based techniques on the same such basic as differential EZW, and exhibitingcoding where better an performance than the original EZW. Video compression algorithms such as H.264 and MPEG use inter-frame prediction to reduce video data between a series of images. This involves techniques such as differential coding where an Electronics 2019, 8, 88 3 of 18 Electronics 2019, 8, x FOR PEER REVIEW 3 of 18 image is compared with a reference image and only the pixels that have changed with respect to that reference image are coded. MPEG (Moving Picture Experts Group) allows to encode video usingusing threethree codingcoding modes:modes: Intra coded framesframes (I):(I): framesframes areare coded coded separately separately without without reference reference to to previous previous images images (i.e., (i.e., as asJPEG JPEG coding). coding). Predictive coded frames (P): images are describeddescribed by difference fromfrom previousprevious images.images. Bidirectionally predictive coded frames (B): (B): the images are described by difference with the previous image and the following image. In order to optimize MPEG coding, the the image image sequences sequences are are in in practice practice coded coded in in a a series series of of I, I, B, B, and P images, the order of which has been determineddetermined experimentally. The typical sequence called GOP (group of pictures) is as follows (see Figure2 2):):
Figure 2.2. TypicalTypicalsequence sequence with with I, BI, andB and Pframes. P frames. P frame P frame can onlycan only refer refer to the to previous the previous I or P frames,I or P whileframes, a Bwhile frame a B can frame refer can to previousrefer to pr orevious subsequent or subsequent I or P frames. I or P frames.
To evaluateevaluate thethe visualvisual qualityquality of of the the reconstructed reconstructed video, video, several several measurements measurements of of the the quality quality of ofthe the MPEG MPEG video video are are validated validated in [in7, 8[7,8].]. As a successor to the famous H.264 or MPEG-4 Part 10 10 standards standards [9,10], [9,10], HEVC HEVC (High-Efficiency (High-Efficiency Video Coding)Coding) oror H.265 H.265 standard standard has has recently recently been be defineden defined as a promisingas a promising standard standard for video for coding, video coding,In June 2013In June [11 ],2013 the [11], first versionthe first of version HEVC of was HEVC announced, was announced, and to improve and to the improve efficiency the ofefficiency the coding of theof this coding standard, of this astandard, set of its a components set of its components requires morerequires development. more development. In addition, In addition, this new this system new systemrequires requires new hardware new hardware and software and software investments investments delaying delaying its adoption its adoption in applications in applications such as in such the asmedical in the field.medical field. In ourour case, case, the the use use of wavelets of wavelets enhances enhances the contours, the contours, which minimizes which minimizes the impact the of information impact of informationloss in medical loss imaging, in medical mainly imaging, in tumor mainly detection in tumor [12– 15detection]. Moreover, [12–15]. bandelets Moreover, are 2nd-generation bandelets are wavelets2nd-generation and the wavelets corresponding and the transform corresponding is an orthogonal transform multi-resolution is an orthogonal transform multi-resolution that is able to transformcapture the that geometric is able contentto capture ofimages the geometric and surfaces. content Hence, of images the possible and surfaces. presence Hence, of artifacts the possible will be presenceconsiderably of artifacts reduced. will be considerably reduced. Our motivation lies in the prospect of offering a new codec that is simple to implement with performances superior to the current H.264 standard and which could compete with the H.265 or HEVC standard for some applications. This This superiority superiority is is based based on on the SPIHT algorithm and the second-generation wavelets. One can ask: “Why using second generation wavelets (instead of classical wavelets)?” This new new generation generation of of wavelets wavelets makes makes it itpossib possiblele to to generate generate decorrelated decorrelated coefficients, coefficients, and and to eliminateto eliminate any any redundancy redundancy by by retaining retaining only only the the ne necessarycessary information information and and it it is is better better suited suited to geometric data (ridges,(ridges, contours,contours, curves,curves, oror singularities).singularities). ThisThis 2nd-generation2nd-generation wavelet wavelet consists of Xlets such asas shearlets,shearlets, bandelets,bandelets, curvelets, curvelets, contourlets, contourlets, ridgelets, ridgelets, noiselets, noiselets, etc. etc. These These Xlets Xlets provide provide an aninteresting interesting performance performance in some in applications, some applications, e.g., contourlets-based e.g., contourlets-based denoising [16], contourlets-baseddenoising [16], contourlets-based video coding [17,18], curvelets-based contrast enhancement [19], bandelets-based geometric image representations [20], shearlets-based image denoising [21], …
ElectronicsElectronics2019 2019, 8,, 8 88, x FOR PEER REVIEW 44 ofof 18 18
In this paper, we analyse and compare to the current state-of-the-art coding methods, the videoperformances coding [17 of,18 a], new curvelets-based coding method contrast based enhancement on bandelet [19 transform], bandelets-based where its geometric coefficients image are representationsencoded using [the20], set shearlets-based partitioning in image hierarchical denoising trees [21 (SPIHT).], . . . InThe this remainder paper, we of analysethis paper and is compare organized to as the follows: current Section state-of-the-art 2 presents coding and recalls methods, the main the performancesproperties of ofbandelet a new coding transform. method Section based 3 intr on bandeletoduces bandelet-SPIHT-bas transform where itsed coefficients video coding. are encoded Section 4 usingfocuses the on set the partitioning experimental in hierarchical aspect and treesthe analysis (SPIHT). of the results. It is worth noting that this work is an extensionThe remainder of the ofconference this paper paper is organized (IECON 2016 as follows:—42nd Annual Section 2Conference presents and of the recalls IEEE the Industrial main propertiesElectronics of bandeletSociety) transform.entitled “Towards Section3 introducesa New Standard bandelet-SPIHT-based in Medical Video video Compression” coding. Section [22].4 focusesCompared on the to experimental this conference aspect paper, and thethere analysis are subs of thetantial results. differences, It is worth in noting this thatrevised this workversion, is anincluding extension large of the databases conference used paper benchmarking (IECON 2016—42nd and the Annualintroduction Conference of HEVC of the which IEEE is Industrial the most Electronicsrecent standardized Society) entitled video“Towards compression a New technolo Standardgy. inFinally, Medical Section Video 5 Compression” concludes this [ 22study.]. Compared to this conference paper, there are substantial differences, in this revised version, including large databases2. Bandelet used Transform benchmarking Algorithm and the introduction of HEVC which is the most recent standardized video compression technology. Finally, Section5 concludes this study. This section describes the different steps of the bandelet approximation algorithm. This 2.algorithm Bandelet was Transform originally Algorithm proposed by Peyré and Mallat [23,24] in the context of geometric image representation. This section describes the different steps of the bandelet approximation algorithm. This algorithm was2.1. originally Bandelet Transform proposed by Peyré and Mallat [23,24] in the context of geometric image representation.
2.1. BandeletBandelets Transform are 2nd-generation wavelets and the corresponding transform is an orthogonal multi-resolution transform that is able to capture the geometric content of images and surfaces. Bandelets are 2nd-generation wavelets and the corresponding transform is an orthogonal Bandelet transform is carried out as follows: multi-resolution transform that is able to capture the geometric content of images and surfaces. First, using orthogonal or bi-orthogonal wavelet, we calculate the wavelet transform of the Bandelet transform is carried out as follows: original medical image. First, using orthogonal or bi-orthogonal wavelet, we calculate the wavelet transform of the Second, to construct a bandelet basis on the whole wavelet domain, we use a quadtree original medical image. segmentation at each wavelet scale in dyadic squares, as shown in Figure 3 where the main steps for Second, to construct a bandelet basis on the whole wavelet domain, we use a quadtree computing bandelet transform are summarized and illustrated on a brain magnetic resonance image segmentation at each wavelet scale in dyadic squares, as shown in Figure3 where the main steps for (MRI). Note that we define a different quadtree for each scale of the wavelet transform (in Figure 3 computing bandelet transform are summarized and illustrated on a brain magnetic resonance image only the quadtree of the finest scale is depicted). More specifically, we use the same quadtree for (MRI). Note that we define a different quadtree for each scale of the wavelet transform (in Figure3 each of the three orientations of the wavelet transform at fixed scale. only the quadtree of the finest scale is depicted). More specifically, we use the same quadtree for each of the three orientations of the wavelet transform at fixed scale.
Quadtree-A Quadtree- B
(a ) MRI (brain) Video (b ) Video frame (c ) Wavelet decomposition (d ) Extraction of a sub - square
(e ) Warped wavelet coefficients ( f ) Resulting 1D Wavelet transform (g ) Bandelet coefficients space
Figure 3. Example of graphical steps of the bandelet transformation algorithm illustrated on a brain Figure 3. Example of graphical steps of the bandelet transformation algorithm illustrated on a brain magnetic resonance image (MRI) (T2-weighted image acquired at 3 T). magnetic resonance image (MRI) (T2-weighted image acquired at 3 T). A complete representation in a bandelet basis is thus composed of: A complete representation in a bandelet basis is thus composed of:
Electronics 2019, 8, 88 5 of 18
Electronics 2019, 8, x FOR PEER REVIEW 5 of 18 • An image segmentation at each scale, defined by a quadtree, • • ForAn image each dyadic segmentation square of at the each segmentation, scale, defined by a quadtree, • For each dyadic square of the segmentation, – a polynomial flow, – a polynomial flow, – –bandelet bandelet coefficients. coefficients.
The MATLAB implementationimplementation ofof bandeletbandelet transformtransform isis freelyfreely availableavailable inin [[25].25].
2.2. Geometric Flows The bandelet bases are basedbased onon anan associationassociation betweenbetween thethe waveletwavelet decompositiondecomposition and an estimation ofof the the image image information information of geometric of geometri character.c character. Instead Instead of describing of describing the image the geometry image throughgeometry edges, through which edges, are mostwhich often are most ill defined, often ill the de imagefined, geometry the image is geometry characterized is characterized with a geometric with flowa geometric of vectors. flow Theseof vectors. vectors These give vectors the local give directions the local indirections which the in imagewhich hasthe regularimage has variations. regular Orthogonalvariations. Orthogonal bandelet bases bandelet are constructed bases are by constr dividingucted the by image dividing support the inimage regions support inside in which regions the geometricinside which flow the is geometric parallel. The flow estimation is parallel. of theThe geometry estimation is doneof the by geometry studying is the done contours by studying present the in ancontours image. present A contour in an is thenimage. seen A contour as a parametric is then curveseen as that a parametric will be characterized curve that bywill its be tangents. characterized To do this,by its we tangents. look for To gradients do this, ofwe significant look for grad importanceients of significant in the frame. importance in the frame. Around each region of frames,frames, the local geometry directions, in which the frame has regular variations inin thethe neighborhoodneighborhood ofof eacheach pixel, pixel, are are determined determined by by two-dimensional two-dimensional geometric geometric flow flow of of a vectora vector field. field.
2.3. Quadtree Division Process In this algorithm,algorithm, the bandelet transform is performedperformed over each dyadic square and scale in the waveletwavelet domain.domain. This is of course a redundantredundant transform.transform. The best segmentation represented as a quadtree isis obtainedobtained throughthrough LagrangianLagrangian optimization.optimization. An example of quadtree decompositiondecomposition is shown in Figure4 4..
Figure 4. Example of quadtree decomposition.
In orderorder toto minimizeminimize thethe distortiondistortion rate, we useuse a parallel,parallel, vertical or horizontal flowflow in eacheach dyadic square.square. TheThe macro macro block block is is considered considered regular regular uniformly uniformly and and wavelet wavelet basis basis is used is used if the if there the isthere no geometricis no geometric flow. Otherwise, flow. Otherwise, the sub-block the sub-bl mustock be processedmust be processed by warped by wavelets, warped aswavelets, explained as inexplained [23,24]. in [23,24]. The best quadtree requiresrequires aa minimization,minimization, forfor eacheach direction,direction, ofof thethe followingfollowing Lagrangian:Lagrangian:
2 Lf,R=f-f() 2 +λQR+R+R2 2 () L( fd, R) =dk fd − dfdR dRk + λQ∑ jSRjS + jGRjG jB+ RjB (1)(1) j j where where
f d : Original signal. fd: Original signal. f dR : Recovered signal using inverse 1D wavelet transform. fdR: Recovered signal using inverse 1D wavelet transform. R jS : Number of bits needed to encode the dyadic segmentation. RjS: Number of bits needed to encode the dyadic segmentation. R jG : Number of bits needed to encode the geometric parameter.
R jB : Number of bits needed to encode the quantized bandelets coefficients. λ: Lagrangian multiplier is chosen heuristically to be equal to ¾. To justify this value, see [20].
Electronics 2019, 8, 88 6 of 18
RjG: Number of bits needed to encode the geometric parameter.
RjB: Number of bits needed to encode the quantized bandelets coefficients. Electronics 2019, 8, x FOR PEER REVIEW 6 of 18 3 λ: Lagrangian multiplier is chosen heuristically to be equal to 4 . To justify this value, see [20]. Q: QuantizationQ : Quantization step. step. TheThe following following steps steps implement implemen thet the quadtree quadtree structure.structure. To computeTo compute the quadtree,the quadtree, we firstwe first compute compute the Lagrangianthe Lagrangian for eachfor each dyadic dyadic square square for the for smallest the size.smallest Then, the size. algorithm Then, the goesalgorithm from goes the smallestfrom the sizesmal tolest the size biggest, to the biggest, and tries and to tries merge to merge each groupeach of fourgroup squares. of four To do squares. so, it simply To do computesso, it simply the computes Lagrangian the ofLagrangian the merged of the square, merged and square, compares and it to compares it to the sum of the 4 Lagrangians (plus 1 bit, which is the cost of coding a split in the tree). the sum of the 4 Lagrangians (plus 1 bit, which is the cost of coding a split in the tree). Using MATLAB code, one can compute a single quadtree for each scale and for each Using MATLAB code, one can compute a single quadtree for each scale and for each orientation. orientation. Sharing the same quadtree for the three orientations leads to a little efficiency increase, Sharingat the the price same of quadtreea more complex for the implementation. three orientations The leads quadtree to a structure little efficiency and geometry increase, are at stored the price in of a morean image complex that implementation.is the same size as the The original quadtree imag structuree (and they and have geometry the same are hierarchical stored in structure an image as that is thethe same wavelet size transformed as the original image). image (and they have the same hierarchical structure as the wavelet transformed image). 2.4. Warped Wavelet along Geometric Flows 2.4. Warped Wavelet along Geometric Flows In order to perform the complex geometry and remove the redundancy of orthogonal wavelet coefficients,In order to bandelet perform basis the complexdecomposition geometry is applied and remove with a fast the application redundancy of ofthe orthogonal geometric flow, wavelet coefficients,quadtree bandeletdecomposition, basis warping decomposition and bandeletization. is applied with a fast application of the geometric flow, quadtreeIn decomposition, each dyadic square warping and at and each bandeletization. scale, the geometric flow is used to warp the wavelet basis alongIn each the dyadicregularity square direction. and atAfte eachr warping, scale, the the geometricnext step is flow to construct is used bandelets to warp theby applying wavelet basisa alongbandeletization the regularity procedure. direction. After warping, the next step is to construct bandelets by applying a The wavelet includes high-pass filters and vanishing moments at lower resolutions, this is valid bandeletization procedure. for vertical and diagonal detail coefficients, but not for horizontal detail coefficients. The problem of The wavelet includes high-pass filters and vanishing moments at lower resolutions, this is valid regularity along the geometric flow is due to the scaling function where it includes low-pass filters for verticaland does and not diagonalhave a vanishing detail coefficients, moment at lower but not resolutions. for horizontal detail coefficients. The problem of regularityTo along take advantage the geometric of regularity flow is due along to thethe scalinggeometric function flow for where horizontal it includes detail low-pass coefficients, filters the and doeswarped not have wavelet a vanishing basis is moment bandeletized at lower by replacin resolutions.g the horizontal wavelet by modified horizontal waveletTo take advantagefunction. Near of regularity the singularities, along the geometricwavelet coefficient flow for horizontalcorrelations detail are coefficients,removed by the using warped waveletbandeletization basis is bandeletized operation. by replacing the horizontal wavelet by modified horizontal wavelet function. Near theAfter singularities, warping waveletand bandeletization coefficient correlations operations, are the removed regions byare usingregular bandeletization along the vertical operation. or horizontalAfter warping direction. and In bandeletizationthe end, warped wavelets operations, are used the regionsto compute are bandelet regular coefficients along the with vertical 1D or horizontaldiscrete direction. wavelet transform In the end, than warped are encoded wavelets using are SPIHT used coder. to compute bandelet coefficients with 1D discrete waveletFigure 5 transformillustrates thanthis feature are encoded and compares using SPIHT histograms: coder. wavelet coefficients vs bandelet coefficients. Figure 5 clearly shows that the bandelet coefficients have larger amplitudes than the Figure5 illustrates this feature and compares histograms: wavelet coefficients vs bandelet wavelet coefficients. This gives the bandelets the ability to carry almost only useful information since coefficients. Figure5 clearly shows that the bandelet coefficients have larger amplitudes than the what is considered undesirable information is represented by small amplitude coefficients rarely waveletpresents coefficients. in the histogram This gives of theband bandeletselet coefficients. the ability Bandelets to carry are, almost therefore, only a usefulnatural information candidate for since whateffective is considered image compression. undesirable information is represented by small amplitude coefficients rarely presentsHence, in the histogramthe SPIHT coder, of bandelet initially coefficients. applied to wa Bandeletsvelet coefficients, are, therefore, will become a natural bandelet-SPIHT candidate for effectiveand will image be applied compression. to bandelet coefficients: this is what justifies our contribution.
1800 4000
1600 3500
1400 3000
1200 2500
1000
2000
800
1500 600
1000 400
200 500 Bandelet coefficient number Wavelet coefficients number coefficients Wavelet 0 0 -300 -200 -100 0 100 200 300 -300 -200 -100 0 100 200 300 Wavelet coefficients values Bandelet coefficient values (a) Wavelet (b) Bandelet
FigureFigure 5. Histograms 5. Histograms of of bandelet bandelet vs. vs. wavelet wavelet coefficients.coefficients. Bandelet Bandelet transform transform contains contains only only the themost most significantsignificant coefficients coefficients and, and, therefore, therefore, those those which which carrycarry information.
Electronics 2019, 8, 88 7 of 18
Hence, the SPIHT coder, initially applied to wavelet coefficients, will become bandelet-SPIHT and will be applied to bandelet coefficients: this is what justifies our contribution.
3. Process of Encoding
3.1. Set Partitioning in Hierarchical Trees (SPIHT) Algorithm Principles The SPIHT algorithm is one of the most efficient encoder [6]. It is used in many applications such as progressive web browsing, flexible image storage and retrieval, and image transmission over heterogeneous networks [26,27]. The SPIHT algorithm takes up the principles of EZW (embedded zerotree wavelet coding) [5] while proposing to recursively partition the coefficient trees. Set partitioning in hierarchical trees (SPIHT) is an image-compression algorithm that exploits the inherent similarities across the subbands in a wavelet decomposition of an image. This algorithm codes the most important wavelet transform coefficients first, and transmits the bits so that an increasingly refined copy of the original image can be obtained progressively. This method represents an important advance in the field and deserves special attention because it provides the following:
- Highest image quality; - Progressive image transmission; - Fully embedded coded file; - Simple quantization algorithm; - Fast coding/decoding; - Completely adaptive; - Error protection.
For more details on the SPIHT algorithm refer to the famous founding paper written by Said et al. [6]. SPIHT MATLAB code is available for both gray scale and true color images; it requires Image Processing Toolbox and Wavelet Toolbox. Thanks to these features, we have chosen this algorithm as the basis for our contribution, called the bandelet-SPIHT algorithm.
3.2. New Coding Algorithm: Bandelet-SPIHT The proposed coding method is formulated as follows:
• Step 1: Input: Medical sequences of size 512 × 512; • Step 2: Performing 2D discrete wavelet transform to decompose sequences into a subband; • Step 3: Decompose the frame recursively into dyadic squares S; • Step 4: Geometric flow is applied to the each square; • Step 5: Warping the discrete wavelets along the geometric flux; • Step 6: Perform 1D discrete wavelet transform to bandelize warped wavelet bases; • Step 7: Resulting bandelet coefficients are encoded using SPIHT encoder; • Step 8: Quality of the recovered video sequences is evaluated according to both objective and subjective criteria.
4. Experimental Results and Discussion We proposed in this work, a new algorithm for medical video coding based on bandelet transform coupled by SPIHT coder that is able to capture complex geometric contents of images and surfaces and reduce drastically existing redundancy in video. Electronics 2019, 8, 88 8 of 18
The accuracy of the algorithm is evaluated vs. bit-rate range 0.1 to 0.5 Mbps on sized medical video sequences. Each sequence has variable number of frames, and the frame rate is equal to 30 frames/second. Figure6 shows a sample of medical frames used for evaluation:
- MR imaging is the modality of choice for accurate local staging of bladder cancer (see Figure6. (d)). In addition, bladder MR imaging helps detect lymph node involvement. With bladder Electronics 2019, 8, x FOR PEER REVIEW 8 of 18 cancer (T2-weighted axial image), MRIs are used mainly to look for any signs that cancer has spread beyond the bladder into nearby tissues or lymph nodes. cancer (T2-weighted axial image), MRIs are used mainly to look for any signs that cancer has - Cardiac MRI is used to detect or monitor cardiac disease and to evaluate the heart’s anatomy—for spread beyond the bladder into nearby tissues or lymph nodes. - example,Cardiac inMRI the is evaluation used to ofdetect congenital or monitor heart disease.cardiac disease The MR and image to evaluate of the heart, the shownheart’s in Figureanatomy—for3c, is derived example, from in gradient the evaluation echo imaging of congenital employed heart in disease. the assessment The MR image of ventricular of the function,heart, shown blood in velocity Figure 3c, and is flowderived measurements, from gradient assessment echo imaging of valvular employed disease, in the assessment and myocardial of perfusion.ventricular This function, MR image blood is performedvelocity and with flow the me useasurements, of intravenous assessment contrast of agentsvalvular (gadolinium disease, chelates)and myocardial to enhance perfusion. the signal This ofMR pathology image is performed or to better with visualize the use the of bloodintravenous pool orcontrast vessels. Theagents paramagnetic (gadolinium effect chelates) of gadolinium to enhance causes the signal a shortening of pathology of T1or relaxationto better visualize time, thus the blood causing areaspool withor vessels. gadolinium The paramagnetic to be bright effect on of T1-weighted gadolinium causes images. a shortening Contrast of agent T1 relaxation is used in cardiactime, thus MRI causing for evaluation areas with of gadolinium myocardial to perfusion, be bright on delayed T1-weighted enhanced images. imaging Contrast (myocardial agent is infarctions,used in cardiac myocarditis, MRI for infiltrative evaluation processes), of myoc differentiationardial perfusion, of intracardiacdelayed enhanced masses (neoplasmimaging vs.(myocardial thrombus) etc.infarctions, myocarditis, infiltrative processes), differentiation of intracardiac - Computedmasses (neoplasm tomography vs. thrombus) (CT) of theetc. abdomen and pelvis, shown in Figure3b, is a diagnostic - imagingComputed test tomography used to help (CT) detect of the diseases abdomen of the and small pelvis, bowel, shown colon in Figure and other 3b, is internal a diagnostic organs andimaging is often test used used to to determine help detect the diseases cause of of unexplained the small bowel, pain. colon In emergency and other cases, internal it can organs reveal and is often used to determine the cause of unexplained pain. In emergency cases, it can reveal internal injuries and bleeding quickly enough to help save lives. internal injuries and bleeding quickly enough to help save lives. - Figure3a shows a coronary angiogram showing the left coronary circulation. The exam by - Figure 3a shows a coronary angiogram showing the left coronary circulation. The exam by coronarographycoronarography consists consists in in aa filmedfilmed radiography,radiography, typically typically at at 30 30 frames frames per per second, second, of ofcoronary coronary (arteries(arteries using using X-rays X-rays andand aa tracertracer product product based based on on iodine). iodine). The The procedure procedure involves involves inserting inserting a acatheter (thin (thin flexible flexible tube) tube) into into a blood a blood vessel vessel in the in groin the groinor arm or area arm toarea guide to it guide to the itheart. to the heart.Once Once the catheter the catheter is in is place, in place, contrast contrast material material is injected is injected into into thethe coronary coronary arteries arteries so sothat that physiciansphysicians can can see see blocked blocked or or narrowed narrowed sections sections of of arteries. arteries.The The testtest alsoalso verifiesverifies thethe conditioncondition of theof valvesthe valves and and heart heart muscle. muscle.
FigureFigure 6. Medical6. Medical video video used used for assessment.for assessment. (a) Coronary (a) Coronary angiography-X-ray; angiography-X-Ray; (b) abdomen/pelvis- (b) abdomen computed/pelvis-computed tomography tomography (CT); (c) (CT); heart-axial (c) heart-axial MRI; (d )MRI; bladder-MRI. (d) bladder-MRI.
TheThe study study is is based based on on real real data data from from INSERMINSERM (French(French National National Institute Institute of of Health Health and and Medical Medical Research),Research), including including large large databases databases usedused forfor thethe comparative analysis. analysis. The The objective objective is isto toshow show the the feasibilityfeasibility of of the the proposed proposed codec. codec. The The imagesimages shownshown are a a simple simple illustration illustration and and their their technical technical and and medicalmedical characteristics characteristics are are summarized summarized in in TableTable1 .1. The results presented in this paper are based on a statistical study (Mean, 95% confidence interval, mean squared error/peak signal-to-noise ratio (MSE/PSNR)). Each value represents an average over a large sample of sequences (see Table 1).
Table 1. Details and characteristics of all standards tested video.
Number of Number of Width × Heigh Anatomical Region/Modality Frames Frames/Second Video Format Brain-MRI (T2-weighted image acquired at 3 T) 144 30 240 × 240 AVI Bladder-MRI (T2-weighted axial image) 662 30 480 × 320 AVI Heart-MRI (Gradient echo imaging) 87 30 340 × 336 AVI Abdomen/pelvis-CT 295 30 340 × 352 AVI Coronary angiogram-X-Ray (+a tracer product based on iodine) 594 30 360 × 360 AVI
Electronics 2019, 8, 88 9 of 18
Table 1. Details and characteristics of all standards tested video.
Number of Number of Width × Heigh Anatomical Region/Modality Frames Frames/Second Video Format
Brain-MRI (T2-weighted image acquired at 3 T) 144 30 240 × 240 AVI Bladder-MRI (T2-weighted axial image) 662 30 480 × 320 AVI Heart-MRI (Gradient echo imaging) 87 30 340 × 336 AVI Abdomen/pelvis-CT 295 30 340 × 352 AVI Coronary angiogram-X-ray (+a tracer product 594 30 360 × 360 AVI based on iodine)
The results presented in this paper are based on a statistical study (Mean, 95% confidence interval, mean squared error/peak signal-to-noise ratio (MSE/PSNR)). Each value represents an average over a large sample of sequences (see Table1). The perceived quality of recovered frames has been measured. Moreover, a comparison between Bandelet-SPIHT versus wavelet transform using SPIHT encoder, MPEG-4, and H.26x is carried. To evaluate the quality of the compressed image, subjective and objective quality measures are used. But subjective quality measures based upon group of observers are too time consuming [28–31]. So, objective quality measures such as MSE, PSNR etc. are preferred.
4.1. Objective Measurement of Video Quality For the analysis of decoded video, we can distinguish data metrics, which measure the fidelity of the signal without considering its content, and picture metrics, which treat the video data as the visual information that it contains. For compressed video delivery over packet networks, there are also packet- or bitstream-based metrics, which look at the packet header information and the encoded bitstream directly without fully decoding the video. Furthermore, metrics can be classified into full-reference, no-reference and reduced-reference metrics based on the amount of reference information they require.
4.1.1. Data Metrics The image and video processing community has long been using MSE and PSNR as fidelity metrics (mathematically, PSNR is just a logarithmic representation of MSE). There are a number of reasons for the popularity of these two metrics. The formulas for computing them are as simple to understand and implement as they are easy and fast to compute. Minimizing MSE is also very well understood from a mathematical point of view. Over the years, video researchers have developed a familiarity with PSNR that allows them to interpret the values immediately. There is probably no other metric as widely recognized as PSNR, which is also due to the lack of alternative standards. Despite its popularity, PSNR only has an approximate relationship with the video quality perceived by human observers, simply because it is based on a byte-by-byte comparison of the data without considering what they actually represent. PSNR is completely ignorant to things as basic as pixels and their spatial relationship, or things as complex as the interpretation of images and image differences by the human visual system. PSNR is the ratio between the maximum possible power of a signal and the power of noise. PSNR is usually expressed in terms of the logarithmic decibel
(2n − 1) PSNR = 10 log (2) 10 MSE where
• (2n − 1) is the dynamic of the signal. In the standard case, (2n − 1) = 255 is maximum possible amplitude for an 8-bit image. Electronics 2019, 8, 88 10 of 18
• MSE represents the mean square error between two frames, namely the original frame f (i, j) and the recovered frame fr(i, j) of size M × N and is given by:
M N 1 2 MSE = ∑ ∑( f (i, j) − fr(i, j)) (3) MN i=1 j=1
MSE can be identified as the power of noise.
The PSNR is an easy, fast and very popular quality measurement metric, widely used to compare the quality of video encoding and decoding. Although a high PSNR generally means a good quality reconstruction, this is not always the case. Indeed, PSNR requires original image for comparison, but this may not be available in every case, also PSNR does not correlate well with subjective video quality measures, so is not very suitable for perceived visual quality.
4.1.2. Picture Metrics As an alternative to data metrics described above, better visual quality measures have been designed taking into account the effects of distortions on perceived quality. An engineering approach consists primarily on the extraction and analysis of certain features or artifacts in the video. These can be either structural image elements such as contours, or specific distortions that are introduced by a particular video processing step, compression technology or transmission link, such as block artifacts. This does not necessarily mean that such metrics disregard human vision, as they often consider psychophysical effects as well, but image content and distortion analysis rather than fundamental vision modeling is the conceptual basis for their design. 1. Image Quality Assessment using Structural Similarity (SSIM) index Wang et al.’s structural similarity (SSIM) index [32] computes the mean, variance and covariance of small patches inside a frame and combines the measurements into a distortion map. Two image signals x and y are aligned with each other (e.g., spatial patches extracted from each image). If we consider one of the signals to have perfect quality, then the similarity measure can serve as a quantitative measurement of the quality of the second signal. SSIM index introduces three key features: luminance l, contrast c and structure s. This metrics is defined in (4): SSIM( f , fr) = l( f , fr) × c( f , fr) × s( f , fr) (4)
- The luminance comparison l(x, y) is defined as a function of the mean intensities Mx and My of signals x et y: 2M M + C ( ) = x y 1 l x, y 2 2 (5) Mx + My + C1 where 1 N 1 N Mx = ∑ xi and My = ∑ yi N i=1 N i=1 2 2 and the constant C1 is included to avoid instability when Mx + My is close to zero (similar considerations for the presence of C2 and C3). - The contrast comparison c(x, y) is function of standard deviations σx and σy, and takes the following form: 2σ σ + C ( ) = x y 2 c x, y 2 2 (6) σx + σy + C2 where the standard deviation of x is Electronics 2019, 8, 88 11 of 18
!1/2 1 N σ = (x2 − (M )2) x − ∑ i x N 1 i=1
- A simple and effective measure to quantify the structural similarity s(x, y) is the estimation of the correlation coefficient σxy between x and y. The structure comparison s(x, y) is defined as follows:
σxy + C3 s(x, y) = (7) σxσy + C3
where the covariance of x and y is
1 N σ = x y − M M xy − ∑ i i x y N 1 i=1
2 2 Specifically, it is possible to choose Ci = Ki D , i = 1, and Ki is a constant, such as Ki 1 and D is the dynamic range of the pixel values (D = 255 corresponds to a grey-scale digital image when C2 the number of bits/pixel is 8). For C3 = 2 , the explicit expression of the structural similarity (SSIM) index is: 2Mx My + C1 2σxy + C2 SSIM(x, y) = (8) 2 2 2 2 Mx + My + C1 σx + σy + C2
SSIM is symmetrical and less than or equal to one (it is equal to one if and only if f = fr, i.e., if x = y in expression (8)). Generally, over the whole video coding, a mean value of SSIM is required as mean SSIM (MSSIM):
1 L MSSIM( f , fr) = ∑ SSIM( fi, fri) (9) L i=1 where fi and fri are the contents of frames (original and recovered respectively) at the ith local window (or sub-image), and L is the total of local windows number in frame. The MSSIM values exhibit greater consistency with the visual quality. 2. Image Quality Assessment using Video Fidelity (VIF) index Sheikh and Bovik [33] proposed a new paradigm for video quality assessment: visual information fidelity (VIF). This criterion quantifies the Shannon information that is shared between the original and recovered images relative to the contained information in the original image itself. It uses natural scene statistics modelling in conjunction with an image-degradation model and a human visual system (HVS) model. Visual information fidelity uses the Gaussian scale mixture model in the wavelet domain. To obtain VIF one performs a scale-space-orientation wavelet decomposition using the steerable pyramid and models each sub-band in the source as C = SU, where S is a random field of scalars and U is a Gaussian vector. The distortion model is GC + n where G is a scalar gain field and n is an additive Gaussian noise. VIF then assumes that the distorted and source images pass through the human visual system and the HVS uncertainty is modelled as visual noise N. The model is then: Reference signal E = C + N (10)
Test signal F = D + N (11) Electronics 2019, 8, 88 12 of 18 where E and F denote the visual signal at the output of the HVS model from the reference and the test videosElectronics respectively 2019, 8, x FOR (see PEER Figure REVIEW7), from which the brain extracts cognitive information. 12 of 18
Figure 7. Diagram of the visual information fidelityfidelity (VIF) metric.
VIF measure takes values between 00 andand 1,1, wherewhere 11 meansmeans perfectperfect quality.quality. Thus the visual information fidelityfidelity (VIF)(VIF) measuremeasure isis givengiven by:by: