<<

IEEE TRANSACTIONS ON PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002 2843 An Improvement to Multiple Description Transform Coding Yao Wang, Senior Member, IEEE, Amy R. Reibman, Senior Member, IEEE, Michael T. Orchard, Fellow, IEEE, and Hamid Jafarkhani, Senior Member, IEEE

Abstract—A multiple description transform coding (MDTC) comprehensive review of the literature in both theoretical and method has been reported previously. The redundancy rate distor- algorithmic development, see the comprehensive review paper tion (RRD) performance of this coding scheme for the independent by Goyal [1]. and identically distributed (i.i.d.) two-dimensional (2-D) Gaussian source has been analyzed using mean squared error (MSE) as The performance of an MD coder can be evaluated by the re- the distortion measure. At the small redundancy region, the dundancy rate distortion (RRD) function, which measures how MDTC scheme can achieve excellent RRD performance because fast the side distortion ( ) decreases with increasing redun- a small increase in redundancy can reduce the single description dancy ( ) when the central distortion ( ) is fixed. As back- distortion at a rate faster than exponential, but the performance ground material, we first present a bound on the RRD curve for of MDTC becomes increasingly poor at larger redundancies. This paper describes a generalization of the MDTC (GMDTC) scheme, an i.i.d Gaussian source with the MSE as the distortion mea- which introduces redundancy both by transform and through sure. This bound was derived by Goyal and Kovacevic [2] and correcting the error resulting from a single description. Its RRD was translated from the achievable region for multiple descrip- performance is closer to the theoretical bound in the entire range tions, which was previously derived by Ozarow [3]. It can be of redundancy. Analysis both for a single pair of variables and for seen that decays at a super-exponential rate for small and multiple variables is presented. then gradually slows down to an exponential rate, similar to the Index Terms—Error resilience, multiple description coding, rate-distortion (RD) function for the Gaussian source. source coding. As part of the background material, we also review two MD coders developed previously and compare their RRD perfor- I. INTRODUCTION mance with the bound. The first coder, called multiple descrip- tion transform coding (MDTC), was first presented in [4]–[6]. ULTIPLE description (MD) coding addresses the Goyal et al. extended the MDTC idea to consider the genera- problem of encoding a source into two (or more) M tion of more than two descriptions, which they refer to as “gen- bitstreams such that a high-quality reconstruction is decodable eralized multiple description coding” [2], [7], [8]. The basic from the two bitstreams together, while a lower, but still idea of MDTC is to introduce a controlled amount of corre- acceptable, quality reconstruction is decodable if either of the lation between two originally uncorrelated variables, generally two bitstreams is lost. To accomplish this goal, each description with unequal variances, by using a pairwise correlating trans- alone must carry a sufficient amount of information about the form (PCT). The resulting variables are assigned to two sepa- original source. This necessarily means that there is a certain rate streams, which are each then quantized and coded to form amount of shared information and, hence, correlation between one description. The correlation between the two variables en- the two descriptions. This correlation will increase the ables the estimation of one from the other, but this correlation required to code the two descriptions beyond that required for a also leads to a loss in coding efficiency (i.e., redundancy), com- single bitstream optimized for coding efficiency. The extra bit pared with coding the original two uncorrelated variables. The rate is the redundancy introduced by the MD coder to reduce redundancy can be controlled precisely by a single transform pa- the single description distortion. MD coding was first studied rameter. We show that the RRD function of the MDTC scheme from the point of view, where the goal matches the bound very closely at small redundancies, but at was to find the achievable rate-distortion region for a given the higher redundancy regime, decays slower than exponen- source using multiple descriptions. Since then, various practical tially for increasing and converges to a constant that is half of coders have been proposed to achieve the MD objective. For a the smaller of the variances of the two variables. Manuscript received August 13, 2001; revised June 24, 2002. The associate The second coder, called multiple description layered coding editor coordinating the review of this paper and approving it for publication was (MDLC), is built on top of layered coding [9], [15], [16]. It du- Prof. Sheila S. Hemami. Y. Wang is with Polytechnic University, Brooklyn, NY 11201 USA (e-mail: plicates the base layer bits from a layered coder in both descrip- [email protected]). tions and splits the enhancement layer bits between the two de- A. R. Reibman is with AT&TLabs—Research, Florham Park, NJ 07932-0971 scriptions. Obviously, the redundancy in the MDLC system is USA (e-mail: [email protected]). M. T. Orchard is with the Department of Electrical and Computer Engi- equal to the bit rate used for the base layer. Because redundancy neering, Rice University, Houston, TX 77005 USA (e-mail: [email protected]). bits are exclusively used to protect the more important part of H. Jafarkhani is with the Center for Pervasive Comunications, Department the signal, this scheme is intuitively appealing. We will see that of Electrical and Computer Engineering, University of California, Irvine, CA 92697 USA (e-mail: [email protected]). the RRD function of this approach has an exponential decay rate Digital Object Identifier 10.1109/TSP.2002.804062. in the entire range of the redundancy, such that it is worse than

1053-587X/02$17.00 © 2002 IEEE 2844 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002

MDTC at small redundancies but better at higher redundancies. II. RRD BOUND AND REVIEW OF PREVIOUS WORK At the higher redundancy regime, it is above the bound by a A. RRD Bound for Gaussian Variables factor of two. To circumvent the disadvantages associated with both MDTC First, we review the definition of the RRD function, which and MDLC, we have developed a generalization of the MDTC was introduced in [5]. For a given source, we call a coder that is (GMDTC) scheme,1 which essentially combines the ideas in optimized for coding efficiency a single description (SD) coder. MDTC and MDLC such that it matches the performance of Let represent the base rate needed by the SD coder to achieve MDTC at small redundancies and matches that of MDLC at a distortion . Let represent the rate required by an MD large redundancies. The poor performance of MDTC at high coder to achieve the same distortion (call the central dis- redundancies is because MDTC includes only one variable in tortion) when both descriptions are available. Further, let each description so that the estimation error for the original (called the side distortion) represent the corresponding average two variables cannot be reduced to the two-description distor- distortion when only a single description is available. To reduce tion even at very high redundancy. To overcome this problem, , redundancy must be introduced so that . The ex- with GMDTC, in each description, we include not only one of cess rate is defined as the redundancy. An RRD the transformed variables but some information about the es- function captures the relationship between and for a fixed timation error for the other variable as well. The bits used for or, equivalently, . In this subsection, we present the lower coding the estimation error contribute to a second mode (called bound on the side distortion for an i.i.d. Gaussian source with perp-mode2 ) of redundancy, in addition to the first mode (called the MSE as the distortion measure. We first consider a one-di- transform-mode) introduced by the correlating transform. The mensional (1-D) source and then extend it to a two-dimensional RRD performance of the GMDTC scheme is optimized by allo- (2-D) source. cating a given total redundancy between the two modes to mini- Let and represent the rate and distortion associated mize . An interesting result is that at a total redundancy lower with description 1, 2, and let represent the distortion than a critical point , all the given redundancy should be used from both descriptions. Ozarow [3] derived the necessary and in the transform mode. Beyond this point, the transform should sufficient conditions for a quintuple ( , , , , )tobe only introduce bits of redundancy, and the remaining bits achievable for an i.i.d Gaussian source with unit variance using should be exclusively applied for coding the estimation error. the MSE distortion criterion. Combining this result with the RD Thus, the GMDTC coder corresponds to a transform coder at function for the Gaussian source, one can derive the lower bound low redundancies and a “hybrid” coder at higher redundancies. on the side distortion for given and .For In Section III, we describe the GMDTC scheme for a single the balanced case of , , Goyal and Kovacevic pair of Gaussian variables, discuss how to allocate redundancies [2] derived such a lower bound. For a source with variance , between the two modes of operations, and derive the overall this bound can be written as (1), shown at the bottom of the page, RRD function. We will see that the overall RRD performance where 2 . of GMDTC is much closer to the theoretical RRD bound: It has In this paper, we are primarily concerned with the behavior of a super-exponential decay-rate at smaller redundancies but an the above bound when the central distortion is substantially exponential decay-rate at large redundancies. However, it is still smaller than the side distortion and the redundancy is far above the RRD bound by a factor of two in the large redundancy below the base rate . Therefore, compared with the range of range, as was the MDLC scheme. and , one can consider , . In this case, Then,inSectionIV,weconsiderhowtoapplyGMDTCtomore the bound in (1) becomes thantwoGaussianvariables.Wediscusshowtooptimallyallocate redundancy among pairs for a given pairing scheme and which pairing scheme achieves the best RRD performance. We show (2) that for large redundancies, all pairing combinations are equal, Because the slope of at is negative infinity, a and the RRD performance only depends on the geometric mean very small amount of redundancy can yield a very large reduc- of the variances of all the variables, but for small redundancies, tion in .For large enough so that 2 , we can ap- the optimal pairing scheme is to pair the variable with the largest proximate 2 by 2 , yielding variance with the variable having the smallest variance.

1Note that in other papers, e.g., [2], the term “Generalized MDC” has been when (3) used to refer to MDC with more than two descriptions. The GMDTC scheme presented here still generates two descriptions, but it modifies the original Therefore, in the high-redundancy regime, the RRD bound de- MDTC coder to allow the specification of the coefficient estimation error as enhancement information. creases exponentially, similar to the distortion-rate function of 2The name “perp” is used because the information coded is orthogonal or an i.i.d. Gaussian source. Interestingly, in the very small redun- perpendicular to what has been transmitted. dancy regime, the RRD bound decays super-exponentially. Note

for (1) for WANG et al.: IMPROVEMENT TO MULTIPLE DESCRIPTION TRANSFORM CODING 2845 that the slope of the RRD bound does not change suddenly but rather gradually and continuously. The RRD bound given in (2) is for coding a 1-D i.i.d. Gaussian source. In order to evaluate our transform-based coder against this bound, we need to find an equivalent bound for coding a 2-D vector i.i.d. Gaussian source in which each vector sample contains two independent variables and , generally with different variances and . In the 2-D case, we define the redundancy as the excess rate per variable required by an MD coder over that required by an SD coder. Specifically, , where is the average rate per variable required by an SD coder to achieve an average per variable distortion when coding and , and is the average rate per variable required by an MD coder to achieve Fig. 1. Basic MDTC scheme for a single pair of variables. the same average central distortion. One way to code the 2-D source into two descriptions is by first finding an optimal allocation of the total base rate between Therefore, at large redundancy, the RRD function for a 2-D the two components to achieve the specified average central dis- Gaussian source also decays exponentially, with the constant tortion and then finding an optimal allocation of the total re- factor depending on the geometric mean of the variances of dundancy between the two components that will minimize the the two variables. This is similar to the RD function for coding average single description distortion . Using this approach, two independent Gaussian variables with optimal rate allocation Goyal and Kovacevic derived an approximate lower bound on the [11]. side distortion for the 2-D Gaussian source [2] for different cen- In the next two subsections, we evaluate the operational RRD tral distortions. In an unpublished addendum [10], Goyal showed functions of two MD coders that we developed previously and thatthecentraldistortionsplitthatisoptimalfortherate-distortion compare them against the bound. problem is not necessarilybestfor minimizing the side distortion. The correct way to derive the side distortion bound is by simulta- B. Basic MDTC Scheme neouslysearchingthecombinationsof thatmin- The MDTC scheme was first introduced in [4] and [5] and imizesthesidedistortion undertheconstraint further developed in [2], [6], [7], [14]. Fig. 1 shows a block , . The difference between the approximate diagram of this scheme. An input pair of variables and are boundandthecorrectboundis,however,negligible,whenthecen- transformed into and , using tral distortion is very small. Since we are only concerned with the case when ,we can find the RRD found for the 2-D Gaussian source by optimal (7) redundancy allocation that minimizes in (2), i.e.,

The transform parameter controls the correlation between minimize and , which in turn controls the redundancy of the MDTC coder. In our analysis, we assume and are two indepen- dent Gaussian variables with variances and , respectively. Therefore, and are also Gaussian variables. Because the transform is, in general, nonorthogonal, quan- subject to (4) tizing and will lead to degraded quantization performance. Therefore, and are first quantized to yield quantized indices and , and the transform is accomplished in the integer do- The above-constrained optimization problem can be solved nu- main to yield quantized transformed coefficients and .We merically. The solid curve in Fig. 3 shows the solution for a pair used the implementation described in [12]. A more general de- of variables with , . We can still see super-ex- scription of integer-to-integer transforms can be found in [13]. ponential decay for small . When both and are suffi- At the receiver, if both and are available, an integer ciently large, the approximation in (3) applies so that the mini- version of the inverse transform is applied to yield and . mizing function in (4) becomes An inverse quantization is then applied to recover the quantized values and . If only one channel is working, say, the channel when carrying , inverse quantization is applied to yield . Then, (5) is estimated from , using the optimal linear estimator It is easy to show that the optimal redundancy allocation yields

(6) with (8) 2846 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002

Fig. 2. MDC using layered coding (MDLC).

Finally, and are obtained by the inverse of (7). One can also have been used in the robust audio tool (RAT) developed by combine the estimation and inverse transformation to estimate Hardman et al. [15] for audio streaming over the Internet, where, and directly from , as shown in Fig. 1. for each portion of a signal described in a previous packet, a low In [6] and [14], we derived the operational RRD function of temporal resolution description of this portion is piggy-backed the previously mentioned MDTC scheme with consideration of in a following packet. The MD coder proposed by Jiang and Or- both estimation and quantization errors. When the quantization tega [16] also uses this idea: Each description contains a finely error is small so that , the contribution of to quantized version of one signal component (say the even sam- can be neglected. In this case, the RRD function can be ex- ples), as well as a coarsely quantized version of another signal pressed as component (say the odd samples). Both algorithms can be con- sidered to be a special case of the MDLC framework, with the base layer containing either a temporally subsampled signal or (9) a coarsely quantized signal. More general ideas of combining As with the bound given in (2), (9) has a negative infinite layered coding with unequal error protection for realizing MDC slope at so that it decays super-exponentially for small with more than two descriptions have been considered in [17] redundancy. However, at large redundancies, it does not decay and [18]. exponentially, as does the bound. In fact, at the large redundancy In the following, we derive the RRD function for the MDLC region, the RRD function in (9) can be approximated by scheme. Consider an i.i.d Gaussian source with variance .We use bits to code the base layer, which is sent on both chan- when nels. Then, we use bits to code the error between the original (10) source and the signal reproduced from the base layer descrip- Therefore, the one channel distortion cannot be reduced to less tion. The resulting bit streams are sent on the two channels al- than , even when the redundancy is allowed to get very ternately using time sharing, with bits on each channel on high. This is due to the fact that the MDTC method sends one average. Fig. 2 shows the block diagram of this coder. Because variable ( or ) on each channel, regardless of redundancy. the base layer, using bits, is duplicated in both descriptions, Thus, for any redundancy, one-channel reconstruction accuracy the redundancy of this coder is obviously . can get no better than the best 1-D approximation of the two- When only the base layer is received on a channel, dimensional (2-D) source ( , ). When MDTC is operating the distortion is 2 , according to at the largest possible redundancy, and both are equal to the RD bound for the i.i.d. Gaussian source [19]. When the larger variable so that is represented with zero error the enhancement layer is also received, the distortion is (ignoring the quantization error), whereas is estimated with 2 . This is possible an error equal to . Thus, the average is . because the Gaussian source is successively refinable [20], [21]. Because a channel always receives the base layer and C. MD Coder Using Layered Coding it receives the enhancement layer only half of the time, the As shown previously, the basic MDTC scheme works very average distortion on a single channel is well at the low-redundancy region but becomes very inefficient when the redundancy increases further. In [9], we presented an alternative approach for MD coding. It produces two equally important descriptions by adapting a layered coder, which pro- (11) duces a base layer and an enhancement layer for a given source. To achieve the MD objective, we include the base layer in both Obviously, the two-channel distortion of this coder is channels and split the enhancement layer equally between the 2 . This is two channels. This method is referred to as MDLC. This ap- obtained with a total of bits. An optimal SD coder proach is intuitively appealing because it applies redundancy ex- using bits would yield a distortion 2 . clusively to the most significant part of the signal. Similar ideas Letting , we see that . WANG et al.: IMPROVEMENT TO MULTIPLE DESCRIPTION TRANSFORM CODING 2847

Fig. 3. RRD functions for a pair of Gaussian variables with ' aI, ' aHXP. Solid line: Bound derived from optimal redundancy allocation using numerical solutions. Dotted: RRD function of the MDTC scheme given by (9). Dash: RRD function of the MDLC scheme given by (18); “+”: RRD function of the GMDTC scheme given by (24). The RRD functions of MDTC and GMDTC move closer to the bound for a larger ratio of ' a' .

Therefore, the redundancy is , as expected. 2 . Similarly, the average distortion when only de- The one channel distortion in (11) can be rewritten as scription two is received is 2 2 . Thus, the average side distortion is (12)

As with the rest of this paper, we are primarily concerned with (15) the performance of MDLC when is negligible compared with As before, we only consider the case when . Under . In this case, the RRD function of MDLC is reduced to this assumption, and the fact that ,wehave

(13) (16)

From (13), we see that with MDLC, always decays expo- Given the total per variable redundancy , one should deter- nentially in redundancy, even for small . For large , the MDLC mine and so that scheme is inferior to the bound in (3) by a factor of 2. To compare this scheme with MDTC, we apply this scheme minimize to a pair of variables and . Description one includes both the base and enhancement layers for but only base layer for . subject to (17) Likewise, description two includes both base and enhancement layers for but only base layer for . Let and repre- The solution to the above constrained minimization problem is sent the bit rate used for variable for the base layer and enhancement layer, respectively. Clearly, the redundancy for variable is , as the base layer bits for each vari- otherwise. able are included in both descriptions. Using the RD function (18) and the successive refinement property for the i.i.d. Gaussian We see, again, a factor-of-2 degradation compared with the source, the average distortion in and , when both descrip- bound in (6) for large redundancies. tions are received, is D. Comparison of MDTC and MDLC With the RRD Bound (14) Fig. 3 compares four RRD functions: the bound and those achievable by the MDTC, MDLC, and GMDTC schemes, all On the other hand, the average distortion when only descrip- obtained for a pair of variables with and . tion one is received is 2 We will describe the GMDTC scheme in Section III. The bound 2848 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002

Fig. 4. GMDTC scheme for a single pair of variables.

(solid line) is obtained by numerically solving the optimization proach can be viewed as a hybrid of MDTC and MDLC. In the problem in (4); the MDTC curve is calculated according to (9); following, we analyze redundancy allocation between coding the MDLC curve is obtained based on (18). We can see that the ( , ) and ( , ) and derive the RRD function of the MDTC curve matches closely the bound in the small redundancy GMDTC scheme. region, with both decaying super-exponentially, but the MDTC With the transform given in (7), the estimation error for curve starts to level off at an intermediate redundancy, and it from and that for from are the same. Therefore, the vari- eventually converges to a nonzero value . On the other ables and have identical variances and, hence, should hand, the MDLC curve decays slower than the bound and the be coded with identical rates. Thus, we only discuss the case MDTC curve at smaller redundancies, but it eventually becomes when the channel carrying and is lost (the other case fol- parallel with the bound because both decay exponentially at large lows from symmetry). Denote the per-variable redundancy used redundancies. The crossing point between MDTC and MDLC to code and by , and denote the per-variable rate applied curves moves farther to the right when becomes smaller. to coding and by . The variance of is equal to the estimation error for given . Let the MDTC inverse trans- III. GENERALIZATION OF MULTIPLE DESCRIPTION form matrix be represented by . The distortion TRANSFORM CODING , which is the MSE for and when only is Recognizing the inferior behavior of MDTC when the re- received, is related to the quantization error for , , and the dundancy is greater than a certain value, we have developed estimation error for from , ,by a GMDTC scheme, which was first presented in [6]. Recall that the MDTC scheme sends one variable ( or ) on each channel, regardless of redundancy. Thus, for any redundancy, (19) one-channel reconstruction accuracy can get no better than the The approximation in the preceding equation is based on the best 1-D approximation of the 2-D source ( , ). To allow fact that when is sufficiently small, is significantly one-channel distortion to converge to two-channel distortion for smaller than . Therefore, the variance of can be written high redundancy, we include in each description the error be- as . Since the original tween the original signal and that reconstructed from or . variables and are independent Gaussian random variables, More specifically, we include, in the channel carrying the vari- if we ignore the quantization errors for and , all derived vari- able , some information about a variable , which is the or- ables , , , and are also Gaussian. Therefore, the quan- thogonal complement to in the Hilbert space spanned by ( , tization error 3 for follows the RD function for the Gaussian ). By definition, is the component of that cannot be pre- source: 2 . Note that is equal to the dicted from or , where is the quantized final reconstruction error for . Based on the relationship be- value of . The channel carrying also codes a variable tween the distortion in and and that in and given by that is similarly defined. Note that bits used to code and (19), the overall single description distortion with GMDTC is are pure redundancy since they are irrelevant when both channels are received. Fig. 4 shows the steps required in imple- menting this generalized framework for a given pair. Conceptually, we can think of or as the base layer infor- (20) mation about and , whereas or is the enhancement information. Unlike in MDLC, where the same base layer in- 3In GMDTC, the quantization error for g and h is significantly smaller than that for g and h , except when & is close to ‚ . Therefore, when deter- formation is duplicated in both descriptions, and have an mining h , one can ignore the quantization error for g and h but not that for adjustable amount of correlation. In this sense, the GMDTC ap- g and h . WANG et al.: IMPROVEMENT TO MULTIPLE DESCRIPTION TRANSFORM CODING 2849

Fig. 5. RRD functions obtained based on theoretical analysis and simulations for one pair of variables, with ' aI, ' aHXR. “Generalized” corresponds to GMDTC. “Transform-only” corresponds to MDTC.

To determine how to allocate and , given a total ponential decay behavior of the bound, which is better than redundancy budget , we determine the optimal that MDLC. This is more clearly revealed in Fig. 3, where we com- minimizes in (20), with . Setting pare GMDTC with the bound as well as MDLC and MDTC. We yields can see that the GMDTC curve is initially (for ) the same as MDTC; both are very close to the bound. When , the (21) MDTC curve starts to flatten, whereas the GMDTC curve con- tinues to decay at an exponential rate. As with the MDTC curve, the GMDTC curve moves closer to the bound when the ratio We denote the that satisfies (21) as . Interestingly, is in- becomes smaller. dependent of . In fact, is the point where has a Fig. 5 compares the RRD curves obtained by the MDTC and slope that is exactly equal to an exponential decay with exponent GMDTC methods for two variables with and . 2 . This result implies that when , we should apply all We show both theoretical curves calculated according to (9) and the allowed redundancy to introduce correlation between and (24) as well as results from numerical simulations. To obtain the . On the other hand, when , we should use the trans- simulation points, we generate pairs of independent Gaussian form to introduce correlation only up to the amount specified by random variables and with specified variances, uniformly and use the remaining redundancy to code and . quantize and with a selected step-size to yield and , Solving (21) yields then apply PCT using the integer-to-integer implementation of [12] to yield and , and finally reconstruct and for both (22) the one- and two-channel cases. Points with different redundan- cies are generated by varying the transform parameter .For The single-channel distortion at is the GMDTC simulations, we also need to generate and and quantize each using a coarser step-size to yield (23) and . One- and two-channel distortions are calculated from the reconstructed variables. The bit rates and are es- timated based on the entropies of , , , , , and . The The RRD function over the entire redundancy range is then redundancy rates are determined by . To avoid the bias caused by the random generator, 8000 simulation points are generated, and the resulting and are averaged. For MDTC and the transform-mode in GMDTC, the quantization step size (24) . is . This ensures that the rate is high enough so that the quantization error for and can indeed be ignored, as Comparing (24) with (6), we see that, for large redundancy, has been done in our theoretical derivations. For the perp-mode GMDTC is a factor of 2 above the bound, as was MDLC. How- in GMDTC, the step-size is varied to generate points with ever, at smaller redundancy, GMDTC achieves the super-ex- varying overall redundancies. 2850 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002

Note that the points generated using MDTC and the trans- the problem is to form-mode in GMDTC (i.e., ) match very well with the theoretical curve. However, in the perp-mode ( i.e., minimize ), the simulated RRD curve for GMDTC deviates from the theoretical one more noticeably. This can be explained as follows. Theoretically, the RD function for the Gaussian source subject to is achievable only when an infinite number of variables are quantized and coded together. Our simulation uses scalar quanti- zation followed by entropy coding; hence, it requires a higher bit where represents the RRD function for the th pair, rate than that indicated by the RD bound to achieve a given dis- given by (24), with and replaced by , , and tortion. In the perp-mode, because the additional redundancy is replaced with . directly determined from the bit rates required for and , The optimal redundancy allocation requires that each pair op- the simulated RRD function deviates from the theoretical one. erates at the same slope on its RRD function. Direct solution of However, in the transform-mode (and in the case of MDTC), be- this problem is difficult because the RRD function of each pair cause we apply scalar quantization and entropy coding in both has two parts. Instead, we could first determine the optimal re- the SD (quantizing and coding and individually) and MD dundancy allocation by assuming that the RRD function of each (quantizing and coding and individually) simulations, the pair is an exponentially decaying function over the entire redun- extra bits beyond the RD function in these two cases are similar dancy range, i.e., assuming and are cancelled when we calculate the redundancy. Relationship Between GMDTC and Overcomplete Trans- forms: It is worthwhile to note that without considering This function matches with the actual RRD function at large quantization, another way to generate , , , and from and will be referred to as the virtual RRD function. Optimal and is by using an overcomplete 4 2 linear transform, redundancy allocation for this virtual RRD function can be de- which can be considered to be a frame expansion. Goyal et al. rived easily using the Lagrange multiplier method. For a given [22] have proposed a MDC scheme using frame expansion. The desired average redundancy , the solution is difference between that work and the GMDTC presented here lies mainly in two aspects. The first and more important is that there is an innate hierarchy among the four sets of coefficients: Without (resp. ), (resp. ) is useless. On the other hand, MDC using frame expansion usually generates symmetric descriptions. Therefore, instead of considering GMDTC as Then for each pair, we check whether . If this is the . Otherwise, we find a point in the first generating four descriptions using frame expansion, it is better case, we set segment of that matches the slope of at by solving to think of GMDTC as generating two descriptions, with each description containing two layers. The second difference lies in the way quantization is handled. The GMTDC framework allows us to map from and to and in the quantized domain. With the frame expansion scheme proposed in [22], all four variables will be first generated from original and and Here, represents the first super-exponential segment of the then quantized. As the transform is nonorthogonal, quantizing function. Because this segment decays faster than ,we the transformed coefficients is less efficient than quantizing the will have . The average redundancy obtained from this original variables. solution may thus exceed the specified one . However, this solution is optimal for because each pair operates at the same IV. REDUNDANCY ALLOCATION AND OPTIMAL slope on its RRD function. If one is interested in a solution for PAIRING SCHEME the original specified average redundancy , then the process can be repeated by starting with a slightly lower redundancy The GMDTC method presented in the previous section is ap- until the realized equals the desired one. plied to coding a single pair of variables. To generalize this Using the redundancy allocation scheme described previ- method to ( ) variables, we group them into ously, the RRD functions obtained with different pairs will all (assume is even) pairs and apply the basic GMDTC scheme to converge to a virtual RRD function of the form each pair. For a given variables, many pairing combinations (each defining pairs) are possible, and for each pairing, dif- ferent amounts of redundancy can be introduced into each pair. This section discusses how to optimize the RRD performance at large . This function is the average obtained with with respect to these degrees of freedom. optimal redundancy allocation on the virtual RRD function of each pair. Note that the geometric mean of the variances A. Optimal Redundancy Allocations Among Pairs is the same no matter how the variables First, we assume a given pairing combination and consider are paired. This means that for large redundancy, all possible how to allocate redundancy among pairs. Mathematically, pairing schemes are equal. At smaller redundancy ranges, WANG et al.: IMPROVEMENT TO MULTIPLE DESCRIPTION TRANSFORM CODING 2851

Fig. 6. Determination of the best pairing scheme. (a) Pairing with a smaller & has a lower RRD curve. (b) Deriving & for two pairs. however, this is not true. In this case, the actual curve of generality, we assume these variables are ordered so that they associated with any pairing is above the virtual one. have decreasing variances . There are only three dif- ferent pairing schemes: {(1, 2), (3, 4)}, {(1, 3), (2, 4)}, and {(1, B. Optimal Pairing Scheme 4), (2, 3)}. Let denote the redundancy at which the real RRD To determine which pairing is optimal for an arbitrary number function for a given pairing meets the virtual RRD. As illus- of variables, we first consider the case of . Without loss trated in Fig. 6(a), all three pairings have identical at zero 2852 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002

Fig. 7. RRD functions obtained with GMDTC for coding six variables using different pairing combinations. The variances for the variables are ' a 0.4 Yi a HY IY FFFY S. The different curves are obtained based on theoretical analysis for the pairing methods indicated. The symbols “*,” “+,” and “x” indicate the simulation results for “Pairing neighbors,” “Another pairing,” and “Pairing big-to-small.” redundancy, and they all eventually converge to the same vir- has the larger based on the coefficient variances and then tual curve. Furthermore, the RRD function associated with each apply (25) correspondingly to derive its . Then, by making pairing combination is convex. Therefore, the pairing combina- use of the fact that , one can show that tion that has the smallest will yield the lowest (thus, the best) RRD curve because it merges with the virtual curve earliest. We will show that pairing combination {(1, 4), (2, 3)}, i.e., pairing That is, the pairing {(1, 4), (2, 3)} yields the smallest and, the biggest variable to the smallest one, yields the smallest . consequently, is the best pairing combination for the For a given pairing combination containing two pairs, is case. the smallest average redundancy at which the actual RRD curves Now, consider arbitrary variables. For any given for both pairs meet with their respective virtual RRD curves, and pairing, choose any two pairs. If the largest variance is not paired the two RRD curves have equal slope. Letting 1, with the smallest one for the four variables involved, we can 2 represent the variances of the two pairs, we can determine change the pairing for these four variables to the big-to-small and using [(22) and (23)]. As shown in Fig. 6(b), if pairing. The previous result shows that this can only improve (or equivalently, ), then this point is achieved when the RRD performance. Repeating this process for each set of the first RRD function is operating at , whereas the second two pairs will yield a pairing that pairs the th largest variance RRD function is at a point . The point should be with the th, which is the optimal pairing scheme. where the second RRD function is in its exponential segment Fig. 7 shows theoretical and simulated RRD for the case of and has the same slope as the first RRD function at . Specifi- six variables using GMDTC. The variances of the variables are cally, can be determined by matching the slope of the expo- 0.4 , and the quantization step-size in the transform mode nential segment of with the slope of at . is . Of the possible 15 ways to pair six variables, we This yields show performance for three possible pairings: • pairing neighbors {(1,2), (3,4), (5,6)}; • the pairing {(1,4), (2,3), (5,6)}; • the pairing that pairs the biggest variable with the smallest Thus {(1,6), (2,5), (3,4)}. Simulation results are also shown, with the three pairings in- (25) dicated by “*,” “+,” and “x,” respectively. We can see that the pairing that performs best over the entire range of redundancies is {(1, 6), (2, 5), (3, 4)}. Let , , and denote the As with the single pair case shown in Fig. 5, simulation re- associated with the pairing combinations {(1, 2), (3, 4)}, {(1, sults for GMDTC deviate from the theoretical RRD curve. For 3), (2, 4)}, and {(1, 4), (2, 3)}, respectively. For each pairing multiple pairs, this effect is enhanced when neighboring vari- combination containing two pairs, we can determine which pair ables are paired. As a result, the performance advantage of using WANG et al.: IMPROVEMENT TO MULTIPLE DESCRIPTION TRANSFORM CODING 2853

the correct pairing is greater in practice than that predicted by [14] Y. Wang, M. Orchard, V. Vaishampayan, and A. Reibman, “Multiple theory. description coding using pairwise correlating transforms,” IEEE Trans. Image Processing, vol. 10, pp. 351–366, Mar. 2001. [15] V. Hardman, A. Sasse, M. Handley, and A. Watson, “Reliable audio for use over the internet,” in Proc. INET, Honolulu, HI, June 1995. V. C ONCLUSION [16] W. Jiang and A. Ortega, “Multiple description coding via polyphase transform and selective quantization,” in Proc. SPIE Conf. Visual We have described a GMDTC scheme, which improves on Commun. Image Process., San Jose, CA, Jan. 1999. a previously reported MDTC scheme. By combining the trans- [17] A. E. Mohr, E. A. Riskin, and R. E. Ladner, “Generalized multiple de- form operation mode of MDTC and the idea of successive re- scription coding through unequal loss protection,” in Proc. IEEE Int. Conf. Image Proces., vol. 1, Kobe, Japan, Oct. 1999, pp. 411–415. finement, the GMDTC scheme can retain the excellent perfor- [18] R. Puri and K. Ramchandran, “Multiple description source coding using mance of MDTC at small redundancies and yet overcome the forward error correction,” Conf. Rec. 33rd Asilomar Conf. , Syst., inefficiency of MDTC at large redundancies. Comput., vol. 1, pp. 342–346, Oct. 1999. [19] T. M. Cover and J. A. Thomas, Elements of Information Theory.New With the current GMDTC method, when both descriptions York: Wiley, 1991. are received, the information carried by and is simply [20] W. H. R. Equitz and T. M. Cover, “Successive refinement of informa- ignored. One possible way to improve the performance of tion,” IEEE Trans. Inform. Theory, vol. 37, pp. 269–275, Mar. 1991. [21] , “Addendum to ’Successive refinement of information’,” IEEE GMDTC is by jointly designing the quantizers for and Trans. Inform. Theory, vol. 39, pp. 1465–1466, July 1993. (and similarly for and ) so that the quantized can be [22] V. K. Goyal, J. Kovacevic, and M. Vetterli, “Quantized frame expansion used to refine the reconstruction for beyond that obtainable as source-channel codes for erasure channels,” in Proc. Data Compress. Conf., Mar. 1999, pp. 326–335. from the quantized . This way, for the same total bit rate, [23] V. A. Vaishampayan, “Design of multiple description scalar quantizer,” the two description distortion can be reduced. One way to IEEE Trans. Inform. Theory, vol. 39, pp. 821–834, May 1993. achieve this is by interleaving the quantizer cells for and using the multiple description scalar quantization scheme proposed by Vaishampayan [23].

Yao Wang (M’90–SM’98) received the B.S. and ACKNOWLEDGMENT M.S. degrees in electronic engineering from Ts- inghua University, Beijing, China, in 1983 and 1985, This work was conducted in AT&T Labs—Research. respectively, and the Ph.D. degree in electrical and computer engineering from University of California, Santa Barbara, in 1990. REFERENCES Since 1990, she has been with the faculty of Polytechnic University, Brooklyn, NY, where she [1] V. K. Goyal, “Multiple description coding: Compression meets the net- is presently Professor of electrical and computer work,” IEEE Signal Processing Mag., vol. 18, pp. 74–93, Sept. 2001. engineering. She was on sabbatical leave at Princeton [2] V. K. Goyal and J. Kovacevic, “Generalized multiple description coding University in 1998 and was a visiting professor at the with correlating transforms,” IEEE Trans. Inform. Theory, vol. 47, pp. University of Erlangen, Germany, in the summer of 1998. She was a consultant 2199–2224, Sept. 2001. with AT&T Labs—Research (formerly AT&T Bell Laboratories) from 1992 [3] L. Ozarow, “On a source coding problem with two channels and three to 2000. Her research areas include communications, multimedia signal receivers,” Bell Syst. Tech. J., vol. 59, p. 1921, Dec. 1980. processing, and medical imaging. She is the leading author of a textbook [4] Y. Wang, M. Orchard, and A. Reibman, “Multiple description image entitled Video Processing and Communications and has published over 100 coding for noisy channels by pairing transform coefficients,” Proc. IEEE papers in journals and conference proceedings. First Workshop Multimedia Signal Process., June 1997. Dr. Wang has served as an Associate Editor for IEEE TRANSACTIONS ON [5] M. Orchard, Y. Wang, V. Vaishampayan, and A. Reibman, “Redundancy MULTIMEDIA and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR rate distortion analysis of multiple description image coding using pair- VIDEO TECHNOLOGY. She received the New York City Mayor’s Award for Ex- wise correlating transforms,” in Proc. Int. Conf. Image Process., Santa cellence in Science and Technology in the Young Investigator Category in 2000. Barbara, CA, Oct. 1997. [6] Y. Wang, M. Orchard, and A. Reibman, “Optimal pairwise correlating transforms for multiple description coding,” in Proc. Int. Conf. Image Process., vol. 1, Chicago, IL, Oct. 1998, pp. 679–683. [7] V. K. Goyal and J. Kovacevic, “Optimal multiple description transform coding of Gaussian vectors,” in Proc. Dig. Contr. Conf., Snowbird, UT, Amy R. Reibman (M’87–SM’02) received the B.S., Mar. 1998, pp. 388–397. M.S. and Ph.D. degrees in electrical engineering from [8] V. K. Goyal, J. Kovacevic, R. Arean, and M. Vetterli, “Multiple descrip- Duke University, Durham, NC, in 1983, 1984, and tion transform coding of images,” Proc. IEEE Int. Conf. Image Process., 1987, respectively. vol. 1, pp. 674–678, Oct 1998. From 1988 to 1991, she was an Assistant Professor [9] A. R. Reibman, H. Jafarkhani, M. T. Orchard, and Y. Wang, “Per- with the Department of Electrical Engineering, formance of multiple description coders on a real channel,” in Proc. Princeton University, Princeton, NJ. In 1991, she ICASSP, Phoenix, AZ, May 1999. joined AT&T Bell Laboratories, Holmdel, NJ, and [10] V. K. Goyal. Computing the Side Distortion Lower Bound for Mul- became a Distinguished Member of Technical Staff tiple Description Coding of Two-Dimensional Gaussian Sources. in 1995. She is currently a Technology Consultant in [Online]. Available: http://lcavwww.epfl.ch/ goyal/Papers/it-mdtc-ad- the Communication Sciences Research Department dendum-bound.pdf at AT&T Laboratories. Her research interests include video compression [11] J.-Y. Huang and P. M. Schultheiss, “Block quantization of correlated systems for transport over packet and wireless networks and video quality Gaussian random variables,” IEEE Trans. Commun., vol. COM-11, pp. metrics. 289–296, Sept. 1963. Dr. Reibman received the IEEE Communications Society Leonard G. [12] X. Li, B. Tao, and M. T. Orchard, “On implementing transforms from Abraham Prize Paper Award in 1998. She was the Technical Program Chair integers to integers,” in Proc. Int. Conf. Image Process., vol. 3, Chicago, for the Sixth International Workshop on Packet Video in 1994 and Technical IL, Oct. 1998, pp. 881–885. Co-chair for the First IEEE Workshop on Multimedia Signal Processing in [13] R. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo, “ 1997. She is the Technical co-chair of the IEEE International Conference on transforms that map integers to integers,” Appl. Comput. Harmon. Anal., Image Processing in 2002. vol. 5, no. 3, pp. 332–369, July 1998. 2854 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002

Michael Orchard (F’00) was born in Shanghai, Hamid Jafarkhani (SM’01) received the B.S. de- China, and grew up in New York, NY. He received gree in electronics from Tehran University, Teheran, the B.S. and M.S. degrees in electrical engineering Iran, in 1989 and the M.S. and Ph.D. degrees, both in from San Diego State University, San Diego, CA, electrical engineering, from the University of Mary- in 1980 and 1986 and the M.A. and Ph.D. degrees land, College Park, in 1994 and 1997, respectively. in electrical engineering from Princeton University, From June 1996 to September 1996, he was a Princeton, NJ, in 1988 and 1990. summer intern at Lucent Technologies (Bell Labs). He has been a Professor with the Department of He joined AT&T Labs–Research, Florham Park, NJ, Electrical and Computer Engineering at Rice Univer- as a Senior Technical Staff Member in August 1997. sity, Houston, TX, since July 2001. Prior to joining Later, he was promoted to a Principle Technical Rice University, he was on the faculty at Princeton Staff Member. He was with Broadcom Corp. as a University from 1995 to 2001 and was an Assistant Professor at the Univer- Senior Staff Scientist from July 2000 to September 2001. Currently, he is sity of Illinois at Urbana-Champaign from 1990 to 1995. From 1982 to 1986, with the Department of Electrical and Computer Engineering, University of he worked at the Government Products Division of Scientific Atlanta, Atlanta, California, Irvine, where he is also the Deputy Director of Center for Pervasive GA, developing passive sonar DSP applications. From 1988 to 1999, he was a Communications. consultant with the Visual Communications Department of AT&T Bell Labora- Dr. Jafarkhani ranked first in the nationwide entrance examination of Iranian tories and with NEC Research Institute in Princeton in 1999. universities in 1984. He was a co-recipient of the American Division Award of Dr. Orchard received the National Science Foundation Young Investigator the 1995 Texas Instruments DSP Solutions Challenge. He is an Associate Editor Award in 1993 and the Army Research Office Young Investigator Award in for the IEEE COMMUNICATIONS LETTERS. 1996.