Simultaneous Low-Pass Filtering and Total Variation Denoising Ivan W
Total Page:16
File Type:pdf, Size:1020Kb
LAST EDIT: FEBRUARY 4, 2014 1 Simultaneous Low-Pass Filtering and Total Variation Denoising Ivan W. Selesnick, Harry L. Graber, Douglas S. Pfeil, and Randall L. Barbour Abstract—This paper seeks to combine linear time-invariant However, the signals arising in some applications are more (LTI) filtering and sparsity-based denoising in a principled way complex: they are neither isolated to a specific frequency band in order to effectively filter (denoise) a wider class of signals. nor do they admit a highly sparse representation. For such LTI filtering is most suitable for signals restricted to a known frequency band, while sparsity-based denoising is suitable for signals, neither LTI filtering nor sparsity-based denoising is signals admitting a sparse representation with respect to a known appropriate by itself. Can conventional LTI filtering and more transform. However, some signals cannot be accurately catego- recent sparsity-based denoising methods be combined in a rized as either band-limited or sparse. This paper addresses the principled way, to effectively filter (denoise) a wider class of problem of filtering noisy data for the particular case where the signals than either approach can alone? underlying signal comprises a low-frequency component and a sparse or sparse-derivative component. A convex optimization This paper addresses the problem of filtering noisy data approach is presented and two algorithms derived, one based where the underlying signal comprises a low-frequency com- on majorization-minimization (MM), the other based on the ponent and a sparse or sparse-derivative component. It is alternating direction method of multipliers (ADMM). It is shown assumed here that the noisy data y(n) can be modeled as that a particular choice of discrete-time filter, namely zero-phase non-causal recursive filters for finite-length data formulated in y(n) = f(n) + x(n) + w(n); n = 0;:::;N − 1 (1) terms of banded matrices, makes the algorithms effective and computationally efficient. The efficiency stems from the use of where f is a low-pass signal, x is a sparse and/or sparse- fast algorithms for solving banded systems of linear equations. derivative signal, and w is stationary white Gaussian noise. The method is illustrated using data from a physiological- For noisy data such as y in (1), neither conventional low-pass measurement technique (i.e., near infrared spectroscopic time series imaging) that in many cases yields data that is well- filtering nor sparsity-based denoising is suitable. Further, (1) is approximated as the sum of low-frequency, sparse or sparse- a good model for many types of signals that arise in practice, derivative, and noise components. for example, in nano-particle biosensing (e.g., Fig. 3a in [30]) Keywords: total variation denoising, sparse signal, sparsity, and near infrared spectroscopic (NIRS) imaging (e.g., Fig. 9 low-pass filter, Butterworth filter, zero-phase filter. in [3]). Note that if the low-pass signal f were observed in noise I. INTRODUCTION alone (y = f +w), then low-pass filtering (LPF) would provide a good estimate of f; i.e., f ≈ LPF(f + w). On the other Linear time-invariant (LTI) filters are widely used in sci- hand, if x were a sparse-derivative signal observed in noise ence, engineering, and general time series analysis. The prop- alone (y = x+w), then total variation denoising (TVD) would erties of LTI filters are well understood, and many effective provide a good estimate of x; i.e., x ≈ TVD(x+w) [68]. Given methods exist for their design and efficient implementation noisy data of the form y = f + x + w, we seek a simple [62]. Roughly, LTI filters are most suitable when the signal optimization-based approach that enables the estimation of f of interest is (approximately) restricted to a known frequency and x individually. band. At the same time, the effectiveness of an alternate In this paper, an optimization approach is presented that approach to signal filtering, based on sparsity, has been enables the simultaneous use of low-pass filtering and sparsity- increasingly recognized [22], [34], [60], [74]. Over the past based denoising to estimate a low-pass signal and a sparse 10-15 years, the development of algorithms and theory for signal from a single noisy additive mixture, cf. (1). The sparsity-based signal processing has been an active research optimization problem we formulate involves the minimization area, and many algorithms for sparsity-based denoising (and of a non-differentiable, strictly convex cost function. We reconstruction, etc.) have been developed [67], [73]. These are present two iterative algorithms.1 The first algorithm models most suitable when the signal of interest either is itself sparse x in (1) as having a sparse derivative and is derived using or admits a sparse representation. the majorization-minimization (MM) principle. The second Contact author: I. Selesnick, email: [email protected], phone: 718 260-3416, algorithm models x in (1) as having a sparse derivative, or fax: 718 260-3906. being sparse itself, or both. This algorithm is derived using I. W. Selesnick is with the Department of Electrical and Computer Engi- neering, Polytechnic Institute of New York University, 6 Metrotech Center, the alternating direction method of multipliers (ADMM). The Brooklyn, NY 11201. H. L. Graber, D. S. Pfeil, and R. L. Barbour are with second algorithm is more general and can be used in place of the Department of Pathology, SUNY Downstate Medical Center, Brooklyn, the first. However, as will be illustrated below (Sec. VII-C), NY 11203. This research was supported by the NSF under Grant No. CCF-1018020, in cases where the first algorithm is applicable, it is preferable the NIH under Grant Nos. R42NS050007, R44NS049734, and R21NS067278, and by DARPA project N66001-10-C-2008. 1Software is available at http://eeweb.poly.edu/iselesni/lpftvd/ 2 LAST EDIT: FEBRUARY 4, 2014 to the second one because it converges faster and does not addresses general linear inverse problems, and involves both require a step-size parameter as the second algorithm does. 1D signals and images. The work described in this paper In addition, this paper explains how a suitable choice of can be differentiated from that of Ref. [44] by noting that discrete-time filter makes the proposed approach effective and this work: (1) utilizes LTI filtering, which provides a more computationally efficient. Namely, we describe the design and convenient way to specify the frequency response of the implementation of a zero-phase non-causal recursive filter for smoothing operator, in comparison with Tikhonov regulariza- finite-length data, formulated in terms of banded matrices. tion, (2) utilizes compound regularization (see Sec. II-C), and We choose recursive filters for their computational efficiency (3) explicitly exploits fast algorithms for banded systems. in comparison with non-recursive filters, and the zero-phase Many papers have addressed the problem of filter- property to eliminate phase distortion (phase/time offset is- ing/denoising piecewise smooth signals, a class of signals sues). As the algorithms are intended primarily for batch-mode that includes the signals taken up in this paper, i.e. y in processing, the filters need not be causal. We cast the recursive (1). However, as noted in Ref. [71], much of the work on discrete-time filter in terms of a matrix formulation so as this topic explicitly or implicitly models the underlying signal to easily and accurately incorporate it into the optimization of interest as being composed of smooth segments separated framework and because it facilitates the implementation of by discontinuities (or blurred discontinuities) [16], [33], [57]. the filter on finite-length data. Furthermore, the formulation This is particularly appropriate in image processing wherein is such that all matrix operations in the devised algorithms distinct smooth regions correspond to distinct objects and involve only banded matrices, thereby exploiting the high discontinuities correspond to the edges of objects (e.g., one computational efficiency of solvers for banded linear systems object occluding another) [43]. Under this model, smoothing [65, Sect 2.4] and of sparse matrix multiplication. across discontinuities should be avoided, to prevent blurring The computational efficiency of the proposed algorithms of edges. The signal model (1) taken up in this paper differs also draws on recent developments in sparse-derivative signal in an important way: it models the smooth behavior on the denoising (i.e., total variation (TV) denoising [10], [19], [68]). two sides of a discontinuity as being due to a common low- In particular, we note that the exact solution to the 1D TV pass signal, i.e. f in (1). In contrast to most methods developed denoising problem can be calculated by fast constructive for processing piecewise smooth signals, the proposed method algorithms [27], [49]. The algorithms presented here draw on seeks to exploit the common smooth behavior on both sides this and the ‘fused lasso signal approximator’ [40]. of a discontinuity, as in Refs. [44], [70], [71]. After Sec. II on preliminaries, Sec. III presents the for- The problem addressed in this paper is a type of sparsity- mulation of the optimization problem for simultaneous low- based denoising problem, and, as such, it is related to the pass filtering and sparse-signal denoising. Section IV derives general problem of sparse signal estimation. Many papers, es- an iterative algorithm for solving the optimization problem. pecially over the last fifteen years, have addressed the problem Section V addresses the case where both the signal itself and of filtering/denoising signals, both 1D and multidimensional, its derivative are sparse. Section VI presents recursive discrete- using sparse representations via suitably chosen transforms time filters to be used in the algorithms. Section VII illustrates (wavelet, etc.) [29], [53], [64]. The method described here the proposed algorithms on data, including NIRS times series.