A Fast SVD for Multilevel Block Hankel Matrices with Minimal Memory Storage

Numer Algor DOI 10.1007/s11075-014-9930-0 ORIGINAL PAPER A fast SVD for multilevel block Hankel matrices with minimal memory storage Ling Lu · Wei Xu · Sanzheng Qiao Received: 2 January 2014 / Accepted: 6 October 2014 © Springer Science+Business Media New York 2014 Abstract Motivated by the Cadzow filtering in seismic data processing, this paper presents a fast SVD method for multilevel block Hankel matrices. A seismic data presented as a multidimensional array is first transformed into a two dimensional multilevel block Hankel (MBH) matrix. Then the Lanczos process is applied to reduce the MBH matrix into a bidiagonal or tridiagonal matrix. Finally, the SVD of the reduced matrix is computed using the twisted factorization method. To achieve high efficiency, we propose a novel fast MBH matrix-vector multiplication method for the Lanczos process. In comparison with existing fast Hankel matrix-vector multiplication methods, our method applies 1-D, instead of multidimensional, FFT and requires minimum storage. Moreover, a partial SVD is performed on the reduced matrix, since complete SVD is not required by the Caszow filtering. Our numerical experiments show that our fast MBH matrix-vector multiplication method signif- icantly improves both the computational cost and storage requirement. Our fast MBH SVD algorithm is particularly efficient for large size multilevel block Hankel matrices. This work was supported by the Natural Science Foundation of China (Project No: 11101310) and Shanghai Key Laboratory of Contemporary Applied Mathematics of Fudan University. L. Lu · W. Xu () Department of Mathematics, Tongji University, Shanghai, 200092, People’s Republic of China e-mail: [email protected] L. Lu e-mail: [email protected] S. Qiao Department of Computing and Software, McMaster University, Hamilton, Ontario, L8S 4K1, Canada e-mail: [email protected] Numer Algor Keywords Multi-level Block Hankel Matrix · SVD · Cadzow filtering · Seismic data processing Mathematics Subject Classification (2010) 15B05 · 15A18 · 65F20 · 65F25 · 65F50 1 Introduction A matrix is called Hankel if its entries on each antidiagonal are equal. Hankel matrices arise in many applications, such as digital information processing, sys- tem theory and automatic control theory. In particular, in seismic data processing, Cadzow filtering [5] involves Hankel matrices. Denoising and completion (regular- ization) are possible by iteratively finding a low-rank approximation that honors the original observations. Cadzow filtering is one kind of rank-reduction methods which can be employed to attenuate noise and for prestack seismic data regulariza- tion. It is equivalent to time series analysis called multichannel singular spectrum analysis [4, 12]. The application of Cadzow filtering approach for random noise attenuation on seismic data started from the works of Ulrych et al. [24] on eigenimage filtering of seismic data. For one dimensional case, the Cadzow filtering consists of the following three steps: (1) Convert the original 1D data into the frequency domain and form a Han- kel matrix; (2) find a low-rank approximation of the Hankel matrix; (3) recover the signal by averaging the elements on each antidiagonal of the low-rank approximation and transforming into the time domain. Note that the Hankel structure can be destroyed during step (2) and restored by the averaging process at step (3). These two steps are performed only once to produce a Hankel matrix, which need not be the nearest approximation to the original Hankel matrix. There are some other structure-preserving low-rank approximation methods which guarantee both structure and “nearest” approximation [7, 15]. However, they require a number of iterations involving the computation of the singular value decomposition (SVD), introducing significant additional computational cost. Thus they are impractical for large data matrices in seismic data processing. In the absence of noise, the Hankel marix formed by seismic data is rank deficient [21, 22]. However, due to noise in the observed data, the rank of the data Hankel matrix is higher than it should be. Cadzow filtering removes the noise by reducing the rank of the Hankel matrix and then recovers trace values from the rank-reduced matrix. Thus, seismic data reconstruction and noise attenuation can be posed as a Hankel matrix rank-reduction problem. Trickett furthered Cadzow filtering by applying eigenimage filtering to 3D data frequency slices and later extended F-x Cadzow filtering to F-xy Cadzow filtering by forming a larger Hankel matrix of Hankel matrices (Level-2 Block Hankel matrix) in multiple spatial dimensions [21–23]. In 2013, Gao et al. [13] developed a rank reduction denoising and reconstruction scheme that is used to reconstruct prestack data that depend on four spatial dimensions by forming a Level-4 Block Toeplitz matrix. This is often called 5D interpolation because reconstruction algorithms operate on Numer Algor volumes that depend on four spatial dimensions and time or frequency. Falkovskiy et al. [9] introduced an additional dimension for composing the Hankel matrices. Instead of using only spatial dimensions for composing these matrices, they proposed to add a frequency dimension to create an extended matrix from a series of frequency slices. This Frequency Extension approach improved filter quality through better statistics by utilizing these multi-frequency slices. If using Cadzow filtering with Frequency Extension, we need to construct a Level-5 Block Hankel matrix. In practice, the seismic signal data are in dozens even hundreds of gigabytes, whose generated Hankel matrix size is usually in hundreds of thousands. Thus, fast and memory efficient matrix reduction methods are necessary. The most stable matrix rank-reduction method is the singular value decomposition (SVD). The subroutine ZGESDD in LAPACK, adopted by MATLAB function svd, computes the SVD of a general matrix. But neither the truncated classical SVD nor the randomized SVD algorithm [14, 16, 17] take advantage of the special structure of the Multilevel Block Hankel (MBH) matrix arising from Cadzow filtering methods. A general SVD method, like ZGESDD, is too expensive for large size matrices occured in seismic data processing. By exploiting the Hankel structure, Xu and Qiao [27] proposed a fast SVD algorithm for (level-1) Hankel matrices. The fast SVD algorithm for Hankel matrices in [27] consists of two stages. First, the Hankel matrix is bidiagonalized through the Lanczos method. When the Hankel matrix is square, it is tridiagonalized to maintain its symmetry. Second, the SVD of the bidiagonal or tridiagonal matrix is computed using the twisted factorization method in [2, 26]. It is well known that the computational cost of the Lanczos method is dominated by matrix-vector multiplications. Thus, efficient matrix-vector multiplication is crucial. Motivated by the seismic application, in this paper, we consider the Cadzow filtering for multidimensional data arrays and propose an efficient MBH matrix-vector multiplication in the Lanczos method. A multidimensional seismic data array is first transformed into a two-dimensional MBH matrix, then the 1-D FFT [1] is applied to perform a fast MBH matrix-vector multiplication. In contrast to other fast MBH matrix-vector multiplication methods employing multidimensional FFTs in [6, 13], our algorithm only requires 1-D FFT and minimal memory storage. Therefore, the main contribution of this article is our fast MBH matrix-vector multiplication with minimal memory storage requirement and its application to fast MBH SVD. In this paper, we focus on complex MBH matrices arising from the seismic data denoising. Our method is also applicable to real MBH matrices and multi-level block Teo- plitz matrices from other applications, for example, the vibration signal measured on ship or mechanical equipment[29] and discrete dipole approximation (DDA) from electromagnetic scattering [20]. The rest of the paper is organized as follows. In Section 2, we present an algorithm for constructing a two-dimensional MBH from a signal data array of multiple spatial dimensions. In Section 3, we propose a fast algorithm for the MBH matrix- vector multiplication. Then, based on the fast MBH matrix-vector multiplication, the Lanczos method and the twisted factorization is given in Section 4. In Section 5 we present our numerical experimental results on some random MBH matrices and the experiments on seismic signal data. Finally, we conclude this paper with some remarks. Numer Algor 2 Constructing multilevel block Hankel matrices In this section, we present our algorithm for constructing a two dimensional MBH matrix from a signal data array S of multiple spatial dimensions in frequency domain. The two-dimensional MBH matrix allows us to develop a fast MBH matrix-vector multiplication using 1-D FFT. For clarity, we present a recursive version of our algorithm. In the trivial case when the data signal array S is one dimensional, that is, an n- T by-1 complex column vector [s1 s2 ... sn] , representing an array of n traces along a constant-frequency slice. The Cadzow filtering for this signal model can be achieved by constructing the Hankel matrix [17]: ⎡ ⎤ s1 s2 ··· sn−p+1 ⎢ ⎥ ⎢ s2 s3 ··· sn−p+2 ⎥ ⎢ ⎥ H = ⎢ s3 s4 ··· sn−p+3 ⎥ . ⎢ . ⎥ (2.1) ⎣ . ··· . ⎦ sp sp+1 ··· sn Apparently, a Hankel matrix is determined by its first column and last row. In this paper, we set the Hankel matrix as near to square as possible. Specifically, in (2.1), we choose p =n/2+1, where a denotes the largest integer that is not greater than a. Thus H is either square when n is odd or p-by-(p − 1) when n is even. Then, T given a trace vector S =[s1 s2 ... sn] , the Hankel matrix H is uniquely determined. Moreover, a longer main diagonal is desirable in the denoising process, since the rank of a matrix is limited by the length of its main diagonal. Now, we consider the general case when S is a data signal array of k-dimensions.

Load more