1290 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 32, NO. 7, JULY 2013 Highly Undersampled Magnetic Resonance Image Reconstruction Using Two-Level Bregman Method With Dictionary Updating Qiegen Liu*, Shanshan Wang, Kun Yang, Jianhua Luo, Yuemin Zhu, and Dong Liang

Abstract—In recent years Bregman (or related I. INTRODUCTION augmented Lagrangian method) has shown to be an efficient opti- mization technique for various inverse problems. In this paper, we propose a two-level Bregman Method with dictionary updating for AGNETIC resonance imaging (MRI) is an essential highly undersampled magnetic resonance (MR) image reconstruc- M medical diagnostic tool which provides clinicians tion. The outer-level Bregman iterative procedure enforces the with important anatomical information in the absence of ion- sampled k-space data constraints, while the inner-level Bregman method devotes to updating dictionary and sparse representation izing radiation. However, despite its superiority in obtaining of small overlapping image patches, emphasizing local structure high-resolution images and excellent depiction of soft tissues, adaptively. Modified sparse coding stage and simple dictionary MRI still has its own limitations; specifically, one property updating stage applied in the inner minimization make the whole accompanying MRI is that its scanning time is linearly related algorithm converge in a relatively small number of iterations, to the amount of data it acquires. As reported in [1], increased and enable accurate MR image reconstruction from highly un- dersampled k-space data. Experimental results on both simulated scan duration may introduce some potential issues such as MR images and real MR data consistently demonstrate that the physiological motion artifacts and patient discomfort. There- proposed algorithm can efficiently reconstruct MR images and fore, it is necessary to reduce the acquisition time. On the other present advantages over the current state-of-the-art reconstruc- hand, the reduction of the acquisition time may result in quality tion approach. degradation of MR images due to the undersampling, which compromises its diagnostic value. In this sense, accurate re- construction from highly undersampled k-space data is of great Index Terms—Augmented Lagrangian, Bregman iterative necessity for both quick MR image acquisition and clinical method, dictionary updating, image reconstruction, magnetic diagnosis. (CS) theory, as a fundamental resonance imaging (MRI), sparse representation. and newly developed methodology in information society, has provided a crucial theoretical foundation for quick MR image acquisition. Specifically, the application of CS to MRI is known Manuscript received December 15, 2012; revised March 18, 2013; accepted as CS-MRI [2]–[6]. March 25, 2013. Date of publication April 02, 2013; date of current version June The basis for CS to work is sparsity, namely the image 26, 2013. This work was supported in part by High Technology Research De- has a sparse representation in certain domain. Normally, the velopment Plan of China under 2006AA020805, in part by the NSFC of China under 30670574 and 61262084, in part by Shanghai International Cooperation transforms which allow the image to have a sparse representa- Grant under 06SR07109, in part by Region Rhone-Alpes of France under the tion are named as sparsifying transforms. Total variation (TV) project Mira Recherche 2008, in part by the joint project of Chinese NSFC and wavelet transform are two such transforms [1], [6]–[9] (under 30911130364) and French ANR 2009 (under ANR-09-BLAN-0372–01), and in part by China Scholarship Council under Grant 2011623084. Asterisk in- frequently employed in CS recovery problems. For instance, dicates corresponding author. Lustig et al. [1] focused on MR image reconstruction with TV *Q. Liu is with the Department of Electronic Information Engineering, Nan- penalty and the wavelet transform of Daubechies. Trzasko et chang University, Nanchang 330031, China (e-mail: [email protected]). D. Liang is with the Paul C. Lauterbur Research Centre for Biomedical al. [8] proposed a homotopic -minimization strategy, instead Imaging, Shenzhen Key Laboratory for MRI, Shenzhen Institutes of Advanced of -minimization, to reconstruct the MR image. The work of Technology, Chinese Academy of Sciences, Shenzhen 518055, China (e-mail: [10] presented an edge guided compressive sensing reconstruc- [email protected]). S. Wang is with the School of Biomedical Engineering, Shanghai Jiao Tong tion (EdgeCS) method, which alternatively performs TV-based University, Shanghai 200240, China, and with Biomedical and Multimedia In- CS reconstruction and edge detection with each step benefiting formation Technology (BMIT) Research Group, School of Information Tech- from the latest solution of the other. However, since TV prior nologies, The University of Sydney, NSW 2006, Australia (e-mail: sophiaw@it. usyd.edu.au). prefers cartoon-like images which are piecewise constant, it K. Yang is with the Department of Electrical Computer Engineering, National does not apply to MR images to some extent which consist of University of Singapore, 117576 Singapore (e-mail: [email protected]). crucial details for clinical diagnosis. Bredies et al. thus intro- J. Luo is with the College of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: [email protected]). duced the total generalized variation (TGV) model for MRI Y. Zhu is with the CREATIS, CNRS UMR 5220, Inserm U 630, INSA Lyon, problems [11], [12]. Unfortunately, although this has improved University of Lyon 1, Lyon, France (e-mail: [email protected]). the reconstruction result, it is still a TV-based regularization, Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. which can be considered as only forcing the reconstructed Digital Object Identifier 10.1109/TMI.2013.2256464 image to be sparse with respect to spatial differences. Other

0278-0062/$31.00 © 2013 IEEE LIU et al.: HIGHLY UNDERSAMPLED MAGNETIC RESONANCE IMAGE RECONSTRUCTION USING TWO-LEVEL BREGMAN METHOD 1291 analytically designed dictionaries, such as the wavelets and the well-known augmented Lagrangian (AL) scheme in some shearlets, also have this intrinsic deficiency, i.e., lacking the particular cases [25], [26]. As a promising iterative mechanism, adaptability to various images. We name the sparsity, which the its significance is soon presented in image deblurring and MR fixed and global transform makes the image possess, as global image reconstruction problems [25], [27]–[33]. sparsity. In this work, we exploit both the strengths of the patch-based In addition to global sparsity, nonlocal similarity is another adaptive dictionaries and the Bregman iteration technique. The popular patch-based sparsity which describes the resemblance main contribution of this paper is the development of a fast and of small image patches in an image. This property has been suc- robust numerical algorithm, named two-level Bregman method cessfully applied in image denoising problems [13]–[15] and with dictionary updating (TBMDU), for MR image reconstruc- many authors have also incorporated this nonlocal information tion. The proposed algorithm consists of a two-level solver em- into the CS recovery problems. For instance, Liang et al. [16] ploying the Bregman technique. One is to estimate the recovery applied the nonlocal total variation (NLTV) regularization to image and the other is to calculate the dictionary and sparse co- reduce the blocky effect introduced by TV regularization. This efficients of image patches. A modified strategy is applied to the method replaces the conventional gradient functional used in sparse coding step of the inner minimization, enabling the effi- TV with a weighted nonlocal gradient function and obtains an ciency of the whole algorithm. improvement in signal-to-noise ratio of reconstruction quality in parallel imaging. The work of [17] further incorporated a II. BACKGROUND AND RELATED WORK semi-nonlocal prior into the homotopic minimization to the reconstruction of breast MR images. Egiazarian et al. [18] A. CS-MRI With Various Regularizers proposed a recursive filtering procedure for CS image recon- With CS theory intensively studied [2], [3] and successfully struction. To excite the algorithm, random noise is injected applied in practical problems, it has become well known that im- in the unobserved part of the spectrum at each iteration, and ages can be reconstructed from very few linear measurements then a spatially adaptive image denoising filter [in particular as long as the image has a sparse representation. To explore the state-of-the-art block-matching and 3-D filtering (BM3D)] the sparsity inherent in MR images, researchers often apply [14] is utilized in the image domain to remove the noise and some sparsifying transforms to convert them into a represen- recover the detail information of the image. In addition, under tation with few values significantly different from zero. Table I the assumption that each image patch can be sparsely repre- is a brief summary of some commonly used sparsity-promoting sented, the K-singular value decomposition (K-SVD) algorithm regularization terms [denoted by ]. For more general situa- proposed by Elad et al. [15], [19] has also been used for MR tion, if a sparsifying transform is defined, it is modeled ideally image reconstruction [20]–[22]. Especially, Ravishankar et that the sparsest representation of an image under is a good al. [22] proposed an outstanding two-step alternating method estimation of the image provided it matches the available ob- named DLMRI, in one step of which the sparsifying dictio- served data. Mathematically this model can be described as nary is learned for a sparse representation with the noise and alias removed, while in the other step, the missing k-space (1) data is restored and filled in the spectrum. Since patch-based methodologies can capture local image features effectively where denotes the original image, counts the number of and recover MR images robustly, most of the results appear nonzero elements in the vector, represents the partially sam- to be improved when compared to the results achieved by the pled Fourier encoding matrix and is the raw measurement data global transform-based methods. Motivated by this, just as in k-space. is the sparsity-promoting regularization crite- [22], we prefer to use patch-based adaptive dictionary learning rion subject to the data-consistency . If the channels algorithm to attack CS-MRI reconstruction. Nevertheless, there of MRI are contaminated with white Gaussian noise, the mini- are two difficulties lying in these patch-based highly nonlinear mization problem in (1) changes into methods, namely the computational complexity and the sen- (2) sitivity to initialization. This is typically serious in dictionary learning related problems [15], [22], [23]. In detail, although where is the standard deviation of the zero-mean complex DLMRI [22] method has designed an outstanding strategy by Gaussian noise added into the measured k-space samples. The learning the immediate dictionary with a fixed number of 10 constrained problem (2) can be converted to an analogous un- iterations and only used part of the data samples extracted from constrained Lagrangian form the previous images, the two problems still exist in some sense, which will be discussed in the experiment section. (3) Considering the high computational load, it is advisable to ex- plore and develop some efficient iterative algorithms for the im- with denoting the Lagrangian multiplier. There are still some plementation of these methods. The Bregman iterative method differences between (2) and (3). As declared in [34], compared is one of the outstanding iterative regularization schemes. It was to (2) and (3) possesses more intuitive Bayesian interpretation originally developed for image denoising, which alleviated the and is usually easier to solve. On the other hand, (2) has more drawback of the classical TV algorithm proposed by Rudin et straightforward explanation to the parameter ,whichisthe al. [24], and later proved to be very close to, or equivalent to noise standard deviation. Furthermore, it is much easier to set 1292 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 32, NO. 7, JULY 2013

TABLE I COMMONLY USED REGULARIZATION TERMS

the value of than that of the parameter . As for the practical generated by (7) converges to zero monotonically. Further- implementation, -norm is usually employed to relax due more, after substituting (5) into (7), (7) has another equivalent to its computational intractability. Therefore, our paper focuses formulation on the constrained problems (1) and (2) with convex -norm as well. (8) Essentially, for MR images, there is a rich amount of different local structural patterns, which cannot be well represented by using only one fixed transform or basis. Therefore, TV models with . This method has been widely applied in sparse and many types of wavelets will introduce various faults in the signal and image recovery problems with state-of-the-art results reconstruction. In this work, we prefer to employ the sparse rep- achieved [25], [27]–[31]. For example, in [31], the authors solve resentation model of image patches with adaptive dictionary as the following optimization problem of MR image reconstruc- the regularizer, because of its strong ability in image fine struc- tion from sparse radial samples ture preservation. For the computational efficacy, we would like to refer to the Bregman iterative method since it possesses some (9) wonderful properties such as simplicity, efficiency and stability.

B. Bregman Iterative Method for Sparse Representation where denotes the bounded variation of the image cal- culated by finite differences, and denotes the wavelet trans- Bregman method was originally developed to solve the fol- form, and is the weighting parameter for the second regular- lowing problem: ization term. The Bregman iterativemethodisemployedinthe (4) following way. A nonlinear conjugate gradient (CG) descent algorithm [1] is firstly applied to solve the minimization sub- by starting with the concept of Bregman distance for single vari- problem in (9) and then the two-step alternating procedure in able case, which is shown as follows: (9) is implemented until a stop criterion is satisfied.

(5) III. TBMDU where and are convex functions, is differentiable, and is A. The Image Recovery Model subgradient of the constrained function at the point .The As we have discussed and summarized above, dictionary advantage of Bregman iteration is to transform a constrained learning possesses strong capability in preserving fine struc- problem (4) into a sequence of unconstrained subproblems tures and details for image recovery problems. Additionally, (6) due to its adaptability to various image contents, it could be an ideal model to attack such problems. Consider image patches, which are extracted from an image and form a In the case of (1) and (2), can be approximately formu- set . Dictionary learning [15], lated as . After applying the Bregman [19], [20] assumes each image patch of size iterations and letting , two essential updating rules can be sparsely represented by a learned dictionary ,i.e., can be obtained . With all the patches considered, the model can be (7) written as where and is a weighting parameter. For more details, readers can refer to [25], [27]–[32]. A merit of the (10) Bregman iteration is that the residual sequence of LIU et al.: HIGHLY UNDERSAMPLED MAGNETIC RESONANCE IMAGE RECONSTRUCTION USING TWO-LEVEL BREGMAN METHOD 1293 where is the collection of optimal sparse thermore, the problem in (10) is decoupled into distinct prob- coefficients and is the trained dictio- lems and an auxiliary variable is introduced to convert the nary with its atom denoted by . is formulated in a vector unconstrained problem (10) into a constrained formulation form [19]. With (10) employed as the regularization term, i.e., in (8), the following objective function can be gained: (13) Then the split Bregman method, which is equivalent to the AL method [25], [26], is used to solve the problem (12), namely

(14)

where (11) where , ,and de- termines the sparse level of the image patches in the “optimal” and dictionary. The number of atoms ,where mea- (15) sures the degree of the overcompleteness of the dictionary. At this moment, the target model is formulated and presented in In conventional dictionary learning approaches, the dictionary (11). is usually updated after the optima of sparse coefficients are achieved and the whole learning procedure is carried out in B. Bregman Technique for Solving Subproblem (10) an iterative way until certain stop criteria are satisfied. In con- Clearly, one key procedure in the model (11) is to solve sub- trast, here we update the dictionary after each inner iteration problem (10). In other words, it is crucial to efficiently update of (14) and (15). In order to update the dictionary, we have to the dictionary and learn the coefficient matrix . This section consider all samples. By taking the derivative of the functional starts with a review of our previous work on dictionary learning. with respect to , we obtain the following Then, right afterwards, we present in detail the training of the update rule dictionary and the sparse coding for updating . 1) Our Previous Work on Dictionary Learning: As stated in the early literatures [15], [23], [35], the computational com- plexity and the sensitivity to initialization are two issues in the (16) dictionary learning technique because of its implicit highly non- linear and nonconvex nature. To overcome these drawbacks, After the gradient descent updating, the columns of the designed we have developed a class of Bregman iterative/AL based dic- dictionary ( ) are additionally constrained to tionary learning methods [36], [37] to solve the approximately be of unit norm so as to avoid the scaling ambiguity [19]. One equivalent dictionary learning model property should be noted is that each dictionary updating can be considered as a refinement operation. For more information, (12) please refer to our previous papers [36], [37]. 3) Modified Strategy for Sparse Coding: In this subsection, we focus on the inner problem i.e., sparse coding with respect The work of [36] presents a predual dictionary learning (PDL) to variable ,where is used to denote the inner iteration method which extends the predual proximal point algorithm number. On the other hand, in accordance with the index above, (PPPA) by updating the dictionary via a gradient descent after is used to denote the iteration number for updating the dictio- each inner minimization step of PPPA. Theoretical analysis il- nary in the outer iteration of dictionary learning. When and lustrates that PDL possesses excellent iterative property which appear as superscripts at the same time, this means the vari- is beneficial to dictionary learning and image recovery. How- able is updated in both inner and outer iterations of dictionary ever, the PDL only applies to the dictionary learning problems learning. Firstly, the minimization in (14) with respect to is where is nonnegative. The AL based multi-scale dictionary computed analytically as an expression of learning (ALM-DL) method proposed in [37], on the other hand, can apply to such problems directly. Both methods fall into the “Bregman iteration/augmented Lagrangian framework,” which is subject to the bilinear constraint with denoting an auxiliary variable. In this paper, we further investigate the idea inherent in these two approaches and develop a new technique to solve the model (10) which is a subproblem in the sparse representation model (11). Therefore, the proposed method TBMDU addressing the general inverse (17) problem of MR image reconstruction can be considered as an extension of the ideas employed in [36] and [37]. where is implicitly updated as an expression of i.e., 2) Dictionary Updating: For a simple notation and a clear . Furthermore, we take another trick, explanation, we utilize to denote the image patch .Fur- i.e., is replaced by its latest state in each inner iteration of 1294 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 32, NO. 7, JULY 2013 the iterative shrinkage/thresholding algorithm (ISTA) [25] with regard to as follows:

(18) where denotes the update of the variable in the inner loop, which is derived with the analytical expression of the vari- able from (17). Likewise, with eliminated, we attain the minimization functional of

Now the least squares solution of can be obtained as (19) Then following the ISTA algorithm and (17), we can obtain the solution of

(23) (20) where is Let denote the full Fourier encoding matrix which the solution of and is normalized such that . represents the full . k-space data and we can substitute it into (23) to obtain the fol- In summary, the proposed Bregman-based method consists of lowing update rule: a two-level nested loop. The outer loop updates the dual vari- ables and the dictionary, while the inner loop minimizes the primal variables and in the meantime enables the accuracy of the algorithm. In Algorithm 1, the method is summarized and presented in a matrix form, where and are respectively the collections of and .

C. Bregman Technique for MR Image Reconstruction (24) This section goes back to the topic of MR image reconstruc- In our formulation we use periodically positioned, overlapping tion shown in (11). Recall that the Bregman technique applied 2-D image patches, which are assumed to “wrap around” at in Section III-B can be naturally incorporated into (11). Conse- image boundaries. Pixels near the right and bottom image quently, the image variable can be obtained by tackling the boundaries can also constitute the top-left corners of patches. following minimization: The patches in this scenario begin near the image boundary and wrap around on the opposite side of the image. When the image patches are sampled, the overlapping stride is defined to be the distance in pixels between corresponding pixel locations in adjacent image patches. Similarly as described in [22], the matrix is a diagonal matrix consisting of ones and zeros. The ones are at those diagonal entries that correspond to sampled locations in k-space. represents the subset from k-space that has been sampled from Fourier measurements (21) .Let denote for short. and the matrix premultiplying in (24) becomes diagonal As presented in Section III-B, for updating , ,and ,by and trivially invertible. We average the patch results and then eliminating the constant variables, we can gain a new update transform it to the Fourier domain functional of . Furthermore, in order to reach faster conver- gence, the variable is updated at each inner iteration of the Bregman iterative process, and it yields (25)

To sum up, the solution of (24) is

(22) (26) LIU et al.: HIGHLY UNDERSAMPLED MAGNETIC RESONANCE IMAGE RECONSTRUCTION USING TWO-LEVEL BREGMAN METHOD 1295

where represents the updated value at loca- tion and stands for the subset of k-space that Fig. 1. Convergence property of the proposed TBMDU. (a) Pseudo radial sam- has been sampled. Similar to [22], we also name (25) and pling with 7.11-fold undersampling. (b)–(d) PSNR, HFEN, and function value (26) as the frequency interpolation step. Thus the update of versus the number of iterations. is specified with the value of (26) substituted into. cost of . On the other hand, different from DLMRI D. The Summary for TBMDU which uses K-SVD for dictionary learning, TBMDU employs Up to this point, the two-level structure for the proposed our own developed dictionary learning algorithm [37]. The dic- TBMDU has been presented. We summarize the main steps of tionary update of K-SVD uses the complicated SVD decom- TBMDU in Algorithm 2. For the outer loop, the value of is position, while our method only involves simple matrix mul- progressively updated to obtain the minimum of (1) or (2). The tiplication. With the patch size denoted by , the number of inner loop employs the two-step approach, iteratively attaining image patches by and the number of atoms by ,thenwe sparse representation of the image patches and updating (25) have , ,and . The total com- and (26), to minimize the penalty function for the given value putational cost for TBMDU is ,where of . Particularly, we utilize the simple iterative solver to , ,and denote the numbers of iterations. Please attain adaptively the sparse representation of the image patches. refer to [22] for the computational cost of DLMRI. We present For the implementation of TBMDU, we initialize it with a the result for reconstructing an axial T2-weighted brain image zero-filled Fourier estimate . employing the pseudo radial sampling at 7.11-fold undersam- The proposed method involves four parameters , , ,and . pling ratio in Fig. 1 for a better illustration. TBMDU was imple- The first two are positive parameters for Bregman/AL iteration mented with its own recommended parameter settings, i.e., the algorithms. As analyzed in [27] and [32], the values of and patch size , the overcompleteness of the dictio- have little effect on the final reconstruction quality provided nary and the patch overlap (correspondingly they are sufficiently small. Furthermore, it should be noted that and ). We display the peak signal-to- the smaller the value of the Bregman parameters are, the more noise ratio (PSNR) and high-frequency error norm (HFEN) [22] iterations are needed to reach the stop criteria. stands for the values as functions of outer iteration numbers in the two-level sparse level of the image patches and can be determined empir- Bregman method in Fig. 1(b) and (c), from which it can be ob- ically. As to the step size for updating the dictionary, it can be served that both measures change quickly during the first few it- set as a small positive number such as 0.01. The works of [25], erations. Similarly, the plot in Fig. 1(d) shows that our TBMDU [28], [29] demonstrate that the Bregman iteration/AL algorithm has a notable ability to decrease the function value .Fig.2 is guaranteed to converge when dictionary is fixed, even for presents the computation time against the outer iteration number inexact minimization. If the dictionary is updated, the global , where it can be observed that the average computation time solution of problem (1) or (2) may not be found due to the un- per iteration is about 89.16 s. convexity and nonlinearity of (1) and (2).

E. Computational Cost IV. EXPERIMENTAL RESULTS As summarized in Algorithm 2, TBMDU mainly consists of In this section, the performance of the proposed method is alternating operations in the image domain and k-space domain. tested under a variety of sampling schemes and with different Similar to [22], for the update of image , its compu- undersampling factors. Sampling schemes include the 2-D tational complexity is dominated by two FFTs, which have a random sampling [8], Cartesian sampling with random phase 1296 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 32, NO. 7, JULY 2013

Fig. 2. Plot of the computation time against the outer iteration number .

encodes (1-D random) [1], [8], and pseudo radial sampling [8], [22]. The MR images tested in the experiments are in vivo MR Fig. 3. The reconstruction results of COW using Cartesian sampling at four- scans of size 512 512 (many of which are courtesy of [38] and fold undersampling with different initializations: the results gained by DLMRI with (a) DCT, (b) random, and (c) SVD-based matrix as initializations; the re- used in [22]). According to many prior works on CS-MRI [1], sults gained by the proposed TBMDU with (d) DCT, (e) random, and (f) SVD- [7], [22], the CS data acquisition is simulated by subsampling based matrix as initializations; comparison at the red arrow-pointed regions: the 2-D discrete Fourier transform of the MR images, except the original image (left) and the images reconstructed by DLMRI (middle) and TBMDU (right) with three different initializations (g) DCT, (h) random, and test with the real frequency data. To justify the promising per- (i) SVD-based matrix. formance of our proposed method, we have compared TBMDU with the recently proposed DLMRI1[22], which has substan- tially outperformed other CS-MRI methods such as the method2 algorithms have been implemented in MATLAB 7.1 on a PC proposed in [1] (known as LDP) and the classical zero-filling re- equipped with AMD 2.31 GHz CPU and 3 GB RAM. construction. For a fair comparison with the DLMRI method, we imple- A. Impact of the Initial Dictionary ment TBMDU with the same parameter settings for dictionary This section investigates how different initial dictionaries learning purpose, namely the patch size , would affect the efficacy of the proposed method to reconstruct the overcompleteness of the dictionary and the patch MR images. To this end, three different initializations are re- overlap (correspondingly and the number of spectively used, namely the redundant DCT matrix, the random data samples for a 512 512 image under the matrix and the singular value decomposition (SVD)-based wrap around condition). While for other not-shared parame- matrix. The random matrix is obtained by randomly selecting ters in the two methods, the DLMRI is implemented with its the image patches from the training data set, while SVD-based built-in settings. As to TBMDU, the number of outer iterations initialization is formulated by the left singular vectors of the is set according to the stopping rule , training data. Both DLMRI and TBMDU are implemented to , , , and . reconstruct a transverse slice of a noncontrast MR angiography is empirically chosen as 12. Real-valued and complex-valued (MRA) of the circle of Willis (COW) using Cartesian sampling dictionaries are respectively used for the experiments on real- at 4-fold undersampling. Fig. 3 presents the results produced by valued and complex-valued images. Furthermore, we use both the two methods with three initializations. Their corresponding the peak signal-to-noise ratio (PSNR) and high-frequency error reconstruction error magnitudes are showed in Fig. 4. It can be norm (HFEN) [22] for a quantitative comparison of reconstruc- seen that the final results gained by our TBMDU are almost tion results. It needs to be emphasized that, due to the excellent the same regardless of initializations. Although both DLMRI ability in adding fine details and the simplicity of the dictio- and TBMDU present excellent performances on suppressing nary updating scheme, we only need to initialize the dictionary aliasing artifacts, our TBMDU has provided a better recon- at the beginning of the algorithm and all the data samples are struction of object edges (such as the vessels) and preserved then used to update the dictionary at each iteration. This is an finer texture information (such as the bottom-right of the re- construction). Fig. 3(g)–(i) present a microscopic comparison important distinction between our TBMDU method and other between the reference image and the images reconstructed by dictionary-related methods [20]–[22]. In our experiments, all DLMRI and TBMDU to highlight the differences. In general, 1Available at https://netfiles.uiuc.edu/ravisha3/www/DLMRICODE.zip. our proposed method provides greater intensity fidelity to the 2Available at http://www.stanford.edu/ mlustig/. image reconstructed from the full data. LIU et al.: HIGHLY UNDERSAMPLED MAGNETIC RESONANCE IMAGE RECONSTRUCTION USING TWO-LEVEL BREGMAN METHOD 1297

Fig. 4. The reconstruction errors for different initializations: the errors intro- ducedbyDLMRIwith(a)DCT,(b)random,and(c)SVD-basedmatrixas initializations; the errors introduced by the proposed TBMDU with (d) DCT, (e) random, and (f) SVD-based matrix as initializations.

Fig. 6. Reconstruction of an axial T2-weighted brain image at 7.11-fold un- dersampling. Reconstruction using TBMDU and DLMRI by employing (a), (d) the variable density 2-D random sampling, (b), (e) Cartesian sampling, and (c), Fig. 5. (a) PSNR and (b) HFEN versus the number of iterations of the DLMRI (f) sampling the central k-space (Cartesian) phase encoding lines, respectively. and our TBMDU methods, with random, DCT and SVD as initial dictionaries, Panels (g) and (h) are the PSNR and HFEN versus the number of iterations of respectively. DLMRI and TBMDU.

To quantitatively measure the performance of both methods namely the variable density 2-D random sampling, Cartesian with different initializations, the variations of PSNR and HFEN sampling and the sampling of central k-space (Cartesian) phase versus the number of iterations are presented in Fig. 5. From encoding lines [22]. For these three cases, the obtained PSNR Fig. 5(a), it can be easily observed that the results gained by values of DLMRI are 40.54 dB, 35.78 dB, and 31.12 dB, while TBMDU are superior to those of DLMRI in all three cases. The ours are 44.01 dB, 37.41 dB, and 31.26 dB, respectively. Our final PSNR values of the DLMRI results with random, DCT, results surpass those of DLMRI, respectively, by 3.47 dB, 1.63 and SVD initializations are respectively 35.68 dB, 35.73 dB, dB, and 0.14 dB in the three cases. The PSNR and HFEN and 35.83 dB, while for TBMDU, they are 36.91 dB, 36.89 dB, curves plotted in Fig. 6(g) and (h) indicate the superiority of and 36.90 dB, respectively. The biggest relative difference in ourmethodtoDLMRI.Theseexamples show that the more PSNR for the three results of DLMRI is 0.15 dB while ours incoherent the acquisition is, the better the reconstruction result is only 0.02 dB, which empirically indicates that our method will be, and hence the PSNRs obtained by random sampling is less sensitive to initialization. From Fig. 5(b), it can be ob- schemes are higher than those of Cartesian sampling schemes, served that TBMDU outperforms DLMRI in all three cases too. as shown in Fig. 6(a)–(e). On the other hand, when only the The final HFEN values gained by DLMRI with random, DCT, central k-space data are sampled, both DLMRI and TBMDU and SVD initializations are 1.8685, 1.8581, and 1.8294, respec- produce obscured results due to the complete absence of high tively, while ours are 1.5651, 1.5702, and 1.5730, respectively. frequency information in the phase encoding direction. The performance gap for DLMRI in HFEN is 0.0391, while ours In Fig. 7, the performance of DLMRI and TBMDU are illus- is much smaller, i.e., 0.0079. trated at a range of undersampling factors including 2.5, 4, 6, 8, 10, and 20, where zero-mean complex white Gaussian noise B. Impact of Undersampling Schemes is added to the 2-D random sampled k-space with its standard This subsection targets at evaluating the performance of deviation . Furthermore, it should be noted that the TBMDU under different sampling trajectories and at various sampling masks used are from the work of [22]. Since our stop- undersampling factors. In Fig. 6, we present the results of ping rule for the outer loop is determined by , reconstructing an axial T2-weighted brain image at 7.11-fold the number of outer iterations takes the values of 3, 3, 6, undersampling by employing three different sampling schemes, 7, 7, and 11 for the undersampling factors of 2.5, 4, 6, 8, 10, and 1298 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 32, NO. 7, JULY 2013

Fig. 7. Reconstruction results at 4, 8, and 20-fold undersampling gained by (a)–(c) TBMDU and (d)–(f) DLMRI, respectively. (g) PSNR versus undersam- pling factor for the two methods. (h) and (i) Reconstruction error magnitudes for DLMRI and TBMDU at 20-fold undersampling, respectively.

20, respectively. For the subjective comparison, reconstruction results under 4, 8, and 20-fold undersampling are displayed in Fig. 7(a)–(f), respectively. For the quantitative comparison, the variation of PSNR as a function of the undersampling factor is shown in Fig. 7(g), from which we can observe that TBMDU Fig. 8. (a) Reference T2-weighted sagittal view of the lumbar spine. (b) Sam- pling mask in k-space with 6.09-fold undersampling. (c) PSNR versus noise performs better than DLMRI at all the tested k-space undersam- level for TBMDU and DLMRI. Reconstruction results with noise standard de- pling factors. In particular at 20-fold undersampling, the mag- viation 2, 8, and 10 of the two methods are displayed in (d) and (g), (e) and nitude image of the reconstruction error for TBMDU [Fig. 7(h)] (h), and (f) and (i), respectively, (j) and (k) Reconstruction error magnitudes for DLMRI and TBMDU with noise standard deviation 8, respectively. presents less pixel errors and structure information than that of DLMRI in Fig. 7(h). of the TBMDU reconstruction appear less obscured than those C. Performance at Different Noise Levels in the DLMRI results. The magnitudes of the reconstruction To investigate the sensitivity of DLMRI and our method to errors for both methods with , shown in Fig. 8(k) and (j), different levels of complex white Gaussian noise, both methods reveal that our method provides a more accurate reconstruction are applied to reconstruct a T2-weighted sagittal view of the of image contrast and sharper anatomical depiction. lumbar spine under pseudo radial sampling at 6.09-fold un- dersampling. Fig. 8 presents the reconstruction results of both D. Results on Complex-Valued Data methods at a variety of levels of complex white Gaussian noise, In Fig. 9, a Cartesian fast spin echo (FSE) sequence is used which are added to the k-space samples. Fig. 8(c) presents the to acquire the T2-weighted k-space data of a brain. We utilize PSNRs of the recovered MR images by DLMRI (blue curves) the randomly undersampled phase encodes of the 2-D FSE pro- and TBMDU (red curves) at a sequence of different standard vided online by Lustig to compare DLMRI with TBMDU. From deviations . In the case of ,the Fig. 9(a) and (b), it can see that there is no big visual difference PSNR of the image obtained by TBMDU reaches 39.49 dB between the reconstructions by the DLMRI and our method at while that obtained by DLMRI is only 38.22 dB. Obviously, 2.5-fold undersampling. When the phase encodes are undersam- the difference gap between the two methods is significant at pled further to 5-fold, the reconstructions with the two methods low noise levels. Reconstruction results with noise at standard have presented a bit difference in visual quality. Fig. 9(f) shows deviation of are displayed in Fig. 8(d)–(i), respec- that the visible aliasing artifacts along the phase encoding direc- tively. It can be observed that the reconstruction by TBMDU is tion (horizontal in the image plane) in the DLMRI reconstruc- clearer, sharper than that by DLMRI, and is relatively devoid of tion is more than those of TBMDU reconstructions. To facilitate aliasing artifacts. In particular, the skeletons in the top half part theobservation,wealsopresenttherelativedifferenceimage LIU et al.: HIGHLY UNDERSAMPLED MAGNETIC RESONANCE IMAGE RECONSTRUCTION USING TWO-LEVEL BREGMAN METHOD 1299

Fig. 9. Cartesian sampling at different undersampling ratios. (a), (b) Recon- Fig. 10. Reconstruction comparison of a physical phantom MR image. struction using DLMRI and TBMDU at 2.5-fold undersampling, respectively. (a) MRI of the physical phantom. (b), (c) Reconstruction by using DLMRI (c), (d) Reconstruction using DLMRI and TBMDU at 5-fold undersampling, re- and TBMDU at 2.5-fold undersampling, respectively. (d), (e) Reconstruc- spectively. (e) Close-up of the results in (a) and (b). (f) A close-up of the results tion by using DLMRI and TBMDU at 5-fold undersampling, respectively. in (c) and (d). (g) The relative difference image between (c) and (a). (h) The (f), (g) Enlargements of (a), (d), and (e). relative difference image between (d) and (b). between the images reconstructed at 2.5-fold and 5-fold under- sampling in Fig. 9(g) and (h). To better demonstrate the superiority of our TBMDU to DLMRI in terms of PSNR and visual quality, we choose another test data under the same experimental setting. Fig. 10 shows the comparison results on reconstructing an MRI phys- ical phantom, which is often used to assess the resolution of an MRI system. Fig. 10(a) displays the fully-sampled recon- struction of the physical phantom. Fig. 10(b) and (c) show the results of DLMRI and TBMDU at 2.5-fold undersampling. Although the PSNR of DLMRI is 33.67 dB while that of our result is 37.51 dB, the reconstructions do not show much visual difference. However, when the phase encodes are undersam- pled further by utilizing 5-fold undersampling as shown in Fig. 11. Comparison of reconstructing a physical phantom MR image with Fig. 10(d) and (e), PSNRs for DLMRI and TBMDU are 18.66 noise of . (a) The reference physical phantom. (b), (c) Reconstruc- tions using DLMRI and TBMDU at 5-fold undersampling, respectively. and 26.82 dB, respectively. In this case, the reconstructions (d), (e) Enlargements of (a), (b), and (c). with the two methods display some differences in visual quality. The reconstruction with TBMDU is clearer, sharper than that with DLMRI, and relatively devoid of aliasing artifacts. In The enlargements of output are presented in Fig. 11(d) and (e). particular, as shown in Fig. 10(f) and (g), the result of our As can be observed, the DLMRI estimate exhibits more oscil- method presents less ringing artifacts [see Fig. 10(f)]. Besides, lating artifacts than that of TBMDU. the spot and circle in the bottom-left part of the TBMDU reconstruction appear less obscured than those in the DLMRI V. D ISCUSSION result [see Fig. 10(g)]. Similar to the DLMRI method [22], this section evaluates To investigate the noise sensitivity in the k-space of complex the sensitivity of the proposed method to parameter settings by MR data, complex Gaussian noise of is added to k-space varying one parameter at a time while keeping the rest fixed at of 5-fold undersampling. In this case, PSNRs of DLMRI and their nominal values, which have been definedinSectionIII-E. TBMDU methods are 17.93 and 22.94 dB, respectively. The The brain image is used for the evaluation with 2-D random reconstruction results of the two methods are shown in Fig. 11. sampling and 10-fold undersampling. The parameters including 1300 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 32, NO. 7, JULY 2013

Fig. 12. Parameter evaluation. PSNR and HFEN versus (a) the patch size , (b) overcompleteness of the dictionary, and (c) parameter . the patch size , the overcompleteness of the dictionary and REFERENCES are evaluated and the results in PSNR are shown in Fig. 12. [1] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse MRI: The application Fig. 12(a) and (b) show that both PSNR and HFEN improve of compressed sensing for rapid MR imaging,” Magn. Reson. Med., vol. 58, no. 6, pp. 1182–1195, 2007. when the patch size increases from 25 to 81 or the redundancy [2] E. J. Candès, J. K. Romberg, and T. Tao, “Robust uncertainty princi- factor goes up from 1 to 5. However, when is greater than ples: Exact signal reconstruction from highly incomplete frequency in- 36 or the redundancy factor is bigger than 2, the amount of formation,” IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489–509, Feb. 2006. improvement is rather small. Considering that the dictionary [3] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. updating step in our TBMDU method is very simple, we rec- 52, no. 4, pp. 1289–1306, Apr. 2006. ommend and for practical implementation, [4]J.A.TroppandA.C.Gilbert,“Signal recovery from random measure- ments via orthogonal matching pursuit,” IEEE Trans. Inf. Theory, vol. even though we conduct all the experiments in the previous sec- 53, no. 12, pp. 4655–4666, Dec. 2007. tion with the same settings on shared parameters with DLMRI [5]K.T.Block,M.Uecker,andJ.Frahm,“UndersampledradialMRI for the purpose of a fair comparison. Furthermore, Fig. 12(c) with multiple coils. Iterative image reconstruction using a total vari- ation constraint,” Magn. Reson. Med., vol. 57, no. 6, pp. 1086–1098, presents that the changes in PSNR and HFEN are very limited 2007. with the increase of the parameter (from 6 to 96). This indi- [6] M. Lustig, J. Santos, D. Donoho, and J. Pauly, “K-t SPARSE: High cates that is a very robust parameter and can be empirically frame rate dynamic MRI exploiting spatio-temporal sparsity,” in Proc. 13th Annu. Meet. ISMRM, Seattle, WA, 2006, p. 2420. chosen, which is very similar to the parameter in DLMRI (see [7] S. Ma, W. Yin, Y. Zhang, and A. Chakraborty, “An efficient algorithm Table I). Finally, the performances of TBMDU with respect to for compressed MR imaging using total variation and wavelets,” in two Bregman/AL parameters and are very similar to the ex- Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2008, pp. 1–8. [8] J. Trzasko and A. Manduca, “Highly undersampled magnetic reso- periments conducted in [36], [37], and [39]. We refer the readers nance image reconstruction via homotopic minimization,” IEEE to these papers for more details. Trans. Med. Imag., vol. 28, no. 1, pp. 106–121, Jan. 2009. [9] M. Akçakaya, S. Nam, P. Hu, M. Moghari, L. Ngo, V. Tarokh, W. Manning, and R. Nezafat, “Compressed sensing with wavelet domain VI. CONCLUSION dependencies for coronary MRI: A retrospective study,” IEEE Trans. Med. Imag., vol. 30, no. 5, pp. 1090–1099, May 2011. Following the Bregman iterative framework proposed by [10] G. A. Weihong and Y. B. Wotao, EdgeCS edge guided compres- Osher et al. in this paper we proposed a two-level Bregman sive sensing reconstruction CAAM Rice Univ., Houston, TX, Rep. TR10-02, 2010. method with dictionary updating (TBMDU) for MR image [11] F. Knoll, K. Bredies, T. Pock, and R. Stollberger, “Second order total reconstruction. The TBMDU method consists of a two-level generalized variation (TGV) for MRI,” Magn. Reson. Med., vol. 65, Bregman strategy. One level is related to the CS-data fidelity no. 2, pp. 480–491, 2011. [12] K. Bredies, K. Kunisch, and T. Pock, “Total generalized variation,” term, and the other is related to the image patch-based coeffi- SIAM J. Imag. Sci., vol. 3, no. 3, pp. 492–526, 2010. cient matrix and dictionary. The whole algorithm converges in [13] A. Buades, B. Coll, and J. Morel, “A non-local algorithm for image a small number of iterations by means of the accelerated sparse denoising,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2005, vol. 2, pp. 60–65. coding and simple dictionary updating. Various experimental [14] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising results demonstrate the superior performance of the algorithm by sparse 3-D transform-domain collaborative filtering,” IEEE Trans. under a variety of sampling trajectories and k-space undersam- Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007. [15] M. Elad and M. Aharon, “Image denoising via sparse and redun- pling factors. It even provides highly accurate reconstructions dant representations over learned dictionaries,” IEEE Trans. Image for severely undersampled MR measurements. The proposed Process., vol. 15, no. 12, pp. 3736–3745, Dec. 2006. method is very robust to the initial dictionary and parameter [16] D. Liang, H. Wang, Y. Chang, and L. Ying, “Sensitivity encoding reconstruction with nonlocal total variation regularization,” Magn. settings. The proposed framework will be extended to partially Reson. Med., vol. 65, no. 5, pp. 1384–1392, 2011. parallel imaging in the forthcoming study. [17] A. Wong, A. Mishra, P. Fieguth, and D. Clausi, “Sparse reconstruction of breast MRI using homotopic minimization in a regional sparsified domain,” IEEE Trans. Biomed. Eng., vol. 60, no. 3, pp. 743–752, Mar. ACKNOWLEDGMENT 2010. [18] K. Egiazarian, A. Foi, and V. Katkovnik, “Compressed sensing image The authors sincerely thank the anonymous referees for their reconstruction via recursive spatially adaptive filtering,” in Proc. IEEE valuable comments on this work. The authors also would like to Int. Conf. Image Process., 2007, vol. 1, pp. I–549. [19]M.Aharon,M.Elad,andA.Bruckstein,“K-SVD:Analgorithmfor thank Ravishankar et al. for sharing their experiment materials designing overcomplete dictionaries for sparse representation,” IEEE and source codes. Trans. Signal Process., vol. 54, no. 11, pp. 4311–4322, 2006. LIU et al.: HIGHLY UNDERSAMPLED MAGNETIC RESONANCE IMAGE RECONSTRUCTION USING TWO-LEVEL BREGMAN METHOD 1301

[20] Y. Chen, X. Ye, and F. Huang, “A novel method and fast algorithm [31] T. Chang, L. He, and T. Fang, “MR image reconstruction from sparse for MR image reconstruction with significantly under-sampled data,” radial samples using Bregman iteration,” in Proc. 13th Annu. Meet. Inverse Probl. Imag., vol. 4, no. 2, pp. 223–240, 2010. ISMRM, Seattle, WA, 2006, p. 696. [21] A. Bilgin, Y. Kim, F. Liu, and M. Nadar, “Dictionary design for com- [32] B. Liu, K. King, M. Steckner, J. Xie, J. Sheng, and L. Ying, “Reg- pressed sensing MRI,” in Proc. 18th Sci. Meet. ISMRM, Stockholm, ularized sensitivity encoding (SENSE) reconstruction using Bregman Sweden, 2010, p. 4887. iterations,” Magn. Reson. Med., vol. 61, no. 1, pp. 145–152, 2008. [22] S. Ravishankar and Y. Bresler, “MR image reconstruction from highly [33] J. Aelterman, H. Luong, B. Goossens, A. Pizurica, and W. Philips, undersampled k-space data by dictionary learning,” IEEE Trans. Med. “Augmented Lagrangian based reconstruction of non-uniformly Imag., vol. 30, no. 5, pp. 1028–1041, May 2011. sub-Nyquist sampled MRI data,” Signal Process., vol. 91, no. 12, pp. [23] K. Skretting and K. Engan, “Recursive least squares dictionary learning 2731–2742, 2011. algorithm,” IEEE Trans. Signal Process., vol. 58, no. 4, pp. 2121–2130, [34] M. Afonso, J. Bioucas-Dias, and M. Figueiredo, “An augmented Apr. 2010. Lagrangian approach to the constrained optimization formulation of [24] L. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based imaging inverse problems,” IEEE Trans. Image Process., vol. 20, no. noise removal algorithms,” Phys. D: Nonlinear Phenom., vol. 60, no. 3, pp. 681–695, Mar. 2011. 1, pp. 259–268, 1992. [35] R. Otazo and D. Sodickson, “Adaptive compressed sensing MRI,” in [25] W. Yin, S. Osher, D. Goldfarb, and J. Darbon, “Bregman iterative algo- Proc.18thSci.Meet.ISMRM, Stockholm, Sweden, 2010, p. 4867.. rithms for -minimization with applications to compressed sensing,” [36] M. Yaghoobi, T. Blumensath, and M. Davies, “Dictionary learning for SIAM J. Imag. Sci., vol. 1, no. 1, pp. 143–168, 2008. sparse approximations with the majorization method,” IEEE Trans. [26] M. Afonso, J. Bioucas-Dias, and M. Figueiredo, “Fast image recovery Signal Process., vol. 57, no. 6, pp. 2178–2191, Jun. 2009. using variable splitting and constrained optimization,” IEEE Trans. [37] Q. Liu, S. Wang, and J. Luo, “A novel predual dictionary learning algo- Image Process., vol. 19, no. 9, pp. 2345–2356, Sep. 2010. rithm,” J. Vis. Commun. Image Represent., vol. 23, no. 1, pp. 182–193, [27] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin, “An iterative reg- 2012. ularization method for total variation-based image restoration,” SIAM [38] Q. Liu, J. Luo, S. Wang, M. Xiao, and M. Ye, “An augmented La- Multiscale Model. Simul., vol. 4, no. 2, pp. 460–489, 2005. grangian multi-scale dictionary learning algorithm,” EURASIP J. Adv. [28] T. Goldstein and S. Osher, “The split Bregman method for L1-regular- Signal Process., vol. 2011, no. 1, pp. 1–16, 2011. ized problems,” SIAM J. Imag. Sci., vol. 2, no. 2, pp. 323–343, 2009. [39] Am. Radiol. Services [Online]. Available: http://www3.americanradi- [29] X. Zhang, M. Burger, X. Bresson, and S. Osher, “Bregmanized ology.com/pls/web1/wwmain.home nonlocal regularization for deconvolution and sparse reconstruction,” [40] Q. Liu, S. Wang, J. Luo, Y. Zhu, and M. Ye, “An augmented Lagrangian SIAM J. Imag. Sci., vol. 3, no. 3, pp. 253–276, 2010. approach to general dictionary learning for image denoising,” J. Vis. [30] S. Ramani and J. Fessler, “Parallel MR image reconstruction using aug- Commun. Image Represent., vol. 23, no. 5, pp. 753–766, 2012. mented Lagrangian methods,” IEEE Trans. Med. Imag., vol. 30, no. 3, pp. 694–706, Mar. 2011.