A Fast and Accurate Completion Method based on QR Decomposition and L 2,1-Norm Minimization Qing Liu, Franck Davoine, Jian Yang, Ying Cui, Jin Zhong, Fei Han

To cite this version:

Qing Liu, Franck Davoine, Jian Yang, Ying Cui, Jin Zhong, et al.. A Fast and Accu- rate Matrix Completion Method based on QR Decomposition and L 2,1-Norm Minimization. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2019, 30 (3), pp.803-817. ￿10.1109/TNNLS.2018.2851957￿. ￿hal-01927616￿

HAL Id: hal-01927616 https://hal.archives-ouvertes.fr/hal-01927616 Submitted on 20 Nov 2018

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Page 1 of 20 1

1 2 3 A Fast and Accurate Matrix Completion Method 4 5 based on QR Decomposition and L -Norm 6 2,1 7 8 Minimization 9 Qing Liu, Franck Davoine, Jian Yang, Member, IEEE, Ying Cui, Zhong Jin, and Fei Han 10 11 12 13 14 Abstract—Low- matrix completion aims to recover ma- I. INTRODUCTION 15 trices with missing entries and has attracted considerable at- tention from machine learning researchers. Most of the ex- HE problem of recovering an incomplete matrix with 16 isting methods, such as weighted nuclear-norm-minimization- missing values has recently attracted considerable atten- 17 based methods and QR-decomposition-based methods, cannot tionT from researchers in the image processing [1-10], signal 18 provide both convergence accuracy and convergence speed. To processing [11-13], and machine learning [14-20] fields. Con- 19 investigate a fast and accurate completionFor method, Peer an iterative Reviewventional methods formulate this task as a low-rank matrix QR-decomposition-based method is proposed for computing an m n 20 minimization problem. Suppose that M(M R × ,m approximate Singular Value Decomposition (CSVD-QR). This ∈ ≥ 21 method can compute the largest r(r>0) singular values of n>0) is an incomplete matrix; then, the traditional low-rank 22 a matrix by iterative QR decomposition. Then, under the frame- minimization problem is formulated as follows: 23 work of matrix tri-factorization, a CSVD-QR-based L2,1-norm minimization method (LNM-QR) is proposed for fast matrix com- min rank(X), s.t.Xi,j = Mi,j, (i, j) Ω, (1) 24 X ∈ 25 pletion. Theoretical analysis shows that this QR-decomposition- based method can obtain the same optimal solution as a nuclear m n where X R × is the considered low-rank matrix, rank(X) 26 norm minimization method, i.e., the L2,1-norm of a submatrix ∈ is the rank of X, and Ω is the set of locations corresponding 27 can converge to its nuclear norm. Consequently, an LNM-QR- 28 based iteratively reweighted L2,1-norm minimization method to the observed entries. The problem in Eq. (1) is NP- 29 (IRLNM-QR) is proposed to improve the accuracy of LNM-QR. hard and is difficult to optimize. Fortunately, the missing 30 Theoretical analysis shows that IRLNM-QR is as accurate as an values in a matrix can be accurately recovered by a nuclear iteratively reweighted nuclear norm minimization method, which 31 norm minimization under broad conditions [21, 22]. The most is much more accurate than the traditional QR-decomposition- widely used methods based on the nuclear norm are singular 32 based matrix completion methods. Experimental results obtained 33 on both synthetic and real-world visual datasets show that our value thresholding (SVT) [23] and accelerated proximal gra- 34 methods are much faster and more accurate than the state-of- dient [24]. These methods are not fast because of the high 35 the-art methods. computational cost of singular value decomposition (SVD) 36 Index Terms—Matrix Completion, QR Decomposition, Ap- iterations. Moreover, these methods are not very accurate 37 proximate SVD, Iteratively Reweighted L2,1-Norm. when recovering matrices with complex structures. One of 38 the reasons is that the nuclear norm may not be a good 39 approximation of the rank function [28] in these cases. 40 This work was supported in part by the National Natural Science Foundation To improve the accuracies of nuclear-norm-based methods, 41 of China under Grant Nos. U1713208, 61672287, 61602244, 91420201, some improved methods based on the Schatten p-norm [25, 35, 61472187, and 61572241 and in part by the Natural Science Foundation 36], weighted nuclear norm [27], γ-norm [33], and arctangent 42 of Zhejiang Province (LQ18F030014) and National Basic Research Program 43 of China under Grant No. 2014CB349303 and Innovation Foundation from rank [34], have been proposed. In 2015, F. Nie et al. [25] 44 Key Laboratory of Intelligent Perception and Systems for High-Dimensional proposed a joint Schatten p-norm and Lp-norm robust matrix Information of Ministry of Education (JYB201706). This work was also completion method. This method can obtain a better conver- 45 carried out in the framework of the Labex MS2T, program Investments for the 46 future, French ANR (Ref. ANR-11-IDEX-0004-02). (Corresponding author: gence accuracy than that of SVT. However, it may become 47 Zhong Jin.) slow when addressing large-scale matrices because of using Q. Liu is with the School of Computer Science and Engineering, Nanjing SVD in each iteration. C. Lu et al. [26] proposed an iteratively 48 University of Science and Technology, Nanjing, 210094, China and is with 49 the School of Software, Nanyang Institute of Technology, Nanyang, 473004, reweighted nuclear norm minimization (IRNN) method [26] in 50 China (e-mail: [email protected]). 2016. By using nonconvex functions to update the weights for F. Davoine is with Sorbonne Universites,´ Universite´ de technologie de the singular values, IRNN is much more accurate than SVT. 51 Compiegne,` CNRS, Heudiasyc, UMR 7253, Compiegne,` France (e-mail: 52 [email protected]). However, it still relies on SVD to obtain the singular values 53 J. Yang and Z. Jin are with the School of Computer Science and Engineer- for recovering incomplete matrices, which may cause it to be ing, Nanjing University of Science and Technology, Nanjing, 210094, China slow when applied to real-world datasets. Some other methods 54 (e-mail: [email protected]; [email protected]). 55 Y. Cui is with the College of Computer Science and Technology, Zhe- in references [33] and [34] also face the same difficulty. 56 jiang University of Technology, Hang’zhou, 310023, China (e-mail: cuiy- To improve the speed of SVT, some methods based on ma- [email protected]). trix factorization [13, 17, 18, 31] have recently been proposed. 57 F. Han is with the School of Computer Science and Communication 58 Engineering, Jiangsu University, Zhenjiang, Jiangsu, 212013, China (e-mail: In 2013, A fast tri-factorization (FTF) method [32] based on 59 [email protected]). Qatar Riyal (QR) decomposition [29, 30] was proposed. FTF 60 Page 2 of 20 2

1 2 relies on the cheaper QR decomposition as a substitute for II. RELATED WORK 3 SVD to extract the orthogonal bases of rows and columns of In this section, the SVD of a matrix and some widely 4 an incomplete matrix and applies SVD to a submatrix whose used matrix completion methods based on SVD and QR 5 size can be set in advance. FTF is very fast when applied decomposition are respectively introduced. 6 to low-rank data matrices. However, it will become slow if 7 the test matrices are not of low rank. Moreover, the FTF 8 method is not as accurate as a weighted nuclear-norm-based A. Singular Value Decomposition m n 9 method, such as IRNN [26]. A more recent work, i.e., the Suppose that X R × is an arbitrary real matrix; then, ∈ 10 robust bilinear factorization (RBF) method [18], is slightly the SVD of X is as follows: 11 more accurate than FTF. However, it is still not fast and not X = UΛV T , (2) 12 accurate enough for real applications. Some other methods 13 based on matrix factorization proposed in references [13], m n U =(u1, ,um) R × , (3) 14 [17], and [31] also have similar characteristics. Thus, the ··· ∈ n n 15 traditional methods based on the weighted nuclear norm and V =(v1, ,vn) R × , (4) ··· ∈ 16 matrix factorization cannot provide both convergence speed where U and V are column orthogonal matrices, the columns 17 and convergence accuracy. Recently, the L2,1-norm was successfully applied to feature of which are the left and right singular vectors of X, re- 18 m n spectively. Λ R × is a diagonal matrix with diagonal 19 selection [37, 38], optimal mean robust principle component ∈ For Peer Reviewentries, where Λ = σ (X), that are assumed to be in order 20 analysis [53], and low-rank representation [39-42]. The feature ii i of decreasing magnitude. σ (X) is the ith singular value of 21 selection methods and the method in [53] use a combination i X. 22 of the nuclear norm and L2,1-norm as their loss function Many papers on how to compute the singular values of 23 to extract the subspace structures of test datasets. Because X exist [48-50]. Here, we introduce a simple SVD method 24 they use the L2,1-norm, they are more robust with respect to (SVD-SIM), which was proposed by Paul Godfrey [51] in 25 outliers. However, they are still not fast because they use SVD 2006. In this method, the singular values and singular vectors 26 to search for the optimal solutions. In low-rank representation, can be computed by iterative QR decomposition. The QR 27 the outliers among data points can be removed by solving decomposition [30] of X is as follows: 28 an L2,1-norm minimization problem, the optimal solution of which can be obtained without using SVD. However, an L - 29 2,1 X = LR, (5) 30 norm-based matrix completion method does not exist. m m In general, developing a fast and accurate matrix completion where L R × is an , the columns of 31 ∈ 32 method remains a significant open challenge. which are the basis of the space spanned by the columns of m n 33 In fact, the singular values and singular vectors can also X. R R × is a weakly upper-. ∈ T 34 be obtained by QR decomposition [51], which is much faster Let Λ1 = X , U1 = eye(m, m), and V1 = eye(n, n). In L th 35 than SVD. Additionally, the 2,1-norm can be applied to fast the j iteration of SVD-SIM, the variables, i.e., Λj , Uj, and 36 matrix completion methods under the framework of matrix Vj , are alternately updated in two steps. tri-factorization. Thus, this paper aims to propose a fast and 37 In step 1, Uj+1 is updated by QR decomposition. Suppose accurate matrix completion method based on L -norm mini- T 38 2,1 that the QR decomposition of Λj is 39 mization and QR decomposition to address the aforementioned ΛT = Q S , (6) 40 open challenge. Our main contributions are as follows: j 1 1 A QR-decomposition-based method for computing an ap- m m m n 41 • where Q1 R × and S1 R × are intermediate proximate SVD (CSVD-QR) is proposed. It can compute ∈ ∈ 42 variables. Uj+1 is updated as follows: 43 the largest r(r>0) singular values of a given matrix. A CSVD-QR-based L -norm minimization method 44 • 2,1 Uj+1 = Uj Q1. (7) (LNM-QR) is proposed for fast matrix completion. By 45 T using QR as a substitute for SVD, LNM-QR is much In step 2, the variable S1 in Eq. (6) is decomposed by QR 46 decomposition as follows: 47 faster than the methods that utilize SVD. A CSVD-QR-based iteratively reweighted L2,1-norm T 48 • S1 = Q2S2, (8) minimization method (IRLNM-QR) is proposed to im- 49 m m m n prove the accuracy of LNM-QR. A theoretical analysis where Q2 R × and S2 R × are intermediate 50 ∈ ∈ 51 shows that IRLNM-QR has the same optimal solution as variables. Λj+1 and Vj+1 are updated as follows: that of the IRNN method, which is much more accurate 52 Λj+1 = S2, (9) 53 than the traditional QR-decomposition-based methods. The L2,1-norm of a matrix is proven to be the upper 54 • Vj+1 = Vj Q2. (10) 55 bound of its nuclear norm (in Section III.C). Thus, U , Λ , and V , produced by Eqs. (6-10) can converge to U, Λ, 56 the proposed methods can also be applied to improve j j j and V , respectively, where Λ is a diagonal matrix with entries 57 the performances of the nuclear-norm-based low-rank satisfying 58 representation [42, 44], multi-view data analysis [45-47], and matrix/tensor completion [54, 55] methods. Λ = σ (X), (11) 59 || ii||1 i 60 Page 3 of 20 3

1 2 where Λii 1 is the l1-norm of Λii. The columns of U and where σi(X) σj (X) and 1 i0 and the nuclear norm of X is defined as follows: 18 The IRNN method is much faster and more accurate than n SVT. However, it is still not fast enough for real applications 19 X = Σi=1σi(X), (13) || ||∗ For Peer Reviewbecause of the high computational cost of SVD iterations. 20 th where σi(X) is the i singular value of X and PΩ(X) is 21 3) Fast Tri-Factorization (FTF) Method: An FTF method 22 Xi,j, (i, j) Ω, based on QR decomposition was recently proposed by Liu et (PΩ(X))i,j = ∈ (14) m n 0, (i, j) / Ω. al. [32] for fast matrix completion. Suppose that X R × 23 ∈ ! ∈ is a real matrix whose rank is r, which can be decomposed as 24 The Lagrange function of the problem in Eq. (12) is 25 X = LDR, (24) 1 2 T 26 Lag = µ X + X F + tr(Y PΩ(X M)), (15) ∥ ∥∗ 2 ∥ ∥ − m r r r r n 27 where L R × , D R × , R R × , and r (0,n]. If m n ∈ ∈ ∈ ∈ where Y R × and µ > 0. The variable X can be updated L is a column orthogonal matrix and R is a row orthogonal 28 ∈ 29 by solving the following problem: matrix, then the following conclusion will be obtained: 1 30 2 X = D . X = arg min µ X + X PΩ(Y ) F . (16) (25) 31 X ∥ ∥∗ 2 ∥ − ∥ ∥ ∥∗ ∥ ∥∗ 32 The problem in Eq. (16) can be solved by the singular value Consequently, the nuclear norm minimization problem on X 33 shrinking operator [23] shown in Lemma 1. in SVT can be modified as follows: 34 m n T T Lemma 1 [23] For each τ 0, Y R × is a given real L L = I,RR = I, 35 T ≥ ∈ min D , s.t. (26) matrix, where Y = UΛV is the SVD decomposition of Y . D ∥ ∥∗ PΩ(LDR)=PΩ(M), 36 The global solution to ! 37 where r is a preset parameter that regulates the computational 1 2 cost of FTF. 38 Sτ (Y ) = arg min τ X + X Y (17) X ∥ ∥ 2 ∥ − ∥F 39 ∗ The variables L, R, and D can be alternately optimized 40 is given by the singular value shrinking operator by fixing both other variables. L and R can be updated by T applying QR decomposition, the computational cost of which 41 Sτ (Y )=Udiag(P (Λii),i=1,,n)V , (18) 42 is much lower than that of SVD, to two matrices with sizes where P (Λ ) = max Λ τ, 0 and i =1, ,n. of m r and r n, respectively [32]. Additionally, D is 43 i ii { ii − } ··· × × 44 The SVT method converges efficiently on synthetic data updated by applying the singular value shrinking operator to matrices with strict low-rank structures. However, it is not a matrix of size r r. Therefore, FTF is much faster than the 45 × 46 always accurate and not always fast when recovering matrices traditional methods, such as SVT and IRNN, when it applies 47 with complex structures [28]. SVD to small-scale matrices. However, it may become slow 48 2) Iteratively Reweighted Nuclear Norm (IRNN) Method: when recovering matrices with complex structures, the ranks 49 An IRNN method [26] has been proposed for improving of which are full or near full. The reason is that the parameter r the convergence accuracy of SVT. This method solves the should be given a large value, which makes the computational 50 th 51 minimization problem in its k iteration as follows: cost large in that case. Another disadvantage of FTF is that 52 n α 2 it is still a nuclear-norm-minimization-based method, which min Σi=1 g(σi(X)) σi(X)+ F (X) F , (19) 53 X ∇ · 2 ∥ ∥ is less accurate than a weighted nuclear-norm-based method, such as IRNN. 54 where F (X) = X 1 (X f(X )), f(X)=P (X M), − α k −∇ k Ω − In general, we may conclude that the weighted nuclear- 55 and α > 0. g(x) is a continuous, concave, and monotonically norm-based methods and QR-decomposition-based methods 56 increasing function on [0, ). g(x ) is the supergradient of ∞ ∇ 0 cannot achieve satisfactory levels of both convergence speed 57 g(x) at x0, and g(x) obeys 58 ∇ and convergence accuracy. Thus, a fast and accurate method g(σ (X)) g(σ (X)), (20) for matrix completion should be investigated. 59 ∇ i ≤∇ j 60 Page 4 of 20 4

1 2 C. An L2,1-Norm Minimization Solver for Low-Rank Repre- The optimal Λ to the problem in Eq. (35) can be given by the 3 sentation LNMS as follows:

4 Recently, the L2,1-norm was successfully used in low- ( C(:,j) 2 τ)+ 5 Λ(:,j)= ∥ ∥ − C(:,j), (36) rank representation [39] to optimize the noise data matrix C(:, j) 2 m n ∥ ∥ 6 E R × . The optimal E can be updated by solving the ∈ 7 minimization problem as follows: C = U T YV, (37) 8 1 2 9 min τ E + E C , (27) where j =1, ,n. The LNMS in Eq. (36) can recover the E ∥ ∥2,1 2 ∥ − ∥F ··· 10 columns of Λ one by one, which shows that the matrix C in m n where C R × is a given real matrix and τ > 0. The Eq. (36) does not need to be diagonal. Therefore, it is suitable 11 ∈ 12 L2,1-norm of E is defined as to decompose X into three matrices as follows: 13 E = Σn Σm E2 . (28) X = LDR, (38) 14 ∥ ∥2,1 j=1 i=1 ij m r r r r n 15 " th where L R × , D R × , R R × , and r (0,n]. The The optimal E(:,j) (denoting the j column of E) of the ∈ ∈ ∈ ∈ 16 problem in Eq. (27) obeys variables L and R denote a column orthogonal matrix and 17 row orthogonal matrix, respectively. Specifically, they are the ( C(:,j) τ) 18 E(:,j)= ∥ ∥2 − + C(:,j), (29) orthogonal bases of the columns and rows of X, respectively. C(:, j) 19 ∥ ∥For2 Peer ReviewThe matrix D does not need to be diagonal, which is different 20 where from the matrix Λ in SVD. Then, we formulate the following 21 L2,1-norm minimization problem: 22 C(:,j) = Σm C2 , (30) ∥ ∥2 i=1 ij 1 2 23 min τ D 2,1 + LDR Y F , s.t.X= LDR. (39) (x) = max" x, 0 , (31) L,D,R ∥ ∥ 2 ∥ − ∥ 24 + { } According to Eqs. (36-37), the variable D can be optimized 25 with x ( , + ) being a real number. The L -norm ∈ −∞ ∞ 2,1 very efficiently after obtaining the variables L and R. The 26 minimization solver in Eq. (29) is referred to as LNMS in problem in Eq. (32) is a special case of the problem in Eq. 27 this paper for convenience. The computational cost of the (39). Therefore, the L -norm minimization problem in Eq. 28 LNMS is much lower than that of the singular value shrinking 2,1 (39) can also be applied to matrix completion. One key issue 29 operator. In this paper, a fast and accurate matrix completion is how to extract the orthogonal bases L and R. Because D 30 method using the LNMS under the matrix tri-factorization does not need to be diagonal, L and R can be obtained via a 31 framework is introduced. A deeper analysis is presented in method more efficient than SVD. 32 the next section. 33 2) Using QR Decomposition to Extract Orthogonal Bases L and R: The orthogonal bases L and R can also be computed 34 III. OUR PROPOSED METHODS 35 by QR decomposition [31, 32], the computational cost of A. Motivation 36 which is approximately ten percent that of SVD. According to 37 This paper aims to investigate a fast and accurate matrix the SVD-SIM [51] method, the left and right singular vectors 38 completion method based on L2,1-norm minimization. can be directly obtained by QR decomposition. One may 39 1) Application of L2,1-Norm Minimization to Matrix Com- think that the variables L and R in Eq. (39) can be obtained 40 pletion: The nuclear norm minimization problem in Lemma by SVD-SIM with only a few iterations. However, SVD- 41 1 is a special case of an L2,1-norm minimization problem. SIM computes all the singular values and the corresponding T 42 Suppose that X is a variable whose SVD is X = UΛV . The singular vectors simultaneously, which allows us to forego this 43 problem in Eq. (17) is equivalent to the following problem: idea. Obviously, computing all the singular values may reduce the recovery speed of a matrix completion method [43], which 44 n 1 T 2 min τΣj=1Λjj + UΛV Y , (32) motivates us to propose a method for computing the largest 45 U,Λ,V 2 − F r(r (0,n]) singular values and the corresponding singular 46 # # ∈ where τ > 0 and Y is a given# real matrix. Because# Λ is a vectors. Some methods have already been proposed, such as 47 diagonal matrix, we have the following collusion: 48 the power method [30, 52]. However, the computational cost n of the power method increases sharply with increasing r as a 49 Σj=1Λjj = Λ 2,1 . (33) ∥ ∥ result of applying SVD to a submatrix. 50 Thus, the problem in Eq. (32) can be reformulated as 51 In this paper, a method for computing the SVD of a matrix by iterative QR decomposition (CSVD-QR) is proposed. This 52 1 T 2 min τ Λ 2,1 + UΛV Y . (34) method can compute the largest r(r (0,n]) singular values 53 U,Λ,V ∥ ∥ 2 − F ∈ 54 # # and the corresponding singular vectors of a matrix, which is Because the variables U and V# are column# orthogonal ma- 55 different from SVD-SIM. Consequently, the orthogonal bases trices, the optimal solution to the problem in Eq. (32) is L and R in Eq. (39) can be computed by CSVD-QR with only 56 equivalent to that of the following problem: 57 a few iterations. Then, using the results obtained by CSVD- QR, two fast matrix completion methods based on the L2,1- 58 1 T 2 min τ Λ 2,1 + Λ U Y V . (35) norm are proposed: 59 U,Λ,V ∥ ∥ 2 − F # # 60 # # Page 5 of 20 5

1 A CSVD-QR-based L -norm minimization method L = Q(q , ,q ), (47) 2 • 2,1 j+1 1 ··· r (LNM-QR) is proposed for matrix completion. By using 3 m m m r QR decomposition as a substitute for SVD, LNM-QR is where Q R × and T R × are intermediate variables. 4 ∈ ∈ T much faster than the compared methods using SVD. Eq. (46) indicates that the QR decomposition of XRj is 5 T A CSVD-QR-based iteratively reweighted L -norm XRj = QT . Similarly, Rj+1 can be updated as follows: 6 • 2,1 7 minimization method (IRLNM-QR) is proposed to im- T [Q, T ]=qr(X Lj+1), (48) 8 prove the accuracy of LNM-QR. IRLNM-QR has advan- tages in terms of both convergence speed and convergence 9 Rj+1 = Q(q1, ,qr), (49) 10 accuracy over the traditional methods. ··· n n n r 11 We can now introduce the proposed method for computing where Q R × and T R × are intermediate variables. ∈ ∈ 12 an approximate SVD based on QR decomposition and the two Since the optimal R is a row orthogonal matrix, we set 13 matrix completion methods based on the L2,1-norm. R = RT . 14 j+1 j+1 (50) 15 B. Method for Computing an Approximate SVD based on QR Finally, Dj+1 is updated as follows: 16 Decomposition (CSVD-QR) 2 17 m n D = arg min X L DR (51) Suppose that X R × is a given real matrix. In this j+1 j+1 j+1 F 18 ∈ D ∥ − ∥ section, we propose a method that can compute the largest = LT XRT . (52) 19 r(r (0,n]) singular values and the corresponding singular j+1 j+1 ∈ For Peer Review 20 vectors of X by QR decompositions directly. Specifically, we According to Eq. (48), we have 21 aim to find three matrices, i.e., L, D, and R, such that T T 22 T = Lj+1XQ. (53) X LDR 2 ε , (40) 23 ∥ − ∥F ≤ 0 24 Because Rj+1 is generated by Eqs. (49-50), we have where ε0 is a positive tolerance. Please see Eq. (38) for the 25 T definitions of L, D, and R. Consequently, a minimization Dj+1 = T (1 r, 1 r). (54) 26 problem is formulated: ··· ··· 27 2 T T The sequences of Lj , Rj , and Dj (j =1, ,n, ) 28 min X LDR F , s.t.LL = I,RR = I. (41) { } { } { } ··· ··· L,D,R ∥ − ∥ generated by Eqs. (46-47), Eqs. (48-50), and Eq. (54) can 29 converge to matrices L, R, and D, respectively, with matrix The minimization function in Eq. (41) is convex to each one 30 D satisfying 31 of the variables, i.e., L, D, and R, when the remaining two Dii = Λii , (55) 32 are fixed. Thus, the variables can be alternately updated one ∥ ∥1 ∥ ∥1 by one. Suppose that L , D , and R denote the results of the 33 j j j where i =1, ,r and Λ is the ith singular value of X. jth iteration in the alternating method. Let L = eye(m, r), ··· ii 34 1 The columns of L and RT are left and right singular vectors D = eye(r, r), and R = eye(r, n). In the jth iteration, L 35 1 1 j+1 corresponding to the largest r singular values, respectively. is updated with fixed D and R as follows: 36 j j This method of computing an approximate SVD based on QR 37 2 decomposition is called CSVD-QR, the main steps of which Lj+1 = arg min X LDjRj F . (42) 38 L ∥ − ∥ are shown in Table I. 39 Since Rj is a row orthogonal matrix, the optimal solution to 40 Eq. (42) is as follows: TABLE I: MAIN STEPS OF CSVD-QR. 41 T + Input: X, a real matrix; Lj+1 = XRj Dj , (43) 42 Output: L, D, R (X = LDR); + 43 where Dj is the Moore–Penrose pseudo-inverse of Dj . Be- Initialization: r>0, q>0, j =1; Itmax > 0; 44 cause the optimal L should be a column orthogonal matrix, ε0 is a positive tolerance; C = eye(n, r); L1 = eye(m, r); 45 Lj+1 can be set to the orthogonal basis of the range space T + D1 = eye(r, r); R1 = eye(r, n). 46 spanned by the columns of XRj Dj as follows: Repeat: 47 Lj+1 : Eqs.(46 47); L = orth(XRT D+), (44) − j+1 j j Rj+1 : Eqs.(48 50); 48 − Dj+1 : Eqs.(48, 54); j = j +1; 49 where orth(X) is an operator that extracts the orthogonal 2 Until: Lj Dj Rj X F ε0 or j>Itmax. 50 basis of the columns of X. In view of Eq. (40), Lj+1 is the ∥ − ∥ ≤ Return: L = Lj , D = Dj , R = Rj . 51 orthogonal basis of the columns of X, which can be set to the n r 52 orthogonal basis of XA, where A R × is a random matrix ∈ r = n D 53 [30]. Therefore, the solution in Eq. (44) can also be given as Theorem 1 When , the diagonal entries of j at the jth(j>1) Λ 54 iteration of CSVD-QR are equal to those of j in L = orth(XRT ). (45) 55 j+1 j SVD-SIM. For the proof, please see Appendix A. 56 In this paper, we use QR decomposition to compute the Theorem 1 shows that CSVD-QR can converge as fast as 57 orthogonal basis of XRT in Eq. (45) as follows: SVD-SIM. In the proposed methods for matrix completion, j we use only the output of CSVD-QR with one iteration (see 58 T 59 [Q, T ]=qr(XRj ), (46) more details in Section III.C). 60 Page 6 of 20 6

1 2 C. Proposed Methods for Fast and Accurate Matrix Comple- function is optimized in two steps. In step 1, Lk+1 and Rk+1 3 tion are updated by solving the following minimization problem: 4 According to Section III.A.1, the L2,1-norm minimization 2 Yk 5 problem in Eq. (39) can be applied to matrix completion. min (Xk + ) LDkR . (64) L,R µ − 6 # k #F The variable D in Eq. (39) does not need to be diagonal. # # 7 According to the analyses# corresponding# to Eq. (41), Lk+1 Consequently, the variables L and R can be given by CSVD- # # 8 QR with only a few iterations. Moreover, the orthogonal and Rk+1 can be given by CSVD-QR. If CSVD-QR is ini- 9 subspace of the recovery result may not considerably change tialized by Li and Ri, it will converge within a few iterations 10 after two conservative iterations [43]. Thus, we use the orthog- because the matrices L and R will not considerably change Yk in two consecutive iterations [43]. In our method, Xk + is 11 onal bases of the rows and columns of the recovered matrix µk 12 in the previous iteration in the proposed matrix completion decomposed in one iteration of CSVD-QR as follows: 13 methods to initialize CSVD-QR. By using this smart warm Yk 14 initialization, L and R computed from CSVD-QR with one Xk + = Lk+1DT Rk+1, (65) µk 15 iteration can be used to recover the original incomplete matrix. r r where DT R × . By using this smart warm initialization, 16 1) An L2,1-Norm Minimization based on QR Decompo- ∈ 17 sition for Matrix Completion (LNM-QR): According to the LNM-QR can converge very quickly. 18 analysis in Section III.A, the considered matrix X can be In step 2, Xk+1 is updated by solving an L2,1-norm 19 decomposed as in Eq. (38). Moreover,For the original Peer incomplete Reviewminimization. First, the variable D can be optimized, with Xk, Yk, Lk+1, and Rk+1 held fixed, by solving the following 20 matrix can be recovered efficiently by solving the L2,1-norm 21 minimization problem in Eq. (39). The relationship between problem: 22 D and D 2,1 can confirm this conclusion. The matrix D 1 ∥ ∥∗ ∥ ∥ Dk+1 = arg min D 2,1 23 can be decomposed as follows: D µk ∥ ∥ 24 (66) D = Σr Dj , (56) 1 T Yk 25 j=1 + D Lk+1(Xk + )Rk+1 2 − µk . 26 D , (i = j), # # Dj = k,j (57) # # 27 k,i 0, (i = j), From Eqs. (65-66), we# have the following conclusion:# ! ̸ # # 28 j r r T Yk T where i, j, k =1, ,r, and D R × . From Eq. (56), we Lk+1(Xk + )Rk+1 = DT , (67) 29 ··· ∈ µ have k 30 r j where D is as shown in Eq. (65). Therefore, Eq. (66) can be D = Σj=1D . (58) T 31 ∥ ∥∗ ∗ reformulated as follows: 32 Because the nuclear norm is# a convex# function, we have # # 1 1 2 33 r j Dk+1 = arg min D 2,1 + D DT F . (68) D Σj=1 D . (59) D µk ∥ ∥ 2 ∥ − ∥ 34 ∥ ∥∗ ≤ ∗ 35 Because the Σr Dj term is# equal# to D , i.e., According to Eqs. (29-31), the minimization problem in Eq. j=1 # # 2,1 36 ∗ ∥ ∥ (68) can be solved by the LNMS as follows: #Σr # Dj = D , 37 # j=1# 2,1 (60) ∗ ∥ ∥ Dk+1 = DT K, (69) 38 we obtain the following# conclusion:# 39 # # where K is a diagonal matrix, i.e., 40 D D 2,1 . (61) ∥ ∥∗ ≤∥ ∥ K = diag(k1, ,kr), (70) 41 From Eq. (61), the L2,1-norm of a matrix is clearly the upper ··· th 42 bound of its nuclear norm. This conclusion motivates us to where the j entry kj can be given as follows: 43 apply the L2,1-norm minimization problem in Eq. (39) to 1 ( DT (:,j) )+ F µk 44 matrix completion as follows: kj = ∥ ∥ − . (71) 45 DT (:, j) F LT L = I,X = LDR, ∥ ∥ 46 min D 2,1 , s.t. T (62) Second, by fixing the variables Lk+1, Dk+1, Rk+1, and Yk, D ∥ ∥ RR = I,PΩ(LDR)=PΩ(M). 47 ! Xk+1 is updated as follows: 48 Please see Eq. (38) for the definitions of L, D, and R. Xk+1 =Lk+1Dk+1Rk+1 + PΩ(M) 49 From the analysis in Section III.A, the variable D does not (72) 50 need to be diagonal. Because the optimization function in Eq. PΩ(Lk+1Dk+1Rk+1). − 51 (62) is convex, the corresponding problem can be solved by Finally, by fixing the variables, i.e., Lk+1, Dk+1, Rk+1, and 52 the alternating direction method of multipliers (ADMM). The Xk+1, Yk+1 and µk are updated as follows: 53 augmented Lagrange function of the problem in Eq. (62) is 54 Yk+1 = Yk + µk(Xk+1 Lk+1Dk+1Rk+1), (73) Lag = D 2,1 − 55 ∥ ∥ µk+1 = ρµk, (74) T µ 2 (63) 56 + tr(Y (X LDR)) + X LDR F , − 2 ∥ − ∥ where ρ 1. The proposed L2,1-norm minimization method 57 m n ≥ where µ>0 and Y R × . Suppose that Xk denotes based on CSVD-QR is called LNM-QR, the main steps of 58 th ∈ 59 the result of the k iteration in the ADMM. The Lagrange which are summarized in Table II. 60 Page 7 of 20 7

1 TABLE II: MAIN STEPS OF LNM-QR. 2 CSVD-QR (see Eq. (65)). The variables, i.e., Xk+1, Yk+1, and 3 Input: M, a real matrix with missing values; µk, are updated according to Eq. (72), Eq. (73), and Eq. (74), 4 Ω, the set of locations corresponding respectively. The value of Dk+1 can be determined by solving 5 to the observed entries. the following problem: : Xopt, the recovery result. 6 Output Initialization: r>0, q>0, k =0; Itmax > 0; 1 r j j 1 2 7 min Σj=1 g( D ) D + D DT F , (79) C = eye(n, r); L1 = eye(m, r); D µk ∇ ∗ ∗ 2 ∥ − ∥ 8 D1 = eye(r, r); R1 = eye(r, n); D # # # # 9 X0 = M, ε0 is a positive tolerance. where T is as shown# in# Eq.# (65).# In each iteration of IRNN, 10 Repeat: g(x) is a concave function[26]. In this paper, we design a novel Step 1: L , R : Eq. (65); 11 k+1 k+1 function for g(x). In each iteration of this extension model, Step 2: Dk+1: Eq. (69); j 12 the g( D ) term obeys Xk+1: Eq. (72); k = k +1. ∇ ∗ 13 2 Until: Xk Xk 1 F ε0 or k>Itmax. # # j ¯ j ∥ − − ∥ ≤ # # g( D )=µj (1 kj ) D , (80) 14 Return: L = Lj , D = Dj , R = Rj . ∇ − T ∗ ∗ 15 ¯ ¯# # ¯ # # where 1 k1 k#2 # kr > 0,µ>0#, and#j [1, ,r]. 16 ≥ ≥ ≥ ··· # # r ∈r ··· Theorem 2 For any given real matrix C R × and µ>0, 17 Because the convex optimization function in Eq. (62) is ∈ the optimal solution to the following problem 18 minimized by the ADMM, which is a gradient-search-based method, LNM-QR can converge to its optimal solution. Sup- 1 1 19 For Peer Reviewmin X + X C 2 , (81) pose that N iterations are required for LNM-QR to converge. r r w (2,1) F 20 X R × µ ∥ ∥ · 2 ∥ − ∥ If the updating steps in LNM-QR are continued, then X (k> ∈ 21 k can be given as follows: 22 N) will be equal to XN . Because CSVD-QR is initialized by 23 matrices L and R in the previous iteration, LNM-QR can fall Xopt = CK, (82) back to CSVD-QR. Thus, the sequence of D produced by 24 { k} 25 LNM-QR (see Eqs. (69-71)) can converge to a diagonal matrix where D with entries Djj that obey 26 K = diag(k1, ,kr), (83) 27 w···j Djj = σj (XN ). (75) (Λ ) ∥ ∥1 jj µ + 28 kj = − , (84) 29 Therefore, the L2,1-norm minimization function of the LNM- Λjj 30 QR model (in Eq. (62)) can converge to the nuclear norm of with j [1, ,r] and Λ being the singular value of Cj . ∈ ··· jj 31 D, which motivates us to improve LNM-QR as an iteratively The definition of Cj is the same as that of Dj (see Eq. (56)). 32 reweighted L2,1-norm minimization method. For the proof, please see Appendix B. 33 2) Extension of LNM-QR: Since the L2,1-norm of D can According to Theorem 2 and Eq. (80), the variable Dk+1 34 converge to its nuclear norm and the IRNN [26] method can be updated as follows: 35 performs much better than the nuclear-norm-based methods, ¯ 36 it is suitable to use an iteratively reweighted L2,1-norm min- Dk+1 = DT K, (85) imization to replace the L -norm minimization in step 2 of 37 2,1 K¯ = diag(k¯1, , k¯r). (86) 38 LNM-QR. According to Eq. (60), the weighted L2,1-norm of ··· 39 X is denoted as The proposed iteratively reweighted L2,1-norm minimization 40 n j method based on CSVD-QR for matrix completion is called X w (2,1) = Σj=1wj X , (76) 41 ∥ ∥ · IRLNM-QR. ∗ j j Theorem 3 If the weights g( D )(j =1, ,r) in 42 where wj > 0(j [1,n]). The definition# of#X is the same as ∇ || ||∗ ··· j ∈ # # Eq. (77) are given by Eq. (80), and if (k¯ , , k¯ ) in Eq. 43 that of D in Eq. (56). The minimization problem of LNM-QR 1 ··· r 44 in Eq. (62) can be modified as follows: (80) are arranged in decreasing order, then IRLNM-QR will converge to the optimal solution of an iteratively reweighted 45 r j j min Σj=1 g( D ) D , nuclear norm minimization method. For the proof, please see 46 D ∇ ∗ ∗ (77) Appendix C. 47 s.t.X= LDR, PΩ#(LDR# )=# P#Ω(M), 48 # # # # According to Theorem 3, the weights in Eqs. (85-86) should Please see Eq. (38) for the definitions of L, D, and R. g(x ) be arranged in decreasing order. In the experiment, k¯ (j = 49 ∇ 0 j 50 is the supergradient of g(x) at x0. g(x) is a continuous and 1, ,r) are given as follows: monotonically increasing function on [0, + ). The problem ··· 51 ∞ 1, 1 j S, 1 1, r is the row number of K¯ , and S(S

1

2 D. Complexity Analysis 0.03 3 In this section, the computational complexities of SVT, SVD−SIM CSVD−QR 4 IRNN, FTF, LNM-QR, and IRLNM-QR are analyzed. Suppose 0.025 m n 5 that X R × is a real matrix. The computational cost 0.02 ∈ 6 of SVD on X is O(mn2). The main CPU times of SVT 7 and IRNN are consumed by performing SVD on X. Thus, 0.015

2 Relative Error 8 their computational complexities are O(mn ). The main CPU 0.01 9 time of FTF is consumed by performing QR decomposition 10 twice to update L and R and SVD once on submatrix D. 0.005 11 Thus, the computational cost of FTF is O(r2(m + n)+r3), 0 0 10 20 30 40 50 12 where r min(m, n). The main CPU times of LNM-QR and Iteration Number ≪ 13 IRLNM-QR are consumed by performing QR decomposition Fig. 1 The Relative Error curve of CSVD-QR. 14 twice to update L and R (please see Eqs. (46-50)). Thus, 15 the computational complexities of LNM-QR and IRLNM-QR 16 are O(r2(m + n)). Clearly, the computational complexities of 30 iterations. Thus, the proposed CSVD-QR can accurately 17 LNM-QR and IRLNM-QR are much smaller than those of compute all the singular values of X. 18 FTF, SVT, and IRNN. Hence, LNM-QR and IRLNM-QR are 19 much faster than the traditional methodsFor based Peer on SVD. ReviewSecond, let m = n = 300, r1 = 250, r = 50, and 20 ε0 =0.0005, with the largest r singular values computed 21 XPERIMENTAL ESULTS by CSVD-QR. The variable Dk can converge to a diagonal IV. E R r r matrix when CSVD-QR converges. Suppose that Tk R × 22 ∈ To demonstrate the effectiveness of the proposed methods, is a whose entries T = D (i, j = 23 several comparative experiments are performed. First, the con- k(i,j) k(i,j) 1, ,r). The convergence of Dk can be shown by plotting 24 vergence of CSVD-QR is tested. Then, the proposed LNM-QR ··· $ $ 10Tk(k =2, 10, 30, 60), as in Fig. 2. $ $ 25 and IRLNM-QR methods for matrix completion are evaluated 26 using synthetic and real-world datasets. 27 The experiments are performed on a MATLAB 2012a 28 platform equipped with an i5-6300U CPU and 4 GB of RAM. 29 30 A. Convergence of CSVD-QR 31 In this section, CSVD-QR is tested on a synthetic matrix 32 (A) k = 2, (B) k = 10, (C) k = 20, (D) k = 60. 33 X that is generated as follows: Fig. 2 The convergence procedure of Dk in CSVD-QR. 34 m r1 r1 n X = M L × MR × , (89) 35 m r M × 1 = randn(m, r ), (90) Fig. 2 shows that the sequence of D can converge to 36 L 1 { k} r1 n a diagonal matrix. The relative error of the singular values 37 MR × = randn(r1,n), (91) computed by CSVD-QR reaches 0.0005 in the 60th iteration, 38 k where r1 [1,n] is the rank of X. Suppose that H = which means that CSVD-QR can accurately compute the 39 ∈ th (h1,h2, ,hr) is a vector, the i entry of which hi is equal largest r singular values. In the proposed LNM-QR and 40 ··· th to Dk(i,i), where Dk (in Eq. (52)) is the submatrix in the k it- IRLNM-QR methods, the matrices L and R computed by 41 eration of CSVD-QR. The CSVD-QR method is stopped when CSVD-QR with one iteration, are used to recover the original 42 Σt σ (X) i=1 ∥i∥1− i the relative error |t | < ε0, (t =min(r, r1)). The incomplete matrix. Therefore, the proposed methods are much 43 Σi=1σi(X) faster than the traditional methods based on SVD. 44 experiments for CSVD-QR are conducted as follows. 45 First, let m = n = r = 300, r1 = 250, and ε0 =0.001. 46 All the singular values of matrix X are computed via the B. Experimental Results of the Proposed Methods for Matrix 47 CSVD-QR and SVD-SIM methods. Because r is larger than Completion 48 r1, CSVD-QR can compute all the singular values of X. The 49 CPU times of CSVD-QR and SVD-SIM are 0.830 s and 0.900 In this section, LNM-QR and IRLNM-QR are tested using s, respectively. Their CPU times are almost equal to each other synthetic datasets and real-world images. The convergence 50 th 51 because the diagonal entries of Dk in the k iteration of accuracies and speeds are compared with those of the SVT 52 CSVD-QR (r = n) are equal to those of Λk in SVD-SIM [23], IRNN-SCAD (IRNN with SCAD function) [26], FTF 53 (see Theorem 1). More details are shown by the relative error [32], and RBF (a nuclear-norm-based method) [18] methods. 54 curve of the CSVD-QR method in Fig. 1. The maximum numbers of iterations for LNM-QR, IRLNM- 55 Fig. 1 shows that the relative errors of CSVD-QR and QR, FTF, RBF, SVT, and IRNN-SCAD are 50, 50, 200, 56 SVD-SIM are equal to each other in every iteration, which is 200, 200, and 200, respectively. The parameters of FTF, 57 consistent with the conclusion in Theorem 1. The relative error RBF, SVT, and IRNN-SCAD are set to the optimal values. 58 of CSVD-QR is still shown to sharply decrease during the The total reconstruction error (ERR) and peak signal-to-noise 59 first 10 iterations and converge gradually after approximately ratio (PSNR), which are two measures commonly used for 60 Page 9 of 20 9

1 2 evaluation purposes, are defined as follows: methods are shown in Table III (the standard errors are shown 3 in parentheses) and the corresponding CPU times are shown ERR = XREC X , (92) 4 ∥ − ∥F in Fig. 4. 2552 5 PSNR = 10 log10 , (93) TABLE III: MEAN RECONSTRUCTION ERRORS AND STANDARD 6 × MSE 1 ERRORS OF THE SIX METHODS USING SYNTHETIC DATA, 50% OF 7 MSE = , (94) WHICH IS RANDOMLY MISSING. 3T SE 8 2 · 2 2 SE = ERR + ERR + ERR , (95) Noise LNM IRLNM IRNN SVT FTF RBF 9 r g b level -QR -QR -SCAD 10 where T is the total number of missing entries, X is the 0.1 41.463 35.493 36.266 55.325 59.825 59.751 11 original matrix, and X is the recovered matrix. (0.153) (0.065) (0.084) (0.027) (16.770) (5.931) REC 0.2 81.741 71.254 72.423 108.333 115.333 103.768 12 1) Synthetic Data: LNM-QR and IRLNM-QR are tested on (0.274) (0.261) (0.119) (0.145) (16.082) (2.871) 13 M 0.3 123.654 107.247 106.708 176.110 162.110 150.452 a synthetic low-rank data matrix , generated as follows: (0.357) (0.263) (0.175) (0.140) (41.325) (1.668) 14 m r1 r1 n 0.4 169.648 142.676 140.824 200.791 209.791 199.547 15 M = ML × MR × + PΩ(σ randn(m, n)), (96) (0.702) (0.405) (0.341) (0.143) (26.347) (1.770) · 0.5 218.630 177.684 175.013 302.106 310.106 247.318 16 where ML and MR are generated as in Eqs. (90-91), respec- (0.765) (0.495) (0.504) (0.479) (27.893) (1.891) 17 0.6 261.513 213.589 211.327 323.247 333.247 295.879 tively. r1 > 0 is the rank of M, and σ regulates the noise (1.191) (0.452) (0.612) (0.453) (25.894) (1.918) 18 level for M. Clearly, a larger σ(σ > 0) can make M more 0.7 308.720 249.447 253.629 343.896 353.896 343.460 19 (1.213) (0.571) (0.637) (0.726) (28.369) (1.752) difficult to recover. In this section,Form = 1000 Peer, n = 1000 , Review0.8 354.001 284.824 281.419 384.264 387.264 390.843 20 r1 = 50, and 50% of the entries of M are randomly missing. (1.348) (0.689) (0.792) (0.657) (26.232) (1.984) 21 2 0.9 398.841 320.044 315.756 412.214 412.214 412.175 Let µ0 = 10− and ρ =1.4 (see Eq. (74)) for LNM-QR and (1.419) (0.976) (1.174) (1.123) (26.367) (1.998) 4 22 IRLNM-QR, and let µ0 = 10− and ρ =1.4 for FTF and 23 MBF. The parameter S in Eqs. (87-88) was tested from 2 to Table III shows that the IRLNM-QR method, which is as 24 40 to determine the best value for IRLNM-QR. The parameter accurate as IRNN-SCAD, is more accurate than LNM-QR, 25 θ in Eqs. (87-88) was set to 20 for IRLNM-QR. SVT, RBF, and FTF. The reason is that IRLNM-QR can 26 First, the effects of r (the rank of D) on LNM-QR, IRLNM- obtain the same optimal solution as an iteratively reweighted 27 QR and FTF are tested. Let σ =0.5 and r increase from 10 nuclear norm minimization method. Table III still shows that 28 to 300, with a step size of 10. The effects of r on the three IRLNM-QR and LNM-QR are as stable as SVT and IRNN- 29 methods are shown in Fig. 3. SCAD, which are considerably more stable than FTF. The 30 standard errors of IRLNM-QR and LNM-QR are much smaller 31 5000 than those of FTF and RBF. Unlike other tested methods, the 32 IRLNM−QR 4500 LNM−QR standard error of RBF is relatively large when the noise level 33 FTF 4000 is equal to 0.1 and becomes small when the noise level varies 34 3500 from 0.2 to 0.9. This may result from the fact that optimizing 35 3000 the l1-norm of a [18] may make RBF be more 36 2500 stable when recovering matrices with high noise levels. 37 2000 Reconstruction Error 38 1500 120 39 1000 500 40 100 0 41 0 50 100 150 200 250 300 r 42 80 r D 43 Fig. 3 The effects of parameter (rank of ) on the LNM-QR, IRLNM-QR, 60 IRLNM−QR

44 and FTF methods using synthetic data matrices. CPU time (s) LNM−QR FTF 40 45 IRNN−SCAD SVT Fig. 3 shows that the reconstruction error of FTF is much RBF 46 20 47 larger than those of LNM-QR and IRLNM-QR when r>50. Thus, LNM-QR and IRLNM-QR are much more robust with 0 48 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Noise level σ 49 respect to r than FTF. This result also shows that the three methods obtain the best reconstruction error at r = 50, i.e., 50 Fig. 4 The CPU times of the six methods on random data. 51 the rank of M. Because estimating the rank of an incomplete 52 matrix is difficult, we must set r to a larger value in FTF. Fig. 4 shows that LNM-QR and IRLNM-QR are much faster 53 Consequently, the accuracy and speed of FTF will be reduced. than SVT and IRNN-SCAD. The speed of IRLNM-QR is Second, the effects of the noise level (σ) on LNM-QR approximately 80 100 times those of SVT and IRNN-SCAD. 54 ∼ 55 and IRLNM-QR are compared with those on the FTF, RBF, The reason is that the computational cost of a full SVD in 56 SVT, and IRNN methods. Let r =1.5r1 for FTF, LNM- each iteration of SVT or IRNN-SCAD is much larger than 57 QR, and IRLNM-QR. Then, we repeat the experiments 10 that of the QR decomposition in LNM-QR and IRLNM-QR 58 times, with σ increasing from 0.1 to 0.9 for each method. when r = 75(1.5r1). The speed of LNM-QR or IRLNM- The mean reconstruction errors with standard errors of the six QR is only 2 3 times that of FTF or RBF. When r = 75, 59 ∼ 60 Page 10 of 20 10

1 2 the computational cost of the SVD in FTF is not very large. becomes large because the computational cost of the SVD in 3 However, FTF will become slow if r is given a large value. FTF is dominated by the parameter r. The CPU times of the 4 2) Real-World Data: Recovering useful information cov- three methods for different values of r (Fig. 7) confirm this 5 ered by the texts in an image is much more challenging than conclusion. 6 recovering matrices with missing entries randomly distributed. Fig. 7 shows that the CPU time of FTF increases very 7 We first check the positions of the texts and then initialize quickly compared to those of LNM-QR and IRLNM-QR. 8 the corresponding entries to be zero to generate incomplete Approximately 0.5 s to 42 s and 9 s to 190s are required 9 images. The original images and incomplete images in Fig. 5 for LNM-QR and IRLNM-QR and for FTF, respectively, to (on the next page) are 1024 1024 in size. Because the color recover the 5th incomplete image in Fig. 5 when r ranges from 10 × 11 images have three channels, we treat each channel separately 10 to 300. Thus, LNM-QR and IRLNM-QR are approximately and then combine the results to form the final recovered 5 18 times faster than FTF. Furthermore, the accuracy of 12 ∼ 13 images. The incomplete images in Fig. 5 are recovered by IRLNM-QR is much better than those of FTF and LNM-QR. 14 LNM-QR and IRLNM-QR. Then, their results, i.e., conver- Second, the convergence accuracies of LNM-QR and 15 gence accuracies, CPU times, and numbers of iterations, are IRLNM-QR are compared with those of RBF, FTF, SVT, 5 16 compared with those of RBF, FTF, SVT, and IRNN-SCAD. and IRNN-SCAD. Let µ0 = 10− and ρ =1.1 for RBF, 5 3 17 First, the effects of the parameter r on FTF, LNM-QR, and µ0 = 10− and ρ =1for FTF, and µ0 = 10− and ρ =1 5 18 IRLNM-QR are tested. Let µ0 = 10− and ρ =1for FTF for LNM-QR and IRLNM-QR. From the analyses of Fig. 6, 3 19 and RBF, and let µ0 = 10− and Forρ =1for Peer LNM-QR and Reviewit is suitable to let r = 200 for RBF, FTF, LNM-QR, and 20 IRLNM-QR. Let r increase from 10 to 300, with a step size IRLNM-QR when recovering the eight incomplete images in th 21 of 10. By using these values, the 5 incomplete image in Fig. the second row of Fig. 5. The parameter S (see Eqs. (87-88)) 22 5 is recovered by the three methods. The PSNR values and is tested from 2 to 20 to choose the best value for IRLNM- 23 CPU times of the three methods for different r are shown in QR, and θ (in Eqs. (87-88)) is set to 3. With these values, 24 Fig. 6 and Fig. 7, respectively. the incomplete images are recovered by the six methods. The 25 convergence accuracies, recovered images, PSNR curves and

26 CPU times of the six methods are shown in Table IV, Fig. 8, 27 28 Fig. 9, and Fig. 10, respectively. 28 26 PSNR OF RECOVERY RESULTS OF THE SIX 29 24 TABLE IV: METHODS ON THE INCOMPLETE IMAGES IN FIG. 5. 30 22

31 PSNR 20 Images SVT FTF RBF IRLNM LNM IRNN (1-8) -QR -QR -SCAD 32 18 33 1 39.244 39.471 40.191 41.202 40.534 41.186 16 2 33.741 33.994 37.171 38.143 37.187 38.695 FTF 34 14 IRLNM−QR 3 31.576 32.664 33.966 35.625 33.789 35.739 35 LNM−QR 4 30.092 29.765 31.517 35.173 32.143 34.379 12 5 25.510 26.247 26.346 26.606 27.732 0 50 100 150 200 250 300 27.769 36 r (the rank of D) 6 20.225 20.299 21.273 22.217 21.329 22.184 37 7 22.478 22.787 24.373 25.314 24.264 25.200 Fig. 6 The effects of r on the three methods, 8 25.529 25.635 27.172 27.844 27.129 28.045 38 i.e., FTF, LNM-QR, and IRLNM-QR. 39 40 As shown in Table IV, the accuracy of LNM-QR is much 200 better than that of SVT and is slightly better than those of 41 FTF 180 IRLNM−QR RBF and FTF. However, the PSNR of IRLNM-QR on the eight LNM−QR 42 160 43 images, which is much better than those of RBF and FTF, is 140 approximately equal to that of IRNN-SCAD. The reason is 44 120 that by using matrices L and R in the previous iteration as an 45 100 initialization, IRLNM-QR can converge to the optimal solution 46 CPU time (s) 80 of an iteratively reweighted nuclear norm method. Some of the 47 60 recovery results are plotted in Fig. 8 due to space limitations. 48 40 49 20 As shown in Fig. 8, the recovery result of IRLNM-QR is 0 as clear as that of IRNN-SCAD but much clearer than that 50 0 50 100 150 200 250 300 r (the rank of D) 51 of SVT. The recovery results of RBF, FTF, and LNM-QR are 52 Fig. 7 The CPU times of the three methods with different r. very similar to each other, with some abnormal points being 53 present (please see the points between the bird and the tree 54 Fig. 6 shows that the convergence accuracies of the three in Fig. 8 (D), (E), and (F)). Note that by using the outputs of 55 methods increase with the parameter r. The PSNR curves CSVD-QR with one iteration, LNM-QR and IRLNM-QR can 56 of LNM-QR and IRLNM-QR converge when r>180 and converge very efficiently. The PSNR curves of the six methods on the 5th incomplete image in Fig. 5 are shown in Fig. 9. 57 r>200, respectively. Similarly, the PSNR curve of FTF 58 increases sharply when r<120 and increases gradually Fig. 9 shows that IRLNM-QR is much more accurate 59 when 120

1 2 3 4 5 6 7 (A) The original images used in Section IV.B.2). 8 9 10 11 12 (B) The incomplete images generated. 13 Fig. 5 The real-world images (1-8) used in Section IV.B.2), which are 1024 1024 in size. 14 × 15 16 17 18 19 For Peer Review 20 21 22 23 24 (A) (B) (C) (D) 25 26 27 28 29 30 31 32

33 (E) (F) (G) (H) 34 Fig. 8 The recovery results of the six methods on image 3 in Fig. 5 with text noise. (A) The original image. (B) The incomplete image. (C) IRLNM-QR, PSNR=35.625. 35 (D) LNM-QR, PSNR=33.789. (E) RBF, PSNR=33.966. (F) FTF, PSNR=32.664. (G) SVT, PSNR=31.576. (H) IRNN-SCAD, PSNR=35.739. 36 37 38 30 1000 39 IRLNM−QR 900 LNM−QR 40 SVT 25 800 FTF 41 IRNN−SCAD 700 RBF

42 20 600

43 500 PSNR

44 15 CPU time (s) 400 SVT 45 IRNN−SCAD 300 FTF 46 10 LNM−QR 200 IRLNM−QR 47 RBF 100

5 0 48 0 50 100 150 200 1 2 3 4 5 6 7 8 49 Iteration Number Index of Images 50 Fig. 9 The PSNR curve of the six methods on the 5th Fig. 10 The CPU times of the six methods. 51 incomplete image with texts in Fig. 5. 52 53 54 LNM-QR can converge with fewer iterations than SVT, RBF, QR. The reasons is that IRLNM-QR may require a few more 55 FTF, and IRNN-SCAD. IRLNM-QR converges to the optimal iterations to search for a better solution. IRNN-SCAD and solution after approximately 50 iterations, whereas IRNN- SVT require approximately 400 s 450 s and 580 s 680 s, 56 ∼ ∼ 57 SCAD, RBF, FTF, and SVT require at least 130, 120, 180, respectively. The FTF method, which is almost as fast as 58 and 150 iterations, respectively. RBF, is not fast when recovering real-world images, as the 59 Fig. 10 shows that LNM-QR is a bit faster than IRLNM- parameter r in FTF should be set at a large value to improve its 60 Page 12 of 20 12

1 2 convergence accuracy. In general, LNM-QR and IRLNM-QR and Dj+1 is updated as follows: 3 are approximately 15 times, 15 times, 35 times, and 20 times T Dj+1 = (1 r, 1 r). (106) 4 faster than the FTF, RBF, SVT, and IRNN-SCAD methods. Tj ··· ··· m n n n 5 If m>n, we let Q R × and Tj R × for Eq. (101). ∈ ∈ 6 V. C ONCLUSIONS When j =1, we have 7 To investigate a fast and accurate completion method, a QR- [L ,T ]=qr(X), (107) 8 2 1 decomposition-based method for computing an approximate T T T 9 SVD (CSVD-QR) is proposed. This method can be used to [R2,D2 ]=qr(T1 L2 L2). (108) 10 compute the largest r(r>0) singular values of a matrix Since LT L = I, DT is equal to Λ in SVD-SIM. 11 2 2 2 2 by QR decomposition iteratively. Then, under the frame- When j =2, we have 12 work of matrix tri-factorization, a CSVD-QR-based L2,1-norm T 13 minimization method (LNM-QR) is proposed for fast matrix [L3,T2]=qr(XR2 ), (109) 14 T T completion. Theoretical analysis shows that the L2,1-norm of XR2 = L2D2R2R2 (110) 15 a submatrix in LNM-QR can converge to its nuclear norm. = L2D2. (111) 16 Consequently, an LNM-QR-based iteratively reweighted L2,1- 17 norm minimization method (IRLNM-QR) for improving the Consequently, 18 accuracy of LNM-QR is proposed. Theoretical analysis shows [L3,T2]=qr(L2D2). (112) 19 that IRLNM-QR is as accurate as an iteratively reweighted For Peer ReviewAccording to Eqs. (99-100), 20 nuclear norm minimization method, which is much more 21 T accurate than the traditional QR-decomposition-based matrix [L2 L3,T2]=qr(D2). (113) 22 completion methods. The experimental results obtained using T 23 both synthetic and real-world visual datasets show that LNM- Because D2 is equal to Λ2, T2 in Eq. (113) is equal to the 24 QR and IRLNM-QR are much faster than the FTF, RBF, SVT, S1 term in Eq. (3). D3 is updated as follows: 25 and IRNN-SCAD methods. The experimental results still show [R ,DT ]=qr(XT L ), (114) 26 3 3 3 that IRLNM-QR is almost as accurate as the IRNN method. T T T T 27 X L3 = R2 D2 L2 L3. (115) 28 VI. APPENDIX According to Eq. (113), Eq. (115) is equal to 29 In this appendix, some mathematical details regarding T T T 30 X L3 = R2 T2 . (116) CSVD-QR, LNM-QR and IRLNM-QR are provided. In ad- 31 dition, Theorems 1, 2 and 3 are proven. According to Eq. (114) and (116), we have 32 T T 33 [R2R3,D3 ]=qr(T2 ). (117) A. Proof of Theorem 1 34 T m n According to Eq. (6) in Section II.A, D3 is equal to Λ3. Thus, 35 Suppose that X R × (m n) is a real matrix, the QR ∈ ≥ we can conclude that the diagonal entries of Dj are equal to 36 decomposition of which [30] is as follows: those of Λ , where j =1, ,N (N>1). 37 j ··· 38 X = LR. (97) 39 m n B. Proof of Theorem 2 Let T = QX, where Q R × is an orthogonal matrix. ∈ 40 Then, the QR decomposition of T is as follows: We rewrite the problem in Eq. (81) as follows: 41 T = , (98) 1 1 2 42 min X w (2,1) + X C F , (118) LR X r r µ ∥ ∥ · 2 ∥ − ∥ 43 m m m n ∈R × where R × and R × satisfy 44 L ∈ R ∈ where µ > 0. According to Eq. (76), the optimal solution Xopt 45 = QL, (99) to the problem in Eq. (118) is as follows: L 46 = R, (100) j wj j 1 j j 2 47 R Xopt = arg min X + X C F , (119) Xj µ 2 − 48 where is an orthogonal matrix. opt ∗ L th # # # # Let r = n, L = eye(m, r), and R = eye(r, n). In the j j 49 1 1 where j = 1, ,r and#X # = Σr# X . Suppose# Cj = iteration of CSVD-QR, L is updated as follows: ··· opt j=1 opt 50 j+1 UΛV T is the SVD of Cj . Cj has only one singular value that 51 T j [Q, Tj ]=qr(XR ), (101) can be denoted as σ(C ). According to Lemma 1, the optimal 52 j L =Q(q , ,q ), (102) solution to Eq. (119) is 53 j+1 1 ··· r j j wj T 54 X =( C )+UV , (120) Rj+1 is updated as follows: opt F − µ 55 T # # [Q, j ]=qr(X Lj+1), (103) where Cj = σ(Cj ).# Then,# we have 56 T F 57 R = Q(q , ,q ), (104) j+1 1 ··· r # # ( Cj wj ) 58 T # # j F µ + Rj+1 = Rj+1, (105) Xopt = − Cj . (121) 59 Cj # ∥# ∥F 60 # # Page 13 of 20 13

1 2 Finally, we form the final optimal Xopt as follows: [6] C. Lu, C. Zhu, C. Xu, S. Yan, and Z. Lin. Generalized Singular 3 Value Thresholding, in Proc. of the AAAI Conference on Artificial Xopt = CK, (122) Intelligence, 2015. 4 [7] Y. Kang, Robust and Scalable Matrix Completion, in Proc. of the K = diag(k , ,k ), (123) 5 1 ··· r International Conference on Big Data and Smart Computing, 2016. j wj [8] Y. Chen, Incoherence-Optimal Matrix Completion, IEEE Transactions 6 ( C )+ k = F − µ . (124) on Information Theory, Vol. 61, pp. 2909 – 2923, March, 2015. 7 j j [9] H. Bi, B. Zhang, and W. Hong, Matrix-Completion-Based Airborne # #C F 8 # ∥# ∥ Tomographic SAR Inversion Under Missing Data, IEEE Geoscience and Remote Sensing Letters,Vol. 12, No. 11, pp. 2346 – 2350, Nov. 2015. 9 C. Proof of Theorem 3 [10] J. Huang, F. Nie, and H. Huang, Robust Discrete Matrix Completion, in 10 Proc. of the 27th AAAI Conference on Artificial Intelligence, pp. 424 Because the sequence of D produced by IRLNM-QR 11 { k} – 430, 2012. can converge to a diagonal matrix D that obeys Eq. (75) in [11] R. Ma, N. Barzigar, A. Roozgard, and S. Cheng, Decomposition 12 Approach for Low-Rank Matrix Completion and Its Applications, IEEE Section III, the DT term in Eq. (85) in Section III can also 13 r r Trans. on Signal Processing, Vol. 62, No. 7, Apr. 2014. converge to a diagonal matrix T R × , the entries of which 14 ∈ [12] C. Tzagkarakis, S. Becker, and A. Mouchtaris, Joint low-rank represen- obey Tii Tjj (i 0, and j [1, ,r]. According to Eq. (128) [20] C. Dorffer, M. Puigt, G. Delmaire, and G. Roussel, Fast nonnegative 2 ≥ ··· r ∈ ··· matrix factorization and completion using Nesterov iterations, in Proc. 33 and Eq. (129) (Eq. (80) in Section III), we have of International Conference on Latent Variable Analysis and Signal 34 Separation, Springer, pp. 26 – 35, 2017. 1 ¯ [21] M. Fazel, Matrix Rank Minimization with Applications, PhD thesis, 35 Pj(Λjj)= Tjj (1 kj ) Tjj (130) ∥ ∥1 − µ − ∥ ∥1 Stanford Univ., 2002. 36 k = k¯ T . (131) [22] E. Cande`s and B. Recht, Exact Matrix Completion via Convex Opti- 37 j ∥ jj∥1 mization, Foundations on Computational Math, Vol. 9, pp. 717 – 772, 2009. 38 When k¯ k¯ , P (Λ ) obeys Eq. (23) in Lemma 2, i.e., i ≥ j i ii [23] J. Cai, E. Cande`s, and Z. Shen, A Singular Value Thresholding Method 39 for Matrix Completion, SIAM J. Optimization, Vol. 20, pp. 1956 – 1982, (k¯ T k¯ T )( T T ) 0. (132) 40 i ∥ ii∥1 − j ∥ jj∥1 ∥ ii∥1 −∥ jj∥1 ≥ 2010. 41 [24] K. C. Toh and S. Yun, An accelerated proximal gradient algorithm for Thus, IRLNM-QR can converge to the optimal solution of an nuclear norm regularized linear least squares problems, Pacific Journal 42 iteratively reweighted nuclear norm minimization method and of Optimization, Vol. 6, No. 3, pp. 615 – 640, 2010. 43 can converge with an accuracy equal to that of IRNN. [25] F. Nie, H. Wang, and C. Ding, Joint Schatten-p norm and lp norm 44 robust matrix completion for missing value recovery, Knowledge and Information Systems , Vol. 42, No. 3, pp. 525 – 544, 2015. 45 REFERENCES [26] C. Lu, J. Tang, S. Yan, and Z. Lin, Non-convex Non-smooth Low Rank 46 Minimization via Iteratively Reweighted Nuclear Norm, IEEE Trans. on 47 [1] C. Huang, X. Ding, C. Fang, and D. Wen, Robust Image Restoration via Image Processing, Vol. 25, No. 2, pp. 829 – 839, 2016. Adaptive Low-Rank Approximation and Joint Kernel Regression, IEEE [27] S. Gu, L. Zhang, W. Zuo, and X. Feng, Weighted nuclear norm 48 Trans. on Image Processing, Vol. 23, No. 12, pp. 5284 – 5297, Dec. minimization with application to image denoising, IEEE Conference on 49 2014. Computer Vision and Pattern Recognition, pp. 2862 – 2869, 2014. 50 [2] Y. Luo, T. Liu, D. Tao, and C. Xu, Multiview Matrix Completion for [28] Y. Hu, D. Zhang, J. Ye, X. Li, and X. He, Fast and accurate matrix Multilabel Image Classification, IEEE Trans. on Image Processing, Vol. completion via truncated nuclear norm regularization. IEEE Trans. on 51 24, No. 8, Aug. 2015. Pattern Analysis and Machine Intelligence, Vol. 35, No. 9, pp. 2117 – 52 [3] R. Cabral, F. Torre, J. Costeira, and A. Bernardino, Matrix Completion 2130, 2013. 53 for Weakly-Supervised Multi-Label Image Classification, IEEE Trans. [29] B. De Schutter and B. De Moor The QR decomposition and the singular on Pattern Analysis and Machine Intelligence, Vol. 37, No. 1, Jan. 2015. value decomposition in the symmetrized maxplus algebra, SIAM Journal 54 [4] Q. Liu, Z. Lai, Z. Zhou, F. Kuang, and Z. Jin , A Truncated Nuclear on Matrix Analysis and Applications, Vol.19, pp. 378 – 406, 1998. 55 Norm Regularization Method based on Weighted Residual Error for [30] N. Halko. P. Martinsson, and J. Tropp, Finding Structure with Ran- Matrix Completion, IEEE Trans. on Image Processing, Vol. 25, No. domness: Probabilistic Algorithms for Constructing Approximate Matrix 56 1, pp. 316 – 330, Jan. 2016. Decompositions, SIAM Review, Vol. 53, No. 2, pp. 217 – 288. 2011. 57 [5] K. H. Jin and J. C. Ye, Annihilating Filter-Based Low-Rank Hankel Ma- [31] Z. Wen, W. Yin, and Y. Zhang, Solving a low-rank factorization 58 trix Approach for Image Inpainting, IEEE Trans. on Image Processing, model for matrix completion by a nonlinear successive over-relaxation 59 Vol. 24, No. 11, pp. 3498 – 3511, Nov. 2015. algorithm, Math. Prog. Comp., Vol. 4, pp. 333 – 361. Dec. 2012. 60 Page 14 of 20 14

1 2 [32] Y. Liu, L. Jiao, and F. Shang, A fast tri-factorization method for low- Q. Liu received a B.S. in information and computer rank matrix recovery and completion, Pattern Recognition Vol. 46, No. science from Inner Mongolia University, China, in 3 1, pp. 163 – 173, 2013. 2009, an M.S. in pattern recognition and intelligent 4 [33] Z. Kang, C. Peng, and Q. Cheng, Robust PCA via Nonconvex Rank systems from Jiangsu University, China, in 2013, 5 Approximation, in Proc. of the IEEE International Conference on Data and a Ph.D. in control science and engineering from Mining, pp. 211 – 220, 2015. Nanjing University of Science and Technology, Chi- 6 [34] Z. Kang, C. Peng, and Q. Cheng, Robust Subspace Clustering via na, in 2017. Since 2018, he has been a faculty mem- 7 Tighter Rank Approximation, in Proc. of the 24th ACM International ber in the School of Software, Nanyang Institute 8 Conference on Information and Knowledge Management, pp. 393 – 401, of Technology, China. His current research interests 2015. include image processing and machine learning. 9 [35] F. Nie, H. Wang, X. Cai, H. Huang, and C. Ding, Robust Matrix 10 Completion via Joint Schatten p-Norm and lp-Norm Minimization, in th 11 Proc. of IEEE the 12 International Conference on Data Mining, pp. F. Davoine received his Ph.D. in 1995 from Greno- 12 566 – 574, 2012. ble, France. He was appointed at Universite´ de [36] F. Nie, H. Wang, and C. Ding, Low rank matrix recovery via efficient ` th technologie de Compiegne, Heudiasyc Lab., France 13 Schatten-p norm minimization, in Proc. of the 26 AAAI Conference in 1997 as an associate professor and in 2002 as 14 on Artificial Intelligence, pp. 655 – 661, 2012. a researcher at CNRS. From 2007 to 2014, he was [37] F. Nie, H. Huang, X. Cai, and C. Ding, Efficient and Robust Feature 15 on leave at LIAMA Sino-European Lab. in Beijing, Selection via Joint L2,1-Norms Minimization, in Proc. of Advances in P.R. China, as PI of a project with CNRS and 16 Neural Information Processing Systems, pp. 1813 – 1821, 2010. Peking University on Multi-sensor based perception 17 [38] C. Hou, F. Nie, X. Li, D. Yi, and Y. Wu, Joint Embedding Learning and and reasoning for intelligent vehicles. Since 2015, Sparse Regression: A Framework for Unsupervised Feature Selection, 18 he is back in Compiegne,` PI of a challenge-team IEEE Transactions on Cybernetics, Vol. 44, No. 6, pp.793 – 804, 2014. within the Laboratory of Excellence MS2T, focus- 19 [39] G. Liu, Z. Lin, and Y. Yu, Robust Subspace Segmentation by Low- ing on Collaborative vehicle perception and urban scene understanding for Forth Peer Review 20 Rank Representation, in Proc. of the 27 International Conference on autonomous driving. Machine Learning, pp. 663 – 670, 2010. 21 [40] K. Tang, R. Liu, Z. Su, and J. Zhang, Structure-Constrained Low-Rank received a Ph.D. in pattern recognition 22 Representation, IEEE Trans. on Neural Networks and Learning Systems, J. Yang and intelligent systems from Nanjing University of Vol. 25, No. 12, Dec. 2014. 23 Science and Technology (NUST), Nanjing, China, in [41] X. Fang, Y. Xu, X. Li, Z. Lai, and W. Wong, Robust Semi-Supervised 2002. He was a Post-Doctoral Researcher with the 24 Subspace Clustering via Non-Negative Low-Rank Representation, IEEE University of Zaragoza, Zaragoza, Spain, in 2003. 25 Trans. on Cybernetics, Vol. 46, No. 8, pp. 1828 – 1838, 2017. He was a Post-Doctoral Fellow with the Biometrics [42] S. Xiao, M. Tan, D. Xu, and Z. Dong, Robust Kernel Low-Rank 26 Centre, Hong Kong Polytechnic University, Hong Representation, IEEE Trans. on Neural Networks and Learning Systems, Kong, from 2004 to 2006 and with the Department 27 Vol. 27, No. 11, pp: 2268 – 2281, 2016. of Computer Science, New Jersey Institute of Tech- 28 [43] C. J. Hsieh and P. A. Olsen, Nuclear Norm Minimization via Active nology, Newark, NJ, USA, from 2006 to 2007. He Subspace Selection, in Proc. of the 31st International Conference on 29 is currently a Professor at the School of Computer Machine Learning, Beijing, China, 2014. Science and Technology, NUST. His current research interests include pattern 30 [44] Y. Fu, J. Gao, D. Tien, Z. Lin, and X. Hong, Tensor LRR and Sparse recognition and machine learning. He is also an Associate Editor of Pattern 31 Coding-Based Subspace Clustering, IEEE Trans. on Neural Networks Recognition Letters and the IEEE Trans. on Neural Networks and Learning and Learning Systems, Vol. 27, No. 10, pp: 2120 – 2133, 2016. 32 Systems. [45] Z. Ding and Y. Fu, Robust Multiview Data Analysis Through Collective 33 Low-Rank Subspace, IEEE Trans. on Neural Networks and Learning 34 Systems, Vol. 29, No. 5, pp. 1986 – 1997, 2018. Y. Cui received a B.S. in computer science and 35 [46] Y. Wang, L. Wu, X. Lin, and J. Gao, Multiview Spectral Clustering technology and a Ph.D. in pattern recognition and in- via Structured Low-Rank Matrix Factorization, IEEE Trans. on Neural telligent systems from Nanjing University of Science 36 Networks and Learning Systems, 2018 (Early Access). and Technology, China, in 2008 and 2015, respec- 37 [47] W. Yang, Y. Shi, Y. Gao, L. Wang, and M. Yang, Incomplete-Data tively. From 2013 to 2014, she was a visiting student 38 Oriented Multiview Dimension Reduction via Sparse Low-Rank Rep- at the University of Technology, Sydney, Australia. resentation, IEEE Trans. on Neural Networks and Learning Systems, Since 2015, she has been a faculty member in the 39 2018 (Early Access). College of Computer Science and Technology, Zhe- 40 [48] G. H. Golub and W. Kahan, Calculating the singular values and pseudo- jiang University of Technology, Hangzhou, China. 41 inverse of a matrix, Journal of the Society for Industrial and Applied Her research interests include pattern recognition Mathematics, Vol. 2, No. 2, pp. 205 – 224, 1965. and image processing. 42 [49] J. Demmel and W. Kahan, Accurate Singular Values of Bidiagonal 43 Matrices, SIAM J. Sci. and Stat. Comput., 11(5), 873 – 912, 1990. Z. Jin received a B.S. in mathematics, an M.S. in 44 [50] J. Demmel and K. Veselic, Jacobis Method is More Accurate than QR, applied mathematics and a Ph.D. in pattern recogni- SIAM. J. Matrix Anal. and Appl., 13(4), 1204 – 1245, 1992. tion and intelligent systems from Nanjing University 45 [51] Paul Godfrey, Simple SVD, version 1.0, of Science and Technology, Nanjing, China in 1982, 46 https://cn.mathworks.com/matlabcentral/fileexchange/12674-simple- 1984 and 1999, respectively. His current interests 47 svd. are in the areas of pattern recognition and face [52] M. Li, W. Bi, J. Kwok, and B. Lu, Large-Scale Nystrom Kernel recognition. 48 Matrix Approximation Using Randomized SVD, IEEE Trans. on Neural 49 Networks and Learning Systems, Vol. 26, No. 1, pp. 152 – 164, 2015. 50 [53] F. Nie, J. Yuan, and H. Huang, Optimal mean robust principal component analysis,International Conference on Machine Learning, in Proc. of the 51 31st International Conference on Machine Learning, Beijing, China, pp. received an M.A. from Hefei University of 1062 – 1070, 2014. F. Han 52 Technology in 2003 and a Ph.D. from the University [54] Q. Shi, H. Lu, and Y. Cheung, Rank-One Matrix Completion With Auto- 53 of Science and Technology of China in 2006. He is matic Rank Estimation via L1-Norm Regularization, IEEE Transactions currently a Professor of computer science at Jiang- 54 on Neural Networks and Learning Systems, 2017 (Early Access). su University. His research interests include neural [55] W. Hu, D. Tao, W. Zhang, Y. Xie, and Y. Yang, The Twist Tensor Nuclear 55 networks and bioinformatics. Norm for Video Completion, IEEE Transactions on Neural Networks 56 and Learning Systems, Vol. 28, No. 12, pp. 2961 – 2973, 2017. 57 58 59 60