On the Generic Low-Rank Matrix Completion
Total Page:16
File Type:pdf, Size:1020Kb
On the Generic Low-Rank Matrix Completion Yuan Zhang, Yuanqing Xia∗, Hongwei Zhang, Gang Wang, and Li Dai Abstract—This paper investigates the low-rank matrix com- where the matrix to be completed consists of spatial coordi- pletion (LRMC) problem from a generic vantage point. Unlike nates and temporal (frame) indices [9, 10]. most existing work that has focused on recovering a low-rank matrix from a subset of the entries with specified values, the The holy grail of the LRMC is that, the essential information only information available here is just the pattern (i.e., positions) of a low-rank matrix could be contained in a small subset of of observed entries. To be precise, given is an n × m pattern matrix (n ≤ m) whose entries take fixed zero, unknown generic entries. Therefore, there is a chance to complete a matrix from values, and missing values that are free to be chosen from the a few observed entries [5, 11, 12]. complex field, the question of interest is whether there is a matrix completion with rank no more than n − k (k ≥ 1) for almost all Thanks to the widespread applications of LRMC in di- values of the unknown generic entries, which is called the generic verse fields, many LRMC techniques have been developed low-rank matrix completion (GLRMC) associated with (M,k), [4, 5, 8, 12–16]; see [11, 17] for related survey. Since the rank and abbreviated as GLRMC(M,k). Leveraging an elementary constraint is nonconvex, the minimum rank matrix completion tool from the theory on systems of polynomial equations, we give a simple proof for genericity of this problem, which was first (MRMC) problem which asks for the minimum rank over observed by Kirly et al., that is, depending on the pattern of the all possible matrix completions, is NP-hard [4]. A natural observed entries, the answer is either yes or no, for almost all surrogate of the MRMC problem is to relax the rank optimiza- realizations of the unknown generic entries. We provide necessary tion problem using the nuclear norm minimization (NNM), and sufficient conditions ensuring feasibility of GLRMC(M, 1). which can be solved efficiently by semi-definite programming Aso, we provide a sufficient condition and a necessary condition (which we conjecture to be sufficient too) for the general case (SDP) [4, 13, 18]. It has been proven that the NNM can (i.e., k ≥ 1). In the following, two randomized algorithms perfectly recover a low-rank matrix, provided that the number are presented to determine upper and lower bounds for the of observed entries, whose locations are uniformly sampled generic minimum completion rank. Our approaches are based at random, exceeds a certain threshold depending on the on the algebraic geometry theory, and a novel basis preservation rank and coherence of the low-rank matrix. A regularization principle discovered herein. Finally, numerical simulations are given to corroborate the theoretical findings and the effectiveness term was added to the NNM function, resulting in a singular of the proposed algorithms. value thresholding (SVT) algorithm of lower computational complexity than SDP [14]. Some algorithms targeted the Index Terms—Low-rank matrix completion, generic property, algebraic geometry, basis preservation principle, graph theory Frobenius norm minimization (FNM) subject to known rank constraints, which includes, to name a few, ADMiRA using greedy projection to identify a set of rank-one matrices to I. INTRODUCTION approximate the original matrix [19], LMaFit based on a Matrix completion, that is, the problem of completing a nonlinear successive over-relaxation technique [8], SVP using matrix and recovering the missing entries, has long been a singular value projection [15], and RKHS via reproducing Ker- research focus in different communities [1–6]. The key idea nel Hilbert space [16]. Inference and uncertainty quantification enabling the recovery is the “regularity” present in both the in noisy matrix completion has recently been addressed in [12]. arXiv:2102.11490v2 [cs.IT] 7 May 2021 missing as well as available entries. Low rank is an attribute characterizing such regularity, and it has been thoroughly ex- Despite the above achievements, in a more practical scenario ploited in the so-called low-rank matrix completion (LRMC). where the patterns of observed entries may not be random, or Low-rank matrices naturally arise in a variety of practical no prior knowledge about the rank or coherence of the low- scenarios, including collaborative filtering in recommendation rank matrix is available, there remains not much to claim for systems [7], image and video compression and recovery [8], these algorithms in terms of the exact recovery guarantees. reduced-order controller design [4], tensor decomposition in This is particularly the case when the observed entries are hyperspectral imaging [6], to name a few. In computer vision, corrupted by noise with unknown statistical characteristics, for example, recovering or denoising a scene from a sequence rendering their exact values unavailable. This has indeed of images (frames) can be formulated as an LRMC problem, inspired us to ask a meaningful and fundamental question: Given only a pattern of the observed and missing entries, This work was supported in part by the China Postdoctoral Innova- whether matrix completion of a prescribed rank exists or not tive Talent Support Program under Grant BX20200055, the China Post- doctoral Science Foundation under Grant 2020M680016, the National for almost all values of the observed entries? Natural Science Foundation of China under Grant 62003042, and the State Key Program of National Natural Science Foundation of China To address this question, this paper inherits and devel- under Grant 61836001. The authors are with the School of Automa- ops the framework that is termed generic low-rank matrix tion, Beijing Institute of Technology, Beijing 100081, China (email: [email protected]; xia [email protected] [corresponding author]; completion (GLRMC). In the literature, ‘genericity’ means [email protected]; [email protected]; [email protected]). ‘typical behavior’ or ‘happens with probability 1’. In other words, a property is generic, if for almost all1 parameters in erations such as maximum matching computations and set the corresponding space, this property holds; the parameters unions/interactions, and does not involve the Jacobian matrix violating the property have zero measure. In fact, genericity or random matrices adopted in [24]. As the exact values of has attracted considerable attention in control theory [20– observed entries are no longer needed, the proposed approach 22], and rigidity theory [23]. Genericity in the LRMC was is robust against rounding errors and noise corruption. In initialized in the pioneering work [24], where the authors particular, in scenarios where exact values of the observed introduced an algebraic-combinatorial approach to study some entries are not available or corrupted by noise with unknown fundamental problems in the LRMC from a generic viewpoint. statistical characteristics, the GLRMC can provide information This approach was later developed to matrix completions in the beyond the scope of most existing LRMC approaches. Notably, real-field [25, 26], as well as to symmetric matrix completions for LRMC techniques that require a priori rank knowledge, [27]. However, except for some special cases, such as the rank- such as the LMaFit [8] and the SVP [15], our approach may one matrix completion or for a r r square matrix to have come handy providing reasonable prior bounds. × rank r 1 completion [24, 26], the above-mentioned question It is also worth stressing that, genericity does not rule remains− largely unresolved. out ‘exceptional’ cases. When additional knowledge on the In the GLRMC of this paper, given a pattern of observed observed entries is available, corresponding modifications of entries, the observed entries are classified into two types, the GLRMC approach might be due before it becomes valid. namely, fixed zero entries and unknown generic entries (the As a matter of fact, for a specific pattern, [28] has provided former does not necessarily exist), and the missing entries are conditions for the uniqueness of rank-r matrix completion, free to be chosen from the complex field. The introduction of but it may still require exact values of the observed entries. fixed zero entries is intended to fit a broader scenario where GLRMC also differs from the minimum rank problem of a the zero/nonzero information of the observed entries may be graph studied in [29, chap 46]. The latter seeks to find the available, which also makes the present work differ from the smallest possible rank over all matrices whose (i, j)th entry previous ones on this topic [24–27]. We show that the existence is nonzero whenever (i, j) is an edge of that graph and zero of a matrix completion with a prescribed rank r is generic. Put otherwise. differently, either for almost all realizations of the unknown The rest of this paper is organized as follows. Section generic entries, the LRMC with rank r is feasible, or for almost II presents the problem formulation. Section III establishes all realizations, the LRMC is infeasible. The genericity in genericity of the feasibility of the LRMC for a given pattern. LRMC without the fixed zero entries was first proved in [24], Section IV provides necessary and sufficient conditions for where the Jacobian of the masking operator was used. Here, we the feasibility of GLRMC under the constraint that the rank provide an alternatively simple proof based on an elementary of the completion reduces at least one, with proof is given in tool from the theory of systems of polynomial equations. For Section V. Section VI extends the results to the general case. an n m rectangular pattern (n m), we give necessary Numerical simulations are presented in Section VII to validate × ≤ and sufficient deterministic conditions for the feasibility of the theoretical findings.