GRAPH : FOUNDATIONS AND EMERGING DIRECTIONS

Yuichi Tanaka, Yonina C. Eldar, Antonio Ortega, and Gene Cheung

Sampling Signals on Graphs From theory to applications

he study of sampling signals on graphs, with the goal of building an analog of sampling for standard signals in the T time and spatial domains, has attracted considerable atten- tion recently. Beyond adding to the growing theory on graph signal processing (GSP), sampling on graphs has various promising applications. In this article, we review the current progress on sampling over graphs, focusing on theory and potential applications. Although most methodologies used in graph signal sam- pling are designed to parallel those used in sampling for standard signals, sampling theory for graph signals signifi- cantly differs from the theory of Shannon–Nyquist and shift- invariant (SI) sampling. This is due, in part, to the fact that the definitions of several important properties, such as shift invariance and bandlimitedness, are different in GSP systems. Throughout this review, we discuss similarities and differenc- es between standard and graph signal sampling and highlight open problems and challenges.

Introduction Sampling is one of the fundamental tenets of digital signal pro- cessing (see [1] and the references therein). As such, it has been studied extensively for decades and continues to draw consid- erable research efforts. Standard sampling theory relies on concepts of frequency domain analysis, SI signals, and band- ©ISTOCKPHOTO.COM/ALISEFOX limitedness [1]. The sampling of time and spatial domain sig- nals in SI spaces is one of the most important building blocks of digital signal processing systems. However, in the big data era, the signals we need to process often have other types of connections and structure, such as network signals described by graphs. This article provides a comprehensive overview of the theo- ry and algorithms for the sampling of signals defined on graph domains, i.e., graph signals. GSP [2]–[4]—a fast developing field in the signal processing community—generalizes key sig- nal processing ideas for signals defined on regular domains to discrete-time signals defined over irregular domains described Digital Object Identifier 10.1109/MSP.2020.3016908 Date of current version: 28 October 2020 abstractly by graphs. GSP has found numerous promising

14 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 1053-5888/20©2020IEEE

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. applications across many engineering disciplines, including []X ij . A subvector of x is denoted xS with indicator index set S. NM# ||||RC# image processing, wireless communications, machine learn- Similarly, a submatrix of X ! R is denoted XRC ! R , ing, and data mining [2], [3], [5]. where indicator indices of its rows and columns are given by R Network data are pervasive and found in applications such and C, respectively; XRR is simply written as XR . as sensor, neuronal, transportation, and social networks. The number of nodes in such networks is often very large: process- Review: GSP and sampling in Hilbert spaces ing and storing all of the data as is can require huge computa- tion and storage resources, which may not be tolerable even The basics of GSP in modern, high-performance communication and computer We denote by GV= (,E) a graph where V and E are the sets systems. Therefore, it is often of interest to reduce the amount of vertices and edges, respectively. The number of vertices is of data while keeping the important information as much as N = | V | unless otherwise specified. We define an adjacency possible. Sampling of graph signals addresses this issue: how matrix W where entry []W mn represents the weight of the edge one can reduce the number of samples on a between vertices m and n; []W mn = 0 for graph and reconstruct the underlying signal, GSP has found numerous unconnected vertices. The degree matrix generalizing the standard sampling paradigm promising applications D is diagonal, with mth diagonal element to graph signals. across many engineering []DWmm = Rnm[ ] n . In this article, we con- Generalization of the sampling prob- disciplines, including sider undirected graphs without self-loops, lem to GSP raises a number of challenges. i.e., []WWmn = []nm and []W nn = 0 for all First, for a given graph and graph operator, image processing, m and n, but most theory and methods dis- the notion of frequency for graph signals wireless communications, cussed can be extended to signals on direct- is mathematically straightforward, but the machine learning, and ed graphs. connection of these frequencies to the actual data mining. GSP uses different variation operators [2], properties of signals of interest (and, thus, [3] depending on the application and assumed the practical meaning of concepts, such as bandlimitedness and signal and/or network models. Here, for concreteness, we focus on smoothness) is still being investigated. Second, periodic sam- the graph Laplacian LD:=-W or its symmetrically normal- pling, widely used in traditional signal processing, is not appli- ized version LD:.= --12//LD 12 The extension to other variation cable in the graph domain (e.g., it is unclear how to select “every operators (e.g., adjacency matrix) is possible with a proper modifi- other” sample). Although the theory of sampling on graphs and cation of the basic operations discussed in this section. Since L is a manifolds has been studied in the past (see [6], [7], and follow- real symmetric matrix, it always possesses an eigendecomposition < up works), early works have not considered the problems inher- LU= KU , where Uu= [,1 f,]uN is an orthonormal matrix ent in applications, e.g., how to select the best set of nodes from containing the eigenvectors ui, and K = diag(,mm1 f,)N con- a given graph. Therefore, developing practical techniques for sists of the eigenvalues mi. We refer to mi as the graph frequency. sampling set selection that can adapt to local graph topology is A graph signal x :V " R is a function that assigns a value to very important. Third, the work to date has mostly focused on each node. Graph signals can be written as vectors, x, in which the direct nodewise sampling, while there has been only limited nth element, xn[], represents the signal value at node n. Note that work on developing more advanced forms of sampling, e.g., any vertex labeling can be used, since a change in labeling simply adapting SI sampling [1] to the graph setting [8]. Finally, graph results in row/column permutation of the various matrices, their signal sampling and reconstruction algorithms must be imple- corresponding eigenvectors, and the vectors representing graph mented efficiently to achieve a good tradeoff between accuracy signals. The graph Fourier transform (GFT) is defined as and complexity. To address these challenges, various graph sampling N- 1 xit []==GHuxii,[/ u nxn][]. (1) approaches have recently been developed (e.g., [7] and [9]–[14]) n = 0 based on different notions of graph frequency, bandlimitedness, and shift invariance. For example, a common approach to define Other GFT definitions can also be used without changing the the graph frequency is based on the spectral decomposition of framework. In this article, for simplicity, we assume real-val- different variation operators, such as the adjacency matrix or ued signals. Although the GFT basis is real valued for undi- variants of graph Laplacians. The proposed reconstruction pro- rected graphs, extensions to complex-valued GSP systems are cedures in the literature differ in their objective functions, lead- straightforward. ing to a tradeoff between accuracy and complexity. Our goal is A linear graph filter is defined by G ! RNN# , which, applied to provide a broad overview of existing techniques, highlighting to x, produces an output what is known to date to inspire further research on sampling yG= x. (2) over graphs and its use in a broad class of applications in signal processing and machine learning. Vertex- and frequency-domain graph filter designs considered In what follows, we use boldfaced lowercase (uppercase) sym- in the literature both lead to filters G that depend on the struc- bols to represent vectors (matrices); the ith element in a vector x ture of the graph G. Vertex-domain filters are defined as poly- is xi[] or xi; and the ith row, jth column of a matrix X is given by nomials of the variation operator, i.e.,

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 15

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. P p is the sampled cross correlation. Thus, we may view sampling yG==xLc p x, (3) / in the Fourier domain as multiplying the input spectrum by the eop = 0 filter’s frequency response and subsequently aliasing the result where the output at each vertex is a linear combination of sig- with uniform intervals that depend on the sampling period. In nal values in its P-hop neighborhood. In frequency-domain fil- BL sampling, st()-=sinc(/tT), where sinc()tt= sin()rr/( t), ter design, G is chosen to be diagonalized by U so that while s(t) may be chosen more generally in the generalized sam- pling framework. yG==xUgt ()K Ux< , (4) Recovery of the signal from its samples c is represented as

* where ggtt()K :(= diag ()mm1 ,,f gt ()N ) is the graph frequen- xWu ==Hc WH()Sx, (8) cy response. Filtering via (4) is analogous to filtering in the Fourier domain for conventional signals. When there where W is a set transformation corresponding to a basis exist repeated eigenvalues mmij= , their graph frequency {}wn for the reconstruction space, which spans a closed responses must be the same, i.e., ggtt()mmij= (). If gt ()mi is subspace W of H. The transform H is called the correc- a Pth order polynomial, (4) coincides with vertex-domain tion transformation and operates on the samples c prior to filtering (3). recovery. The reconstruction problem is to choose H so that xu is either equal to x or as close as possible under a desired Generalized sampling in Hilbert spaces metric. Typically, solving this problem requires making as- We next briefly review generalized sampling in Hilbert spac- sumptions about x, e.g., that it lies in a known subspace or es [1] (see Figure 1). We highlight generalized sampling with is smooth. a known SI signal subspace [1], a generalization of Shan- In the SI setting, the recovery corresponding to (8) is given by non–Nyquist sampling beyond bandlimited (BL) signals. Reviewing these cases will help us illustrate similarities and xtu ()=-/ ([hn][) cn])wt()nT , (9) differences with respect to sampling and reconstruction in n ! Z the graph setting. where a discrete-time correction filter hn[] is first applied to Let x be a vector in a Hilbert space H and cn[] be its cn[]: the output dn[]= hn[]) cn[] is interpolated by a filter nth sample, cn[]=GHsxn, , where {}sn is a Riesz basis, and w(t) to produce the recovery xtu (). GH·,· is an inner product. Denoting by S the set transformation Suppose that x lies in an arbitrary subspace A of H , and * corresponding to {}sn , we can write the samples as cS= x, assume that A is known. This is one of the well-studied sig- where ·* represents the adjoint. The subspace generated by nal models, i.e., signal priors, that have been considered in {}sn is denoted by S. In the SI setting, ssn =-()tnT for a generalized sampling. With this subspace prior, x can be rep- real function s(t) and a given period T. The samples can then resented as be expressed as x ==/ dna[] n Ad, (10) cn[]=-GHst()nT ,(xt)(=-xt)() st),tn= T (5) where {}an is an orthonormal basis for A, and dn[] are the where * denotes convolution. The continuous-time Fourier expansion coefficients of x. In the SI setting, x(t) is written as transform (CTFT) of cn[],(C ~), can be written as xt()=-/ dnat[] ()nT (11) CR()~~= SX (), (6) n ! Z where for some sequence dn[], where a(t) is a real generator satisfy- ing the Riesz condition. In the Fourier domain, (11) becomes 3 12* ~r--k ~r2 k RSX ()~ := S X (7) / jT~ T k =-3 ``T jjT XD()~~= ()eA(), (12)

SamplingReconstruction Generation SamplingReconstruction c[n] d[n] x ! RN c ! RM x S∗ HWx dxAHS W Sampling Correction Recon. Correction Recon. Transformation Transformation Transformation Operator Operator

(a) (b)

FIGURE 1. The generalized sampling frameworks for (a) sampling in Hilbert space and (b) its GSP counterpart.

16 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. where A()~ is the CTFT of a(t), and De()jT~ is the discrete- in Hilbert Spaces” section can be written in matrix form [15]. time Fourier transform of dn[] and is 2r/T periodic. Similar to (10), we can assume any graph signal x is modeled In this article, we focus on generalized sampling for the by a known generator matrix A ! RNK# (KN# ) and expan- unconstrained case, where an arbitrary transformation can sion coefficients d ! RK as follows: be used as W. We can also consider generalized sampling xA:.= d (17) for a predefined W (see [1] and the references therein). In the unconstrained setting, we may recover a signal in A The graph sampling operator is a matrix S ! RNM# MN# by choosing WA= in (8). If SA* is invertible, then perfect that, without loss of generality, is assumed to have linearly^h in- recovery of any x ! A is possible by using HS= ()*1A - . dependent columns that span a sampling space, S 1 RN . The Invertibility can be ensured by the direct-sum (DS) condition: sampled signal is then given by A and S= intersect only at the origin and span H jointly so that HA= 5 S=. Under the DS condition, a unique recovery cS:.= < x ! RM (18) is obtained by an oblique projection operator onto A along S= given by Since A is known, signal recovery may be given by using (16): xAu ==()SA**-1 Sx x. (13) xu ==AHcA()SA< @

2 and S as well as efficient implementation of the pseudoin- xxu = argmin (15) verse described. In some cases, analogous to the SI setting in xS! A, * xc= standard sampling, this inverse can be implemented by fil- and is given by tering in the graph Fourier domain, as we show in the next section. Next, we describe some typical graph signal models xAu = ()SA**@ Sx. (16) (i.e., specific A’s) as well as two sampling approaches (i.e., The correction transformation is HS= ()* A @ , and ·@ represents choices of S). the Moore–Penrose pseudoinverse. Its corresponding form in the SI setting is the same as (14), but H()~ = 0 if RSA ()~ = 0. Graph signal models These results on sampling in Hilbert space are based on a The signal generation and recovery models of (17) and (19) are signal model where x ! A, with A assumed to be a known valid for any arbitrary signal subspace represented by A. Here, subspace. These results have also been extended to include we introduce several models of A proposed in the literature various forms of smoothness on x as well as sparsity. that are related to the specific graph G on which we wish to process data. Graph sampling theory The most widely studied graph signal model is the BL In this section, we first describe a general graph signal sam- signal model. This model corresponds to AU= VB , where pling and reconstruction framework that is inspired by that of B :{= 1,,f K}. A BL signal is, thus, written by

the “Generalized Sampling in Hilbert Spaces” section. Then, K we discuss the graph signal subspaces proposed in the litera- xuBL ==/ di i UdVB , (20) ture. Two definitions of graph signal sampling, which are gen- i = 1 eralizations of those studied in standard sampling, are also de- where ~m:= K is called the bandwidth or cutoff frequency of scribed. Finally, we present recovery experiments for BL and the graph signal. The signal subspace of ~-BL graph signals

full-band signals, both of which can be perfectly reconstructed on G is often called the Paley–Wiener space PW~ (G) [7], based on the proposed framework. [10]. In spectral graph theory [16], [17], it is known that ei- genvectors corresponding to low graph frequencies are smooth General sampling and recovery framework within clusters, i.e., localized in the vertex domain. We consider finite dimensional graphs and graph signals for A more general frequency domain subspace model could which the generalized sampling in the “Generalized Sampling be obtained as

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 17

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. The Relationship Between Periodic Graph Spectrum and Shift-Invariant Signals

To connect signal generation models of graph signals to D(ej T) (Repeated Twice) those of shift-invariant (SI) signals, the periodic graph spec- trum (PGS) subspace has been proposed in [8]. 1 A( ) Definition S1: PGS subspace A K-dimensional PGS subspace, where KN# , of a given graph G is a space of graph signals that can be ex­­ pressed as a graph Fourier transform (GFT) spectrum fil- tered by a given generator: 0 r 2r N-1 XPGS ==xn[]xn[] | daiKmod t ()miiun[], (S1) (a) ( i = 0 2

where at ()mi is the graph frequency-domain response

of the generator, and di is an expansion coefficient. A signal in a PGS subspace can be represented in (22). Definition S1 assumes the graph spectrum is periodic. j T The PGS subspace is related to the signal subspace X( ) = D(e ) A( ) X[i]: DFT Spectrum of continuous-time SI signals in (12). Suppose that T in 1 (12) is a positive integer; i.e., the spectrum De()jT~ is (Spectrum on a Cycle Graph) repeated T times within ~r! 02,, and A()~ in 6 @ (12) has support ~r! 02,. In this case, a sequence jT~ 6 @ X[]i = De()A()~ u~r= 2 iN/ (iN=-01,,f ) corresponds to the discrete Fourier transform (DFT) spectrum of length N. Therefore, this X[]i can be regarded as a 0 2r graph signal spectrum in a PGS subspace if the GFT is (b) identical to the DFT (by relaxing to a complex U); e.g., the graph G is a cycle graph, i.e., a periodic FIGURE S1. The relationship between PGS and SI signals for T = 2: graph consisting of a ring. This relationship is illus- (a) the signal generation model in SI space and (b) the spectrum of trated in Figure S1. the sampled SI signal. The correction filter HS= ()R A @ for signals in a PGS sub- H is equivalent to filtering in the graph frequency space mimics the frequency response of (14). Suppose that domain with correction filter [8] the graph frequency-domain sampling in the “Definition 2: Graph Frequency-Domain Sampling” section is used. t 1 u ! h()mi = , (S2) The DS condition in this case implies Rga ()mi 0 for all Ru ga ()mi

iK=1,,f , where Ru ga ()mmii:= |,,gt()++Kiat ()m K, . If x ! XPGS and the DS condition holds, then the correction transform which clearly parallels the SI expression (14).

K N where Dsamp is the matrix for the GFT domain upsampling xu==//dai t i ()m jj Ad, (21) i ==11j (details are given in the “Definition 2: Graph Frequency Do- main Sampling” section). This model parallels those stud- K N where d ! R , and the ith column of A is R j = 1 at ij()m u j. ied in the SI setting and leads to recovery methods based In this case, each of the at i ()m imposes a certain spectral on filtering in the graph frequency domain, similar to (14). shape (e.g., exponential), and the parameter di controls This relationship­ is described in “The Relationship Between how much weight is given to the ith spectral shape. It is Periodic Graph Spectrum and Shift-Invariant Signals.” clear that (21) includes (20) as a special case by choosing The vertex-domain subspace model can also be consid- at i ()m appropriately. ered. Let {Ti} iK= 1,,f be a partition of V, where each ^h Another special case of (21) has been proposed by assum- node in Ti is locally connected within the cluster. A piece- ing periodicity of the spectrum [8]. A signal in a periodic graph wise constant graph signal is then given by spectrum (PGS) subspace can be represented as K

xdPC ==di 11TiK[,TT1 f,]1 , (23) < / xUPGS = at ()K Ddsamp , (22) i = 1

18 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. where []1Ti n = 1 when the node n is in Ti and 0 otherwise design of graph wavelets/filter banks [19]. Graph frequency-

[18]. In this case, A = [,11T1 f,]TK . Piecewise smooth graph domain sampling is defined as follows. signals can be similarly defined. Graph signals parameterized by the various models intro- Definition 2: Graph frequency-domain sampling duced earlier can be perfectly recovered by (19) beyond the BL Let xt ! RN be the original signal in the graph frequency do- assumption in (20) as long as the class of signals to be recon- main, i.e., xUt = < x, and let gt ()K be an arbitrary sampling filter structed corresponds to a vector subspace of sufficiently low expressed in the graph frequency domain. For a sampling ratio dimension relative to the sampling rate. While many studies M ! Z, where M is assumed to be a divisor of N for simplicity, the have focused on the BL setting, the important point we stress sampled graph signal in the graph frequency domain is given by here is that, when a proper signal model is given by prior infor- mation or by estimating/learning from data, recovery is often cD= samp gtt()K x, (25) possible whether or not the signal is BL. Appropriate modeling of graph signal subspaces beyond those already shown is an where interesting future research topic from both a theoretical and an application point of view. DIsamp = MMI f (26) 6 @ Sampling methods is the spectrum folding matrix. Similar to time- and frequency-domain sampling for signals in The sampling matrix S< in the graph frequency domain is, SI space, i.e., (5) and (6), graph signal sampling can be defined thus, given by in both the vertex and spectral domains. For time-domain sig- << nals, there is a simple relationship between sampling in both SD= samp gt ()K U . (27) domains, as can be seen from (11) and (12). In contrast, for general graphs, direct nodewise sampling in the vertex domain While this definition is clearly an analog of the frequency- (i.e., selecting a subset of nodes) does not correspond to a spec- domain sampling in (6), in general, it cannot be written as an trum folding operation in the graph frequency domain, and vice operator of the form of IGTV , i.e., graph filtering, as defined versa. Thus, we discuss vertex and frequency-domain graph in the “Review: GSP and Sampling in Hilbert Spaces” section, sampling separately. and vertex-domain sampling, except for some specific cases, such as cycle or bipartite graphs [8], [12], [19]. Therefore, graph Vertex-domain sampling frequency-domain sampling requires samples in all nodes to Vertex-domain sampling is an analog of time-domain sam- be available before performing sampling. pling. Samples are selected on a predetermined sampling set, While this property may not be desirable for a direct appli- TV! , containing T = M nodes. Sampling set selection is cation of sampling, e.g., obtaining samples at a subset of nodes described in the “Sampling Set Selection and Efficient Com- in a sensor network, assuming all of the samples are available putation Methods” section. For a given T, we define sampling before sampling is reasonable for several applications. For as follows. example, graph filter banks require the whole signal prior to filtering and sampling (whether it is performed in the vertex or Definition 1: Vertex-domain sampling graph frequency domain), along with a strategy to downsample Let x ! RN be a graph signal and G ! RNN# be an arbitrary the filtered signals to achieve critical sampling. graph filter in (2). Suppose that the sampling set T is given a It has been shown that a filter bank with frequency-domain priori. The sampled graph signal c ! RM is defined by [9], [10] downsampling can outperform one using vertex-domain sam- pling [19]. Graph frequency-domain sampling also outperforms cI= TV Gx, (24) that in the vertex domain with generalized graph signal sampling frameworks [8], [15]. See “An Illustrative Example of Sampling where ITV is a submatrix of the identity matrix whose rows Procedures” for a comparison between graph signal sampling are indicated by the sampling set T. and conventional discrete-time signal sampling, which shows < The sampling matrix is, therefore, given by SI= TV G. the lack of equivalence between vertex and frequency sampling. Though many methods in the literature consider direct sam- pling, where GI= , an arbitrary G can be used prior to node- Remarks on correction and reconstruction transforms wise sampling. In the SI setting, signal recovery can be implemented in the time domain as (9) with counterparts in the Fourier domain as Graph frequency-domain sampling in (14). However, this is not the case for vertex-domain sam- Sampling in the graph frequency domain [12] parallels the pling: While the sampling matrix S in Definition 1 is designed Fourier domain sampling in (6): the graph Fourier-transformed to parallel sampling in the time domain, the correction matrix input xt is first multiplied by a graph spectral filter gt ()K ; the HS= ()< A @ does not have a diagonal graph frequency re- product is subsequently folded with period M, resulting in sponse in general. Refer to “Bandlimited Signal Recovery With aliasing for full-band signals, which can be utilized for the Vertex-Domain Sampling” for an example with BL signals.

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 19

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. An Illustrative Example of Sampling Procedures

In Figure S2(a), standard discrete-time sampling is shown DFT of the frequency-sampled signal. Figure S2(b) illus- in both the time and frequency domains. Pointwise sam- trates graph signal sampling in the vertex and graph fre- pling in the time domain corresponds to folding of the dis- quency domains (Definitions S1 and 2), which do not yield crete Fourier transform (DFT) spectrum [1]. Note that both the same output, unlike their conventional signal counter- sampling methods yield the same output after the inverse parts of Figure S2(a).

Original Discrete-Time Signal Time Domain Time-Domain Sampling

n n

DFT DFT Frequency Domain DFT Spectrum Inverse DFT

k

Frequency-Domain Sampling (Spectrum Folding)

(a)

Vertex Domain Original Graph Signal

Vertex-Domain Sampling

GFT Not Identical Graph Frequency Discrete GFT Spectrum Domain

k Frequency-Domain Sampling (Spectrum Folding)

(b)

FIGURE S2. The sampling procedures for (a) standard and (b) graph signals. GFT: graph Fourier transform.

20 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. Bandlimited Signal Recovery With Vertex-Domain Sampling

Assume we have a bandlimited signal xBL as defined in cannot be written as a graph spectral filter having a diago- (20) and we use direct nodewise sampling, i.e., GI= . nal frequency response. Even if the sampling filter G is This is a well-studied setting for graph sampling theory. applied before nodewise sampling, perfect or least-squares @ The DS condition in this case is often called the uniqueness ­recovery is obtained by replacing ()UTB in (S3) with @ set condition [7], [10] and requires a full-rank IUTV VB = ()GUTV VB , while AU= VB does not change. An approxi- MM# UTB ! R [9], [10]. (We assume MK= .) In this case, mate recovery of bandlimited graph signals is possible with

we have AU= VB, and (19) is reduced to an alternative approach, e.g., an iterative algorithm using polynomial filters and projection onto convex sets [S1]. @ u VB TB xU= ()Uc, (S3) Reference [S1] S. K. Narang, A. Gadde, and A. Ortega, “Signal processing techniques M = TB= . for interpolation in graph structured data,” in Proc. IEEE Int. Conf. Acoustics, where we assume In other words, the cor- Speech and Signal Processing (ICASSP), 2013, pp. 5445–5449. doi: @ rection transform is given by H = ()UTB . As a result, H 10.1109/ICASSP.2013.6638704.

Different Reconstruction Operators

The reconstruction in (19) allows for perfect signal recov- consistent solution, i.e., such that Sx< = Sx< u, and instead ery under the discrete-sum condition, when the signal solve the following problem: lies in a given subspace. However, we may not always < 2 xSu =-argmin xc2 + c Vx p, (S5) have such a strong prior. For example, we may only x ! RN know that our signal is smooth in some sense. A popular approach in this case is to consider the following recov- where c20 is a regularization parameter. Fast and effi- ery [1], [20]: cient interpolation algorithms have also been studied in [S1] and [S2] based on generalizations of standard signal xVu = argmin x p, (S4) Sx< = c processing techniques to the graph setting.

V p $1. Reference where is a matrix that measures smoothness, and [S2] A. Heimowitz and Y. C. Eldar, “Smooth graph signal interpolation for If there is noise, we can relax the goal of achieving a big data,” 2018, arXiv:1806.03174.

M In this section, for simplicity, we have considered the function is at ()mmii=-12/,mmax and each element in d ! R case where measurement, sampling, and recovery are is drawn from N(,11). The sampling filter is also full-band noise-free. In the presence of noise, the reconstruction of (19) where gt ()mmii=-exp(/2). may be replaced by noise-robust methods. See sidebar “Differ- As shown in Figure 2(a), both vertex and frequency sam- ent Reconstruction Operators” for an example. Note that the pling methods can recover the BL graph signal. Note that c is recovery procedures for the noisy cases have been well studied identical to d for graph frequency-domain sampling. In con- in the context of (generalized) sampling theory for standard trast, Figure 2(b) shows that the original signal oscillates in the signals [1] as well as compressed sensing [20]. Robustness vertex domain due to its full-band generator function. Also, c against noisy measurements is also a major motivation to opti- of graph frequency-domain sampling does not match the origi- mize the sampling set selection of graphs, which we discuss in nal spectrum due to aliasing and the sampling filter. However, the next section. even in that case, the original signal is perfectly recovered when the signal subspace is given. Graph signal sampling and recovery example for synthetic signals Sampling set selection and To illustrate signal recovery both for BL and full-band settings, efficient computation methods we consider the random sensor graph example of Figure 2, The recovery method in (19) can be possible only if the signal with N = 64 and M = 15. The first scenario is the well-known subspace, e.g., cutoff frequency in the BL setting, is known BL setting, where the signal is BL as in (20), with K = 15, and perfectly a priori. However, in practice, the cutoff frequency is the sampling filter is the identity matrix, i.e., GI= . In the sec- often unknown (and, thus, can at best be estimated), or the sig- ond scenario, we use the full-band generator in the graph fre- nal is smooth but not strictly BL in the first place. Furthermore, quency domain with the PGS model in (22) [8]. The generator observed samples may be corrupted by additive noise. Thus,

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 21

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. and the x 1 0.5 0 –0.5 –1 1 0.5 0 –0.5 –1 1 0.5 0 –0.5 –1 1 0.5 0 –0.5 –1 ma m /, 12 =- and the sampling filter is ii 15 mm () = t a K tex Sampling): MSE = 3.62e-23 tex Sampling): MSE = 1.80e-23 . Reconstructed (GFT Sampling): MSE = 3.74e-31 Reconstructed (GFT Sampling): MSE = 8.34e-31 Reconstructed (Ver Reconstructed (Ver 4 4 1 0.5 0 –0.5 –1 1 0.5 0 –0.5 –1 (a) BL sampling and recovery, where the signal is BL with (a) BL sampling and recovery, . 15 tex Sampling) tex Sampling) m m = M by using both vertex and graph frequency-domain sampling without changing the framework. Graph (a) (b) c Original Spectrum Sampled Signal (GFT Sampling) Original Spectrum Sampled Signal (GFT Sampling) has length c Sampled Signal (Ver Sampled Signal (Ver 0123 0123

4 3 2 1 0 4 2 0

–1 –2 Spectrum Spectrum The sample . 64 = N tex-Domain tex-Domain Ver Sampling GFT-Domain Sampling Ver Sampling GFT-Domain Sampling 1 0.5 0 –0.5 –1 1 0.5 0 –0.5 –1 Even in this case, the original signal is perfectly recovered from ). 2 (/ p ex =- ii Original Original mm () t g BL Sampling Non-BL Sampling Sampling examples for signals on a random sensor graph with FIGURE 2. the identity matrix. (b) Sampling and recovery of the graph signal lying in a known subspace, where original is generated by PGS model [8] with generator function sampling function is Signal Processing toolbox (GSPBOX) (https://epfl-lts2.github.io/gspbox-html/) is used for obtaining the random sensor graph. MSE: mean square error

22 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. practical sampling set selection algorithms often aim at Deterministic sampling set selection ­maximizing robustness to noise or imperfect knowledge Two main types of deterministic sampling set selection methods of sampled signal characteristics. In this section, efficient have been proposed in the literature. First, vertex-based methods sampling set selection methods for vertex-domain sampling have been studied extensively in machine learning and sensor net- are examined. work communities as a sensor placement ­problem. (See fur- Along with signal reconstruction quality, computa- ther discussion on applications in the “Applications” section). tional complexity is another key concern when design- Second, spectrum-based methods—selection schemes grounded ing sampling algorithms since signals may reside on very in graph frequency assumptions—represent a relatively new ap- large graphs. Often, one would like to avoid computing the proach and have been studied in the context of graph sampling eigendecomposition of the chosen graph variation operator, theory. We focus on the latter approach due to space limitations. such as the graph Laplacian, which requires large computa- See [11] for a summary of existing vertex-based methods. tional cost [O()N3 in the general case]. In this section, we also provide an overview of fast and efficient sampling set The exact BL case selection methods. For simplicity, suppose we directly observe the samples, i.e., GI= , and choose a BL signal model in (20). To optimize the Sampling set selection: Deterministic sampling set, we can define an objective function to quan- and random approaches tify reconstruction error in the presence of noise. The sampled A list of representative sampling methods is given in Table 1. signal y ! RM is then yc=+n, where n is an independent One of the first considerations when deciding on a sampling identically distributed additive noise introduced during the scheme is whether it is deterministic or random. Deterministic measurement or sampling process. Using­ the LS recovery approaches [9]–[11], [13], [21]–[23] choose a fixed node subset (13), the reconstructed signal xu is then given by to optimize a predetermined cost function. Since sampling set @@@ selection is, in general, combinatorial and NP-hard, many de- xu ==UUVB TB yUVB UcTB + UUVB TB n. (28) terministic selection methods are greedy, adding one locally optimal node at a time until the sampling budget is exhausted. The LS reconstruction error thus becomes ex:=-u x = @ Though a greedy selection method is suboptimal in general, it UUVB TB n. Many deterministic methods consider an optimi- gives a constant-factor approximation of the original combi- zation objective based on the error covariance matrix: natorial optimization problem if the cost function for the sam- < <<-1 pling set selection is maximization of a submodular function E := E[]ee = UUVB ()TB UUTB VB . (29) [14], [23]. Advantages of deterministic sampling set selection methods include the following: Given (29), one can choose different optimization criteria ■■ The “importance” of individual nodes is computed and based on the optimal design of experiments [27]. For example, totally ordered for greedy selection; if the sampling budget the A-optimality criterion minimizes the average errors by changes, one can add or remove nodes easily without seeking T, which minimizes the trace of the matrix inverse rerunning the entire selection algorithm. [9], [11], [14]: ■■ The selected node subset remains fixed as long as the < -1 graph structure is the same. min Tr ()UUTB TB , (30) TT|| |= M ^h In contrast, random methods [24]–[26] select nodes ran- domly according to a predetermined probability distribution. while E-optimality minimizes the worst-case errors by maxi- Typically, the distribution is designed so that more “important” mizing the smallest eigenvalue of the information matrix < nodes are selected with higher probabilities. One key merit of UTB UTB [9], [11], [14]: random methods is low compu- tational cost. Once the probabili- ty distribution is determined, the Table 1. A comparison of graph spectrum-based sampling set selection methods. selection itself can be realized quickly in a distributed manner. Deterministic/ Localization in Localization in Graph In practice, random sampling Methods Random Kernel Vertex Domain Frequency Domain k Maximizing the cutoff frequency [10] Deterministic m k ! Z+ ü may perform well on average, ^ h but it often requires more sam- Error covariance [9], [14] Deterministic Ideal ü ples than deterministic methods Approximate supermodularity [25] Deterministic Ideal ü to achieve the same reconstruc- Localized operator [11] Deterministic Arbitrary ü ü ) tion quality, even if the signal is Neumann series [24] Deterministic Ideal ü ü Gershgorin disk alignment [13] Deterministic m ü BL [25]. One may also combine Cumulative coherence [27] Random Ideal ü) ü deterministic and random selec- Global/local uncertainty [28] Random Arbitrary ü ü tion methods in finding a sam- pling set. *Localized in the vertex domain only if the ideal kernel is approximated by a polynomial.

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 23

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. < max mmin UUTB TB . (31) randomly without taking into account the underlying graph TT= M ^h [9], [24], [25], which results in very low computational cost. In either case, sampling set selection based on the error covari- However, theoretical results based on studies on compressed ance matrix (29) requires (partial) singular value decomposi- sensing [25] have shown that the number of required nodes for tion (SVD) of an MM# matrix, even when the GFT matrix recovery of BL graph signals tends to be larger than for graph- U is given a priori. This results in a large computational cost. dependent deterministic selections [25]. To alleviate this burden, greedy sampling without performing Second, graph-dependent random selection methods [24]– SVD has been recently proposed. This category includes meth- [26] assume that node importance varies according to the ods using spectral proxies, which approximately maximize underlying graph, e.g., important nodes are connected to many cutoff frequency [10], a graph filter submatrix that avoids SVD other nodes with large edge weights. In these approaches, a by utilizing a fast GFT and block matrix inversion [22], and a sampling probability distribution p ! RN, where pi[]$ 0 for polynomial filtering-based approach that maximizes a vertex all ii(,=-01f,)N and Ri pi[]= 1, is first obtained prior domain support of graph spectral filters [11]. to running a random selection algorithm. They assume G is given a priori. In addition to graph information, we can incor- Smooth signals porate information about observations (samples) to obtain p Instead of a strict BL assumption, one can assume the target [24]. The sampling set is then randomly chosen based on p. signal x is smooth with respect to the underlying graph, where As an example, in the graph coherence-based random selec- smoothness is measured via an operator V. One can thus tion method for ~-BL graph signals of [25], the sampling distri- < 2 ­reconstruct via a regularization-based optimization in (S5); bution is given as pi[]:= <

N method [28]. The random approach does not choose any nodes }mgi, []nN:(= / gut kk)[iu][k n], (34) in the rightmost cluster in this realization. The entropy-based k = 1 methods select many nodes in small clusters because they which can be viewed as the “impulse response” of a graph fil- tend to favor low-degree nodes. In contrast, the deterministic ter by rewriting (34) in vector form as approach selects nodes that are more uniformly distributed

< across clusters. }dgi, = UUgt ()K i, (35) In the second example, Figure 3(b), we use graph signal where di is an indicator vector for the ith node, i.e., the unit sampling to select pixels in an image. Each graph node corre- impulse. In [11], it has been shown that many proposed cost sponds to a pixel. Sampling set selection is based on maximiz- functions can be interpreted as having the form of (35) for dif- ing the cutoff frequency [10]. We use two variation operators, ferent kernels. the combinatorial and symmetric normalized graph Lapla- cians, leading to very different sampling sets. When using the Random sampling set selection combinatorial Laplacian, the selected pixels tend to be closer Random selection methods can be classified into two cat- to image contours or the image boundary. In contrast, pixels egories. First, graph-independent approaches select nodes selected using the normalized Laplacian are more uniformly

24 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. Community Graph Deterministic: Localization Operator Combinatorial Graph Laplacian

Random: Gaussian Process: Cumulative Coherence Entropy Normalized Graph Laplacian

(a) (b)

FIGURE 3. A comparison of sampling sets. (a) The sampling sets for a community graph with N = 256, where 10 nodes are selected. The sizes of the clusters are 44,,81,,6326, 4 from the top left cluster in a clockwise direction. At the top are the original graph and sampling set by a deterministic ap- proach [11]. "At the bottom are ,a sampling set of one realization by a random approach [27] and a sampling set by a Gaussian process-based method [30]. GSPBOX (https://epfl-lts2.github.io/gspbox-html/) is used for obtaining the cluster graph. (b) The sampling sets for image pixels. Each pixel is a node, and edge weights are chosen based on the method in https://github.com/STAC-USC/NNK_Image_graph. A sampling set selection based on maximizing the cutoff frequency [10] is used. The enlarged portions are shown for better visualization. The sampling sets use the (top) combinatorial and (bottom) normalized graph Laplacians, respectively.

distributed within the image. See also “To Nor- Table 2. Computational complexities of GSP-based deterministic sampling set selection. malize or Not to Normalize” for a comparison of variation operators. Method Preparation Selection Maximizing the cutoff frequency [10] Ok( E MT()k ) ON( M) 3 4 Computational complexity Error covariance: E-optimal [9] OM(( E + CM ))Te ON( M ) Selecting a sampling set can be divided into two Error covariance: A-optimal [9], [14] ON( M4) phases: 1) preparation, which includes computing Error covariance: T-optimal [14] ON( M) 3 required prior information, e.g., the eigendecom- Error covariance: D-optimal [14] OM( ) Approximate supermodularity [25] ON( M2) position of the graph operator, and 2) selection, Localized operator [11] OP(( E ++N ))J OM(J ) the main routine that selects nodes for sampling, Neumann series [24] O()NN22log ON( M3) e.g., calculating the cost function for candidate Gershgorin disk alignment [13] OJ((log2 1/)h ) OM((J log2 1/)h ) nodes in each iteration. Computational complexi- ties of different methods are summarized next. We assume MK= for simplicity. Parameters: Tk(): the average number of iterations required for conver- gence of a single eigenpair, where k is a tradeoff factor between performance and complexity; Te: the number of iterations to convergence for the eigenpair computations using a block version of Rayleigh quo- Deterministic selection tient minimization; C : constant; P: the approximation order of the Chebyshev polynomial approximation; Deterministic selection methods, studied in J : the number of nonzero elements in the localization operator; h: the numerical precision to terminate the the context of graph sampling theory, basical- binary search in Gershgorin disk alignment. ly need to calculate eigenpairs of (a part of) the variation operator in the selection phase. Their computational ­computational complexities of these eigendecomposition-free costs mostly depend on the number of edges in the graph and methods compare with previous sampling methods [9], [10] the assumed bandwidth. A recent trend is to investigate ei- that require computation of multiple eigenpairs. gendecomposition-free sampling set selection algorithms [11], [13], [22]. These recent methods approximate a graph spectral Random selection cost function with vertex-domain processing, like polynomi- Random selection methods typically entail a much smaller com- al approximation of the filter kernel. Table 2 shows that the putational cost in the selection phase than their ­deterministic

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 25

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. To Normalize or Not to Normalize

Different graph variation operators lead to different sam- 1D real line (i.e., the ­horizontal axis of the figure), and the pling sets, as shown in Figure 3. This difference in behav- GFT bases are shown as a series of 1D signals stacked ior is due to normalization. As an example, consider a vertically (lowest frequency at the bottom and highest at three-cluster graph with N = 27 [S3] and compare the the top), with values only in points on the 1D line corre- combinatorial graph Laplacian and its symmetric normal- sponding to nodes. Nodes in clusters A, B, and C are now ized version, with sampling set selection based on maxi- grouped in the left, middle, and right regions, respectively, mizing the cutoff frequency [10], as seen in Figure S3. on the 1D line. The vertex-domain expressions in Figure S3(a) represent The sampling order is represented with red circles: the colors as the node selection orders (the first-chosen nodes first selected node is shown with a red circle in the lowest- are blue, while the last-chosen ones are red). Observe frequency basis (bottom signal), while the last one chosen that, for the combinatorial Laplacian, most nodes in cluster appears in the highest-frequency GFT basis (top signal). A are selected in the last stage, while for the normalized Since high-frequency eigenvectors of the combinatorial graph Laplacian, the nodes in cluster A (and the other clus- graph Laplacian are highly localized in cluster A (see the ters) are selected at all stages. This is due to the localiza- light blue circle), the method in [10] more likely selects tion of the graph Fourier transform (GFT) bases: nodes in cluster A in its last stage. In contrast, for the nor- eigenvectors of the combinatorial graph Laplacian are malized version, the ­eigenvectors are less localized, so localized in the vertex domain compared to the normal- selected nodes are more balanced among clusters. ized one. This is illustrated by the spectral representations Reference in Figure S3(b), using the visualization technique in [S3]. [S3] B. Girault and A. Ortega, “What’s in a frequency: New tools for graph In this visualization, graph nodes are embedded into the Fourier transform visualization,” 2019, a r X i v :19 0 3 . 0 8 8 2 7.

Combinatorial Symmetric Normalized Graph Laplacian Graph Laplacian Cluster B Cluster C Last Selection Order

Cluster A First

(a)

ABCABC Cluster 3 L L i i

i 1.5 i 20 m 20 m Last 2 1

10 10 1 0.5 Graph Frequency Graph Frequency Selection Order Graph Frequency Index Graph Frequency Index

0 0 0 First 0 0 0.2 0.4 0 0.2 0.4 Selected Vertex Embedding Vertex Embedding Nodes (b)

FIGURE S3. The selection orders for different variation operators and oscillations of GFT bases, shown in (a) the vertex domain and (b) the graph frequency domain, along with oscillations of the GFT bases. GFT: graph Fourier transform.

26 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. counterparts. As discussed, given a sampling probability dis- where Nr and Nc are often very large. One well-known­ tribution p, all sampled nodes can be chosen quickly and in example is the Netflix challenge (https://en.wikipedia.org/ parallel using p. Hence, only the preparation phase needs wiki/Netflix_Prize): to recommend movies to viewers, miss- to be considered. For graph-independent random selection, ing movie ratings in a large matrix, with viewers and movies pi[]= 1 N for all i, and its computational cost is negligible. as rows and columns, respectively, are estimated based on a Many graph-dependent approaches require repeated calcula- small subset of available viewer ratings. As an ill-posed prob- < N tions of UUgt ()K v, where v ! R is a random vector or di . lem, signal priors are required for regularization. One popular While the naive implementation still requires eigendecompo- prior is the low-rank prior: target matrix signal X should be of sition, the graph filter response gt ()m is often approximated by low dimensionality and, thus, low rank. However, rank()X is

a polynomial: the preparation phase requires iterative vertex nonconvex, and convexifying it to the nuclear norm X ) (the domain processing. Typically, p is estimated after L filtering sum of singular values) still requires computing the SVD per operations of random vectors (typically LN= 2 log [25]), iteration in a proximal gradient method, which is expensive. which leads to OELP + N complexity [25]. The underlying assumption of a low-rank prior is that the ^ ^ hh items along the rows and columns are similar. One can thus Applications alternatively model these pairwise similarity relations using Graph signal sampling has been used across a wide range of two graphs [30], [31]. Specifically, columns of X are assumed applications, such as wireless communications, data mining, to be smooth with respect to an undirected weighted row graph and 3D imaging. We select a few interesting applications for GVrr= ,,ErrW with vertices Vr = 1,,f m and edges ^h " , in-depth discussion in this section. EVrr3 # Vr . Weight matrix Wr specifies pairwise simi- larities among vertices in Gr . The combinatorial graph Lapla- Sensor placement cian matrix of Gr is LDrr=-Wr, where the degree matrix

Sensor placement [28], [29] has long been studied in the Dr is diagonal with entries DWr ii = / j r ij. The work in wireless communication community. The basic problem is [31] assumes that all columns66 of @@the matrix signal X are BL to choose a subset of locations from a discrete feasible set to with respect to the graph frequencies defined using Lr . As an place sensors to monitor a physical phenomenon, such as tem- alternative to strict bandlimitedness, [30] assumes that the col- perature or radiation, over a large geographical area of interest. umns of X are smooth with respect to Lr, resulting in a small R Commonly, the field signal is assumed to be represented by Tr XLr X . ^h a low-dimensional parameter vector with a measurement ma- Similarly, one can define a column graph Gcc= V ,,EccW , ^h trix U generated by a Gaussian process. Different criteria have with vertices Vc,,= 1 f,,n edges Ecc3 VV# c, and weight " , been proposed to optimize the corresponding error covariance matrix Wc, for the rows of X. One can thus assume bandlim- matrix, including A-optimality, E-optimality, D-optimality, itedness for the rows of X with respect to the corresponding and frame potential [29]. Laplacian Lc [31] or, simply, that the rows of X are smooth As a concrete example, one formulation is to maximize with respect to Lc [30]. the smallest eigenvalue mmin of the inverse error covariance Given a sampling set X = ij,,iN!!11ff,,r j ,, "^h "", matrix (information matrix) via selection of a sensor subset Nc , denote by AX the sampling matrix: T, with T = M: ,, 1, if ij,;! X < AX = ^h (37) max mmin UUTV TV , (36) ij 0, otherwise. TT= M ^h 6 @ '

where UTV is a submatrix of U with selected rows indicated We can now formulate the matrix completion problem with by set T, and maximization leads to E-optimality [27] as men- double graph Laplacian regularization as follows [32]: tioned in the “Deterministic Sampling Set Selection” section. 1 2 a < If the measurement matrix U is the matrix UM contain- min f XA=-X % XY + Tr XLr X X ^^hh2 F 2 ^ h ing the first M eigenvectors of a graph Laplacian matrix L, b < then we can interpret (36) as a graph signal sampling problem + Tr XLc X , (38) 2 under an M-BL assumption. Sampling set selection methods ^ h described in the “Sampling Set Selection and Efficient Com- where a and b are weight parameters. To solve the uncon- putation Methods” section can thus be used to solve (36). strained quasi-peak problem (38), one can take the derivative Specifically, recent fast graph sampling schemes [11] have with respect to X, set it to zero, and solve for X, resulting in a been used for sensor selection with improved execution speed system of linear equations for the unknown vectorized vec X) : ^h and reconstruction quality compared to Gaussian process- ) based methods. AIu XX++abnr77LLcmIXvecv= ec AY% , (39) ^ h ^ h ^ h u Sampling for matrix completion where AAXX= diag vec , vec $ , means a vector form of ^h Matrix completion is the problem of filling or interpolating miss- a matrix by stacking^^ its columns,hh and diag $ creates a di- ^h ing values in a partially observable matrix signal X ! RNNrc# , agonal matrix with the input vector as its diagonal elements.

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 27

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. A solution to (39) can be efficiently found, e.g., by using the can perform PC subsampling: selection of a representative conjugate gradient method. 3D point subset such that the salient geometric features of the There are practical scenarios where the available observed original PC are well preserved. entries AYX % in a matrix X are not provided a priori but must Previous works in PC subsampling either employ a random be actively sampled first. This problem of how to best choose or regular sampling approach that does not preserve shape matrix entries for later completion given a sampling budget is characteristics proactively [33], [34] or maintain only obvious called active matrix completion—a popular research topic in features like sharp corners and edges [36]. Instead, leveraging the machine learning community. Extending sampling algo- on a recent fast eigendecomposition-free graph sampling algo- rithms for signals in single graphs, as discussed in previous sec- rithm [13], [35], PC sampling is performed that preserves the tions, the authors in [31] and [32] propose sampling algorithms overall shape of the original PC in a worst-case reconstruction to select matrix entries, assuming that the target signal X is BL sense. After connecting 3D points in a PC into a k-nearest- or smooth over both row and column graphs, respectively. neighbor graph, a postprocessing procedure is first assumed In a nutshell, the approach in [31] first selects rows and col- to superresolve a subsampled PC to full resolution based on umns separately based on BL assumptions on row and column a variant of a graph-based regularization, similar in form to graphs and then chooses matrix entries that are indexed by the (32), that expects the sought signal to be smooth with respect to selected rows and columns. In contrast, [32] greedily selects the constructed graph. Like (32), the superresolution procedure one matrix entry at a time by considering the row and column amounts to solving a system of linear equations, where the graph smoothness alternately, where each greedy selection coefficient matrix B is a function of the PC sampling matrix seeks to maximize the smallest eigenvalue of the coefficient H. They then derive a sampling objective that maximizes the matrix in (39)—the E-optimality criterion. In Figure 4, we see smallest eigenvalue mmin B of B—the E-optimality criteri- an example of a low-rank matrix and sampling performance on—through the selection^h of H. In Figure 5, one can observe (in root-mean-square error) of [31] (BL) under different band- that PCs subsampled via graph sampling [35] (denoted by width assumptions for row and column graphs and [32] itera- “GS”) outperformed other schemes [33], [34] in reconstruction tive Gershgorin circle shift (IGCS). We see that BL and IGCS quality after subsequent PC superresolution. perform comparably for a large sample budget, but BL is sensi- tive to the assumed row and column graph bandwidths. Closing remarks In this article, we overview sampling on graphs from theory to Subsampling of 3D point cloud applications. The graph sampling framework is similar to sam- A point cloud (PC)—a collection of discrete geometric samples pling for standard signals; however, its realization is completely­ (i.e., 3D coordinates) of a physical object in 3D space—is a pop- different due to the irregular nature of the graph domain. Cur- ular visual signal representation for 3D imaging ­applications, rent methods have found several interesting applications. At such as virtual reality. PCs can be very large, with millions the same time, the following issues, both theoretical and prac- of points, making subsequent image processing tasks, such tical aspects, are still open: as viewpoint rendering or object detection/recognition, very ■■ Interconnection between vertex and spectral representations computation intensive. To lighten this computation load, one of sampling: As shown in the “Sampling Methods” section,

1

10 0.95 Random 20 0.9 BL (11, 9) 30 BL (9, 11) 0.85 IGCS 40 50 0.8 60 0.75 70 0.7

80 Root-Mean-Square Error 0.65 90 100 0.6 20 40 60 80 100 120 140 160 180 200 150 200 250 300 350 400 450 500 550 600 Sample Size (a) (b)

FIGURE 4. A performance comparison of the graph sampling algorithms (random, [31], [32]) for matrix completion: (a) an example of a low-rank matrix and (b) root-mean-square error comparison.

28 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. RS MESS GS RS MESS GS

(a) (b)

FIGURE 5. The reconstruction results from different 3D PC subsampling methods—RS [33], MESS [34], and GS [36]—for (a) bunny and (b) armadillo models. The surface is colorized by the distance from the ground-truth surface: green/yellow means smaller errors, and blue/red means larger errors. RS: random sampling; MESS: mesh element subsampling; GS: graph sampling.

two definitions can be possible for graph signal sampling. Acknowledgments Can these sampling approaches be described in a more uni- We thank Sarath Shekkizhar, Benjamin Girault, Fen Wang, fied way beyond a few known special cases? This may lead and Chinthaka Dinesh for preparing the figures. This work was to a more intuitive understanding of graph signal sampling. supported in part by the Japan Science and Technology Agency ■■ Studies beyond BL graph signals: Most studies in graph sig- Precursory Research for Embryonic Science and Technology nal sampling are based on sampling and reconstruction of under grant JPMJPR1935, the Japan Society for the Promotion BL (or smooth) graph signals. However, as shown in the of Science Grants-in-Aid for Scientific Research under grants “Generalized Sampling in Hilbert Spaces” section, sampling 19K22864 and 20H02145, the European Union’s Horizon 2020 methods beyond the BL setting have been studied in stan- Research and Innovation Program under grant 646804-ERC- dard sampling. Investigating GSP systems beyond the BL COG-BNYQ, the U.S. National Science Foundation under assumption will be beneficial for many practical applications grants CCF-1410009 and CCF-1527874, and Natural Sciences since real data are often not BL. Such examples include gen- and Engineering Research Council grants RGPIN-2019–06271 eralized graph signal sampling [15] and PGS sampling [8]. and RGPAS-2019-00110. ■■ Fast and efficient deterministic sampling: Eigendecomposition- free methods are a current trend for graph signal sampling, Authors as seen in the “Deterministic Sampling Set Selection” Yuichi Tanaka ([email protected]) is an associate professor section, but their computational complexities are still high in the Department of and Computer compared to random methods. Furthermore, current Science, Tokyo University of Agriculture and Technology, ­deterministic ­approaches are mostly based on greedy sam- Japan. Currently, he has a cross appointment as a PRESTO pling. Faster deterministic graph sampling methods are Researcher with the Japan Science and Technology Agency. required that will be tractable for graphs with millions and He served as an associate editor of IEEE Transactions on Signal even billions of nodes. Processing from 2016 to 2020. His current research interests ■■ Fast and distributed reconstruction: Similar to sampling, the are in the field of high-dimensional signal processing and reconstruction step also requires an eigendecomposition-free machine learning, which includes graph signal processing, interpolation algorithm. Such an algorithm is expected to be geometric deep learning, sensor networks, image/video pro- implemented in a distributed manner. While fast filtering cessing in extreme situations, biomedical signal processing, methods have been studied, as briefly introduced in the and remote sensing. He is a Senior Member of IEEE. “Graph-Sampling Theory” section, fast and more accurate Yonina C. Eldar ([email protected]) is a pro- interpolation methods of signals on a graph are still required. fessor in the Department of Math and at the ■■ Applications: Some direct applications of graph signal Weizmann Institute of Science, , , where she sampling have been introduced in the “Applications” heads the center for Biomedical Engineering and Signal ­section. Note that sampling itself is ubiquitous in signal Processing. She is also a visiting professor at the Massachusetts processing and machine learning: many applications can Institute of Technology and at the Broad Institute and an apply graph signal sampling as an important ingredient. adjunct professor at , and she was a visiting For example, graph neural networks and PC processing are professor at . She is the editor-in-chief of potential areas of application because it is often convenient Foundations and Trends in Signal Processing and serves the to treat available data as signals on a structured graph. IEEE on several technical and award committees. She has Continued discussions with domain experts in different received many awards for excellence in research and teaching, areas can facilitate applications of graph sampling theory including the IEEE Signal Processing Society Technical and algorithms to the wider field of data science. Achievement Award, IEEE Aerospace and Electronic Systems

IEEE SIGNAL PROCESSING MAGAZINE | November 2020 | 29

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply. Society Fred Nathanson Memorial Radar Award, and IEEE [11] A. Sakiyama, Y. Tanaka, T. Tanaka, and A. Ortega, “Eigendecomposition-free sampling set selection for graph signals,” IEEE Trans. Signal Process., vol. 67, Kiyo Tomiyasu Award. She is a member of the Israel no. 10, pp. 2679–2692, May 2019. doi: 10.1109/TSP.2019.2908129. Academy of Sciences and Humanities and a fellow of the [12] Y. Tanaka, “Spectral domain sampling of graph signals,” IEEE Trans. Signal European Association for Signal Processing. She is a Fellow Process., vol. 66, no. 14, pp. 3752–3767, July 2018. doi: 10.1109/TSP.2018.2839620. [13] Y. Bai, F. Wang, G. Cheung, Y. Nakatsukasa, and W. Gao, “Fast graph sam- of IEEE. pling set selection using Gershgorin disc alignment,” IEEE Trans. Signal Process., Antonio Ortega ([email protected]) is a professor vol. 68, pp. 2419–2434, 2020. doi: 10.1109/TSP.2020.2981202. of Electrical and Computer Engineering at the University of [14] P. D. Lorenzo, S. Barbarossa, and P. Banelli, “Sampling and recovery of graph signals,” in Cooperative and Graph Signal Processing, P. M. Djuric´ and C. Southern California. He is the editor-in-chief of IEEE Richard, Eds. Amsterdam, The Netherlands: Elsevier, 2018, pp. 261–282. Transactions on Signal and Information Processing Over [15] S. P. Chepuri, Y. C. Eldar, and G. Leus, “Graph sampling with and without Networks and recently served as a member of the board of gov- input priors,” in Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing (ICASSP), 2018, pp. 4564–4568. ernors of the IEEE Signal Processing Society. He has received [16] F. R. K. Chung, Spectral Graph Theory (CBMS Regional Conference Series several paper awards, including the 2016 Signal Processing in Mathematics, No. 92). American Mathematical Society, 1997. Magazine Award. His research interests include graph signal [17] A. M. Duval and V. Reiner, “Perron–Frobenius type results and discrete ver- sions of nodal domain theorems,” Linear Algebra Appl., vol. 294, no. 1–3, pp. 259– processing, machine learning, multimedia compression, and 268, 1999. doi: 10.1016/S0024-3795(99)00090-7. wireless sensor networks. He is a fellow of the European [18] S. Chen, R. Varma, A. Singh, and J. Kovacˇevi´c, “Representations of piecewise Association for Signal Processing and a member of the smooth signals on graphs,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Association for Computing Machinery and Asia-Pacific Signal Processing (ICASSP), 2016, pp. 6370–6374. doi: 10.1109/ICASSP.2016.7472903. [19] A. Sakiyama, K. Watanabe, Y. Tanaka, and A. Ortega, “Two-channel critical- and Information Processing Association. He is a Fellow of IEEE. ly-sampled graph filter banks with spectral domain sampling,” IEEE Trans. Signal Gene Cheung ([email protected]) received his Ph.D. degree Process., vol. 67, no. 6, pp. 1447–1460, Mar. 2019. doi: 10.1109/TSP.2019. 2892033. in electrical engineering and computer science from the [20] Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applications. University of California, Berkeley, in 2000. He is an associate Cambridge, U.K.: Cambridge university press, 2012. professor at York University, Toronto, . He has served [21] A. Gadde, A. Anis, and A. Ortega, “Active semi-supervised learning using sam- pling theory for graph signals,” in Proc. 20th ACM SIGKDD Int. Conf. Knowledge as associate editor for multiple journals, including IEEE Discovery Data Mining, 2014, pp. 492–501. doi: 10.1145/2623330.2623760. Transactions on Multimedia (2007–2011) and IEEE Transactions [22] F. Wang, G. Cheung, and Y. Wang, “Low-complexity graph sampling with on Image Processing (2015–2019). He is a coauthor of several noise and signal reconstruction via Neumann series,” IEEE Trans. Signal Process., papers that received awards, including the Best Student Paper vol. 67, no. 21, pp. 5511–5526, 2019. doi: 10.1109/TSP.2019.2940129. [23] L. F. O. Chamon and A. Ribeiro, “Greedy sampling of graph signals,” IEEE Trans. Award at IEEE ICIP 2013; ICIP 2017; and IEEE Image, Signal Process., vol. 66, no. 1, pp. 34–47, 2017. doi: 10.1109/TSP.2017.2755586. Video, and Multidimensional Signal Processing Workshop [24] S. Chen, R. Varma, A. Singh, and J. Kovacˇevi´c, “Signal recovery on graphs: 2016. He is a recipient of the Canadian Natural Sciences Fundamental limits of sampling strategies,” IEEE Trans. Signal Inf. Process. Netw., vol. 2, no. 4, pp. 539–554, 2016. doi: 10.1109/TSIPN.2016.2614903. and Engineering Research Council Discovery Accelerator [25] G. Puy, N. Tremblay, R. Gribonval, and P. Vandergheynst, “Random sampling Supplement 2019. His research interests include 3D imaging of bandlimited signals on graphs,” Appl. Comput. Harmon. Anal., vol. 44, no. 2, and graph signal processing. pp. 446–475, Mar. 2018. doi: 10.1016/j.acha.2016.05.005. [26] N. Perraudin, B. Ricaud, D. I. Shuman, and P. Vandergheynst, “Global and local uncertainty principles for signals on graphs,” APSIPA Trans. Signal Inform. References Process., vol. 7, p. e3, Apr. 2018. doi: 10.1017/ATSIP.2018.2. [1] Y. C. Eldar, Sampling Theory: Beyond Bandlimited Systems. Cambridge, U.K.: [27] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, 2015. Cambridge Univ. Press, 2009. [2] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, [28] A. Krause, A. Singh, and C. Guestrin, “Near-optimal sensor placements in “The emerging field of signal processing on graphs: Extending high-dimensional Gaussian processes: Theory, efficient algorithms and empirical studies,” J. Mach. data analysis to networks and other irregular domains,” IEEE Signal Process. Mag., Learn. Res., vol. 9, pp. 235–284, Feb. 2008. vol. 30, no. 3, pp. 83–98, Oct. 2013. doi: 10.1109/MSP.2012.2235192. [29] J. Ranieri, A. Chebira, and M. Vetterli, “Near-optimal sensor placement for [3] A. Ortega, P. Frossard, J. Kovacˇevi´c, J. M. F. Moura, and P. Vandergheynst, linear inverse problems,” IEEE Trans. Signal Process., vol. 62, no. 5, pp. 1135– “Graph signal processing: Overview, challenges, and applications,” Proc. IEEE, 1146, 2014. doi: 10.1109/TSP.2014.2299518. vol. 106, no. 5, pp. 808–828, May 2018. doi: 10.1109/JPROC.2018.2820126. [30] V. Kalofolias, X. Bresson, M. Bronstein, and P. Vandergheynst, “Matrix com- [4] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs,” pletion on graphs,” 2014, arXiv:1408.1717. IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644–1656, Apr. 2013. doi: [31] G. Ortiz-Jiménez, M. Coutino, S. P. Chepuri, and G. Leus, “Sparse sampling 10.1109/TSP.2013.2238935. for inverse problems with tensors,” IEEE Trans. Signal Process., vol. 67, no. 12, [5] G. Cheung, E. Magli, Y. Tanaka, and M. Ng, “Graph spectral image process- pp. 3272–3286, 2019. doi: 10.1109/TSP.2019.2914879. ing,” Proc. IEEE, vol. 106, no. 5, pp. 907–930, May 2018. doi: 10.1109/ [32] F. Wang, Y. Wang, G. Cheung, and C. Yang, “Graph sampling for matrix com- JPROC.2018.2799702. pletion using recurrent Gershgorin disc shift,” IEEE Trans. Signal Process., vol. 68, [6] I. Pesenson, “A sampling theorem on homogeneous manifolds,” Trans. Amer. Math. pp. 2814–2829, Apr. 2020. doi: 10.1109/TSP.2020.2988784. Soc., vol. 352, no. 9, pp. 4257–4269, 2000. doi: 10.1090/S0002-9947-00-02592-7. [33] F. Pomerleau, F. Colas, R. Siegwart, and S. Magnenat, “Comparing ICP vari- [7] I. Pesenson, “Sampling in Paley–Wiener spaces on combinatorial graphs,” ants on real-world data sets,” Auton. Robots, vol. 34, no. 3, pp. 133–148, 2013. doi: Trans. Amer. Math. Soc., vol. 360, no. 10, pp. 5603–5627, 2008. doi: 10.1090/ 10.1007/s10514-013-9327-2. S0002-9947-08-04511-X. [34] P. Cignoni, M. Callieri, M. Corsini, M. Dellepiane, F. Ganovelli, and G. [8] Y. Tanaka and Y. C. Eldar, “Generalized sampling on graphs with subspace and Ranzuglia, “Meshlab: An open-source mesh processing tool,” in Proc. smoothness priors,” IEEE Trans. Signal Process., vol. 68, pp. 2272–2286, Mar. Eurographics Italian Chapter Conf., 2008, pp. 129–136. 2020. doi: 10.1109/TSP.2020.2982325. [35] S. Chen, D. Tian, C. Feng, A. Vetro, and J. Kovacˇevi´c, “Fast resampling of [9] S. Chen, R. Varma, A. Sandryhaila, and J. Kovacˇevi´c, “Discrete signal process- three-dimensional point clouds via graphs,” IEEE Trans. Signal Process., vol. 66, ing on graphs: Sampling theory,” IEEE Trans. Signal Process., vol. 63, no. 24, no. 3, pp. 666–681, Feb. 2018. doi: 10.1109/TSP.2017.2771730. pp. 6510–6523, Dec. 2015. doi: 10.1109/TSP.2015.2469645. [36] C. Dinesh, G. Cheung, F. Wang, and I. Bajic´, “Sampling of 3D point cloud via Gershgorin disc alignment,” in Proc. IEEE Conf. Image Processing, to be [10] A. Anis, A. Gadde, and A. Ortega, “Efficient sampling set selection for band- published. limited graph signals using graph spectral proxies,” IEEE Trans. Signal Process.,

vol. 64, no. 14, pp. 3775–3789, July 2016. doi: 10.1109/TSP.2016.2546233. SP

30 IEEE SIGNAL PROCESSING MAGAZINE | November 2020 |

Authorized licensed use limited to: Weizmann Institute of Science. Downloaded on November 01,2020 at 09:14:01 UTC from IEEE Xplore. Restrictions apply.