On the Graph Fourier Transform for Directed Graphs

1 On the Graph Fourier Transform for Directed Graphs Stefania Sardellitti, Member, IEEE, Sergio Barbarossa, Fellow, IEEE, and Paolo Di Lorenzo, Member, IEEE Abstract—The analysis of signals defined over a graph is Fourier basis is constituted by the eigenvectors of the graph relevant in many applications, such as social and economic Laplacian, which represent the basis that minimizes the l2- networks, big data or biological networks, and so on. A key norm graph total variation. This approach is well motivated tool for analyzing these signals is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been on undirected graphs where the minimization of the ℓ2-norm suggested in the literature, based on the eigen-decomposition of total variation is equivalent to minimizing the quadratic form either the graph Laplacian or adjacency matrix. In this paper, built on the Laplacian matrix. Hence an orthonormal basis we address the general case of directed graphs and we propose minimizing the ℓ2-norm total variation leads to the eigenvec- an alternative approach that builds the graph Fourier basis tors of the Laplacian matrix. However, these properties do as the set of orthonormal vectors that minimize a continuous extension of the graph cut size, known as the Lovasz´ extension. not hold anymore in the directed graph case. An alternative To cope with the non-convexity of the problem, we propose two approach, valid for the more general and challenging case of alternative iterative optimization methods, properly devised for directed graphs, was proposed in [1], [4]. That method builds handling orthogonality constraints. Finally, we extend the method on the Jordan decomposition of the adjacency matrix, and to minimize a continuous relaxation of the balanced cut size. defines the associated generalized eigenvectors as the GFT The formulated problem is again non-convex and we propose an efficient solution method based on an explicit-implicit gradient basis. This second method is rooted on the association of the algorithm. graph adjacency matrix with the signal shift operator, which is at the basis of all shift-invariant linear filtering methods Index Terms—Graph signal processing, Graph Fourier Trans- form, total variation, clustering. for graph signals [15], [16]. This approach paved the way to the algebraic signal processing framework. However, the GFT definition proposed in [4] raises some important issues I. INTRODUCTION requiring further investigation. First, the basis vectors are Graph signal processing (GSP) has attracted a lot of interest linearly independent, but in general they are not orthogonal, in the last years because of its many potential applications, so that the resulting transform is not unitary and then it from social and economic networks to smart grids, gene does not preserve scalar products. Second, the total variation regulatory networks, and so on. GSP represents a promising introduced in [4], does not respect some desirable properties, tool for the representation, processing and analysis of complex for example, it does not guarantee that a constant graph networks, where discrete signals are defined on the vertices of signal has zero total variation [17], [18]. Finally, the numerical a (possibly weighted) graph. Many works in the recent litera- computation of the Jordan decomposition often incurs into ture attempt to extend the classical discrete signal processing well-known numerical instabilities, even for moderate size (DSP) theory from time signals or images to signals defined matrices [19], although alternative decomposition methods over the vertices of a graph by introducing the basic concepts have been recently suggested to tackle these instability issues of graph-based filtering [1]–[3], graph-based transforms [4]– [20]. In some applications, one of the major motivations for [7], sampling and uncertainty principle [8]–[12]. A central role using the GFT is the analysis of graph signals that exhibit clustering properties, i.e. signals that are smooth within subsets arXiv:1601.05972v3 [math.SP] 1 Jul 2017 in GSP is played by the spectral analysis of graph signals, which is based on the introduction of the so called Graph of highly interconnected nodes (clusters), while they can vary Fourier Transform (GFT). Alternative definitions of GFT have arbitrarily across different clusters. In such cases, the GFT of been introduced see, e.g., [5], [4], [8], [13], [14], each of these signals is typically sparse and its sparsity carries relevant them coming from different motivations, like building a basis information on the data under analysis. These signals are said with minimal variation, filtering signals defined over graphs, to be band-limited, in analogy with what happens to smooth etc. Two basic approaches have been suggested. The first one time signals. Within the machine learning context, GSP can is rooted on spectral graph theory and it uses the graph- play a key role in unsupervised and semi-supervised learning, Laplacian as the central unit, see e.g. [5] and the references as suggested in [21], [22]. In these applications, the input is a therein. This approach applies to undirected graphs and the point cloud and the goal is to detect clusters, either without or with limited supervision. Graph-based methods tackle these S. Sardellitti and S. Barbarossa are with Sapienza University of Rome, problems by associating a graph to the point cloud, where DIET Dept., Via Eudossiana 18, 00184 Rome, Italy (e-mail: stefa- the vertices are the points themselves, whereas edges between [email protected], [email protected]). P. Di Lorenzo is with the Dept. of Engineering, University of Perugia, Via G. Duranti 93, pairs of points are established if two points are sufficiently 06125 Perugia, Italy (e-mail: [email protected]). This work has been close. The goal of clustering/classification is to associate a supported by TROPIC Project, Nr. ICT-318784. The work of P. Di Lorenzo different label to each cluster. If we look at these labels as a was funded by the “Fondazione Cassa di Risparmio di Perugia”. Matlab code to implement the algorithms proposed in this paper is available at signal defined over the points (vertices), this signal is band- https://sites.google.com/site/stefaniasardellitti/code-supplement limited by construction [21], [22]. 2 In this paper, we propose a novel alternative approach II. MIN-CUT SIZE AND ITS LOVASZ´ EXTENSION to build the GFT basis for the general case of directed In this section, we recall the definitions of cut size and graphs. Rather than starting from the decomposition of one Lovász extension, as they will form the basic tools for our of the graph matrix descriptors, either adjacency or Laplacian, definition of GFT. We consider a graph = , consisting we start identifying an objective function to be minimized of a set of N vertices (or nodes) = G1,...,N{V E}along with V { } and then we build an orthogonal matrix that minimizes that a set of edges = aij i,j , such that aij > 0 if there is E { } ∈V objective function. More specifically, we choose as objective a direct link from node j to node i, or aij = 0 otherwise. function the graph cut size, as its minimization leads to We denote with the cardinality of , i.e. the number of identifying clusters. We consider the general case of directed elements of . A|V| signal s on a graph isV defined as a mapping graphs, which subsumes the undirected graphs as a particular from the vertexV set to a real vectorG of size N = , i.e. case. The cut function is a set function and its minimization s : R. Let A denote the N N adjacency matrix|V| with V → × is NP-hard, however exploiting the sub-modularity property entries given by the edge weights aij for i, j = 1,...,N. of the cut size, it has been shown that there exists a lossless The graph Laplacian is defined as L := D A where the convex relaxation of the cut size, named its Lovász extension in-degree matrix D is a diagonal matrix whose− ith diagonal [23], [24], whose minimization preserves the optimality of the entry is di = j aij . solution of the original non-convex problem. Interestingly, the One of theP basic operations over graphs is clustering, Lovász extension of the cut size gives rise to an alternative i.e. the partition of the graph onto disjoint subgraphs, such definition of total variation of a graph signal that captures that the vertices within each subgraph (cluster) are highly the edges’ directivity. Furthermore, in the case of undirected interconnected, whereas there are only a few links between graphs, the Lovász extension reduces to the l1 norm total different clusters. Finding a good partition can be formulated variation of a graph signal, which represents the discrete as the minimization of the cut size [28], whose definition counterpart of the total variation of continuous-time signals, is reported here below. Let us consider a subset of vertices which plays a fundamental role in the continuous time Fourier , and its complement set in denoted by ¯. The edge Transform, see, e.g., [17], [13]. We define the GFT basis boundaryS⊂V of is defined as the setV of edges withS one end in as the set of orthonormal vectors that minimize the Lovász and the otherS end in ¯. The cut size between and ¯ is extension of the cut size. Unfortunately, even though the ob- definedS as the sum of theS weights over the boundaryS [28],S i.e. jective function is convex, the resulting problem is non-convex, cut( , ¯) := aji. (1) because of the orthogonality constraint imposed on the basis S S vectors. Thus, to find a (possibly local) solution of the problem i∈SX,j∈S¯ in an efficient manner, we exploit two recently developed Finding the partition that minimizes the cut size in (1) is an methods that are specifically tailored to handle non-convex NP-hard problem.

Load more