Variants of the Graph Laplacian with Applications in Machine Learning

Variants of the Graph Laplacian with Applications in Machine Learning Sven Kurras Dissertation zur Erlangung des Grades des Doktors der Naturwissenschaften (Dr. rer. nat.) im Fachbereich Informatik der Fakultät fürMathematik, Informatik und Naturwissenschaften der UniversitätHamburg Hamburg, Oktober 2016 Diese Promotion wurde gefördertdurch die Deutsche Forschungsgemeinschaft, Forschergruppe 1735 \Structural Inference in Statistics: Adaptation and Efficiency”. Betreuung der Promotion durch: Prof. Dr. Ulrike von Luxburg Tag der Disputation: 22. März2017 Vorsitzender des Prüfungsausschusses: Prof. Dr. Matthias Rarey 1. Gutachterin: Prof. Dr. Ulrike von Luxburg 2. Gutachter: Prof. Dr. Wolfgang Menzel Zusammenfassung In sämtlichen Lebensbereichen finden sich Graphen. Zum Beispiel verbringen Menschen viel Zeit mit der Kantentraversierung des Internet-Graphen. Weitere Beispiele fürGraphen sind soziale Netzwerke, öffentlicher Nahverkehr, Moleküle, Finanztransaktionen, Fischernetze, Familienstammbäume,sowie der Graph, in dem alle Paare natürlicher Zahlen gleicher Quersumme durch eine Kante verbunden sind. Graphen könnendurch ihre Adjazenzmatrix W repräsentiert werden. Darüber hinaus existiert eine Vielzahl alternativer Graphmatrizen. Viele strukturelle Eigenschaften von Graphen, beispielsweise ihre Kreisfreiheit, Anzahl Spannbäume,oder Random Walk Hitting Times, spiegeln sich auf die ein oder andere Weise in algebraischen Eigenschaften ihrer Graphmatrizen wider. Diese grundlegende Verflechtung erlaubt das Studium von Graphen unter Verwendung sämtlicher Resultate der Linearen Algebra, angewandt auf Graphmatrizen. Spektrale Graphentheorie studiert Graphen insbesondere anhand der Eigenwerte und Eigenvektoren ihrer Graphmatrizen. Dabei ist vor allem die Laplace-Matrix L = D − W von Bedeutung, aber es gibt derer viele Varianten, zum Beispiel die normalisierte Laplacian, die vorzeichenlose Laplacian und die Diplacian. Die meisten Varianten basieren auf einer \syntaktisch kleinen" Anderung¨ von L, etwa D +W anstelle von D −W . Solcherart Modifikationen ändern meist vollständigdie in den Eigenwerten und Eigenvektoren codierte Information. Auf diese Weise könnensich neuartige Verbindungen zu Grapheigenschaften ergeben. Die vorliegende Doktorarbeit untersucht neue und existierende Varianten von Laplace- Matrizen. Die f-adjusted Laplacian wird eingeführtund gezeigt, dass diese als eine spezielle Diagonalmodifikation der normalisierten Laplace-Matrix aufgefasst werden kann. Im Kontext zufälligergeometrischer Nachbarschaftsgraphen wird bewiesen, dass diese Matrix eine konkrete Manipulation der zugrundeliegenden Wahrschein- lichkeitsdichte beschreibt. Diese Intuition erlaubt neuartige Ansätzefürverschiedene Problemstellungen des Maschinellen Lernens, zum Beispiel fürdie Bildsegmentierung im Falle nicht-uniform gesampelter Pixelpositionen. Diese Arbeit untersucht zudem die iterierte Anwendung des Normalisierungsschrittes W 7! D−1=2WD−1=2, welcher der normalisierten Laplace-Matrix zugrunde liegt. Das Ergebnis sind neue Resultate zum Konvergenzverhalten. Diese führenzur Definition der f-fitted Laplacian, welche sich beispielsweise zur Behebung eines bestimmten Typs von Stichprobenverzerrung eignet. Das letzte Kapitel studiert die signed Laplacian. Dabei handelt es sich um eine Erweiterung der Laplace-Matrix auf Graphen mit sowohl positiven als auch negativen Kantengewichten. Diese Arbeit bietet eine Neuinterpretation des kleinsten Eigenwertes der signed Laplacian, wodurch sich der zugehörigeEigenvektor als der kanonische Kandidat fürspektrales Korrelations-Clustering offenbart. Die Ergebnisse des so entwickelten Algorithmus sind State of the Art. Der Algorithmus wird in einer umfassenden Praxisanwendung implementiert, deren Ziel die automatisierte Identifikation von unfairem Verhalten in einem Multiplayer-Online-Spiel ist. Abstract Graphs are everywhere. For example, people spend a lot of time on traversing the hyperlink-edges of the web graph. Other examples of graphs are social networks, public transportation maps, molecules, financial transactions, fishing nets, family trees, and the graph that you get by connecting any two natural numbers that have the same digit sum. A graph can be represented by its adjacency matrix W , but there exist several alternative graph matrices. Many structural properties of a graph such as the absence of cycles, number of spanning trees or random walk hitting times, reflect in these graph matrices as all kinds of algebraic properties. This fundamental relation allows to study graph properties by applying all the machinery from linear algebra to matrices. Spectral graph theory focuses particularly on the eigenvalues and eigenvectors of graph matrices. In this field of research, the graph Laplacian matrix L = D − W is particularly well-known, but there are numerous variants such as the normalized Laplacian, the signless Laplacian and the Diplacian. Most variants are defined by a \syntactically tiny" modification of L, for example D + W instead of D − W . Nevertheless, such modifications can drastically affect the information that is encoded in the eigenvalues and eigenvectors. This can finally lead to new relations to graph properties. This thesis studies novel and existing variants of Laplacian matrices. The f-adjusted Laplacian is introduced and proven to be a specific diagonal modification of the normalized Laplacian matrix. In the context of random geometric neighborhood graphs, this matrix can be understood as a specific distortion that is applied to the underlying probability density. This intuition allows for new approaches to various applications in machine learning, for example to image segmentation in case of non-uniformly sampled pixel positions. This thesis further studies the repeated application of the normalization step W 7! D−1=2WD−1=2 that underlies the normalized Laplacian. It contributes a novel convergence result that finally leads to the f-fitted Laplacian as another strategy to remove a certain type of sampling bias. The last chapter studies the signed Laplacian, which generalizes the Laplacian to graphs of both positive and negative edge weights. This thesis contributes a novel re-interpretation of the smallest eigenvalue of the signed Laplacian. It identifies the corresponding eigenvector as the canonical candidate for a spectral approach to correlation clustering. The suggested algorithm is shown to compete with state-of-the-art. The algorithm is implemented in an extensive application from practice that aims at automatically detecting unfair user behavior in a multi-player online game. Acknowledgements First of all, I want to thank my advisor Ulrike von Luxburg for guiding my research with the right balance between clear orientation and total freedom. Thank you for introducing me to the fascinating world of machine learning by sharing your knowledge and intuition in many inspiring discussions. These three years of research in your working group were really a great time in which I learned interesting stuff at a higher rate than ever before! Special thanks go also to my office mates Matthäus Kleindessner and Morteza Alamgir. I really enjoyed the time with you, and I am grateful for your helpful input, for the nice conference travels, for your pleasant scientific spirit, and for all the sweet pastries from Austria and Iran! I am grateful to the German Research Foundation who funded my research in the brilliant Research Unit 1735, Structural Inference in Statistics: Adaptation and Efficiency. I highly appreciated all our meetings, in particular the excellent spring schools! I say \thank you for everything" to all the wonderful interim group members, workmates, students, professors, IT admins, and the administration staff, in particular to Hildegard Westermann. A thousand thanks go further to all the people that I met at conferences, guest talks, poster sessions or colloquia, whose names I forgot or never knew, but who nevertheless contributed to great scientific and non-scientific conversations. Such a friendly and competent research community is really something that is worth paying a lot of taxes for! Since I made the almost-mistake to start working in industry in parallel to finishing \just the last 10%" of this thesis, big thanks go to my employer Risk.Ident GmbH who gave me the opportunity to solely focus on finishing this thesis during the hot final phase. Special thanks go to Marco Fisichella, who never stopped asking me on the progress of my thesis, and to the terrific data science team | you all rock! A million thanks go to my parents, family and friends, who sustained me during the last year of living on a submarine, surfacing only every now and then. Last but absolutely not least my warmest thanks go to my wonderful wife-to-be Julia and our sweet lovely baby Charlotte, who were patient with my tight time-window scheduling for such a long time. Now the windows are open and our full-time life begins! I ♥ you! Contents Page List of Illustrations 12 List of Symbols 14 Chapter 1: Introduction 17 1.1 The world of spectral graph theory . 17 1.2 Structural overview on this thesis . 20 1.3 Machine learning context . 21 1.3.1 Graphs . 21 1.3.2 Random graph models . 23 1.3.3 Random walks on graphs . 27 1.3.4 Linear algebra background . 28 1.3.5 Graph matrices . 30 1.3.6 Spectral clustering . 32 1.4 Summary of the main results and publications . 37 Chapter 2: The f-adjusted Laplacian 41 2.1 Chapter introduction . 41 2.2 Informal summary of the main contributions . 46 2.3 Formal setup . 47 2.3.1 Random

Variants of the Graph Laplacian with Applications in Machine Learning

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support