Graph Matching Networks for Learning the Similarity of Graph Structured Objects

Graph Matching Networks for Learning the Similarity of Graph Structured Objects Yujia Li 1 Chenjie Gu 1 Thomas Dullien 2 Oriol Vinyals 1 Pushmeet Kohli 1 Abstract In the past few years graph neural networks (GNNs) have emerged as an effective class of models for learning rep- This paper addresses the challenging problem of resentations of structured data and for solving various su- retrieval and matching of graph structured ob- pervised prediction problems on graphs. Such models are jects, and makes two key contributions. First, we invariant to permutations of graph elements by design and demonstrate how Graph Neural Networks (GNN), compute graph node representations through a propagation which have emerged as an effective model for var- process which iteratively aggregates local structural infor- ious supervised prediction problems defined on mation (Scarselli et al., 2009; Li et al., 2015; Gilmer et al., structured data, can be trained to produce embed- 2017). These node representations are then used directly for ding of graphs in vector spaces that enables effi- node classification, or pooled into a graph vector for graph cient similarity reasoning. Second, we propose a classification. Problems beyond supervised classification or novel Graph Matching Network model that, given regression are relatively less well-studied for GNNs. a pair of graphs as input, computes a similarity In this paper we study the problem of similarity learning score between them by jointly reasoning on the for graph structured objects, which appears in many impor- pair through a new cross-graph attention-based tant real world applications, in particular similarity based matching mechanism. We demonstrate the ef- retrieval in graph databases. One motivating application fectiveness of our models on different domains is the computer security problem of binary function simi- including the challenging problem of control-flow- larity search, where given a binary which may or may not graph based function similarity search that plays contain code with known vulnerabilities, we wish to check an important role in the detection of vulnerabili- whether any control-flow-graph in this binary is sufficiently ties in software systems. The experimental analy- similar to a database of known-vulnerable functions. This sis demonstrates that our models are not only able helps identify vulnerable statically linked libraries in closed- to exploit structure in the context of similarity source software, a recurring problem (CVE, 2010; 2018) for learning but they can also outperform domain- which no good solutions are currently available. Figure1 specific baseline systems that have been carefully shows one example from this application, where the binary hand-engineered for these problems. functions are represented as control flow graphs annotated with assembly instructions. This similarity learning problem is very challenging as subtle differences can make two 1. Introduction graphs be semantically very different, while graphs with different structures can still be similar. A successful model arXiv:1904.12787v2 [cs.LG] 12 May 2019 Graphs are natural representations for encoding relational for this problem should therefore (1) exploit the graph struc- structures that are encountered in many domains. Expect- tures, and (2) be able to reason about the similarity of graphs edly, computations defined over graph structured data are both from the graph structures as well as from learned se- employed in a wide variety of fields, from the analysis of mantics. molecules for computational biology and chemistry (Gilmer et al., 2017; Yan et al., 2005), to the analysis of knowl- In order to solve the graph similarity learning problem, we edge graphs or graph structured parses for natural language investigate the use of GNNs in this context, explore how understanding. they can be used to embed graphs into a vector space, and learn this embedding model to make similar graphs close 1 2 DeepMind Google. Correspondence to: Yujia Li <yu- in the vector space, and dissimilar graphs far apart. One [email protected]>. important property of this model is that, it maps each graph Proceedings of the 36 th International Conference on Machine independently to an embedding vector, and then all the sim- Learning, Long Beach, California, PMLR 97, 2019. Copyright ilarity computation happens in the vector space. Therefore, 2019 by the author(s). Graph Matching Networks 0xf90 push R15, RSP push R14, RSP 0x138c70 mov R14D , ESI push RBP, RSP push R13, RSP push R15, RSP push R12, RSP push R14, RSP push R12, RSP beddings for similarity learning; (2) we propose the new mov R13D , EDI push RBX, RSP push RBP, RSP mov R12, RDX push RBX, RSP mov RBP, RDI mov EDI, 560 mov RDI, [R12 + 28] mov EBP, EDX xor R15D , R15D 0x138cd3 test RDI, RDI 0x1950 sub RSP, 18 jz 138e4e mov RAX, [28] push RBP, RSP mov [ RSP + 8], RAX push R15, RSP xor EAX, EAX call f70 0x138c8f Graph Matching Networks that computes similarity through push R14, RSP 0x138d02 push R13, RSP cmp RDI, [R12 + 30] jnbe 138e4e push R12, RSP push RBX, RSP movsxd R14, EAX push RAX, RSP 0xfc0 mov EBP, EDX 0x138c9a mov RBX, RAX mov R14D , ESI mov EBX, [R12] lea EAX, R13 + f mov EDX, [RSI + 10] mov R15D , EDI mov ECX, 10 mov R11D , [RSI + 18] cross-graph attention-based matching; (3) empirically we mov EDI, 560 mov R15D , EAX mov R10, [RSI] lea R8, RDI + R14 * 1 mov [ R12 + 28], 0 mov EAX, EDX call 1028 cdq EDX, EAX shr EAX, 3 and R15D , f0 mov EAX, [R10 + RAX * 1 ] idiv EDX, EAX, ECX mov ECX, EDX test RBX, RBX and CL, 7 shr EAX, CL jz 108a mov ECX, 20 cmp R8, [R12 + 20] jmp 138e4e sub ECX, EBX shl/sal EAX, CL 0x196d shr EAX, CL add EDX, EBX show that the proposed graph similarity learning models mov RBX, RAX cmp R11D , EDX 0xfdf cmovbe EDX, R11D test RBX, RBX mov [ RSI + 10], EDX jbe 138d10 jz 19ac test EAX, EAX 0x108a lea R12D , RAX + 2 jz 138d02 lea RAX, 2179f9 xor EAX, EAX test BPL, 8 mov [ RBX], RAX jz 100e 0x138cd3 0x138d02 movsxd R14, EAX mov [ R12 + 28], 0 lea R8, RDI + R14 * 1 achieve good performance across a range of applications, 0x1975 jmp 138e4e cmp R8, [R12 + 20] jbe 138d10 lea R12D , R15 + f mov EAX, R12D 0xff3 0x19ac sar EAX, 1f shr EAX, 1c mov EAX, EBP 0x138d10 and EAX, 3 xor EBX, EBX lea R13D , R15 + RAX * 1 + f 0x100e mov EAX, EDX jmp 1a44 mov [ RBX + 520 ], EAX shr EAX, 3 sar R13D , 4 0x138ce1 mov [ RBX + 520 ], 1 mov EAX, EBP movzx EAX, [R10 + RAX * 1 ] xor EBP, EBP outperforming structure agnostic models and established lea RAX, 21bcc9 sar EAX, 4 lea RDX, b18f00 mov [ RBX + 524 ], 1 cmp EDX, R11D mov ESI, 10 mov [ RBX], RAX and EAX, 3 setl BPL xor EAX, EAX add EBP, EDX test BPL, 8 mov [ RBX + 524 ], EAX mov RDI, RBP mov [ RSI + 10], EBP jmp 1022 call 67b88 jnz 19b3 and DL, 7 movzx ECX, DL bt EAX, ECX jnb 138d8a 0x1022 hand-engineered baselines. 0x19b3 0x138d10bt EBP, 13 0x138d33 0x199b jnb 1035 mov EAX, EBP mov EAX, EBP shr EAX, 3 and EAX, 3 mov EDX, [R10 + RAX * 1 ] mov [ RBX + 520 ], 1 mov ECX, EBP mov [ RBX + 520 ], EAX mov EAX, 1 0x138d8a and CL, 7 0x138cf7 shr EDX, CL mov EAX, EBP cmp RDI, R8 mov R15D , bebbb1b7 jmp 19c6 add EBP, 4 jnb 138e4e jmp 138e4e shr EAX, 4 0x1035 cmp R11D , EBP and EAX, 3 cmovbe EBP, R11D mov EAX, EBP xor R15D , R15D 0x1028 and EDX, f movshr EAX, 1f EAX, EDX mov [ RSI + 10], EBP bt EBP, 1d call f10 jz 138e3c mov [ RBX + 514 ], EAX jnb 104d 0x19c6 shr EAX, 3 0x138d5c and R12D , f0 mov EAX, EBP add R13D , 2 shr EAX, 3 movzx EAX, [R10 + RAX * 1 ] mov [ RBX + 524 ], EAX 0x102d 0x1046 mov ECX, EBP test EBP, 80000 and CL, 7 0x138ce1 shr EAX, CL mov [ RBX + 514 ], EAX 0x138d93 jnz 1a1b movzxor [ RBX + 514 ], 2 EAX, [R10 + RAX 0x138e3c * and1 ]EAX, 1 jmp 1067 lea R9, 10e02a9 xor ECX, ECX 2. Related Work xor EDX, EDX jmp 138daa cmp EBP, R11D setl CL add ECX, EBP mov [ RSI + 10], ECX mov ECX, EAX xor EBP, EBP neg ECX 0x104d 0x19dc xor EDX, ECX add EDX, EAX bt EBP, 1e jmp 138e3e mov ECX, EBP lea RDX, b18f00 jnb 105a shr ECX, 1f cmp EDX, R11D mov EAX, EBP 0x138daa shr EAX, 1c = 6= 0x1a1b and EAX, 2 movsxd RAX, [R12 + 4] 0x1053 lea RAX, RAX + RAX * 2 mov ESI, 10 lea EDX, RAX + RCX * 1 mov RDX, [R9 + RAX * 8 + 8 ] call fc0 or [ RBX + 514 ], 4 mov ECX, EBP test EBP, 40000000 setl BPL shr ECX, 3 Graph Neural Networks and Graph Representation lea ECX, RAX + RCX * 1 + 4 mov EBX, [R10 + RCX * 1 ] 0x138e3e mov ECX, EBP cmovne EDX, ECX and CL, 7 movzx ESI, DL shr EBX, CL mov RDX, R14 xor EAX, EAX mov [ RBX + 514 ], EDX mov ECX, 20 call 67248 test EBP, 10000000 0x105a sub ECX,[ R9 + RAX*8 ] add EBP, EDX shl/sal EBX, CL jz 1a26 bt EBP, 1c shr EBX, CL movsx EAX, [RDX + RBX * 4 + 2 ] jnb 1067 add EAX, EBP cmp R11D , EAX mov RDI, RBP mov [ RSI + 10],jnbe 138de4EBP Learning The history of graph neural networks (GNNs) 0x1a07 0x1060 0x138de1 0x138e49 call 67b88 0x1a20 test EBP, 40000000 or [ RBX + 514 ], 1 mov EAX, R11D add [ R12 + 28], R14 cmove ECX, EAX and DL, 7 mov [ RBX + 514 ], EAX or ECX, 1 mov [ RBX + 514 ], ECX 0x138de4 goes back to at least the early work by Gori et al.(2005) jmp 1a26 0x1067 movsx RCX, [RDX + RBX * 4 ] mov [ RSI + 10], EAX movzx mov R8 D, R12 D ECX, DL movzx EDX, [R12 + RCX * 1 + 8 ] mov ECX, R15D test EDX, EDX mov EDX, R14D jz 138e21 mov ESI, R13D 0x1a26 mov RDI, RBX call 4ab0 bt EAX, ECX 0x138df6 mov RDI, RBX and Scarselli et al.(2009), who proposed to use a propaga- mov ESI, R15D mov ECX, EAX shr ECX, 3 mov EDX, R14D movzx EBP, [R10 + RCX * 1 ] mov ECX, R12D mov ECX, EAX 0x107b and CL, 7 mov R8D, R13D jnb 138d8a shr EBP, CL call 1a56 and EBP, 1 mov [ RBX + 510 ], ffffffff xor ECX, ECX mov RAX, RBX cmp EAX, R11D setl CL jmp 108c add ECX, EAX mov [ RSI + 10], ECX tion process to learn node representations.

Graph Matching Networks for Learning the Similarity of Graph Structured Objects

Deep Similarity Learning for Sports Team Ranking

RELIEF Algorithm and Similarity Learning for K-NN

Deep Metric Learning: a Survey

Cross-Domain Visual Matching Via Generalized Similarity Measure and Feature Learning

Low-Rank Similarity Metric Learning in High Dimensions

Learning Mahalanobis Distance Metric: Considering Instance Disturbance Helps∗

Unsupervised Domain Adaptation with Similarity Learning

Algorithms for Similarity Relation Learning from High Dimensional Data Phd Dissertation

Supervised Learning with Similarity Functions

Unsupervised Graph-Based Similarity Learning Using Heterogeneous Features

Coupled Attribute Similarity Learning on Categorical Data Can Wang, Xiangjun Dong, Fei Zhou, Longbing Cao, Senior Member, IEEE, and Chi-Hung Chi

Learning Similarities for Linear Classification: Theoretical