Machine Learning Assisted Graph Algorithms

Management of a large-scale IT project requires establishing and following an elaborate set of rules. They range from code-style checking to project-wide workflows. For instance, we might establish thorough requirements traceability, meaning that every requirement should have both implementation and a set of tests, and vice-versa: every code change and every test should reference a requirement it addresses.

Now to automate traceability checking, all of the code, tests, documentation and requirements need to be parsed and represented as a dependency graph. The rule itself is a small graph depicting required relationships among appropriate types of objects.

After that checking, the rule comes down to finding all the instances where corresponding subgraphs match or fail to do so while they should. The same method can be employed for all kinds of analysis, from finding bugs in the code to the whole organisational network analysis.

Representing rules as graphs and using subgraph matching, in addition to rule validation, also implements knowledge extraction from a knowledge graph. After matching a rule like “a requirement should have a corresponding test”, a developer knows that the first matched thing is a requirement and the second one is a test. Despite the fact that requirements and tests are represented by complex subgraphs and have many other relations.

Unfortunately, subgraph matching is an NP-complete problem meaning it scales poorly to big graphs. Thus all practical algorithms solving NP-complete problems use clever sets of heuristics in order to speed-up the search process. Specifically for the problems of subgraph matching and graph isomorphism, researchers from the University of Salerno proposed the VF3 algorithm [1], which achieves state-of-the-art performance, especially on large and dense graphs. And yet subgraph matching on very big graphs can still take hours. This is perfectly fine for offline analyses in chemistry, pharmacology or genetics but unacceptable for interactive development.

Recently researchers started augmenting graph algorithms with Graph Neural Networks [2], which show impressive results in the tasks of node and edge properties prediction (say, travel time or max flow) but do not naturally generalise to the tasks involving several graphs like subgraph isomorphism problem [3].

One way to overcome this problem is to compute graph similarity using graph embeddings [4] because graphs with unsimilar embeddings cannot possibly be isomorphic. The same idea can be extended to subgraph matching. Dividing big graphs into small patches and computing similarity among them [5] makes it possible to discover parts of graphs that align.

Another line of research combines GNNs with Reinforcement Learning in order to speed up matching small query graphs in large graphs [6]. It is a common task for graph databases and knowledge bases. An even more ambitious approach employs topological and spectral methods in order to tackle a generalised version of subgraph matching, the graph alignment problem [7]. Unfortunately, these methods don’t always reduce computational demand, often provide only an approximate solution that can’t be turned into an exact one, and might have weird false positive corner cases.

Neural Sheaf Diffusion [8] seeks to recover the underlying structure of a graph in order to overcome GNN oversmoothing and improve performance in heterophilic settings, which opens up the possibility of employing this technique to speed up graph algorithms. But efficient application of this method to subgraph matching remains an open research and engineering problem.

[1] V. Carletti, P. Foggia, A. Saggese, and M. Vento, “Introducing VF3: A New Algorithm for Subgraph Isomorphism”, in Graph-Based Representations in Pattern Recognition, P. Foggia, C.-L. Liu, and M. Vento, Eds., in Lecture Notes in Computer Science, vol. 10310. Cham: Springer International Publishing, 2017, pp. 128–139.

[2] L. Wu, P. Cui, J. Pei, L. Zhao, and X. Guo, Graph Neural Networks: Foundation, Frontiers and Applications. 2022.

[3] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How Powerful are Graph Neural Networks?”, p. 17, 2019

[4] Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, Graph Matching Networks for Learning the Similarity of Graph Structured Objects. Cornell University, 2019, pp. 3835–3845.

[5] Z. Ying, A. H. -j. Wang, J. You, W. Chengtao, A. Canedo, and J. Leskovec, “Neural Subgraph Matching”. May 2021.

[6] Y. Bai, D. Xu, Y. Sun, and W. Wang, “Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search”. Sep. 29, 2022.

[7] J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, and P. Karras, “GRASP: Graph Alignment Through Spectral Signatures,” in Lecture Notes in Computer Science, Springer Science+Business Media, 2021, pp. 44–52.

[8] C. Bodnar, F. Di Giovanni, B. P. Chamberlain, P. Liò, and M. M. Bronstein, “Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs”, Feb. 2022.

References

[1] V. Carletti, P. Foggia, A. Saggese, and M. Vento, “Introducing VF3: A New Algorithm for Subgraph Isomorphism”, in Graph-Based Representations in Pattern Recognition, P. Foggia, C.-L. Liu, and M. Vento, Eds., in Lecture Notes in Computer Science, vol. 10310. Cham: Springer International Publishing, 2017, pp. 128–139.

[2] L. Wu, P. Cui, J. Pei, L. Zhao, and X. Guo, Graph Neural Networks: Foundation, Frontiers and Applications. 2022.

[3] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How Powerful are Graph Neural Networks?”, p. 17, 2019

[4] Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, Graph Matching Networks for Learning the Similarity of Graph Structured Objects. Cornell University, 2019, pp. 3835–3845.

[5] Z. Ying, A. H. -j. Wang, J. You, W. Chengtao, A. Canedo, and J. Leskovec, “Neural Subgraph Matching”. May 2021.

[6] Y. Bai, D. Xu, Y. Sun, and W. Wang, “Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search”. Sep. 29, 2022.

[7] J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, and P. Karras, “GRASP: Graph Alignment Through Spectral Signatures,” in Lecture Notes in Computer Science, Springer Science+Business Media, 2021, pp. 44–52.

[8] C. Bodnar, F. Di Giovanni, B. P. Chamberlain, P. Liò, and M. M. Bronstein, “Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs”, Feb. 2022.

Blog

© 2024 Noeon Research. All rights reserved.

Midtown Tower 18F, 9-7-1 Akasaka, Minato-ku, Tokyo, Japan

Midtown Tower 18F, 9-7-1 Akasaka, Minato-ku, Tokyo, Japan