Noeon Research

Machine Learning Assisted Graph Algorithms

Dec 06, 2023

Management of a large-scale IT project requires establishing and following an elaborate set of rules. They range from code-style checking to project-wide workflows. For instance, we might establish thorough requirements traceability, meaning that every requirement should have both implementation and a set of tests, and vice-versa: every code change and every test should reference a requirement it addresses.

Now to automate traceability checking, all of the code, tests, documentation and requirements need to be parsed and represented as a dependency graph. The rule itself is a small graph depicting required relationships among appropriate types of objects.

After that checking, the rule comes down to finding all the instances where corresponding subgraphs match or fail to do so while they should. The same method can be employed for all kinds of analysis, from finding bugs in the code to the whole organisational network analysis.

Representing rules as graphs and using subgraph matching, in addition to rule validation, also implements knowledge extraction from a knowledge graph. After matching a rule like “a requirement should have a corresponding test”, a developer knows that the first matched thing is a requirement and the second one is a test. Despite the fact that requirements and tests are represented by complex subgraphs and have many other relations.

Unfortunately, subgraph matching is an NP-complete problem meaning it scales poorly to big graphs. Thus all practical algorithms solving NP-complete problems use clever sets of heuristics in order to speed-up the search process. Specifically for the problems of subgraph matching and graph isomorphism, researchers from the University of Salerno proposed the VF3 algorithm [1], which achieves state-of-the-art performance, especially on large and dense graphs. And yet subgraph matching on very big graphs can still take hours. This is perfectly fine for offline analyses in chemistry, pharmacology or genetics but unacceptable for interactive development.

Recently researchers started augmenting graph algorithms with Graph Neural Networks [2], which show impressive results in the tasks of node and edge properties prediction (say, travel time or max flow) but do not naturally generalise to the tasks involving several graphs like subgraph isomorphism problem [3].

One way to overcome this problem is to compute graph similarity using graph embeddings [4] because graphs with unsimilar embeddings cannot possibly be isomorphic. The same idea can be extended to subgraph matching. Dividing big graphs into small patches and computing similarity among them [5] makes it possible to discover parts of graphs that align.

Another line of research combines GNNs with Reinforcement Learning in order to speed up matching small query graphs in large graphs [6]. It is a common task for graph databases and knowledge bases. An even more ambitious approach employs topological and spectral methods in order to tackle a generalised version of subgraph matching, the graph alignment problem [7]. Unfortunately, these methods don’t always reduce computational demand, often provide only an approximate solution that can’t be turned into an exact one, and might have weird false positive corner cases.

Neural Sheaf Diffusion [8] seeks to recover the underlying structure of a graph in order to overcome GNN oversmoothing and improve performance in heterophilic settings, which opens up the possibility of employing this technique to speed up graph algorithms. But efficient application of this method to subgraph matching remains an open research and engineering problem.

Dec 06, 2023

[1] V. Carletti, P. Foggia, A. Saggese, and M. Vento, “Introducing VF3: A New Algorithm for Subgraph Isomorphism”, in Graph-Based Representations in Pattern Recognition, P. Foggia, C.-L. Liu, and M. Vento, Eds., in Lecture Notes in Computer Science, vol. 10310. Cham: Springer International Publishing, 2017, pp. 128–139.

[2] L. Wu, P. Cui, J. Pei, L. Zhao, and X. Guo, Graph Neural Networks: Foundation, Frontiers and Applications. 2022.

[3] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How Powerful are Graph Neural Networks?”, p. 17, 2019

[4] Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, Graph Matching Networks for Learning the Similarity of Graph Structured Objects. Cornell University, 2019, pp. 3835–3845.

[5] Z. Ying, A. H. -j. Wang, J. You, W. Chengtao, A. Canedo, and J. Leskovec, “Neural Subgraph Matching”. May 2021.

[6] Y. Bai, D. Xu, Y. Sun, and W. Wang, “Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search”. Sep. 29, 2022.

[7] J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, and P. Karras, “GRASP: Graph Alignment Through Spectral Signatures,” in Lecture Notes in Computer Science, Springer Science+Business Media, 2021, pp. 44–52.

[8] C. Bodnar, F. Di Giovanni, B. P. Chamberlain, P. Liò, and M. M. Bronstein, “Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs”, Feb. 2022.

References

[2] L. Wu, P. Cui, J. Pei, L. Zhao, and X. Guo, Graph Neural Networks: Foundation, Frontiers and Applications. 2022.

[3] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How Powerful are Graph Neural Networks?”, p. 17, 2019

[4] Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, Graph Matching Networks for Learning the Similarity of Graph Structured Objects. Cornell University, 2019, pp. 3835–3845.

[5] Z. Ying, A. H. -j. Wang, J. You, W. Chengtao, A. Canedo, and J. Leskovec, “Neural Subgraph Matching”. May 2021.

[6] Y. Bai, D. Xu, Y. Sun, and W. Wang, “Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search”. Sep. 29, 2022.

[8] C. Bodnar, F. Di Giovanni, B. P. Chamberlain, P. Liò, and M. M. Bronstein, “Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs”, Feb. 2022.

Blog

Sheaf Theory Applications and Use Cases

We’ll continue our overview of what sheaves are, how they can be useful, and their real-world applications in areas like document analysis, recommendation systems, engineering, and molecular design.

Sheaf Theory: From Deep Geometry to Deep Learning

Mathematics often provides unexpected tools that revolutionize how we think about practical problems. One of its more recent and wildly useful tools? Sheaf theory.

AI Safety

How does Noeon Research ensure its technology remains safe while improving in capability?

Goal Decomposition

How to make an AI system that can decompose complex problems on different abstraction levels for efficient reasoning?

Initial Population of Knowledge Base

What is the cheap and reliable way to incorporate both common and domain-specific knowledge into the knowledge base?

Knowledge Representation

What is the best way to architect knowledge representation that helps to utilise it in downstream tasks?

Pragmatic Communication

How the system can identify a lack of knowledge and make informationally dense request?

Noeon Research UK Ltd is a registered company in England and Wales. Registration number: 16093898. VAT registration number: 490 4632 84.
C/O Mackrell Solicitors, 60 St Martins Lane, Covent Garden, London, United Kingdom, WC2N 4JS.

(01)

(02)

(03)

(04)

(05)

(06)