Noeon Research

Initial Population of Knowledge Base

Oct 10, 2023

For Noeon Research system to be applicable, it needs to automate organisational knowledge processing and structuring. It should convert project specifications, documentation, and code into a rich internal knowledge representation capable of handling both highly structured and natural language data. This representation should connect to vast pre-existing common-sense and IT-specific knowledge in order to swiftly and automatically learn project specifics.

Every knowledge-based system needs an initial population of its knowledge base, no matter what form it takes – explicit knowledge graph or non-interpretable LLM weights. Cyc – arguably the most comprehensive and vast knowledge graph and symbolic reasoning system – has been in development for almost 40 years, and much of that time was spent inputting handcrafted facts and rules [1]. LLMs require only a few weeks of training distributed across a fleet of machines. But while ontologies [2] and knowledge graphs are laborious to build, LLMs are opaque, hard to reason about and interpret.

However, it’s possible to automatically extract knowledge from LLMs in the form of a symbolic Knowledge Graph using symbolic knowledge distillation [3] techniques. Application of the same approach to current multi-modal LLMs has the potential to cover also not-so-common-sense knowledge from diverse areas like Physics, Chemistry and IT with highly structured knowledge in the form of formulae, code and diagrams.

This approach, however, is very new, and its applicability to wider contexts remains speculative. Moreover, it needs an extensive restricted natural language dataset expressing relevant knowledge to coax the LLM into generating further bits of knowledge in the same domain.

Another promising direction is to fine-tune pre-trained LLMs on a structured corpus in order to produce interpretable causal explanations [4]. These explanations are shaped into a predetermined structure in a restricted fragment of natural language and can be easily parsed into a symbolic form [5]. This approach also requires a relevant seed dataset for fine-tuning. Moreover, a limited context window demands clever prompt engineering in order to present an LLM with all relevant information and structure and elicit correct generation.

Oct 10, 2023

[1] W. Knight. (2016). An AI that spent 30 years learning some common sense is ready for work. MIT Technology Review.

[2] S. Borgo and P. Hitzler, Some Open Issues After Twenty Years of Formal Ontology. 2018, pp. 1–9.

[3] P. West et al., “Symbolic Knowledge Distillation: from General Language Models to Commonsense Models”. 2021.

[4] N. Mostafazadeh et al., “GLUCOSE: GeneraLized and COntextualized Story Explanations”. 2020.

[5] A. Kalyanpur, T. Breloff, and D. A. Ferrucci, “Braid: Weaving Symbolic and Neural Knowledge into Coherent Logical Explanations,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, pp. 10867–10874, Jun. 2022.

References

[1] W. Knight. (2016). An AI that spent 30 years learning some common sense is ready for work. MIT Technology Review.

[2] S. Borgo and P. Hitzler, Some Open Issues After Twenty Years of Formal Ontology. 2018, pp. 1–9.

[3] P. West et al., “Symbolic Knowledge Distillation: from General Language Models to Commonsense Models”. 2021.

[4] N. Mostafazadeh et al., “GLUCOSE: GeneraLized and COntextualized Story Explanations”. 2020.

Blog

Sheaf Theory Applications and Use Cases

We’ll continue our overview of what sheaves are, how they can be useful, and their real-world applications in areas like document analysis, recommendation systems, engineering, and molecular design.

Sheaf Theory: From Deep Geometry to Deep Learning

Mathematics often provides unexpected tools that revolutionize how we think about practical problems. One of its more recent and wildly useful tools? Sheaf theory.

AI Safety

How does Noeon Research ensure its technology remains safe while improving in capability?

Machine Learning Assisted Graph Algorithms

What is the fastest way to do subgraph matching for knowledge extraction and meaning grounding?

Goal Decomposition

How to make an AI system that can decompose complex problems on different abstraction levels for efficient reasoning?

Knowledge Representation

What is the best way to architect knowledge representation that helps to utilise it in downstream tasks?

Pragmatic Communication

How the system can identify a lack of knowledge and make informationally dense request?

Noeon Research UK Ltd is a registered company in England and Wales. Registration number: 16093898. VAT registration number: 490 4632 84.
C/O Mackrell Solicitors, 60 St Martins Lane, Covent Garden, London, United Kingdom, WC2N 4JS.

(01)

(02)

(03)

(04)

(05)

(06)