Noeon Research
LinkedIn
Noeon Research
wallpaper
Initial Population of Knowledge Base
Oct 10, 2023

For Noeon Research system to be applicable, it needs to automate organisational knowledge processing and structuring. It should convert project specifications, documentation, and code into a rich internal knowledge representation capable of handling both highly structured and natural language data. This representation should connect to vast pre-existing common-sense and IT-specific knowledge in order to swiftly and automatically learn project specifics.

Every knowledge-based system needs an initial population of its knowledge base, no matter what form it takes – explicit knowledge graph or non-interpretable LLM weights. Cyc – arguably the most comprehensive and vast knowledge graph and symbolic reasoning system – has been in development for almost 40 years, and much of that time was spent inputting handcrafted facts and rules [1]. LLMs require only a few weeks of training distributed across a fleet of machines. But while ontologies [2] and knowledge graphs are laborious to build, LLMs are opaque, hard to reason about and interpret.

However, it’s possible to automatically extract knowledge from LLMs in the form of a symbolic Knowledge Graph using symbolic knowledge distillation [3] techniques. Application of the same approach to current multi-modal LLMs has the potential to cover also not-so-common-sense knowledge from diverse areas like Physics, Chemistry and IT with highly structured knowledge in the form of formulae, code and diagrams.

This approach, however, is very new, and its applicability to wider contexts remains speculative. Moreover, it needs an extensive restricted natural language dataset expressing relevant knowledge to coax the LLM into generating further bits of knowledge in the same domain.

Another promising direction is to fine-tune pre-trained LLMs on a structured corpus in order to produce interpretable causal explanations [4]. These explanations are shaped into a predetermined structure in a restricted fragment of natural language and can be easily parsed into a symbolic form [5]. This approach also requires a relevant seed dataset for fine-tuning. Moreover, a limited context window demands clever prompt engineering in order to present an LLM with all relevant information and structure and elicit correct generation.

Oct 10, 2023
Blog
Sheaf Theory: From Deep Geometry to Deep Learning
Mathematics often provides unexpected tools that revolutionize how we think about practical problems. One of its more recent and wildly useful tools? Sheaf theory.
blog-post-image
Read more
© 2025 Noeon Research. All rights reserved.
Midtown Tower 34F, 9-7-1 Akasaka, Minato-ku, Tokyo, Japan
Noeon Research UK Ltd is a registered company in England and Wales. Registration number: 16093898. C/O Mackrell Solicitors, 60 St Martins Lane, Covent Garden, London, United Kingdom, WC2N 4JS.
Privacy Policy
(01)
(04)
(05)
X.com
LinkedIn