Noeon Research
LinkedIn
Noeon Research
wallpaper
Knowledge Representation

Efficient completion of an IT project, automated or not, requires good Knowledge Representation. Noeon Research architects its Knowledge Representation to be able to handle different notions at different levels of abstraction. For example, structured entities like code, configuration, and data model are treated at a different level of abstraction compared to facts and negations such as requirements, architectural decisions and constraints; these, in turn, must be treated at a different level compared to causal relations, for instance, how workload affects performance and memory consumption. At a yet higher level, we have the relation of the project objectives to the objectives of adjacent projects, etc.

LLMs represent knowledge in the form of learned weights in vast neural networks approximating some conditional probability distribution function. This representation proves to be very versatile. For instance, ChatGPT natively answers programming questions, producing code in various programming languages. Surprisingly ChatGPT often performs on par or even better than fine-tuned systems like Copilot or CodeWhisperer [1], presumably due to a bigger context window size and cross-domain knowledge transfer. However, LLM-based systems struggle with hallucinations and made-up facts [2], inventing non-existent functions and APIs.

For an enterprise system to be trustworthy, it is mandatory to get correct answers where there are correct answers and be able to check the correctness. To achieve this, we need much more structured representations.

To overcome the opaqueness of weights and biases of Neural Network layers, researchers apply symbolic distillation [3] to recover the structure of LLMs knowledge in the form of a Knowledge Graph. Being symbolic, structured and explicit, Knowledge Graphs support direct reasoning about causal relationships and fact-checking. However, it is unclear if this technique can be effectively transferred from common-sense general knowledge to domain-specific knowledge like programming.

In particular, if the process of compressing a corpus into an internal LLM representation loses information about relationships between domain objects, we will not be able to recover that information in the Knowledge Graph. This is not important in common-sense reasoning, where progress is measured in percent accuracy, but is much more important in software engineering, where using any API incorrectly will crash the program.

It is tempting to think that recovering a symbolic representation of knowledge entirely obviates the need for an LLM. However, instructions are usually given in natural language, which needs to be translated into a graph query language to interact with the symbolic knowledge base. LLMs are the best known tool for this kind of translation.

Ontologies [4] are the best-known form of structured Knowledge Representation. However, there’s no universal philosophical and methodological approach to ontology construction which results in a multitude of mutually incompatible domain-specific ontologies [5]. Moreover, as long as most ontologies are based on Description Logic [6] they are not adapted to representing procedural (algorithmic) knowledge which severely limits their usefulness for automatic code transformation.

Aug 18, 2023













[1] B. Yetiştiren, I. Özsoy, M. Ayerdem, and E. Tüzün, “Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT”, Apr. 21, 2023.
[2] Z. Ji et al., “Survey of Hallucination in Natural Language Generation”, ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, Mar. 2023.


[3] P. West et al., “Symbolic Knowledge Distillation: from General Language Models to Commonsense Models”, Nov. 28, 2022.














[4] S. Staab and R. Studer, “What Is an Ontology?”, in Handbook on Ontologies, Springer Science & Business Media, 2013.
[5] S. Borgo and P. Hitzler, “Some Open Issues After Twenty Years of Formal Ontology”, 2018, pp. 1–9.
[6] L. F. Sikos, “Description Logics: Formal Foundation for Web Ontology Engineering”, in Description Logics in Multimedia Reasoning, 2017.
References
Blog
Control Flow and Data Transformations: A New Visualization and Universal Syntax
Computer programs fundamentally have two interconnected components: Control Flow and Data Transformations. Control flow determines the sequence of operations—essentially deciding which steps happen when, including loops and conditionals.
blog-post-image
Read more
© 2025 Noeon Research. All rights reserved.
Midtown Tower 34F, 9-7-1 Akasaka, Minato-ku, Tokyo, Japan
Noeon Research UK Ltd is a registered company in England and Wales. Registration number: 16093898. VAT registration number: 490 4632 84.
C/O Mackrell Solicitors, 60 St Martins Lane, Covent Garden, London, United Kingdom, WC2N 4JS.
Privacy Policy
(01)
(04)
(05)
X.com
LinkedIn