The Missing Theory of Optimization

Compilers reason. We can't say what that means.

Optimization as Reasoning

Every programmer relies on optimizing compilers - systems that improve your code before running it. Almost no one thinks of them as primitive reasoning engines. But look at what compilers actually do: examine a program, consider alternatives, make decisions, iteratively refine. They analyze, transform, improve, repeat - sometimes dozens of passes before generating an executable. It's reasoning, we just never call it that.

Which raises a question: what if optimization and execution are actually endpoints of a continuum? The more you optimize at compile-time, the less work remains at run-time. Push this boundary to its limit and you get systems that think rather than run.

SOTA: Thousands of Rules, No Theory

Optimizing compilers are built on pattern-matching and rewrite rules. When a compiler encounters code matching a known pattern, it applies a transformation. Industry-grade compilers like GCC or LLVM contain thousands of such rules, organized into optimization passes: constant propagation, dead code elimination, loop transformations, instruction scheduling. Each rule is handcrafted by compiler engineers and written for specific language syntax, often tuned for particular hardware architectures.

This approach works. Decades of engineering have produced compilers that generate efficient code. However, it has fundamental limitations. Each optimization rule must be rewritten for every programming language. A rule that optimizes Python code won't work for Rust or Java without reimplementation, even when the underlying transformation is identical. Compiler engineers spend enormous effort duplicating the same conceptual optimizations across different syntaxes.

This syntax dependence runs deeper than mere inconvenience. If the compiler has a rule for optimizing one pattern but not an equivalent variant, the optimization won't be triggered. Two programs that mean the same thing mathematically may receive different optimizations simply because they're written differently. We can't prove general properties. To verify a compiler never makes code slower, you check every individual rule — all of them, one by one. There's no principled method for discovering new optimizations; engineers must manually identify, test, and add patterns.

The Foundational Gap: Structural Theory

What these limitations point to is a missing mathematical framework — one that depends parametrically on which structural information transformations are permitted to discard. Functional equivalence is a necessary baseline, but far richer structure can be extracted, preserved, and carried forward. We can describe individual transformations and prove they preserve specific invariants, but we lack any general account of what makes a transformation an optimization in the first place.

Compare this to other areas of programming language theory. Categorical semantics provided syntax-independent foundations for types. Linear logic gave us a structural account of resources. The Curry-Howard correspondence (proofs-as-programs) revealed that two seemingly different objects — programs and proofs — are the same underlying structure. Each transcended specific syntaxes to reveal underlying mathematical structure. Optimization has no equivalent.

We have taxonomies of optimizations, correctness proofs for specific transformations, and sophisticated engineering. What we lack is a cartography of optimization: which structures could be preserved, which may be traded away, how these choices relate to one another, and where the mathematical limits lie. A structural theory would chart this landscape in terms independent of any particular language or notation — the way categorical semantics mapped the terrain of types without committing to any syntax.

Why This Is Hard

Building a structural theory of optimization is, arguably, the "final boss" of theoretical computer science. Narrower foundational problems have taken decades — and some remain unsettled.

Formalizing concurrency took decades — producing Milner's CCS, Hoare's CSP, and the process algebra tradition through sustained effort across the 1970s and 80s — and it addressed a more constrained problem than optimization. Yet as Samson Abramsky noted, "we still don't know" what the fundamental structures of concurrency are. Linear logic (Girard, 1987) was a breakthrough, but it took years, and it addressed a single, well-scoped question about resource-sensitive reasoning.

These were hard but narrow problems. By contrast, optimization cuts across semantics, structure, and transformation simultaneously. A theory of optimization must explain not just how programs change, but why some changes count as improvements, what properties must be preserved across all languages and notations, and what it means to reach a limit. No prior foundational framework had to answer all of these at once.

Progress has been made in adjacent areas. CompCert proved one compiler preserves semantics — a major achievement, but not a general theory. Partial evaluation and the Futamura projections explored the compile-time/run-time boundary — but the structural characterization remains elusive.

Why It Matters

Despite decades of research, no structural theory of optimization exists. By structural theory we mean one that brings model-theoretic tools to bear: identifies the semantic objects transformations act on, characterizes equivalence and improvement in terms of those objects, and makes it possible to reason about entire classes of transformations. Without this, engineers spend millions of hours adapting compilers to new languages and processor architectures, duplicating work that proper foundations could systematize.

A structural theory would have immediate practical value: clearer foundations for compiler correctness, better tools for reasoning about program transformations, more principled compiler construction. Optimizations could be derived systematically rather than discovered through trial and error. Entire classes of transformation could provably hold desired properties.

But there's something even larger. An optimizing compiler is a system that reasons about a program before executing it. A structural account of this process — independent of any particular language or syntax — would be a formal framework for systems that think before they act. Every system that reasons before acting — planners, optimizers, agents — would inherit this theory.

Noeon's Approach

At Noeon, our work focuses on a unified knowledge representation system in which computation and knowledge are not separate concepts. Programs are knowledge; knowledge is executable. This has direct implications for optimization. When a program and its transformation are represented in the same structural space, questions about validity and equivalence become structural questions — ones that can be reasoned about directly, rather than handled separately for each language or syntax.

We aren’t implementing an existing structural theory of optimization - there isn’t one. And we're not interested in writing yet another optimizing compiler without any theory. We're developing both together, letting the system and the mathematics inform each other.

Join Us

If you've looked at this problem and thought "someone should work on this" - we are. Join us.

Mar 18, 2026

Blog

Control Flow and Data Transformations: A New Visualization and Universal Syntax

Computer programs fundamentally have two interconnected components: Control Flow and Data Transformations. Control flow determines the sequence of operations—essentially deciding which steps happen when, including loops and conditionals.

Sheaf Theory Applications and Use Cases

We’ll continue our overview of what sheaves are, how they can be useful, and their real-world applications in areas like document analysis, recommendation systems, engineering, and molecular design.

Sheaf Theory: From Deep Geometry to Deep Learning

Mathematics often provides unexpected tools that revolutionize how we think about practical problems. One of its more recent and wildly useful tools? Sheaf theory.

AI Safety

How does Noeon Research ensure its technology remains safe while improving in capability?

Machine Learning Assisted Graph Algorithms

What is the fastest way to do subgraph matching for knowledge extraction and meaning grounding?

Goal Decomposition

How to make an AI system that can decompose complex problems on different abstraction levels for efficient reasoning?

Initial Population of Knowledge Base

What is the cheap and reliable way to incorporate both common and domain-specific knowledge into the knowledge base?

Knowledge Representation

What is the best way to architect knowledge representation that helps to utilize it in downstream tasks?

Pragmatic Communication

How the system can identify a lack of knowledge and make informationally dense request?

Noeon Research UK Ltd is a registered company in England and Wales. Registration number: 16093898. VAT registration number: 490 4632 84.
C/O Mackrell Llp, 5 Pemberton Row, London, England, EC4A 3BA.

(01)

(02)

(03)

(04)

(05)

(06)