
Every programmer relies on optimizing compilers - systems that improve your code before running it. Almost no one thinks of them as primitive reasoning engines. But look at what compilers actually do: examine a program, consider alternatives, make decisions, iteratively refine. They analyze, transform, improve, repeat - sometimes dozens of passes before generating an executable. It's reasoning, we just never call it that.
Which raises a question: what if optimization and execution are actually endpoints of a continuum? The more you optimize at compile-time, the less work remains at run-time. Push this boundary to its limit and you get systems that think rather than run.
Optimizing compilers are built on pattern-matching and rewrite rules. When a compiler encounters code matching a known pattern, it applies a transformation. Industry-grade compilers like GCC or LLVM contain thousands of such rules, organized into optimization passes: constant propagation, dead code elimination, loop transformations, instruction scheduling. Each rule is handcrafted by compiler engineers and written for specific language syntax, often tuned for particular hardware architectures.
This approach works. Decades of engineering have produced compilers that generate efficient code. However, it has fundamental limitations. Each optimization rule must be rewritten for every programming language. A rule that optimizes Python code won't work for Rust or Java without reimplementation, even when the underlying transformation is identical. Compiler engineers spend enormous effort duplicating the same conceptual optimizations across different syntaxes.
This syntax dependence runs deeper than mere inconvenience. If the compiler has a rule for optimizing one pattern but not an equivalent variant, the optimization won't be triggered. Two programs that mean the same thing mathematically may receive different optimizations simply because they're written differently. We can't prove general properties. To verify a compiler never makes code slower, you check every individual rule — all of them, one by one. There's no principled method for discovering new optimizations; engineers must manually identify, test, and add patterns.
What these limitations point to is a missing mathematical framework — one that depends parametrically on which structural information transformations are permitted to discard. Functional equivalence is a necessary baseline, but far richer structure can be extracted, preserved, and carried forward. We can describe individual transformations and prove they preserve specific invariants, but we lack any general account of what makes a transformation an optimization in the first place.
Compare this to other areas of programming language theory. Categorical semantics provided syntax-independent foundations for types. Linear logic gave us a structural account of resources. The Curry-Howard correspondence (proofs-as-programs) revealed that two seemingly different objects — programs and proofs — are the same underlying structure. Each transcended specific syntaxes to reveal underlying mathematical structure. Optimization has no equivalent.
We have taxonomies of optimizations, correctness proofs for specific transformations, and sophisticated engineering. What we lack is a cartography of optimization: which structures could be preserved, which may be traded away, how these choices relate to one another, and where the mathematical limits lie. A structural theory would chart this landscape in terms independent of any particular language or notation — the way categorical semantics mapped the terrain of types without committing to any syntax.
Building a structural theory of optimization is, arguably, the "final boss" of theoretical computer science. Narrower foundational problems have taken decades — and some remain unsettled.
Formalizing concurrency took decades — producing Milner's CCS, Hoare's CSP, and the process algebra tradition through sustained effort across the 1970s and 80s — and it addressed a more constrained problem than optimization. Yet as Samson Abramsky noted, "we still don't know" what the fundamental structures of concurrency are. Linear logic (Girard, 1987) was a breakthrough, but it took years, and it addressed a single, well-scoped question about resource-sensitive reasoning.
These were hard but narrow problems. By contrast, optimization cuts across semantics, structure, and transformation simultaneously. A theory of optimization must explain not just how programs change, but why some changes count as improvements, what properties must be preserved across all languages and notations, and what it means to reach a limit. No prior foundational framework had to answer all of these at once.
Progress has been made in adjacent areas. CompCert proved one compiler preserves semantics — a major achievement, but not a general theory. Partial evaluation and the Futamura projections explored the compile-time/run-time boundary — but the structural characterization remains elusive.
Despite decades of research, no structural theory of optimization exists. By structural theory we mean one that brings model-theoretic tools to bear: identifies the semantic objects transformations act on, characterizes equivalence and improvement in terms of those objects, and makes it possible to reason about entire classes of transformations. Without this, engineers spend millions of hours adapting compilers to new languages and processor architectures, duplicating work that proper foundations could systematize.
A structural theory would have immediate practical value: clearer foundations for compiler correctness, better tools for reasoning about program transformations, more principled compiler construction. Optimizations could be derived systematically rather than discovered through trial and error. Entire classes of transformation could provably hold desired properties.
But there's something even larger. An optimizing compiler is a system that reasons about a program before executing it. A structural account of this process — independent of any particular language or syntax — would be a formal framework for systems that think before they act. Every system that reasons before acting — planners, optimizers, agents — would inherit this theory.
At Noeon, our work focuses on a unified knowledge representation system in which computation and knowledge are not separate concepts. Programs are knowledge; knowledge is executable. This has direct implications for optimization. When a program and its transformation are represented in the same structural space, questions about validity and equivalence become structural questions — ones that can be reasoned about directly, rather than handled separately for each language or syntax.
We aren’t implementing an existing structural theory of optimization - there isn’t one. And we're not interested in writing yet another optimizing compiler without any theory. We're developing both together, letting the system and the mathematics inform each other.
If you've looked at this problem and thought "someone should work on this" - we are. Join us.