E-Graphs: Optimizing CAD Under the Hood

Every time you union two shapes, fillet an edge, or sweep a profile, the CAD kernel builds a tree of operations. That tree can grow deep and redundant fast. A single bracket model might accumulate hundreds of Boolean nodes, many of which could be simplified without changing the final geometry. Traditional CAD kernels leave those trees alone and brute-force the evaluation. E-graphs offer a fundamentally different approach.

What Is an E-Graph?

An e-graph (equivalence graph) is a data structure that compactly represents many equivalent versions of the same expression simultaneously. Instead of storing a single canonical form, an e-graph groups nodes into equivalence classes. Two nodes belong to the same class if a rewrite rule proves they produce the same result.

Consider a simple arithmetic analogy. The expressions (a + b) + c and a + (b + c) are associatively equivalent. An e-graph stores both forms within one equivalence class, along with any other form discovered by applying rules. When you later ask “what is the cheapest form?”, the e-graph can search all known variants and return the optimal one.

In a CAD context, “expressions” are not arithmetic --- they are geometry operations. A union of two spheres, a smooth-union with blending radius, a difference followed by a re-union: each of these can be represented as a term in an expression language. E-graphs let the kernel explore algebraic identities over those terms.

Equality Saturation: The Engine

The process of populating an e-graph is called equality saturation. It works in three phases:

Seed. Insert the original expression into the e-graph.
Saturate. Apply a set of rewrite rules exhaustively. Each rule match merges equivalence classes or adds new nodes. Rules fire until no new merges are possible or a budget is hit.
Extract. Traverse the saturated e-graph using a cost function and extract the cheapest equivalent expression.

The critical insight is that saturation applies rules in any order without losing information. Traditional term rewriting is destructive --- you pick a direction and commit. E-graphs keep every intermediate form alive, which prevents the phase-ordering problem that plagues classical optimizers.

Rewrite Rules for Geometry

What do rewrite rules look like for a CAD kernel? Here are a few categories:

Boolean algebra. Union(A, A) => A. Difference(A, Empty) => A. Intersection(A, Universe) => A.
Commutativity. Union(A, B) => Union(B, A). This lets the extractor choose whichever ordering is cheaper to evaluate.
Smooth operator equivalence. SmoothUnion(A, B, r=0) => Union(A, B). When the blending radius is zero, the smooth variant degenerates to the sharp version, which is cheaper to compute.
Distributivity. Intersection(Union(A, B), C) => Union(Intersection(A, C), Intersection(B, C)). Distributing can sometimes expose sub-expressions that cancel or simplify.
Transform fusion. Translate(Translate(A, v1), v2) => Translate(A, v1 + v2). Collapsing stacked transforms reduces evaluation depth.

Each rule is proven correct for signed distance fields before it enters the rule set. Incorrect rules would silently corrupt geometry, so the rule library is treated as a verified artifact.

Why This Matters for CAD Performance

Evaluation Depth

Implicit geometry kernels evaluate fields by traversing the operation tree at every sample point. If you are ray-marching a model with 200 operations deep, each ray step touches 200 nodes. If e-graph extraction collapses that tree to 80 operations, you get a roughly 2.5x speedup on every single ray step --- and there may be millions of steps per frame.

Memory Pressure

E-graphs also deduplicate shared sub-expressions. If two features reference the same base cylinder, the e-graph naturally merges them into one equivalence class. The extracted expression reuses that node instead of duplicating it, reducing memory footprint.

Determinism

Because extraction uses a well-defined cost function (typically AST size + estimated evaluation cost), the result is deterministic. The same input model always produces the same optimized tree, regardless of the order in which the user built the features. This matters for regression testing and for collaborative workflows where two engineers expect byte-identical evaluation results.

How NeuroCAD Integrates E-Graphs

NeuroCAD uses e-graphs as a core optimization pass inside the kernel_field crate. When a model’s operation graph is finalized (or when incremental re-evaluation triggers), the field evaluator converts the relevant sub-graph into an e-graph term language, saturates it against the rule library, and extracts the cheapest form. The optimized graph is then used for all downstream evaluation: meshing, ray-marching, constraint solving, and export.

The rule library is versioned alongside the kernel. Adding a new rule requires a proof sketch and a regression test that compares the SDF output of the original and rewritten forms over a dense point cloud. This keeps the optimization pass trustworthy.

Admissibility Checks

Not every rewrite is safe for every context. A rule that is valid for exact SDF may break the Lipschitz bound needed by sphere-tracing. NeuroCAD enforces admissibility checks --- each rule carries metadata about which evaluation modes it is valid for (exact SDF, bounded SDF, occupancy field). During saturation, only rules compatible with the current evaluation mode fire.

Practical Impact

In internal benchmarks on models with 150—500 operations, e-graph optimization typically reduces evaluation depth by 30—60% and total sample time by 20—45%. The saturation phase itself costs single-digit milliseconds for models of this size, which is negligible compared to the evaluation savings over millions of sample points.

For engineers, this is invisible. You build your model with whatever feature sequence makes design sense. The kernel optimizes behind the scenes. There is no “optimize” button to press and no need to restructure your feature tree for performance.

Limitations and Open Problems

E-graph saturation is not free. The data structure can grow exponentially if rules interact in pathological ways. Budget limits (maximum number of e-class merges or maximum wall-clock time) are essential to keep saturation tractable. NeuroCAD uses both a node budget and a time budget, whichever is hit first terminates saturation.

Extraction with complex cost functions can itself be NP-hard. In practice, ILP-based extractors or greedy bottom-up extractors work well for the expression sizes seen in CAD models, but this remains an active research area.

Finally, the rule library must be curated carefully. An incorrect rule produces silently wrong geometry, which is the worst failure mode for an engineering tool. Formal verification of rewrite rules for SDF operators is an ongoing research direction.

E-Graphs: Optimizing CAD Under the Hood

E-Graphs: Optimizing CAD Under the Hood

What Is an E-Graph?

Equality Saturation: The Engine

Rewrite Rules for Geometry

Why This Matters for CAD Performance

Evaluation Depth

Memory Pressure

Determinism

How NeuroCAD Integrates E-Graphs

Admissibility Checks

Practical Impact

Limitations and Open Problems

Further Reading

Discuss NeuroCAD as a pilot or investment opportunity.