notes

Slotted E-Graphs

Rudi Schneider (TU Berlin), Thomas Kœhler (INRIA), Michel Steuwer (TU Berlin)

[conference details] [video] [live video (at 02:04:54)] [paper] [implementation]

An approach for representing bound variables in e-graphs. In their slotted e-graph, e-classes are parameterized by slots abstracting over all free variables.

Why not use de Bruijn levels? Then β-reduction the example given for de Bruijn indices doesn’t have the same sharing problem:

β-reduction: (λx. M) N ==> M[x := N]

η-reduction: λx. M x ==> M when M does not contain x free

η-expansion: M ==> λx. M x

However, adding or removing binders still breaks sharing.

Key idea: unify e-nodes that are equivalent up to renaming of variables into an e-class which is parameterized by its variable names.

Example: f(x, y) and f(x', y') have no sharing in conventional e-graphs. With slotted e-graphs, this uses a variable node, v with a slot for the binding. Then, the two calls instantiate v with their bindings: f(v[x], v[y]) and f(v[x'], v[y']). Then, those two nodes are unified as, say, n[a, b] = f(v[a], v[b]). (Brackets are my notation. The visual graphs have a header to declare the slots and edges instantiate slots at the user end of the line.)

Conventional e-class: set of equivalent terms.

Slotted e-class: function from variable names to a set of equivalent terms.

Challenge 1: Parameter mismatch. When an e-class has a node which depends on a variable and a node which doesn’t, their implementation says the e-class doesn’t depend on a variable. For example, x * 0 depends on x and 0 doesn’t.

Challenge 2: Symmetry. Merging a + b with b + a is merging the same node with itself. To make it symmetric, e-classes depend on a set of parameter lists. In that case, {(a, b), (b, a)}. If you call it with any of them, it’s the same. That parameter set list can be represented nicely as a permutation group.

Follow-up

(Question asked on Zulip.)

Parameter mismatch and symmetry could be handled by the same mechanism. As described in the talk, x * 0 has signature (x) and 0 has signature (), so when unified, the e-class has signature (). Instead of deleting a variable, the signatures could be unioned in a set: {(), (x)}. This also works with symmetry.

As an example, suppose you start with the expression: (x + y) * 0

and rules:

Applying rewrites would yield the e-class of:

In this model, each e-class is partitioned by the unique signatures. Let’s tentatively call them var-classes. e-nodes then point to var-classes with instantiations instead of e-classes.

What about permutation groups? Would that then need to be expanded instead of compactly represented (or however they do it)?

What about types? Do var-classes need to be partitioned by sorts in addition to arity? Is it possible for a rewrite to map a binding to another type? I don’t think so, so var nodes should be strongly typed.

Terminology: “Signature” is not used in the paper, but I use it for the list of slots. “Shape” is an e-node, where its slots are renamed in ascending order according to first occurrence in the e-node.

The expression (x * 2) + y with rules x + y ==> y + x and x * y ==> y * x forms the following e-class:

Although the two var-classes have the same arity, they cannot be unified.

The expression ((x * 2) * 0) + (y * 0) with rule x * 0 ==> 0 forms the following e-class:

Again, although the 1-ary var-classes have the same arity, they cannot be unified.