Rudi Schneider (TU Berlin), Thomas Kœhler (INRIA), Michel Steuwer (TU Berlin)
[conference details] [video] [live video (at 02:04:54)] [paper] [implementation]
An approach for representing bound variables in e-graphs. In their slotted e-graph, e-classes are parameterized by slots abstracting over all free variables.
Why not use de Bruijn levels? Then β-reduction the example given for de Bruijn indices doesn’t have the same sharing problem:
β-reduction: (λx. M) N ==> M[x := N]
η-reduction: λx. M x ==> M when M does not contain x free
η-expansion: M ==> λx. M x
However, adding or removing binders still breaks sharing.
Key idea: unify e-nodes that are equivalent up to renaming of variables into an e-class which is parameterized by its variable names.
Example: f(x, y)
and f(x', y')
have no sharing in conventional e-graphs.
With slotted e-graphs, this uses a variable node, v
with a slot for the
binding. Then, the two calls instantiate v
with their bindings:
f(v[x], v[y])
and f(v[x'], v[y'])
. Then, those two nodes are unified as,
say, n[a, b] = f(v[a], v[b])
. (Brackets are my notation. The visual graphs
have a header to declare the slots and edges instantiate slots at the user end
of the line.)
Conventional e-class: set of equivalent terms.
Slotted e-class: function from variable names to a set of equivalent terms.
Challenge 1: Parameter mismatch. When an e-class has a node which depends on a variable and a node which doesn’t, their implementation says the e-class doesn’t depend on a variable. For example, x * 0 depends on x and 0 doesn’t.
Challenge 2: Symmetry. Merging a + b with b + a is merging the same node with itself. To make it symmetric, e-classes depend on a set of parameter lists. In that case, {(a, b), (b, a)}. If you call it with any of them, it’s the same. That parameter set list can be represented nicely as a permutation group.
(Question asked on Zulip.)
Parameter mismatch and symmetry could be handled by the same mechanism. As described in the talk, x * 0 has signature (x) and 0 has signature (), so when unified, the e-class has signature (). Instead of deleting a variable, the signatures could be unioned in a set: {(), (x)}. This also works with symmetry.
As an example, suppose you start with the expression: (x + y) * 0
and rules:
Applying rewrites would yield the e-class of:
In this model, each e-class is partitioned by the unique signatures. Let’s tentatively call them var-classes. e-nodes then point to var-classes with instantiations instead of e-classes.
What about permutation groups? Would that then need to be expanded instead of compactly represented (or however they do it)?
What about types? Do var-classes need to be partitioned by sorts in addition to
arity? Is it possible for a rewrite to map a binding to another type? I don’t
think so, so var
nodes should be strongly typed.
Terminology: “Signature” is not used in the paper, but I use it for the list of slots. “Shape” is an e-node, where its slots are renamed in ascending order according to first occurrence in the e-node.
The expression (x * 2) + y with rules x + y ==> y + x and x * y ==> y * x forms the following e-class:
Although the two var-classes have the same arity, they cannot be unified.
The expression ((x * 2) * 0) + (y * 0) with rule x * 0 ==> 0 forms the following e-class:
Again, although the 1-ary var-classes have the same arity, they cannot be unified.