notes

Egraphs and Automated Reasoning: Looking Back to Look Forward

Philip Zucker

[conference details] [pre-recorded video] [conference video] [live video (at 00:24:43)] [slides]

[blog post] [abstract]

(The audio is corrupted for most of the conference-recorded talk.)

Simplification and completion

Basic form of completion:

  1. Orient according to term order. Make a bidirectional rule directed in the direction of which would yield a cheaper result, e.g., [X * 2 = X << 2] ==> {X * 2 -> X << 2}.
  2. Add critical pairs (CPs) as equations. Find possible phase ordering problems: two LHSes which can overlap in a non-trivial way. Derive equations, e.g., {X * 2 -> X << 1; (X * Y) / Y -> X} => [(X << 1) / 2 = X].
  3. Reduce equations. Reduce the equations wrt the rewrite rules, to simplify and discard redundant equations that the system can already prove.
  4. Repeat.

This repairs confluence.

More advanced completion (Baader and Nipkow):

If new rewrite rules are derived, which imply the old, then the R-simplify and L-simplify rules let you compact and canonicalize your rewrite system.

Completion sounds like what Cranelift does statically to its rewrite rules before generating core for them.

Union-find is Ground Atomic Completion

Union-find is an instance of completion, in particular ground atomic completion. Instead of variables for patterns, we have atomic symbols.

Term rewrite system Union-find
Run TRS Find
Add equation Union
R/L simplify Path compression
Term ordering Tie breaking (e.g., depth or rank)

Twee has the best visualization of proof production of any automated theorem prover he’s seen. Remark: The user names axioms (rewrites), which would help for identification and provenance.

E-Graph is Ground Term Completion

TRS E-graph
Canonical term E-class
R/L-simplify Canonicalization
Run rules Extraction
Term orders Extraction objective
KBO weights Weights

Most important point: a Compressed Ground Rewrite System can be diagrammatically represented in just as compelling or a simpler way than e-graphs:

Where are the e-classes and e-nodes?

Flattening: introduce fresh constants for every term and rank them less in term ordering.

This has two types of rules: f(e1, e2) -> e3 uses the e-node table (which e-nodes belong to which e-classes) and e3 -> e1 uses union-find (e-class ID to e-class ID).

E-matching

Run rules in reverse to generate all equivalent terms.

Bottom-up e-matching is perhaps related to the ae-graph.

Bottom-up e-matching is better behaved in regards to polynomial and destructive rewrites.

Strategies

In completion, everything is put into a single sack and run repeatedly. In e-graphs, there’s a distinction between the rules and e-graph; rules don’t apply to each other to make inferences.

Set of Support strategy: there are active and passive rules, where passive rules aren’t allowed to infer among themselves; only active against passive or active against active is allowed. Passive is the eqsat rewrite rules and active is the e-graph.

Prolog and Datalog can be seen as incomplete strategies on resolution.

Propositional and equational system correspondences

Propositional Equational
Resolution Paramodulation
Ordered resolution Superposition
? Completion
Ground Ordered Resolution E-graph / Ground Completion / Ground Superposition
Prolog Functional Logic Programming
Datalog Egglog
Answer Set Programming ?
Lambda Prolog ?
Hypothetical Datalog ?
Minikanren ?