notes

Adapton: Composable, Demand-Driven Incremental Computation

Matthew A. Hammer, Yit Phang Khoo, Michael Hicks, and Jeffrey S. Foster

Programming Language Design and Implementation (PLDI 2014), Edinburgh, Scotland, June 2014

[paper]

Lightly-edited excerpts:

5.1 Adapton change propagation algorithm

Adapton operates on an acyclic demanded computation graph (DCG), corresponding to demanded computation traces. Each node in the graph represents a ref or a thunk, and each directed edge represents a dependency pointing from a thunk to another ref or thunk. The DCG is initially empty at the beginning of the execution. Nodes are added to the DCG whenever a new ref or thunk is created via ref, thunk, or a memo constructor (when memoization misses). Edges are added when a thunk calls get or force; edges are labeled with the value returned by that call. We maintain edges bidirectionally: Each node stores an ordered list of outgoing edges that is appended by each call to get or force, as well as an unordered set of incoming edges. This allows us to traverse the DCG from caller to callee or vice-versa.

The key to making Adapton efficient is to split change propagation into two phases—dirtying and propagation. The dirtying phase occurs when we make calls to set to update inputs to the incremental program. For each call to set, we traverse the DCG starting from the ref backward, marking all traversed edges as “dirty”. Note that unlike Section 2, in our implementation only edges are dirtied; a node is implicitly dirty if any of its outgoing edges are dirty.

The propagation phase occurs when we make calls to force to demand results from the incremental program. For each call to force on a thunk, we perform an in-order traversal starting from the thunk’s dirty outgoing edges, re-evaluating thunks as necessary. We check if we need to re-evaluate a thunk by iterating over its dirty outgoing edges in the order they were added. For each dirty edge, we first clean its dirty flag. If the target node is a thunk, we recursively check if we need to re-evaluate the target thunk. Then, we compare the value of the target ref or thunk against the label of the outgoing edge. If the value is inconsistent, we know that at least one input to the thunk has changed, so we need to re-evaluate the thunk, and need not check its remaining outgoing edges. In fact, we first remove all its outgoing edges, since some edges may no longer be relevant due to the changed input; relevant edges will be re-added when get or force is called during re-evaluation.

Note that, in both dirtying and propagation, we only traverse an edge if it is clean or dirty, respectively. We can do so because the above procedures maintain an invariant that, if an edge is dirty at the end of a dirtying or propagation phase, then all edges transitively reachable by traversing incoming edges beginning from the source node will also be dirty; dually, if an edge is clean, then all edges transitively reachable by traversing outgoing edges beginning from the target node will also be clean. This greatly improves efficiency by amortizing dirtying costs across consecutive calls to set, and propagation cost across consecutive calls to force.

By traversing the DCG in-order, Adapton re-evaluates inconsistent nodes in the same order as a standard non-incremental lazy evaluator would force thunks. Therefore, Adapton avoids re-evaluating nodes unnecessarily, such as thunks that were conditionally forced due to certain inputs, but may no longer be forced under updated inputs.

6.2 Micro-benchmark

We observed that benchmarks spend a significant portion of time in the garbage collector (sometimes more than 50%).