Matthew A. Hammer, Yit Phang Khoo, Michael Hicks, and Jeffrey S. Foster
Programming Language Design and Implementation (PLDI 2014), Edinburgh, Scotland, June 2014
[paper]
Lightly-edited excerpts:
Adapton operates on an acyclic demanded computation graph (DCG), corresponding
to demanded computation traces. Each node in the graph represents a ref
or a
thunk
, and each directed edge represents a dependency pointing from a thunk
to another ref
or thunk
. The DCG is initially empty at the beginning of the
execution. Nodes are added to the DCG whenever a new ref
or thunk
is created
via ref
, thunk, or a memo constructor (when memoization misses). Edges are
added when a thunk
calls get
or force
; edges are labeled with the value
returned by that call. We maintain edges bidirectionally: Each node stores an
ordered list of outgoing edges that is appended by each call to get
or
force
, as well as an unordered set of incoming edges. This allows us to
traverse the DCG from caller to callee or vice-versa.
The key to making Adapton efficient is to split change propagation into two
phases—dirtying and propagation. The dirtying phase occurs when we make calls to
set
to update inputs to the incremental program. For each call to set
, we
traverse the DCG starting from the ref
backward, marking all traversed edges
as “dirty”. Note that unlike Section 2, in our implementation only edges are
dirtied; a node is implicitly dirty if any of its outgoing edges are dirty.
The propagation phase occurs when we make calls to force
to demand results
from the incremental program. For each call to force
on a thunk
, we perform
an in-order traversal starting from the thunk
’s dirty outgoing edges,
re-evaluating thunk
s as necessary. We check if we need to re-evaluate a
thunk
by iterating over its dirty outgoing edges in the order they were added.
For each dirty edge, we first clean its dirty flag. If the target node is a
thunk
, we recursively check if we need to re-evaluate the target thunk
.
Then, we compare the value of the target ref
or thunk
against the label of
the outgoing edge. If the value is inconsistent, we know that at least one input
to the thunk
has changed, so we need to re-evaluate the thunk
, and need not
check its remaining outgoing edges. In fact, we first remove all its outgoing
edges, since some edges may no longer be relevant due to the changed input;
relevant edges will be re-added when get
or force
is called during
re-evaluation.
Note that, in both dirtying and propagation, we only traverse an edge if it is
clean or dirty, respectively. We can do so because the above procedures maintain
an invariant that, if an edge is dirty at the end of a dirtying or propagation
phase, then all edges transitively reachable by traversing incoming edges
beginning from the source node will also be dirty; dually, if an edge is clean,
then all edges transitively reachable by traversing outgoing edges beginning
from the target node will also be clean. This greatly improves efficiency by
amortizing dirtying costs across consecutive calls to set
, and propagation
cost across consecutive calls to force
.
By traversing the DCG in-order, Adapton re-evaluates inconsistent nodes in the
same order as a standard non-incremental lazy evaluator would force thunks.
Therefore, Adapton avoids re-evaluating nodes unnecessarily, such as thunk
s
that were conditionally forced due to certain inputs, but may no longer be
forced under updated inputs.
We observed that benchmarks spend a significant portion of time in the garbage collector (sometimes more than 50%).