Optimizing tail recursion is trivial, because the stack is reused, but general TCO requires additional cleanup or resizing.
In assembly, TCO can be implemented
by replacing a call
followed by a ret
with a jmp
. There may be call frame
setup between the two instructions, which would need to be adjusted.
LLVM has the -tailcallelim
pass and optimizes tail calls
and sibling calls.
It extends the basic algorithm to move trivial instructions from between the
call and return and rewrites associative expressions to use an accumulator (as
in naïve factorial or Fibonacci). It also performs it on functions that return
void, the callee’s result, or a run-time constant on all exits, or if it can
prove that callees don’t access the caller stack frame. TCO is only performed
for the fastcc
, GHC, HiPE, tailcc
, and
swifttailcc
calling conventions.
The Func dialect has func.call
,
func.call_indirect
, and func.return
, and the LLVM dialect
has llvm.call
and llvm.return
, but their docs make no mention of tail calls.
Presumably, MLIR leans on LLVM’s TCO.
WebAssembly has the regular call instructions,
call
, call_indirect
, and call_ref
as well as their proposed
tail-call counterparts, return_call
, return_call_indirect
, and
return_call_ref
, which unwind the frame before performing the call. The callee
does not necessarily need to be the same as the caller and can have a different
type, so is more general than tail self-recursive calls.
V8 has implemented this proposal.
Turbofan (V8’s optimizing compiler) handles the stack and register manipulations for reusing the stack space by generating a list of moves that are semantically executed in parallel. The “gap resolver” then orders moves such that all sources are read before being overwritten, using temporary registers or stack slots to break cycles.
Liftoff (V8’s baseline compiler) does not optimize tail calls. It instead updates the stack frame as for a regular call, then shifts it downwards to discard the caller frame.
Wasmtime details their implementation in an RFC.
In RFC 3407, it is proposed that
Rust use the already-reserved become
keyword to denote a tail call. It is used
opt-in, in place of a return, and its argument must be a call. Tail calls to and
from closures will not be supported, due to the high implementation effort.
Zig uses separate @call
syntax to specify modifiers, such as to control tail calls or inlining.
The JVM eliminates tail-recursive calls, but not general tail calls or mutual tail recursion.
The Graal compiler has internal support for tail calls and limited support for Truffle languages.