Description
An alternative dispatch mechanism to switch or computed gotos is to use tail calls.
In a tail calling interpreter, dispatch is implemented as a tail call to the next instruction.
For example: the C code generated for NOP
would look something like:
funcptr INSTRUCTION_TABLE[256] = {
[NOP] = nop_func,
...
};
RETURN_TYPE nop_func(PyObject **sp, PyInterpreterFrame *fp, PyThreadState *ts, _Py_CODEUNIT *next_instr, int oparg)
{
next_instr += 1;
int opcode = next_instr->op.code;
int oparg = next_instr->op.arg;
funcptr next = INSTRUCTION_TABLE[opcode];
return next(sp, fp, ts, next_instr, oparg);
}
If we can get the C compiler to generate jumps for tailcalls and avoid lots of expensive spills, this can be faster than computed gotos.
The problem is that C compilers do not always generate tailcalls and the platform ABI force them to spill a lot of registers.
However, upcoming versions of Clang may support annotations to force tail calls, and LLVM already supports the GHC (Haskell) calling convention which should eliminate the spills.
This has the potential to speed up the tier 1 interpreter by several percent.