Skip to content

JIT: Optimize common C calls via replication #133020

Open
@Fidget-Spinner

Description

@Fidget-Spinner

Feature or enhancement

Proposal:

Currently, a C calls is executed roughly as follows in the specializing interpreter:

(py_func->m_ml->ml_meth)(...args)

This has two sources of overhead:

  1. Double pointer lookup while cheap is still something we can remove.
  2. JIT cannot inline the function without PGO (which the JIT currently does not have, and will probably never have).

We can optimize it to the following in the JIT:

PyCFunction cfunc = LOOKUP_TABLE[1..n]; // Via replicate(n)
DEOPT_IF(cfunc != py_func->m_ml->ml_meth);
cfunc(...args);

LOOKUP_TABLE will be populated with common C functions that we know Python code uses.
This will remove the overhead of 2. Allowing the JIT to inline and optimize these calls.

If we want, there's an even more extreme optimization we could do. We could just burn in the C function directly and call it. saving the overhead of 1. However, I don't think this could be done without breaking strange usages of ml_meth where it's dynamically set. So I would be more cautious here with that.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetopic-JITtype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions