Skip to content

✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

Open
@zamazan4ik

Description

@zamazan4ik

Is your feature request related to a problem? Please describe.

Not a problem. An idea about how the TensorRT performance can be improved.

I checked Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) improvements on multiple projects. The results are available here. According to the tests, these optimizations can help with achieving better performance in many cases for many applications: compilers and interpreters, static analysis, databases, networking, etc. Since this, I think optimizing TensorRT (its C++ part) with PGO and PLO would be a good idea.

Describe the solution you'd like

I can suggest the following things:

  • Perform PGO benchmarks on TensorRT. If it shows improvements - add a note to the documentation about possible improvements in TensorRT performance with PGO.
  • Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize TensorRT according to their workloads.
  • Optimize pre-built TensorRT binaries

Additional context

As an additional optimization step after PGO, I can suggest Post-Link Optimization (PLO) with a tool like LLVM BOLT. I think it's still worth evaluating it only after the PGO integration into TensorRT.

Here I collected several PGO-related links (more PGO-related materials available at https://github.com/zamazan4ik/awesome-pgo/).

Examples of how PGO optimization is integrated into other projects:

I have some examples of how PGO information looks in the documentation:

Regarding LLVM BOLT integration, I have the following examples:

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions