Skip to content

Refactoring memory planning to allow running multiple algorithms #8440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2025

Conversation

tarun292
Copy link
Contributor

@tarun292 tarun292 commented Feb 13, 2025

This diff introduces memory_planning_algorithm_suite which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a MemoryAlgoResult that contains the results of the memory planning done by that algorithm. These algos like before don't update the TensorSpec directly, but rather in memory_planning_algorithm_suite we figure out which algo gave us the best result and then update the TensorSpec's with values (offsets etc.) returned by that algo.

Differential Revision: D69515056

Copy link

pytorch-bot bot commented Feb 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8440

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit c2fa1c1 with merge base 94ec549 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 13, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

facebook-github-bot pushed a commit that referenced this pull request Feb 13, 2025
Summary: Pull Request resolved: #8440

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

facebook-github-bot pushed a commit that referenced this pull request Feb 13, 2025
Summary: Pull Request resolved: #8440

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

@tarun292 tarun292 changed the title Memory planning updates Refactoring memory planning to allow running multiple algorithms Feb 27, 2025
@github-project-automation github-project-automation bot moved this to To triage in ExecuTorch Core Feb 27, 2025
tarun292 added a commit that referenced this pull request Feb 27, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

tarun292 added a commit that referenced this pull request Feb 27, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.

Pull Request resolved: #8440

Differential Revision: D69515056
@JacobSzwejbka
Copy link
Contributor

What is the regression on lowering time?

@mergennachin mergennachin self-requested a review February 28, 2025 16:00
@tarun292
Copy link
Contributor Author

tarun292 commented Mar 4, 2025

What is the regression on lowering time?

@JacobSzwejbka from a measurement on llama3 it takes about an additional 100ms to run 2 algorithms and pick the best one.

tarun292 added a commit that referenced this pull request Mar 7, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Differential Revision: D69515056
tarun292 added a commit that referenced this pull request Mar 20, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.

Pull Request resolved: #8440

Reviewed By: JacobSzwejbka

Differential Revision: D69515056
facebook-github-bot pushed a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

tarun292 added a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
facebook-github-bot pushed a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

tarun292 added a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

tarun292 added a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.

Pull Request resolved: #8440

Reviewed By: JacobSzwejbka

Differential Revision: D69515056
tarun292 added a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

tarun292 added a commit that referenced this pull request Mar 21, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.

Pull Request resolved: #8440

Reviewed By: JacobSzwejbka

Differential Revision: D69515056
tarun292 added a commit that referenced this pull request Mar 22, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

tarun292 added a commit that referenced this pull request Mar 22, 2025
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.

Pull Request resolved: #8440

Reviewed By: JacobSzwejbka

Differential Revision: D69515056
Summary:
This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo.


Reviewed By: JacobSzwejbka

Differential Revision: D69515056
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69515056

@facebook-github-bot facebook-github-bot merged commit d3863a8 into main Mar 24, 2025
80 of 83 checks passed
@facebook-github-bot facebook-github-bot deleted the export-D69515056 branch March 24, 2025 23:45
@github-project-automation github-project-automation bot moved this from In progress to Done in ExecuTorch Core Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported topic: not user facing
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Refactor memory planning to support running multiple algorithms and choose the one that has the best output results.
3 participants