Refactoring memory planning to allow running multiple algorithms #8440

tarun292 · 2025-02-13T00:44:39Z

This diff introduces memory_planning_algorithm_suite which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.

The requirement for each of these algorithms is that they should generate a MemoryAlgoResult that contains the results of the memory planning done by that algorithm. These algos like before don't update the TensorSpec directly, but rather in memory_planning_algorithm_suite we figure out which algo gave us the best result and then update the TensorSpec's with values (offsets etc.) returned by that algo.

Differential Revision: D69515056

pytorch-bot · 2025-02-13T00:44:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8440

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit c2fa1c1 with merge base 94ec549 ():

NEW FAILURE - The following job has failed:

pull / unittest / macos / macos-job (gh)
backends/xnnpack/test/passes/test_convert_to_linear.py::TestConvertToLinear::test_fp32_convert_to_linear

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-02-13T00:44:52Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: Pull Request resolved: #8440 Differential Revision: D69515056

facebook-github-bot · 2025-02-13T00:48:02Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: Pull Request resolved: #8440 Differential Revision: D69515056

facebook-github-bot · 2025-02-13T00:50:10Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Differential Revision: D69515056

facebook-github-bot · 2025-02-27T18:03:03Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Differential Revision: D69515056

JacobSzwejbka · 2025-02-27T18:17:50Z

What is the regression on lowering time?

tarun292 · 2025-03-04T22:27:16Z

What is the regression on lowering time?

@JacobSzwejbka from a measurement on llama3 it takes about an additional 100ms to run 2 algorithms and pick the best one.

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056

facebook-github-bot · 2025-03-21T05:55:54Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056

facebook-github-bot · 2025-03-21T18:38:43Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056

facebook-github-bot · 2025-03-21T20:08:10Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056

facebook-github-bot · 2025-03-21T22:04:59Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056

facebook-github-bot · 2025-03-22T01:42:58Z

This pull request was exported from Phabricator. Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056

Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056

facebook-github-bot · 2025-03-24T18:09:17Z

This pull request was exported from Phabricator. Differential Revision: D69515056

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 13, 2025

facebook-github-bot added the fb-exported label Feb 13, 2025

tarun292 added the topic: not user facing label Feb 13, 2025

facebook-github-bot pushed a commit that referenced this pull request Feb 13, 2025

Memory planning updates (#8440)

8eed6f3

Summary: Pull Request resolved: #8440 Differential Revision: D69515056

facebook-github-bot force-pushed the export-D69515056 branch from e6b5faf to 8eed6f3 Compare February 13, 2025 00:47

facebook-github-bot force-pushed the export-D69515056 branch from 8eed6f3 to af63ca9 Compare February 13, 2025 00:50

facebook-github-bot pushed a commit that referenced this pull request Feb 13, 2025

Memory planning updates (#8440)

af63ca9

Summary: Pull Request resolved: #8440 Differential Revision: D69515056

tarun292 changed the title ~~Memory planning updates~~ Refactoring memory planning to allow running multiple algorithms Feb 27, 2025

tarun292 added this to ExecuTorch Core Feb 27, 2025

github-project-automation bot moved this to To triage in ExecuTorch Core Feb 27, 2025

tarun292 linked an issue Feb 27, 2025 that may be closed by this pull request

Add heap based greedy planning algorithm and graph coloring algorithm for memory planning. #8467

Open

tarun292 removed a link to an issue Feb 27, 2025

Add heap based greedy planning algorithm and graph coloring algorithm for memory planning. #8467

Open

tarun292 linked an issue Feb 27, 2025 that may be closed by this pull request

Refactor memory planning to support running multiple algorithms and choose the one that has the best output results. #8466

Closed

tarun292 force-pushed the export-D69515056 branch from af63ca9 to 32e3fa4 Compare February 27, 2025 17:59

tarun292 requested review from JacobSzwejbka, larryliu0820 and SS-JIA as code owners February 27, 2025 17:59

tarun292 force-pushed the export-D69515056 branch from 32e3fa4 to b323ab5 Compare February 27, 2025 18:03

mergennachin self-requested a review February 28, 2025 16:00

tarun292 force-pushed the export-D69515056 branch from b323ab5 to ba0fa16 Compare March 7, 2025 19:44

tarun292 force-pushed the export-D69515056 branch from 69081ff to 27d74e2 Compare March 20, 2025 21:33

facebook-github-bot force-pushed the export-D69515056 branch from 27d74e2 to a2cfdc9 Compare March 21, 2025 05:55

facebook-github-bot force-pushed the export-D69515056 branch from a2cfdc9 to a8802ae Compare March 21, 2025 18:38

tarun292 force-pushed the export-D69515056 branch from a8802ae to f365bba Compare March 21, 2025 20:04

tarun292 force-pushed the export-D69515056 branch from f365bba to 89496fc Compare March 21, 2025 20:08

tarun292 force-pushed the export-D69515056 branch from 89496fc to 1e51dc0 Compare March 21, 2025 22:01

tarun292 force-pushed the export-D69515056 branch from 1e51dc0 to 1a9b21f Compare March 21, 2025 22:05

tarun292 force-pushed the export-D69515056 branch from 1a9b21f to ff1e53b Compare March 22, 2025 01:39

tarun292 force-pushed the export-D69515056 branch from ff1e53b to f91623a Compare March 22, 2025 01:43

facebook-github-bot force-pushed the export-D69515056 branch from f91623a to c2fa1c1 Compare March 24, 2025 18:09

facebook-github-bot merged commit d3863a8 into main Mar 24, 2025
80 of 83 checks passed

facebook-github-bot deleted the export-D69515056 branch March 24, 2025 23:45

github-project-automation bot moved this from In progress to Done in ExecuTorch Core Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactoring memory planning to allow running multiple algorithms #8440

Refactoring memory planning to allow running multiple algorithms #8440

Uh oh!

tarun292 commented Feb 13, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 13, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Feb 13, 2025

Uh oh!

facebook-github-bot commented Feb 13, 2025

Uh oh!

facebook-github-bot commented Feb 13, 2025

Uh oh!

facebook-github-bot commented Feb 27, 2025

Uh oh!

JacobSzwejbka commented Feb 27, 2025

Uh oh!

tarun292 commented Mar 4, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 22, 2025

Uh oh!

facebook-github-bot commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Refactoring memory planning to allow running multiple algorithms #8440

Refactoring memory planning to allow running multiple algorithms #8440

Uh oh!

Conversation

tarun292 commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8440

❌ 1 New Failure

Uh oh!

facebook-github-bot commented Feb 13, 2025

Uh oh!

facebook-github-bot commented Feb 13, 2025

Uh oh!

facebook-github-bot commented Feb 13, 2025

Uh oh!

facebook-github-bot commented Feb 27, 2025

Uh oh!

JacobSzwejbka commented Feb 27, 2025

Uh oh!

tarun292 commented Mar 4, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 21, 2025

Uh oh!

facebook-github-bot commented Mar 22, 2025

Uh oh!

facebook-github-bot commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

tarun292 commented Feb 13, 2025 •

edited

Loading

pytorch-bot bot commented Feb 13, 2025 •

edited

Loading