-
Notifications
You must be signed in to change notification settings - Fork 568
Refactoring memory planning to allow running multiple algorithms #8440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8440
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit c2fa1c1 with merge base 94ec549 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: Pull Request resolved: #8440 Differential Revision: D69515056
e6b5faf
to
8eed6f3
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
8eed6f3
to
af63ca9
Compare
Summary: Pull Request resolved: #8440 Differential Revision: D69515056
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Differential Revision: D69515056
af63ca9
to
32e3fa4
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Differential Revision: D69515056
32e3fa4
to
b323ab5
Compare
What is the regression on lowering time? |
@JacobSzwejbka from a measurement on llama3 it takes about an additional 100ms to run 2 algorithms and pick the best one. |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Differential Revision: D69515056
b323ab5
to
ba0fa16
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
69081ff
to
27d74e2
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
27d74e2
to
a2cfdc9
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
a2cfdc9
to
a8802ae
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
a8802ae
to
f365bba
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
f365bba
to
89496fc
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
89496fc
to
1e51dc0
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
1e51dc0
to
1a9b21f
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
1a9b21f
to
ff1e53b
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
ff1e53b
to
f91623a
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
f91623a
to
c2fa1c1
Compare
This pull request was exported from Phabricator. Differential Revision: D69515056 |
This diff introduces memory_planning_algorithm_suite which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.
The requirement for each of these algorithms is that they should generate a MemoryAlgoResult that contains the results of the memory planning done by that algorithm. These algos like before don't update the TensorSpec directly, but rather in memory_planning_algorithm_suite we figure out which algo gave us the best result and then update the TensorSpec's with values (offsets etc.) returned by that algo.
Differential Revision: D69515056