Qualcomm AI Engine Direct - alias_copy op #10319

winskuo-quic · 2025-04-21T03:07:11Z

Summary

Remove alias_copy op.

Test plan

Add UTs to ensure alias_copy is removed.

pytorch-bot · 2025-04-21T03:07:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10319

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 24f8528 with merge base 17cbef5 ():

NEW FAILURE - The following job has failed:

pull / test-llava-runner-linux / linux-job (gh)
RuntimeError: Command docker exec -t b83706795ed398fcbccf2f8f99f9ef3b3d732666fb33e34ffc2ccdc5a1923eb1 /exec failed with exit code 139

This comment was automatically generated by Dr. CI and updates every 15 minutes.

winskuo-quic · 2025-04-21T03:11:28Z

Hi @cccclai, @billmguo,
This PR is the remove alias_copy operation.
Please have a look.
Thanks

billmguo

Work for me

cccclai · 2025-04-21T17:56:41Z

backends/qualcomm/tests/test_qnn_delegate.py

+    def test_qnn_backend_alias(self):
+        module = Alias()  # noqa: F405
+        sample_input = (torch.randn(1, 10),)
+        self.lower_module_and_test_output(module, sample_input)


Are we testing alias op should still work, meaning they're not removed?

Hi @cccclai,
Thanks for reviewing the PR.
Alias op will be removed.
This test is just to show alias can be properly removed and model is still working properly.
I have added some comments under the model.

hmm, I assume this test will work regardless with or without alias_op, correct? I was thinking export the graph and run to_backend, and check there is no alias op after that.

hmm, I assume this test will work regardless with or without alias_op, correct? I was thinking export the graph and run to_backend, and check there is no alias op after that.

This test it to reproduce @billmguo error.
Without adding exir_ops.edge.aten.alias_copy.default to RemoveRedundancy pass, this test will fail during qnn_partitioner where it does not have op_builder for aten.alias_copy.default.

Thanks for the suggestion. That will probably be more straight forward as the unit test is not actually running the alias_op since it is dropped during RemoveRedundancy. Maybe we can add a new class for UT, targeting whether passes are working as expected.

facebook-github-bot · 2025-04-22T05:26:31Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2025-04-22T16:18:29Z

It seems like the test is failing

AssertionError: False is not true : ref_output:
tensor([[ 0.5914, -0.8622,  0.0214, -0.3349,  0.0285, -0.9833, -1.2256,  0.3349,
         -0.7197,  0.3278]])

model_output:
tensor([[ 0.5914,  0.5914, -1.2185, -1.2185,  0.5914, -1.2185, -1.2185, -1.2185,
         -1.2185, -1.2185]])

winskuo-quic · 2025-04-23T13:33:55Z

It seems like the test is failing

AssertionError: False is not true : ref_output:
tensor([[ 0.5914, -0.8622,  0.0214, -0.3349,  0.0285, -0.9833, -1.2256,  0.3349,
         -0.7197,  0.3278]])

model_output:
tensor([[ 0.5914,  0.5914, -1.2185, -1.2185,  0.5914, -1.2185, -1.2185, -1.2185,
         -1.2185, -1.2185]])

Thanks for sharing the results. I think this is failing by chances. I tested locally a couple times, and it all passed. I have changed to a simpler OP instead since the purpose here is to verify alias_copy is handled properly.
Would you mind if I push another PR for UT refactor to add a new class for passes related stuff later on? I think we will need to discuss internally first on how to refactor the UT classes.
Thanks

cccclai · 2025-04-23T14:22:51Z

Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?

tugsbayasgalan · 2025-04-23T20:32:19Z

I have been looking at the failure and still haven't made much progress. It looks like qualcomm passes do something special to exported artifact state dicts down the line. Is it accurate? Can you point to me places where you deal with module state dicts?

winskuo-quic · 2025-04-24T03:35:00Z

Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?

I have been looking at the failure and still haven't made much progress. It looks like qualcomm passes do something special to exported artifact state dicts down the line. Is it accurate? Can you point to me places where you deal with module state dicts?

Hi @cccclai, @tugsbayasgalan,

Yes I am able to reproduce the issue after bumping the torch nightly version to "dev20250422".

I believe the reason is that we are trying to convert the quant weights back to floating point during

executorch/backends/qualcomm/_passes/annotate_quant_attrs.py

Line 144 in d31ef13

set_parameter(param, n.args[0], self.edge_program)

However, after the bumping the torch version, it seems like the FP weights are not properly saved.
We are using the following functions to get and set parameters.

executorch/backends/qualcomm/builders/utils.py

Line 30 in d31ef13

def get_parameter(

executorch/backends/qualcomm/builders/utils.py

Line 47 in d31ef13

def set_parameter(

It would be appreciated if you could share the best practices on handling parameters modification.
Thanks

tugsbayasgalan · 2025-04-24T14:26:16Z

I am still confused how this logic could break your flow tho. Before returning state dict, i just shallow copy in export meaning they should be the same state dict lol. In what part of set_parameter does it fail?

winskuo-quic · 2025-04-25T13:04:01Z

I am still confused how this logic could break your flow tho. Before returning state dict, i just shallow copy in export meaning they should be the same state dict lol. In what part of set_parameter does it fail?

I think I might know why it is not updating, but I need some time to verify if it is actually the root cause as I am currently blocked by something else.
I think when we are initializing the passes, we passed the exported_program into the constructor.

executorch/backends/qualcomm/_passes/qnn_pass_manager.py

Line 160 in 9e59c19

kwargs["edge_program"] = exported_program

Next, during the lowering process, when decomposition is called, we get a new exported_program with a shallow copy new_state_dict. https://github.com/pytorch/pytorch/blob/ad81eeb7c7c906e0cdd04a5cc8fdb9592281c317/torch/export/exported_program.py#L880
Then, when we are running the passes, we are updating our values to old state_dict in the old exported_program.
This is why we keep on getting incorrect params during QNN op builder.

Please let me know if there's anything that is still unclear.
Thanks

tugsbayasgalan · 2025-04-25T14:20:36Z

Ahh this makes sense. You should always work with state_dict of ep after running decompositions because the state dict can change after decompositions.

tugsbayasgalan · 2025-04-29T15:54:55Z

@winskuo-quic Just checking in, will you be working on making the qualcomm code compatible with new export behaviour?

winskuo-quic · 2025-04-30T03:01:14Z

@winskuo-quic Just checking in, will you be working on making the qualcomm code compatible with new export behaviour?

Hi @tugsbayasgalan,
Thanks for the suggestions.
We will work on Qualcomm code to make it compatible with new export behavior.
As you mentioned, it is probably safer to work on state dict of the ep after decomposition.

Remove alias_copy op

34223a0

winskuo-quic requested a review from cccclai as a code owner April 21, 2025 03:07

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 21, 2025

billmguo approved these changes Apr 21, 2025

View reviewed changes

cccclai reviewed Apr 21, 2025

View reviewed changes

cccclai added the release notes: qualcomm Changes to the Qualcomm backend delegate label Apr 22, 2025

change to relu

24f8528

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - alias_copy op #10319

Qualcomm AI Engine Direct - alias_copy op #10319

winskuo-quic commented Apr 21, 2025

pytorch-bot bot commented Apr 21, 2025 •

edited

Loading

winskuo-quic commented Apr 21, 2025

billmguo left a comment

cccclai Apr 21, 2025

winskuo-quic Apr 22, 2025

cccclai Apr 22, 2025

winskuo-quic Apr 22, 2025

facebook-github-bot commented Apr 22, 2025

cccclai commented Apr 22, 2025

winskuo-quic commented Apr 23, 2025

cccclai commented Apr 23, 2025

tugsbayasgalan commented Apr 23, 2025

winskuo-quic commented Apr 24, 2025

tugsbayasgalan commented Apr 24, 2025

winskuo-quic commented Apr 25, 2025

tugsbayasgalan commented Apr 25, 2025

tugsbayasgalan commented Apr 29, 2025

winskuo-quic commented Apr 30, 2025

Qualcomm AI Engine Direct - alias_copy op #10319

Are you sure you want to change the base?

Qualcomm AI Engine Direct - alias_copy op #10319

Conversation

winskuo-quic commented Apr 21, 2025

Summary

Test plan

pytorch-bot bot commented Apr 21, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10319

❌ 1 New Failure

winskuo-quic commented Apr 21, 2025

billmguo left a comment

Choose a reason for hiding this comment

cccclai Apr 21, 2025

Choose a reason for hiding this comment

winskuo-quic Apr 22, 2025

Choose a reason for hiding this comment

cccclai Apr 22, 2025

Choose a reason for hiding this comment

winskuo-quic Apr 22, 2025

Choose a reason for hiding this comment

facebook-github-bot commented Apr 22, 2025

cccclai commented Apr 22, 2025

winskuo-quic commented Apr 23, 2025

cccclai commented Apr 23, 2025

tugsbayasgalan commented Apr 23, 2025

winskuo-quic commented Apr 24, 2025

tugsbayasgalan commented Apr 24, 2025

winskuo-quic commented Apr 25, 2025

tugsbayasgalan commented Apr 25, 2025

tugsbayasgalan commented Apr 29, 2025

winskuo-quic commented Apr 30, 2025

pytorch-bot bot commented Apr 21, 2025 •

edited

Loading