Skip to content

Qualcomm AI Engine Direct - alias_copy op #10319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

winskuo-quic
Copy link
Collaborator

Summary

Remove alias_copy op.

Test plan

Add UTs to ensure alias_copy is removed.

@winskuo-quic winskuo-quic requested a review from cccclai as a code owner April 21, 2025 03:07
Copy link

pytorch-bot bot commented Apr 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10319

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 24f8528 with merge base 17cbef5 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 21, 2025
@winskuo-quic
Copy link
Collaborator Author

Hi @cccclai, @billmguo,
This PR is the remove alias_copy operation.
Please have a look.
Thanks

Copy link
Contributor

@billmguo billmguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Work for me

def test_qnn_backend_alias(self):
module = Alias() # noqa: F405
sample_input = (torch.randn(1, 10),)
self.lower_module_and_test_output(module, sample_input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we testing alias op should still work, meaning they're not removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cccclai,
Thanks for reviewing the PR.
Alias op will be removed.
This test is just to show alias can be properly removed and model is still working properly.
I have added some comments under the model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I assume this test will work regardless with or without alias_op, correct? I was thinking export the graph and run to_backend, and check there is no alias op after that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I assume this test will work regardless with or without alias_op, correct? I was thinking export the graph and run to_backend, and check there is no alias op after that.

This test it to reproduce @billmguo error.
Without adding exir_ops.edge.aten.alias_copy.default to RemoveRedundancy pass, this test will fail during qnn_partitioner where it does not have op_builder for aten.alias_copy.default.

Thanks for the suggestion. That will probably be more straight forward as the unit test is not actually running the alias_op since it is dropped during RemoveRedundancy. Maybe we can add a new class for UT, targeting whether passes are working as expected.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai cccclai added the release notes: qualcomm Changes to the Qualcomm backend delegate label Apr 22, 2025
@cccclai
Copy link
Contributor

cccclai commented Apr 22, 2025

It seems like the test is failing

AssertionError: False is not true : ref_output:
tensor([[ 0.5914, -0.8622,  0.0214, -0.3349,  0.0285, -0.9833, -1.2256,  0.3349,
         -0.7197,  0.3278]])

model_output:
tensor([[ 0.5914,  0.5914, -1.2185, -1.2185,  0.5914, -1.2185, -1.2185, -1.2185,
         -1.2185, -1.2185]])

@winskuo-quic
Copy link
Collaborator Author

It seems like the test is failing

AssertionError: False is not true : ref_output:
tensor([[ 0.5914, -0.8622,  0.0214, -0.3349,  0.0285, -0.9833, -1.2256,  0.3349,
         -0.7197,  0.3278]])

model_output:
tensor([[ 0.5914,  0.5914, -1.2185, -1.2185,  0.5914, -1.2185, -1.2185, -1.2185,
         -1.2185, -1.2185]])

Thanks for sharing the results. I think this is failing by chances. I tested locally a couple times, and it all passed. I have changed to a simpler OP instead since the purpose here is to verify alias_copy is handled properly.
Would you mind if I push another PR for UT refactor to add a new class for passes related stuff later on? I think we will need to discuss internally first on how to refactor the UT classes.
Thanks

@cccclai
Copy link
Contributor

cccclai commented Apr 23, 2025

Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?

@tugsbayasgalan
Copy link
Contributor

I have been looking at the failure and still haven't made much progress. It looks like qualcomm passes do something special to exported artifact state dicts down the line. Is it accurate? Can you point to me places where you deal with module state dicts?

@winskuo-quic
Copy link
Collaborator Author

Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?

I have been looking at the failure and still haven't made much progress. It looks like qualcomm passes do something special to exported artifact state dicts down the line. Is it accurate? Can you point to me places where you deal with module state dicts?

Hi @cccclai, @tugsbayasgalan,

Yes I am able to reproduce the issue after bumping the torch nightly version to "dev20250422".

I believe the reason is that we are trying to convert the quant weights back to floating point during

set_parameter(param, n.args[0], self.edge_program)

However, after the bumping the torch version, it seems like the FP weights are not properly saved.
We are using the following functions to get and set parameters.

It would be appreciated if you could share the best practices on handling parameters modification.
Thanks

@tugsbayasgalan
Copy link
Contributor

I am still confused how this logic could break your flow tho. Before returning state dict, i just shallow copy in export meaning they should be the same state dict lol. In what part of set_parameter does it fail?

@winskuo-quic
Copy link
Collaborator Author

I am still confused how this logic could break your flow tho. Before returning state dict, i just shallow copy in export meaning they should be the same state dict lol. In what part of set_parameter does it fail?

I think I might know why it is not updating, but I need some time to verify if it is actually the root cause as I am currently blocked by something else.
I think when we are initializing the passes, we passed the exported_program into the constructor.

kwargs["edge_program"] = exported_program

Next, during the lowering process, when decomposition is called, we get a new exported_program with a shallow copy new_state_dict. https://github.com/pytorch/pytorch/blob/ad81eeb7c7c906e0cdd04a5cc8fdb9592281c317/torch/export/exported_program.py#L880
Then, when we are running the passes, we are updating our values to old state_dict in the old exported_program.
This is why we keep on getting incorrect params during QNN op builder.

Please let me know if there's anything that is still unclear.
Thanks

@tugsbayasgalan
Copy link
Contributor

Ahh this makes sense. You should always work with state_dict of ep after running decompositions because the state dict can change after decompositions.

@tugsbayasgalan
Copy link
Contributor

@winskuo-quic Just checking in, will you be working on making the qualcomm code compatible with new export behaviour?

@winskuo-quic
Copy link
Collaborator Author

@winskuo-quic Just checking in, will you be working on making the qualcomm code compatible with new export behaviour?

Hi @tugsbayasgalan,
Thanks for the suggestions.
We will work on Qualcomm code to make it compatible with new export behavior.
As you mentioned, it is probably safer to work on state dict of the ep after decomposition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: qualcomm Changes to the Qualcomm backend delegate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants