Skip to content

[Dependency Scanning] Fix search path context hashing hole and avoid serializing unnecessary field #81244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 2, 2025

Conversation

artemcm
Copy link
Contributor

@artemcm artemcm commented May 1, 2025

  • [Dependency Scanning][Serialization] Do not serialize auxiliary files
    The field is only used to store information to be used in finalize stage, in caching builds. When loading scan results from the cache, the entries are finalized already and have the file info encoded in CASIDs already.
    Resolves rdar://150307865

  • Include system-ness of framework and import search paths in the PCH hash
    This hash is also used for the dependency scanning hash. In both cases, PCH contents may differ based on whether a certain module they depend on is found in a system or non-system search path. In dependency scanning, systemness should cause a full change of scanning context requiring a from-scratch scan.
    Resolves rdar://150334077

artemcm added 2 commits May 1, 2025 12:49
The field is only used to store information to be used in finalize stage, in caching builds. When loading scan results from the cache, the entries are finalized already and have the file info encoded in CASIDs already.

Resolves rdar://150307865
This hash is also used for the dependency scanning hash. In both cases, PCH contents may differ based on whether a certain module they depend on is found in a system or non-system search path. In dependency scanning, systemness should cause a full change of scanning context requiring a from-scratch scan.

Resolves rdar://150334077
@artemcm
Copy link
Contributor Author

artemcm commented May 1, 2025

@swift-ci test

@artemcm artemcm marked this pull request as ready for review May 2, 2025 16:47
Copy link
Contributor

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test to make sure multiple incremental scan works correctly?

I am hoping you can write a test that makes sure .swiftmoduledeps file stays the same if you do an incremental scanning from reusing the result.

Otherwise, the fix LGTM.

@artemcm
Copy link
Contributor Author

artemcm commented May 2, 2025

Is it possible to add a test to make sure multiple incremental scan works correctly?

I am hoping you can write a test that makes sure .swiftmoduledeps file stays the same if you do an incremental scanning from reusing the result.

I will think of such a test as a follow-up. I am thinking about using counters of actual new module lookup queries executed, to check that an incremental scan produces an identical output without actually doing much work.

@artemcm artemcm merged commit 6baa3de into swiftlang:main May 2, 2025
5 checks passed
@artemcm artemcm deleted the IncrementalScanUnlimitedGrowth branch May 2, 2025 17:01
@cachemeifyoucan
Copy link
Contributor

I will think of such a test as a follow-up. I am thinking about using counters of actual new module lookup queries executed, to check that an incremental scan produces an identical output without actually doing much work.

That is not enough. The bug you fixed actually produced the same output (because the field is not printed) and there are other things that can hide the wrong serialization from changing scanning output. I think it is important to test the serialization stays the same if the result is expected to be reused.

@artemcm
Copy link
Contributor Author

artemcm commented May 2, 2025

I will think of such a test as a follow-up. I am thinking about using counters of actual new module lookup queries executed, to check that an incremental scan produces an identical output without actually doing much work.

That is not enough. The bug you fixed actually produced the same output (because the field is not printed) and there are other things that can hide the wrong serialization from changing scanning output. I think it is important to test the serialization stays the same if the result is expected to be reused.

Good point. I think this would also involve a little bit of work to ensure the various data structures used are ordered and deterministic, which is worth doing. I will look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants