[clang][modules] Fix filesystem races in ModuleManager
#131354
Draft
+99
−140
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
ModuleManager
usesFileEntry
objects to uniquely identify module files. This requires first consulting theFileManager
(and therefore the file system) when loading PCM files. This is problematic, as this may load a different PCM file to what's already in theInMemoryModuleCache
and fail the size and mtime checks.This PR changes things so that module files are identified by their file system path. This removes the need of knowing the file entry at the start and allows us to fix the race. The downside is that we no longer get the
FileManager
inode-based uniquing, meaning symlinks in the module cache no longer work. I think this is fine, since Clang itself never creates symlinks in that directory. Moreover, we already had to work around filesystems recycling inode numbers, so this actually seems like a win to me (and resolves a long-standing FIXME-like comment).Note that this change also requires that all places calling into
ModuleManager
from within a singleCompilerInstance
use consistent paths to refer to module files. This might be problematic if there are concurrent Clang processes operating on the same module cache directory but with different spellings of its path - theIMPORT
records in PCM files will be valid and consistent within single process, but may break uniquing in another process. We might need to do some canonicalization of the module cache paths to avoid this issue.