-
Notifications
You must be signed in to change notification settings - Fork 49
[5.7] Recover from parser errors #519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
hamishknight
merged 6 commits into
swiftlang:swift/release/5.7
from
hamishknight:totally-5.7
Jun 30, 2022
Merged
[5.7] Recover from parser errors #519
hamishknight
merged 6 commits into
swiftlang:swift/release/5.7
from
hamishknight:totally-5.7
Jun 30, 2022
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c36e539
to
c6ad784
Compare
c6ad784
to
dfe3d67
Compare
stephentyrone
approved these changes
Jun 30, 2022
This stores both a source location, and has the ability to be `nil`, which is necessary to enable parser recovery in cases where we expect a number but parse something that e.g overflows.
Currently we use Swift error handling for parser errors. While this is convenient, it has a number of drawbacks: - Any AST parsed gets thrown away as soon as we encounter an error. This prevents clients from being able to get any useful information from invalid AST (rdar://93677069). - Multiple diagnostics cannot be issued, meaning that e.g a basic syntactic error could obscure a more useful semantic error. - It doesn't extend nicely to e.g warning diagnostics, meaning that we'd eventually end up with 2 ways of emitting diagnostics. - The thrown errors relied on `recordLoc` blocks to annotate them with source location info, which could lead to errors without location info if we forgot to add the appropriate `recordLoc` calls. Additionally, in some cases we want a more fine grained location info than the block would give us. Therefore this commit removes the use of Swift error handling throughout the parser. The parser is now a total function that _always_ returns an AST. If errors are encountered while parsing, they are recorded, and are attached to the resulting AST by the parser. The parser attempts to recover as much of the AST it can when encountering an error. As such, there is now are now `.invalid` atom and character property kinds. Sema then runs and can attach more diagnostics onto the AST. For now, the compiler interface remains the same, and we pick a single error to `throw`, but this will be changed in a later PR to allow multiple errors and warnings, as well as AST recovery. This also means we can better preserve the capture type in the presence of parser errors. Fortunately, in most cases, this is quite a mechanical transformation. It entails: - Moving the lexical analysis methods onto the `Parser`. We were already passing `ParsingContext` parameters for most of them, so it's not clear they were benefitting from the isolation that `Source` offered. Effectively this means that all parsing has access to the context and diagnostics. - Converting error throwing statements into calls to the parser's `error` method (or `unreachable` method for unreachables). This commit also updates the parser tests to be able to be able to match against multiple diagnostics.
Scan to the closing delimiter of an invalid identifier, and better diagnose an invalid text segment option.
We now always run validation, which is fine because the resulting AST can still be returned.
dfe3d67
to
c1c4e61
Compare
@swift-ci please test |
This was referenced Jun 30, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
5.7 cherry-pick of #481
Currently we use Swift error handling for parser errors. While this is convenient, it has a number of drawbacks:
Any AST parsed gets thrown away as soon as we encounter an error. This prevents clients from being able to get any useful information from invalid AST (rdar://93677069).
Multiple diagnostics cannot be issued, meaning that e.g a basic syntactic error could obscure a more useful semantic error.
It doesn't extend nicely to e.g warning diagnostics, meaning that we'd eventually end up with 2 ways of emitting diagnostics.
The thrown errors relied on
recordLoc
blocks to annotate them with source location info, which could lead to errors without location info if we forgot to add the appropriaterecordLoc
calls. Additionally, in some cases we want a more fine grained location info than the block would give us.Therefore this PR removes the use of Swift error handling throughout the parser. The parser is now a total function that always returns an AST. If errors are encountered while parsing, they are recorded, and are attached to the resulting AST by the parser. The parser attempts to recover as much of the AST it can when encountering an error. As such, there is now are now
.invalid
atom and character property kinds. Sema then runs and can attach more diagnostics onto the AST.For now, the compiler interface remains the same, and we pick a single error to
throw
, but this will be changed in a later PR to allow multiple errors and warnings, as well as AST recovery. This also means we can better preserve the capture type in the presence of parser errors.Fortunately, in most cases, this is quite a mechanical transformation. It entails:
Moving the lexical analysis methods onto the
Parser
. We were already passingParsingContext
parameters for most of them, so it's not clear they were benefitting from the isolation thatSource
offered. Effectively this means that all parsing has access to the context and diagnostics.Converting error throwing statements into calls to the parser's
error
method (orunreachable
method for unreachables).This PR also updates the parser tests to be able to be able to match against multiple diagnostics.
Part of the fix for rdar://93677069
Resolves #449