-
Notifications
You must be signed in to change notification settings - Fork 49
[Integration] main (005e0fb) -> swift/main #492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PCRE and ICU both support quoted sequences that don't have a terminating `\E`. Update the parsing to allow this. Additionally, allow empty quoted sequences outside of custom character classes, which is consistent with ICU. Finally, don't allow quoted sequences to span multiple lines in extended syntax literals.
rdar://92459215 has been fixed.
Use inits instead of as methods Add ARO tests
This change makes `Regex`, `RegexComponent`, and its component types `Sendable`. Regex stores a `Program` instance, which lazily lowers the DSLTree into a compiled program. Without synchronization, this lazy compilation is unsafe under concurrency. This change uses atomic initialization for the compiled program.
Obtain match output elements without materializing the output.
- Add a test where the capture transform produecs a `Substring` from a `Substring`. - Add a test where the capture transform wraps a `Substring` in an `Optional`.
Work around rdar://94763190
Disable Prototypes to work around a CI failure
…custom types * Track the whole match as an element of the "capture list" in the matching engine. Do so by emitting code as an implicit `capture` around the root node. * No longer handle `matcher` as a special case within `capture` lowering, because the matcher can be arbitrarily nested within "output-forwarding" nodes, such as a `changeMatchingOptions` non-capturing group. Instead, make the bytecode emitter carry a result value so that a custom output can be propagated through any forwarding nodes. ```swift Regex { Capture( SemanticVersionParser() .ignoringCase() .matchingSemantics(.unicodeScalar) ) // This would not work previously. } ``` * Collapse DSLTree node `transform` into `capture`, because a transform can never be standalone (without a `capture` parent). This greatly simplifies `capture` lowering. * Make the bytecode's capture transform use type `(Input, _StoredCapture) -> Any` so that it can transform any whole match, not just `Substring`. This means you can now transform any captured value, including a custom-consuming regex component's result! ```swift Regex { "version:" OneOrMore(.whitespace) Capture { SemanticVersionParser() // Regex<SemanticVersion> } transform: { // (SemanticVersion) -> SomethingElse } } ``` The transforms of `Capture` and `TryCapture` are now generalized from taking `Substring` to taking generic parameter `W` (the whole match). * Fix an issue where initial options were applied based solely on whether the bytecode had any instructions, failing examples such as `((?i:.))`. It now checks whether the first matchable atom has been emitted.
Fully generalize "whole match" in the engine and enable transforming custom types
This change preserves the lazy atomic initialization, so using Regex will still be thread-safe by default, even without the annotation.
Add additional capture transform tests.
`buildEither` was removed from the regex builder DSL proposal. See swiftlang/swift-evolution#1634.
Parse, but diagnose in Sema
Remove (?X) and (?u) for now
Remove `buildEither`.
top level code is real weird, let's not talk about it
^ and $ should match the start and end of the callee, even if that callee is a substring. Right now ^ and $ match the start and end of the callee's base string, instead. In addition, ^ and $ should only match the start and end of the callee when replacing a subrange, not the start and end of the subrange.
This was caused by the fact that we'd walk into `expectUnicodeScalar` if we saw `\o`, but we only want to parse `\o{`. Instead, change it to be a `lex..` method, and bail if we don't lex a scalar.
Add regex benchmarker
@swift-ci please test |
rxwei
approved these changes
Jun 16, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.