-
Notifications
You must be signed in to change notification settings - Fork 49
[Integration] main (e87149a) -> swift/main #521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This shouldn't include e.g `namedCapturesOnly`.
This should always be set in a multi-line literal, with extended syntax potentially being set and unset as we parse.
Relax the ban on unsetting extended syntax in a multi-line literal such that it does not apply to a scoped unset e.g `(?-x:...)`, as long as it does not span multiple lines. This commit also bans the use of `(?^)` in a multi-line literal, unless it is scoped and does not span multiple lines. Instead, `(?^x)` must be written, as PCRE defines `(?^)` to be equivalent to `(?-imnsx)`.
* Add a `clearThrough` instruction This will let us fix lookahead assertions that have leftover save points in the subpattern on success, and also allow us to implement atomic groups. * Fix lookaheads with quantifiers On success, the subpatterns in lookaheads like (?=.*e) had a save point that persisted, causing the logic in the lookahead group to be invalid. * Implement atomic non-capturing group support In addition to the (?>...) syntax, this is what's underneath `Local`.
* Allow CustomConsuming types to match w/ zero width We previously asserted if a custom consuming type matches with zero width, but that isn't necessary or good. A custom type can implement a lookaround assertion or act as a tracer. * Rename Processor.advance(to:) to resume(at:) Since the given index doesn’t need to advance, this name is less misleading.
This prepares for adopting an opaque result type for matches(of:) and ranges(of:). The old, CollectionConsumer-based model moves index-by-index, and isn't aware of the regex's semantic level, which results in inaccurate results for regexes that match at a mid-character index.
20x perf speedup in the "BasicBacktrack" benchmarks.
* Re-use the same executor, remember semantic mode. Gives around a 20% perf improvement to first-match style benchmarks. * Remove history preservation Cuts down on memory usage and avoids some ARC overhead. ~20% gains on "AllMatches" and related benchmarks. * Lower-level matchSeq Avoid collection algorithms inside matchSeq, which are liable to add ARC and inefficiencies. Results in a 3x improvement to ReluctantQuantWithTerminal.
Gives a 7x improvement to firstMatch-style benchmarks like "FirstMatch", 2-3x to CSS and basic backtracking benchmarks. Thanks to @rctcwyvrn for the original code.
Currently, unary regex component builder simply forwards the component's base type. However, this is inconsistent with non-unary builder results. The current behavior may lead to surprising results when the user marks a property with `@RegexComponentBuilder`. This patch makes `RegexComponentBuilder.buildPartialBlock<R>(first: R)` return a `Regex<R.RegexOutput>` rather than `R` itself. --- Before: ```swift // error: cannot convert value of type 'OneOrMore<Substring>' to specified type 'Regex<Substring>' @RegexComponentBuilder var r: Regex<Substring> { OneOrMore("a") // Adding other components below will make the error go away. } struct MyCustomRegex: RegexComponent { // error: cannot convert value of type 'OneOrMore<Substring>' to specified type 'Regex<Substring>' var regex: Regex<Substring> { OneOrMore("a") } } ``` After: No errors.
Make unary builder return `Regex` type consistently
* [benchmark] Add no-capture version of grapheme breaking exercise * [benchmark] Add cross-engine benchmark helpers * [benchmark] Hangul Syllable finding benchmark
* Avoid double execution by avoiding Array init * De-genericize processor, engine, etc. Provides only modest performance improvements (it was already getting specialized), but makes it possible to add String-specific specializations.
* Add debug mode * Fix typo in css regex * Add HTML benchmark * Add email regex benchmarks * Add save/compare functionality to the benchmarker * Clean up compare and add cli flags
)" (swiftlang#507) This reverts commit e0af639.
oops Repeat does not get to participate in inline fix tests
Handle atoms as things to be wrapped in One
[Printer] Unconditionally print a regex block for concatenations
This separates the two different ideas for boundaries in the base input: - subjectBounds: These represent the actual subject in the input string. For a `String` callee, this will cover the entire bounds, while for a `Substring` these will represent the bounds of the substring in the base. - searchBounds: These represent the current search range in the subject. These bounds can be the same as `subjectBounds` or a subrange when searching for subsequent matches or replacing only in a subrange of a string. * firstMatch shouldn't update searchBounds on iteration When we move forward while searching for the first match, the search bounds should stay the same. Only the currentPosition needs to move forward. This will allow us to implement the \G start of match anchor, with which /\Gab/ matches "abab" twice, compared with /^ab/, which only matches once. * Make matches(of:) and ranges(of:) boundary-aware With this change, RegexMatchesCollection keeps the subject bounds and search bounds separately, modifying the search bounds with each iteration. In addition, the replace methods that only operate on a subrange can specify that specifically, getting the correct anchor behavior while only matching within a portion of a string.
* [benchmark] Add no-capture version of grapheme breaking exercise * [benchmark] Add cross-engine benchmark helpers * [benchmark] Hangul Syllable finding benchmark * Add debug mode * Fix typo in css regex * Add HTML benchmark * Add email regex benchmarks * Add save/compare functionality to the benchmarker * Clean up compare and add cli flags * Make fixes * oops, remove some leftover code * Fix linux build issue + add cli option for specifying compare file * Add benchmarks Co-authored-by: Michael Ilseman <[email protected]>
- Space out the names properly instead of relying on tabs - Add a decimal point to the percentage - Filter out NS benchmarks from the comparison - Sort comparisons by amount of improvement/regression (by s, not % beceause we have lots of variance + low runtime benchmarks)
We can do the semantic members check up-front.
Tighten up validation of character class range operands such that we reject quotes and custom character classes. This includes rejecting syntax that would be a subtraction in .NET. We throw a custom error that suggests using `--` instead.
@swift-ci please test |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.