You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Documentation/Evolution/RegexSyntax.md
+46-11Lines changed: 46 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -10,27 +10,56 @@ Hello, we want to issue an update to [Regular Expression Literals](https://forum
10
10
11
11
A regex declares a string processing algorithm using syntax familiar across a variety of languages and tools throughout programming history. We propose the ability to create a regex at run time from a string containing regex syntax (detailed here), API for accessing the match and captures, and a means to convert between an existential capture representation and concrete types.
12
12
13
-
The overall story is laid out in [Regex Type and Overview](https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/RegexTypeOverview.md) and each individual component is tracked in [Pitch and Proposal Status](https://github.com/apple/swift-experimental-string-processing/issues/107).
13
+
The overall story is laid out in [Regex Type and Overview][overview] and each individual component is tracked in [Pitch and Proposal Status](https://github.com/apple/swift-experimental-string-processing/issues/107).
14
14
15
15
## Motivation
16
16
17
17
Swift aims to be a pragmatic programming language, striking a balance between familiarity, interoperability, and advancing the art. Swift's `String` presents a uniquely Unicode-forward model of string, but currently suffers from limited processing facilities.
18
18
19
-
<!--
20
-
... tools need run time construction
21
-
... ns regular expression operates over a fundamentally different model and has limited syntactic and semantic support
22
-
... we prpose a best-in-class treatment of familiar regex syntax
23
-
-->
19
+
`NSRegularExpression` can construct a processing pipeline from a string containing [ICU regular expression syntax][icu-syntax]. However, it is inherently tied to ICU's engine and thus it operates over a fundamentally different model of string than Swift's `String`. It is also limited in features and carries a fair amount of Objective-C baggage.
20
+
21
+
```swift
22
+
let pattern =#"(\w+)\s\s+(\S+)\s\s+((?:(?!\s\s).)*)\s\s+(.*)"#
23
+
let nsRegEx =try!NSRegularExpression(pattern: pattern)
24
24
25
-
The full string processing effort includes a regex type with strongly typed captures, the ability to create a regex from a string at runtime, a compile-time literal, a result builder DSL, protocols for intermixing 3rd party industrial-strength parsers with regex declarations, and a slew of regex-powered algorithms over strings.
25
+
funcprocessEntry(_line: String) -> Transaction? {
26
+
let range =NSRange(line.startIndex..<line.endIndex, in: line)
27
+
guardlet result = nsRegEx.firstMatch(in: line, range: range),
28
+
let kindRange =Range(result.range(at: 1), in: line),
29
+
let kind = Transaction.Kind(line[kindRange]),
30
+
let dateRange =Range(result.range(at: 2), in: line),
31
+
let date =try?Date(String(line[dateRange]), strategy: dateParser),
32
+
let accountRange =Range(result.range(at: 3), in: line),
33
+
let amountRange =Range(result.range(at: 4), in: line),
Fixing these fundamental limitations requires migrating to a completely different engine and type system representation. This is the path we're proposing with `Regex`, outlined in [Regex Type and Overview][overview]. Details on the semantic mismatch between ICU and Swift's `String` is discussed in [Unicode for String Processing][pitches].
46
+
47
+
Run-time construction is important for tools and editors. For example, SwiftPM allows the user to provide a regular expression to filter tests via `swift test --filter`.
26
48
27
-
This proposal specifically hones in on the _familiarity_ aspect by providing a best-in-class treatment of familiar regex syntax.
28
49
29
50
## Proposed Solution
30
51
31
-
<!--
32
-
... regex compiling and existential match type
33
-
-->
52
+
We propose run-time construction of `Regex` from a best-in-class treatment of familiar regular expression syntax. A `Regex` is generic over its `Output`, which includes capture information. This may be an existential `AnyRegexOutput`, or a concrete type provided by the user.
53
+
54
+
```swift
55
+
let pattern =#"(\w+)\s\s+(\S+)\s\s+((?:(?!\s\s).)*)\s\s+(.*)"#
56
+
let regex =try!Regex(compiling: pattern)
57
+
// regex: Regex<AnyRegexOutput>
58
+
59
+
let regex: Regex<(Substring, Substring, Substring, Substring, Substring)> =
60
+
try!Regex(compiling: pattern)
61
+
```
62
+
34
63
35
64
### Syntax
36
65
@@ -866,3 +895,9 @@ This proposal regards _syntactic_ support, and does not necessarily mean that ev
Copy file name to clipboardExpand all lines: Documentation/Evolution/RegexTypeOverview.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ We propose addressing this basic shortcoming through an effort we are calling re
14
14
3. A literal for compile-time construction of a regex with statically-typed captures, enabling powerful source tools.
15
15
4. An expressive and composable result-builder DSL, with support for capturing strongly-typed values.
16
16
5. A modern treatment of Unicode semantics and string processing.
17
-
6. A treasure trove of string processing algorithms, along with library-extensible protocols enabling industrial-strength parsers to be used seamlessly as regex components.
17
+
6. A slew of regex-powered string processing algorithms, along with library-extensible protocols enabling industrial-strength parsers to be used seamlessly as regex components.
18
18
19
19
This proposal provides details on \#1, the `Regex` type and captures, and gives an overview of how each of the other proposals fit into regex in Swift.
0 commit comments