Skip to content

Untangle _RegexParser from RegexBuilder #299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

natecook1000
Copy link
Member

With the change to make _StringProcessing import _RegexParser as implementation only, no AST types can be used by the RegexBuilder module. This adds wrapper types in _StringProcessing to hide those types, particularly as associated values in DSLTree.Node. It also provides a wrapper type for DSLTree.Node that conforms to _TreeNode for the purpose of building up the tree's capture structure. (It sure would be nice if implementation-only-imported protocols were just treated as internal.)

@natecook1000 natecook1000 requested a review from rxwei April 19, 2022 06:01
@natecook1000
Copy link
Member Author

@swift-ci Please test

@natecook1000 natecook1000 merged commit 182da3b into swiftlang:main Apr 19, 2022
@natecook1000 natecook1000 deleted the untangle_regex_parser_builder branch April 19, 2022 16:33
let amount = amount._patternBase
let kind = kind._patternBase
let amount = amount.ast._patternBase
let kind = (kind.ast ?? .eager)._patternBase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this depend on options?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does in two ways:

  • If this quantifier was written in the DSL, it can have a default value that gets drawn from the current options
  • If this quantifier was written in syntax, .eager and .reluctant can get flipped, depending on the options (the AST is just the syntax)

We defer both of those effects until byte code generation — we'd need to maintain option state in a similar way here.

case .default:
return ".eager"
}
(ast ?? .eager)._patternBase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this depend on options?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see above.

@@ -134,6 +142,12 @@ extension DSLTree {
self.isInverted = isInverted
}

public static func generalCategory(_ category: Unicode.GeneralCategory) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this API? What proposal is it in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the Unicode for String Processing proposal — we should provide more ways of making character classes over Unicode properties.

extension DSLTree {
/// Presents a wrapped version of `DSLTree.Node` that can provide an internal
/// `_TreeNode` conformance.
struct _Tree: _TreeNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for? Why isn't this DSLTree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less a compiler limitation workaround. _TreeNode is a protocol defined in _RegexParser, and since DSLTree is public (as SPI), its conformance to _TreeNode is also public. However, _RegexParser is imported as implementation only, so its symbols can't/shouldn't be included in the public interface of _StringProcessing. So this type basically provides that DSLTree: _TreeNode conformance by composition, and since it's internal, it doesn't cause the issue with exported implementation-only symbols.

}

@_spi(RegexBuilder)
public enum _AST {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the AST relevant at all to DSLTree? Is it because we don't have or want to have versions of these enums?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the things that we hadn't duplicated into the DSLTree layer, so we were just storing AST symbols in DSLTree nodes. My goal with this change was just to wrap those, not necessarily replace them wholesale. At some point it would be great to have better separation between the two modules.

@@ -294,7 +298,17 @@ extension _CharacterClassModel: CustomStringConvertible {
}

extension _CharacterClassModel {
public func makeAST() -> AST.Node? {
public func makeDSLTreeCharacterClass() -> DSLTree.CustomCharacterClass? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this public? Should it be SPI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_CharacterClassModel is SPI, so this is as well — I should probably at least add a comment so it isn't jarring to see this in an extension.

natecook1000 added a commit to natecook1000/swift-experimental-string-processing that referenced this pull request May 12, 2022
This makes the changes necessary for _RegexParser to be imported as an
implementation-only dependency. The change provides _StringProcessing wrappers
for all `AST` types that need to be publicly visible via SPI, and a
DSLTree.Node wrapper for internal conformance to _TreeNode.

Co-authored-by: Richard Wei <[email protected]>
natecook1000 added a commit to natecook1000/swift-experimental-string-processing that referenced this pull request May 16, 2022
This makes the changes necessary for _RegexParser to be imported as an
implementation-only dependency. The change provides _StringProcessing wrappers
for all `AST` types that need to be publicly visible via SPI, and a
DSLTree.Node wrapper for internal conformance to _TreeNode.

Co-authored-by: Richard Wei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants