Skip to content

Support NS/CFURL re-core in Swift #1238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 12, 2025
Merged

Conversation

jrflat
Copy link
Contributor

@jrflat jrflat commented Apr 4, 2025

Changes to allow NSURL and CFURL to use the Swift URL implementation. With this refactoring, the behavior of URL should be identical, except for a few minor compatibility/bug fixes to the parser and URL functions, revealed by NS/CFURL test failures with the re-core enabled. The overall architecture has a bit of complexity in order to support:

  • file reference NSURLs, but not URLs,
  • subclassed NSURLs, and
  • the ability to bridge a NSURL to Swift while maintaining the same object pointer returned by functions like .absoluteURL (returning self) and .baseURL, which is the previous behavior, and one I believe could have implications for reference counting in ObjC/C land.

This PR changes struct URL to wrap a single class that implements each of its methods. In FOUNDATION_FRAMEWORK, the inner class types of URL conform to a new _URLProtocol. Outside FOUNDATION_FRAMEWORK, only _SwiftURL is used, so the protocol is not needed.

Note: Except for baseURL, a nil URL? return value in the protocol means that struct URL should return self.


class _SwiftURL provides the new Swift implementation for URL, using the same parser and URLParseInfo as URLComponents, but with a few compatibility behaviors.


class _BridgedURL wraps an NSURL reference. Its methods use the old implementations, which call directly into NSURL methods. _BridgedURL is used when an NSURL subclass is bridged to Swift, allowing us to 1) return the same subclass object when bridging back to ObjC and 2) call methods that are overridden by the NSURL subclass like we did before. _BridgedURL also helps us to easily compare the results of the old and new URL implementation.

Note: If the NSURL subclass does not override a method, NSURL will call into the underlying _SwiftURL implementation.


class _BridgedNSSwiftURL wraps an _NSSwiftURL, which is the Swift subclass of NSURL. _BridgedNSSwiftURL is used when an _NSSwiftURL is bridged to Swift, allowing us to return the same object (pointer) when bridging back to ObjC, such as in cases where .absoluteURL should return self, or .baseURL should return a pointer to the same NSURL from initialization. At the same time, this still allows us to use the new _SwiftURL for NSURLs bridged to Swift.

@jrflat
Copy link
Contributor Author

jrflat commented Apr 4, 2025

@swift-ci please test

@jrflat
Copy link
Contributor Author

jrflat commented Apr 4, 2025

@swift-ci please test

@FranzBusch
Copy link
Member

I understand the goals of this change and I haven't looked at the detailed implementation but the following sounds problematic to me:

This PR changes struct URL to wrap a single class that implements each of its methods

URL is a very common currency type used all over the place. Backing it with a class which makes it unconditionally allocate is going to result in serious performance problems and will make it problematic to use in a lot of server use-cases.

Since this PR also introduces differences between platforms. Is there any resulting difference in top-level-API behaviour between platforms?

#if FOUNDATION_FRAMEWORK
private static var _type: any _URLProtocol.Type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the performance impacts of this and other existential on Darwin platforms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL-perf-after-nsurl-recore

I tested the URL benchmarks in this repo before and after the change. Creating a URL from a valid URL string, e.g.

blackHole(URL(string: validURLString))

is 47% faster, partly since we now lazily create an NSURL if we're actually going to bridge instead of up front.

Interestingly, getting encoded components, e.g.

blackHole(encodedURL.scheme)
blackHole(encodedURL.user())
blackHole(encodedURL.password())
blackHole(encodedURL.host())
blackHole(encodedURL.path())
blackHole(encodedURL.query())
blackHole(encodedURL.fragment())

is 195% faster now, even with just calling nearly identical code on the stored any _URLProtocol & AnyObject. So in these cases, I think the existential may have performance benefit over the old architecture.

Parsing an invalid port URL to return nil is likely regressed in part due to an additional compatibility check for URL to allow an empty port. Maybe we're also allocating the class before we know we're just going to return nil? Regardless, parsing an invalid URL string is probably something we don't need to optimize for.

@jrflat
Copy link
Contributor Author

jrflat commented Apr 7, 2025

URL is a very common currency type used all over the place. Backing it with a class which makes it unconditionally allocate is going to result in serious performance problems and will make it problematic to use in a lot of server use-cases.

URL has always been backed by some form of class. Before this, it was backed by a relative and base class URLParseInfo. Originally, I tried to make URLParseInfo a struct for this use-case, but that caused compatibility problems since the size of URL went from 8 bytes (backed by an NSURL) to ~400 bytes in order to accommodate all the optional component ranges. Apps that used highly nested structs and enums with URLs suddenly began crashing since these types would now overflow the stack.

We could try storing some ranges/data directly in the struct URL, but unless there are changes around Swift's handling of large nested types on the stack (which there could have been since my original testing), I think it's a bit risky.

Since this PR also introduces differences between platforms. Is there any resulting difference in top-level-API behaviour between platforms?

There's no differences in behavior for URL APIs other than the compatibility/link-check behaviors for Darwin that were already in place. There is code related to NSURL bridging and file reference URL resolution, which are Darwin-only.

@jrflat
Copy link
Contributor Author

jrflat commented Apr 8, 2025

@swift-ci please test

@jrflat
Copy link
Contributor Author

jrflat commented Apr 8, 2025

@swift-ci please test

@jrflat
Copy link
Contributor Author

jrflat commented Apr 11, 2025

@swift-ci please test

@jrflat
Copy link
Contributor Author

jrflat commented Apr 11, 2025

@swift-ci please test

@jrflat jrflat requested review from parkera and jmschonfeld April 11, 2025 22:44
@parkera
Copy link
Contributor

parkera commented Apr 11, 2025

Thank you!

@jrflat
Copy link
Contributor Author

jrflat commented Apr 11, 2025

Will merge after Swift smoke tests pass with this branch at swiftlang/swift#75354

@jrflat jrflat merged commit 914b9f7 into swiftlang:main Apr 12, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants