Skip to content

Add new feature zlib-rs #1873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 20, 2025
Merged

Add new feature zlib-rs #1873

merged 2 commits into from
Mar 20, 2025

Conversation

NobodyXu
Copy link
Contributor

@NobodyXu NobodyXu commented Mar 3, 2025

@NobodyXu NobodyXu force-pushed the zlib-rs branch 3 times, most recently from 438f5bf to 34889b9 Compare March 5, 2025 13:10
@NobodyXu
Copy link
Contributor Author

NobodyXu commented Mar 5, 2025

I need some help with the windows CI, not sure why it's pulling libz-sys and libz-ng-sys when it already has libz-rs

Other CIs, except for MSRV, are not related to this PR

@NobodyXu
Copy link
Contributor Author

NobodyXu commented Mar 6, 2025

Tried checking using cargo tree --workspace for x86_64-pc-windows-msvc, but no luck, it shows that flate2 only pulls in zlib-rs and miniz_oxide

I have no clue what's going on here

@NobodyXu
Copy link
Contributor Author

cc @Byron I might need some help from you, since you also maintain flate2

@emilazy
Copy link
Contributor

emilazy commented Mar 16, 2025

This would be great for Jujutsu. We’ve held off on defaulting to zlib-ng because of packaging concerns. It could even go in max-performance-safe, I think, since it shouldn’t require a C compiler, or even be the default.

I’m not sure what’s up with Windows, but it looks kind of like both the zlib and zlib-rs features are getting enabled at once? If there’s appetite for this change, I can try and find the time to dig into it further.

I think the justfile will need adjusting, also.

Please let us know what we can do to help move this along, @Byron :)

@@ -67,6 +66,8 @@ crc32 = ["dep:crc32fast"]
zlib = ["dep:flate2", "flate2?/rust_backend", "dep:thiserror"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be turning on rust_backend when using zlib-rs?

Copy link
Contributor Author

@NobodyXu NobodyXu Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rust_backend is another pure rust implementation that is much much slower than other backend, AFAIK it's only there as a fallback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but the zlib-rs feature pulls in the zlib feature, and enables the other Rust backend, which we probably don’t want? I’m wondering if that might somehow be part of the CI problems.

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this happen, and apologies for the late response.

I am still catching up with what accumulated over just 7 days, it's going to be a while.

Meantime, I think this PR needs changes, but nothing huge or in the way of a merge.
Unfortunately, I also don't know why 32bit ARM and Windows suffer from duplicate definitions. There is this one issue of the same happening on Linux, but there it seems to be about pulling in multiple versions of libz-sys.

To get this PR going, I think the default performance backend should remain zlib-ng, but jj for instance could configure their unix builds to use zlib-rs more easily in future.

@emilazy is cordially invited to help shape this PR so gitoxide would work better for jj.

What do you think?

@@ -32,7 +32,7 @@ gix-object = { version = "^0.47.0", path = "../gix-object" }
gix-path = { version = "^0.10.14", path = "../gix-path", optional = true }
gix-date = { version = "^0.9.3", path = "../gix-date" }

flate2 = { version = "1.0.33", optional = true }
flate2 = { version = "1.0.33", optional = true, default-features = false }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that default-features is a problem - these, i.e. the rust-backend are superseded by any other choice.

@@ -67,6 +66,8 @@ crc32 = ["dep:crc32fast"]
zlib = ["dep:flate2", "flate2?/rust_backend", "dep:thiserror"]
## Use the C-based zlib-ng backend, which can compress and decompress significantly faster.
zlib-ng = ["zlib", "flate2?/zlib-ng"]
## Use the rust-based zlib backend, which can compress and decompress significantly faster and is on par with zlib-ng or even faster.
zlib-rs = ["zlib", "flate2?/zlib-rs"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see this here!

gix/Cargo.toml Outdated
## Note that some platforms might suffer from compile failures, which is when `max-performance-safe` should be used.
max-performance = ["max-performance-safe", "zlib-ng", "fast-sha1"]
max-performance = ["max-performance-safe", "zlib-rs", "fast-sha1"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to push my luck 😅, and would prefer to start by merely making the use of zlib-rs easy with the provided feature.

Copy link
Contributor Author

@NobodyXu NobodyXu Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe another feature then?

Yeah that seems fine, cargo-binstall could enable the feature separately.

In cargo-binstall we want to get rids of zlib-ng as we already pulled in zlib-rs, having zlib-ng actually makes cross compilation harder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Byron reverted this change

@emilazy
Copy link
Contributor

emilazy commented Mar 18, 2025

To get this PR going, I think the default performance backend should remain zlib-ng, but jj for instance could configure their unix builds to use zlib-rs more easily in future.

I agree that this is the conservative choice. However, in the future, I’d hope that zlib-rs could be the default even without an extra performance flag, as it won’t pull in any native code and seems like a clearly superior choice to the current default Rust implementation. In fact, the feature could probably be dropped entirely at that point, as the sha1 package seems to use intrinsics rather than depending on a C compiler: https://github.com/RustCrypto/hashes/blob/e6bc891b48f6bf304d12d248ef4d21b46ee11812/sha1/src/compress/aarch64.rs. So it seems to me that zlib-rs will enable us to have maximum performance without any C compiler dependency at all.

@Byron
Copy link
Member

Byron commented Mar 19, 2025

And there even is a test for that, so it should be straightforward to validate that indeed no C-compiler is needed anymore.

Also, I'd be happy to reach that goal, and if it can be achieved earlier, that's better, too.

For now it seems the easiest way to pacify CI is to undo the default backend change. This buys time for figuring out what's actually going on there.

My thinking is that ideally, flate2 won't have to rely on the C bindings of zlib-rs either, it's a Rust library and there shouldn't be conflicts on the symbol level. Even if something somewhere in the dependency graph uses the C version of Zlib, flate2 shouldn't be affected by it. In case this is feasible, CC @folkertdev .

@folkertdev
Copy link

I believe the problem is this

│   │   │   ├── gix-transport v0.45.0 (/tmp/gitoxide/gix-transport)
│   │   │   │   ├── base64 v0.22.1
│   │   │   │   ├── bstr v1.11.3 (*)
│   │   │   │   ├── curl v0.4.47
│   │   │   │   │   ├── curl-sys v0.4.80+curl-8.12.1
│   │   │   │   │   │   ├── libc v0.2.170
│   │   │   │   │   │   ├── libz-sys v1.1.21
│   │   │   │   │   │   │   └── libc v0.2.170

I suspect that is hard to work around (though using reqwest and rustls does sound like the better long-term solution than curl). So for the immediate future prefixing the symbols is the quickest solution.

We do want to provide a nice rust interface. The brave rustls folks already use our rust interface, but we basically break it on every release, so it's not something I'd recommend. As it stands, creating a good rust interface would require some funding.

@Byron
Copy link
Member

Byron commented Mar 19, 2025

Thanks for chiming in! This would mean zlib-rs will work either without transport support, or when building the pure-rust version that will use a reqwest powered HTTP transport implementation.
But it also means that it's just a matter of time until flate2 can start using the Rust interface :).
Regarding funding, zlib-rs would probably qualify for support by the Sovereign Tech Fund, I did some research myself, but realized that even managing funds and driving development according to some plan is a full-time job (at least at first).

@NobodyXu NobodyXu changed the title Use zlib-rs for max-performance Add new feature zlib-rs Mar 19, 2025
Also bump flate2 to 1.1.0 to make sure it support zlib-rs

Signed-off-by: Jiahao XU <[email protected]>
@NobodyXu NobodyXu requested a review from Byron March 19, 2025 13:50
@erikjee
Copy link

erikjee commented Mar 19, 2025

@Byron I'm the person at Trifecta Tech Foundation (where zlib-rs lives) blessed with fundraising duties.
Last summer we applied to Sovereign Tech for zlib-rs (and zstd) but that got rejected.
Applying together might have better chances, either at STA or another fund. Please let me know if you want to talk about that.

So far, funding zlib-rs has been difficult and only partly successful because everyone seems to like the free stuff better I haven't explained its relevance well enough.

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, it's ready to go now!

@erikjee Will you be at RustWeek, because I am considering to go. If I go, I think with the right amount of support ("You can do it" [patting shoulder]) I could finally go through with the application for gitoxide and just get it done while there. Applications might be closed already, but I'd hope they reopen in a couple of months.
Strangely, despite being in a good spot, the more successful gitoxide is, the less time I have to actually work on it, and funded outside help could definitely fix that. And yes, why not make zlib-rs a sub-project of that application.

@Byron Byron enabled auto-merge March 20, 2025 01:01
@Byron Byron merged commit 316f113 into GitoxideLabs:main Mar 20, 2025
37 of 39 checks passed
@erikjee
Copy link

erikjee commented Mar 20, 2025

@Byron

Will you be at RustWeek, because I am considering to go.

I'm one of the organizers :) So, yes!
Ready to pat you on the back etc. The unconf would be best, because I'll actually have some time. I think Folkert already sent you an email inviting you to join. I'll reply there.

@emilazy
Copy link
Contributor

emilazy commented Mar 20, 2025

We do want to provide a nice rust interface. The brave rustls folks already use our rust interface, but we basically break it on every release, so it's not something I'd recommend. As it stands, creating a good rust interface would require some funding.

Just a thought, but maybe you could provide a Rust API that is 1:1 with the C API first? That is, define the C API with non-#[extern] Rust definitions, and then have extern "C"s with the same signatures that just forward on the calls – or even just have one set of definitions with #[cfg_attr]s. That way, a Rust package like flate2 that wants to avoid polluting the symbol table can use the C API directly from Rust, without anyone getting blocked on API design concerns.

@NobodyXu
Copy link
Contributor Author

Just a thought, but maybe you could provide a Rust API that is 1:1 with the C API first?

I think this is the status quo?

@folkertdev
Copy link

No, we don't do this part

define the C API with non-#[extern] Rust definitions

That's an interesting idea, I'll have to think on that a bit

@emilazy
Copy link
Contributor

emilazy commented Mar 20, 2025

Not quite. If you look at https://docs.rs/libz-rs-sys/latest/libz_rs_sys/fn.deflate.html, it’s extern "C-unwind" and has #[export_name = "deflate"]. That is what results in the symbol conflict we see here. It could instead work like this:

pub unsafe fn deflate(strm: *mut z_stream, flush: i32) -> c_int {}

#[cfg(feature = "c-api")]
#[export_name = "deflate"]
pub unsafe extern "C-unwind" fn export_deflate(strm: *mut z_stream, flush: i32) -> c_int {
    deflate(strm, flush)
}

(Actually I guess this very slightly changes semantics, because &deflate won’t necessarily be the same function pointer seen from C. But it probably doesn’t matter too much. You could also keep the extern but just #[cfg_attr(not(feature = "c-api"), export_name = "zlib_rs_deflate")], though that’s a bit of a hack.)

@emilazy
Copy link
Contributor

emilazy commented Mar 20, 2025

Hmm, actually it looks like there’s already a custom-prefix feature! But of course that has the problem of breaking crates which do need the C API because of features being additive, and needing wiring up externally by packaging. Still, that means it should be technically possible to build something that uses both curl and zlib-rs already. But I think my proposal of splitting the exports (or just using a macro to conditionalize the extern on a crate feature) is probably best.

@folkertdev
Copy link

we've added a export-symbols feature flag now (trifectatechfoundation/zlib-rs#322), which will be part of the next release. Thanks @emilazy for that suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants