Skip to content

Regression in rlua crate tests between rustc 1.23.0 and 1.24.0, only on windows. #48251

Closed
@kyren

Description

@kyren

[The following explanation of the problem contains a lot of assumptions about the problem's nature, I could be extremely wrong! Take this explanation with a grain of salt]

Wild guess, it has something to do with #46833.

So, I'm not sure this is a rust bug exactly, but I wanted to make you aware of a crate regression that's causing me quite a lot of problems. Ultimately, I believe this comes from me doing some things that are not.. exactly definitely kosher, but that I don't have a convenient workaround for.

Ultimately, the problem stems from the Lua C API having error handling fundamentally based around setjmp / longjmp. In order to write a usable wrapper in Rust, without having to write inconvenient, slow C shims, Rust must, at times, call into Lua C API functions that in turn trigger a longjmp, and that longjmp must necessarily pass over Rust frames.

I realize how unsafe this is! However, there are some limitations here that I think make it at least somewhat reasonable to be able to do:

  1. There are only Copy types on the rust stack frame being jumped over.
  2. Rust is never calling setjmp, only triggering a longjmp through a Lua C API call (such as lua_error).
  3. Making this unsafe to do means that it is impossible to wrap a C API such as Lua's without resorting to writing more C.

This last point is unfortunate, and causes me a lot of headaches, and at least one strange practical problem. We at Chucklefish build rlua for consoles, but for consoles the gcc crate does not work, and we have to build Lua out of tree (also to include Lua console specific patches). Needing to include C shims in rlua means that we would also have to copy paste those shims into the console projects, and this is a less than ideal situation.

The reason I think this issue has to do with #46833 is that I can actually fix the regression if I simply add an #[unwind] attribute to each rust function that calls lua_error. However, this is only necessary for some reason on windows, and the attribute is currently unstable!

I think it's completely reasonable to need to mark functions as #[unwind] if they will trigger a longjmp, but I'm not exactly sure what to do in the meantime until unwind_attributes becomes stable. Maybe this behavior on windows IS simply a bug? Maybe it's a bug on linux / macos that this doesn't abort? I would especially not like to have to write C shims if ultimately I can get rid of them when unwind_attributes becomes stable.

(Side Note: Chucklefish actually uses the stable rust compiler in production now, so this is especially annoying. We were very excited about the 1.24 release, because it includes incremental compilation and rustfmt-preview while also being stable, and we noticed this bug in trying to switch to the stable compiler on windows.)

Edit: If people have to dig into the rlua source, I'm sorry, it is a tangled nest of "unsafe, unsafe everywhere". The best example to look at is the test_error test, and the method Lua::create_callback_function. Adding the #[unwind] attribute to: callback_call_impl, safe_pcall, and safe_xpcall fixes that test on windows, on unstable.

Edit 2: Edited for clarity, but also I need to point out that I need to do something, because this bug currently makes rlua just basically broken on windows. It's not "just" a test failure, error handling in general will just abort.

Metadata

Metadata

Assignees

Labels

C-bugCategory: This is a bug.O-windowsOperating system: WindowsP-highHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.regression-from-stable-to-stablePerformance or correctness regression from one stable version to another.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions