Closed
Description
On nightly 2021-03-05, with RUSTFLAGS="-Ccodegen-units=1 -Cinline-threshold=0 -Clink-dead-code"
, I am experiencing undefined behaviour which seems to be based around char::encode_utf8
.
I ran this code:
fn make_string(ch: char) -> String {
let mut bytes = [0u8; 4];
ch.encode_utf8(&mut bytes).into()
}
fn main() {
let ch = '😃';
dbg!(ch);
let string = make_string(ch);
dbg!(string);
}
I expected to see the following output:
[src/bin/string_crash.rs:8] ch = '😃'
[src/bin/string_crash.rs:10] string = "😃"
I get the above output on stable 1.50.0, or on the same nightly version if I remove at least one of the three RUSTFLAGS
listed above.
With all three flags present, on nightly 2021-03-05 I see the following output:
[src/bin/string_crash.rs:8] ch = '😃'
memory allocation of 140730017967032 bytes failed
Aborted
The exact bytes count varies, so this looks like UB to me.
Meta
rustc +nightly --version --verbose
:
rustc 1.52.0-nightly (caca2121f 2021-03-05)
binary: rustc
commit-hash: caca2121ffe4cb47d8ea2d9469c493995f57e0b5
commit-date: 2021-03-05
host: x86_64-unknown-linux-gnu
release: 1.52.0-nightly
LLVM version: 12.0.0
Metadata
Metadata
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Category: This is a bug.Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessBugs identified for the LLVM ICE-breaker groupCritical priorityRelevant to the compiler team, which will review and decide on the PR/issue.Performance or correctness regression from stable to nightly.