Skip to content

Probable bug with the output of option_env!("NON_UNICODE_ENV_VAR") #122669

Closed
@beetrees

Description

@beetrees

I tried this code:

fn main() {
    dbg!(option_env!("NON_UNICODE_ENV_VAR"));
}

compiler with NON_UNICODE_ENV_VAR=$'\xFF' rustc code.rs (using Bash shell syntax, any non-Unicode environment variable value for NON_UNICODE_ENV_VAR will work).

I expected to see this happen: a Some value, as the documentation currently states "If the named environment variable is present at compile time, this will expand into an expression of type Option<&'static str> whose value is Some of the value of the environment variable".

Instead, this happened: The program outputted:

[code.rs:2:2] option_env!("NON_UNICODE_ENV_VAR") = None

Currently, the documentation doesn't consider the possibility of a non-Unicode environment variable value (as only Unicode values can be expressed as a &'static str), and the implementation emits None, the same as for a non-existent environment variable.

There are three possibilities that I can think of:

  1. The current behaviour is a bug, and a compilation error should be emitted as it isn't possibly to represent a non-Unicode value as a &'static str, and the documentation should also be updated to reflect this. Note that this would be a breaking change to the compiler itself, but would be allowed as it would be a bugfix.
  2. The current behaviour is not a bug, and the documentation should be updated to reflect the current behaviour.
  3. Same as 2, but the behaviour is unexpected enough to warrant a warn/deny-by-default lint.

Personally, I believe 1 to be the case, as the current behaviour is unexpected and leads to code patterns like let was_defined_at_compile_time = option_env!("VAR").is_some() silently returning incorrect results. There doesn't seem to be a "correct" value to return, as None is already taken by "does not exist" and Some(&str) cannot represent a non-Unicode value. It seems unlikely anyone is relying on the current behaviour given Rust's very UTF-8 everywhere design, the fact the same result can be achieved much more easily by just not setting the environment variable, and the platform-specific nature of what non-Unicode values can exist (Windows UTF-16 unpaired surrogates vs. Unix invalid UTF-8 bytes). It therefore seems to be more likely to occur when someone has mistyped something in their shell or due to data corruption, which the current output would make rather confusing to track down (as it would seem to say the variable was missing, rather than set incorrectly).

Meta

rustc --version --verbose:

commit-hash: 07dca489ac2d933c78d3c5158e3f43beefeb02ce
commit-date: 2024-02-04
host: x86_64-unknown-linux-gnu
release: 1.76.0
LLVM version: 17.0.6

I believe the decision of what to do is down to libs-api, so

@rustbot label +T-libs-api

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)C-bugCategory: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions