Skip to content

The kstring integration in gix-attributes is unsound #1460

Closed
@ssbr

Description

@ssbr

Current behavior 😯

gix-attributes (in state::ValueRef) unsafely creates a &str from a &[u8] containing non-UTF8 data, with the justification that so long as nothing reads the &str and relies on it being UTF-8 in the &str, there is no UB:

// SAFETY: our API makes accessing that value as `str` impossible, so illformed UTF8 is never exposed as such.

The problem is that the non-UTF8 str is exposed to outside code: first to the kstring crate itself, which requires UTF-8 in its documentation and may have UB as a consequence of this, but also to serde, where it propagates to e.g. serde_json, serde_yaml, etc., where the same problems occur.

As far as I know, this is not sound, and either is or can cause UB down the line in these places that can view the &str.

Expected behavior 🤔

I think gix-attributes should probably use a Vec<u8> or smallvec or similar, at least until kstring can support bytes.

Absolute worst case, one could at least add an unsafe feature, so that code that opts out of unsafe can get the slower Vec<u8>-based implementation which is guaranteed to be sound.

Git behavior

N/A

Steps to reproduce 🕹

N/A

Metadata

Metadata

Assignees

Labels

acknowledgedan issue is accepted as shortcoming to be fixed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions