Skip to content

Reading a CString safely without overhead from Read #59229

Closed
@DevQps

Description

@DevQps

Hello everyone,

When I was inspecting libflate for unsafe code I found this piece of code:

fn read_cstring<R>(mut reader: R) -> io::Result<CString>
where
    R: io::Read,
{
    let mut buf = Vec::new();
    loop {
        let b = reader.read_u8()?;
        if b == 0 {
            return Ok(unsafe { CString::from_vec_unchecked(buf) });
        }
        buf.push(b);
    }
}

The Problem

If I am correct and didn't miss anything this function is completely safe. However because there is (for as far as I know) no functionality that can safely read a CString without performance overhead, the author probably felt forced to implement it himself.

I think reading CStrings from an object that implements Read is a pretty common operation when handling binary files, so it might be good to provide functionality in the standard library for doing so without sacrificing performance.

Currently two solutions have been presented.

Solution 1: Add a CString::from_reader method

One solution would be to add a CString::from_reader method as follows:

fn CString::from_reader(mut reader: impl Read) -> Result<CString, std::io::Error>
{
    let mut buffer = Vec::new();
    let mut character: u8 = 0;
    
    loop {
        let slice = std::slice::from_mut(&mut character);
        reader.read_exact(slice)?;
    
        // Check if a null character has been found, if so return the Vec as CString.
        if character == 0 {
            return Ok(unsafe { CString::from_vec_unchecked(buffer) });
        }
            
        // Push a new non-null character.
        buffer.push(character);
    }
}

Pro's and Con's:

  • Pro: CStrings can be read using a simple one liner if the source being read implements Read. &[u8] also implements Read so I think this is already quite flexible.
  • Con: It only works when reading from objects that impl Read. I am not sure if there are any scenario's on which this would not be sufficient?

-> Playground example

Solution 2: Add way to convert from Vec<NonZeroU8> to CString

Credits to @alex

Another solution would be to add a conversion method (by for example using the From trait) for Vec<NonZeroU8> to CString. Since we can be sure that no zero characters are included in the Vector we could perform a cheap conversion. I took an attempt to implementing From for CString but it might be improved.

impl From<Vec<NonZeroU8>> for CString
{
    fn from(vector: Vec<NonZeroU8>) -> Self
    {
        unsafe {
            let vector = std::mem::transmute::<Vec<NonZeroU8>, Vec<u8>>(vector);
            CString::from_vec_unchecked(vector)
        }
    }
}

Pro's and Con's:

  • Pro: Everything that can be converted to a Vec<NonZeroU8> can be safely converted into a CString.
  • Con: People would still have to read bytes from a Read````-able object manually and converting them into a Vec``` in order for this to work.

-> Playground example

I wonder what you all think of this proposal and whether or not it could be improved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-FFIArea: Foreign function interface (FFI)C-feature-requestCategory: A feature request, i.e: not implemented / a PR.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions