Skip to content

Rename std::io::Read::chars to utf8_chars. #33761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

SimonSapin
Copy link
Contributor

Unlike str::chars where UTF-8 is implied since that’s always the encoding of str (whose contents is guaranteed to be well-formed), the bytes read from io::Read are arbitrary.

Unlike `str::chars` where UTF-8 is implied since that’s always
the encoding of `str` (whose contents is guaranteed to be well-formed),
the bytes read from `io::Read` are arbitrary.
@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 20, 2016
@steveklabnik
Copy link
Member

Hm, I mean, char is already a data type in Rust, though, and it implies UTF-8.

@SimonSapin
Copy link
Contributor Author

The char type is not related to UTF-8. It is represented in memory as a u32 of the code point value, making [char] slices effectively UTF-32.

str is always UTF-8 internally, so it’s fine for str::chars not to include utf8 in its name. But str is not involved in Read::chars / Read::utf8_chars.

@eefriedman
Copy link
Contributor

It might make sense to hold off on merging this until the other problems with chars() are addressed (moving it to BufRead, and the error handling problems). In particular, depending on the resolution of the error handling issues, we might end up renaming it again, to something like chars_utf8_lossy.

@SimonSapin
Copy link
Contributor Author

I think making UTF-8 explicit in the name is orthogonal to all of these issues. Similar to how we have both String::from_utf8 and String::from_utf8_lossy, lossy decoding should be an addition to something that yields results, not a replacement.

SimonSapin added a commit to SimonSapin/rust that referenced this pull request May 22, 2016
Unlike `str::chars` where UTF-8 is implied since that’s always
the encoding of `str` (whose contents is guaranteed to be well-formed),
the bytes read from `io::Read` are arbitrary.

Fixes rust-lang#33761.
@SimonSapin SimonSapin mentioned this pull request May 22, 2016
@alexcrichton
Copy link
Member

I'm gonna close this in favor of #33801 as it looks like it's included there and cuts down on the number of PRs, thanks though @SimonSapin!

@SimonSapin
Copy link
Contributor Author

Yes, I meant to do this and forgot.

@SimonSapin SimonSapin deleted the io-utf8-chars branch May 23, 2016 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants