Consistent interface to get text and bytes

Follow up from #610 #790 and #893

General policy:

- Do not implement `str()`
- To get unicode string use `.text`
- In `.text` use UTF-8 and *replace* (the rationale for *replace* is explained in https://github.com/libgit2/pygit2/pull/790#issuecomment-385906316)
- To get the byte string use `.data` or `.raw` (*this is to be decided*)
- For attributes the name of the attribute returns text, prefix with `raw_` to get bytes. For instance `Signature.name` and `Signature.raw_name`
- Implement the buffer protocol, `bytes(..)` where appropriate

Open for discussion.

TODO:

- [x] Replace `TreeEntry._name` by `.raw_name`
- [ ] Replace `DiffLine.content` by `.text`
- [ ] Inventory all the places where we get bytes, text, or the buffer protocol
- [ ] Settle on `.data` or `.raw`
- [ ] Replace `DiffLine.raw_content` by `.data` or `.raw`
- [ ] Replace `Object.read_raw()` by `.data` (or `.raw`), then remove `Blob.data` (it will inherit from `Object`)
- [ ] Settle on `str()` `bytes()` and the buffer protocol

The case of Oid, what we've now:

- `oid.raw` returns the byte string (that's good, unless we decide to settle on `.data`)
- `str(oid)` and `oid.hex` both return the hex representation, always `<str>` (bytes in Python 2 and text in Python 3)
- Oid is the only place where we implement `str(...)`
- `Object.hex` and `TreeEntry.hex` behave the same, they return always `<str>`. Apparently these are the only places where we always return `<str>`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent interface to get text and bytes #895

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consistent interface to get text and bytes #895

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions