M.-A. Lemburg writes:
I complete agree with Stephen, that bytes are in fact often an encoding of text. If that text is ASCII compatible, I don't see any reason why we should not continue to expose the C lib standard string APIs available for text manipulations on bytes.
We already *have* a type in Python 3.3 that provides text manipulations on arrays of 8-bit objects: str (per PEP 393).
BTW: I don't know why so many people keep asking for use cases. Isn't it obvious that text data without known (but ASCII compatible) encoding or multiple different encodings in a single data chunk is part of life ?
Isn't it equally obvious that if you create or read all such ASCII- compatible chunks as (encoding='ascii', errors='surrogateescape') that you *don't need* string APIs for bytes? Why do these "text chunks" need to be bytes in the first place? That's why we ask for use cases. AFAICS, reading and writing ASCII- compatible text data as 'latin1' is just as fast as bytes I/O. So it's not I/O efficiency, and (since in this model we don't do any en/decoding on bytes/str), it's not redundant en/decoding of bytes to str and back.