[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Terry Reedy tjreedy at udel.edu
Sat Jan 11 22:28:34 CET 2014


On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote:

> We already *have* a type in Python 3.3 that provides text
> manipulations on arrays of 8-bit objects: str (per PEP 393).
>
>   > BTW: I don't know why so many people keep asking for use cases.
>   > Isn't it obvious that text data without known (but ASCII compatible)
>   > encoding or multiple different encodings in a single data chunk
>   > is part of life ?
>
> Isn't it equally obvious that if you create or read all such ASCII-
> compatible chunks as (encoding='ascii', errors='surrogateescape') that
> you *don't need* string APIs for bytes?
>
> Why do these "text chunks" need to be bytes in the first place?
> That's why we ask for use cases.  AFAICS, reading and writing ASCII-
> compatible text data as 'latin1' is just as fast as bytes I/O.  So
> it's not I/O efficiency, and (since in this model we don't do any
> en/decoding on bytes/str), it's not redundant en/decoding of bytes to
> str and back.

The problem with some criticisms of using 'unicode in Python 3' is that 
there really is no such thing. Unicode in 3.0 to 3.2 used the old 
internal model inherited from 2.x. Unicode in 3.3+ uses a different 
internal model that is a game changer with respect to certain issues of 
space and time efficiency (and cross-platform correctness and 
portability). So at least some the valid criticisms based on the old 
model are out of date and no longer valid.

-- 
Terry Jan Reedy



More information about the Python-Dev mailing list