[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Steven D'Aprano steve at pearwood.info
Sun Jan 12 02:31:51 CET 2014


On Sat, Jan 11, 2014 at 04:28:34PM -0500, Terry Reedy wrote:

> The problem with some criticisms of using 'unicode in Python 3' is that 
> there really is no such thing. Unicode in 3.0 to 3.2 used the old 
> internal model inherited from 2.x. Unicode in 3.3+ uses a different 
> internal model that is a game changer with respect to certain issues of 
> space and time efficiency (and cross-platform correctness and 
> portability). So at least some the valid criticisms based on the old 
> model are out of date and no longer valid.

While there are definitely performance savings (particularly of memory) 
regarding the FSR in Python 3.3, for the use-case we're talking about, 
Python 3.1 and 3.2 (and for that matter, 2.2 through 2.7) Unicode 
strings should be perfectly adequate. The textual data being used is 
ASCII, and the binary blobs are encoded to Latin-1, so everything is a 
subset of Unicode, namely U+0000 to U+00FF. That means there are no 
astral characters, and no behavioural differences between wide and 
narrow builds (apart from memory use).


-- 
Steven


More information about the Python-Dev mailing list