Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

12 Jan 2014


      On Sat, Jan 11, 2014 at 04:28:34PM -0500, Terry Reedy wrote:
...
The problem with some criticisms of using 'unicode in Python 3' is that 
there really is no such thing. Unicode in 3.0 to 3.2 used the old 
internal model inherited from 2.x. Unicode in 3.3+ uses a different 
internal model that is a game changer with respect to certain issues of 
space and time efficiency (and cross-platform correctness and 
portability). So at least some the valid criticisms based on the old 
model are out of date and no longer valid.
While there are definitely performance savings (particularly of memory) 
regarding the FSR in Python 3.3, for the use-case we're talking about, 
Python 3.1 and 3.2 (and for that matter, 2.2 through 2.7) Unicode 
strings should be perfectly adequate. The textual data being used is 
ASCII, and the binary blobs are encoded to Latin-1, so everything is a 
subset of Unicode, namely U+0000 to U+00FF. That means there are no 
astral characters, and no behavioural differences between wide and 
narrow builds (apart from memory use).


-- 
Steven

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Steven D'Aprano