[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Steven D'Aprano
steve at pearwood.info
Sun Jan 12 02:31:51 CET 2014
On Sat, Jan 11, 2014 at 04:28:34PM -0500, Terry Reedy wrote:
> The problem with some criticisms of using 'unicode in Python 3' is that
> there really is no such thing. Unicode in 3.0 to 3.2 used the old
> internal model inherited from 2.x. Unicode in 3.3+ uses a different
> internal model that is a game changer with respect to certain issues of
> space and time efficiency (and cross-platform correctness and
> portability). So at least some the valid criticisms based on the old
> model are out of date and no longer valid.
While there are definitely performance savings (particularly of memory)
regarding the FSR in Python 3.3, for the use-case we're talking about,
Python 3.1 and 3.2 (and for that matter, 2.2 through 2.7) Unicode
strings should be perfectly adequate. The textual data being used is
ASCII, and the binary blobs are encoded to Latin-1, so everything is a
subset of Unicode, namely U+0000 to U+00FF. That means there are no
astral characters, and no behavioural differences between wide and
narrow builds (apart from memory use).
--
Steven
More information about the Python-Dev
mailing list