[Python-Dev] PEP 414

Sat Mar 3 10:26:12 CET 2012

Lennart Regebro <regebro <at> gmail.com> writes:

> So the question is if you have any proposal that is *less* confusing
> while still being practical. Because we do need to distinguish between
> binary, Unicode and "native" strings. Isn't this the least confusing
> solution?

It's a matter of the degree of confusion caused (hard to assess) and also a
question of taste, so there will be differing views on this. Considering use of
unicode_literals, 'xxx' for text, b'yyy' for bytes and with a function wrapper
to mark native strings, it becomes clear that the native strings are special
cases - much less encountered when looking at code compared to 'xxx' / b'yyy',
so there are fewer opportunities for confusion. Where native strings need to be
discussed, then it is not unexceptional, nor I believe incorrect, to explain
that they are there to suit the requirements of legacy APIs which pre-date
Python 3 and the latest versions of Python 2. In terms of practicality, it is
IMO quite practical (assuming 2.5 / earlier support can be dropped) to move to a
2.6+/3.x-friendly codebase, e.g. by using Armin's python-modernize.

Regards,

Vinay Sajip