Lennart Regebro
So the question is if you have any proposal that is *less* confusing while still being practical. Because we do need to distinguish between binary, Unicode and "native" strings. Isn't this the least confusing solution?
It's a matter of the degree of confusion caused (hard to assess) and also a question of taste, so there will be differing views on this. Considering use of unicode_literals, 'xxx' for text, b'yyy' for bytes and with a function wrapper to mark native strings, it becomes clear that the native strings are special cases - much less encountered when looking at code compared to 'xxx' / b'yyy', so there are fewer opportunities for confusion. Where native strings need to be discussed, then it is not unexceptional, nor I believe incorrect, to explain that they are there to suit the requirements of legacy APIs which pre-date Python 3 and the latest versions of Python 2. In terms of practicality, it is IMO quite practical (assuming 2.5 / earlier support can be dropped) to move to a 2.6+/3.x-friendly codebase, e.g. by using Armin's python-modernize. Regards, Vinay Sajip