Re: [Python-Dev] Internal representation of strings and Micropython

June 5, 2014


      On Thu, Jun 5, 2014 at 11:59 AM, Paul Moore <p.f.moore@gmail.com> wrote:
...
On 5 June 2014 14:15, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
As I've said before in other contexts, find me Windows, Mac OS X and
JVM developers, or educators and scientists that are as concerned by
the text model changes as folks that are primarily focused on Linux
system (including network) programming, and I'll be more willing to
concede the point.
There is once again a strong selection bias in this discussion, by its
very nature. People who like the new model don't have anything to
complain about, and so are not heard.
Just to support Nick's point, I for one find the Python 3 text model a
huge benefit, both in practical terms of making my programs more
robust, and educationally, as I have a far better understanding of
encodings and their issues than I ever did under Python 2. Whenever a
discussion like this occurs, I find it hard not to resent the people
arguing that the new model should be taken away from me and replaced
with a form of the old error-prone (for me) approach - as if it was in
my best interests.
Internal details don't bother me - using UTF8 and having indexing be
potentially O(N) is of little relevance. But make me work with a
string type that *doesn't* abstract a string as a sequence of Unicode
code points and I'll get very upset.
Once you get past whether str + bytes throws an exception which seems
to be the divide most people focus on, you can discover new things
like dance-encoded strings, bytes decoded using an incorrect encoding
intended to be transcoded into the correct encoding later, surrogates
that work perfectly until .encode(), str(bytes), APIs that disagree
with you about whether the result should be str or bytes, APIs that
return either string or bytes depending on their initializers and so
on. Unicode can still be complicated in Python 3 independent of any
judgement about whether it is worse, better, or different than Python
2.

Re: [Python-Dev] Internal representation of strings and Micropython

Daniel Holth