"convert" string to bytes without changing data (encoding)
driscoll at cs.wisc.edu
Thu Mar 29 18:31:23 CEST 2012
On 01/-10/-28163 01:59 PM, Ross Ridge wrote:
> Evan Driscoll<driscoll at cs.wisc.edu> wrote:
>> People like you -- who write to assumptions which are not even remotely
>> guaranteed by the spec -- are part of the reason software sucks.
>> This email is a bit harsher than it deserves -- but I feel not by much.
> I don't see how you could feel the least bit justified. Well meaning,
> if unhelpful, lies about the nature Python strings in order to try to
> convince someone to follow what you think are good programming practices
> is one thing. Maliciously lying about someone else's code that you've
> never seen is another thing entirely.
I'm not even talking about code that you or the OP has written. I'm
talking about your suggestion that
I can in fact say what the internal byte string representation
of strings is any given build of Python 3.
Aside from the questionable truth of this assertion (there's no
guarantee that an implementation uses one consistent encoding or data
structure representation consistently), that's of no consequence because
you can't depend on what the representation is. So why even bring it up?
Also irrelevant is:
In practice the number of ways that CPython (the only Python 3
implementation) represents strings is much more limited.
Pretending otherwise really isn't helpful.
If you can't depend on CPython's implementation (and, I would argue,
your code is broken if you do), then it *is* helpful. Saying that "you
can just look at what CPython does" is what is unhelpful.
That said, looking again I did misread your post that I sent that harsh
reply to; I was looking at it perhaps a bit too much through the lens of
the CPython comment I said above, and interpreting it as "I can say what
the internal representation is of CPython, so just give me that" and
launched into my spiel. If that's not what was intended, I retract my
statement. As long as everyone is clear on the fact that Python 3
implementations can use whatever encoding and data structures they want,
perhaps even different encodings or data structures for equal strings,
and that as a consequence saying "what's the internal representation of
this string" is a meaningless question as far as Python itself is
concerned, I'm happy.
More information about the Python-list