"convert" string to bytes without changing data (encoding)

Evan Driscoll driscoll at cs.wisc.edu
Thu Mar 29 18:31:23 CEST 2012

On 01/-10/-28163 01:59 PM, Ross Ridge wrote:
> Evan Driscoll<driscoll at cs.wisc.edu>  wrote:
>> People like you -- who write to assumptions which are not even remotely
>> guaranteed by the spec -- are part of the reason software sucks.
> ...
>> This email is a bit harsher than it deserves -- but I feel not by much.
> I don't see how you could feel the least bit justified.  Well meaning,
> if unhelpful, lies about the nature Python strings in order to try to
> convince someone to follow what you think are good programming practices
> is one thing.  Maliciously lying about someone else's code that you've
> never seen is another thing entirely.

I'm not even talking about code that you or the OP has written. I'm 
talking about your suggestion that

    I can in fact say what the internal byte string representation
    of strings is any given build of Python 3.

Aside from the questionable truth of this assertion (there's no 
guarantee that an implementation uses one consistent encoding or data 
structure representation consistently), that's of no consequence because 
you can't depend on what the representation is. So why even bring it up?

Also irrelevant is:

   In practice the number of ways that CPython (the only Python 3
   implementation) represents strings is much more limited.
   Pretending otherwise really isn't helpful.

If you can't depend on CPython's implementation (and, I would argue, 
your code is broken if you do), then it *is* helpful. Saying that "you 
can just look at what CPython does" is what is unhelpful.

That said, looking again I did misread your post that I sent that harsh 
reply to; I was looking at it perhaps a bit too much through the lens of 
the CPython comment I said above, and interpreting it as "I can say what 
the internal representation is of CPython, so just give me that" and 
launched into my spiel. If that's not what was intended, I retract my 
statement. As long as everyone is clear on the fact that Python 3 
implementations can use whatever encoding and data structures they want, 
perhaps even different encodings or data structures for equal strings, 
and that as a consequence saying "what's the internal representation of 
this string" is a meaningless question as far as Python itself is 
concerned, I'm happy.


More information about the Python-list mailing list