"convert" string to bytes without changing data (encoding)

Ross Ridge rridge at csclub.uwaterloo.ca
Wed Mar 28 21:10:23 CEST 2012


Tim Chase  <python.list at tim.thechases.com> wrote:
>Internally, they're a series of bytes, but they are MEANINGLESS 
>bytes unless you know how they are encoded internally.  Those 
>bytes could be UTF-8, UTF-16, UTF-32, or any of a number of other 
>possible encodings[1].  If you get the internal byte stream, 
>there's no way to meaningfully operate on it unless you also know 
>how it's encoded (or you're willing to sacrifice the ability to 
>reliably get the string back).

In practice the number of ways that CPython (the only Python 3
implementation) represents strings is much more limited.  Pretending
otherwise really isn't helpful.

Still, if Chris Angelico had used your much less misleading explaination,
then this could've been resolved much quicker.  The original poster
didn't buy Chris's bullshit for a minute, instead he had to find out on
his own that that the internal representation of strings wasn't what he
expected to be.

					Ross Ridge

-- 
 l/  //	  Ross Ridge -- The Great HTMU
[oo][oo]  rridge at csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //	  



More information about the Python-list mailing list