"convert" string to bytes without changing data (encoding)
driscoll at cs.wisc.edu
Wed Mar 28 21:20:50 CEST 2012
On 01/-10/-28163 01:59 PM, Ross Ridge wrote:
> Steven D'Aprano<steve+comp.lang.python at pearwood.info> wrote:
>> The right way to convert bytes to strings, and vice versa, is via
>> encoding and decoding operations.
> If you want to dictate to the original poster the correct way to do
> things then you don't need to do anything more that. You don't need to
> pretend like Chris Angelico that there's isn't a direct mapping from
> the his Python 3 implementation's internal respresentation of strings
> to bytes in order to label what he's asking for as being "silly".
That mapping may as well be:
length = random.randint(len(some_string), 5*len(some_string))
bytes =  * length
for i in xrange(length):
bytes[i] = random.randint(0, 255)
Of course this is hyperbole, but it's essentially about as much
guarantee as to what the result is.
As many others have said, the encoding isn't defined, and I would guess
varies between implementations. (E.g. if Jython and IronPython use their
host platforms' native strings, both have 16-bit chars and thus probably
use UTF-16 encoding. I am not sure what CPython uses, but I bet it's
It's even guaranteed that the byte representation won't change! If
something is lazily evaluated or you have a COW string or something, the
bytes backing it will differ.
So yes, you can say that pretending there's not a mapping of strings to
internal representation is silly, because there is. However, there's
nothing you can say about that mapping.
More information about the Python-list