"convert" string to bytes without changing data (encoding)
Evan Driscoll
driscoll at cs.wisc.edu
Wed Mar 28 15:20:50 EDT 2012
On 01/-10/-28163 01:59 PM, Ross Ridge wrote:
> Steven D'Aprano<steve+comp.lang.python at pearwood.info> wrote:
>> The right way to convert bytes to strings, and vice versa, is via
>> encoding and decoding operations.
>
> If you want to dictate to the original poster the correct way to do
> things then you don't need to do anything more that. You don't need to
> pretend like Chris Angelico that there's isn't a direct mapping from
> the his Python 3 implementation's internal respresentation of strings
> to bytes in order to label what he's asking for as being "silly".
That mapping may as well be:
def get_bytes(some_string):
import random
length = random.randint(len(some_string), 5*len(some_string))
bytes = [0] * length
for i in xrange(length):
bytes[i] = random.randint(0, 255)
return bytes
Of course this is hyperbole, but it's essentially about as much
guarantee as to what the result is.
As many others have said, the encoding isn't defined, and I would guess
varies between implementations. (E.g. if Jython and IronPython use their
host platforms' native strings, both have 16-bit chars and thus probably
use UTF-16 encoding. I am not sure what CPython uses, but I bet it's
*not* that.)
It's even guaranteed that the byte representation won't change! If
something is lazily evaluated or you have a COW string or something, the
bytes backing it will differ.
So yes, you can say that pretending there's not a mapping of strings to
internal representation is silly, because there is. However, there's
nothing you can say about that mapping.
Evan
More information about the Python-list
mailing list