"convert" string to bytes without changing data (encoding)

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Mar 28 19:54:20 CEST 2012


On Wed, 28 Mar 2012 11:36:10 -0400, Ross Ridge wrote:

> Chris Angelico  <rosuav at gmail.com> wrote:
>>What is a string? It's not a series of bytes.
> 
> Of course it is.  Conceptually you're not supposed to think of it that
> way, but a string is stored in memory as a series of bytes.

You don't know that. They might be stored as a tree, or a rope, or some 
even more complex data structure. In fact, in Python, they are stored as 
an object.

But even if they were stored as a simple series of bytes, you don't know 
what bytes they are. That is an implementation detail of the particular 
Python build being used, and since Python doesn't give direct access to 
memory (at least not in pure Python) there's no way to retrieve those 
bytes using Python code.

Saying that strings are stored in memory as bytes is no more sensible 
than saying that dicts are stored in memory as bytes. Yes, they are. So 
what? Taken out of context in a running Python interpreter, those bytes 
are pretty much meaningless.


> What he's asking for many not be very useful or practical, but if that's
> your problem here than then that's what you should be addressing, not
> pretending that it's fundamentally impossible.

The right way to convert bytes to strings, and vice versa, is via 
encoding and decoding operations. What the OP is asking for is as silly 
as somebody asking to turn a float 1.3792 into a string without calling 
str() or any equivalent float->string conversion. They're both made up of 
bytes, right? Yeah, they are. So what?

Even if you do a hex dump of float 1.3792, the result will NOT be the 
string "1.3792". And likewise, even if you somehow did a hex dump of the 
memory representation of a string, the result will NOT be the equivalent 
sequence of bytes except *maybe* for some small subset of possible 
strings.



-- 
Steven



More information about the Python-list mailing list