"convert" string to bytes without changing data (encoding)
Terry Reedy
tjreedy at udel.edu
Wed Mar 28 14:11:28 EDT 2012
On 3/28/2012 11:36 AM, Ross Ridge wrote:
> Chris Angelico<rosuav at gmail.com> wrote:
>> What is a string? It's not a series of bytes.
>
> Of course it is. Conceptually you're not supposed to think of it that
> way, but a string is stored in memory as a series of bytes.
*If* it is stored in byte memory. If you execute a 3.x program mentally
or on paper, then there are no bytes.
If you execute a 3.3 program on a byte-oriented computer, then the 'a'
in the string might be represented by 1, 2, or 4 bytes, depending on the
other characters in the string. The actual logical bit pattern will
depend on the big versus little endianness of the system.
My impression is that if you go down to the physical bit level, then
again there are, possibly, no 'bytes' as a physical construct as the
bits, possibly, are stored in parallel on multiple ram chips.
> What he's asking for many not be very useful or practical, but if that's
> your problem here than then that's what you should be addressing, not
> pretending that it's fundamentally impossible.
The python-level way to get the bytes of an object that supports the
buffer interface is memoryview(). 3.x strings intentionally do not
support the buffer interface as there is not any particular
correspondence between characters (codepoints) and bytes.
The OP could get the ordinal for each character and decide how *he*
wants to convert them to bytes.
ba = bytearray()
for c in s:
i = ord(c)
<append bytes to ba corresponding to i>
To get the particular bytes used for a particular string on a particular
system, OP should use the C API, possibly through ctypes.
--
Terry Jan Reedy
More information about the Python-list
mailing list