"convert" string to bytes without changing data (encoding)

Michael Ströder michael at stroeder.com
Fri Mar 30 03:04:49 EDT 2012


Steven D'Aprano wrote:
> On Thu, 29 Mar 2012 17:36:34 +0000, Prasad, Ramit wrote:
> 
>>>> Technically, ASCII goes up to 256 but they are not A-z letters.
>>>>
>>> Technically, ASCII is 7-bit, so it goes up to 127.
>>
>>> No, ASCII only defines 0-127.  Values >=128 are not ASCII.
>>>
>>> >From https://en.wikipedia.org/wiki/ASCII:
>>>
>>>   ASCII includes definitions for 128 characters: 33 are non-printing
>>>   control characters (now mostly obsolete) that affect how text and
>>>   space is processed and 95 printable characters, including the space
>>>   (which is considered an invisible graphic).
>>
>>
>> Doh! I was mistaking extended ASCII for ASCII. Thanks for the
>> correction.
> 
> There actually is no such thing as "extended ASCII" -- there is a whole 
> series of many different "extended ASCIIs". If you look at the encodings 
> available in (for example) Thunderbird, many of the ISO-8859-* and 
> Windows-* encodings are "extended ASCII" in the sense that they extend 
> ASCII to include bytes 128-255. Unfortunately they all extend ASCII in a 
> different way (hence they are different encodings).

Yupp.

Looking at RFC 1345 some years ago (while having to deal with EBCDIC) made
this all pretty clear to me. I appreciate that someone did this heavy work of
collecting historical encodings.

Ciao, Michael.



More information about the Python-list mailing list