[Python-3000] New io system and binary data

Charles D Hixson charleshixsn at earthlink.net
Sun Sep 23 19:24:24 CEST 2007


Guido van Rossum wrote:
> On 9/19/07, Bill Janssen <janssen at parc.com> wrote:
>   
>> This really isn't a UTF-8 problem.  It is the problem with file opens
>> defaulting to "text" mode instead of "binary" mode rearing its ugly
>> head again.
>>     
>
> You can repeat that until you're blue in the face but it's not going
> to change. Way more programs (especially simple ones) deal with txet
> than with binary data.
>
>   
OTOH, almost all of that text is ASCII.  Even if the system mode is set 
to utf-8, ascii is still ascii.

Still, this won't affect me, much, as I rarely send anything complex via 
pipes.  (I know, I should.  It's more secure.  But the fact is, I 
don't.  I use files.)

But this is the kind of thing that could make dealing with, say, xpm 
files a real hassle.  (Probably won't, as ascii is still ascii, but it 
will introduce corner cases.)  A lot of the time what I'm really dealing 
with is bytes rather than characters.  I think of them as characters, 
and try to choose values that display nicely as characters, because 
that's the way that's been convenient for decades.  But they ARE bytes, 
sometimes signed bytes.  And this is going to mean that there are lots 
of cases where they don't map nicely to something that's trying to 
understand them as unicode.

So there needs to be an easy and obvious way to deal with files whose 
records are arrays of byte valued data...that is commonly manipulated by 
an editor using ascii-8.



More information about the Python-3000 mailing list