[Python-3000] New io system and binary data
Charles D Hixson
charleshixsn at earthlink.net
Sun Sep 23 19:24:24 CEST 2007
Guido van Rossum wrote:
> On 9/19/07, Bill Janssen <janssen at parc.com> wrote:
>
>> This really isn't a UTF-8 problem. It is the problem with file opens
>> defaulting to "text" mode instead of "binary" mode rearing its ugly
>> head again.
>>
>
> You can repeat that until you're blue in the face but it's not going
> to change. Way more programs (especially simple ones) deal with txet
> than with binary data.
>
>
OTOH, almost all of that text is ASCII. Even if the system mode is set
to utf-8, ascii is still ascii.
Still, this won't affect me, much, as I rarely send anything complex via
pipes. (I know, I should. It's more secure. But the fact is, I
don't. I use files.)
But this is the kind of thing that could make dealing with, say, xpm
files a real hassle. (Probably won't, as ascii is still ascii, but it
will introduce corner cases.) A lot of the time what I'm really dealing
with is bytes rather than characters. I think of them as characters,
and try to choose values that display nicely as characters, because
that's the way that's been convenient for decades. But they ARE bytes,
sometimes signed bytes. And this is going to mean that there are lots
of cases where they don't map nicely to something that's trying to
understand them as unicode.
So there needs to be an easy and obvious way to deal with files whose
records are arrays of byte valued data...that is commonly manipulated by
an editor using ascii-8.
More information about the Python-3000
mailing list