[I18n-sig] Re: [Python-Dev] Pre-PEP: Python Character Model

Paul Prescod paulp@ActiveState.com
Wed, 07 Feb 2001 14:53:53 -0800

Toby Dickenson wrote:
> I dislike the idea of burdening the file object interface with
> separate functions for binary and text IO, and a way of changing the
> encoding. There are many other types/classes that support the file
> interface, and I think it is desirable to support text IO on all of
> them.

It is not burdensome to change each of them over. It's probably about 10
lines of code each.

> The wrapper approach from the codecs module seems better, since it can
> be used to convert any byte file into a text file.

The wrapper approach is not user friendly and users will not make use of
it unless they are already i18n experts. My goal is to nudge people
toward thinking about i18n.

> Also consider a hypothetical new storage device that stores unicode
> natively: how should it implement readbytes?

It could simply choose not to.

> We can unify these two only if we change the default encoding from
> ASCII to latin1, otherwise:

I prefer not to think of it as a "default encoding of Latin1" and more
as "doing the obvious thing." C has a character 245. Python has a
character 245. Only someone who knows too much would expect anything
other than an obvious mapping.

> The counter-argument from last time around was that this will do the
> wrong thing for anyone mixing unicode objects with plain strings
> containing non-latin1 content. This argument goes away once there is
> only one type used for storing text.

That's where I'm trying to get to but I'm trying to minimize the amount
of cruft added to the language between here and there.

 Paul Prescod