[I18n-sig] Modified open() builtin (Re: Python Character Model)

Paul Prescod paulp@ActiveState.com
Sun, 11 Feb 2001 01:59:08 -0800

Andy Robinson wrote:
> ....
> I can see three separate proposals going on here.  Here's what I
> think:
> (1) introduce b"whatever".
> I'm 100% in favour - breaks nothing, adds clarity, and having it early
> may ease the pain if we ever do break old code in a few years.
> (2) widen the string representation so they can hold single or
> multi-byte
> data but without implying their semantics.

This is not a short-term proposal because it involves more
implementation work than the others.

> I'm not sure on this one - it goes further than any other language
> and the extra power may lead to new classes of errors.  

Actually, the way you describe it, it sounds alot like wchar.

> (3) changing open().
> This should be contingent on (2).  As long as u"hello" and "hello"
> have a different type, our current solution is exactly right -
> we have wrappers classes around files which handle Unicode strings,
> but files themselves always do I/O in bytes. We've actually got
> the explicit position you favour right now - to write Unicode to a
> file, I need to explicitly create a wrapper with an encoding.

I don't follow why this should be contingent on widening the basic
string representation! Given a Unicode type, we need to read and write
Unicode data today. In my personal opinion, wrappers are too obscure and
too optional. The average programmer is not going to even know they

> If you go to (2), it becomes possible to write a string containing
> unicode straight to a file object, and therefore it is desirable
> to let the file object handle conversion, so you need a way to
> specify it etc. 

We already have Unicode strings that we need to write to files!

> I am still not sure this is right.  The stackable
> streams concept is well understood from Java and gives a lot of
> power.

The stackable streams will still exist. But Python is "flatter" than
Java in general. Java's IO libraries are in my opinion almost
incomprehensible. Yes ,very powerful once you understand them but a lot
to learn to do basic things.

I would not be embarrassed to tell a newbie Python programmer that they
should write:

file = open("/etc/passwd.txt", "ASCII")

It's pretty clear what's going on and they don't need any understanding
of Unicode. What's the Java equivalent?

 Paul Prescod