[I18n-sig] Modified open() builtin (Re: Python Character Model)

Andy Robinson andy@reportlab.com
Sun, 11 Feb 2001 09:18:57 -0000


> > Any reason why we cannot use a keyword argument for encoding
> > and put it at the end of the argument list ? The result is:
> >
> > 1. no ambiguity
> > 2. backward compatibility
> > 3. good visibility of what the argument stands for (without having
> >    to look up the manual for e.g. the meaning of 'mbcs')
>
> I would like to have the option of one day making it a
> required argument
> without having to also make mode and bytes required. Mode would be a
> minor inconvenience but bytes would be major.
>
>  Paul Prescod

I can see three separate proposals going on here.  Here's what I
think:

(1) introduce b"whatever".

I'm 100% in favour - breaks nothing, adds clarity, and having it early
may ease the pain if we ever do break old code in a few years.

(2) widen the string representation so they can hold single or
multi-byte
data but without implying their semantics.

I'm not sure on this one - it goes further than any other language
and the extra power may lead to new classes of errors.  Alongside the
proposal,
we need a bunch of examples of how this could be used, and of how it
could be abused, and then I think we all need to sit on it for a
while.
Which is what you've been saying too.

(3) changing open().

This should be contingent on (2).  As long as u"hello" and "hello"
have a different type, our current solution is exactly right -
we have wrappers classes around files which handle Unicode strings,
but files themselves always do I/O in bytes. We've actually got
the explicit position you favour right now - to write Unicode to a
file, I need to explicitly create a wrapper with an encoding.

If you go to (2), it becomes possible to write a string containing
unicode straight to a file object, and therefore it is desirable
to let the file object handle conversion, so you need a way to
specify it etc.   I am still not sure this is right.  The stackable
streams concept is well understood from Java and gives a lot of
power.

- Andy