[Python-Dev] file() vs open(), round 7

M.-A. Lemburg mal at egenix.com
Tue Dec 27 18:39:26 CET 2005


Phillip J. Eby wrote:
> At 04:20 PM 12/27/2005 +0100, M.-A. Lemburg wrote:
>> Phillip J. Eby wrote:
>> > At 02:35 PM 12/27/2005 +0100, Fredrik Lundh wrote:
>> >> M.-A. Lemburg wrote:
>> >>
>> >>>> can we add a opentext factory for file/codecs.open while we're at
>> it ?
>> >>> Why a new factory function ? Can't we just redirect to codecs.open()
>> >>> in case an encoding keyword argument is passed to open() ?!
>> >> I think open is overloaded enough as it is.  Using separate
>> functions for
>> >> distinct
>> >> use cases is also a lot better than keyword trickery.
>> >>
>> >> Here's a rough draft:
>> >>
>> >>     def textopen(name, mode="r", encoding=None):
>> >>         if "U" not in mode:
>> >>             mode += "U"
>> >>         if encoding:
>> >>             return codecs.open(name, mode, encoding)
>> >>         return file(name, mode)
>> >
>> > Nice. It should probably also check whether there's a 'b' or 't' in
>> 'mode'
>> > and raise an error if so.
>>
>> Why should it do that ?
> 
> It's not necessary if both codecs.open() and file() raise an error when
> there's both a 'U' and a 't' or 'b' in the mode string, I suppose.

I see what you mean. codecs.open() doesn't work with 'U'.

>> FYI: codecs.open() explicitly adds the 'b' to the mode since
>> we don't want the platform text mode interfere with the
>> Unicode line breaking.
> 
> I think maybe you're confusing the wrapped file's mode with the
> passed-in mode, here.  The passed-in mode should contain at most one of
> 'b', 't', or 'U', IIUC.  The mode used for the wrapped file should of
> course always be 'b', but that's not visible to the user of the routine.

Thinking about this some more, I think it's better to
make encoding mandatory and to not use file() at all
in the API.

When we move to all text is Unicode in Py3k, we'll
have to require this anyway, so why not start with it
now.

That said, I think that a name "textfile" would be
more appropriate for the factory function, like you
suggested.

Valid values for mode would then be 'r', 'w' and 'a'.
'U' is not needed. 'b' and 't' neither. The '+' modes
don't work well with codecs.

>> > I'd also prefer to call it 'textfile', as that
>> > reads more nicely with "for line in textfile(...):" use cases, and
>> it does
>> > return a file object.
>>
>> Nope: open() is only guaranteed to return a file-like object,
>> e.g. codecs.open() will return a wrapped version of a file
>> object.
> 
> I meant it's a "file object" in use case terms, not that
> isinstance(ob,file).

We usually call this an "xyz-like object" (meaning that
the object provides a certain kind of interface).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 27 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list