[Python-Dev] file() vs open(), round 7
M.-A. Lemburg
mal at egenix.com
Tue Dec 27 19:48:07 CET 2005
Martin v. Löwis wrote:
> M.-A. Lemburg wrote:
>>> Here's a rough draft:
>>>
>>> def textopen(name, mode="r", encoding=None):
>>> if "U" not in mode:
>>> mode += "U"
>>
>> The "U" is not needed when opening files using codecs -
>> these always break lines using .splitlines() which
>> breaks lines according to the Unicode rules and also
>> knows about the various line break variants on different
>> platforms.
>
> Still, codecs typically don't implement universal newlines
> correctly. If you specify 'U', then do .read(), you deserve
> to get \n (U+0010) as the line separator; with most codecs,
> you get whatever line breaks where in the file.
>
> Passing 'U' to the underlying stream is wrong, as well:
> if the stream is double-byte oriented (e.g. UTF-16),
> the 'U' filtering will rarely do anything, but if it does
> something, it will be wrong.
>
> I agree that it would be desirable to have textopen always
> default to universal newlines, however, this is difficult
> to implement.
I think that codecs solve the problem in a better way.
If you want to read lines from a stream, you'd use
.readline() or .readlines() to read the lines, and not
expect .read() to magically apply some conversion to the
original data.
Both line methods have a parameter keepends (which defaults to
True). This parameter specifies whether you will get the
original line end markers or not, which makes it possible to let
the application implement whatever logic it finds
appropriate.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Dec 27 2005)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
More information about the Python-Dev
mailing list