detecting newline character
Thomas 'PointedEars' Lahn
PointedEars at web.de
Sun Apr 24 05:19:27 EDT 2011
Daniel Geržo wrote:
> On 24.4.2011 9:05, jmfauth wrote:
>> Use the io module.
>
> For the record, when I use io.open(file=self.path, mode="rt",
> encoding=enc)) as fobj:
>
> my tests are passing and everything seems to work fine.
>
> That indicates there is a bug with codecs module and universal newline
> support.
No, it proves that you either have not bothered to read the underlying
source code and documentation (despite it has been quoted to you), or have
not understood it.
It is clear now that codecs.open() would not support universal newlines from
at least Python 2.6 forward as it is *documented* that it opens files in
*binary mode* only. The source code that I have posted shows that it
therefore actively removes 'U' from the mode string when the `encoding'
argument was passed, and always appends 'b' to the mode if not present. As
a result, __builtin__.open() is called without 'U' in the `mode' argument,
which is *documented* to set file.newlines to None (regardless whether
Python was compiled with universal newline support).
<http://docs.python.org/library/stdtypes.html?highlight=newlines#file.newlines>
`io' is a more general module than `codecs', therefore io.open() does not
have those restrictions (but it has others – RTSL!¹). Did you note that
your `mode' argument does not contain `b'? Append it and you will see why
this cannot work.
The bug, if any, is that codecs.open() does not check for your wrong `mode'
argument, while io.open() does.
_____
¹ RTSL: Read the Source, Luke!
--
PointedEars
More information about the Python-list
mailing list