> On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker <
Chris.Barker@noaa.gov>
> wrote:
>>
>> On 4/4/11 10:35 PM, Charles R Harris wrote:
>> > IIUC, "Ub" is undefined -- "U" means universal newlines, which makes
>> > no
>> > sense when used with "b" for binary. I looked at the code a ways
>> > back,
>> > and I can't remember the resolution order, but there isn't any
>> > checking
>> > for incompatible flags.
>> >
>> > I'd expect that genfromtxt, being txt, and line oriented, should use
>> > 'rU'. but if it wants the raw line endings (why would it?) then rb
>> > should be fine.
>> >
>> >
>> > "U" has been kept around for backwards compatibility, the python
>> > documentation recommends that it not be used for new code.
>>
>> That is for 3.* -- the 2.7.* docs say:
>>
>> """
>> In addition to the standard fopen() values mode may be 'U' or 'rU'.
>> Python is usually built with universal newline support; supplying 'U'
>> opens the file as a text file, but lines may be terminated by any of the
>> following: the Unix end-of-line convention '\n', the Macintosh
>> convention '\r', or the Windows convention '\r\n'. All of these external
>> representations are seen as '\n' by the Python program. If Python is
>> built without universal newline support a mode with 'U' is the same as
>> normal text mode. Note that file objects so opened also have an
>> attribute called newlines which has a value of None (if no newlines have
>> yet been seen), '\n', '\r', '\r\n', or a tuple containing all the
>> newline types seen.
>>
>> Python enforces that the mode, after stripping 'U', begins with 'r', 'w'
>> or 'a'.
>> ""
>>
>> which does, in fact indicate that 'Ub' is NOT allowed. We should be
>> using 'Ur', I think. Maybe the "python enforces" is what we saw the
>> error from -- it didn't used to enforce anything.
>>
>
> 'rbU' works and I put that in as a quick fix.
>>
>> On 4/5/11 7:12 AM, Charles R Harris wrote:
>>
>> > The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in
>> > python, as it works just fine on python 2.7.
>>
>> "Ub" never made any sense anywhere -- "U" means universal newline text
>> file. "b" means binary -- combining them makes no sense. On older
>> pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is
>> supposed to raise an error.
>>
>> does 'Ur' work with \r line endings on Python 3?
>
> Yes.
>
>>
>> According to my read of the docs, 'U' does nothing -- "universal"
>> newline support is supposed to be the default:
>>
>> """
>> On input, if newline is None, universal newlines mode is enabled. Lines
>> in the input can end in '\n', '\r', or '\r\n', and these are translated
>> into '\n' before being returned to the caller.
>> """
>>
>> > It may indeed be desirable
>> > to read the files as text, but that would require more work on both
>> > loadtxt and genfromtxt.
>>
>> Why can't we just open the file with mode 'Ur'? text is text, messing
>> with line endings shouldn't hurt anything, and it might help.
>>
>
> Well, text in the files then gets the numpy 'U' type instead of 'S', and
> there are places where byte streams are assumed for stripping and such.
> Which is to say that changing to text mode requires some work. Another
> possibility is to use a generator:
>
> def usetext(fname):
> f = open(fname, 'rt')
> for l in f:
> yield asbytes(f.next())
>
> I think genfromtxt could use a refactoring and cleanup, but probably not for
> 1.6.