[Numpy-discussion] ANN: Numpy 1.6.0 beta 2

Charles R Harris charlesr.harris at gmail.com
Tue Apr 5 13:20:45 EDT 2011


On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker
<Chris.Barker at noaa.gov>wrote:

> On 4/4/11 10:35 PM, Charles R Harris wrote:
> >     IIUC, "Ub" is undefined -- "U" means universal newlines, which makes
> no
> >     sense when used with "b" for binary. I looked at the code a ways
> back,
> >     and I can't remember the resolution order, but there isn't any
> checking
> >     for incompatible flags.
> >
> >     I'd expect that genfromtxt, being txt, and line oriented, should use
> >     'rU'. but if it wants the raw line endings (why would it?) then rb
> >     should be fine.
> >
> >
> > "U" has been kept around for backwards compatibility, the python
> > documentation recommends that it not be used for new code.
>
> That is for  3.*  -- the 2.7.* docs say:
>
> """
> In addition to the standard fopen() values mode may be 'U' or 'rU'.
> Python is usually built with universal newline support; supplying 'U'
> opens the file as a text file, but lines may be terminated by any of the
> following: the Unix end-of-line convention '\n', the Macintosh
> convention '\r', or the Windows convention '\r\n'. All of these external
> representations are seen as '\n' by the Python program. If Python is
> built without universal newline support a mode with 'U' is the same as
> normal text mode. Note that file objects so opened also have an
> attribute called newlines which has a value of None (if no newlines have
> yet been seen), '\n', '\r', '\r\n', or a tuple containing all the
> newline types seen.
>
> Python enforces that the mode, after stripping 'U', begins with 'r', 'w'
> or 'a'.
> ""
>
> which does, in fact indicate that 'Ub' is NOT allowed. We should be
> using 'Ur', I think. Maybe the "python enforces" is what we saw the
> error from -- it didn't used to enforce anything.
>
>
'rbU' works and I put that in as a quick fix.

>
> On 4/5/11 7:12 AM, Charles R Harris wrote:
>
> > The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in
> > python, as it works just fine on python 2.7.
>
> "Ub" never made any sense anywhere -- "U" means universal newline text
> file. "b" means binary -- combining them makes no sense. On older
> pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is
> supposed to raise an error.
>
> does 'Ur' work with \r line endings on Python 3?
>

Yes.


> According to my read of the docs, 'U' does nothing -- "universal"
> newline support is supposed to be the default:
>
> """
> On input, if newline is None, universal newlines mode is enabled. Lines
> in the input can end in '\n', '\r', or '\r\n', and these are translated
> into '\n' before being returned to the caller.
> """
>
> > It may indeed be desirable
> > to read the files as text, but that would require more work on both
> > loadtxt and genfromtxt.
>
> Why can't we just open the file with mode 'Ur'? text is text, messing
> with line endings shouldn't hurt anything, and it might help.
>
>
Well, text in the files then gets the numpy 'U' type instead of 'S', and
there are places where byte streams are assumed for stripping and such.
Which is to say that changing to text mode requires some work. Another
possibility is to use a generator:

def usetext(fname):
    f = open(fname, 'rt')
    for l in f:
       yield asbytes(f.next())

I think genfromtxt could use a refactoring and cleanup, but probably not for
1.6.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110405/7b0fde69/attachment.html>


More information about the NumPy-Discussion mailing list