ANN: Numpy 1.6.0 beta 2
Hi, I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list. Many bug fixes went in since beta 1, among which: - fix installation of source release with python 3 - f2py fixes for assumed shape arrays - several loadtxt bug fixes and enhancements - change default floating point error handling from "print" to "warn" - much more I quickly counted in the timeline, and in the last few weeks the number of open tickets has been decreased by over 100. Thanks to everyone who contributed to this spring cleaning! Sources and binaries can be found at http://sourceforge.net/projects/numpy/files/NumPy/1.6.0b2/ For (preliminary) release notes see below. Enjoy, Ralf Note: NumPy 1.6.0 is not yet released. ========================= NumPy 1.6.0 Release Notes ========================= This release includes several new features as well as numerous bug fixes and improved documentation. It is backward compatible with the 1.5.0 release, and supports Python 2.4 - 2.7 and 3.1 - 3.2. Highlights ========== * Re-introduction of datetime dtype support to deal with dates in arrays. * A new 16-bit floating point type. * A new iterator, which improves performance of many functions. New features ============ New 16-bit floating point type ------------------------------ This release adds support for the IEEE 754-2008 binary16 format, available as the data type ``numpy.half``. Within Python, the type behaves similarly to `float` or `double`, and C extensions can add support for it with the exposed half-float API. New iterator ------------ A new iterator has been added, replacing the functionality of the existing iterator and multi-iterator with a single object and API. This iterator works well with general memory layouts different from C or Fortran contiguous, and handles both standard NumPy and customized broadcasting. The buffering, automatic data type conversion, and optional output parameters, offered by ufuncs but difficult to replicate elsewhere, are now exposed by this iterator. Legendre, Laguerre, Hermite, HermiteE polynomials in ``numpy.polynomial`` ------------------------------------------------------------------------- Extend the number of polynomials available in the polynomial package. In addition, a new ``window`` attribute has been added to the classes in order to specify the range the ``domain`` maps to. This is mostly useful for the Laguerre, Hermite, and HermiteE polynomials whose natural domains are infinite and provides a more intuitive way to get the correct mapping of values without playing unnatural tricks with the domain. Fortran assumed shape array and size function support in ``numpy.f2py`` ----------------------------------------------------------------------- F2py now supports wrapping Fortran 90 routines that use assumed shape arrays. Before such routines could be called from Python but the corresponding Fortran routines received assumed shape arrays as zero length arrays which caused unpredicted results. Thanks to Lorenz Hüdepohl for pointing out the correct way to interface routines with assumed shape arrays. In addition, f2py interprets Fortran expression ``size(array, dim)`` as ``shape(array, dim-1)`` which makes it possible to automatically wrap Fortran routines that use two argument ``size`` function in dimension specifications. Before users were forced to apply this mapping manually. Other new functions ------------------- ``numpy.ravel_multi_index`` : Converts a multi-index tuple into an array of flat indices, applying boundary modes to the indices. ``numpy.einsum`` : Evaluate the Einstein summation convention. Using the Einstein summation convention, many common multi-dimensional array operations can be represented in a simple fashion. This function provides a way compute such summations. ``numpy.count_nonzero`` : Counts the number of non-zero elements in an array. ``numpy.result_type`` and ``numpy.min_scalar_type`` : These functions expose the underlying type promotion used by the ufuncs and other operations to determine the types of outputs. These improve upon the ``numpy.common_type`` and ``numpy.mintypecode`` which provide similar functionality but do not match the ufunc implementation. Changes ======= Changes and improvements in the numpy core ------------------------------------------ ``default error handling`` -------------------------- The default error handling has been change from ``print`` to ``warn`` for all except for ``underflow``, which remains as ``ignore``. ``numpy.distutils`` ------------------- Several new compilers are supported for building Numpy: the Portland Group Fortran compiler on OS X, the PathScale compiler suite and the 64-bit Intel C compiler on Linux. ``numpy.testing`` ----------------- The testing framework gained ``numpy.testing.assert_allclose``, which provides a more convenient way to compare floating point arrays than `assert_almost_equal`, `assert_approx_equal` and `assert_array_almost_equal`. ``C API`` --------- In addition to the APIs for the new iterator and half data type, a number of other additions have been made to the C API. The type promotion mechanism used by ufuncs is exposed via ``PyArray_PromoteTypes``, ``PyArray_ResultType``, and ``PyArray_MinScalarType``. A new enumeration ``NPY_CASTING`` has been added which controls what types of casts are permitted. This is used by the new functions ``PyArray_CanCastArrayTo`` and ``PyArray_CanCastTypeTo``. A more flexible way to handle conversion of arbitrary python objects into arrays is exposed by ``PyArray_GetArrayParamsFromObject``. Deprecated features =================== The "normed" keyword in ``numpy.histogram`` is deprecated. Its functionality will be replaced by the new "density" keyword. Removed features ================ ``numpy.fft`` ------------- The functions `refft`, `refft2`, `refftn`, `irefft`, `irefft2`, `irefftn`, which were aliases for the same functions without the 'e' in the name, were removed. ``numpy.memmap`` ---------------- The `sync()` and `close()` methods of memmap were removed. Use `flush()` and "del memmap" instead. ``numpy.lib`` ------------- The deprecated functions ``numpy.unique1d``, ``numpy.setmember1d``, ``numpy.intersect1d_nu`` and ``numpy.lib.ufunclike.log2`` were removed. ``numpy.ma`` ------------ Several deprecated items were removed from the ``numpy.ma`` module:: * ``numpy.ma.MaskedArray`` "raw_data" method * ``numpy.ma.MaskedArray`` constructor "flag" keyword * ``numpy.ma.make_mask`` "flag" keyword * ``numpy.ma.allclose`` "fill_value" keyword ``numpy.distutils`` ------------------- The ``numpy.get_numpy_include`` function was removed, use ``numpy.get_include`` instead.
Hi all, On 4 Apr 2011, at 22:04, Ralf Gommers wrote:
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
the tests have a number of Python2.4-incompatibilities, one for a file opening mode and the rest for class declaration styles. Cheers, Derek Running unit tests for numpy NumPy version 1.6.0b2 NumPy is installed in /sw/lib/python2.4/site-packages/numpy Python version 2.4.4 (#1, Jan 5 2011, 03:05:41) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 1.0.0 ..... ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_multiarray.py, line 1023)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_multiarray.py", line 1023 class TestPutmask(): ^ SyntaxError: invalid syntax ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_numeric.py, line 1068)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_numeric.py", line 1068 class TestAllclose(): ^ SyntaxError: invalid syntax ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_scalarmath.py, line 84)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_scalarmath.py", line 84 class TestRepr(): ^ SyntaxError: invalid syntax ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_twodim_base.py, line 280)) ... File "/sw/lib/python2.4/site-packages/numpy/lib/tests/ test_twodim_base.py", line 280 class TestTriuIndices(): ^ SyntaxError: invalid syntax ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_linalg.py, line 243)) ... File "/sw/lib/python2.4/site-packages/numpy/linalg/tests/ test_linalg.py", line 243 class TestMatrixPower(): ^ SyntaxError: invalid syntax ====================================================================== ERROR: test_gft_filename (test_io.TestFromTxt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/sw/lib/python2.4/site-packages/numpy/lib/tests/test_io.py", line 1327, in test_gft_filename assert_array_equal(np.genfromtxt(name), exp_res) File "/sw/lib/python2.4/site-packages/numpy/lib/npyio.py", line 1235, in genfromtxt fhd = iter(np.lib._datasource.open(fname, 'Ub')) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 145, in open return ds.open(path, mode) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub ---------------------------------------------------------------------- Ran 2648 tests in 28.573s FAILED (KNOWNFAIL=3, SKIP=21, errors=6)
On Mon, Apr 4, 2011 at 8:42 PM, Derek Homeier < derek@astro.physik.uni-goettingen.de> wrote:
Hi all,
On 4 Apr 2011, at 22:04, Ralf Gommers wrote:
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
the tests have a number of Python2.4-incompatibilities, one for a file opening mode and the rest for class declaration styles.
Cheers, Derek
Running unit tests for numpy NumPy version 1.6.0b2 NumPy is installed in /sw/lib/python2.4/site-packages/numpy Python version 2.4.4 (#1, Jan 5 2011, 03:05:41) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 1.0.0 ..... ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_multiarray.py, line 1023)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_multiarray.py", line 1023 class TestPutmask(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_numeric.py, line 1068)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_numeric.py", line 1068 class TestAllclose(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_scalarmath.py, line 84)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_scalarmath.py", line 84 class TestRepr(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_twodim_base.py, line 280)) ... File "/sw/lib/python2.4/site-packages/numpy/lib/tests/ test_twodim_base.py", line 280 class TestTriuIndices(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_linalg.py, line 243)) ... File "/sw/lib/python2.4/site-packages/numpy/linalg/tests/ test_linalg.py", line 243 class TestMatrixPower(): ^ SyntaxError: invalid syntax
Subclassing object should fix those, I think.
====================================================================== ERROR: test_gft_filename (test_io.TestFromTxt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/sw/lib/python2.4/site-packages/numpy/lib/tests/test_io.py", line 1327, in test_gft_filename assert_array_equal(np.genfromtxt(name), exp_res) File "/sw/lib/python2.4/site-packages/numpy/lib/npyio.py", line 1235, in genfromtxt fhd = iter(np.lib._datasource.open(fname, 'Ub')) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 145, in open return ds.open(path, mode) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms. Chuck
On Mon, Apr 4, 2011 at 11:42 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Mon, Apr 4, 2011 at 8:42 PM, Derek Homeier <derek@astro.physik.uni-goettingen.de> wrote:
Hi all,
On 4 Apr 2011, at 22:04, Ralf Gommers wrote:
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
the tests have a number of Python2.4-incompatibilities, one for a file opening mode and the rest for class declaration styles.
Cheers, Derek
Running unit tests for numpy NumPy version 1.6.0b2 NumPy is installed in /sw/lib/python2.4/site-packages/numpy Python version 2.4.4 (#1, Jan 5 2011, 03:05:41) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 1.0.0 ..... ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_multiarray.py, line 1023)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_multiarray.py", line 1023 class TestPutmask(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_numeric.py, line 1068)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_numeric.py", line 1068 class TestAllclose(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_scalarmath.py, line 84)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_scalarmath.py", line 84 class TestRepr(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_twodim_base.py, line 280)) ... File "/sw/lib/python2.4/site-packages/numpy/lib/tests/ test_twodim_base.py", line 280 class TestTriuIndices(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_linalg.py, line 243)) ... File "/sw/lib/python2.4/site-packages/numpy/linalg/tests/ test_linalg.py", line 243 class TestMatrixPower(): ^ SyntaxError: invalid syntax
Subclassing object should fix those, I think.
====================================================================== ERROR: test_gft_filename (test_io.TestFromTxt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/sw/lib/python2.4/site-packages/numpy/lib/tests/test_io.py", line 1327, in test_gft_filename assert_array_equal(np.genfromtxt(name), exp_res) File "/sw/lib/python2.4/site-packages/numpy/lib/npyio.py", line 1235, in genfromtxt fhd = iter(np.lib._datasource.open(fname, 'Ub')) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 145, in open return ds.open(path, mode) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms.
"rb" is fine on Windows with python 3.2, (that's what I tested initially for this bug) Josef
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Apr 5, 2011 at 12:03 AM, <josef.pktd@gmail.com> wrote:
On Mon, Apr 4, 2011 at 11:42 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Mon, Apr 4, 2011 at 8:42 PM, Derek Homeier <derek@astro.physik.uni-goettingen.de> wrote:
Hi all,
On 4 Apr 2011, at 22:04, Ralf Gommers wrote:
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
the tests have a number of Python2.4-incompatibilities, one for a file opening mode and the rest for class declaration styles.
Cheers, Derek
Running unit tests for numpy NumPy version 1.6.0b2 NumPy is installed in /sw/lib/python2.4/site-packages/numpy Python version 2.4.4 (#1, Jan 5 2011, 03:05:41) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 1.0.0 ..... ====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_multiarray.py, line 1023)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_multiarray.py", line 1023 class TestPutmask(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_numeric.py, line 1068)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_numeric.py", line 1068 class TestAllclose(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_scalarmath.py, line 84)) ... File "/sw/lib/python2.4/site-packages/numpy/core/tests/ test_scalarmath.py", line 84 class TestRepr(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_twodim_base.py, line 280)) ... File "/sw/lib/python2.4/site-packages/numpy/lib/tests/ test_twodim_base.py", line 280 class TestTriuIndices(): ^ SyntaxError: invalid syntax
====================================================================== ERROR: Failure: SyntaxError (invalid syntax (test_linalg.py, line 243)) ... File "/sw/lib/python2.4/site-packages/numpy/linalg/tests/ test_linalg.py", line 243 class TestMatrixPower(): ^ SyntaxError: invalid syntax
Subclassing object should fix those, I think.
====================================================================== ERROR: test_gft_filename (test_io.TestFromTxt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/sw/lib/python2.4/site-packages/numpy/lib/tests/test_io.py", line 1327, in test_gft_filename assert_array_equal(np.genfromtxt(name), exp_res) File "/sw/lib/python2.4/site-packages/numpy/lib/npyio.py", line 1235, in genfromtxt fhd = iter(np.lib._datasource.open(fname, 'Ub')) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 145, in open return ds.open(path, mode) File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms.
"rb" is fine on Windows with python 3.2, (that's what I tested initially for this bug)
Sorry, I take this back. All our files have \n file endings not \r\n. So I didn't test the latter case. Josef
Josef
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 4/4/11 9:03 PM, josef.pktd@gmail.com wrote:
On Mon, Apr 4, 2011 at 11:42 PM, Charles R Harris
File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms.
"rb" is fine on Windows with python 3.2, (that's what I tested initially for this bug)
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags. I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine. Note that if you only test with unix (\n) and dos (\r\n) line endings, it may work with 'b', if it's splitting on '\n', but not if you try it with Mac endings (\r). Of course with OS-X mac endings are getting pretty uncommon. -Chris
On Mon, Apr 4, 2011 at 11:22 PM, Chris Barker <Chris.Barker@noaa.gov> wrote:
On 4/4/11 9:03 PM, josef.pktd@gmail.com wrote:
On Mon, Apr 4, 2011 at 11:42 PM, Charles R Harris
File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms.
"rb" is fine on Windows with python 3.2, (that's what I tested initially for this bug)
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
"U" has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code. Whatever 'new' means in this case ;) If it is unneeded for python 2.4 I say drop it. Note that if you only test with unix (\n) and dos (\r\n) line endings,
it may work with 'b', if it's splitting on '\n', but not if you try it with Mac endings (\r). Of course with OS-X mac endings are getting pretty uncommon.
I suppose we could try the different line endings on a single platform and see what happens. It would be nice to know just how portable text really is. Chuck
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
"U" has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code.
That is for 3.* -- the 2.7.* docs say: """ In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen. Python enforces that the mode, after stripping 'U', begins with 'r', 'w' or 'a'. "" which does, in fact indicate that 'Ub' is NOT allowed. We should be using 'Ur', I think. Maybe the "python enforces" is what we saw the error from -- it didn't used to enforce anything. On 4/5/11 7:12 AM, Charles R Harris wrote:
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7.
"Ub" never made any sense anywhere -- "U" means universal newline text file. "b" means binary -- combining them makes no sense. On older pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is supposed to raise an error. does 'Ur' work with \r line endings on Python 3? According to my read of the docs, 'U' does nothing -- "universal" newline support is supposed to be the default: """ On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. """
It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt.
Why can't we just open the file with mode 'Ur'? text is text, messing with line endings shouldn't hurt anything, and it might help. If we stick with binary, then it comes down to: - will having an extra \r with Windows files hurt anything? -- probably not. - Are there many mac-style text files out there anymore? not many. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker <Chris.Barker@noaa.gov>wrote:
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes
no
sense when used with "b" for binary. I looked at the code a ways
back,
and I can't remember the resolution order, but there isn't any
checking
for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
"U" has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code.
That is for 3.* -- the 2.7.* docs say:
""" In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.
Python enforces that the mode, after stripping 'U', begins with 'r', 'w' or 'a'. ""
which does, in fact indicate that 'Ub' is NOT allowed. We should be using 'Ur', I think. Maybe the "python enforces" is what we saw the error from -- it didn't used to enforce anything.
'rbU' works and I put that in as a quick fix.
On 4/5/11 7:12 AM, Charles R Harris wrote:
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7.
"Ub" never made any sense anywhere -- "U" means universal newline text file. "b" means binary -- combining them makes no sense. On older pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is supposed to raise an error.
does 'Ur' work with \r line endings on Python 3?
Yes.
According to my read of the docs, 'U' does nothing -- "universal" newline support is supposed to be the default:
""" On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. """
It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt.
Why can't we just open the file with mode 'Ur'? text is text, messing with line endings shouldn't hurt anything, and it might help.
Well, text in the files then gets the numpy 'U' type instead of 'S', and there are places where byte streams are assumed for stripping and such. Which is to say that changing to text mode requires some work. Another possibility is to use a generator: def usetext(fname): f = open(fname, 'rt') for l in f: yield asbytes(f.next()) I think genfromtxt could use a refactoring and cleanup, but probably not for 1.6. Chuck
On Tue, Apr 5, 2011 at 1:20 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker <Chris.Barker@noaa.gov> wrote:
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
"U" has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code.
That is for 3.* -- the 2.7.* docs say:
""" In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.
Python enforces that the mode, after stripping 'U', begins with 'r', 'w' or 'a'. ""
which does, in fact indicate that 'Ub' is NOT allowed. We should be using 'Ur', I think. Maybe the "python enforces" is what we saw the error from -- it didn't used to enforce anything.
'rbU' works and I put that in as a quick fix.
On 4/5/11 7:12 AM, Charles R Harris wrote:
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7.
"Ub" never made any sense anywhere -- "U" means universal newline text file. "b" means binary -- combining them makes no sense. On older pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is supposed to raise an error.
does 'Ur' work with \r line endings on Python 3?
Yes.
According to my read of the docs, 'U' does nothing -- "universal" newline support is supposed to be the default:
""" On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. """
It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt.
Why can't we just open the file with mode 'Ur'? text is text, messing with line endings shouldn't hurt anything, and it might help.
Well, text in the files then gets the numpy 'U' type instead of 'S', and there are places where byte streams are assumed for stripping and such. Which is to say that changing to text mode requires some work. Another possibility is to use a generator:
def usetext(fname): f = open(fname, 'rt') for l in f: yield asbytes(f.next())
I think genfromtxt could use a refactoring and cleanup, but probably not for 1.6.
I think it should also be possible to read "rb" and strip any \r, \r\n in _iotools.py, that's were the bytes are used, from my reading and the initial error message. Josef
Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Apr 5, 2011 at 11:45 AM, <josef.pktd@gmail.com> wrote:
On Tue, Apr 5, 2011 at 1:20 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker <
Chris.Barker@noaa.gov>
wrote:
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which
makes
no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
"U" has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code.
That is for 3.* -- the 2.7.* docs say:
""" In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.
Python enforces that the mode, after stripping 'U', begins with 'r', 'w' or 'a'. ""
which does, in fact indicate that 'Ub' is NOT allowed. We should be using 'Ur', I think. Maybe the "python enforces" is what we saw the error from -- it didn't used to enforce anything.
'rbU' works and I put that in as a quick fix.
On 4/5/11 7:12 AM, Charles R Harris wrote:
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7.
"Ub" never made any sense anywhere -- "U" means universal newline text file. "b" means binary -- combining them makes no sense. On older pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is supposed to raise an error.
does 'Ur' work with \r line endings on Python 3?
Yes.
According to my read of the docs, 'U' does nothing -- "universal" newline support is supposed to be the default:
""" On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. """
It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt.
Why can't we just open the file with mode 'Ur'? text is text, messing with line endings shouldn't hurt anything, and it might help.
Well, text in the files then gets the numpy 'U' type instead of 'S', and there are places where byte streams are assumed for stripping and such. Which is to say that changing to text mode requires some work. Another possibility is to use a generator:
def usetext(fname): f = open(fname, 'rt') for l in f: yield asbytes(f.next())
I think genfromtxt could use a refactoring and cleanup, but probably not for 1.6.
I think it should also be possible to read "rb" and strip any \r, \r\n in _iotools.py, that's were the bytes are used, from my reading and the initial error message.
Doesn't work for \r, you get the whole file at once instead of line by line. Chuck
Hi, On Tue, Apr 5, 2011 at 10:56 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Apr 5, 2011 at 11:45 AM, <josef.pktd@gmail.com> wrote:
On Tue, Apr 5, 2011 at 1:20 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker <Chris.Barker@noaa.gov> wrote:
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
"U" has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code.
That is for 3.* -- the 2.7.* docs say:
""" In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.
Python enforces that the mode, after stripping 'U', begins with 'r', 'w' or 'a'. ""
which does, in fact indicate that 'Ub' is NOT allowed. We should be using 'Ur', I think. Maybe the "python enforces" is what we saw the error from -- it didn't used to enforce anything.
'rbU' works and I put that in as a quick fix.
On 4/5/11 7:12 AM, Charles R Harris wrote:
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7.
"Ub" never made any sense anywhere -- "U" means universal newline text file. "b" means binary -- combining them makes no sense. On older pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is supposed to raise an error.
does 'Ur' work with \r line endings on Python 3?
Yes.
According to my read of the docs, 'U' does nothing -- "universal" newline support is supposed to be the default:
""" On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. """
It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt.
Why can't we just open the file with mode 'Ur'? text is text, messing with line endings shouldn't hurt anything, and it might help.
Well, text in the files then gets the numpy 'U' type instead of 'S', and there are places where byte streams are assumed for stripping and such. Which is to say that changing to text mode requires some work. Another possibility is to use a generator:
def usetext(fname): f = open(fname, 'rt') for l in f: yield asbytes(f.next())
I think genfromtxt could use a refactoring and cleanup, but probably not for 1.6.
I think it should also be possible to read "rb" and strip any \r, \r\n in _iotools.py, that's were the bytes are used, from my reading and the initial error message.
Doesn't work for \r, you get the whole file at once instead of line by line.
Thanks for trying to sort out this ugliness. I've added another pull request: https://github.com/numpy/numpy/pull/71 - tests for \n \r\n and \r files, raising skiptest for currently failing 3.2 \r mode. Matthew
Hi, On Tue, Apr 5, 2011 at 9:46 AM, Christopher Barker <Chris.Barker@noaa.gov> wrote:
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I disagree that U makes no sense for binary file reading. In python 3: 'b' means, "return byte objects" 't' means "return decoded strings" 'U' means two things: 1) When iterating by line, split lines at any of '\r', '\r\n', '\n' 2) When returning lines split this way, convert '\r' and '\r\n' to '\n' If you support returning lines from a binary file (which python 3 does), then I think 'U' is a sensible thing to allow - as in this case. Best, Matthew
On Tue, Apr 5, 2011 at 5:56 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Tue, Apr 5, 2011 at 9:46 AM, Christopher Barker <Chris.Barker@noaa.gov> wrote:
On 4/4/11 10:35 PM, Charles R Harris wrote:
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I disagree that U makes no sense for binary file reading.
In python 3:
'b' means, "return byte objects" 't' means "return decoded strings"
'U' means two things:
1) When iterating by line, split lines at any of '\r', '\r\n', '\n' 2) When returning lines split this way, convert '\r' and '\r\n' to '\n'
If you support returning lines from a binary file (which python 3 does), then I think 'U' is a sensible thing to allow - as in this case.
U looks appropriate in this case, better than the workarounds. However, to me the python 3.2 docs seem to say that U only works for text mode and readline only takes \n as line separator readline(limit=-1) Read and return one line from the stream. If limit is specified, at most limit bytes will be read. The line terminator is always b'\n' for binary files; for text files, the newlines argument to open() can be used to select the line terminator(s) recognized. Josef
Best,
Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 4/5/11 3:36 PM, josef.pktd@gmail.com wrote:
I disagree that U makes no sense for binary file reading.
I wasn't saying that it made no sense to have a "U" mode for binary file reading, what I meant is that by the python2 definition, it made no sense. In Python 2, the ONLY difference between binary and text mode is line-feed translation. As for Python 3:
In python 3:
'b' means, "return byte objects" 't' means "return decoded strings"
'U' means two things:
1) When iterating by line, split lines at any of '\r', '\r\n', '\n' 2) When returning lines split this way, convert '\r' and '\r\n' to '\n'
a) 'U' is default -- it's essentially the same as 't' (in PY3), so 't' means "return decoded and line-feed translated unicode objects" b) I think the line-feed conversion is done regardless of if you are iterating by lines, i.e. with a full-on .read(). At least that's how it works in py2 -- not running py3 here to test.
If you support returning lines from a binary file (which python 3 does), then I think 'U' is a sensible thing to allow - as in this case.
but what is a "binary file"? I THINK what you are proposing is that we'd want to be able to have both linefeed translation and no decoding done. But I think that's impossible -- aren't the linefeeds themselves encoded differently with different encodings?
U looks appropriate in this case, better than the workarounds. However, to me the python 3.2 docs seem to say that U only works for text mode
Agreed -- but I don't see the problem -- your files are either encoded in something that might treat newlines differently (UCS32, maybe?), in which case you'd want it decoded, or you are working with ascii or ansi or utf-8, in which case you can specify the encoding anyway. I don't understand why we'd want a binary blob for text parsing -- the parsing code is going to have to know something about the encoding to work -- it might as well get passed in to the file open call, and work with unicode. I suppose if we still want to assume ascii for parsing, then we could use 't' and then re-encode to ascii to work with it. Which I agree does seem heavy handed just for fixing newlines. Also, one problem I've often had with encodings is what happens if I think I have ascii, but really have a couple characters above 127 -- then the default is to get an error in decoding. I'd like to be able to pass in a flag that either skips the un-decodable characters or replaces them with something, but it doesn't look like you can do that with the file open function in py3.
The line terminator is always b'\n' for binary files;
Once you really make the distiction between text and binary, the concept of a "line terminator" doesn't really make sense anyway. In the ansi world, everyone should always have used 'U' for text. It probably would have been the default if it had been there from the beginning. People got away without it because: 1) dos line feeds have a "\n" in them anyway 2) most if the time it doesn't matter that there is an extra whitespace charater inther 3) darn few of us ever had to deal with the mac "\r" Now that we are in a unicode world (at least a little) there is simply no way around the fact that you can't reliably read a file without knowing how it is encoded. My thought at this point is to say that the numpy text file reading stuff only works on 1byte, ansi encoding (nad maybe only ascii), and be done with it. utf-8 might be OK -- I don't know if there are any valid files in, say latin-1 that utf-8 will barf on -- you may not get the non-ascii symbols right, but that's better than barfing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
Hi, On Tue, Apr 5, 2011 at 4:12 PM, Christopher Barker <Chris.Barker@noaa.gov> wrote:
On 4/5/11 3:36 PM, josef.pktd@gmail.com wrote:
I disagree that U makes no sense for binary file reading.
I wasn't saying that it made no sense to have a "U" mode for binary file reading, what I meant is that by the python2 definition, it made no sense. In Python 2, the ONLY difference between binary and text mode is line-feed translation.
I think it's right to say that the difference between a text and a binary file in python 2 is - none for unix, and '\r\n' -> '\n' translation in windows. The difference between 'rt' and 'U' is (this is for my own benefit): For 'rt', a '\r' does not cause a line break - with 'U' - it does. For 'rt' _not_ on Windows, '\r\n' stays the same - it is stripped to '\n' with 'U'.
As for Python 3:
In python 3:
'b' means, "return byte objects" 't' means "return decoded strings"
'U' means two things:
1) When iterating by line, split lines at any of '\r', '\r\n', '\n' 2) When returning lines split this way, convert '\r' and '\r\n' to '\n'
a) 'U' is default -- it's essentially the same as 't' (in PY3), so 't' means "return decoded and line-feed translated unicode objects"
Right - my argument is that the behavior implied by 'U' and 't' is conceptually separable. 'U' is for how to do line-breaks, and line-termination translations, 't' is for whether to decode the text or not. In python 3.
b) I think the line-feed conversion is done regardless of if you are iterating by lines, i.e. with a full-on .read(). At least that's how it works in py2 -- not running py3 here to test.
Yes, that looks right.
If you support returning lines from a binary file (which python 3 does), then I think 'U' is a sensible thing to allow - as in this case.
but what is a "binary file"?
In python 3 a binary file is a file which is not decoded, and returns bytes. It still has a concept of a 'line', as defined by line terminators - you can iterate over one, or do .readlines(). In python 2, as you say, a binary file is essentially the same as a text file, with the single exception of the windows \r\n -> \n translation.
I THINK what you are proposing is that we'd want to be able to have both linefeed translation and no decoding done. But I think that's impossible -- aren't the linefeeds themselves encoded differently with different encodings?
Right - so obviously if you open a utf-16 file as binary, terrible things may happen - this was what Pauli was pointing out before. His point was that utf-8 is the standard, and that we probably would not hit many other encodings. I agree with you if you are saying that it would be good to be able to deal with them if we can - presumably by allowing 'rt' file objects, producing python 3 strings.
U looks appropriate in this case, better than the workarounds. However, to me the python 3.2 docs seem to say that U only works for text mode
Agreed -- but I don't see the problem -- your files are either encoded in something that might treat newlines differently (UCS32, maybe?), in which case you'd want it decoded, or you are working with ascii or ansi or utf-8, in which case you can specify the encoding anyway.
I don't understand why we'd want a binary blob for text parsing -- the parsing code is going to have to know something about the encoding to work -- it might as well get passed in to the file open call, and work with unicode. I suppose if we still want to assume ascii for parsing, then we could use 't' and then re-encode to ascii to work with it. Which I agree does seem heavy handed just for fixing newlines.
Also, one problem I've often had with encodings is what happens if I think I have ascii, but really have a couple characters above 127 -- then the default is to get an error in decoding. I'd like to be able to pass in a flag that either skips the un-decodable characters or replaces them with something, but it doesn't look like you can do that with the file open function in py3.
The line terminator is always b'\n' for binary files;
Once you really make the distiction between text and binary, the concept of a "line terminator" doesn't really make sense anyway.
Well - I was arguing that, given we can iterate over lines in binary files, then there must be the concept of what a line is, in a binary file, and that means that we need the concept of a line terminator. I realize this is a discussion that would have to happen on the python-dev list... See you, Matthew
Sorry to keep harping on this, but for history's sake, I was one of the folks that got 'U' introduced in the first place. I was dealing with a nightmare of unix, mac and dos test files, 'U' was a godsend. On 4/5/11 4:51 PM, Matthew Brett wrote:
The difference between 'rt' and 'U' is (this is for my own benefit):
For 'rt', a '\r' does not cause a line break - with 'U' - it does.
Perhaps semantics, but what 'U' does is actually change any of the line breaks to '\n' -- any line breaking happens after the fact. In Py2, the difference between 'U' and 't' is that 't' assumes that any file read uses the native line endings -- a bad idea, IMHO. Back in the day, Guido argued that text file line ending conversion was the job of file transfer tools. The reality, however, is that users don't always use file transfer tools correctly, nor even understand the implications of line endings. All that being said, mac-style files are pretty rare these days. (though I bet I've got a few still kicking around)
Right - my argument is that the behavior implied by 'U' and 't' is conceptually separable. 'U' is for how to do line-breaks, and line-termination translations, 't' is for whether to decode the text or not. In python 3.
but 't' and 'U' are the same in python 3 -- there is no distinction. It seems you are arguing that there could/should be a way to translate line termination without decoding the text, but ...
In python 3 a binary file is a file which is not decoded, and returns bytes. It still has a concept of a 'line', as defined by line terminators - you can iterate over one, or do .readlines().
I'll take your word for it that it does, but that's not really a binary file then, it's a file that you are assuming is encoded in an ascii-compatible way. While I know that "practicality beats purity", we really should be opening the file as a text file (it is text, after all), and specifying utf-8 or latin-1 or something as the encoding. However, IIUC, then the issue here is later on down the line, numpy uses regular old C code, which expects ascii strings. In that case, we could encode the text as ascii, into a bytes object. That's a lot of overhead for line ending translation, so probably not worth it. But if nothing else, we should be clear in the docs that numpy text file reading code is expecting ascii-compatible data. (and it would be nice to get the line-ending translation)
Right - so obviously if you open a utf-16 file as binary, terrible things may happen - this was what Pauli was pointing out before. His point was that utf-8 is the standard,
but it's not the standard -- it's a common use, but not a standard -- ideally numpy wouldn't enforce any particular encoding (though it could default to one, and utf-8 would be a good choice for that)
Once you really make the distiction between text and binary, the concept of a "line terminator" doesn't really make sense anyway.
Well - I was arguing that, given we can iterate over lines in binary files, then there must be the concept of what a line is, in a binary file, and that means that we need the concept of a line terminator.
maybe, but that concept is built on a assumption that your file is ascii-compatible (for \n anyway), and you know what they say about assumptions...
I realize this is a discussion that would have to happen on the python-dev list...
I'm not sure -- I was thinking that python missed something here, but I don't think so anymore. In the unicode world, there is not choice but to be explicit about encodings, and if you do that, then python's "text or binary" distinction makes sense. .readline() for binary file doesn't, but so be it. Honestly, I've never been sure in this discussion what code actually needs fixing, so I'm done now -- we've talked enough that the issues MUST have been covered by now! -Chris
On Mon, Apr 4, 2011 at 11:22 PM, Chris Barker <Chris.Barker@noaa.gov> wrote:
On 4/4/11 9:03 PM, josef.pktd@gmail.com wrote:
On Mon, Apr 4, 2011 at 11:42 PM, Charles R Harris
File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms.
"rb" is fine on Windows with python 3.2, (that's what I tested initially for this bug)
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
Note that if you only test with unix (\n) and dos (\r\n) line endings, it may work with 'b', if it's splitting on '\n', but not if you try it with Mac endings (\r). Of course with OS-X mac endings are getting pretty uncommon.
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7. It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt. Chuck
On Tue, Apr 5, 2011 at 8:12 AM, Charles R Harris <charlesr.harris@gmail.com>wrote:
On Mon, Apr 4, 2011 at 11:22 PM, Chris Barker <Chris.Barker@noaa.gov>wrote:
On 4/4/11 9:03 PM, josef.pktd@gmail.com wrote:
On Mon, Apr 4, 2011 at 11:42 PM, Charles R Harris
File "/sw/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 477, in open return _file_openers[ext](found, mode=mode) IOError: invalid mode: Ub
Guess that wasn't tested before ;) I thought that was strange when I saw it. The source of the problem is line 2035 in npyio.py. Additionally, Since genloadtxt needs to have byte strings the 'rb" mode should probably be used. That works on linux, both for python 2 and python 3, but doing that might uncover genfromtxt problems on other platforms.
"rb" is fine on Windows with python 3.2, (that's what I tested initially for this bug)
IIUC, "Ub" is undefined -- "U" means universal newlines, which makes no sense when used with "b" for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags.
I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine.
Note that if you only test with unix (\n) and dos (\r\n) line endings, it may work with 'b', if it's splitting on '\n', but not if you try it with Mac endings (\r). Of course with OS-X mac endings are getting pretty uncommon.
The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7. It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt.
Curiously, 'rbU' and 'rU' do work on 2.4, but not 'Urb', 'Ub', 'Ur', or 'bU'. Due to Python 3 not hadling '\r' in binary mode, the shortest path forward might be to open files as text and call asbytes on the lines as they are read. Chuck
On 4/4/2011 1:04 PM, Ralf Gommers wrote:
Hi,
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
Many bug fixes went in since beta 1, among which: - fix installation of source release with python 3 - f2py fixes for assumed shape arrays - several loadtxt bug fixes and enhancements - change default floating point error handling from "print" to "warn" - much more
I quickly counted in the timeline, and in the last few weeks the number of open tickets has been decreased by over 100. Thanks to everyone who contributed to this spring cleaning!
Sources and binaries can be found at http://sourceforge.net/projects/numpy/files/NumPy/1.6.0b2/ For (preliminary) release notes see below.
Enjoy, Ralf
Thank you. numpy 1.6.0b2 builds well on Windows with msvc9/MKL. No license file is included in the installers. The installers for Python 3 include 2to3 backup files. This has already been fixed in scipy <https://github.com/scipy/scipy/commit/f7dae4f21593d94735a0377a1af3a9275413b8...>. All numpy tests pass on win32. A few numpy tests fail on win-amd64: ====================================================================== ERROR: Ticket #99 ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\numpy\testing\decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_regression.py", line 148, in test_intp np.intp('0x' + 'f'*i_width,16) TypeError: function takes at most 1 argument (2 given) ====================================================================== FAIL: test_iterator.test_iter_broadcasting_errors ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187, in runTest self.test(*self.arg) File "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py", line 639, in test_iter_broadcasting_errors 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Message "non-broadcastable output operand with shape (%lld,%lld) doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain operand shape (2,3) ====================================================================== FAIL: test_noncentral_f (test_random.TestRandomDist) ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\numpy\random\tests\test_random.py", line 297, in test_noncentral_f np.testing.assert_array_almost_equal(actual, desired, decimal=14) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 14 decimals (mismatch 100.0%) x: array([[ 1.62003345, 1.7253997 ], [ 0.96735921, 0.42933718], [ 0.71714872, 6.24979552]]) y: array([[ 1.405981 , 0.34207973], [ 3.57715069, 7.92632663], [ 0.43741599, 1.17742088]]) One scipy 0.9.0 test fails (32 and 64 bit): ====================================================================== FAIL: test_expon (test_morestats.TestAnderson) ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\scipy\stats\tests\test_morestats.py", line 72, in test_expon assert_array_less(crit[:-1], A) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 869, in assert_array_less header='Arrays are not less-ordered') File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 613, in assert_array_compare chk_same_position(x_id, y_id, hasval='inf') File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 588, in chk_same_position raise AssertionError(msg) AssertionError: Arrays are not less-ordered x and y inf location mismatch: x: array([ 0.911, 1.065, 1.325, 1.587]) y: array(inf) Christoph
On Tue, Apr 5, 2011 at 12:10 AM, Christoph Gohlke <cgohlke@uci.edu> wrote:
On 4/4/2011 1:04 PM, Ralf Gommers wrote:
Hi,
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
Many bug fixes went in since beta 1, among which: - fix installation of source release with python 3 - f2py fixes for assumed shape arrays - several loadtxt bug fixes and enhancements - change default floating point error handling from "print" to "warn" - much more
I quickly counted in the timeline, and in the last few weeks the number of open tickets has been decreased by over 100. Thanks to everyone who contributed to this spring cleaning!
Sources and binaries can be found at http://sourceforge.net/projects/numpy/files/NumPy/1.6.0b2/ For (preliminary) release notes see below.
Enjoy, Ralf
Thank you.
numpy 1.6.0b2 builds well on Windows with msvc9/MKL.
No license file is included in the installers.
The installers for Python 3 include 2to3 backup files. This has already been fixed in scipy <https://github.com/scipy/scipy/commit/f7dae4f21593d94735a0377a1af3a9275413b8...>.
All numpy tests pass on win32.
A few numpy tests fail on win-amd64:
====================================================================== ERROR: Ticket #99 ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\numpy\testing\decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_regression.py", line 148, in test_intp np.intp('0x' + 'f'*i_width,16) TypeError: function takes at most 1 argument (2 given)
====================================================================== FAIL: test_iterator.test_iter_broadcasting_errors ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187, in runTest self.test(*self.arg) File "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py", line 639, in test_iter_broadcasting_errors 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Message "non-broadcastable output operand with shape (%lld,%lld) doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain operand shape (2,3)
====================================================================== FAIL: test_noncentral_f (test_random.TestRandomDist) ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\numpy\random\tests\test_random.py", line 297, in test_noncentral_f np.testing.assert_array_almost_equal(actual, desired, decimal=14) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 14 decimals
(mismatch 100.0%) x: array([[ 1.62003345, 1.7253997 ], [ 0.96735921, 0.42933718], [ 0.71714872, 6.24979552]]) y: array([[ 1.405981 , 0.34207973], [ 3.57715069, 7.92632663], [ 0.43741599, 1.17742088]])
One scipy 0.9.0 test fails (32 and 64 bit):
====================================================================== FAIL: test_expon (test_morestats.TestAnderson) ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\scipy\stats\tests\test_morestats.py", line 72, in test_expon assert_array_less(crit[:-1], A) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 869, in assert_array_less header='Arrays are not less-ordered') File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 613, in assert_array_compare chk_same_position(x_id, y_id, hasval='inf') File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 588, in chk_same_position raise AssertionError(msg) AssertionError: Arrays are not less-ordered
x and y inf location mismatch: x: array([ 0.911, 1.065, 1.325, 1.587]) y: array(inf)
I think this is from the change how inf are now handled in the asserts, but I cannot check because I don't have numpy 1.6 yet
np.testing.assert_array_less(2,3) np.testing.assert_array_less(2,np.inf) np.testing.assert_array_less(np.inf,np.inf)
the anderson test really states: assert array([ 0.911, 1.065, 1.325, 1.587]) < np.inf which is correct but not a good test. Josef
Christoph _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Superpack is OK on Windows Vista 64bit with Python 2.7 32bit. Alan Isaac Ran 2993 tests in 25.707s OK (KNOWNFAIL=9, SKIP=5) Running unit tests for numpy NumPy version 1.6.0b2 NumPy is installed in c:\Python27\lib\site-packages\numpy Python version 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] nose version 0.11.0
On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu> wrote:
<snip>
A few numpy tests fail on win-amd64:
<snip>
====================================================================== FAIL: test_iterator.test_iter_broadcasting_errors ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187, in runTest self.test(*self.arg) File "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py", line 639, in test_iter_broadcasting_errors 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Message "non-broadcastable output operand with shape (%lld,%lld) doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain operand shape (2,3)
I've pushed a fix for this to the 1.6.x branch, can you confirm that it works on win-amd64? Thanks, Mark <snip>
On 4/5/2011 4:05 PM, Mark Wiebe wrote:
On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu>> wrote:
<snip>
A few numpy tests fail on win-amd64:
<snip>
====================================================================== FAIL: test_iterator.test_iter_broadcasting_errors ---------------------------------------------------------------------- Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187, in runTest self.test(*self.arg) File "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py", line 639, in test_iter_broadcasting_errors 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Message "non-broadcastable output operand with shape (%lld,%lld) doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain operand shape (2,3)
I've pushed a fix for this to the 1.6.x branch, can you confirm that it works on win-amd64?
Thanks, Mark
<snip>
Sorry, I forgot to mention that this test failed on 64-bit Python 2.6 only. I now recognize it is due to a known issue with Python's PyErr_Format function <http://bugs.python.org/issue7228>. Unfortunately the fix will not be backported to Python 2.6. Maybe this test could be marked as known failure on win-amd64-py2.6? Could you please revert your changes or set the format specifier to "%lld"? "%I64d" is not supported by PyErr_Format. Thanks, Christoph
On Tue, Apr 5, 2011 at 5:07 PM, Christoph Gohlke <cgohlke@uci.edu> wrote:
On 4/5/2011 4:05 PM, Mark Wiebe wrote:
On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu>> wrote:
<snip>
A few numpy tests fail on win-amd64:
<snip>
======================================================================
FAIL: test_iterator.test_iter_broadcasting_errors
----------------------------------------------------------------------
Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187,
in
runTest self.test(*self.arg) File
"X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py",
line 639, in test_iter_broadcasting_errors 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py",
line
34, in assert_ raise AssertionError(msg) AssertionError: Message "non-broadcastable output operand with shape (%lld,%lld) doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain operand shape (2,3)
I've pushed a fix for this to the 1.6.x branch, can you confirm that it works on win-amd64?
Thanks, Mark
<snip>
Sorry, I forgot to mention that this test failed on 64-bit Python 2.6 only. I now recognize it is due to a known issue with Python's PyErr_Format function <http://bugs.python.org/issue7228>. Unfortunately the fix will not be backported to Python 2.6. Maybe this test could be marked as known failure on win-amd64-py2.6?
Could you please revert your changes or set the format specifier to "%lld"? "%I64d" is not supported by PyErr_Format.
This means PyString_Format and PyOS_snprintf support different sets of formatting characters, which begs the question of which set of functions should the NPY_*_FMT macros be intended for? Currently it looks like just the NPY_INTP_FMT is used with PyString_Format/PyErr_Format, and the rest are used with PyOS_snprintf. Also, in other places npy_intp variables are cast to long and "%ld" is used instead of NPY_INTP_FMT. I suppose I'll just change NPY_INTP_FMT unconditionally to "%lld" when it's long long, since that's what the PyString_Format 2.7 documentation says is the correct portable formatter. -Mark
Thanks,
Christoph _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Apr 5, 2011 at 5:07 PM, Christoph Gohlke <cgohlke@uci.edu> wrote:
On 4/5/2011 4:05 PM, Mark Wiebe wrote:
On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu>> wrote:
<snip>
A few numpy tests fail on win-amd64:
<snip>
======================================================================
FAIL: test_iterator.test_iter_broadcasting_errors
----------------------------------------------------------------------
Traceback (most recent call last): File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187,
in
runTest self.test(*self.arg) File
"X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py",
line 639, in test_iter_broadcasting_errors 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py",
line
34, in assert_ raise AssertionError(msg) AssertionError: Message "non-broadcastable output operand with shape (%lld,%lld) doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain operand shape (2,3)
I've pushed a fix for this to the 1.6.x branch, can you confirm that it works on win-amd64?
Thanks, Mark
<snip>
Sorry, I forgot to mention that this test failed on 64-bit Python 2.6 only. I now recognize it is due to a known issue with Python's PyErr_Format function <http://bugs.python.org/issue7228>. Unfortunately the fix will not be backported to Python 2.6. Maybe this test could be marked as known failure on win-amd64-py2.6?
Could you please revert your changes or set the format specifier to "%lld"? "%I64d" is not supported by PyErr_Format.
I've committed an attempted workaround using the %zd formatter, can you check whether it works on 64-bit Windows Python 2.6 now? Thanks, Mark
Thanks,
Christoph _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 4/5/2011 6:46 PM, Mark Wiebe wrote:
On Tue, Apr 5, 2011 at 5:07 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu>> wrote:
On 4/5/2011 4:05 PM, Mark Wiebe wrote: > On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu> > <mailto:cgohlke@uci.edu <mailto:cgohlke@uci.edu>>> wrote: > > > <snip> > > A few numpy tests fail on win-amd64: > > <snip> > > ====================================================================== > FAIL: test_iterator.test_iter_broadcasting_errors > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187, in > runTest > self.test(*self.arg) > File > "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py", > line 639, in test_iter_broadcasting_errors > 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) > File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line > 34, in assert_ > raise AssertionError(msg) > AssertionError: Message "non-broadcastable output operand with shape > (%lld,%lld) > doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain > operand shape (2,3) > > > I've pushed a fix for this to the 1.6.x branch, can you confirm that it > works on win-amd64? > > Thanks, > Mark > > <snip> >
Sorry, I forgot to mention that this test failed on 64-bit Python 2.6 only. I now recognize it is due to a known issue with Python's PyErr_Format function <http://bugs.python.org/issue7228>. Unfortunately the fix will not be backported to Python 2.6. Maybe this test could be marked as known failure on win-amd64-py2.6?
Could you please revert your changes or set the format specifier to "%lld"? "%I64d" is not supported by PyErr_Format.
I've committed an attempted workaround using the %zd formatter, can you check whether it works on 64-bit Windows Python 2.6 now?
Thanks, Mark
I have not tried but I do not expect it to work. According to <http://docs.python.org/c-api/string.html> the "%zd" format is 'exactly equivalent to printf("%zd")'. The 'z' prefix character is C99 AFAIK and not supported by Visual Studio compilers <http://msdn.microsoft.com/en-us/library/xdb9w69d%28v=VS.90%29.aspx>. Christoph
On Tue, Apr 5, 2011 at 7:21 PM, Christoph Gohlke <cgohlke@uci.edu> wrote:
On 4/5/2011 6:46 PM, Mark Wiebe wrote:
On Tue, Apr 5, 2011 at 5:07 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu>> wrote:
On 4/5/2011 4:05 PM, Mark Wiebe wrote: > On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu> > <mailto:cgohlke@uci.edu <mailto:cgohlke@uci.edu>>> wrote: > > > <snip> > > A few numpy tests fail on win-amd64: > > <snip> > >
======================================================================
> FAIL: test_iterator.test_iter_broadcasting_errors >
----------------------------------------------------------------------
> Traceback (most recent call last): > File "X:\Python26-x64\lib\site-packages\nose\case.py", line 187, in > runTest > self.test(*self.arg) > File >
"X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py",
> line 639, in test_iter_broadcasting_errors > 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) > File "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line > 34, in assert_ > raise AssertionError(msg) > AssertionError: Message "non-broadcastable output operand with shape > (%lld,%lld) > doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't contain > operand shape (2,3) > > > I've pushed a fix for this to the 1.6.x branch, can you confirm that it > works on win-amd64? > > Thanks, > Mark > > <snip> >
Sorry, I forgot to mention that this test failed on 64-bit Python 2.6 only. I now recognize it is due to a known issue with Python's PyErr_Format function <http://bugs.python.org/issue7228>.
Unfortunately
the fix will not be backported to Python 2.6. Maybe this test could
be
marked as known failure on win-amd64-py2.6?
Could you please revert your changes or set the format specifier to "%lld"? "%I64d" is not supported by PyErr_Format.
I've committed an attempted workaround using the %zd formatter, can you check whether it works on 64-bit Windows Python 2.6 now?
Thanks, Mark
I have not tried but I do not expect it to work. According to <http://docs.python.org/c-api/string.html> the "%zd" format is 'exactly equivalent to printf("%zd")'. The 'z' prefix character is C99 AFAIK and not supported by Visual Studio compilers <http://msdn.microsoft.com/en-us/library/xdb9w69d%28v=VS.90%29.aspx>.
I think there's a good chance it will work, since the CPython code is re-interpreting the format codes itself. In the Python 2.6 code, it looks like it will become %Id on Windows, which would be correct. -Mark
On 4/5/2011 7:44 PM, Mark Wiebe wrote:
On Tue, Apr 5, 2011 at 7:21 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu>> wrote:
On 4/5/2011 6:46 PM, Mark Wiebe wrote: > On Tue, Apr 5, 2011 at 5:07 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu> > <mailto:cgohlke@uci.edu <mailto:cgohlke@uci.edu>>> wrote: > > > > On 4/5/2011 4:05 PM, Mark Wiebe wrote: > > On Mon, Apr 4, 2011 at 9:10 PM, Christoph Gohlke <cgohlke@uci.edu <mailto:cgohlke@uci.edu> > <mailto:cgohlke@uci.edu <mailto:cgohlke@uci.edu>> > > <mailto:cgohlke@uci.edu <mailto:cgohlke@uci.edu> <mailto:cgohlke@uci.edu <mailto:cgohlke@uci.edu>>>> wrote: > > > > > > <snip> > > > > A few numpy tests fail on win-amd64: > > > > <snip> > > > > > ====================================================================== > > FAIL: test_iterator.test_iter_broadcasting_errors > > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File "X:\Python26-x64\lib\site-packages\nose\case.py", line > 187, in > > runTest > > self.test(*self.arg) > > File > > "X:\Python26-x64\lib\site-packages\numpy\core\tests\test_iterator.py", > > line 639, in test_iter_broadcasting_errors > > 'Message "%s" doesn\'t contain operand shape (2,3)' % msg) > > File > "X:\Python26-x64\lib\site-packages\numpy\testing\utils.py", line > > 34, in assert_ > > raise AssertionError(msg) > > AssertionError: Message "non-broadcastable output operand with > shape > > (%lld,%lld) > > doesn't match the broadcast shape (%lld,%lld,%lld)" doesn't > contain > > operand shape (2,3) > > > > > > I've pushed a fix for this to the 1.6.x branch, can you confirm > that it > > works on win-amd64? > > > > Thanks, > > Mark > > > > <snip> > > > > Sorry, I forgot to mention that this test failed on 64-bit Python 2.6 > only. I now recognize it is due to a known issue with Python's > PyErr_Format function <http://bugs.python.org/issue7228>. Unfortunately > the fix will not be backported to Python 2.6. Maybe this test could be > marked as known failure on win-amd64-py2.6? > > Could you please revert your changes or set the format specifier to > "%lld"? "%I64d" is not supported by PyErr_Format. > > > I've committed an attempted workaround using the %zd formatter, can you > check whether it works on 64-bit Windows Python 2.6 now? > > Thanks, > Mark >
I have not tried but I do not expect it to work. According to <http://docs.python.org/c-api/string.html> the "%zd" format is 'exactly equivalent to printf("%zd")'. The 'z' prefix character is C99 AFAIK and not supported by Visual Studio compilers <http://msdn.microsoft.com/en-us/library/xdb9w69d%28v=VS.90%29.aspx>.
I think there's a good chance it will work, since the CPython code is re-interpreting the format codes itself. In the Python 2.6 code, it looks like it will become %Id on Windows, which would be correct.
-Mark
Agreed, it'll work if NPY_INTP_FMT is only used with certain Python functions that use %Id instead of %zd on Windows. Then why not simply define NPY_INTP_FMT as %zd? - #if (PY_VERSION_HEX > 0x02060000) - #define NPY_INTP_FMT "lld" - #else #define NPY_INTP_FMT "zd" - #endif That should work unless 64 bit numpy is run on a LLP64 platform with Python < 2.5. Christoph
SUSE 11.3, python 2.7, gcc 4.3.4, gfortran from gcc 4.6.0, I get two failures on commit 1439a8ddcb2eda20fa102aa44e846783f29c0af3 (head of 1.6.x maintenance branch). --George. ====================================================================== FAIL: Test basic arithmetic function errors ---------------------------------------------------------------------- Traceback (most recent call last): File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 321, in test_floating_exceptions lambda a,b:a/b, ft_tiny, ft_max) File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Type <type 'numpy.float32'> did not raise fpe error 'underflow'. ====================================================================== FAIL: test_kind.TestKind.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/nose/case.py", line 187, in runTest self.test(*self.arg) File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/numpy/f2py/tests/test_kind.py", line 30, in test_all 'selectedrealkind(%s): expected %r but got %r' % (i, selected_real_kind(i), selectedrealkind(i))) File "/noc/users/agn/ext/SUSE_11/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: selectedrealkind(19): expected -1 but got 16 ---------------------------------------------------------------------- Ran 3544 tests in 48.039s FAILED (KNOWNFAIL=4, failures=2) <nose.result.TextTestResult run=3544 errors=0 failures=2> On 4 April 2011 21:04, Ralf Gommers <ralf.gommers@googlemail.com> wrote:
Hi,
I am pleased to announce the availability of the second beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list.
Many bug fixes went in since beta 1, among which: - fix installation of source release with python 3 - f2py fixes for assumed shape arrays - several loadtxt bug fixes and enhancements - change default floating point error handling from "print" to "warn" - much more
I quickly counted in the timeline, and in the last few weeks the number of open tickets has been decreased by over 100. Thanks to everyone who contributed to this spring cleaning!
Sources and binaries can be found at http://sourceforge.net/projects/numpy/files/NumPy/1.6.0b2/ For (preliminary) release notes see below.
Enjoy, Ralf
Note: NumPy 1.6.0 is not yet released.
========================= NumPy 1.6.0 Release Notes =========================
This release includes several new features as well as numerous bug fixes and improved documentation. It is backward compatible with the 1.5.0 release, and supports Python 2.4 - 2.7 and 3.1 - 3.2.
Highlights ==========
* Re-introduction of datetime dtype support to deal with dates in arrays.
* A new 16-bit floating point type.
* A new iterator, which improves performance of many functions.
New features ============
New 16-bit floating point type ------------------------------
This release adds support for the IEEE 754-2008 binary16 format, available as the data type ``numpy.half``. Within Python, the type behaves similarly to `float` or `double`, and C extensions can add support for it with the exposed half-float API.
New iterator ------------
A new iterator has been added, replacing the functionality of the existing iterator and multi-iterator with a single object and API. This iterator works well with general memory layouts different from C or Fortran contiguous, and handles both standard NumPy and customized broadcasting. The buffering, automatic data type conversion, and optional output parameters, offered by ufuncs but difficult to replicate elsewhere, are now exposed by this iterator.
Legendre, Laguerre, Hermite, HermiteE polynomials in ``numpy.polynomial`` -------------------------------------------------------------------------
Extend the number of polynomials available in the polynomial package. In addition, a new ``window`` attribute has been added to the classes in order to specify the range the ``domain`` maps to. This is mostly useful for the Laguerre, Hermite, and HermiteE polynomials whose natural domains are infinite and provides a more intuitive way to get the correct mapping of values without playing unnatural tricks with the domain.
Fortran assumed shape array and size function support in ``numpy.f2py`` -----------------------------------------------------------------------
F2py now supports wrapping Fortran 90 routines that use assumed shape arrays. Before such routines could be called from Python but the corresponding Fortran routines received assumed shape arrays as zero length arrays which caused unpredicted results. Thanks to Lorenz Hüdepohl for pointing out the correct way to interface routines with assumed shape arrays.
In addition, f2py interprets Fortran expression ``size(array, dim)`` as ``shape(array, dim-1)`` which makes it possible to automatically wrap Fortran routines that use two argument ``size`` function in dimension specifications. Before users were forced to apply this mapping manually.
Other new functions -------------------
``numpy.ravel_multi_index`` : Converts a multi-index tuple into an array of flat indices, applying boundary modes to the indices.
``numpy.einsum`` : Evaluate the Einstein summation convention. Using the Einstein summation convention, many common multi-dimensional array operations can be represented in a simple fashion. This function provides a way compute such summations.
``numpy.count_nonzero`` : Counts the number of non-zero elements in an array.
``numpy.result_type`` and ``numpy.min_scalar_type`` : These functions expose the underlying type promotion used by the ufuncs and other operations to determine the types of outputs. These improve upon the ``numpy.common_type`` and ``numpy.mintypecode`` which provide similar functionality but do not match the ufunc implementation.
Changes =======
Changes and improvements in the numpy core ------------------------------------------
``default error handling`` --------------------------
The default error handling has been change from ``print`` to ``warn`` for all except for ``underflow``, which remains as ``ignore``.
``numpy.distutils`` -------------------
Several new compilers are supported for building Numpy: the Portland Group Fortran compiler on OS X, the PathScale compiler suite and the 64-bit Intel C compiler on Linux.
``numpy.testing`` -----------------
The testing framework gained ``numpy.testing.assert_allclose``, which provides a more convenient way to compare floating point arrays than `assert_almost_equal`, `assert_approx_equal` and `assert_array_almost_equal`.
``C API`` ---------
In addition to the APIs for the new iterator and half data type, a number of other additions have been made to the C API. The type promotion mechanism used by ufuncs is exposed via ``PyArray_PromoteTypes``, ``PyArray_ResultType``, and ``PyArray_MinScalarType``. A new enumeration ``NPY_CASTING`` has been added which controls what types of casts are permitted. This is used by the new functions ``PyArray_CanCastArrayTo`` and ``PyArray_CanCastTypeTo``. A more flexible way to handle conversion of arbitrary python objects into arrays is exposed by ``PyArray_GetArrayParamsFromObject``.
Deprecated features ===================
The "normed" keyword in ``numpy.histogram`` is deprecated. Its functionality will be replaced by the new "density" keyword.
Removed features ================
``numpy.fft`` -------------
The functions `refft`, `refft2`, `refftn`, `irefft`, `irefft2`, `irefftn`, which were aliases for the same functions without the 'e' in the name, were removed.
``numpy.memmap`` ----------------
The `sync()` and `close()` methods of memmap were removed. Use `flush()` and "del memmap" instead.
``numpy.lib`` -------------
The deprecated functions ``numpy.unique1d``, ``numpy.setmember1d``, ``numpy.intersect1d_nu`` and ``numpy.lib.ufunclike.log2`` were removed.
``numpy.ma`` ------------
Several deprecated items were removed from the ``numpy.ma`` module::
* ``numpy.ma.MaskedArray`` "raw_data" method * ``numpy.ma.MaskedArray`` constructor "flag" keyword * ``numpy.ma.make_mask`` "flag" keyword * ``numpy.ma.allclose`` "fill_value" keyword
``numpy.distutils`` -------------------
The ``numpy.get_numpy_include`` function was removed, use ``numpy.get_include`` instead. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (11)
-
Alan G Isaac
-
Charles R Harris
-
Chris Barker
-
Christoph Gohlke
-
Christopher Barker
-
Derek Homeier
-
George Nurser
-
josef.pktd@gmail.com
-
Mark Wiebe
-
Matthew Brett
-
Ralf Gommers