Hi Travis, hi Olivier,

Thanks for your replies last month about the choose() issue.

I did some further investigation into this. I ran out of time in that project to come up with a patch, but here's what I found, which may be of interest:

The compile-time constant NPY_MAXARGS is indeed limiting choose(), but only in recent versions. In NumPy version 1.2.1 this constant was set to the same value of 32, but choose() was not limited in the same way. This code succeeds on NumPy 1.2.1:

----------------

import numpy as np

choices = [[0, 1, 2, 3], [10, 11, 12, 13],
          [20, 21, 22, 23], [30, 31, 32, 33]]

morechoices = choices * 2**22
np.choose([2, 0, 1, 0], morechoices)

----------------

where the list contains 16.7m items. So this is a real regression ... for heavy-duty users of choose().

Thanks again for your thoughts!

Best wishes,
    Ed


On Fri, Jun 17, 2011 at 3:05 AM, Travis Oliphant <oliphant@enthought.com> wrote:
Hi Ed, 

I'm pretty sure that this is "bug" is due to the compile-time constant NPY_MAXARGS defined in include/numpy/ndarraytypes.h     I suspect that the versions you are trying it on where it succeeds as a different compile-time constant of that value. 

NumPy uses a multi-iterator object (PyArrayMultiIterObject defined in ndarraytypes.h as well) to broadcast arguments together for ufuncs and for functions like choose.  The data-structure that it uses to do this has a static array of Iterator objects with room for NPY_MAXARGS iterators.     I think in some versions this compile time constant has been 40 or higher.    Re-compiling NumPy by bumping up that constant will of course require re-compilation of most extensions modules that use the NumPy API. 

Numeric did not use this approach to broadcast the arguments to choose together and so likely does not have the same limitation.   It would also not be that difficult to modify the NumPy code to dynamically allocate the iters array when needed to remove the NPY_MAXARGS limitation.   In fact, I would not mind seeing all the NPY_MAXDIMS and NPY_MAXARGS limitations removed.   To do it well you would probably want to have some minimum storage-space pre-allocated (say NPY_MAXDIMS as 7 and NPY_MAXARGS as 10 to avoid the memory allocation in common cases) and just increase that space as needed dynamically.   

This would be a nice project for someone wanting to learn the NumPy code base.

-Travis





On Jun 16, 2011, at 1:56 AM, Ed Schofield wrote:

Hi all,

I have been investigation the limitation of the choose() method (and function) to 32 elements. This is a regression in recent versions of NumPy. I have tested choose() in the following NumPy versions:

1.0.4: fine
1.1.1: bug
1.2.1: fine
1.3.0: bug
1.4.x: bug
1.5.x: bug
1.6.x: bug
Numeric 24.3: fine

(To run the tests on versions of NumPy prior to 1.4.x I used Python 2.4.3. For the other tests I used Python 2.7.)

Here 'bug' means the choose() function has the 32-element limitation. I have been helping an organization to port a large old Numeric-using codebase to NumPy, and the choose() limitation in recent NumPy versions is throwing a spanner in the works. The codebase is currently using both NumPy and Numeric side-by-side, with Numeric only being used for its choose() function, with a few dozen lines like this:

a = numpy.array(Numeric.choose(b, c))

Here is a simple example that triggers the bug. It is a simple extension of the example from the choose() docstring:

----------------

import numpy as np

choices = [[0, 1, 2, 3], [10, 11, 12, 13],
          [20, 21, 22, 23], [30, 31, 32, 33]]

np.choose([2, 3, 1, 0], choices * 8)

----------------

A side note: the exception message (defined in core/src/multiarray/iterators.c) is also slightly inconsistent with the actual behaviour:

Traceback (most recent call last):
  File "chooser.py", line 6, in <module>
    np.choose([2, 3, 1, 0], choices * 8)
  File "/usr/lib64/python2.7/site-packages/numpy/core/fromnumeric.py", line 277, in choose
    return _wrapit(a, 'choose', choices, out=out, mode=mode)
  File "/usr/lib64/python2.7/site-packages/numpy/core/fromnumeric.py", line 37, in _wrapit
    result = getattr(asarray(obj),method)(*args, **kwds)
ValueError: Need between 2 and (32) array objects (inclusive).

The actual behaviour is that choose() passes with 31 objects but fails with 32 objects, so this should read "exclusive" rather than "inclusive". (And why the parentheses around 32?)

Does anyone know what changed between 1.2.1 and 1.3.0 that introduced the 32-element limitation to choose(), and whether we might be able to lift this limitation again for future NumPy versions? I have a couple of days to work on a patch ... if someone can advise me how to approach this.

Best wishes,
    Ed


--
Dr. Edward Schofield
Python Charmers
+61 (0)405 676 229
http://pythoncharmers.com

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion




--
Dr. Edward Schofield
Python Charmers
+61 (0)405 676 229
http://pythoncharmers.com