[Numpy-discussion] Maximum available dimensions in numpy

Tim Hochberg tim.hochberg at cox.net
Mon Jan 16 20:00:03 EST 2006


Paul Barrett wrote:

> On 1/16/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     > Before, I think numpy supported up to 32 dimensions. Is there
>     any reason
>     > for this new limit? Just curious.
>
>     It was actually 40 until recently. I don't know the answer to your
>     question (Travis?), but I am curious why would anyone need more than
>     say 4?  Solving PDEs by finite differences method in more than 4
>     dimensional spaces, anyone? I know I sound like some very well known
>     person, but really 20 "ought to be enough for everyone" (TM) :-).
>
>
> How about setting the default case to 3 or 4 dimensions and then 
> special casing the rare higher dimensional arrays, i.e. using malloc 
> for these situations.  The default dimension size could be a compile 
> time option for those who routinely exceed the default size of 3 or 4.

This seems  like premature optimization. In most cases, if you're in a 
situation where the dimensional overhead matters (lot's of small arrays) 
you are using Numeric/Numarray/NumPy poorly and your code is going to be 
slow and bloated anyway. The way to get good efficiency with these 
extensions is to do block operations on large matrices. This often 
involves a little trickery and several extra dimensions. Reducing the 
default matrix size down to 3 or 4 makes efficient code slower since 
going through malloc will involve an extra dereference and probably some 
extra branches.

There's also no point in setting the default max matrix size to <N> if 
the typical default allignment (8 byte, IIRC) is going to leave some 
bytes unused due to allignment. If one were to pursue some sort of 
hybrid scheme as proposed above, I'd say at a minimum a default 
dimension size would be 6; larger depending on how the allignment works 
out.  I am also leary of this sort of hybrid scheme since the code for 
dimensions larger the <N> the code would be little tested and would thus 
be a good place for bugs to lurk.

I don't see anything wrong with making the maximum dimension size a 
compile time option though. However, since in the common case the extra 
dimensions are unlikely to affect performance in any meaningful, I'd 
recomend that the maximum number of default dimensions stay large by 
default.  Thus people who need to conserve bytes, which I'd consider the 
rare case, have the option of reducing the max dimensions while the 
arrays behave in an unsuprising manner when compiled in the normal manner.

If someone has the extra time, it would be interesting to see some data 
about how always mallocing the extra dimensions, so there was no maximum 
dimensions limit,  affects performance. I'd also be interested in seeing 
cases where the extra dimensions actually affect performance before 
doing to stuff complicate^H^H^H^H^H^H^H^H fix things.

-tim


>
>  -- Paul
>
> -- 
> Paul Barrett, PhD                     Johns Hopkins University
> Assoc. Research Scientist     Dept of Physics and Astronomy
> Phone: 410-516-5190            Baltimore, MD 21218 







More information about the NumPy-Discussion mailing list