[Numpy-discussion] Maximum available dimensions in numpy
Tim Hochberg
tim.hochberg at cox.net
Mon Jan 16 20:00:03 EST 2006
Paul Barrett wrote:
> On 1/16/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
> > Before, I think numpy supported up to 32 dimensions. Is there
> any reason
> > for this new limit? Just curious.
>
> It was actually 40 until recently. I don't know the answer to your
> question (Travis?), but I am curious why would anyone need more than
> say 4? Solving PDEs by finite differences method in more than 4
> dimensional spaces, anyone? I know I sound like some very well known
> person, but really 20 "ought to be enough for everyone" (TM) :-).
>
>
> How about setting the default case to 3 or 4 dimensions and then
> special casing the rare higher dimensional arrays, i.e. using malloc
> for these situations. The default dimension size could be a compile
> time option for those who routinely exceed the default size of 3 or 4.
This seems like premature optimization. In most cases, if you're in a
situation where the dimensional overhead matters (lot's of small arrays)
you are using Numeric/Numarray/NumPy poorly and your code is going to be
slow and bloated anyway. The way to get good efficiency with these
extensions is to do block operations on large matrices. This often
involves a little trickery and several extra dimensions. Reducing the
default matrix size down to 3 or 4 makes efficient code slower since
going through malloc will involve an extra dereference and probably some
extra branches.
There's also no point in setting the default max matrix size to <N> if
the typical default allignment (8 byte, IIRC) is going to leave some
bytes unused due to allignment. If one were to pursue some sort of
hybrid scheme as proposed above, I'd say at a minimum a default
dimension size would be 6; larger depending on how the allignment works
out. I am also leary of this sort of hybrid scheme since the code for
dimensions larger the <N> the code would be little tested and would thus
be a good place for bugs to lurk.
I don't see anything wrong with making the maximum dimension size a
compile time option though. However, since in the common case the extra
dimensions are unlikely to affect performance in any meaningful, I'd
recomend that the maximum number of default dimensions stay large by
default. Thus people who need to conserve bytes, which I'd consider the
rare case, have the option of reducing the max dimensions while the
arrays behave in an unsuprising manner when compiled in the normal manner.
If someone has the extra time, it would be interesting to see some data
about how always mallocing the extra dimensions, so there was no maximum
dimensions limit, affects performance. I'd also be interested in seeing
cases where the extra dimensions actually affect performance before
doing to stuff complicate^H^H^H^H^H^H^H^H fix things.
-tim
>
> -- Paul
>
> --
> Paul Barrett, PhD Johns Hopkins University
> Assoc. Research Scientist Dept of Physics and Astronomy
> Phone: 410-516-5190 Baltimore, MD 21218
More information about the NumPy-Discussion
mailing list