[Numpy-discussion] Maximum available dimensions in numpy

Paul Barrett pebarrett at gmail.com
Mon Jan 16 20:31:02 EST 2006


On 1/16/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Paul Barrett wrote:
>
> > On 1/16/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
> >
> >     > Before, I think numpy supported up to 32 dimensions. Is there
> >     any reason
> >     > for this new limit? Just curious.
> >
> >     It was actually 40 until recently. I don't know the answer to your
> >     question (Travis?), but I am curious why would anyone need more than
> >     say 4?  Solving PDEs by finite differences method in more than 4
> >     dimensional spaces, anyone? I know I sound like some very well known
> >     person, but really 20 "ought to be enough for everyone" (TM) :-).
> >
> >
> > How about setting the default case to 3 or 4 dimensions and then
> > special casing the rare higher dimensional arrays, i.e. using malloc
> > for these situations.  The default dimension size could be a compile
> > time option for those who routinely exceed the default size of 3 or 4.
>
> This seems  like premature optimization. In most cases, if you're in a
> situation where the dimensional overhead matters (lot's of small arrays)
> you are using Numeric/Numarray/NumPy poorly and your code is going to be
> slow and bloated anyway. The way to get good efficiency with these
> extensions is to do block operations on large matrices. This often
> involves a little trickery and several extra dimensions. Reducing the
> default matrix size down to 3 or 4 makes efficient code slower since
> going through malloc will involve an extra dereference and probably some
> extra branches.


It also avoids the possibility of running up against the maximum number of
dimensions, while conserving memory. For those users that create a multitude
of small arrays, the wasted memory might become important.  I only suggested
3 or 4 dimensions, because it would appear to cover 99% of the cases.  I
hope those users creating the other 1%,  know what they are doing.

There's also no point in setting the default max matrix size to <N> if
> the typical default allignment (8 byte, IIRC) is going to leave some
> bytes unused due to allignment. If one were to pursue some sort of
> hybrid scheme as proposed above, I'd say at a minimum a default
> dimension size would be 6; larger depending on how the allignment works
> out.  I am also leary of this sort of hybrid scheme since the code for
> dimensions larger the <N> the code would be little tested and would thus
> be a good place for bugs to lurk.


Agreed.  But 20 or 30 extra dimensions also seems rather a waste and ad hoc.

I don't see anything wrong with making the maximum dimension size a
> compile time option though. However, since in the common case the extra
> dimensions are unlikely to affect performance in any meaningful, I'd
> recomend that the maximum number of default dimensions stay large by
> default.  Thus people who need to conserve bytes, which I'd consider the
> rare case, have the option of reducing the max dimensions while the
> arrays behave in an unsuprising manner when compiled in the normal manner.
>
> If someone has the extra time, it would be interesting to see some data
> about how always mallocing the extra dimensions, so there was no maximum
> dimensions limit,  affects performance. I'd also be interested in seeing
> cases where the extra dimensions actually affect performance before
> doing to stuff complicate^H^H^H^H^H^H^H^H fix things.
>

I'd venture to guess that mallocing the extra dimensions would give a small,
though noticeable, decrease in performance.  Of course, as the array becomes
larger, this overhead will decrease in relation to the time it takes to
perform the operation, much like the overhead seen in numarray, though not
as large.  As you mentioned, arrays with many dimensions are often large
arrays, so the malloc overhead will probably not be too significant.

 -- Paul

--
Paul Barrett, PhD                     Johns Hopkins University
Assoc. Research Scientist     Dept of Physics and Astronomy
Phone: 410-516-5190            Baltimore, MD 21218
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060116/4a085598/attachment.html>


More information about the NumPy-Discussion mailing list