vectorized linalg
David Goldsmith
David.L.Goldsmith at noaa.gov
Mon Oct 9 10:32:12 EDT 2006
Tim Hochberg wrote:
> I periodically need to perform various linalg operations on a large
> number of small matrices. The normal numpy approach of stacking up the
> data into an extra dimension and doing the operations in parallel
> doesn't work because the linalg functions are only designed to work on
> one matrix at a time. At first glance, this restriction seems harmless
> enough since the underlying LAPACK functions are also designed to work
> one matrix at a time and it seem that there would be little to be gained
> by parallelizing the Python level code. However, this turns out to be
> the case.
>
> I have some code, originally written for numarray, but I recently ported
> it over to numpy, that rewrites the linalg function determinant, inverse
> and solve_linear_equations[1] to operate of 2 or 3D arrays, treating 3D
> arrays as stacks of 2D arrays. For operating on large numbers of small
> arrays (for example 1000 2x2 arrays), this approach is over an order of
> magnitude faster than the obvious map(inverse, threeDArray) approach.
>
> So, some questions:
>
> 1. Is there any interest in this code for numpy.linalg? I'd be
> willing to clean it up and work on the other functions in linalg so that
> they were all consistent. Since these changes should all be backwards
> compatible (unless your counting on catching an error from one of the
> linalg functions), I don't see this as a problem to put in after 1.0, so
> there's really no rush on this. I'm only bringing it up now since I just
> ported it over to numpy and it's fresh in my mind.
>
Say "no" to someone offering to make a Python module more feature rich?
Blasphemy! :-)
But I am very curious as to the source/basis of the performance
improvement - did you figure this out?
DG
> 2. If 1., what to do about norm? I think the other functions in
> linalg stack naturally, but norm, since it is meant to work on both
> vectors and matrices is problematic. Norm already seems a bit
> problematic in that, for some values of 'ord', norm can return different
> values for a shape [N] and a shape [1,N] array containing the same values:
>
> >>> linalg.norm(a[:1], 1)
> 0.78120069791102054
> >>> linalg.norm(a[0], 1)
> 1.4588102446677758
>
> My inclination is to introduce stackable 1 and 2D version s of norm,
> vnorm and mnorm for lack of better names. Ideally we'd deprecate the use
> of norm for ord!=None (Froebenius norm) or at least in cases where the
> result depends on the rank [1,N] versus [N].
>
> 3. A similar issue arises with dot. One would like to be able to do
> stacked matrix products, vector products and matrix-vector products. If
> we are making the rest of linalg stackable, linalg would be a sensible
> place for a stackable version of dot to live. However, it's immediately
> clear how to do this without introducing a whole pile (4) of stacking
> dot function to handle the various cases, plus broadcasting, cleanly.
> This would require some more thought. Thoughts?
>
> -tim
>
>
> [1] Those are actually the numarray names, which the current code still
> uses, the numpy names are 'det', 'inv' and 'solve'
>
> [2] Treatment of stacking insolve linear equations is a bit more
> complicated, but I'll ignore that for now.
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
More information about the NumPy-Discussion
mailing list