vectorized linalg

David Goldsmith David.L.Goldsmith at noaa.gov
Mon Oct 9 10:32:12 EDT 2006


Tim Hochberg wrote:
> I periodically need to perform various linalg operations on a large 
> number of small matrices. The normal numpy approach of stacking up the 
> data into an extra dimension and doing the operations in parallel 
> doesn't work because the linalg functions are only designed to work on 
> one matrix at a time. At first glance, this restriction seems harmless 
> enough since the underlying LAPACK functions are also designed to work 
> one matrix at a time and it seem that there would be little to be gained 
> by parallelizing the Python level code. However, this turns out to be 
> the case.
>
> I have some code, originally written for numarray, but I recently ported 
> it over to numpy, that rewrites the linalg function determinant, inverse 
> and solve_linear_equations[1] to operate of 2 or 3D arrays, treating 3D 
> arrays as stacks of 2D arrays. For operating on large numbers of small 
> arrays (for example 1000 2x2 arrays), this approach is over an order of 
> magnitude faster than the obvious map(inverse, threeDArray) approach.
>
> So, some questions:
>
>     1. Is there any interest in this code for numpy.linalg? I'd be 
> willing to clean it up and work on the other functions in linalg so that 
> they were all consistent. Since these changes should all be backwards 
> compatible (unless your counting on catching an error from one of the 
> linalg functions), I don't see this as a problem to put in after 1.0, so 
> there's really no rush on this. I'm only bringing it up now since I just 
> ported it over to numpy and it's fresh in my mind.
>   
Say "no" to someone offering to make a Python module more feature rich?  
Blasphemy! :-)

But I am very curious as to the source/basis of the performance 
improvement - did you figure this out?

DG
>     2. If 1., what to do about norm? I think the other functions in 
> linalg stack naturally, but norm, since it is meant to work on both 
> vectors and matrices is problematic. Norm already seems a bit 
> problematic in that, for some values of 'ord', norm can return different 
> values for a shape [N] and a shape [1,N] array containing the same values:
>
>      >>> linalg.norm(a[:1], 1)
>     0.78120069791102054
>      >>> linalg.norm(a[0], 1)
>     1.4588102446677758
>
>     My inclination is to introduce stackable 1 and 2D version s of norm, 
> vnorm and mnorm for lack of better names. Ideally we'd deprecate the use 
> of norm for ord!=None (Froebenius norm) or at least in cases where the 
> result depends on the rank [1,N] versus [N].
>
>     3. A similar issue arises with dot. One would like to be able to do 
> stacked matrix products, vector products and matrix-vector products. If 
> we are making the rest of linalg stackable, linalg would be a sensible 
> place for a stackable version of dot to live. However, it's immediately 
> clear how to do this without introducing a whole pile (4) of stacking 
> dot function to handle the various cases, plus broadcasting, cleanly. 
> This would require some more thought. Thoughts?
>
> -tim
>
>
> [1] Those are actually the numarray names, which the current code still 
> uses, the numpy names are 'det', 'inv' and 'solve'
>
> [2] Treatment of stacking insolve linear equations is a bit more 
> complicated, but I'll ignore that for now.
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>   


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV




More information about the NumPy-Discussion mailing list