vectorized linalg

Mon Oct 9 13:09:39 EDT 2006

Tim Hochberg wrote:
> David Goldsmith wrote:
>   
>> Tim Hochberg wrote:
>>   
>>     
>>> I periodically need to perform various linalg operations on a large 
>>> number of small matrices. The normal numpy approach of stacking up the 
>>> data into an extra dimension and doing the operations in parallel 
>>> doesn't work because the linalg functions are only designed to work on 
>>> one matrix at a time. At first glance, this restriction seems harmless 
>>> enough since the underlying LAPACK functions are also designed to work 
>>> one matrix at a time and it seem that there would be little to be gained 
>>> by parallelizing the Python level code. However, this turns out to be 
>>> the case.
>>>
>>> I have some code, originally written for numarray, but I recently ported 
>>> it over to numpy, that rewrites the linalg function determinant, inverse 
>>> and solve_linear_equations[1] to operate of 2 or 3D arrays, treating 3D 
>>> arrays as stacks of 2D arrays. For operating on large numbers of small 
>>> arrays (for example 1000 2x2 arrays), this approach is over an order of 
>>> magnitude faster than the obvious map(inverse, threeDArray) approach.
>>>
>>> So, some questions:
>>>
>>>     1. Is there any interest in this code for numpy.linalg? I'd be 
>>> willing to clean it up and work on the other functions in linalg so that 
>>> they were all consistent. Since these changes should all be backwards 
>>> compatible (unless your counting on catching an error from one of the 
>>> linalg functions), I don't see this as a problem to put in after 1.0, so 
>>> there's really no rush on this. I'm only bringing it up now since I just 
>>> ported it over to numpy and it's fresh in my mind.
>>>   
>>>     
>>>       
>> Say "no" to someone offering to make a Python module more feature rich?  
>> Blasphemy! :-)
>>
>> But I am very curious as to the source/basis of the performance 
>> improvement - did you figure this out?
>>   
>>     
> Yes and no. Here's what the begining of linalg.det looks lik:
>
>     a = asarray(a)
>     _assertRank2(a)
>     _assertSquareness(a)
>     t, result_t = _commonType(a)
>     a = _fastCopyAndTranspose(t, a)
>     n = a.shape[0]
>     if isComplexType(t):
>         lapack_routine = lapack_lite.zgetrf
>     else:
>         lapack_routine = lapack_lite.dgetrf
>     # ...
>
> There's quite a few function calls here, since the actual call to dgetrf 
> probably takes neglible time for a 2x2 array, the cost of computation 
> here is dominated by function call overhead, creating the return arrays 
> and such not. Which is the biggest culprit, I'm not sure; by vectorizing 
> it, much of that overhead is only incurred once rather than hundreds or 
> thousands of times, so it all becomes negligible at that point. I didn't 
> bother to track down the worst offender.
>
>   
Ah, that all makes sense (including not bothering to profile the code).

DG
> -tim
>
>
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>   


-- 
HMRD/ORR/NOS/NOAA <http://response.restoration.noaa.gov/emergencyresponse/>

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV