[Numpy-discussion] Numpy and PEP 343

Thu Mar 2 15:49:11 EST 2006

David M. Cooke wrote:

>eric jones <eric at enthought.com> writes:
>
>  
>
>>Travis Oliphant wrote:
>>
>>    
>>
>>>Weave can still help with the "auto-compilation" of the specific
>>>library for your type.  Ultimately such code will be faster than
>>>NumPy can every be.
>>>      
>>>
>>Yes.  weave.blitz() can be used to do the equivalent of this lazy
>>evaluation for you in many cases without much effort.  For example:
>>
>>import weave
>>from scipy import arange
>>
>>a = arange(1e7)
>>b = arange(1e7)
>>c=2.0*a+3.0*b
>>
>># or with weave
>>weave.blitz("c=2.0*a+3.0*b")
>>
>>As Paul D. mentioned.what Tim outlined is essentially template
>>expressions in C++.  blitz++ (http://www.oonumerics.org/blitz/) is a
>>C++ template expressions library for array operations, and weave.blitz
>>translates a Numeric expression into C++ blitz code.  For the example
>>above on large arrays, you get about a factor of 4 speed up on large
>>arrays. (Notice, the first time you run the example it will be much
>>slower because of compile time.  Use timings from subsequent runs.)
>>
>>C:\temp>weave_time.py
>>Expression: c=2.0*a+3.0*b
>>Numeric: 0.678311899322
>>Weave: 0.162177084984
>>Speed-up: 4.18253848494
>>
>>All this to say, I think weave basically accomplishes what Tim wants
>>with a different mechanism (letting C++ compilers do the optimization
>>instead of writing this optimization at the python level).  It does
>>require a compiler on client machines in its current form (even that
>>can be fixed...), but I think it might prove faster than
>>re-implementing a numeric expression compiler at the python level
>>(though that sounds fun as well).
>>    
>>
>
>I couldn't leave it at that :-), so I wrote my bytecode idea up.
>
>You can grab it at http://arbutus.mcmaster.ca/dmc/software/numexpr-0.1.tar.gz
>
>Here are the performance numbers (note I use 1e6 elements instead of
>1e7)
>
>cookedm at arbutus$ py -c 'import numexpr.timing; numexpr.timing.compare()'
>Expression: b*c+d*e
>numpy: 0.0934900999069
>Weave: 0.0459051132202
>Speed-up of weave over numpy: 2.03659447388
>numexpr: 0.0467489004135
>Speed-up of numexpr over numpy: 1.99983527056
>
>Expression: 2*a+3*b
>numpy: 0.0784019947052
>Weave: 0.0292909860611
>Speed-up of weave over numpy: 2.67665945222
>numexpr: 0.0323888063431
>Speed-up of numexpr over numpy: 2.42065094572
>
>Wow. On par with weave.blitz, and no C++ compiler!!! :-)
>  
>
That's awesome! I was also tempted by this, but I never got beyond 
prototyping some stuff in Python.

>You use it like this:
>
>from numexpr import numexpr
>
>func = numexpr("2*a+3*b")
>
>a = arange(1e6)
>b = arange(1e6)
>c = func(a, b)
>  
>
Does this just uses the order that variable are initially used in the 
expression to determine the input order? I'm not sure I like that. It is 
very convenient for simple expressions though.

>Alternatively, you can use it like weave.blitz, like this:
>
>from numexpr import evaluate
>
>a = arange(1e6)
>b = arange(1e6)
>c = evaluate("2*a+3*b")
>  
>
That's pretty sweet. And I see that you cache the expressions, so it 
should be pretty fast if you need to loop.

[snip details]

>If people think it's useful enough, I can check it into scipy's sandbox.
>  
>
Definitely check it in. It won't compile here with VC7, but I'll see if 
I can figure out why.

This is probably thinking two far ahead, but an interesting thing to try 
would be adding conditional expressions:

c = evaluate("(2*a + b) if (a > b) else (2*b + a)")

If that could be made to work, and work fast, it would save both memory 
and time in those cases where you have to vary the computation based on 
the value. At present I end up computing the full arrays for each case 
and then choosing which result to use based on a mask, so it takes three 
times as much space as doing it element by element.

-tim