[Numpy-discussion] lazy evaluation

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Tue Jun 5 17:36:11 EDT 2012

On 06/05/2012 10:47 PM, mark florisson wrote:
> On 5 June 2012 20:17, Nathaniel Smith<njs at pobox.com>  wrote:
>> On Tue, Jun 5, 2012 at 7:08 PM, mark florisson
>> <markflorisson88 at gmail.com>  wrote:
>>> On 5 June 2012 17:38, Nathaniel Smith<njs at pobox.com>  wrote:
>>>> On Tue, Jun 5, 2012 at 4:12 PM, mark florisson
>>>> <markflorisson88 at gmail.com>  wrote:
>>>>> On 5 June 2012 14:58, Nathaniel Smith<njs at pobox.com>  wrote:
>>>>>> On Tue, Jun 5, 2012 at 12:55 PM, mark florisson
>>>>>> <markflorisson88 at gmail.com>  wrote:
>>>>>>> It would be great if we implement the NEP listed above, but with a few
>>>>>>> extensions. I think Numpy should handle the lazy evaluation part, and
>>>>>>> determine when expressions should be evaluated, etc. However, for each
>>>>>>> user operation, Numpy will call back a user-installed hook
>>>>>>> implementing some interface, to allow various packages to provide
>>>>>>> their own hooks to evaluate vector operations however they want. This
>>>>>>> will include packages such as Theano, which could run things on the
>>>>>>> GPU, Numexpr, and in the future
>>>>>>> https://github.com/markflorisson88/minivect (which will likely have an
>>>>>>> LLVM backend in the future, and possibly integrated with Numba to
>>>>>>> allow inlining of numba ufuncs). The project above tries to bring
>>>>>>> together all the different array expression compilers together in a
>>>>>>> single framework, to provide efficient array expressions specialized
>>>>>>> for any data layout (nditer on steroids if you will, with SIMD,
>>>>>>> threaded and inlining capabilities).
>>>>>> A global hook sounds ugly and hard to control -- it's hard to tell
>>>>>> which operations should be deferred and which should be forced, etc.
>>>>> Yes, but for the user the difference should not be visible (unless
>>>>> operations can raise exceptions, in which case you choose the safe
>>>>> path, or let the user configure what to do).
>>>>>> While it would be less magical, I think a more explicit API would in
>>>>>> the end be easier to use... something like
>>>>>>   a, b, c, d = deferred([a, b, c, d])
>>>>>>   e = a + b * c  # 'e' is a deferred object too
>>>>>>   f = np.dot(e, d)  # so is 'f'
>>>>>>   g = force(f)  # 'g' is an ndarray
>>>>>>   # or
>>>>>>   force(f, out=g)
>>>>>> But at that point, this could easily be an external library, right?
>>>>>> All we'd need from numpy would be some way for external types to
>>>>>> override the evaluation of ufuncs, np.dot, etc.? We've recently seen
>>>>>> several reasons to want that functionality, and it seems like
>>>>>> developing these "improved numexpr" ideas would be much easier if they
>>>>>> didn't require doing deep surgery to numpy itself...
>>>>> Definitely, but besides monkey-patch-chaining I think some
>>>>> modifications would be required, but they would be reasonably simple.
>>>>> Most of the functionality would be handled in one function, which most
>>>>> ufuncs (the ones you care about, as well as ufunc (methods) like add)
>>>>> call. E.g. if ((result = NPy_LazyEval("add", op1, op2)) return result;
>>>>> , which is inserted after argument unpacking and sanity checking. You
>>>>> could also do a per-module hook, and have the function look at
>>>>> sys._getframe(1).f_globals, but that is fragile and won't work from C
>>>>> or Cython code.
>>>>> How did you have overrides in mind?
>>>> My vague idea is that core numpy operations are about as fundamental
>>>> for scientific users as the Python builtin operations are, so they
>>>> should probably be overrideable in a similar way. So we'd teach numpy
>>>> functions to check for methods named like "__numpy_ufunc__" or
>>>> "__numpy_dot__" and let themselves be overridden if found. Like how
>>>> __gt__ and __add__ and stuff work. Or something along those lines.
>>>>> I also found this thread:
>>>>> http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html
>>>>> , but I think you want more than just to override ufuncs, you want
>>>>> numpy to govern when stuff is allowed to be lazy and when stuff should
>>>>> be evaluated (e.g. when it is indexed, slice assigned (although that
>>>>> itself may also be lazy), etc). You don't want some funny object back
>>>>> that doesn't work with things which are not overridden in numpy.
>>>> My point is that probably numpy should *not* govern the decision about
>>>> what stuff should be lazy and what should be evaluated; that should be
>>>> governed by some combination of the user and
>>>> Numba/Theano/minivect/whatever. The toy API I sketched out would make
>>>> those decisions obvious and explicit. (And if the funny objects had an
>>>> __array_interface__ attribute that automatically forced evaluation
>>>> when accessed, then they'd work fine with code that was expecting an
>>>> array, or if they were assigned to a "real" ndarray, etc.)
>>> That's disappointing though, since the performance drawbacks can
>>> severely limit the usefulness for people with big data sets. Ideally,
>>> you would take your intuitive numpy code, and make it go fast, without
>>> jumping through hoops. Numpypy has lazy evaluation,  I don't know how
>>> good a job it does, but it does mean you can finally get fast numpy
>>> code in an intuitive way (and even run it on a GPU if that is possible
>>> and beneficial).
>> All of these proposals require the user to jump through hoops -- the
>> deferred-ufunc NEP has the extra 'with deferredstate' thing, and more
>> importantly, a set of rules that people have to learn and keep in mind
>> for which numpy operations are affected, which ones aren't, which
>> operations can't be performed while deferredstate is True, etc. So
>> this has two problems: (1) these rules are opaque, (2) it's far from
>> clear what the rules should be.
> Right, I guess I should have commented on that. I don't think the
> deferredstate stuff is needed at all, execution can always be deferred
> as long as it does not affect semantics. So if something is marked
> readonly because it is used in an expression and then written to, you
> evaluate the expression and then perform the write. The only way to
> break stuff, I think, would be to use pointers through the buffer
> interface or PyArray_DATA and not respect the sudden readonly
> property. A deferred expression is only evaluated once in any valid
> GIL-holding context (so it shouldn't break threads either).

I think Nathaniel's point is that the point where you get a 10-second 
pause to wait for computation is part of the semantics of current NumPy:

print 'Starting computation'
z = (x + y).sum()
print 'Computation done'
print 'Result was', z

I think that if this wasn't the case, newbies would be be tripped up a 
lot and things would feel a lot less intuitive. Certainly when working 
from the IPython command line.

Also, to remain sane in IPython (or when using a debugger, etc.), I'd want

"print z"

to print something like "unevaluated array", not to trigger a 
computation. Same with str(z) and so on.

I don't think a context manager modifying thread-local global state like

with np.lazy:

would be horribly intrusive.

But I also think it'd be good to start with being very explicit (x = 
np.lazy_multiply(a, b); compute(x)) -- such an API should be available 
anyway -- and then have the discussion once that works.


More information about the NumPy-Discussion mailing list