[Numpy-discussion] ANN: numexpr 2.4.3 released

Mon Apr 27 22:47:37 EDT 2015

On Apr 27, 2015 5:30 PM, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>
>
>
> On Mon, Apr 27, 2015 at 7:42 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Mon, Apr 27, 2015 at 4:23 PM, Neil Girdhar <mistersheik at gmail.com>
wrote:
>> > I was told that numba did similar ast parsing, but maybe that's not
true.
>> > Regarding the ast, I don't know about reliability, but take a look at
>> > get_ast in pyautodiff:
>> >
https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec7378e2152/autodiff/context.py
>> > It looks up the __file__ attribute and passes that through compile to
get
>> > the ast.  Of course that won't work when you don't have source code (a
.pyc
>> > only module, or when else?)
>> >
>> > Since I'm looking into this kind of solution for the future of my
code, I'm
>> > curious if you think that's too unreliable for some reason?
>>
>> I'd certainly hesitate to rely on it for anything I cared about or
>> would be used by a lot of people... it's just intrinsically pretty
>> hacky. No guarantee that the source code you find via __file__ will
>> match what was used to compile the function, doesn't work when working
>> interactively or from the ipython notebook, etc. Or else you have to
>> trust a decompiler, which is a pretty serious complex chunk of code
>> just to avoid typing quote marks.
>
>
> Those are all good points.  However, it's more than just typing quote
marks.  The code might have non-numpy things mixed in.  It might have
context managers and function calls and so on.  More comments below.
>
>>
>>
>> >   From a
>> > usability standpoint, I do think that's better than feeding in strings,
>> > which:
>> > * are not syntax highlighted, and
>> > * require porting code from regular numpy expressions to numexpr
strings
>> > (applying a decorator is so much easier).
>>
>> Yes, but then you have to write a program that knows how to port code
>> from numpy expressions to numexpr strings :-). numexpr only knows a
>> tiny restricted subset of Python...
>>
>> The general approach I'd take to solve these kinds of problems would
>> be similar to that used by Theano or dask -- use regular python source
>> code that generates an expression graph in memory. E.g. this could
>> look like
>>
>> def do_stuff(arr1, arr2):
>>     arr1 = deferred(arr1)
>>     arr2 = deferred(arr2)
>>     arr3 = np.sum(arr1 + (arr2 ** 2))
>>     return force(arr3 / np.sum(arr3))
>>
>> -n
>>
>
> Right, there are three basic approaches:  string processing, AST
processing, and compile-time expression graphs.
>
> The big advantage to AST processing over the other two is that you can
write and test your code as regular numpy code along with regular tests.
Then, with the application of a decorator, you get the speedup you're
looking for.  The problem with porting the numpy code to numexpr strings or
Theano-like expression-graphs is that porting can introduce bugs, and even
if you're careful, every time you make a change to the numpy version of the
code, you have port it again.
>
> Also, I personally want to do more than just AST transformations of the
numpy code.  For example, I have some methods that call super.  The super
calls can be collapsed since the mro is known at compile time.

If you want something that handles arbitrary python code ('with' etc.), and
produces results identical to cpython (so tests are reliable), except in
cases where it violates the semantics for speed (super), then yeah, you
want a full replacement python implementation, and I agree that the proper
input to a python implementation is .py files :-). That's getting a bit far
afield from numexpr's goals though...

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150427/8270efb3/attachment.html>