
Announcing Numexpr 2.4.3 ========================= Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL: https://github.com/pydata/numexpr/wiki/NumexprMKL Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies. What's new ========== This is a maintenance release to cope with an old bug affecting comparisons with empty strings. Fixes #121 and PyTables #184. In case you want to know more in detail what has changed in this version, see: https://github.com/pydata/numexpr/wiki/Release-Notes or have a look at RELEASE_NOTES.txt in the tarball. Where I can find Numexpr? ========================= The project is hosted at GitHub in: https://github.com/pydata/numexpr You can get the packages from PyPI as well (but not for RC releases): http://pypi.python.org/pypi/numexpr Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy data! -- Francesc Alted

I've always wondered why numexpr accepts strings rather than looking a function's source code, using ast to parse it, and then transforming the AST. I just looked at another project, pyautodiff, which does that. And I think numba does that for llvm code generation. Wouldn't it be nicer to just apply a decorator to a function than to write the function as a Python string? On Mon, Apr 27, 2015 at 11:50 AM, Francesc Alted <faltet@gmail.com> wrote:
Announcing Numexpr 2.4.3 =========================
Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python.
It wears multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL:
https://github.com/pydata/numexpr/wiki/NumexprMKL
Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies.
What's new ==========
This is a maintenance release to cope with an old bug affecting comparisons with empty strings. Fixes #121 and PyTables #184.
In case you want to know more in detail what has changed in this version, see:
https://github.com/pydata/numexpr/wiki/Release-Notes
or have a look at RELEASE_NOTES.txt in the tarball.
Where I can find Numexpr? =========================
The project is hosted at GitHub in:
https://github.com/pydata/numexpr
You can get the packages from PyPI as well (but not for RC releases):
http://pypi.python.org/pypi/numexpr
Share your experience =====================
Let us know of any bugs, suggestions, gripes, kudos, etc. you may have.
Enjoy data!
-- Francesc Alted
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Apr 27, 2015 1:44 PM, "Neil Girdhar" <mistersheik@gmail.com> wrote:
I've always wondered why numexpr accepts strings rather than looking a
function's source code, using ast to parse it, and then transforming the AST. I just looked at another project, pyautodiff, which does that. And I think numba does that for llvm code generation. Wouldn't it be nicer to just apply a decorator to a function than to write the function as a Python string? Numba works from byte code, not the ast. There's no way to access the ast reliably at runtime in python -- it gets thrown away during compilation. -n

I was told that numba did similar ast parsing, but maybe that's not true. Regarding the ast, I don't know about reliability, but take a look at get_ast in pyautodiff: https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec... It looks up the __file__ attribute and passes that through compile to get the ast. Of course that won't work when you don't have source code (a .pyc only module, or when else?) Since I'm looking into this kind of solution for the future of my code, I'm curious if you think that's too unreliable for some reason? From a usability standpoint, I do think that's better than feeding in strings, which: * are not syntax highlighted, and * require porting code from regular numpy expressions to numexpr strings (applying a decorator is so much easier). Best, Neil On Mon, Apr 27, 2015 at 7:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Apr 27, 2015 1:44 PM, "Neil Girdhar" <mistersheik@gmail.com> wrote:
I've always wondered why numexpr accepts strings rather than looking a
function's source code, using ast to parse it, and then transforming the AST. I just looked at another project, pyautodiff, which does that. And I think numba does that for llvm code generation. Wouldn't it be nicer to just apply a decorator to a function than to write the function as a Python string?
Numba works from byte code, not the ast. There's no way to access the ast reliably at runtime in python -- it gets thrown away during compilation.
-n
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Also, FYI: http://numba.pydata.org/numba-doc/0.6/doc/modules/transforms.html It appears that numba does get the ast similar to pyautodiff and only get the ast from source code as a fallback? On Mon, Apr 27, 2015 at 7:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
I was told that numba did similar ast parsing, but maybe that's not true. Regarding the ast, I don't know about reliability, but take a look at get_ast in pyautodiff: https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec... It looks up the __file__ attribute and passes that through compile to get the ast. Of course that won't work when you don't have source code (a .pyc only module, or when else?)
Since I'm looking into this kind of solution for the future of my code, I'm curious if you think that's too unreliable for some reason? From a usability standpoint, I do think that's better than feeding in strings, which: * are not syntax highlighted, and * require porting code from regular numpy expressions to numexpr strings (applying a decorator is so much easier).
Best,
Neil
On Mon, Apr 27, 2015 at 7:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Apr 27, 2015 1:44 PM, "Neil Girdhar" <mistersheik@gmail.com> wrote:
I've always wondered why numexpr accepts strings rather than looking a
function's source code, using ast to parse it, and then transforming the AST. I just looked at another project, pyautodiff, which does that. And I think numba does that for llvm code generation. Wouldn't it be nicer to just apply a decorator to a function than to write the function as a Python string?
Numba works from byte code, not the ast. There's no way to access the ast reliably at runtime in python -- it gets thrown away during compilation.
-n
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Mon, 27 Apr 2015 19:35:51 -0400 Neil Girdhar <mistersheik@gmail.com> wrote:
Also, FYI: http://numba.pydata.org/numba-doc/0.6/doc/modules/transforms.html
It appears that numba does get the ast similar to pyautodiff and only get the ast from source code as a fallback?
That documentation is terribly obsolete (the latest Numba version is 0.18.2). Modern Numba starts from the CPython bytecode, it doesn't look at the AST. We explain the architecture in some detail here: http://numba.pydata.org/numba-doc/dev/developer/architecture.html Regards Antoine.

On Mon, Apr 27, 2015 at 4:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
I was told that numba did similar ast parsing, but maybe that's not true. Regarding the ast, I don't know about reliability, but take a look at get_ast in pyautodiff: https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec... It looks up the __file__ attribute and passes that through compile to get the ast. Of course that won't work when you don't have source code (a .pyc only module, or when else?)
Since I'm looking into this kind of solution for the future of my code, I'm curious if you think that's too unreliable for some reason?
I'd certainly hesitate to rely on it for anything I cared about or would be used by a lot of people... it's just intrinsically pretty hacky. No guarantee that the source code you find via __file__ will match what was used to compile the function, doesn't work when working interactively or from the ipython notebook, etc. Or else you have to trust a decompiler, which is a pretty serious complex chunk of code just to avoid typing quote marks.
From a usability standpoint, I do think that's better than feeding in strings, which: * are not syntax highlighted, and * require porting code from regular numpy expressions to numexpr strings (applying a decorator is so much easier).
Yes, but then you have to write a program that knows how to port code from numpy expressions to numexpr strings :-). numexpr only knows a tiny restricted subset of Python... The general approach I'd take to solve these kinds of problems would be similar to that used by Theano or dask -- use regular python source code that generates an expression graph in memory. E.g. this could look like def do_stuff(arr1, arr2): arr1 = deferred(arr1) arr2 = deferred(arr2) arr3 = np.sum(arr1 + (arr2 ** 2)) return force(arr3 / np.sum(arr3)) -n -- Nathaniel J. Smith -- http://vorpus.org

On Mon, Apr 27, 2015 at 7:42 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Apr 27, 2015 at 4:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
I was told that numba did similar ast parsing, but maybe that's not true. Regarding the ast, I don't know about reliability, but take a look at get_ast in pyautodiff:
https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec...
It looks up the __file__ attribute and passes that through compile to get the ast. Of course that won't work when you don't have source code (a .pyc only module, or when else?)
Since I'm looking into this kind of solution for the future of my code, I'm curious if you think that's too unreliable for some reason?
I'd certainly hesitate to rely on it for anything I cared about or would be used by a lot of people... it's just intrinsically pretty hacky. No guarantee that the source code you find via __file__ will match what was used to compile the function, doesn't work when working interactively or from the ipython notebook, etc. Or else you have to trust a decompiler, which is a pretty serious complex chunk of code just to avoid typing quote marks.
Those are all good points. However, it's more than just typing quote marks. The code might have non-numpy things mixed in. It might have context managers and function calls and so on. More comments below.
From a usability standpoint, I do think that's better than feeding in strings, which: * are not syntax highlighted, and * require porting code from regular numpy expressions to numexpr strings (applying a decorator is so much easier).
Yes, but then you have to write a program that knows how to port code from numpy expressions to numexpr strings :-). numexpr only knows a tiny restricted subset of Python...
The general approach I'd take to solve these kinds of problems would be similar to that used by Theano or dask -- use regular python source code that generates an expression graph in memory. E.g. this could look like
def do_stuff(arr1, arr2): arr1 = deferred(arr1) arr2 = deferred(arr2) arr3 = np.sum(arr1 + (arr2 ** 2)) return force(arr3 / np.sum(arr3))
-n
Right, there are three basic approaches: string processing, AST processing, and compile-time expression graphs. The big advantage to AST processing over the other two is that you can write and test your code as regular numpy code along with regular tests. Then, with the application of a decorator, you get the speedup you're looking for. The problem with porting the numpy code to numexpr strings or Theano-like expression-graphs is that porting can introduce bugs, and even if you're careful, every time you make a change to the numpy version of the code, you have port it again. Also, I personally want to do more than just AST transformations of the numpy code. For example, I have some methods that call super. The super calls can be collapsed since the mro is known at compile time. Best, Neil
-- Nathaniel J. Smith -- http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Apr 27, 2015 5:30 PM, "Neil Girdhar" <mistersheik@gmail.com> wrote:
On Mon, Apr 27, 2015 at 7:42 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Apr 27, 2015 at 4:23 PM, Neil Girdhar <mistersheik@gmail.com>
I was told that numba did similar ast parsing, but maybe that's not
Regarding the ast, I don't know about reliability, but take a look at get_ast in pyautodiff:
https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec...
It looks up the __file__ attribute and passes that through compile to get the ast. Of course that won't work when you don't have source code (a .pyc only module, or when else?)
Since I'm looking into this kind of solution for the future of my code, I'm curious if you think that's too unreliable for some reason?
I'd certainly hesitate to rely on it for anything I cared about or would be used by a lot of people... it's just intrinsically pretty hacky. No guarantee that the source code you find via __file__ will match what was used to compile the function, doesn't work when working interactively or from the ipython notebook, etc. Or else you have to trust a decompiler, which is a pretty serious complex chunk of code just to avoid typing quote marks.
Those are all good points. However, it's more than just typing quote marks. The code might have non-numpy things mixed in. It might have context managers and function calls and so on. More comments below.
From a usability standpoint, I do think that's better than feeding in strings, which: * are not syntax highlighted, and * require porting code from regular numpy expressions to numexpr
strings
(applying a decorator is so much easier).
Yes, but then you have to write a program that knows how to port code from numpy expressions to numexpr strings :-). numexpr only knows a tiny restricted subset of Python...
The general approach I'd take to solve these kinds of problems would be similar to that used by Theano or dask -- use regular python source code that generates an expression graph in memory. E.g. this could look like
def do_stuff(arr1, arr2): arr1 = deferred(arr1) arr2 = deferred(arr2) arr3 = np.sum(arr1 + (arr2 ** 2)) return force(arr3 / np.sum(arr3))
-n
Right, there are three basic approaches: string processing, AST
wrote: true. processing, and compile-time expression graphs.
The big advantage to AST processing over the other two is that you can
write and test your code as regular numpy code along with regular tests. Then, with the application of a decorator, you get the speedup you're looking for. The problem with porting the numpy code to numexpr strings or Theano-like expression-graphs is that porting can introduce bugs, and even if you're careful, every time you make a change to the numpy version of the code, you have port it again.
Also, I personally want to do more than just AST transformations of the
numpy code. For example, I have some methods that call super. The super calls can be collapsed since the mro is known at compile time. If you want something that handles arbitrary python code ('with' etc.), and produces results identical to cpython (so tests are reliable), except in cases where it violates the semantics for speed (super), then yeah, you want a full replacement python implementation, and I agree that the proper input to a python implementation is .py files :-). That's getting a bit far afield from numexpr's goals though... -n

I don't think I'm asking for so much. Somewhere inside numexpr it builds an AST of its own, which it converts into the optimized code. It would be more useful to me if that AST were in the same format as the one returned by Python's ast module. This way, I could glue in the bits of numexpr that I like with my code. For my purpose, this would have been the more ideal design. On Mon, Apr 27, 2015 at 10:47 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Apr 27, 2015 5:30 PM, "Neil Girdhar" <mistersheik@gmail.com> wrote:
On Mon, Apr 27, 2015 at 7:42 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Apr 27, 2015 at 4:23 PM, Neil Girdhar <mistersheik@gmail.com>
I was told that numba did similar ast parsing, but maybe that's not
Regarding the ast, I don't know about reliability, but take a look at get_ast in pyautodiff:
https://github.com/LowinData/pyautodiff/blob/7973e26f1c233570ed4bb10d08634ec...
It looks up the __file__ attribute and passes that through compile to get the ast. Of course that won't work when you don't have source code (a .pyc only module, or when else?)
Since I'm looking into this kind of solution for the future of my code, I'm curious if you think that's too unreliable for some reason?
I'd certainly hesitate to rely on it for anything I cared about or would be used by a lot of people... it's just intrinsically pretty hacky. No guarantee that the source code you find via __file__ will match what was used to compile the function, doesn't work when working interactively or from the ipython notebook, etc. Or else you have to trust a decompiler, which is a pretty serious complex chunk of code just to avoid typing quote marks.
Those are all good points. However, it's more than just typing quote marks. The code might have non-numpy things mixed in. It might have context managers and function calls and so on. More comments below.
From a usability standpoint, I do think that's better than feeding in
strings,
which: * are not syntax highlighted, and * require porting code from regular numpy expressions to numexpr strings (applying a decorator is so much easier).
Yes, but then you have to write a program that knows how to port code from numpy expressions to numexpr strings :-). numexpr only knows a tiny restricted subset of Python...
The general approach I'd take to solve these kinds of problems would be similar to that used by Theano or dask -- use regular python source code that generates an expression graph in memory. E.g. this could look like
def do_stuff(arr1, arr2): arr1 = deferred(arr1) arr2 = deferred(arr2) arr3 = np.sum(arr1 + (arr2 ** 2)) return force(arr3 / np.sum(arr3))
-n
Right, there are three basic approaches: string processing, AST
wrote: true. processing, and compile-time expression graphs.
The big advantage to AST processing over the other two is that you can
write and test your code as regular numpy code along with regular tests. Then, with the application of a decorator, you get the speedup you're looking for. The problem with porting the numpy code to numexpr strings or Theano-like expression-graphs is that porting can introduce bugs, and even if you're careful, every time you make a change to the numpy version of the code, you have port it again.
Also, I personally want to do more than just AST transformations of the
numpy code. For example, I have some methods that call super. The super calls can be collapsed since the mro is known at compile time.
If you want something that handles arbitrary python code ('with' etc.), and produces results identical to cpython (so tests are reliable), except in cases where it violates the semantics for speed (super), then yeah, you want a full replacement python implementation, and I agree that the proper input to a python implementation is .py files :-). That's getting a bit far afield from numexpr's goals though...
-n
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

2015-04-28 4:59 GMT+02:00 Neil Girdhar <mistersheik@gmail.com>:
I don't think I'm asking for so much. Somewhere inside numexpr it builds an AST of its own, which it converts into the optimized code. It would be more useful to me if that AST were in the same format as the one returned by Python's ast module. This way, I could glue in the bits of numexpr that I like with my code. For my purpose, this would have been the more ideal design.
I don't think implementing this for numexpr would be that complex. So for example, one could add a new numexpr.eval_ast(ast_expr) function. Pull requests are welcome. At any rate, which is your use case? I am curious. -- Francesc Alted

Sorry for the late reply. I will definitely consider submitting a pull request to numexpr if it's the direction I decide to go. Right now I'm still evaluating all of the many options for my project. I am implementing a machine learning algorithm as part of my thesis work. I'm in the "make it work", but quickly approaching the "make it fast" part. With research, you usually want to iterate quickly, and so whatever solution I choose has to be automated. I can't be coding things in an intuitive, natural way, and then porting it to a different implementation to make it fast. What I want is for that conversion to be automated. I'm still evaluating how to best achieve that. On Tue, Apr 28, 2015 at 6:08 AM, Francesc Alted <faltet@gmail.com> wrote:
2015-04-28 4:59 GMT+02:00 Neil Girdhar <mistersheik@gmail.com>:
I don't think I'm asking for so much. Somewhere inside numexpr it builds an AST of its own, which it converts into the optimized code. It would be more useful to me if that AST were in the same format as the one returned by Python's ast module. This way, I could glue in the bits of numexpr that I like with my code. For my purpose, this would have been the more ideal design.
I don't think implementing this for numexpr would be that complex. So for example, one could add a new numexpr.eval_ast(ast_expr) function. Pull requests are welcome.
At any rate, which is your use case? I am curious.
-- Francesc Alted
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Mon, Apr 27, 2015 at 7:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
There's no way to access the ast reliably at runtime in python -- it gets thrown away during compilation.
The "meta" package supports bytecode to ast translation. See < http://meta.readthedocs.org/en/latest/api/decompile.html>.

Wow, cool! Are there any users of this package? On Mon, Apr 27, 2015 at 9:07 PM, Alexander Belopolsky <ndarray@mac.com> wrote:
On Mon, Apr 27, 2015 at 7:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
There's no way to access the ast reliably at runtime in python -- it gets thrown away during compilation.
The "meta" package supports bytecode to ast translation. See < http://meta.readthedocs.org/en/latest/api/decompile.html>.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (5)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Francesc Alted
-
Nathaniel Smith
-
Neil Girdhar