[Numpy-discussion] Insights / lessons learned from NumPy design

Wed Jan 9 13:04:23 EST 2013

On 01/09/2013 04:41 PM, Benjamin Root wrote:
>
>
> On Wed, Jan 9, 2013 at 9:58 AM, Nathaniel Smith <njs at pobox.com
> <mailto:njs at pobox.com>> wrote:
>
>     On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac <alan.isaac at gmail.com
>     <mailto:alan.isaac at gmail.com>> wrote:
>      > I'm just a Python+NumPy user and not a CS type.
>      > May I ask a naive question on this thread?
>      >
>      > Given the work that has (as I understand it) gone into
>      > making NumPy usable as a C library, why is the discussion not
>      > going in a direction like the following:
>      > What changes to the NumPy code base would be required for it
>      > to provide useful ndarray functionality in a C extension
>      > to Clojure?  Is this simply incompatible with the goal that
>      > Clojure compile to JVM byte code?
>
>     IIUC that work was done on a fork of numpy which has since been
>     abandoned by its authors, so... yeah, numpy itself doesn't have much
>     to offer in this area right now. It could in principle with a bunch of
>     refactoring (ideally not on a fork, since we saw how well that went),
>     but I don't think most happy current numpy users are wishing they
>     could switch to writing Lisp on the JVM or vice-versa, so I don't
>     think it's surprising that no-one's jumped up to do this work.
>
>
> If I could just point out that the attempt to fork numpy for the .NET
> work was done back in the subversion days, and there was little-to-no
> effort to incrementally merge back changes to master, and vice-versa.
> With git as our repository now, such work may be more feasible.

This is a matter of personal software design taste I guess, so the 
following is very subjective.

I don't think there's anything at all to gain from this.  In 2013 (and 
presumably, the future), a static C or C++ library is IMO fundamentally 
incompatible with achieving optimal performance. Going through a major 
refactor simply to end up with something that's no faster and no more 
flexible than what NumPy is today seems sort of pointless to me.

What one wants is to generate ufuncs etc. on the fly using LLVM that are 
tuned to the specific tiling pattern of a specific operation, not a 
static C or C++ library (even with C++ meta-programming, the 
combinatorial explosion kills you if you do it all at compile-time).

Granted, one could probably write a C++ library that was more of a 
compiler, using LLVM to emit code. But that's starting all over so not 
really relevant to the question of a NumPy refactor.

This is how I understand Continuum thinks too, with Numba as a back-end 
for Blaze. (And Travis also spoke about this in his "farewell address".)

Finally, Mark Florisson sort of started this with the 'minivect' library 
last summer which could as a "ufunc" backend both for Cython and Numba 
(which for this purpose are different languages), however as I 
understand it focus is now more on developing Numba directly rather than 
minivect (which is understandable as that's quicker).

Dag Sverre