[Numpy-discussion] big-bangs versus incremental improvements (was: Re: SciPy 2014 BoF NumPy Participation)

Wed Jun 4 22:36:39 EDT 2014

On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant <travis at continuum.io> wrote:

> Believe me, I'm all for incremental changes if it is actually possible and
> doesn't actually cost more.  It's also why I've been silent until now about
> anything we are doing being a candidate for a NumPy 2.0.  I understand the
> challenges of getting people to change.  But, features and solid
> improvements *will* get people to change --- especially if their new
> library can be used along with the old library and the transition can be
> done gradually. Python 3's struggle is the lack of features.
>
> At some point there *will* be a NumPy 2.0.   What features go into NumPy
> 2.0, how much backward compatibility is provided, and how much porting is
> needed to move your code from NumPy 1.X to NumPy 2.X is the real user
> question --- not whether it is characterized as "incremental" change or
> "re-write".     What I call a re-write and what you call an
> "incremental-change" are two points on a spectrum and likely overlap
> signficantly if we really compared what we are thinking about.
>
> One huge benefit that came out of the numeric / numarray / numpy
> transition that we mustn't forget about was actually the extended buffer
> protocol and memory view objects.  This really does allow multiple array
> objects to co-exist and libraries to use the object that they prefer in a
> way that did not exist when Numarray / numeric / numpy came out.    So, we
> shouldn't be afraid of that world.   The existence of easy package managers
> to update environments to try out new features and have applications on a
> single system that use multiple versions of the same library is also
> something that didn't exist before and that will make any transition easier
> for users.
>
> One thing I regret about my working on NumPy originally is that I didn't
> have the foresight, skill, and understanding to work more on a more
> extended and better designed multiple-dispatch system so that multiple
> array objects could participate together in an expression flow.   The
> __numpy_ufunc__ mechanism gives enough capability in that direction that it
> may be better now.
>
> Ultimately, I don't disagree that NumPy can continue to exist in
> "incremental" change mode ( though if you are swapping out whole swaths of
> C-code for Cython code --- it sounds a lot like a "re-write") as long as
> there are people willing to put the effort into changing it.   I think this
> is actually benefited by the existence of other array objects that are
> pushing the feature envelope without the constraints --- in much the same
> way that the Python standard library is benefitted by many versions of
> different capabilities being tried out before moving into the standard
> library.
>
> I remain optimistic that things will continue to improve in multiple ways
> --- if a little "messier" than any of us would conceive individually.   It
> *is* great to see all the PR's coming from multiple people on NumPy and all
> the new energy around improving things whether great or small.
>

@nathaniel IIRC, one of the objections to the missing values work was that
it changed the underlying array object by adding a couple of variables to
the structure. I'm willing to do that sort of thing, but it would be good
to have general agreement that that is acceptable.

As to blaze/dynd, I'd like to steal bits here and there, and maybe in the
long term base numpy on top of it with a compatibility layer. There is a
lot of thought and effort that has gone into those projects and we should
use what we can. As is, I think numpy is good for another five to ten years
and will probably hang on for fifteen, but it will be outdated by the end
of that period. Like great whites, we need to keep swimming just to have
oxygen. Software projects tend to be obligate ram ventilators.

The Python 3 experience is definitely something we want to avoid. And while
blaze does big data and offers some nice features, I don't know that it
offers compelling reasons to upgrade to the more ordinary user at this
time, so I'd like to sort of slip it into numpy if possible.

If we do start moving numpy forward in more radical steps, we should try to
have some agreement beforehand as to what sort of changes are acceptable.
For instance, to maintain backward compatibility, is it sufficient that a
recompile will do the job, or do we require forward compatibility for
extensions compiled against earlier releases? Do we stay with C or should
we support C++ code with its advantages of smart pointers, exception
handling, and templates? We will need a certain amount of flexibility going
forward and we should decide, or at least discuss, such issues up front.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140604/ae3ed274/attachment.html>