I won't be able to make it at scipy this year sadly.

I concur with Nathaniel that we can do a lot of things without a full rewrite -- it is all too easy to see what is gained with a rewrite and lose sight of what is lost. I have yet to see a really strong argument for a full rewrite. It may be easier to do a rewrite for a core when you have a few full-time people, but that's a different story for a community effort like numpy.

The main issue preventing new features in numpy is the lack of internal architecture at the C level, but nothing that could not be done by refactoring. Using cython to move away from the python C api would be great, though we need to talk with the cython people so that we can share common code between multiple extensions using cython, to avoid binary size explosion.

There are things that may require some backward incompatible changes in the C API, but that's much more acceptable than a significant break at the python level.

David


On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
> On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> > On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli <kyle.mandli@gmail.com> wrote:
> >>
> >> Hello everyone,
> >>
> >> As one of the co-chairs in charge of organizing the birds-of-a-feather
> >> sesssions at the SciPy conference this year, I wanted to solicit through the
> >> NumPy list to see if we could get enough interest to hold a NumPy centered
> >> BoF this year.  The BoF format would be up to those who would lead the
> >> discussion, a couple of ideas used in the past include picking out a few of
> >> the lead devs to be on a panel and have a Q&A type of session or an open Q&A
> >> with perhaps audience guided list of topics.  I can help facilitate
> >> organization of something but we would really like to get something
> >> organized this year (last year NumPy was the only major project that was not
> >> really represented in the BoF sessions).
> >
> > I'll be at the conference, but I don't know who else will be there. I feel
> > that NumPy has matured to the point where most of the current work is
> > cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd
> > like to see discussed is where do we go from here. One option to look at is
> > Blaze, which looks to have matured a lot in the last year. The problem with
> > making it a NumPy replacement is that NumPy has become quite widespread,
> > with downloads from PyPi running at about 3 million per year. With that much
> > penetration it may be difficult for a new core like Blaze to gain traction.
> > So I'd like to also discuss ways to bring the two branches of development
> > together at some point and explore what NumPy can do to pave the way. Mind,
> > there are definitely things that would be nice to add to NumPy, a better
> > type system, missing values, etc., but doing that is difficult given the
> > current design.
>
> I won't be at the conference unfortunately (I'm on the wrong continent
> and have family commitments then anyway), but I think there's lots of
> exciting stuff that can be done in numpy-land.
>

I wouldn't like to come, but to be honest have not planned to yet and it
doesn't fit too well with the stuff I work on mostly right now. So will
have to see.

- Sebastian

> We absolutely could rewrite the dtype system, and this would
> straightforwardly give us excellent support for missing values, units,
> categorical data, automatic differentiation, better datetimes, etc.
> etc. -- and make numpy much more friendly in general to third-party
> extensions.
>
> I'd like to see the ufunc system revisited in the light of all the
> things we know now, to make gufuncs more first-class, provide better
> support for user-defined types, more flexible loop selection (e.g.
> make it possible to implement np.add.reduce(a, type="kahan")), etc.;
> one goal would be to convert a lot of ufunc-like functions (np.mean
> etc.) into being real ufuncs, and then they'd automatically benefit
> from __numpy_ufunc__, which would also massively improve
> interoperability with alternative array types like blaze.
>
> I'd like to see support for extensible label-based indexing, like pandas.
>
> Internally, I'd like to see internal migrating out of C and into
> Cython -- we have hundreds of lines of code that could be replaced
> with a few lines of Cython and no-one would notice. (Combining this
> with a cffi cython backend and pypy would be pretty interesting
> too...)
>
> I'd like to see sparse ndarrays, with integration into the ufunc
> looping machinery so all ufuncs just work. Or even better, I'd like to
> see the right hooks added so that anyone can write a sparse ndarray
> package using only public APIs, and have all ufuncs just work. (I was
> going to put down deferred/loop-fused/out-of-core computation as a
> wishlist item too, but if we do it right then this too could be
> implemented by anyone without needing to be baked into numpy proper.)
>
> All of these things would take some work and care, but I think they
> could all be done incrementally and without breaking backwards
> compatibility. Compare to ipython, which -- as Fernando likes to point
> out :-) -- went from a little console program to its current
> distributed-notebook-skynet-whatever-it-is by merging one working PR
> at a time. Certainly these changes would much easier and less
> disruptive than any plan that involves throwing out numpy and starting
> over. But they also do help smooth the way for an incremental
> transition to a world where numpy is regularly used alongside other
> libraries.
>
> -n
>


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion