Hello everyone,
As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).
Thanks!
Kyle Manldi (and via proxy Matt McCormick)
On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
Hello everyone,
As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).
I'll be at the conference, but I don't know who else will be there. I feel that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd like to see discussed is where do we go from here. One option to look at is Blaze, which looks to have matured a lot in the last year. The problem with making it a NumPy replacement is that NumPy has become quite widespread, with downloads from PyPi running at about 3 million per year. With that much penetration it may be difficult for a new core like Blaze to gain traction. So I'd like to also discuss ways to bring the two branches of development together at some point and explore what NumPy can do to pave the way. Mind, there are definitely things that would be nice to add to NumPy, a better type system, missing values, etc., but doing that is difficult given the current design.
Chuck
On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris charlesr.harris@gmail.com wrote:
On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
Hello everyone,
As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).
I'll be at the conference, but I don't know who else will be there. I feel that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd like to see discussed is where do we go from here. One option to look at is Blaze, which looks to have matured a lot in the last year. The problem with making it a NumPy replacement is that NumPy has become quite widespread, with downloads from PyPi running at about 3 million per year. With that much penetration it may be difficult for a new core like Blaze to gain traction. So I'd like to also discuss ways to bring the two branches of development together at some point and explore what NumPy can do to pave the way. Mind, there are definitely things that would be nice to add to NumPy, a better type system, missing values, etc., but doing that is difficult given the current design.
I won't be at the conference unfortunately (I'm on the wrong continent and have family commitments then anyway), but I think there's lots of exciting stuff that can be done in numpy-land.
We absolutely could rewrite the dtype system, and this would straightforwardly give us excellent support for missing values, units, categorical data, automatic differentiation, better datetimes, etc. etc. -- and make numpy much more friendly in general to third-party extensions.
I'd like to see the ufunc system revisited in the light of all the things we know now, to make gufuncs more first-class, provide better support for user-defined types, more flexible loop selection (e.g. make it possible to implement np.add.reduce(a, type="kahan")), etc.; one goal would be to convert a lot of ufunc-like functions (np.mean etc.) into being real ufuncs, and then they'd automatically benefit from __numpy_ufunc__, which would also massively improve interoperability with alternative array types like blaze.
I'd like to see support for extensible label-based indexing, like pandas.
Internally, I'd like to see internal migrating out of C and into Cython -- we have hundreds of lines of code that could be replaced with a few lines of Cython and no-one would notice. (Combining this with a cffi cython backend and pypy would be pretty interesting too...)
I'd like to see sparse ndarrays, with integration into the ufunc looping machinery so all ufuncs just work. Or even better, I'd like to see the right hooks added so that anyone can write a sparse ndarray package using only public APIs, and have all ufuncs just work. (I was going to put down deferred/loop-fused/out-of-core computation as a wishlist item too, but if we do it right then this too could be implemented by anyone without needing to be baked into numpy proper.)
All of these things would take some work and care, but I think they could all be done incrementally and without breaking backwards compatibility. Compare to ipython, which -- as Fernando likes to point out :-) -- went from a little console program to its current distributed-notebook-skynet-whatever-it-is by merging one working PR at a time. Certainly these changes would much easier and less disruptive than any plan that involves throwing out numpy and starting over. But they also do help smooth the way for an incremental transition to a world where numpy is regularly used alongside other libraries.
-n
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris charlesr.harris@gmail.com wrote:
On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
Hello everyone,
As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).
I'll be at the conference, but I don't know who else will be there. I feel that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic that I'd like to see discussed is where do we go from here. One option to look at is Blaze, which looks to have matured a lot in the last year. The problem with making it a NumPy replacement is that NumPy has become quite widespread, with downloads from PyPi running at about 3 million per year. With that much penetration it may be difficult for a new core like Blaze to gain traction. So I'd like to also discuss ways to bring the two branches of development together at some point and explore what NumPy can do to pave the way. Mind, there are definitely things that would be nice to add to NumPy, a better type system, missing values, etc., but doing that is difficult given the current design.
I won't be at the conference unfortunately (I'm on the wrong continent and have family commitments then anyway), but I think there's lots of exciting stuff that can be done in numpy-land.
I wouldn't like to come, but to be honest have not planned to yet and it doesn't fit too well with the stuff I work on mostly right now. So will have to see.
- Sebastian
We absolutely could rewrite the dtype system, and this would straightforwardly give us excellent support for missing values, units, categorical data, automatic differentiation, better datetimes, etc. etc. -- and make numpy much more friendly in general to third-party extensions.
I'd like to see the ufunc system revisited in the light of all the things we know now, to make gufuncs more first-class, provide better support for user-defined types, more flexible loop selection (e.g. make it possible to implement np.add.reduce(a, type="kahan")), etc.; one goal would be to convert a lot of ufunc-like functions (np.mean etc.) into being real ufuncs, and then they'd automatically benefit from __numpy_ufunc__, which would also massively improve interoperability with alternative array types like blaze.
I'd like to see support for extensible label-based indexing, like pandas.
Internally, I'd like to see internal migrating out of C and into Cython -- we have hundreds of lines of code that could be replaced with a few lines of Cython and no-one would notice. (Combining this with a cffi cython backend and pypy would be pretty interesting too...)
I'd like to see sparse ndarrays, with integration into the ufunc looping machinery so all ufuncs just work. Or even better, I'd like to see the right hooks added so that anyone can write a sparse ndarray package using only public APIs, and have all ufuncs just work. (I was going to put down deferred/loop-fused/out-of-core computation as a wishlist item too, but if we do it right then this too could be implemented by anyone without needing to be baked into numpy proper.)
All of these things would take some work and care, but I think they could all be done incrementally and without breaking backwards compatibility. Compare to ipython, which -- as Fernando likes to point out :-) -- went from a little console program to its current distributed-notebook-skynet-whatever-it-is by merging one working PR at a time. Certainly these changes would much easier and less disruptive than any plan that involves throwing out numpy and starting over. But they also do help smooth the way for an incremental transition to a world where numpy is regularly used alongside other libraries.
-n
I won't be able to make it at scipy this year sadly.
I concur with Nathaniel that we can do a lot of things without a full rewrite -- it is all too easy to see what is gained with a rewrite and lose sight of what is lost. I have yet to see a really strong argument for a full rewrite. It may be easier to do a rewrite for a core when you have a few full-time people, but that's a different story for a community effort like numpy.
The main issue preventing new features in numpy is the lack of internal architecture at the C level, but nothing that could not be done by refactoring. Using cython to move away from the python C api would be great, though we need to talk with the cython people so that we can share common code between multiple extensions using cython, to avoid binary size explosion.
There are things that may require some backward incompatible changes in the C API, but that's much more acceptable than a significant break at the python level.
David
On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg sebastian@sipsolutions.net wrote:
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris charlesr.harris@gmail.com wrote:
On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.mandli@gmail.com
wrote:
Hello everyone,
As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit
through the
NumPy list to see if we could get enough interest to hold a NumPy
centered
BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a
few of
the lead devs to be on a panel and have a Q&A type of session or an
open Q&A
with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that
was not
really represented in the BoF sessions).
I'll be at the conference, but I don't know who else will be there. I
feel
that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic that
I'd
like to see discussed is where do we go from here. One option to look
at is
Blaze, which looks to have matured a lot in the last year. The problem
with
making it a NumPy replacement is that NumPy has become quite
widespread,
with downloads from PyPi running at about 3 million per year. With
that much
penetration it may be difficult for a new core like Blaze to gain
traction.
So I'd like to also discuss ways to bring the two branches of
development
together at some point and explore what NumPy can do to pave the way.
Mind,
there are definitely things that would be nice to add to NumPy, a
better
type system, missing values, etc., but doing that is difficult given
the
current design.
I won't be at the conference unfortunately (I'm on the wrong continent and have family commitments then anyway), but I think there's lots of exciting stuff that can be done in numpy-land.
I wouldn't like to come, but to be honest have not planned to yet and it doesn't fit too well with the stuff I work on mostly right now. So will have to see.
- Sebastian
We absolutely could rewrite the dtype system, and this would straightforwardly give us excellent support for missing values, units, categorical data, automatic differentiation, better datetimes, etc. etc. -- and make numpy much more friendly in general to third-party extensions.
I'd like to see the ufunc system revisited in the light of all the things we know now, to make gufuncs more first-class, provide better support for user-defined types, more flexible loop selection (e.g. make it possible to implement np.add.reduce(a, type="kahan")), etc.; one goal would be to convert a lot of ufunc-like functions (np.mean etc.) into being real ufuncs, and then they'd automatically benefit from __numpy_ufunc__, which would also massively improve interoperability with alternative array types like blaze.
I'd like to see support for extensible label-based indexing, like pandas.
Internally, I'd like to see internal migrating out of C and into Cython -- we have hundreds of lines of code that could be replaced with a few lines of Cython and no-one would notice. (Combining this with a cffi cython backend and pypy would be pretty interesting too...)
I'd like to see sparse ndarrays, with integration into the ufunc looping machinery so all ufuncs just work. Or even better, I'd like to see the right hooks added so that anyone can write a sparse ndarray package using only public APIs, and have all ufuncs just work. (I was going to put down deferred/loop-fused/out-of-core computation as a wishlist item too, but if we do it right then this too could be implemented by anyone without needing to be baked into numpy proper.)
All of these things would take some work and care, but I think they could all be done incrementally and without breaking backwards compatibility. Compare to ipython, which -- as Fernando likes to point out :-) -- went from a little console program to its current distributed-notebook-skynet-whatever-it-is by merging one working PR at a time. Certainly these changes would much easier and less disruptive than any plan that involves throwing out numpy and starting over. But they also do help smooth the way for an incremental transition to a world where numpy is regularly used alongside other libraries.
-n
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
It sounds like there is a lot to discuss come July and I am sure there will be others "willing" to voice their opinions as well. The primary goal in all of this would be to have a constructive discussion concerning the future of NumPy, do you guys have a feeling for what might be the most effective way to do this? A panel comes to mind but then people for the panel would have to be chosen. In the past I know that we have simply gathered in a circle and discussed which works as well. Whatever the case, if someone could volunteer to "lead" the discussion and also submit it via the SciPy conference website (you have to sign into the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative.
Kyle
On Wed, Jun 4, 2014 at 5:09 AM, David Cournapeau cournape@gmail.com wrote:
I won't be able to make it at scipy this year sadly.
I concur with Nathaniel that we can do a lot of things without a full rewrite -- it is all too easy to see what is gained with a rewrite and lose sight of what is lost. I have yet to see a really strong argument for a full rewrite. It may be easier to do a rewrite for a core when you have a few full-time people, but that's a different story for a community effort like numpy.
The main issue preventing new features in numpy is the lack of internal architecture at the C level, but nothing that could not be done by refactoring. Using cython to move away from the python C api would be great, though we need to talk with the cython people so that we can share common code between multiple extensions using cython, to avoid binary size explosion.
There are things that may require some backward incompatible changes in the C API, but that's much more acceptable than a significant break at the python level.
David
On Wed, Jun 4, 2014 at 9:58 AM, Sebastian Berg <sebastian@sipsolutions.net
wrote:
On Mi, 2014-06-04 at 02:26 +0100, Nathaniel Smith wrote:
On Wed, Jun 4, 2014 at 12:33 AM, Charles R Harris charlesr.harris@gmail.com wrote:
On Tue, Jun 3, 2014 at 5:08 PM, Kyle Mandli kyle.mandli@gmail.com
wrote:
Hello everyone,
As one of the co-chairs in charge of organizing the
birds-of-a-feather
sesssions at the SciPy conference this year, I wanted to solicit
through the
NumPy list to see if we could get enough interest to hold a NumPy
centered
BoF this year. The BoF format would be up to those who would lead
the
discussion, a couple of ideas used in the past include picking out a
few of
the lead devs to be on a panel and have a Q&A type of session or an
open Q&A
with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that
was not
really represented in the BoF sessions).
I'll be at the conference, but I don't know who else will be there. I
feel
that NumPy has matured to the point where most of the current work is cleaning stuff up, making it run faster, and fixing bugs. A topic
that I'd
like to see discussed is where do we go from here. One option to look
at is
Blaze, which looks to have matured a lot in the last year. The
problem with
making it a NumPy replacement is that NumPy has become quite
widespread,
with downloads from PyPi running at about 3 million per year. With
that much
penetration it may be difficult for a new core like Blaze to gain
traction.
So I'd like to also discuss ways to bring the two branches of
development
together at some point and explore what NumPy can do to pave the way.
Mind,
there are definitely things that would be nice to add to NumPy, a
better
type system, missing values, etc., but doing that is difficult given
the
current design.
I won't be at the conference unfortunately (I'm on the wrong continent and have family commitments then anyway), but I think there's lots of exciting stuff that can be done in numpy-land.
I wouldn't like to come, but to be honest have not planned to yet and it doesn't fit too well with the stuff I work on mostly right now. So will have to see.
- Sebastian
We absolutely could rewrite the dtype system, and this would straightforwardly give us excellent support for missing values, units, categorical data, automatic differentiation, better datetimes, etc. etc. -- and make numpy much more friendly in general to third-party extensions.
I'd like to see the ufunc system revisited in the light of all the things we know now, to make gufuncs more first-class, provide better support for user-defined types, more flexible loop selection (e.g. make it possible to implement np.add.reduce(a, type="kahan")), etc.; one goal would be to convert a lot of ufunc-like functions (np.mean etc.) into being real ufuncs, and then they'd automatically benefit from __numpy_ufunc__, which would also massively improve interoperability with alternative array types like blaze.
I'd like to see support for extensible label-based indexing, like
pandas.
Internally, I'd like to see internal migrating out of C and into Cython -- we have hundreds of lines of code that could be replaced with a few lines of Cython and no-one would notice. (Combining this with a cffi cython backend and pypy would be pretty interesting too...)
I'd like to see sparse ndarrays, with integration into the ufunc looping machinery so all ufuncs just work. Or even better, I'd like to see the right hooks added so that anyone can write a sparse ndarray package using only public APIs, and have all ufuncs just work. (I was going to put down deferred/loop-fused/out-of-core computation as a wishlist item too, but if we do it right then this too could be implemented by anyone without needing to be baked into numpy proper.)
All of these things would take some work and care, but I think they could all be done incrementally and without breaking backwards compatibility. Compare to ipython, which -- as Fernando likes to point out :-) -- went from a little console program to its current distributed-notebook-skynet-whatever-it-is by merging one working PR at a time. Certainly these changes would much easier and less disruptive than any plan that involves throwing out numpy and starting over. But they also do help smooth the way for an incremental transition to a world where numpy is regularly used alongside other libraries.
-n
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Thu, Jun 5, 2014 at 1:32 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
In the past I know that we have simply gathered in a circle and discussed which works as well. Whatever the case, if someone could volunteer to "lead" the discussion
It's my experience that a really good facilitator could make all the difference in how productive this kind of discussion is. I have no idea how to find such a facilitator (it's a pretty rare skill), but it would be nice to try, rather than taking whoever is willing to do the bureaucratic part....
and also submit it via the SciPy conference website (you have to sign into
the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative.
someone could still take on the organizer role while trying to find a facilitator...
-Chris
Hi Everyone,
Fernando Perez has volunteered to act as a facilitator for a BoF on NumPy if there is still interest in having this discussion. I can make sure that it is not scheduled in conjunction with some of the other planned BoFs that may interest this crowd. If someone could take the lead on organizing, submitting something to the conference website and I helping to gather interested parties I would be much obliged.
Kyle
On Fri, Jun 6, 2014 at 12:17 AM, Chris Barker chris.barker@noaa.gov wrote:
On Thu, Jun 5, 2014 at 1:32 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
In the past I know that we have simply gathered in a circle and discussed which works as well. Whatever the case, if someone could volunteer to "lead" the discussion
It's my experience that a really good facilitator could make all the difference in how productive this kind of discussion is. I have no idea how to find such a facilitator (it's a pretty rare skill), but it would be nice to try, rather than taking whoever is willing to do the bureaucratic part....
and also submit it via the SciPy conference website (you have to sign into
the dashboard and submit a new proposal) to help us keep track of everything I would be very appreciative.
someone could still take on the organizer role while trying to find a facilitator...
-Chris
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Jun 17, 2014 at 2:40 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
Hi Everyone,
Fernando Perez has volunteered to act as a facilitator for a BoF on NumPy if there is still interest in having this discussion. I can make sure that it is not scheduled in conjunction with some of the other planned BoFs that may interest this crowd. If someone could take the lead on organizing, submitting something to the conference website and I helping to gather interested parties I would be much obliged.
I can submit something to the conference website next week when I get back in town. We need to make sure it fits with the schedule of the Continuum folks.
<snip>
Chuck
Hi Kyle,
On Tue, Jun 17, 2014 at 2:40 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
Hi Everyone,
Fernando Perez has volunteered to act as a facilitator for a BoF on NumPy if there is still interest in having this discussion. I can make sure that it is not scheduled in conjunction with some of the other planned BoFs that may interest this crowd. If someone could take the lead on organizing, submitting something to the conference website and I helping to gather interested parties I would be much obliged.
I see you have reserved a slot on July 10 at 1:30 PM in room 204. That looks good to me. Along with discussing the future development of NumPy, I'd like to propose adopting something like the Blaze standard for datetime, which is similar to the current Pandas treatment, but with a different time base. Hmm... there seems to be a proliferation of time implementations, we should try to pick just one.
<snip>
Chuck
On Jun 27, 2014, at 8:44 PM, Charles R Harris charlesr.harris@gmail.com wrote:
Hi Kyle,
On Tue, Jun 17, 2014 at 2:40 I'd like to propose adopting something like the Blaze standard for datetime,
+1 for some focused discussion of datetime. This has been lingering far too long.
-Chris
Hi folks,
I've just created a page on the numpy wiki:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.
BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.
It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.
Cheers
f
On Sat, Jun 28, 2014 at 11:25 AM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
On Jun 27, 2014, at 8:44 PM, Charles R Harris charlesr.harris@gmail.com wrote:
Hi Kyle,
On Tue, Jun 17, 2014 at 2:40 I'd like to propose adopting something
like the Blaze standard for datetime,
+1 for some focused discussion of datetime. This has been lingering far too long.
-Chris _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez fperez.net@gmail.com wrote:
Hi folks,
I've just created a page on the numpy wiki:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.
BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.
It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.
Cheers
I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.
<snip>
Chuck
On Sat, Jun 28, 2014 at 9:19 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez fperez.net@gmail.com wrote:
Hi folks,
I've just created a page on the numpy wiki:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.
BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.
It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.
Cheers
I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.
I've added a preliminary list of topics. I'm sure there is more to be added.
Chuck
Great, thanks! And we can certainly try to either move into planning at the end, or plan for such afterwards.
Best
f
On Sat, Jun 28, 2014 at 8:40 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
On Sat, Jun 28, 2014 at 9:19 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez fperez.net@gmail.com wrote:
Hi folks,
I've just created a page on the numpy wiki:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.
BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.
It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.
Cheers
I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.
I've added a preliminary list of topics. I'm sure there is more to be added.
Chuck
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I am really excited to see that we have a great agenda for the BoF, I hope that the discussion will be fruitful!
Kyle
On Sat, Jun 28, 2014 at 10:44 PM, Fernando Perez fperez.net@gmail.com wrote:
Great, thanks! And we can certainly try to either move into planning at the end, or plan for such afterwards.
Best
f
On Sat, Jun 28, 2014 at 8:40 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Sat, Jun 28, 2014 at 9:19 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Sat, Jun 28, 2014 at 8:29 PM, Fernando Perez fperez.net@gmail.com wrote:
Hi folks,
I've just created a page on the numpy wiki:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.
BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.
It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.
Cheers
I've been thinking more along the lines of a planning session than a panel discussion, something like the round table discussions that preceded implementation of python3 support and the move to github. Perhaps we could arrange something like that as a follow up on the BOF meetup.
I've added a preliminary list of topics. I'm sure there is more to be added.
Chuck
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Fernando Perez (@fperez_org; http://fperez.org) fperez.net-at-gmail: mailing lists only (I ignore this when swamped!) fernando.perez-at-berkeley: contact me here for any direct mail
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I will be at the conference as will Mark Wiebe for at least part of the time. Others from the Blaze team like Andy Terrel and Matthew Rocklin will also be available at least part of the time (so it depends on when the BoF is). I'm sure they will all have opinions about this. I would be happy to be involved with a discussion around the future of NumPy as it is one of the things I've been thinking about for quite a while.
Obviously, what happens will be more a function of what people have resources to do than just what is discussed, but it is helpful to get people from multiple projects discussing what they are working on and how it relates or could relate to a possible NumPy 2.0 effort. I'm happy to participate.
My bias is that I do not believe it is going to be possible practically to simply modify NumPy itself directly. This was the original direction we considered when we started Continuum -- and spent some time and money in that direction --- but it's a difficult problem that would require a lot of time and patience and testing from multiple people. I'm not sure IPython is the right project to compare against here as it's user-story is quite different. NumPy is already a hybrid that evolved from Numeric.
Of course it is likely *technically* feasible. We could replace every implementation detail with something different --- but not without likely impact on users and more cost than it would be just to re-write sections. However, the challenge is more about the user-base (especially the silent but large user-base), the semantic expectations of that user base, and the challenge that exists in really creating a test suite that covers the entire surface area of actual NumPy use.
Even relatively simple changes can have significant impact at this point. Nathaniel has laid out a fantastic list of great features. These are the kind of features I have been eager to see as well. This is why I have been working to fund and help explore these ideas in the Numba array object as well as in Blaze. Gnumpy, Theano, Pandas, and other projects also have useful tales to tell regarding a potential NumPy 2.0.
Ultimately, I do think it is time to talk seriously about NumPy 2.0, and what it might look like. I personally think it looks a lot more like a re-write, than a continuation of the modifications of Numeric that became NumPy 1.0. Right out of the gate, for example, I would make sure that NumPy 2.0 objects somehow used PyObject_VAR_HEAD so that they were variable-sized objects where the strides and dimension information was stored directly in the object structure itself instead of allocated separately (thus requiring additional loads and stores from memory). This would be a relatively simple change. But, it can't be done and preserve ABI compatibility. It may also, at this point, have impact on Cython code, or other code that is deeply-aware of the NumPy code-structure. Some of the changes that should be made will ultimately require a porting exercise for new code --- at which point why not just use a new project.
Dynd (which is a separate but related project from Blaze) is actually a pretty good start to a NumPy 2.0 already: https://github.com/ContinuumIO/dynd-python and https://github.com/ContinuumIO/libdynd (C++ library).
It can be provided with a backwards-compatible API without too much difficulty so that extension modules built for NumPy 1.X would still work. Numba can support Dynd and Numba's array object provides a useful, deferred-expression evaluation mechanism along with JIT compilation when desired that can support the GPU.
I would make the case that by the end of the year this combination of Dynd plus Numba (and it's array object) could easily provide much of the functionality needed for a solid NumPy++. Separate from that, Blaze provides a pluggable mechanism so that array-oriented computations can be done on a large-variety of backends (including distributed systems).
I agree that users of NumPy should not have to see a big API change in 2.0 --- but any modification of indexing or calculations would present slightly different semantics in certain corner cases --- which I think will be unavoidable in NumPy 2.0 regardless of how it is created. I also think NumPy 2.0 should take the opportunity to look hard at the API and what can be simplified (do we have the right collection of methods?). I'm also a big fan of introducing a common "array of structure" object that has a smaller API footprint than Pandas but has indexing and group-by functionality.
Fortunately, with the buffer protocol in Python, multiple array objects can easily co-exist in the Python ecosystem with no memory copies. I think that is where we are headed and I don't see it as a bad thing. I think agreeing on how to describe types would be very beneficial (it's an under-developed part of the buffer protocol). This is exactly why we have made datashape an independent project that other projects can use as a data-type-description mini-language: https://github.com/ContinuumIO/datashape
I think that a really good project for an enterprising young graduate student, post-doc, or professor (who is willing to delay their PhD or risk their tenure) would be to re-write the ufunc system using more modern techniques and put generalized ufuncs front and center as Nathaniel described.
It sounds like many agree that we can improve the ufunc object implementation. A new ufunc system is an entirely achievable goal and could even be shipped as an "add-on" project external from NumPy for several years before being adopted fully. I know at least 4 people with demo-ware versions of a new ufunc-object that could easily replace current NumPy ufuncs eventually. If you are interested in that, I would love to share what I know with you.
After spending quite a bit of time thinking about this over the past 2 years, interacting with many in the user community outside of this list, and working with people as they explore a few options --- I do have a fair set of opinions. But, there are also a lot of possibilities and many opportunities. I'm looking forward to seeing what emerges in the coming months and years and cooperating where possible with others having overlapping interests.
Best,
-Travis
On Tue, Jun 3, 2014 at 6:08 PM, Kyle Mandli kyle.mandli@gmail.com wrote:
Hello everyone,
As one of the co-chairs in charge of organizing the birds-of-a-feather sesssions at the SciPy conference this year, I wanted to solicit through the NumPy list to see if we could get enough interest to hold a NumPy centered BoF this year. The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics. I can help facilitate organization of something but we would really like to get something organized this year (last year NumPy was the only major project that was not really represented in the BoF sessions).
Thanks!
Kyle Manldi (and via proxy Matt McCormick)
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi Kyle
Kyle Mandli writes:
The BoF format would be up to those who would lead the discussion, a couple of ideas used in the past include picking out a few of the lead devs to be on a panel and have a Q&A type of session or an open Q&A with perhaps audience guided list of topics.
Unfortunately I won't be at the conference this year, but if I were I'd have enjoyed seeing a couple of short presentations, drawn from, e.g., some of the people involved in this discussion (Nathan can perhaps join in via Google Hangout), about possible future directions. That way one can sketch out the playing field to seed the discussion. In addition, I those sketches would provide a useful update to all those watching the conference remotely via video.
Regards Stéfan