Numpy BoF at SciPy 2014  quick report
Hi all, sorry for not posting earlier, postconference InboxInfinity blues and all that... The BoF did go as planned, and it was a good discussion, mostly following the tentative agenda outlined here: https://github.com/numpy/numpy/wiki/NumpyBoFatScipy2014 Various folks were kind enough to take notes during the conversation on an Etherpad instance: https://scipy2014.etherpad.mozilla.org/35 For the sake of completeness and future reference, below I'm including a copy of the notes in this email. Other than what's in the notes, my take home from the discussion is mostly that:  we probably needed a longer slot than 45 minutes to have a chance to dig in a little deeper.  it would have been more productive if a focused numpy sprint had been also planned, so that there could be more structured followup on the ideas that came up. It would be great to hear from others who were present at the conference. In particular, Chris Barker brought up a number of things regarding datetime and planned on following up during the sprints, but I'm not sure what ended up happening. Thanks to everyone who participated! Cheers f #### Copy of Etherpad notes as of 7/16/2014: Notes from BoF: 1:30, July 19, 2014 Working with topics on this page: https://github.com/numpy/numpy/wiki/NumpyBoFatScipy2014 chuck: where do we go from here?  what is the role of numpy now? Generalized ufuncs  still some more to do  (LA stuff  norms)  some ufuncs don't impliment array interface  which are those  sprint topic?  zeros_like, ones_like, more... (duplicate) github issue: https://github.com/numpy/numpy/issues/4862 Here's the original issue: https://github.com/numpy/numpy/issues/3602 Implementation of @ (matrix multiplication)  will be in 3.5 ~ 18months  no work started yet  have to make sure we do it.  @@ was not added.  The PEP for numpy is welldefined. Not much thinking to be done. (Good for a sprint) Datetime:  Can it be done?  too many calendars  to many time scales, etc.  Can we cover most applications?  DynND  higher abstraction  convert to back end implimentation  Also look at what R and Julia do?  Maybe fix up the little issues in datetime64, first?  Pandas does not use numpy machinery  uses a array of objects: those objects are subclassed form datetime.datetime  does use int64, but gets unboxed on storage.  Root cause is using UTC, rather than a naive time.  Naive is not associated with a time zone. Can be interpreted in any way.  Ripping out the locale timezone on I/O would help.  More often than not, using the locale timezone is not desired.  For example, many experimental data do not attach time zones. (Or wrong timezone)  Consider laboratory time (stopwatch rather than a clock). (timedelta)  The C++ committee is standardizing this.  A key feature which is missing, is being able to choose your epoch. New DTypes  Example: quad float types. A solution for missing values? Adding units support.  Record & structured arrays play around with dtypes. Needs to be easier to use these.  Improve documentation.  How to extend to support things like labeled arrays?  This is orthogonal to dtypes.  Would rather access time column instead of 3rd column.  Would provide a better foundation for pandas.  Key is to keep inputs simple.  Finish the DataArray push?  We are very closely there. It has been sitting there for a while.  If interested, talk at sprints on July 10. Missing values?  maybe improve masked array.  give up for now. Inheriting ndarray  introduces many bugs.  should discourage this, but make it easier to work with it. Dynd  The issues discussed so far were motivation for starting dynd  for example, a pluggable type system  adding a categorical type in numpy (at Continuum) broke lots. Easier in dynd.  Commitment for dynd is to give it a numpylike API  Both need to evolve together.  Find ways to make things more uniform (in numpy)  Dynd is more an experimental phase, changing quickly.  Can we import dynd as np?  Not a goal. More exploratory in this phase.  Adding a layer like that at a later time would be good. Not there, yet.  Do not want to repeat py2>py3 debacle.  Buffer protocol:  Supported, but dynd extends it.  As a pure C++ library, goal is to freeze once stable so systems beyond Python can depend on it as a stable interface for working with array data. Boost::Python  Nothing official from numpy for using numpy arrays in C++  Not prioritized.  Numpy has gotten better about namespace pollution?  It kind of works already. Talk to Mike Droettboom  Fernando Perez (@fperez_org; http://fperez.org) fperez.netatgmail: mailing lists only (I ignore this when swamped!) fernando.perezatberkeley: contact me here for any direct mail
On Wed, Jul 16, 2014 at 8:08 PM, Fernando Perez <fperez.net@gmail.com> wrote:
 it would have been more productive if a focused numpy sprint had been also planned, so that there could be more structured followup on the ideas that came up.
The trick is people to do it  there are a scary few number of people with skills, time, and inclination to work on the core numpy code. Exactly one of them (thanks Chuck!) was there for the sprints this year. If there were a way to put together a standalone numpy sprint at some point, that would be really great! In particular, Chris Barker brought up a number of things regarding
datetime and planned on following up during the sprints, but I'm not sure what ended up happening.
We did indeed follow op. No code was written, but: Chuck, Mark W. and I come up with a rough proposal. A handful of other folks came by to chat about it, and seemed to think it would be useful. In short: Some minor changes to time zone handling, with a hook in place to potentially plug in fancier support in the future. Possibly a hook in to plug in addition calendars. We're working on a NEP as we speak (or, correctly speaking, I'm distracted from working on the PEP by reading the numpy list....) Chris  Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception Chris.Barker@noaa.gov
participants (2)

Chris Barker

Fernando Perez