[Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
Fernando Perez
fperez.net at gmail.com
Wed Jul 16 23:08:58 EDT 2014
Hi all,
sorry for not posting earlier, post-conference InboxInfinity blues and all
that...
The BoF did go as planned, and it was a good discussion, mostly following
the tentative agenda outlined here:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
Various folks were kind enough to take notes during the conversation on an
Etherpad instance:
https://scipy2014.etherpad.mozilla.org/35
For the sake of completeness and future reference, below I'm including a
copy of the notes in this email.
Other than what's in the notes, my take home from the discussion is mostly
that:
- we probably needed a longer slot than 45 minutes to have a chance to dig
in a little deeper.
- it would have been more productive if a focused numpy sprint had been
also planned, so that there could be more structured follow-up on the ideas
that came up.
It would be great to hear from others who were present at the conference.
In particular, Chris Barker brought up a number of things regarding
datetime and planned on following up during the sprints, but I'm not sure
what ended up happening.
Thanks to everyone who participated!
Cheers
f
#### Copy of Etherpad notes as of 7/16/2014:
Notes from BoF:
1:30, July 19, 2014
Working with topics on this page:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014
chuck: where do we go from here? -- what is the role of numpy now?
Generalized ufuncs -- still some more to do -- (LA stuff - norms)
- some ufuncs don't impliment array interface -- which are those -- sprint
topic?
- zeros_like, ones_like, more... (duplicate) github issue:
https://github.com/numpy/numpy/issues/4862
Here's the original issue: https://github.com/numpy/numpy/issues/3602
Implementation of @ (matrix multiplication)
- will be in 3.5 ~ 18months
- no work started yet -- have to make sure we do it.
- @@ was not added.
- The PEP for numpy is well-defined. Not much thinking to be done. (Good
for a sprint)
Datetime:
- Can it be done? -- too many calendars -- to many time scales, etc.
- Can we cover most applications?
- DynND -- higher abstraction -- convert to back end implimentation
- Also look at what R and Julia do?
- Maybe fix up the little issues in datetime64, first?
- Pandas does not use numpy machinery
- uses a array of objects: those objects are subclassed form
datetime.datetime
- does use int64, but gets unboxed on storage.
- Root cause is using UTC, rather than a naive time.
- Naive is not associated with a time zone. Can be interpreted in any
way.
- Ripping out the locale timezone on I/O would help.
- More often than not, using the locale timezone is not desired.
- For example, many experimental data do not attach time zones. (Or
wrong timezone)
- Consider laboratory time (stopwatch rather than a clock). (timedelta)
- The C++ committee is standardizing this.
- A key feature which is missing, is being able to choose your epoch.
New DTypes
- Example: quad float types. A solution for missing values? Adding units
support.
- Record & structured arrays play around with dtypes. Needs to be easier
to use these.
- Improve documentation.
- How to extend to support things like labeled arrays?
- This is orthogonal to dtypes.
- Would rather access time column instead of 3rd column.
- Would provide a better foundation for pandas.
- Key is to keep inputs simple.
- Finish the DataArray push?
- We are very closely there. It has been sitting there for a while.
- If interested, talk at sprints on July 10.
Missing values?
- maybe improve masked array.
- give up for now.
Inheriting ndarray
- introduces many bugs.
- should discourage this, but make it easier to work with it.
Dynd
- The issues discussed so far were motivation for starting dynd
- for example, a pluggable type system
- adding a categorical type in numpy (at Continuum) broke lots. Easier in
dynd.
- Commitment for dynd is to give it a numpy-like API
- Both need to evolve together.
- Find ways to make things more uniform (in numpy)
- Dynd is more an experimental phase, changing quickly.
- Can we import dynd as np?
- Not a goal. More exploratory in this phase.
- Adding a layer like that at a later time would be good. Not there, yet.
- Do not want to repeat py2->py3 debacle.
- Buffer protocol:
- Supported, but dynd extends it.
- As a pure C++ library, goal is to freeze once stable so systems beyond
Python can depend on it as a stable interface for working with array data.
Boost::Python
- Nothing official from numpy for using numpy arrays in C++
- Not prioritized.
- Numpy has gotten better about namespace pollution?
- It kind of works already. Talk to Mike Droettboom
--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/9d457329/attachment.html>
More information about the NumPy-Discussion
mailing list