On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
> Hi Nathaniel,
> Overall, hugely in favour! For detailed comments, it would be good to
> have a link to a PR; could you put that up?
Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
But, this raises a question :-). (One which also came up here:
There are sensible two workflows we could use (or at least, two that I
can think of):
1. We merge updates to the NEPs as we go, so that whatever's in the
repo is the current draft. Anyone can go to the NEP webpage at
http://numpy.org/neps (WIP, see #10702) to see the latest version of
all NEPs, whether accepted, rejected, or in progress. Discussion
happens on the mailing list, and line-by-line feedback can be done by
quote-replying and commenting on individual lines. From time to time,
the NEP author takes all the accumulated feedback, updates the
document, and makes a new post to the list to let people know about
the updated version.
This is how python-dev handles PEPs.
2. We use Github itself to manage the review. The repo only contains
"accepted" NEPs; draft NEPs are represented by open PRs, and rejected
NEPs are represented by PRs that were closed-without-merging.
Discussion uses Github's commenting/review tools, and happens in the
This is roughly how Rust handles their RFC process, for example:
Trying to do some hybrid version of these seems like it would be
pretty painful, so we should pick one.
Given that historically we've tried to use the mailing list for
substantive features/planning discussions, and that our NEP process
has been much closer to workflow 1 than workflow 2 (e.g., there are
already a bunch of old NEPs already in the repo that are effectively
rejected/withdrawn), I think we should maybe continue that way, and
keep discussions here?
So my suggestion is discussion should happen on the list, and NEP
updates should be merged promptly, or just self-merged. Sound good?
Nathaniel J. Smith -- https://vorpus.org
Would it be possible to add the fweights and aweights keyword arguments
from np.cov to np.corrcoef? They would retain their meaning from np.cov as
frequency- or importance-based weightings respectively.
In looking to solve issue #9028 "no way to override matmul/@ if
__array_ufunc__ is set", it seems there is consensus around the idea of
making matmul a true gufunc, but matmul can behave differently for
different combinations of array and vector:
(n,k),(k) -> (n)
Currently there is no way to express that in the ufunc signature. The
proposed solution to issue #9029 is to extend the meaning of a signature
so "syntax like (n?,k),(k,m?)->(n?,m?) could mean that n and m are
optional dimensions; if missing in the input, they're treated as 1, and
then dropped from the output" Additionally, there is an open pull
request #5015 "Add frozen dimensions to gufunc signatures" to allow
signatures like '(3),(3)->(3)'.
I would like extending ufunc signature handling to implement both these
ideas, in a way that would be backwardly-compatible with the publicly
exposed PyUFuncObject. PyUFunc_FromFuncAndDataAndSignature is used to
allocate and initialize a PyUFuncObject, are there downstream projects
that allocate their own PyUFuncObject not via
PyUFunc_FromFuncAndDataAndSignature? If so, we could use one of the
"reserved" fields, or extend the meaning of the "identity" field to
allow version detection. Any thoughts?
Any other thoughts about extending the signature syntax?
Office Hours 25April 2018 12:00 -13:00 PDT
Present: Matti Picus, Allan Haldane, Ralf Gommers, Matthew Brett, Tyler
Reddy, Stéfan van der Walt, Hameer Abbasi
Some of the people were not present for the entire discussion, audio was a
little flaky at times.
Grant background overview
Matti has been browsing through issues and pull-requests to try to get a
handle on common themes and community pain points.
- Policy questions:
- Do we close duplicate issues? (answer - Yes, referencing the other issue,
as long as they are true duplicates )
- Do we close tutorial-like issues that are documented?(answer - Yes, maybe
- Common theme - there are many issues about overflow, mainly about int32.
Maybe add a mode or command switch for warning on int32 overflow?
- Requested topic for discussion - improving CI and MacOS testing
- How to filter CI issues on github? There is a component:build label but
it is not CI specific
- What about MacOS testing - should it be sending notices? (answer -
- Running ASV benchmarking (https://asv.readthedocs.io/en/latest/). It is
done with SciPy, but it is fragile, not done nightly; need ability to run
branches more robustly documentation on SciPy site
- Hameer: f2py during testing is the system one, not the internal one
Most of the remaining discussion was a meta-discussion about how the
community will continue to decide priorities and influence how the full-time
developers spend their time.
- Setting up a community-driven roadmap would be useful
- Be aware of the risks of having devoted developer time on a
- Influence can be subtle: ideally, community writes roadmap, instead
of simply commenting on proposal
- Can we distill past lessons to inform future decisions?
- In general, how to determine community priorities?
- Constant communication paramount, looks like things are going in
the right direction.
Furher resources to consider:
- How did Jupyter organize their roadmap (ask Brian Granger)?
- How did Pandas run the project with a full time maintainer (Jeff Reback)?
- Can we copy other projects' management guidelines?
We did not set a time for another online discussion, since it was felt
that maybe near/during the sprint in May would be appropriate.
I apologize for any misrepresentation.
When introducing the ``axes`` argument for generalized ufuncs, the
plan was to eventually also add ``axis`` and ``keepdims`` for
reduction-like gufuncs. I have now attempted to do so in
It is not completely feature-compatible with reductions in that one
cannot (yet) pass in a tuple or None to ``axis``.
Comments most welcome.
All the best,
Here is a python code snippet:
# python vers. 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC
v.1900 64 bit (AMD64)]
import numpy as np # numpy vers. 1.14.3
#import matplotlib.pyplot as plt
N = 21
amp = 10
t = np.linspace(0.0,N-1,N)
arg = 2.0*np.pi/(N-1)
y = amp*np.sin(arg*t)
ypad = np.pad(y, (3,2),'mean')
When I execute this the outputs are:
[ 0.00000000e+00 3.09016994e+00 5.87785252e+00 8.09016994e+00
9.51056516e+00 1.00000000e+01 9.51056516e+00 8.09016994e+00
5.87785252e+00 3.09016994e+00 1.22464680e-15 -3.09016994e+00
-5.87785252e+00 -8.09016994e+00 -9.51056516e+00 -1.00000000e+01
-9.51056516e+00 -8.09016994e+00 -5.87785252e+00 -3.09016994e+00
[-1.37780134e-16 -1.37780134e-16 -1.37780134e-16 0.00000000e+00
3.09016994e+00 5.87785252e+00 8.09016994e+00 9.51056516e+00
1.00000000e+01 9.51056516e+00 8.09016994e+00 5.87785252e+00
3.09016994e+00 1.22464680e-15 -3.09016994e+00 -5.87785252e+00
-8.09016994e+00 -9.51056516e+00 -1.00000000e+01 -9.51056516e+00
-8.09016994e+00 -5.87785252e+00 -3.09016994e+00 -2.44929360e-15
The left pad is correct, but the right pad is different and not the mean
of y) --- why?
I am pleased to announce the release of NumPy 1.14.3. This is a bugfix
release for a few bugs reported following the 1.14.2 release:
* np.lib.recfunctions.fromrecords accepts a list-of-lists, until 1.15
* In python2, float types use the new print style when printing to a file
* style arg in "legacy" print mode now works for 0d arrays
The Python versions supported in this release are 2.7 and 3.4 - 3.6. The
Python 3.6 wheels available from PIP are built with Python 3.6.2 and should
be compatible with all previous versions of Python 3.6. The source releases
were cythonized with Cython 0.28.2.
A total of 6 people contributed to this release. People with a "+" by their
names contributed a patch for the first time.
* Allan Haldane
* Charles Harris
* Jonathan March +
* Malcolm Smith +
* Matti Picus
* Pauli Virtanen
Pull requests merged
A total of 8 pull requests were merged for this release.
* `#10862 <https://github.com/numpy/numpy/pull/10862>`__: BUG: floating
types should override tp_print (1.14 backport)
* `#10905 <https://github.com/numpy/numpy/pull/10905>`__: BUG: for 1.14
back-compat, accept list-of-lists in fromrecords
* `#10947 <https://github.com/numpy/numpy/pull/10947>`__: BUG: 'style'
arg to array2string broken in legacy mode (1.14...
* `#10959 <https://github.com/numpy/numpy/pull/10959>`__: BUG: test, fix
for missing flags['WRITEBACKIFCOPY'] key
* `#10960 <https://github.com/numpy/numpy/pull/10960>`__: BUG: Add
missing underscore to prototype in check_embedded_lapack
* `#10961 <https://github.com/numpy/numpy/pull/10961>`__: BUG: Fix
encoding regression in ma/bench.py (Issue #10868)
* `#10962 <https://github.com/numpy/numpy/pull/10962>`__: BUG: core: fix
NPY_TITLE_KEY macro on pypy
* `#10974 <https://github.com/numpy/numpy/pull/10974>`__: BUG: test, fix
Numpy has three histogram functions - histogram, histogram2d, and
histogram is by far the most widely used, and in the absence of weights and
normalization, returns an np.intp count for each bin.
histogramdd (for which histogram2d is a wrapper) returns np.float64 in all
As a contrived comparison
>>> x = np.linspace(0, 1)>>> h, e = np.histogram(x*x, bins=4); h
array([25, 10, 8, 7], dtype=int64)>>> h, e = np.histogramdd((x*x,), bins=4); h
array([25., 10., 8., 7.])
https://github.com/numpy/numpy/issues/7845 tracks this inconsistency.
The fix is now trivial: the question is, will changing the return type
break people’s code?
Either we should:
1. Just change it, and hope no one is broken by it
2. Add a dtype argument:
- If dtype=None, behave like np.histogram
- If dtype is not specified, emit a future warning recommending to
use dtype=None or dtype=float
- In future, change the default to None
3. Create a new better-named function histogram_nd, which can also be
created without the mistake that is
I was surprised recently to discover that both np.any and np.all() do not
have a way to exit early:
In : import numpy as np
In : data = np.arange(1e6)
In : print(data[:10])
[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
In : %timeit np.any(data)
724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)
In : data = np.zeros(int(1e6))
In : %timeit np.any(data)
732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)
I don't see any discussions about this on the NumPy issue tracker but
perhaps I'm missing something.
I'm curious if there's a way to get a fast early-terminating search in
NumPy? Perhaps there's another package I can depend on that does this? I
guess I could also write a bit of cython code that does this but so far
this project is pure python and I don't want to deal with the packaging
headache of getting wheels built and conda-forge packages set up on all
Thanks for your help!