I've just added some exercises to the collection at https://github.com/rougier/numpy-100
(and in the process, I've discovered np.argpartition... nice!)
If you have some ideas/comments/corrections... Still 20 to go...
I propose that we upload Windows wheels to pypi. The wheels are
likely to be stable and relatively easy to maintain, but will have
slower performance than other versions of numpy linked against faster
BLAS / LAPACK libraries.
There's a long discussion going on at issue github #5479 , where
the old problem of Windows wheels for numpy came up.
For those of you not following this issue, the current situation for
community-built numpy Windows binaries is dire:
* We have not so far provided windows wheels on pypi, so `pip install
numpy` on Windows will bring you a world of pain;
* Until recently we did provide .exe "superpack" installers on
sourceforge, but these became increasingly difficult to build and we
gave up building them as of the latest (1.10.4) release.
Despite this, popularity of Windows wheels on pypi is high. A few
weeks ago, Donald Stufft ran a query for the binary wheels most often
downloaded from pypi, for any platform  . The top five most
downloaded were (n_downloads, name):
So a) the OSX numpy wheel is very popular and b) despite the fact that
we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
and pandas, that depend on numpy, are the 3rd, 4th and 5th most
downloaded wheels as of a few weeks ago.
So, there seems to be a large appetite for numpy wheels.
I have now built numpy wheels, using the ATLAS blas / lapack library -
the build is automatic and reproducible .
I chose ATLAS to build against, rather than, say OpenBLAS, because
we've had some significant worries in the past about the reliability
of OpenBLAS, and I thought it better to err on the side of
However, these builds are relatively slow for matrix multiply and
other linear algebra routines compared numpy built against OpenBLAS or
MKL (which we cannot use because of its license) . In my very
crude array test of a dot product and matrix inversion, the ATLAS
wheels were 2-3 times slower than MKL. Other benchmarks on Julia
found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
much bigger difference on 64-bit (for an earlier version of ATLAS than
we are currently using) .
So, our numpy wheels likely to be stable and give correct results, but
will be somewhat slow for linear algebra.
I propose that we upload these ATLAS wheels to pypi. The upside is
that this gives our Windows users a much better experience with pip,
and allows other developers to build Windows wheels that depend on
numpy. The downside is that these will not be optimized for
performance on modern processors. In order to signal that, I propose
adding the following text to the numpy pypi front page:
All numpy wheels distributed from pypi are BSD licensed.
Windows wheels are linked against the ATLAS BLAS / LAPACK library,
restricted to SSE2 instructions, so may not give optimal linear
algebra performance for your machine. See
http://docs.scipy.org/doc/numpy/user/install.html for alternatives.
In a way this is very similar to our previous situation, in that the
superpack installers also used ATLAS - in fact an older version of
Once we are up and running with numpy wheels, we can consider whether
we should switch to other BLAS libraries, such as OpenBLAS or BLIS
I'm posting here hoping for your feedback...
ANyone interested in Google Summer of Code this year?
I think the real challenge is having folks with the time to really put into
mentoring, but if folks want to do it -- numpy could really benefit.
Maybe as a python.org sub-project?
Deadlines are approaching -- so I thought I'd ping the list and see if
folks are interested.
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
There is currently some discussion
<https://github.com/numpy/numpy/pull/7373> on whether or not object arrays
should have an identity for bitwise reductions. Currently, they do not use
the identity for non-empty arrays, so this would only affect reductions on
empty arrays. Currently bitwise_or, bitwise_xor, and bitwise_and will
return (bool_) 0, (bool_) 0, and (int) -1 respectively in that case. Note
the non-object arrays work as they should, the question is only about
I'm not sure if I should send this here or to scipy-user, feel free to
redirect me there if I'm off topic.
So, there is something I don't understand using inv and lstsq in numpy.
I've built *on purpose* an ill conditioned system to fit a quadric
a*x**2+b*y**2+c*x*y+d*x+e*y+f, the data points are taken on a narrow
stripe four times longer than wide. My goal is obviously to find
(a,b,c,d,e,f) so I built the following matrix:
A[:,0] = data[:,0]**2
A[:,1] = data[:,1]**2
A[:,2] = data[:,1]*data[:,0]
A[:,3] = data[:,0]
A[:,4] = data[:,1]
A[:,5] = 1;
The condition number of A is around 2*1e5 but I can make it much bigger
if needed by scaling the data along an axis.
I then tried to find the best estimate of X in order to minimize the
norm of A*X - B with B being my data points and X the vector
(a,b,c,d,e,f). That's a very basic usage of least squares and it works
fine with lstsq despite the bad condition number.
However I was expecting to fail to solve it properly using
inv(A.T.dot(A)).dot(A.T).dot(B) but actually while I scaled up the
condition number lstsq began to give obviously wrong results (that's
expected) whereas using inv constantly gave "visually good" results. I
have no residuals to show but lstsq was just plain wrong (again that is
expected when cond(A) rises) while inv "worked". I was expecting to see
inv fail much before lstsq.
Interestingly the same dataset fails in Matlab using inv without any
scaling of the condition number while it works using \ (mldivide, i.e
least squares). On octave it works fine using both methods with the
original dataset, I did not try to scale up the condition number.
So my question is very simple, what's going on here ? It looks like
Matlab, Numpy and Octave both use the same lapack functions for inv and
lstsq. As they don't use the same version of lapack I can understand
that they do not exhibit the same behavior but how can it be possible to
have lstsq failing before inv(A.T.dot(A)) when I scale up the condition
number of A ? I feel like I'm missing something obvious but I can't find it.