We've released pandas 0.10.1 which includes many bug fixes from
0.10.0 (including a number of issues with the new file parser,
e.g. reading multiple files in separate threads), various
performance improvements, and major new PyTables/HDF5-based
functionality contributed by Jeff Reback. I strongly recommend
that all users upgrade.
Thanks to all who contributed to this release, especially Chang
She, Jeff Reback, and Yoval P.
As always source archives and Windows installers are on PyPI.
What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html
$ git log v0.10.0..v0.10.1 --pretty=format:%aN | sort | uniq -c | sort -rn
59 Wes McKinney
43 Chang She
5 Vincent Arel-Bundock
4 Damien Garaud
3 Christopher Whelan
3 Andy Hayden
2 Jay Parlar
2 Dan Allan
1 Thouis (Ray) Jones
1 Garrett Drapala
1 Dieter Vandenbussche
1 Anton I. Sipos
Happy data hacking!
What is it
pandas is a Python package providing fast, flexible, and
expressive data structures designed to make working with
relational, time series, or any other kind of labeled data both
easy and intuitive. It aims to be the fundamental high-level
building block for doing practical, real world data analysis in
Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst
Code Repository: http://github.com/pydata/pandas
Mailing List: http://groups.google.com/group/pydata
Will the numpy 1.7.0 'final' be binary compatible with the release
candidate(s)? i.e. Would it be safe for me to release a Windows
installer for a package using the NumPy C API compiled against
the NumPy 1.7.0rc?
I'm specifically interested in Python 3.3, and NumPy 1.7 will be
the first release to support that. For older versions of Python I
can use NumPy 1.6 instead.
This post is to bring the discussion of PR
#2965<https://github.com/numpy/numpy/pull/2965>to the attention of the
list. There are at least three issues in play here.
1) The PR adds modes 'big' and 'thin' to the current modes 'full', 'r',
'economic' for qr factorization. The problem is that the current 'full' is
actually 'thin' and 'big' should be 'full'. The solution here was to raise
a FutureWarning on use of 'full', alias it to 'thin' for the time being,
and at some distant time change 'full' to alias 'big'.
2) The 'economic' mode serves little purpose. I propose to deprecate it and
add a 'qrf' mode instead, corresponding to scipy's 'raw' mode. We can't use
'raw' itself as traditionally the mode may be specified using the first
letter only and that leads to a conflict with 'r'.
3) As suggested in 2, the use of single letter abbreviations can constrain
the options in choosing mode names and they are not as informative as the
full name. A possibility here is to deprecate the use of the abbreviations
in favor of the full names.
A longer term problem is the divergence between the numpy and scipy
versions of qr. The divergence is enough that I don't see any easy way to
come to a common interface, but that is something that would be desirable
I'm convinced that I saw a while ago a function that uses a list of
interval boundaries to index into an array, either to iterate or to
I thought that's very useful, but didn't make a note.
Now, I have no idea where I saw this (I thought numpy), and I cannot
find it anywhere.
Here are the last open issues for 1.7, there are 9 of them:
>From these, 3 are very simple PRs that I just posted.
Let's polish these, get them in.
I propose to release rc2 after that and if all is ok, do the final
release. Some of the
issues are not fully addressed, but I don't think we should be holding
the release any longer.
Let me know if that is ok with you.
My apologies for a delay on my side --- my son was just born 2 weeks
ago and I had to submit
an important article, with the deadline yesterday. Things are settled
now and I now have time
to get this done.
I'm currently trying to build numpy 1.6.2 for python python 3.3 from ports on FreeBSD. Unfortunately, the setup.py execution fails because some  gcc command trying to access _numpyconfig.h fails since _numpyconfig.h is not generated from _numpyconfig.h.in. How do I manually build the proper header from the .h.in, and why does that not happen automatically?
# gcc46 -DNDEBUG -O2 -pipe -fno-strict-aliasing -O2 -pipe -Wl,-rpath=/usr/local/lib/gcc46 -fno-strict-aliasing -fPIC -Inumpy/core/include -Ibuild/src.freebsd-9.0-RELEASE-amd64-3.3/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/local/include/python3.3m -Ibuild/src.freebsd-9.0-RELEASE-amd64-3.3/numpy/core/src/multiarray -Ibuild/src.freebsd-9.0-RELEASE-amd64-3.3/numpy/core/src/umath -c numpy/core/src/multiarray/multiarraymodule_onefile.c -o build/temp.freebsd-9.0-RELEASE-amd64-3.3/numpy/core/src/multiarray/multiarraymodule_onefile.o
Fatal error: can't create build/temp.freebsd-9.0-RELEASE-amd64-3.3/numpy/core/src/multiarray/multiarraymodule_onefile.o: No such file or directory
In file included from numpy/core/include/numpy/ndarraytypes.h:5:0,
numpy/core/include/numpy/numpyconfig.h:4:26: fatal error: _numpyconfig.h: No such file or directory
I noticed that on
there's a "see also" to a function numpy.savez_compressed, which doesn't
seem to exist (neither on my system nor in the online documentation).
What would be the easiest way to find out where to fix this? For someone
without deeper knowledge of how numpy sources are organized it's hard to
find the place where to fix things. How about adding the "source" link
to the docstrings via sphinx, like in scipy?
is there a way to speed up Array.take( floatindices.astype(int) ) ?
astype(int) makes a copy, floor() returns floats.
(Is there a wiki of NumPy one-liners / various tricks ?
would sure beat googling.)
I am trying to create a subclass of ndarray that has additional
attributes. These attributes are maintained with most numpy functions if
__array_finalize__ is used.
The main exception I have found is concatenate (and hstack/vstack, which
just wrap concatenate). In this case, __array_finalize__ is passed an
array that has already been stripped of the additional attributes, and I
don't see a way to recover this information.
In my particular case at least, there are clear ways to handle corner cases
(like being passed a class that lacks these attributes), so in principle
there no problem handling concatenate in a general way, assuming I can get
access to the attributes.
So is there any way to subclass ndarray in such a way that concatenate can
be handled properly?
I have been looking extensively online, but have not been able to find a
clear answer on how to do this, or if there even is a way.