[Pandas-dev] Pandas Sprint Recap

Irv Lustig irv at princeton.com
Wed Jul 18 15:05:42 EDT 2018


>
Stephan Hoyer wrote:


>
> On Tue, Jul 17, 2018 at 3:47 PM Matthew Rocklin <mrocklin at gmail.com>
> wrote:
>
> > Has Pandas ever done a user survey?
> >
> > I would be curious to know the answer to the question "do you make heavy
> > use of the Pandas index" among users, and how that correlates with
> > different domain/industry.
> >
>
> This is a great question. I don't think we've ever done this sort of
> reserach.
>
> My suspicion is that most of the time users ignore the index, and find the
> way it is used heavily in pandas more annoying than helpful. But certainly
> there are some use-cases for which automatic alignment with an index is
> fantastic.
>

For our team, we heavily use the MultiIndex capability for rows (but not
columns). Our main use of pandas is to read in data from disparate data
sources, and do data wrangling to reshape the data.  We do lots of
joins/merges of different DataFrames, and placing the keys in a MultiIndex
makes it easier to track the join operations.

>From our perspective, the MultiIndex on rows is akin to the primary keys of
a data table. As we explore data, being able to slice the data along
various dimensions is quite valuable.  It is also quite natural that a
groupby() operation returns a Series or DataFrame with a MultiIndex.

What I find a bit frustrating is the lack of symmetry in the API between
dealing with the names of a MultiIndex and the names of a column.  It's why
I created this pull request (https://github.com/pandas-dev/pandas/pull/20046
[ENH: Allow rename_axis to specify index and columns arguments]) and opened
this issue https://github.com/pandas-dev/pandas/issues/20421 [API: Allow
MultiIndex.rename() to accept a dict as an argument]

Dr-Irv
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180718/4c19c2e4/attachment.html>


More information about the Pandas-dev mailing list