[IPython-dev] Integrating pandas into pylab

Wed Oct 26 13:03:01 EDT 2011

On Wed, Oct 26, 2011 at 6:25 AM, John Hunter <jdh2358 at gmail.com> wrote:
> I'm in agreement with most of the ideas here.  As the author of pylab,
> I've caught a lot of flack for dumping everything into a single
> namespace as it is unpythonic, and in classes Fernando and I have been
> pretty careful of late to use namespaces, and mpl facilitates this by
> factoring the plotting part of pylab into pyplot.  It's a little more
> confusing for students at first, but ultimately it helps the students
> to know where things come from when they move from the interactive
> environment to scripting.
>
> That said, the major problem is that the current organization of the
> major packages is not logical or intuitive.  numpy has arrays,
> algorithms and IO, scipy has algorithms and IO, matplotlib has
> plotting and algorithms and IO, pandas has datastructures, IO,
> algorithms and plotting (albeit all organized around the dataframe).
> And so on.  I think there is room for a namespace package that
> integrates across these and makes it more intuitive.  The proper top
> level namespaces are something like: array (or statstructures more
> generally), algo, plot, io.  In this model, you would pull the
> relevant components from numpy, scipy, mpl, pandas, scikits, ETS, etc
> into the relevant namespaces.

I would not want to make the new namespace nested - it should be flat
like the current pylab is.  I think the main point here is that we are
competing with Matlab, Mathematica, etc. which all have completely
flat namespaces.  Not that we want to copy everything these packages
do, but for new users, non-technical folks, undergraduates the entire
idea of namespaces is confusing.  This problem (for these users) is
namspaces themselves, not just that the existing ones are confusing.

>  Making everything work together would
> be exceedingly tricky...  Note this is pretty much what scipy used to
> be: at some point they jettisoned plotting and focused on the algo
> part.  It's also a hard and big project and may not attract much
> usage.

What are the main areas that would make it tricky?  I don't have
enough experience with numpy/scipy/matplotlib to know this.

>  If someone pursues this, I would not call it pylab, as this
> will just foster confusion.

I understand why you would not want to separate pylab from matplotlib,
but I don't see that creating yet another namespace would help the
problem.  Confusion would only increase as people would continually
ask "what's the difference between pylab and *?"

> Forgetting about the big problem, and focusing on Thomas original and
> much more limited question of getting pandas into pylab, there are
> three easy solutions:
>
> * matplotlib.pylab can conditionally try/import * from pandas.  This
> is the path of least resistance, and several of our developers may
> object because they want less, not more, namespace dumping.
> Nonetheless, it can probably be done.
>
> * ipython can have it's own configurable "pylab" import which does an
> import * from matplotlib.pylab and anything else you or the users want
> by default.  The downside of configurable is that it makes it easier
> to share histories, etc.
>
> * pandas could be incorporated into numpy.  This is my favorite
> solution since it would get pandas onto as many desktops as soon as
> possible and we could all write code that relies on it being there.
> Obviously pandas is on a much faster release schedule than numpy right
> now, and should live on its own, but in six months time or so when Wes
> is ready to take a breather, it would be great to see pandas
> incorporated.  Then matplotlib.pylab would get it by default.
>
>
>>> * It should be removed from matplotlib.
>
> This is highly unlikely.  We are loathe to break backwards
> compatibility, and this would be *major* breakage.  It's easier and
> less confusing to simply use a different name.

This makes sense.  Given this, I propose that the existing pylab
module in matplotlib be developed into a more general namespace for
this type of thing.  In that process care can be taken to not break
backwards compat.  I think this is logical as pylab already does
import * from a good number of things.

Cheers,

Brian

> JDH
>

-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu and ellisonbg at gmail.com