[IPython-dev] pyspark and IPython

Brian Granger ellisonbg at gmail.com
Fri Aug 30 13:44:56 EDT 2013


> Sorry, I'm at ampcamp again and can't reply in as much detail as I'd
> like, but the pyspark architecture indeed can be used interactively
> from the notebook, and in fact it works much, much better than their
> default shell.  Here's my quick port of the pyspark tutorial:
>
> http://nbviewer.ipython.org/6384491/Data%20Exploration%20Using%20Spark.ipynb
>
> which I ran yesterday on an AMP cluster that had been configured
> according to my little tutorial:
>
> http://nbviewer.ipython.org/6384491/IPythonNotebookPySparkHowTo.ipynb

This is absolutely fantastic - I am glad it was this easy to get
going.  Also glad that pyspark was written in a way that it can be
used interactively.

>
> I'm already talking to the AMPLab folks on how to make this
> integration work seamlessly out of the box with their AMIs, it should
> be absolutely trivial to do once we have a couple of hours to spend on
> it.

Yep, that is so cool.  This is where open source just rocks - anyone
can plug the notebook into their own systems with ease and no
license/$ hassle.

> Once we ship 1.1 (so the super() bug is fixed and we don't have to go
> patching things manually), I'll sit down with them and finish this up.
>
> The deeper question of ipython.parallel/spark
> integration/competition/complementarity is much harder to answer, and
> I'm not really sure what the answer is yet, to be honest.  A good part
> of the reason I'm here is precisely to think about that.

Great, keep us posted.

Cheers,

Brian

>
> Cheers,
>
> f
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev



-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu and ellisonbg at gmail.com



More information about the IPython-dev mailing list