[IPython-dev] pyspark and IPython
ellisonbg at gmail.com
Thu Aug 29 16:35:26 EDT 2013
>From a quick glance, it looks like both pyspark and IPython use
similar parallel computing models in terms of the process model. You
might think that would help them to integrate, but in this case I
think it will get in the way of integration. Without learning more
about the low-level details of their architecture it is really
difficult to know if it is possible or not. But I think the bigger
question is what would the motivation for integration be? Both
IPython and spark provide self-contained parallel computing
capabilties - what usage cases are there for using both at the same
time? I think the biggest potential show stopper is that pyspark is
not designed in any way to be interactive as far as I can tell.
Pyspark jobs basically run in batch mode, which is going to make it
really tough to fit into IPython's interactive model. Worth looking
more into though..
On Thu, Aug 29, 2013 at 11:28 AM, Nitin Borwankar <nborwankar at gmail.com> wrote:
> I'm at AmpCamp3 at UCB and see that there would be huge benefits to
> integrating pyspark with IPython and IPyNB.
> a) has this been attempted/done? if so pointers pl.
> b) does this overlap the IPyNB parallel computing effort in
> conflicting/competing ways?
> c) if this has not been done yet - does anyone have a sense of how much
> effort this might be? (I've done a small hack integrating postgres psql into
> ipynb so I'm not terrified by that level of deep digging, but are there any
> show stopper gotchas?)
> Thanks much,
> Nitin Borwankar
> nborwankar at gmail.com
> IPython-dev mailing list
> IPython-dev at scipy.org
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu and ellisonbg at gmail.com
More information about the IPython-dev