Suggestions for Python MapReduce?
Wed Jul 22 12:55:51 EDT 2009
Phillip B Oldham <phillip.oldham at gmail.com> writes:
> Implementations like BashReduce <http://blog.last.fm/2009/04/06/
> mapreduce-bash-script> are perfect for such scenarios. I'm simply
> wondering if there's another simpler/smaller implementation of
> MapReduce that plays nicely with Python but doesn't require the setup/
> knowledge overhead of more "robust" implementations such as hadoop and
> disco... maybe similar to Ruby's Skynet.
I usually just spew ssh tasks across whatever computing nodes I can
get my hands on. It's less organized than something like mapreduce,
but I tend to run one-off tasks that I have to keep an eye on anyway.
I've done stuff like that across up to 100 or so machines and I think
it wouldn't be really worse if the number were a few times higher. I
don't think it would scale to really large (10,000's of nodes) clusters.
More information about the Python-list