[Baypiggies] Fw: pydoop -- Python MapReduce and HDFS API for Hadoop

Alec Flett alecf at flett.org
Fri Nov 6 22:19:56 CET 2009


FWIW, Metaweb released it's Python Hadoop library as well, called "Happy"

http://code.google.com/p/happy/

(though it looks like it hasn't been updated in a while, I know there has
been ongoing developments internally including support for Python 2.5)

Alec

On Fri, Nov 6, 2009 at 12:59 PM, Joel VanderKwaak <joelvanderkwaak at yahoo.com
> wrote:

> fyi - I haven't used it, and can't comment ;)
>
> ----- Forwarded Message ----
> *From:* Simone Leo <simone.leo at crs4.it>
> *To:* general at hadoop.apache.org
> *Sent:* Fri, November 6, 2009 9:20:36 AM
> *Subject:* pydoop -- Python MapReduce and HDFS API for Hadoop
>
> Hello everybody,
>
> we recently released pydoop, a Python MapReduce and HDFS API for Hadoop:
>
> http://pydoop.sourceforge.net
>
> It is implemented as a Boost.Python wrapper around the C++ code (pipes
> and libhdfs). It allows you to write complete MapReduce application in
> CPython, with the same capabilities as the C++ API. Here is a minimal
> wordcount example:
>
>
> from pydoop.pipes import Mapper, Reducer, Factory, runTask
>
> class WordCountMapper(Mapper):
>
>   def __init__(self, context):
>     super(WordCountMapper, self).__init__(context)
>
>   def map(self, context):
>     words = context.getInputValue().split()
>     for w in words:
>       context.emit(w, "1")
>
> class WordCountReducer(Reducer):
>
>   def __init__(self, context):
>     super(WordCountReducer, self).__init__(context)
>
>   def reduce(self, context):
>     s = 0
>     while context.nextValue():
>       s += int(context.getInputValue())
>     context.emit(context.getInputKey(), str(s))
>
> runTask(Factory(WordCountMapper, WordCountReducer))
>
>
> Any feedback would be greatly appreciated.
>
> --
> Simone Leo
> Distributed Computing group
> Advanced Computing and Communications program
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simleo at crs4.it
> http://www.crs4.it
>
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> To change your subscription options or unsubscribe:
> http://mail.python.org/mailman/listinfo/baypiggies
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/baypiggies/attachments/20091106/16fa5238/attachment-0001.htm>


More information about the Baypiggies mailing list