[Baypiggies] Fw: pydoop -- Python MapReduce and HDFS API for Hadoop

Charles Merriam charles.merriam at gmail.com
Thu Nov 12 01:30:54 CET 2009


In a related note, Shevek released KarmaSphere
(http://www.hadoopstudio.org/) which tracks the logs and usage across
your  jobs on your favorite thousand node Hadoop  network.  Is slick.

Charles


On Fri, Nov 6, 2009 at 1:19 PM, Alec Flett <alecf at flett.org> wrote:
> FWIW, Metaweb released it's Python Hadoop library as well, called "Happy"
> http://code.google.com/p/happy/
> (though it looks like it hasn't been updated in a while, I know there has
> been ongoing developments internally including support for Python 2.5)
> Alec
>
> On Fri, Nov 6, 2009 at 12:59 PM, Joel VanderKwaak
> <joelvanderkwaak at yahoo.com> wrote:
>>
>> fyi - I haven't used it, and can't comment ;)
>>
>> ----- Forwarded Message ----
>> From: Simone Leo <simone.leo at crs4.it>
>> To: general at hadoop.apache.org
>> Sent: Fri, November 6, 2009 9:20:36 AM
>> Subject: pydoop -- Python MapReduce and HDFS API for Hadoop
>>
>> Hello everybody,
>>
>> we recently released pydoop, a Python MapReduce and HDFS API for Hadoop:
>>
>> http://pydoop.sourceforge.net
>>
>> It is implemented as a Boost.Python wrapper around the C++ code (pipes
>> and libhdfs). It allows you to write complete MapReduce application in
>> CPython, with the same capabilities as the C++ API. Here is a minimal
>> wordcount example:
>>
>>
>> from pydoop.pipes import Mapper, Reducer, Factory, runTask
>>
>> class WordCountMapper(Mapper):
>>
>>   def __init__(self, context):
>>     super(WordCountMapper, self).__init__(context)
>>
>>   def map(self, context):
>>     words = context.getInputValue().split()
>>     for w in words:
>>       context.emit(w, "1")
>>
>> class WordCountReducer(Reducer):
>>
>>   def __init__(self, context):
>>     super(WordCountReducer, self).__init__(context)
>>
>>   def reduce(self, context):
>>     s = 0
>>     while context.nextValue():
>>       s += int(context.getInputValue())
>>     context.emit(context.getInputKey(), str(s))
>>
>> runTask(Factory(WordCountMapper, WordCountReducer))
>>
>>
>> Any feedback would be greatly appreciated.
>>
>> --
>> Simone Leo
>> Distributed Computing group
>> Advanced Computing and Communications program
>> CRS4
>> POLARIS - Building #1
>> Piscina Manna
>> I-09010 Pula (CA) - Italy
>> e-mail: simleo at crs4.it
>> http://www.crs4.it
>>
>> _______________________________________________
>> Baypiggies mailing list
>> Baypiggies at python.org
>> To change your subscription options or unsubscribe:
>> http://mail.python.org/mailman/listinfo/baypiggies
>
>
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> To change your subscription options or unsubscribe:
> http://mail.python.org/mailman/listinfo/baypiggies
>


More information about the Baypiggies mailing list