[Baypiggies] [ANN - Berkeley py4science] Talk Weds. April 27: Python capabilities on a Hadoop based cluster

Fernando Perez fperez.net at gmail.com
Tue Apr 27 23:14:03 CEST 2010

Hi all,

a reminder of tomorrow's talk:

    *  April 28, 2pm: Title: Python capabilities on a Hadoop based
cluster. By Dan Starr, Astronomy, UC Berkeley. To make use of several
Hadoop clusters recently made available, I ported portions of our
Python based project into Hadoop run-able jobs using Hadoop treaming
and Cascading. I'll discuss tricks which helped make this possible and
give some comparisons between Yahoo's M45 cluster and an Amazon EC2
cluster using customized Cloudera AMIs. I would also like to give an
overview of Hadoop Dumbo and Python hooks for HIVE.

As usual, we meet at the Redwood Center's conference room: 508-20
Evans Hall (5th floor).

More information: https://cirl.berkeley.edu/view/Py4Science

Please pass this along to any colleagues who might be interested.



More information about the Baypiggies mailing list