[Baypiggies] [ANN - Berkeley py4science] Talk Weds. April 27: Python capabilities on a Hadoop based cluster
Fernando Perez
fperez.net at gmail.com
Tue Apr 27 23:14:03 CEST 2010
Hi all,
a reminder of tomorrow's talk:
* April 28, 2pm: Title: Python capabilities on a Hadoop based
cluster. By Dan Starr, Astronomy, UC Berkeley. To make use of several
Hadoop clusters recently made available, I ported portions of our
Python based project into Hadoop run-able jobs using Hadoop treaming
and Cascading. I'll discuss tricks which helped make this possible and
give some comparisons between Yahoo's M45 cluster and an Amazon EC2
cluster using customized Cloudera AMIs. I would also like to give an
overview of Hadoop Dumbo and Python hooks for HIVE.
As usual, we meet at the Redwood Center's conference room: 508-20
Evans Hall (5th floor).
More information: https://cirl.berkeley.edu/view/Py4Science
Please pass this along to any colleagues who might be interested.
Regards,
f
More information about the Baypiggies
mailing list