Suggestions for Python MapReduce?

python at python at
Wed Jul 22 11:42:07 EDT 2009


We've had great success writing simple, project specific algorithms to
split content into chunks appropriate for ETL type, Python based
processing in a hosted cloud environment like Amazon EC2 or the recently
launched Rackspace Cloud Servers. Since we're purchasing our cloud
hosting time in 1 hour blocks, we divide our data into much larger
chunks than what a traditional map-reduce technique might use. For many
of our projects, the data transfer time to and from the cloud takes the
majority of clock time.


More information about the Python-list mailing list