[Tutor] Sorting a dictionary on a value in a list.
lawrence.wickline at gmail.com
Mon Dec 8 17:55:40 CET 2008
On Dec 6, 2008, at 12:41 AM, Lie Ryan wrote:
> In most cases, in processing involving networking, the bottleneck is
> network speed itself. To speed things up by optimizing your own code
> might not make your download significantly faster (getting 60 seconds
> faster is great for scripts that usually runs for 70 seconds, but is a
> waste of development time for scripts that usually run for 1 hour)
> Usually a multi-threading downloader might be a better chance to
> improvement, especially for 1) downloading from different site, 2)
> remote sites have speed limit, 3) you have faster download link than
> server can gives
In this particular case everything is on the local network. This is
actually part of a hadoop map/reduce system I am learning, so reducing
cpu is of high value. if network pull times become and issue the
cluster can be expanded and the time between pulls can be reduced. As
of this morning I am being directed to make the reducer usable both in
the mapper and then again as a reducer. This has forced me to rework
everything to work so that it can be called as a module.
I have never learned java so that wasn't' an option and the more I am
working with it python seems to be the perfect fit for hadoop type
work. Really fun stuff.
More information about the Tutor