pairs from a list

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Wed Jan 23 04:37:03 EST 2008


On Tue, 22 Jan 2008 23:33:00 -0800, George Sakkis wrote:

> As I mentioned already, I consider the seeking of the most efficient
> solution a legitimate question, regardless of whether a "dumb" solution
> is fast enough for an application. Call it a "don't be sloppy" principle
> if you wish.

Sure, by why do you limit "efficient" and "don't be sloppy" to mean 
"write the fastest executing code you can, regardless of every other 
trade-off"?

Because there are trade-offs:

* execution time
* compilation time
* memory use at run-time
* size of source code on disk
* size of compiled code on disk
* ease of writing it in the first place
* ease of maintenance
* correctness
* fragility in the face of malformed data
* readability and ease of comprehension
* elegance (however you judge that!)
* making sure the usage of various resources never exceed some set of 
upper bounds
etc.

Some of these are relatively unimportant (e.g. the size of the source 
code), but others are extremely important and far too often ignored (e.g. 
readability and even correctness).

In fact, "fastest" isn't even a meaningful attribute. Does it mean:

* the worst-case is fastest
* the best-case is fastest
* the average-case is fastest
* fastest on typical data
* all of the above
etc.

What counts as "typical" data? How do you average all the cases?

Asking "what's the fastest" without considering these issues is a good 
sign that the person asking is being sloppy, something you should be 
against.


> It's the same reason I always use xrange() instead of
> range() for a loop, although in practice the difference is rarely
> measurable.

Well, that's (now) a no-brainer. Once upon a time, range() used to be 
faster for small enough lists, and whatever difference there is in 
readability is insignificant except for utter newbies, so there's no real 
trade-off to make unless you care about the extra "x".

But... do you write list.__len__() instead of len(list) to save a few 
nanoseconds?

If you do, you're guilty of pessimization instead of optimization: for 
built-in lists, len() is actually faster, by a lot. It's only an 
optimization -- and a significant one -- for custom classes.

So, do you write this:

if isinstance(alist, list):
    value = len(alist)  # optimize for lists
else:
    value = alist.__len__()  # optimize for classes

every time you want the length of an arbitrary sequence?

If you did, you'd be pessimizing again. That code runs nearly twice as 
slowly as just calling the built-in len(). Optimizing the parts doesn't 
mean you have optimized the whole.



-- 
Steven



More information about the Python-list mailing list