[Python-Dev] Re: 2.4 news reaches interesting places

Sat Dec 11 05:41:24 CET 2004

Raymond Hettinger wrote:
> guidelines for applications that demand peek performance (in terms of memory

Peak performance, perhaps? :) Anyway, it looks pretty good to me, but I have a 
few additional ideas.

Add a section of Caveats (we know they exist - might as well be upfront about it):

Caveats
--------
.  This document is based primarily on the CPython interpreter from 
www.python.org. However, most of the recommendations should apply to any Python 
interpreter that supports the mentioned feature.

.  For small data sets, some of the suggestions will perform more slowly than 
the approaches they are given as alternatives too. This is due to the fact that 
those suggestions introduce a little more fixed overhead in order to avoid 
overhead in processing each data item. The crossover point (where the more 
scaleable approach begins to show a net performance gain) is generally 
application specific. Use the diagnostic tools mentioned later to make an 
informed decision for your application.

> Use the best algorithms and fastest tools 
> -----------------------------------------
 > .	Membership testing with sets and dictionaries is much faster,
 > O(1), than searching sequences, O(n).  When testing "a in b", b should
 > be a set or dictionary instead of a list or tuple.

Should the bisect module be mentioned? If you have a sorted list already, using
bisect may be faster than constructing an intermediate set. Also, you can use 
bisect if an item's *position* in the list is important. list.index(x) uses a 
linear search (since it can't assume a sorted list), whereas bisect expects the 
list to be pre-sorted.

> intermediate step.  Py2.4 mitigates this issue somewhat; however, 
> ''.join(seq) remains the best practice.

I'd say 'The CPython 2.4 interpreter', rather than just Py2.4.

> .	List comprehensions run a bit faster than equivalent for-loops.

I'd move this to the next section (the algorithms are the same, but the
interpreter can speed up one more than the other)

> .	Custom sort ordering is best performed with Py2.4's key= option or with the
> traditional decorate-sort-undecorate technique.  Both approaches call the key
> function just once per element.  In contrast, sort's cmp= option is called
> many times per element during a sort.  For example, sort(key=str.lower) is
> faster than sort(cmp=lambda a,b: cmp(a.lower(), b.lower())).

If sorting is going to be performed repeatedly (e.g. maintaining a list in 
sorted order), it may be feasible to store the list of keys, rather than 
regenerating them every time. This also plays nicely with using the bisect 
module to update the list (as the list of keys is available to determine the 
correct index for insertion).

I can't find a 'sorted list' recipe in the Cookbook, so I might have to create one.

> Take advantage of interpreter optimizations 
> -------------------------------------------
[snip]

. Prefer iteration (using for loops, list comprehensions, or generator 
expressions) over iterables to explicit invocation of .next()

> Take advantage of diagnostic tools
> ----------------------------------
> 
> .	The hotshot and profile modules help identify performance bottlenecks.

Currently profile is a fair bit more useful than hotshot, mainly due to the fact 
that it isolates time spent in C functions from the Python code that calls those 
functions.

Perhaps another section for External Libraries? If you're doing serious number 
crunching in Python, using NumPy is practically a requirement.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net