Performance: sets vs dicts.

Terry Reedy tjreedy at udel.edu
Wed Sep 1 16:22:56 EDT 2010


On 9/1/2010 11:40 AM, Aahz wrote:
> I think that any implementation
> that doesn't have O(1) for list element access is fundamentally broken,

Whereas I think that that claim is fundamentally broken in multiple ways.

> and we should probably document that somewhere.

I agree that *current* algorithmic behavior of parts of CPython on 
typical *current* hardware should be documented not just 'somewhere' 
(which I understand it is, in the Wiki) but in a CPython doc included in 
the doc set distributed with each release.

Perhaps someone or some group could write a HowTo on Programming with 
CPython's Builtin Classes that would describe both the implementation 
and performance and also the implications for coding style. In 
particular, it could compare CPython's array lists and tuples to singly 
linked lists (which are easily created in Python also).

But such a document, after stating that array access may be thought of 
as constant time on current hardware to a useful first approximation, 
should also state that repeated seqeuntial accessess may be *much* 
faster than repeated random accessess. People in the high-performance 
computing community are quite aware of this difference between 
simplified lies and messy truth. Because of this, array algorithms are 
(should be) written differently in Fortran and C because Fortran stores 
arrays by columns and C by rows and because it is usually much faster to 
access the next item than one far away.

-- 
Terry Jan Reedy




More information about the Python-list mailing list