list.pop(0) vs. collections.dequeue
Steve Howell
showell30 at yahoo.com
Mon Jan 25 18:30:16 EST 2010
On Jan 25, 1:32 pm, Arnaud Delobelle <arno... at googlemail.com> wrote:
> Steve Howell <showel... at yahoo.com> writes:
>
> [...]
>
> > My algorithm does exactly N pops and roughly N list accesses, so I
> > would be going from N*N + N to N + N log N if switched to blist.
>
> Can you post your algorithm? It would be interesting to have a concrete
> use case to base this discussion on.
>
These are the profile results for an admittedly very large file
(430,000 lines), which shows that pop() consumes more time than any
other low level method. So pop() is not a total red herring. But I
have to be honest and admit that I grossly overestimated the penalty
for smaller files. Typical files are a couple hundred lines, and for
that use case, pop()'s expense gets totally drowned out by regex
handling. In other words, it's a lot cheaper to move a couple hundred
pointers per list element pop than it is to apply a series of regexes
to them, which shouldn't be surprising.
ncalls tottime percall cumtime percall filename:lineno
(function)
230001/1 149.508 0.001 222.432 222.432 /home/showell/workspace/
shpaml_website/shpaml.py:192(recurse)
429999 17.667 0.000 17.667 0.000 {method 'pop' of 'list'
objects}
530000 8.428 0.000 14.125 0.000 /home/showell/workspace/
shpaml_website/shpaml.py:143(get_indented_block)
3700008 7.877 0.000 7.877 0.000 {built-in method match}
5410125/5410121 5.697 0.000 5.697 0.000 {len}
300000 3.938 0.000 22.286 0.000 /home/showell/workspace/
shpaml_website/shpaml.py:96(convert_line)
959999 3.847 0.000 6.759 0.000 /home/showell/workspace/
shpaml_website/shpaml.py:29(INDENT)
959999 3.717 0.000 12.547 0.000 /home/showell/workspace/
shpaml_website/shpaml.py:138(find_indentation)
370000 3.495 0.000 20.204 0.000 /home/showell/workspace/
shpaml_website/shpaml.py:109(apply_jquery)
370000 3.322 0.000 6.528 0.000 {built-in method sub}
1469999 2.575 0.000 2.575 0.000 {built-in method groups}
As an aside, I am a little surprised by how often I call len() and
that it also takes a large chunk of time, but that's my problem to
fix.
More information about the Python-list
mailing list