Binary search tree
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Tue Nov 13 03:45:51 EST 2007
En Mon, 12 Nov 2007 16:21:36 -0300, Scott SA <pydev at rscorp.ab.ca> escribió:
> I decided to test the speeds of the four methods:
(but one should always check for correctness before checking speed)
> def dict_example(urls):
> d = {}
> for url in urls:
> if url in d:
> d[url] = 1
The above does nothing more than iterate over all urls.
> For the simple quest of "Find the unique URLs in a list", the 'dict'
> example is by far the quickest.
After correction, there is no significative difference between
dict_example and set_example (sometimes one is slightly slower than the
other, and sometimes it's the other way). But the simpler s = set(urls)
suggested by Terry Reedy clearly wins (at least on my tests).
> Buuut, if you need the indexes and possibly a 'hit count', then the
> enumerated example seems about the best (of the possibilities tested)
In this case there is another alternative using defaultict:
def defaultdict_example(urls):
d = defaultdict(list)
for idx,url in enumerate(urls):
d[url].append(idx)
which is faster than setdefault_enum (again, on my tests).
Another point worth menction, I did shuffle(dummy_urls) before testing.
Having all of them ordered may introduce some bias.
--
Gabriel Genellina
More information about the Python-list
mailing list