Slow comparison between two lists
stef.mientki at gmail.com
Thu Oct 23 14:32:17 CEST 2008
On Thu, Oct 23, 2008 at 2:03 PM, Jani Tiainen <redetin at gmail.com> wrote:
> I have rather simple 'Address' object that contains streetname,
> number, my own status and x,y coordinates for it. I have two lists
> both containing approximately 30000 addresses.
> I've defined __eq__ method in my class like this:
> def __eq__(self, other):
> return self.xcoord == other.xcoord and \
> self.ycoord == other.ycoord and \
> self.streetname == other.streetname and \
> self.streetno == other.streetno
> But it turns out to be very, very slow.
> Then I setup two lists:
> list_external = getexternal()
> list_internal = getinternal()
> Now I need get all all addresses from 'list_external' that are not in
> 'list_internal', and mark them as "new".
> I did it like this:
> for addr in list_external:
> if addr not in list_internal:
> addr.status = 1 # New address
> But in my case running that loop takes about 10 minutes. What I am
> doing wrong?
Sorry I don't see what you're doing wrong,
except I'lde write "if not ( addr in list internal) :"
but you might consider using dictionaries.
A couple of days ago I did some speed comparison between C and Python.
We used the lookup of a string in a hash table (= dictionary in Python),
the results were amazing:
Search string was between 50 and 100 characters.
The table / dictionary was build with 10 million strings, each also 50 ..
100 characters long.
We repeated the search 100 million times and measured the time.
C, about 150 lines of unreadable code
Python: about 20 lines of very easy to read code lines
C : 1.5 days
Python : 20 minutes
C: 20 seconds
Python: 11 seconds !!
Who dares to ask Python is slow ?
C: 20 seconds
> Jani Tiainen
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list