Sets in Python

Alex Martelli aleaxit at yahoo.com
Tue Jan 30 12:04:21 EST 2001


"Magnus Lie Hetland" <mlh at idi.ntnu.no> wrote in message
news:956hcd$qef$1 at tyfon.itea.ntnu.no...
>
> I quite often use dictionaries as sets in Python... But it doesn't really
> seem pretty enough to me... Some arbitrariness (as with
> "while 1: ... break" I still don't like that one...) about using what
> to me is a value as a key, and assigning it some value -- like 1,
> for instance... And then checking it with has_key, and removing
> it with del...
>
> Isn't there a better (more beautiful) way of doing this?

Hi Magnus!  It's been a while...

In Python 2.1, you'll be able to use
    if x in dict
as a synonym for
    if dict.has_key(x)
[and *maybe* the for-loop equivalent, too -- I'm still unclear
about that one, and it doesn't seem to be in the first alpha].

This would seem to point to using dictionaries as sets.


> And if one was to write a small wrapper of some kind --
> is there any way to implement "x in s" without a linear search?
> Or does the "in" operator always loop from 0 and up?
> (It would certainly be nice to be able to override it...)

Time machine at work again -- you CAN 'override it' in the
current Python (2.0; unsure if that was in 1.6 too).  The
special method name you're looking for is __contains__ --
takes self and item, can apparently return any true or
false value (at least, that's how I read the docs).

Thus, it would seem that (without having to wait for 2.1)
implementing sets as class instances would not be too
bad, either (tiny overhead wrt using native dicts directly,
and you get to control the syntax sugar:-).  Supporting
'for x in myset' will require some (modest) trickery in
__getitem__ (caching a [sorted?] copy of the keys of the
dictionary attribute one uses for implementation, else
the overhead might become a tad too big, I suspect).  That
gets into murky territory in order to support semi-sensibly
such behavior as altering the set whose contents one is
looping on, considering the possibility that the for loop
may be prematurely terminated with a break (so one can't
know for sure when to throw away said cache), etc.  But, it
seems to me, nothing _too_ horrible.


Alex






More information about the Python-list mailing list