operators and datatypes for sets

Anton Vredegoor a.vredegoor at hccnet.nl
Tue Apr 24 09:32:02 EDT 2001


On Mon, 23 Apr 2001 21:55:35 GMT, "Steve  Holden" <sholden at cox.rr.com>
wrote:

>"Anton Vredegoor" <anton at vredegoor.doge.nl> wrote in message
>news:9c1jdj$phb$1 at news.hccnet.nl...

<snip>

>> http://home.hccnet.nl/a.vredegoor/smallset/smallset.html
>>
>> The basic idea is to use lists as the underlying datatype and to do
>> member administration by setting bits to '1' or '0' in a longint. This
>> has the advantage of doing very fast operations. There are also some
>> disadvantages, one of them is having to specify a 'universe'.
>>
>So you are talking about finite sets, with underlying base sets specifying
>the universe.
>
>Had you thought about mapping the members on to the positions with a
>dictionary to allow more flexibility? Having to compute the element's
>position outside the set is inelegant.

The idea behind the smallset was to create a kind of wrapper around
large integers in order to make them behave as if they were sets. If I
would create a lot of overhead inside the wrapper it would defeat the
purpose of having fast set operations.

>> I have used the 'in' operator for determining if a set is a subset of
>> another set. This is different from its normal use where it is used
>> for checking if an element is in a set.
>
>This is likely to be VERY confusing. Although "contains" is ambiguous in
>English for subset and element set membership, "in" is pretty much expected
>to refer to element membership, so you might want to think about providing
>an "is_subset" method ... but don't ask me which way around the arguments
>should go, that could be a three-week thread all on its own!

Its a bit confusing, I agree. But just imagine having to program a lot
of set operations like:

 if s1 in s2:
	dosomething()
 else:
	dosomethingelse()

Doesn't it feel right? :-)

>> Further I have used the '/'
>> operator for 'set1 without the elements in set2'. This is also
>> unusual.
>
>Subtraction would seem the more normal operation.

I was using the analogy of mathematical notation for sets. Since the
'\' operator is already in use, the closest would be '/'.

<snip>

>Firstly, have you made any provision for different instances to deal with
>different universes? It seems to me that it would be more flexible if the
>smallset.__init__() method were to record some information about the
>universe of that set, so there was a tighter coupling between smallsets and
>the objects they were dealing with.

I would rather separate set operations totally from the objects, that
would not only be faster, but would have interesting possibilities for
mapping one set into an other. A set is just a large integer anyway.
Why not choose a random long integer < n? There you have a new set! Or
create a new universe with size >= n and use a long integer that was
used to describe another set in another universe to access it.

>That way, compress(), toindex() and expand() would be methods of whichever
>type of object were being stored in the smallset (in your example, point),
>and they would be called from within the smallset methods through a
>self.universe instance variable.
>
>For example, your __repr__() mthod might then change from
>
>    def __repr__(self):
>        #delegate printing to objects in the set
>        return '%s' %expand(self.data)
>
>to
>
>    def __repr__(self):
>        #delegate printing to objects in the set
>        return '%s' %self.data.expand()

Since data is a long integer it would take a lot of overhead to make
it behave this way. I feel you are right in pointing out some problem
though. Maybe there is something falling between classes and defs that
I can't yet put my finger on.  For instance a __getitem__() call would
need to know the universe it is getting the items from. 
__getitem__(self, universe) ?

 for x(universe) in set:
	print x

hmm...

>Hope this helps.

Thanks you for your suggestions, I disagreed with your code changes
but still you helped in making things clearer for me.

Anton.

>regards
> sTeVe
>




More information about the Python-list mailing list