set using alternative hash function?

Dave Angel davea at ieee.org
Fri Oct 16 07:48:54 EDT 2009


Austin Bingham wrote:
> On Thu, Oct 15, 2009 at 7:49 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>   
>> Austin Bingham wrote:
>> I'm feeling really dense about now... What am I missing?
>>     
>
> What you're missing is the entire discussion up to this point. I was
> looking for a way to use an alternative uniqueness criteria in a set
> instance without needing to modify my class.
>
>   
>> So is that the behavior you're wanting, keeping the first object and
>> discarding all others?  Or is there something else I'm still missing?
>>     
>
> Yes and yes. I want "normal" set behavior, but I want the set to use
> user-provided hash and equality tests, i.e. ones that don't
> necessarily call __hash__ and __eq__ on the candidate elements.
>
> Austin
>
>   
Your original post (10/15) never mentioned the __eq__() method, and 
people went off on a tangent discussing hash functions.  But from what 
you've been saying more recently, you want a set-like collection whose 
behavior isn't related to the standard comparison function.

The standard set is intimately tied to ==, in the sense that if two 
objects meet the a==b relationship, then only one will be in the set.  
And it doesn't matter which one, since they're equal.

You want a new collection that works differently.  Great.  It shouldn't 
be that hard to write, but if you try to base it on existing set you'll 
need a second wrapper object.  If you try to base it on dict, you'll 
need a second key object.  You've said you don't want any extra objects 
to be needed, so you'll need custom code.

A couple of dozen lines should do for the basic collection (add, clear, 
remove).  But if you also want difference, symmetric_difference, 
intersection_update, issubset ... it could grow to several dozen.  And 
you'd have to make some interesting decisions, once you have two 
pseudo-sets with possibly different comparison operators.

The only downside I can see is that apparently the built-in set is 
written in C, and if you're implementing this in pure python, it could 
be a lot slower.  On the other hand, in many cases, the comparison 
function-object might be the time-critical piece, in which case you coud 
be close.

DaveA




More information about the Python-list mailing list