Mutable objects which define __hash__ (was Re: Why are tuples immutable?)
Nick Coghlan
ncoghlan at iinet.net.au
Thu Dec 30 02:36:57 EST 2004
Bengt Richter wrote:
> Essentially syntactic sugar to avoid writing id(obj) ? (and to get a little performance
> improvement if they're written in C). I can't believe this thread came from the
> lack of such sugar ;-)
The downside of doing it that way is you have no means of getting from the id()
stored as a key back to the associated object. Meaningful iteration (including
listing of contents) becomes impossible. Doing the id() call at the Python level
instead of internally to the interpreter is also relatively expensive.
> Or, for that matter, (if you are the designer) giving the objects an
> obj.my_classification attribute (or indeed, property, if dynamic) as part
> of their initialization/design?
The main mutable objects we're talking about here are Python lists. Selecting an
alternate classification schemes using a subclass is the current recommended
approach - this thread is about alternatives to that.
I generally work with small enough data sets that I just use lists for
classification (sorting test input data into inputs which worked properly, and
those which failed for various reasons). However, I can understand wanting to
use a better data structure when doing frequent membership testing, *without*
having to make fundamental changes to an application's object model.
> Or subclass your graph node so you can do something readable like
> if node.is_leaf: ...
> instead of
> if my_obj_classification[id(node)] == 'leaf': ...
I'd prefer:
if node in leaf_nodes:
...
Separation of concerns suggests that a class shouldn't need to know about all
the different ways it may be classified. And mutability shouldn't be a barrier
to classification of an object according to its current state.
>>Hence why I suggested Antoon should consider pursuing collections.identity_dict
>>and collections.identity_set if identity-based lookup would actually address his
>>requirements. Providing these two data types seemed like a nice way to do an end
>>run around the bulk of the 'potentially variable hash' key problem.
>
> I googled for those ;-) I guess pursuing meant implementing ;-)
Yup. After all, the collections module is about high-performance datatypes for
more specific purposes than the standard builtins. identity_dict and
identity_set seem like natural fits for dealing with annotation and
classification problems where you don't want to modify the class definitions for
the objects being annotated or classified.
I don't want the capability enough to pursue it, but Antoon seems reasonably
motivated :)
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at email.com | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net
More information about the Python-list
mailing list