[Patches] [ python-Patches-1615701 ] Creating dicts for dict subclasses

SourceForge.net noreply at sourceforge.net
Wed Dec 20 01:13:03 CET 2006


Patches item #1615701, was opened at 2006-12-14 08:08
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1615701&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core (C code)
Group: Python 2.6
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Creating dicts for dict subclasses

Initial Comment:
This patch changes dictobject.c so that creating dicts from mapping like objects only uses the internal dict functions if the argument is a *real* dict, not a subclass. This means that overwritten keys() and __getitem__() methods are now honored. In addition to that the fallback implementation now tries iterkeys() before trying keys(). It also adds a PyMapping_IterKeys() macro.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2006-12-19 19:13

Message:
Logged In: YES 
user_id=80475
Originator: NO

Since update already supports (key, item) changes, I do not see that
rationale in trying to expand the definition what is dict-like to include a
try-this, then try-that approach.  This is a little too ad-hoc for too
little benefit.

Also, I do not see the point of adding PyMapping_Iterkeys to the C API. 
It affords no advantage over its macro definition (the current
one-way-to-do-it). 

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2006-12-19 18:00

Message:
Logged In: YES 
user_id=80475
Originator: NO

It is also asking for bugs if someone hooks __getitem__ and starts to make
possibly invalid assumptions about what other changes occur implicitly.

Also, this patch kills the performance of builtin subclasses.  If I
subclass dict to add a new method, it would suck to have the performance of
all of the other methods drop percariously.

This patch is somewhat overzealous.  It encroaches on the terriority of
UserDict.DictMixin which was specifically made for propagating new
behaviors.  It unnecessarily exposes implementation details.  It introduces
implicit behaviors that should be done through explicit overrides of
methods whose behavior is supposed to change.  

And, it is on the top of a slippery slope that we don't want to go down
(i.e. do we want to guarantee that list.append is implemented in terms of
list.extend, etc).  Python has no shortage of places where builtin
subclasses make direct calls to the underlying C code -- this patch leads
to a bottomless pit of changes that kill performance and make implicit
side-effects the norm instead of the exception.

----------------------------------------------------------------------

Comment By: Jim Jewett (jimjjewett)
Date: 2006-12-19 17:29

Message:
Logged In: YES 
user_id=764593
Originator: NO

FWIW, I'm not sure I agree on not specifying which methods call share
implementation.

If someone hooks __getitem__ but not get, that is just asking for bugs. 
(The implementation of get may -- but need not -- make its own call to
__getitem__, and not everyone will make the same decision.)

----------------------------------------------------------------------

Comment By: Jim Jewett (jimjjewett)
Date: 2006-12-19 17:26

Message:
Logged In: YES 
user_id=764593
Originator: NO

As I understand it, the problem is that dict.update is assuming any dict
subclass will use the same internal data representation.

Restricting the fast path to exactly builtin dicts (not subclasses) fixes
the bug, but makes the fallback more frequent.

The existing fallback is to call keys(), then iterate over it, retrieving
the value for each key.  (keys is required for a "minimal mapping" as
documented is UserDict, and a few other random places.)

The only potential dependency on other methods is his proposed new
intermediate path that avoids creating a list of all keys, by using
iterkeys if it exists.  (I suggested using iteritems to avoid the lookups.)
 If iter* aren't implemented, the only harm is falling back to the existing
fallback of "for k in keys():"


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2006-12-19 16:07

Message:
Logged In: YES 
user_id=80475
Originator: NO

I'm -1 on making ANY guarantees about which methods underlie others --
that would constitute new and everlasting guarantees about how mappings are
implemented.  Subclasses should explicity override/extend the methods
withed changed behavior.  If that proves non-trivial, then it is likely
there should be a has-a relationship instead of an is-a relationship. 
Also, it is likely that the subclass will have Liskov substitutability
violations.  Either way, there is probably a design flaw.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-12-19 14:23

Message:
Logged In: YES 
user_id=89016
Originator: YES

iteritems() has to create a new tuple for each item, so this might be
slower.

----------------------------------------------------------------------

Comment By: Jim Jewett (jimjjewett)
Date: 2006-12-19 12:50

Message:
Logged In: YES 
user_id=764593
Originator: NO

Why are you using iterkeys instead of iteritems?

It seems like if they've filled out the interface enough to have iterkeys,
they've probably filled it out all the way, and you do need the value as
soon as you get the key.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1615701&group_id=5470


More information about the Patches mailing list