Re: [Python-ideas] OrderedDict for kwargs and class statement namespace

Feb. 28, 2013

      On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
...
On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido@python.org> wrote:
...
Hm. I write code regularly that takes a **kwargs dict and stores it
into some other longer-lasting datastructure. Having that other
datastructure suddenly receive OrderedDicts instead of plain dicts
might definitely cause some confusion and possible performance issues.
Could you elaborate on what confusion it might cause?
Well, x.__class__ is different, repr(x) is different, ...
...
As to performance relative to dict, this has definitely been my
primary concern.  I agree that the impact has to be insignificant for
the **kwargs proposal to go anywhere.  Obviously OrderedDict in C had
better be an improvement over the pure Python version or there isn't
much point.  However, it goes beyond that in the cases where we might
replace current uses of dict with OrderedDict.
What happens to the performance if I insert many thousands (or
millions) of items to an OrderedDict? What happens to the space it
uses? The thing is, what started out as OrderedDict stays one, but its
lifetime may be long and the assumptions around dict performance are
rather extreme (or we wouldn't be redoing the implementation
regularly).
...
My plan has been to do a bunch of performance comparison once the
implementation is complete and tune it as much as possible with an eye
toward the main potential internal use cases.  From my point of view,
those are **kwargs and class namespace.  This is in part why I've
brought those two up.
I'm fine with doing this by default for a class namespace; the type of
cls.__dict__ is already a non-dict (it's a proxy) and it's unlikely to
have 100,000 entries.

For **kwds I'm pretty concerned; the use cases seem flimsy.
...
For instance, one guidepost I've used is that typically **kwargs is
going to be small.  However, even for large kwargs I don't want any
performance difference to be a problem.
So, in my use case, the kwargs is small, but the object may live a
long and productive life after the function call is only a faint
memory, and it might grow dramatically.

IOW I have very high standards for backwards compatibility here.
...
...
And I don't recall ever having wanted to know the order of the kwargs
in the call.
Though it may sound a little odd, the first use case I bumped into was
with OrderedDict itself:
OrderedDict(a=1, b=2, c=3)
But because of the self-referentiality, this doesn't prove anything. :-)
...
There were a few other reasonable use cases mentioned in other threads
a while back.  I'll dig them up if that would help.
It would.
...
...
But even apart from that backwards incompatibility I think this
feature is too rarely useful to make it the default behavior -- if
anything I want **kwargs to become faster!
You mean something like possibly not unpacking the **kwargs in a call
into another dict?
def f(**kwargs):
        return kwargs
    d = {'a': 1, 'b': 2}
    assert d is f(**d)
Certainly faster, though definitely a semantic difference.
No, that would introduce nasty aliasing problems in some cases. I've
actually written code that depends on the copy being made here.
(Yesterday, actually. :-)

-- 
--Guido van Rossum (python.org/~guido)