[Python-ideas] OrderedDict for kwargs and class statement namespace

Thu Feb 28 22:28:15 CET 2013

On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido at python.org> wrote:
>> Hm. I write code regularly that takes a **kwargs dict and stores it
>> into some other longer-lasting datastructure. Having that other
>> datastructure suddenly receive OrderedDicts instead of plain dicts
>> might definitely cause some confusion and possible performance issues.
>
> Could you elaborate on what confusion it might cause?

Well, x.__class__ is different, repr(x) is different, ...

> As to performance relative to dict, this has definitely been my
> primary concern.  I agree that the impact has to be insignificant for
> the **kwargs proposal to go anywhere.  Obviously OrderedDict in C had
> better be an improvement over the pure Python version or there isn't
> much point.  However, it goes beyond that in the cases where we might
> replace current uses of dict with OrderedDict.

What happens to the performance if I insert many thousands (or
millions) of items to an OrderedDict? What happens to the space it
uses? The thing is, what started out as OrderedDict stays one, but its
lifetime may be long and the assumptions around dict performance are
rather extreme (or we wouldn't be redoing the implementation
regularly).

> My plan has been to do a bunch of performance comparison once the
> implementation is complete and tune it as much as possible with an eye
> toward the main potential internal use cases.  From my point of view,
> those are **kwargs and class namespace.  This is in part why I've
> brought those two up.

I'm fine with doing this by default for a class namespace; the type of
cls.__dict__ is already a non-dict (it's a proxy) and it's unlikely to
have 100,000 entries.

For **kwds I'm pretty concerned; the use cases seem flimsy.

> For instance, one guidepost I've used is that typically **kwargs is
> going to be small.  However, even for large kwargs I don't want any
> performance difference to be a problem.

So, in my use case, the kwargs is small, but the object may live a
long and productive life after the function call is only a faint
memory, and it might grow dramatically.

IOW I have very high standards for backwards compatibility here.

>> And I don't recall ever having wanted to know the order of the kwargs
>> in the call.
>
> Though it may sound a little odd, the first use case I bumped into was
> with OrderedDict itself:
>
>     OrderedDict(a=1, b=2, c=3)

But because of the self-referentiality, this doesn't prove anything. :-)

> There were a few other reasonable use cases mentioned in other threads
> a while back.  I'll dig them up if that would help.

It would.

>> But even apart from that backwards incompatibility I think this
>> feature is too rarely useful to make it the default behavior -- if
>> anything I want **kwargs to become faster!
>
> You mean something like possibly not unpacking the **kwargs in a call
> into another dict?
>
>     def f(**kwargs):
>         return kwargs
>     d = {'a': 1, 'b': 2}
>     assert d is f(**d)
>
> Certainly faster, though definitely a semantic difference.

No, that would introduce nasty aliasing problems in some cases. I've
actually written code that depends on the copy being made here.
(Yesterday, actually. :-)

-- 
--Guido van Rossum (python.org/~guido)