
On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido@python.org> wrote:
Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues.
Could you elaborate on what confusion it might cause?
Well, x.__class__ is different, repr(x) is different, ...
As to performance relative to dict, this has definitely been my primary concern. I agree that the impact has to be insignificant for the **kwargs proposal to go anywhere. Obviously OrderedDict in C had better be an improvement over the pure Python version or there isn't much point. However, it goes beyond that in the cases where we might replace current uses of dict with OrderedDict.
What happens to the performance if I insert many thousands (or millions) of items to an OrderedDict? What happens to the space it uses? The thing is, what started out as OrderedDict stays one, but its lifetime may be long and the assumptions around dict performance are rather extreme (or we wouldn't be redoing the implementation regularly).
My plan has been to do a bunch of performance comparison once the implementation is complete and tune it as much as possible with an eye toward the main potential internal use cases. From my point of view, those are **kwargs and class namespace. This is in part why I've brought those two up.
I'm fine with doing this by default for a class namespace; the type of cls.__dict__ is already a non-dict (it's a proxy) and it's unlikely to have 100,000 entries. For **kwds I'm pretty concerned; the use cases seem flimsy.
For instance, one guidepost I've used is that typically **kwargs is going to be small. However, even for large kwargs I don't want any performance difference to be a problem.
So, in my use case, the kwargs is small, but the object may live a long and productive life after the function call is only a faint memory, and it might grow dramatically. IOW I have very high standards for backwards compatibility here.
And I don't recall ever having wanted to know the order of the kwargs in the call.
Though it may sound a little odd, the first use case I bumped into was with OrderedDict itself:
OrderedDict(a=1, b=2, c=3)
But because of the self-referentiality, this doesn't prove anything. :-)
There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
It would.
But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster!
You mean something like possibly not unpacking the **kwargs in a call into another dict?
def f(**kwargs): return kwargs d = {'a': 1, 'b': 2} assert d is f(**d)
Certainly faster, though definitely a semantic difference.
No, that would introduce nasty aliasing problems in some cases. I've actually written code that depends on the copy being made here. (Yesterday, actually. :-) -- --Guido van Rossum (python.org/~guido)