OrderedDict for kwargs and class statement namespace

Just wanted to put out some feelers for the feasibility of these two features: * have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__(). Both of these will need OrderedDict in C, which is getting close (issue #16991). -eric

Le Thu, 28 Feb 2013 03:15:14 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Rather than just feasibility, I would like performance not to regress here.
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Really? Last time I looked, it wasn't getting really close. Regards Antoine.

Le Thu, 28 Feb 2013 07:55:03 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
And it also has to be reviewed in deep. To quote you: "The memory-related issues are pushing well past my experience". Regards Antoine.

On Thu, Feb 28, 2013 at 8:14 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Agreed and I appreciate your concern, genuinely. I won't apologize for not having experience in various areas, though I recognize the extra caution it requires. However, I will continue to take opportunities to expand my experience--particularly working on things that others have considered to be good ideas but which no one has advanced. I will continue to do this even if it's slow going and even if my effort eventually bears no fruit other than the experience of having walked that path. Ultimately my goal is to be confident that my fellow stewards feel my contributions are helping Python get better. With OrderedDict, I have no illusions of getting everything done quickly, but do feel that the the bulk of the coding is wrapping up. I suppose that in more experienced hands it would be done quickly, but I'm not asking for that. Rather, I want to get a sense of the applicability of OrderedDict to Python's internals since it would be available as a built-in type. I've presented what I consider as two useful internal applications but would certainly like to know what you think. -eric

Le Thu, 28 Feb 2013 16:41:28 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Well, the OrderedDict constructor is quite a strong use case, as you pointed out (and the only one I can think of :-)). Still, in an aesthetical sense, I like the idea of the Python dict being a pure unordered hash table. Ordered dicts are good for some use cases (I do use them too), but it sounds a bit wrong to make them the first-class mapping type; perhaps because it would feel like PHP :-) Regards Antoine.

On Thu, Feb 28, 2013 at 4:27 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I can appreciate not wanting to see "Python 3.4 killed our performance" articles all over news.ycombinator.com. But isn't it worth it to have a (small, but acceptable) performance hit for the sake of opening up use cases which are not currently possible in Python without ugly hacks? For an example of the "recommended" way to get the ordering of your class attributes: http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin... It seems to me that the "right thing" for python to do when given an ordered list of key=value pairs in a function call or class definition, is to retain the order. So what's an acceptable level of performance regression for the sake of doing things the "right way" here?

Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com> a écrit :
This is already possible with the __prepare__ magic method. http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls. Regards Antoine.

On Thu, Feb 28, 2013 at 9:48 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Sure. Case in point: Django has been working around it since at least python 2.4.
"If you build it, they will come..." When I originally encountered the need for python to retain the order of kwargs that my caller specified, it surprised me that there wasn't more clamoring for kwargs being an OrderedDict. However, since my development timeline didn't allow for holding the project up while a patch was pushed through python-dev and out into a real python release, I sucked it up, forced my calling code to send in hand-crafted OrderedDicts and called it a day. I think most developers don't even stop to think that the language *could* be different, they just put in the workaround and move on. I think if python stopped dropping the order of kwargs on the floor today, you'd see people start to rely on the order of kwargs tomorrow.

On 2/28/2013 11:41 AM, Ethan Furman wrote:
Could you advance the discussion by elaborating your use case? I've never had need for ordered kwargs, so I'm having a hard time seeing how they would be useful. --Ned.

On 02/28/2013 01:51 PM, Ned Batchelder wrote:
I no longer remember my original use-case, but currently I'm working on a command-line parser (I know, there are already plenty -- it's a learning experience) with multiple subcommands, and the order of the subcommands can make a difference. -- ~Ethan~

Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues. And I don't recall ever having wanted to know the order of the kwargs in the call. But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster! On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 1 Mar 2013 03:34, "Guido van Rossum" <guido@python.org> wrote:
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing. Cheers, Nick.
On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com>
wrote:

On Thu, Feb 28, 2013 at 11:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing.
A major use case for __prepare__() is to have the class definition use an OrderedDict. It's even a point of discussion in PEP 3115. While I agree with the the conclusion of the PEP that using only OrderedDict is inferior to __prepare__(), I also think defaulting to OrderedDict is viable and useful. Using __prepare__() necessitates the use of a metaclass, which most people consider black magic. Even when you can inherit from a class that has a custom metaclass (like collections.abc.ABC), it still necessitates inheritance and the chance for metaclass conflicts. While I'm on board with PEP 422, I'm not clear on how it helps here. If class namespaces were ordered by default then class decorators, which are typically much easier to comprehend, would steal yet another use case from metaclasses. The recent discussions on enums got me thinking about this. In some cases you want you class customization logic to be inherited. That's where you've got to use a metaclass. In other cases you don't. Decorators work great for that. If your class decorator needs to have ordering information, then you also have to use a metaclass. Having OrderedDict be the default class namespace would address that. -eric

On Fri, Mar 1, 2013 at 7:37 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
(catching up after moving house, haven't read the whole thread) PEP 422 would make it useful to also add a "namespace" meta argument to type.__prepare__ to give it a namespace instance to return. Then, all uses of such a OrderedDict based metaclass can be replaced by: class MyClass(namespace=OrderedDict()): @classmethod def __init_class__(cls): # The class namespace is the one we passed in! You could pass in a factory function instead, but I think that's a net loss for readability (you would lose the trailing "()" from the empty namespace case, but have to add "lambda:" or "functools.partial" to the prepopulated namespace case) Even if type wasn't modified, you could create your own metaclass that accepted a namespace and returned it: class UseNamespace(type): def __prepare__(cls, namespace): return namespace class MyClass(metaclass=UseNamespace, namespace=OrderedDict()) @classmethod def __init_class__(cls): # The class namespace is the one we passed in! I prefer the approach of adding the "namespace" argument to PEP 422, though, since it makes __init_class__ a far more powerful and compelling idea, and between them the two ideas should cover every metaclass use case that *only* customises creation rather than ongoing behaviour. I actually had this idea a week or so ago, but packaging discussions and moving meant I had postponed writing it up and posting it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Mar 3, 2013 at 12:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Oops, that signature is incorrect. Assume it's tweaked appropriately to accept the normal __prepare__ arguments and still retrieve the "namespace" setting from the class header. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido@python.org> wrote:
Could you elaborate on what confusion it might cause? As to performance relative to dict, this has definitely been my primary concern. I agree that the impact has to be insignificant for the **kwargs proposal to go anywhere. Obviously OrderedDict in C had better be an improvement over the pure Python version or there isn't much point. However, it goes beyond that in the cases where we might replace current uses of dict with OrderedDict. My plan has been to do a bunch of performance comparison once the implementation is complete and tune it as much as possible with an eye toward the main potential internal use cases. From my point of view, those are **kwargs and class namespace. This is in part why I've brought those two up. For instance, one guidepost I've used is that typically **kwargs is going to be small. However, even for large kwargs I don't want any performance difference to be a problem.
And I don't recall ever having wanted to know the order of the kwargs in the call.
Though it may sound a little odd, the first use case I bumped into was with OrderedDict itself: OrderedDict(a=1, b=2, c=3) There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
You mean something like possibly not unpacking the **kwargs in a call into another dict? def f(**kwargs): return kwargs d = {'a': 1, 'b': 2} assert d is f(**d) Certainly faster, though definitely a semantic difference. -eric

On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Well, x.__class__ is different, repr(x) is different, ...
What happens to the performance if I insert many thousands (or millions) of items to an OrderedDict? What happens to the space it uses? The thing is, what started out as OrderedDict stays one, but its lifetime may be long and the assumptions around dict performance are rather extreme (or we wouldn't be redoing the implementation regularly).
I'm fine with doing this by default for a class namespace; the type of cls.__dict__ is already a non-dict (it's a proxy) and it's unlikely to have 100,000 entries. For **kwds I'm pretty concerned; the use cases seem flimsy.
So, in my use case, the kwargs is small, but the object may live a long and productive life after the function call is only a faint memory, and it might grow dramatically. IOW I have very high standards for backwards compatibility here.
But because of the self-referentiality, this doesn't prove anything. :-)
There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
It would.
No, that would introduce nasty aliasing problems in some cases. I've actually written code that depends on the copy being made here. (Yesterday, actually. :-) -- --Guido van Rossum (python.org/~guido)

On Thu, Feb 28, 2013 at 2:28 PM, Guido van Rossum <guido@python.org> wrote:
<snip/>
Good point. My own use of **kwargs rarely sees the object leave the function or get very big, and this aspect of it just hadn't come up to broaden my point of view. I'm glad we're having this discussion. My intuition is that such a use case would be pretty rare, but even then...
IOW I have very high standards for backwards compatibility here.
...Python has an exceptional standard in this regard. FWIW, I agree. I would want to be confident about the ramifications before we made any change like this.
Okay.
You can't blame me for trying to make **kwargs-as-OrderedDict seem like a great idea in comparison. <wink> -eric

Am 28.02.2013 22:53, schrieb Eric Snow:
I guess most function calls don't need the feature of ordered kwargs. Could we implement yet another prefix that turns unordered keyword arguments into ordered keyword arguments, e.g. ***ordkwargs (3 *) and METH_VARARGS|METH_KEYWORDS|METH_ORDERED PyMethodDef.ml_flags? That would allow ordered keyword arguments while keeping backward compatibility to existing programs. Only functions that ask for ordered kwargs would have to pay the minor performance penalty, too. I don't know if its feasible or even possible. The interpreter would have to check the function's flags for each method call in order to decide if it has to create an ordinary dict or an OrderedDict. Christian

On Thu, Feb 28, 2013 at 2:28 PM, Guido van Rossum <guido@python.org> wrote:
Other than OrderedDict, I've only found one other thread that had meaningful references: http://mail.python.org/pipermail/python-dev/2012-December/123105.html I know there were at least a couple more. I'll keep digging. -eric

Am 28.02.2013 11:15, schrieb Eric Snow:
Raymond was/is working on a modification of the dict's internal data structure that reduces its memory consumption. IIRC the modification adds partial ordered as a side effect. The keys are ordered in insertion order until keys are removed. http://mail.python.org/pipermail/python-dev/2012-December/123028.html Christian

On Thu, Feb 28, 2013 at 11:30 AM, Christian Heimes <christian@python.org>wrote:
While this is potentially convenient, in order for people to really be able to use it for these purposes it would have to be reliable behavior going forward. The question is therefore: do we actually want to explicitly make this behavior part of the dict API such that all future implementations of dict (and all implementations in any Python, not just CPython) must be guaranteed to also behave this way? I suspect the answer should be "no". On Thu, Feb 28, 2013 at 7:30 AM, Don Spaulding <donspauldingii@gmail.com> wrote:
So what's an acceptable level of performance regression for the sake of doing things the "right way" here?
For class declarations, I think a little performance hit might not be too big a deal, but those are really not the most important use case here, since there are already ways to work around this for classes if one wants to. However, for function calls, which I actually believe is the more important question, I think we would have to think very hard about anything that introduced any real performance overhead, because functions are typically called a _lot_. Adding even a small extra delay to function calls could impact the overall performance of some types of Python programs substantially.. I also agree with Guido that there may be some unexpected consequences of suddenly changing the type of the kwargs parameter passed to functions which were coded assuming it would always be a plain-ol' dict.. I think it might be doable, but we'd have to be very sure that OrderedDict is exactly compatible in every conceivable way.. Perhaps instead we could compromise by keeping the default case as it is, but providing some mechanism for the function declaration to specify that it wants ordering information passed to it? I'm not sure what a good syntax for this would be, though ("(*args, ***okwargs)"? "(*args, **kwargs, *kworder)"? Not sure I'm really big on either of those, actually, but you get the idea..) --Alex

On Thu, Feb 28, 2013 at 2:32 PM, Alex Stewart <foogod@gmail.com> wrote:
This is also a good point applied to both **kwargs and class namespace. Before we did either we'd need to be clear on the impact on other Python implementors.
**kwargs-as-OrderedDict impacts performance in two ways: the performance of packing the "unbound" keyword arguments into the OrderedDict and the performance of OrderedDict in normal operation after its handed off to the function. Otherwise you don't get a performance impact on functions.
For Python backward compatibility is a cornerstone. I'm surprised there isn't something in the Zen about it. <wink> For **kwargs the bar for compatibility is especially high and I agree with that.
**kwargs packing happens in the interpreter rather than in relationship to functionality provided by the function, so whatever the mechanism it would have be something the interpreter consumes, either a new API/syntax for building functions or a new syntax like you've mentioned. This has come up before. Classes have metaclasses (and __prepare__). Modules have loaders. Poor, poor functions. Because of the same concerns you've already expressed regarding the criticality of function performance, they miss out on all sorts of fun--inside their highly optimized box looking out at the other types showing off their cool new features all the time. It just isn't fair. :) -eric

28.02.2013 11:15, Eric Snow wrote:
While having class namespace ordered sounds very nice, ordered **kwargs sometimes may not be a desirable feature -- especially when you want to keep a cooperatively used interface as simple as possible (because of the risk that keyword argument order could then be taken into consideration by *some* actors while others would still operate with the assumption that argument order cannot be known...).
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Ad issue #16991: will the present semantics of inheriting from OrderedDict be kept untouched? (I mean: which methods call other methods => which methods you need to override etc.) If not we'll have a backward compatibility issue (maybe not very serious but still...). Cheers. *j

On Thu, Feb 28, 2013 at 3:51 PM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote:
You mean like we had with dicts? Now that they are randomized things like docstrings started to break unexpectedly. With **kwargs the OrderedDict is created by the interpreter and passed to the called function. So the the writer of the function is the only one in control of how the ordering is interpreted. Granted, an existing function might, as currently written, expose the ordering or even the kwargs. So that aspect has to be considered. However it would still remain in the complete control of the function how the ordering of **kwargs is exposed.
Any OrderedDict written in C must have identical semantics, including regarding subclassing. I've gone through my implementation on several occasions to check this and I'll probably do so again. Keep in mind that the unit tests for OrderedDict will be run against both the pure Python and C version (see PEP 399). That includes some tests regarding subclassing, though there could probably be a few more of those. Bottom line, if it doesn't quack it's not a duck we want. -eric

Le Thu, 28 Feb 2013 03:15:14 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Rather than just feasibility, I would like performance not to regress here.
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Really? Last time I looked, it wasn't getting really close. Regards Antoine.

Le Thu, 28 Feb 2013 07:55:03 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
And it also has to be reviewed in deep. To quote you: "The memory-related issues are pushing well past my experience". Regards Antoine.

On Thu, Feb 28, 2013 at 8:14 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Agreed and I appreciate your concern, genuinely. I won't apologize for not having experience in various areas, though I recognize the extra caution it requires. However, I will continue to take opportunities to expand my experience--particularly working on things that others have considered to be good ideas but which no one has advanced. I will continue to do this even if it's slow going and even if my effort eventually bears no fruit other than the experience of having walked that path. Ultimately my goal is to be confident that my fellow stewards feel my contributions are helping Python get better. With OrderedDict, I have no illusions of getting everything done quickly, but do feel that the the bulk of the coding is wrapping up. I suppose that in more experienced hands it would be done quickly, but I'm not asking for that. Rather, I want to get a sense of the applicability of OrderedDict to Python's internals since it would be available as a built-in type. I've presented what I consider as two useful internal applications but would certainly like to know what you think. -eric

Le Thu, 28 Feb 2013 16:41:28 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Well, the OrderedDict constructor is quite a strong use case, as you pointed out (and the only one I can think of :-)). Still, in an aesthetical sense, I like the idea of the Python dict being a pure unordered hash table. Ordered dicts are good for some use cases (I do use them too), but it sounds a bit wrong to make them the first-class mapping type; perhaps because it would feel like PHP :-) Regards Antoine.

On Thu, Feb 28, 2013 at 4:27 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I can appreciate not wanting to see "Python 3.4 killed our performance" articles all over news.ycombinator.com. But isn't it worth it to have a (small, but acceptable) performance hit for the sake of opening up use cases which are not currently possible in Python without ugly hacks? For an example of the "recommended" way to get the ordering of your class attributes: http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin... It seems to me that the "right thing" for python to do when given an ordered list of key=value pairs in a function call or class definition, is to retain the order. So what's an acceptable level of performance regression for the sake of doing things the "right way" here?

Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com> a écrit :
This is already possible with the __prepare__ magic method. http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls. Regards Antoine.

On Thu, Feb 28, 2013 at 9:48 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Sure. Case in point: Django has been working around it since at least python 2.4.
"If you build it, they will come..." When I originally encountered the need for python to retain the order of kwargs that my caller specified, it surprised me that there wasn't more clamoring for kwargs being an OrderedDict. However, since my development timeline didn't allow for holding the project up while a patch was pushed through python-dev and out into a real python release, I sucked it up, forced my calling code to send in hand-crafted OrderedDicts and called it a day. I think most developers don't even stop to think that the language *could* be different, they just put in the workaround and move on. I think if python stopped dropping the order of kwargs on the floor today, you'd see people start to rely on the order of kwargs tomorrow.

On 2/28/2013 11:41 AM, Ethan Furman wrote:
Could you advance the discussion by elaborating your use case? I've never had need for ordered kwargs, so I'm having a hard time seeing how they would be useful. --Ned.

On 02/28/2013 01:51 PM, Ned Batchelder wrote:
I no longer remember my original use-case, but currently I'm working on a command-line parser (I know, there are already plenty -- it's a learning experience) with multiple subcommands, and the order of the subcommands can make a difference. -- ~Ethan~

Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues. And I don't recall ever having wanted to know the order of the kwargs in the call. But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster! On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 1 Mar 2013 03:34, "Guido van Rossum" <guido@python.org> wrote:
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing. Cheers, Nick.
On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com>
wrote:

On Thu, Feb 28, 2013 at 11:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing.
A major use case for __prepare__() is to have the class definition use an OrderedDict. It's even a point of discussion in PEP 3115. While I agree with the the conclusion of the PEP that using only OrderedDict is inferior to __prepare__(), I also think defaulting to OrderedDict is viable and useful. Using __prepare__() necessitates the use of a metaclass, which most people consider black magic. Even when you can inherit from a class that has a custom metaclass (like collections.abc.ABC), it still necessitates inheritance and the chance for metaclass conflicts. While I'm on board with PEP 422, I'm not clear on how it helps here. If class namespaces were ordered by default then class decorators, which are typically much easier to comprehend, would steal yet another use case from metaclasses. The recent discussions on enums got me thinking about this. In some cases you want you class customization logic to be inherited. That's where you've got to use a metaclass. In other cases you don't. Decorators work great for that. If your class decorator needs to have ordering information, then you also have to use a metaclass. Having OrderedDict be the default class namespace would address that. -eric

On Fri, Mar 1, 2013 at 7:37 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
(catching up after moving house, haven't read the whole thread) PEP 422 would make it useful to also add a "namespace" meta argument to type.__prepare__ to give it a namespace instance to return. Then, all uses of such a OrderedDict based metaclass can be replaced by: class MyClass(namespace=OrderedDict()): @classmethod def __init_class__(cls): # The class namespace is the one we passed in! You could pass in a factory function instead, but I think that's a net loss for readability (you would lose the trailing "()" from the empty namespace case, but have to add "lambda:" or "functools.partial" to the prepopulated namespace case) Even if type wasn't modified, you could create your own metaclass that accepted a namespace and returned it: class UseNamespace(type): def __prepare__(cls, namespace): return namespace class MyClass(metaclass=UseNamespace, namespace=OrderedDict()) @classmethod def __init_class__(cls): # The class namespace is the one we passed in! I prefer the approach of adding the "namespace" argument to PEP 422, though, since it makes __init_class__ a far more powerful and compelling idea, and between them the two ideas should cover every metaclass use case that *only* customises creation rather than ongoing behaviour. I actually had this idea a week or so ago, but packaging discussions and moving meant I had postponed writing it up and posting it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Mar 3, 2013 at 12:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Oops, that signature is incorrect. Assume it's tweaked appropriately to accept the normal __prepare__ arguments and still retrieve the "namespace" setting from the class header. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido@python.org> wrote:
Could you elaborate on what confusion it might cause? As to performance relative to dict, this has definitely been my primary concern. I agree that the impact has to be insignificant for the **kwargs proposal to go anywhere. Obviously OrderedDict in C had better be an improvement over the pure Python version or there isn't much point. However, it goes beyond that in the cases where we might replace current uses of dict with OrderedDict. My plan has been to do a bunch of performance comparison once the implementation is complete and tune it as much as possible with an eye toward the main potential internal use cases. From my point of view, those are **kwargs and class namespace. This is in part why I've brought those two up. For instance, one guidepost I've used is that typically **kwargs is going to be small. However, even for large kwargs I don't want any performance difference to be a problem.
And I don't recall ever having wanted to know the order of the kwargs in the call.
Though it may sound a little odd, the first use case I bumped into was with OrderedDict itself: OrderedDict(a=1, b=2, c=3) There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
You mean something like possibly not unpacking the **kwargs in a call into another dict? def f(**kwargs): return kwargs d = {'a': 1, 'b': 2} assert d is f(**d) Certainly faster, though definitely a semantic difference. -eric

On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Well, x.__class__ is different, repr(x) is different, ...
What happens to the performance if I insert many thousands (or millions) of items to an OrderedDict? What happens to the space it uses? The thing is, what started out as OrderedDict stays one, but its lifetime may be long and the assumptions around dict performance are rather extreme (or we wouldn't be redoing the implementation regularly).
I'm fine with doing this by default for a class namespace; the type of cls.__dict__ is already a non-dict (it's a proxy) and it's unlikely to have 100,000 entries. For **kwds I'm pretty concerned; the use cases seem flimsy.
So, in my use case, the kwargs is small, but the object may live a long and productive life after the function call is only a faint memory, and it might grow dramatically. IOW I have very high standards for backwards compatibility here.
But because of the self-referentiality, this doesn't prove anything. :-)
There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
It would.
No, that would introduce nasty aliasing problems in some cases. I've actually written code that depends on the copy being made here. (Yesterday, actually. :-) -- --Guido van Rossum (python.org/~guido)

On Thu, Feb 28, 2013 at 2:28 PM, Guido van Rossum <guido@python.org> wrote:
<snip/>
Good point. My own use of **kwargs rarely sees the object leave the function or get very big, and this aspect of it just hadn't come up to broaden my point of view. I'm glad we're having this discussion. My intuition is that such a use case would be pretty rare, but even then...
IOW I have very high standards for backwards compatibility here.
...Python has an exceptional standard in this regard. FWIW, I agree. I would want to be confident about the ramifications before we made any change like this.
Okay.
You can't blame me for trying to make **kwargs-as-OrderedDict seem like a great idea in comparison. <wink> -eric

Am 28.02.2013 22:53, schrieb Eric Snow:
I guess most function calls don't need the feature of ordered kwargs. Could we implement yet another prefix that turns unordered keyword arguments into ordered keyword arguments, e.g. ***ordkwargs (3 *) and METH_VARARGS|METH_KEYWORDS|METH_ORDERED PyMethodDef.ml_flags? That would allow ordered keyword arguments while keeping backward compatibility to existing programs. Only functions that ask for ordered kwargs would have to pay the minor performance penalty, too. I don't know if its feasible or even possible. The interpreter would have to check the function's flags for each method call in order to decide if it has to create an ordinary dict or an OrderedDict. Christian

On Thu, Feb 28, 2013 at 2:28 PM, Guido van Rossum <guido@python.org> wrote:
Other than OrderedDict, I've only found one other thread that had meaningful references: http://mail.python.org/pipermail/python-dev/2012-December/123105.html I know there were at least a couple more. I'll keep digging. -eric

Am 28.02.2013 11:15, schrieb Eric Snow:
Raymond was/is working on a modification of the dict's internal data structure that reduces its memory consumption. IIRC the modification adds partial ordered as a side effect. The keys are ordered in insertion order until keys are removed. http://mail.python.org/pipermail/python-dev/2012-December/123028.html Christian

On Thu, Feb 28, 2013 at 11:30 AM, Christian Heimes <christian@python.org>wrote:
While this is potentially convenient, in order for people to really be able to use it for these purposes it would have to be reliable behavior going forward. The question is therefore: do we actually want to explicitly make this behavior part of the dict API such that all future implementations of dict (and all implementations in any Python, not just CPython) must be guaranteed to also behave this way? I suspect the answer should be "no". On Thu, Feb 28, 2013 at 7:30 AM, Don Spaulding <donspauldingii@gmail.com> wrote:
So what's an acceptable level of performance regression for the sake of doing things the "right way" here?
For class declarations, I think a little performance hit might not be too big a deal, but those are really not the most important use case here, since there are already ways to work around this for classes if one wants to. However, for function calls, which I actually believe is the more important question, I think we would have to think very hard about anything that introduced any real performance overhead, because functions are typically called a _lot_. Adding even a small extra delay to function calls could impact the overall performance of some types of Python programs substantially.. I also agree with Guido that there may be some unexpected consequences of suddenly changing the type of the kwargs parameter passed to functions which were coded assuming it would always be a plain-ol' dict.. I think it might be doable, but we'd have to be very sure that OrderedDict is exactly compatible in every conceivable way.. Perhaps instead we could compromise by keeping the default case as it is, but providing some mechanism for the function declaration to specify that it wants ordering information passed to it? I'm not sure what a good syntax for this would be, though ("(*args, ***okwargs)"? "(*args, **kwargs, *kworder)"? Not sure I'm really big on either of those, actually, but you get the idea..) --Alex

On Thu, Feb 28, 2013 at 2:32 PM, Alex Stewart <foogod@gmail.com> wrote:
This is also a good point applied to both **kwargs and class namespace. Before we did either we'd need to be clear on the impact on other Python implementors.
**kwargs-as-OrderedDict impacts performance in two ways: the performance of packing the "unbound" keyword arguments into the OrderedDict and the performance of OrderedDict in normal operation after its handed off to the function. Otherwise you don't get a performance impact on functions.
For Python backward compatibility is a cornerstone. I'm surprised there isn't something in the Zen about it. <wink> For **kwargs the bar for compatibility is especially high and I agree with that.
**kwargs packing happens in the interpreter rather than in relationship to functionality provided by the function, so whatever the mechanism it would have be something the interpreter consumes, either a new API/syntax for building functions or a new syntax like you've mentioned. This has come up before. Classes have metaclasses (and __prepare__). Modules have loaders. Poor, poor functions. Because of the same concerns you've already expressed regarding the criticality of function performance, they miss out on all sorts of fun--inside their highly optimized box looking out at the other types showing off their cool new features all the time. It just isn't fair. :) -eric

28.02.2013 11:15, Eric Snow wrote:
While having class namespace ordered sounds very nice, ordered **kwargs sometimes may not be a desirable feature -- especially when you want to keep a cooperatively used interface as simple as possible (because of the risk that keyword argument order could then be taken into consideration by *some* actors while others would still operate with the assumption that argument order cannot be known...).
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Ad issue #16991: will the present semantics of inheriting from OrderedDict be kept untouched? (I mean: which methods call other methods => which methods you need to override etc.) If not we'll have a backward compatibility issue (maybe not very serious but still...). Cheers. *j

On Thu, Feb 28, 2013 at 3:51 PM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote:
You mean like we had with dicts? Now that they are randomized things like docstrings started to break unexpectedly. With **kwargs the OrderedDict is created by the interpreter and passed to the called function. So the the writer of the function is the only one in control of how the ordering is interpreted. Granted, an existing function might, as currently written, expose the ordering or even the kwargs. So that aspect has to be considered. However it would still remain in the complete control of the function how the ordering of **kwargs is exposed.
Any OrderedDict written in C must have identical semantics, including regarding subclassing. I've gone through my implementation on several occasions to check this and I'll probably do so again. Keep in mind that the unit tests for OrderedDict will be run against both the pure Python and C version (see PEP 399). That includes some tests regarding subclassing, though there could probably be a few more of those. Bottom line, if it doesn't quack it's not a duck we want. -eric
participants (10)
-
Alex Stewart
-
Antoine Pitrou
-
Christian Heimes
-
Don Spaulding
-
Eric Snow
-
Ethan Furman
-
Guido van Rossum
-
Jan Kaliszewski
-
Ned Batchelder
-
Nick Coghlan