OrderedDict for kwargs and class statement namespace
Just wanted to put out some feelers for the feasibility of these two features: * have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__(). Both of these will need OrderedDict in C, which is getting close (issue #16991). -eric
Le Thu, 28 Feb 2013 03:15:14 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict
Rather than just feasibility, I would like performance not to regress here.
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Really? Last time I looked, it wasn't getting really close. Regards Antoine.
On Feb 28, 2013 3:28 AM, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
Le Thu, 28 Feb 2013 03:15:14 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict
Rather than just feasibility, I would like performance not to regress here.
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Really? Last time I looked, it wasn't getting really close.
Everything's there and there are just a few lingering memory-related issues to iron out. So 50% done then...<wink> -eric
Le Thu, 28 Feb 2013 07:55:03 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
On Feb 28, 2013 3:28 AM, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
Le Thu, 28 Feb 2013 03:15:14 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict
Rather than just feasibility, I would like performance not to regress here.
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Really? Last time I looked, it wasn't getting really close.
Everything's there and there are just a few lingering memory-related issues to iron out. So 50% done then...<wink>
And it also has to be reviewed in deep. To quote you: "The memory-related issues are pushing well past my experience". Regards Antoine.
On Thu, Feb 28, 2013 at 8:14 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
And it also has to be reviewed in deep. To quote you: "The memory-related issues are pushing well past my experience".
Agreed and I appreciate your concern, genuinely. I won't apologize for not having experience in various areas, though I recognize the extra caution it requires. However, I will continue to take opportunities to expand my experience--particularly working on things that others have considered to be good ideas but which no one has advanced. I will continue to do this even if it's slow going and even if my effort eventually bears no fruit other than the experience of having walked that path. Ultimately my goal is to be confident that my fellow stewards feel my contributions are helping Python get better. With OrderedDict, I have no illusions of getting everything done quickly, but do feel that the the bulk of the coding is wrapping up. I suppose that in more experienced hands it would be done quickly, but I'm not asking for that. Rather, I want to get a sense of the applicability of OrderedDict to Python's internals since it would be available as a built-in type. I've presented what I consider as two useful internal applications but would certainly like to know what you think. -eric
Le Thu, 28 Feb 2013 16:41:28 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
With OrderedDict, I have no illusions of getting everything done quickly, but do feel that the the bulk of the coding is wrapping up. I suppose that in more experienced hands it would be done quickly, but I'm not asking for that. Rather, I want to get a sense of the applicability of OrderedDict to Python's internals since it would be available as a built-in type. I've presented what I consider as two useful internal applications but would certainly like to know what you think.
Well, the OrderedDict constructor is quite a strong use case, as you pointed out (and the only one I can think of :-)). Still, in an aesthetical sense, I like the idea of the Python dict being a pure unordered hash table. Ordered dicts are good for some use cases (I do use them too), but it sounds a bit wrong to make them the first-class mapping type; perhaps because it would feel like PHP :-) Regards Antoine.
On Thu, Feb 28, 2013 at 4:27 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Le Thu, 28 Feb 2013 03:15:14 -0700, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict
Rather than just feasibility, I would like performance not to regress here.
I can appreciate not wanting to see "Python 3.4 killed our performance" articles all over news.ycombinator.com. But isn't it worth it to have a (small, but acceptable) performance hit for the sake of opening up use cases which are not currently possible in Python without ugly hacks? For an example of the "recommended" way to get the ordering of your class attributes: http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin... It seems to me that the "right thing" for python to do when given an ordered list of key=value pairs in a function call or class definition, is to retain the order. So what's an acceptable level of performance regression for the sake of doing things the "right way" here?
Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com> a écrit :
For an example of the "recommended" way to get the ordering of your class attributes: http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin...
This is already possible with the __prepare__ magic method. http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
It seems to me that the "right thing" for python to do when given an ordered list of key=value pairs in a function call or class definition, is to retain the order. So what's an acceptable level of performance regression for the sake of doing things the "right way" here?
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls. Regards Antoine.
On Thu, Feb 28, 2013 at 9:48 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com> a écrit :
For an example of the "recommended" way to get the ordering of your class attributes:
http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin...
This is already possible with the __prepare__ magic method.
http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
Sure. Case in point: Django has been working around it since at least python 2.4.
It seems to me that the "right thing" for python to do when given an ordered list of key=value pairs in a function call or class definition, is to retain the order. So what's an acceptable level of performance regression for the sake of doing things the "right way" here?
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls.
"If you build it, they will come..." When I originally encountered the need for python to retain the order of kwargs that my caller specified, it surprised me that there wasn't more clamoring for kwargs being an OrderedDict. However, since my development timeline didn't allow for holding the project up while a patch was pushed through python-dev and out into a real python release, I sucked it up, forced my calling code to send in hand-crafted OrderedDicts and called it a day. I think most developers don't even stop to think that the language *could* be different, they just put in the workaround and move on. I think if python stopped dropping the order of kwargs on the floor today, you'd see people start to rely on the order of kwargs tomorrow.
On 02/28/2013 08:14 AM, Don Spaulding wrote:
On Thu, Feb 28, 2013 at 9:48 AM, Antoine Pitrou <solipsis@pitrou.net <mailto:solipsis@pitrou.net>> wrote:
Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com <mailto:donspauldingii@gmail.com>> a écrit : > > For an example of the "recommended" way to get the ordering of your > class attributes: >http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin...
This is already possible with the __prepare__ magic method. http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
Sure. Case in point: Django has been working around it since at least python 2.4.
> It seems to me that the "right thing" for python to do when given an > ordered list of key=value pairs in a function call or class > definition, is to retain the order. So what's an acceptable level of > performance regression for the sake of doing things the "right way" > here?
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls.
"If you build it, they will come..."
When I originally encountered the need for python to retain the order of kwargs that my caller specified, it surprised me that there wasn't more clamoring for kwargs being an OrderedDict. However, since my development timeline didn't allow for holding the project up while a patch was pushed through python-dev and out into a real python release, I sucked it up, forced my calling code to send in hand-crafted OrderedDicts and called it a day. I think most developers don't even stop to think that the language *could* be different, they just put in the workaround and move on.
I think if python stopped dropping the order of kwargs on the floor today, you'd see people start to rely on the order of kwargs tomorrow.
+1 I'd already be relying on it if it were there. -- ~Ethan~
On 2/28/2013 11:41 AM, Ethan Furman wrote:
On 02/28/2013 08:14 AM, Don Spaulding wrote:
On Thu, Feb 28, 2013 at 9:48 AM, Antoine Pitrou <solipsis@pitrou.net <mailto:solipsis@pitrou.net>> wrote:
Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com <mailto:donspauldingii@gmail.com>> a écrit : > > For an example of the "recommended" way to get the ordering of your > class attributes:
http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin...
This is already possible with the __prepare__ magic method. http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
Sure. Case in point: Django has been working around it since at least python 2.4.
> It seems to me that the "right thing" for python to do when given an > ordered list of key=value pairs in a function call or class > definition, is to retain the order. So what's an acceptable level of > performance regression for the sake of doing things the "right way" > here?
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls.
"If you build it, they will come..."
When I originally encountered the need for python to retain the order of kwargs that my caller specified, it surprised me that there wasn't more clamoring for kwargs being an OrderedDict. However, since my development timeline didn't allow for holding the project up while a patch was pushed through python-dev and out into a real python release, I sucked it up, forced my calling code to send in hand-crafted OrderedDicts and called it a day. I think most developers don't even stop to think that the language *could* be different, they just put in the workaround and move on.
I think if python stopped dropping the order of kwargs on the floor today, you'd see people start to rely on the order of kwargs tomorrow.
+1
I'd already be relying on it if it were there.
Could you advance the discussion by elaborating your use case? I've never had need for ordered kwargs, so I'm having a hard time seeing how they would be useful. --Ned.
-- ~Ethan~ _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On 02/28/2013 01:51 PM, Ned Batchelder wrote:
On 2/28/2013 11:41 AM, Ethan Furman wrote:
On 02/28/2013 08:14 AM, Don Spaulding wrote:
On Thu, Feb 28, 2013 at 9:48 AM, Antoine Pitrou <solipsis@pitrou.net <mailto:solipsis@pitrou.net>> wrote:
Le Thu, 28 Feb 2013 09:30:50 -0600, Don Spaulding <donspauldingii@gmail.com <mailto:donspauldingii@gmail.com>> a écrit : > > For an example of the "recommended" way to get the ordering of your > class attributes:
http://stackoverflow.com/questions/3288107/how-can-i-get-fields-in-an-origin...
This is already possible with the __prepare__ magic method. http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-name...
Sure. Case in point: Django has been working around it since at least python 2.4.
> It seems to me that the "right thing" for python to do when given an > ordered list of key=value pairs in a function call or class > definition, is to retain the order. So what's an acceptable level of > performance regression for the sake of doing things the "right way" > here?
Or, rather, what is the benefit of doing things "the right way"? There are incredibly few cases for relying on the order of key=value pairs in function calls.
"If you build it, they will come..."
When I originally encountered the need for python to retain the order of kwargs that my caller specified, it surprised me that there wasn't more clamoring for kwargs being an OrderedDict. However, since my development timeline didn't allow for holding the project up while a patch was pushed through python-dev and out into a real python release, I sucked it up, forced my calling code to send in hand-crafted OrderedDicts and called it a day. I think most developers don't even stop to think that the language *could* be different, they just put in the workaround and move on.
I think if python stopped dropping the order of kwargs on the floor today, you'd see people start to rely on the order of kwargs tomorrow.
+1
I'd already be relying on it if it were there.
Could you advance the discussion by elaborating your use case? I've never had need for ordered kwargs, so I'm having a hard time seeing how they would be useful.
I no longer remember my original use-case, but currently I'm working on a command-line parser (I know, there are already plenty -- it's a learning experience) with multiple subcommands, and the order of the subcommands can make a difference. -- ~Ethan~
Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues. And I don't recall ever having wanted to know the order of the kwargs in the call. But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster! On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__().
Both of these will need OrderedDict in C, which is getting close (issue #16991).
-eric
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido)
On 1 Mar 2013 03:34, "Guido van Rossum" <guido@python.org> wrote:
Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues. And I don't recall ever having wanted to know the order of the kwargs in the call.
But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing. Cheers, Nick.
On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com>
wrote:
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__().
Both of these will need OrderedDict in C, which is getting close (issue #16991).
-eric
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
But __prepare__ only works for classes, not for methods, right? On Thu, Feb 28, 2013 at 10:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 1 Mar 2013 03:34, "Guido van Rossum" <guido@python.org> wrote:
Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues. And I don't recall ever having wanted to know the order of the kwargs in the call.
But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing.
Cheers, Nick.
On Thu, Feb 28, 2013 at 2:15 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__().
Both of these will need OrderedDict in C, which is getting close (issue #16991).
-eric
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido)
On Thu, Feb 28, 2013 at 11:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing.
A major use case for __prepare__() is to have the class definition use an OrderedDict. It's even a point of discussion in PEP 3115. While I agree with the the conclusion of the PEP that using only OrderedDict is inferior to __prepare__(), I also think defaulting to OrderedDict is viable and useful. Using __prepare__() necessitates the use of a metaclass, which most people consider black magic. Even when you can inherit from a class that has a custom metaclass (like collections.abc.ABC), it still necessitates inheritance and the chance for metaclass conflicts. While I'm on board with PEP 422, I'm not clear on how it helps here. If class namespaces were ordered by default then class decorators, which are typically much easier to comprehend, would steal yet another use case from metaclasses. The recent discussions on enums got me thinking about this. In some cases you want you class customization logic to be inherited. That's where you've got to use a metaclass. In other cases you don't. Decorators work great for that. If your class decorator needs to have ordering information, then you also have to use a metaclass. Having OrderedDict be the default class namespace would address that. -eric
On Fri, Mar 1, 2013 at 7:37 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Thu, Feb 28, 2013 at 11:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
And PEP 422 is designed to make it easier to share a common __prepare__ method with different post processing.
A major use case for __prepare__() is to have the class definition use an OrderedDict. It's even a point of discussion in PEP 3115. While I agree with the the conclusion of the PEP that using only OrderedDict is inferior to __prepare__(), I also think defaulting to OrderedDict is viable and useful.
Using __prepare__() necessitates the use of a metaclass, which most people consider black magic. Even when you can inherit from a class that has a custom metaclass (like collections.abc.ABC), it still necessitates inheritance and the chance for metaclass conflicts. While I'm on board with PEP 422, I'm not clear on how it helps here.
(catching up after moving house, haven't read the whole thread) PEP 422 would make it useful to also add a "namespace" meta argument to type.__prepare__ to give it a namespace instance to return. Then, all uses of such a OrderedDict based metaclass can be replaced by: class MyClass(namespace=OrderedDict()): @classmethod def __init_class__(cls): # The class namespace is the one we passed in! You could pass in a factory function instead, but I think that's a net loss for readability (you would lose the trailing "()" from the empty namespace case, but have to add "lambda:" or "functools.partial" to the prepopulated namespace case) Even if type wasn't modified, you could create your own metaclass that accepted a namespace and returned it: class UseNamespace(type): def __prepare__(cls, namespace): return namespace class MyClass(metaclass=UseNamespace, namespace=OrderedDict()) @classmethod def __init_class__(cls): # The class namespace is the one we passed in! I prefer the approach of adding the "namespace" argument to PEP 422, though, since it makes __init_class__ a far more powerful and compelling idea, and between them the two ideas should cover every metaclass use case that *only* customises creation rather than ongoing behaviour. I actually had this idea a week or so ago, but packaging discussions and moving meant I had postponed writing it up and posting it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Mar 3, 2013 at 12:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
class UseNamespace(type): def __prepare__(cls, namespace): return namespace
Oops, that signature is incorrect. Assume it's tweaked appropriately to accept the normal __prepare__ arguments and still retrieve the "namespace" setting from the class header. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido@python.org> wrote:
Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues.
Could you elaborate on what confusion it might cause? As to performance relative to dict, this has definitely been my primary concern. I agree that the impact has to be insignificant for the **kwargs proposal to go anywhere. Obviously OrderedDict in C had better be an improvement over the pure Python version or there isn't much point. However, it goes beyond that in the cases where we might replace current uses of dict with OrderedDict. My plan has been to do a bunch of performance comparison once the implementation is complete and tune it as much as possible with an eye toward the main potential internal use cases. From my point of view, those are **kwargs and class namespace. This is in part why I've brought those two up. For instance, one guidepost I've used is that typically **kwargs is going to be small. However, even for large kwargs I don't want any performance difference to be a problem.
And I don't recall ever having wanted to know the order of the kwargs in the call.
Though it may sound a little odd, the first use case I bumped into was with OrderedDict itself: OrderedDict(a=1, b=2, c=3) There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster!
You mean something like possibly not unpacking the **kwargs in a call into another dict? def f(**kwargs): return kwargs d = {'a': 1, 'b': 2} assert d is f(**d) Certainly faster, though definitely a semantic difference. -eric
On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Thu, Feb 28, 2013 at 10:32 AM, Guido van Rossum <guido@python.org> wrote:
Hm. I write code regularly that takes a **kwargs dict and stores it into some other longer-lasting datastructure. Having that other datastructure suddenly receive OrderedDicts instead of plain dicts might definitely cause some confusion and possible performance issues.
Could you elaborate on what confusion it might cause?
Well, x.__class__ is different, repr(x) is different, ...
As to performance relative to dict, this has definitely been my primary concern. I agree that the impact has to be insignificant for the **kwargs proposal to go anywhere. Obviously OrderedDict in C had better be an improvement over the pure Python version or there isn't much point. However, it goes beyond that in the cases where we might replace current uses of dict with OrderedDict.
What happens to the performance if I insert many thousands (or millions) of items to an OrderedDict? What happens to the space it uses? The thing is, what started out as OrderedDict stays one, but its lifetime may be long and the assumptions around dict performance are rather extreme (or we wouldn't be redoing the implementation regularly).
My plan has been to do a bunch of performance comparison once the implementation is complete and tune it as much as possible with an eye toward the main potential internal use cases. From my point of view, those are **kwargs and class namespace. This is in part why I've brought those two up.
I'm fine with doing this by default for a class namespace; the type of cls.__dict__ is already a non-dict (it's a proxy) and it's unlikely to have 100,000 entries. For **kwds I'm pretty concerned; the use cases seem flimsy.
For instance, one guidepost I've used is that typically **kwargs is going to be small. However, even for large kwargs I don't want any performance difference to be a problem.
So, in my use case, the kwargs is small, but the object may live a long and productive life after the function call is only a faint memory, and it might grow dramatically. IOW I have very high standards for backwards compatibility here.
And I don't recall ever having wanted to know the order of the kwargs in the call.
Though it may sound a little odd, the first use case I bumped into was with OrderedDict itself:
OrderedDict(a=1, b=2, c=3)
But because of the self-referentiality, this doesn't prove anything. :-)
There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
It would.
But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster!
You mean something like possibly not unpacking the **kwargs in a call into another dict?
def f(**kwargs): return kwargs d = {'a': 1, 'b': 2} assert d is f(**d)
Certainly faster, though definitely a semantic difference.
No, that would introduce nasty aliasing problems in some cases. I've actually written code that depends on the copy being made here. (Yesterday, actually. :-) -- --Guido van Rossum (python.org/~guido)
On Thu, Feb 28, 2013 at 2:28 PM, Guido van Rossum <guido@python.org> wrote:
What happens to the performance if I insert many thousands (or millions) of items to an OrderedDict? What happens to the space it uses? The thing is, what started out as OrderedDict stays one, but its lifetime may be long and the assumptions around dict performance are rather extreme (or we wouldn't be redoing the implementation regularly).
<snip/>
So, in my use case, the kwargs is small, but the object may live a long and productive life after the function call is only a faint memory, and it might grow dramatically.
Good point. My own use of **kwargs rarely sees the object leave the function or get very big, and this aspect of it just hadn't come up to broaden my point of view. I'm glad we're having this discussion. My intuition is that such a use case would be pretty rare, but even then...
IOW I have very high standards for backwards compatibility here.
...Python has an exceptional standard in this regard. FWIW, I agree. I would want to be confident about the ramifications before we made any change like this.
There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
It would.
Okay.
But even apart from that backwards incompatibility I think this feature is too rarely useful to make it the default behavior -- if anything I want **kwargs to become faster!
You mean something like possibly not unpacking the **kwargs in a call into another dict?
def f(**kwargs): return kwargs d = {'a': 1, 'b': 2} assert d is f(**d)
Certainly faster, though definitely a semantic difference.
No, that would introduce nasty aliasing problems in some cases. I've actually written code that depends on the copy being made here. (Yesterday, actually. :-)
You can't blame me for trying to make **kwargs-as-OrderedDict seem like a great idea in comparison. <wink> -eric
Am 28.02.2013 22:53, schrieb Eric Snow:
Good point. My own use of **kwargs rarely sees the object leave the function or get very big, and this aspect of it just hadn't come up to broaden my point of view. I'm glad we're having this discussion. My intuition is that such a use case would be pretty rare, but even then...
I guess most function calls don't need the feature of ordered kwargs. Could we implement yet another prefix that turns unordered keyword arguments into ordered keyword arguments, e.g. ***ordkwargs (3 *) and METH_VARARGS|METH_KEYWORDS|METH_ORDERED PyMethodDef.ml_flags? That would allow ordered keyword arguments while keeping backward compatibility to existing programs. Only functions that ask for ordered kwargs would have to pay the minor performance penalty, too. I don't know if its feasible or even possible. The interpreter would have to check the function's flags for each method call in order to decide if it has to create an ordinary dict or an OrderedDict. Christian
Le Fri, 01 Mar 2013 00:31:28 +0100, Christian Heimes <christian@python.org> a écrit :
Am 28.02.2013 22:53, schrieb Eric Snow:
Good point. My own use of **kwargs rarely sees the object leave the function or get very big, and this aspect of it just hadn't come up to broaden my point of view. I'm glad we're having this discussion. My intuition is that such a use case would be pretty rare, but even then...
I guess most function calls don't need the feature of ordered kwargs. Could we implement yet another prefix that turns unordered keyword arguments into ordered keyword arguments, e.g. ***ordkwargs (3 *) and METH_VARARGS|METH_KEYWORDS|METH_ORDERED PyMethodDef.ml_flags? That would allow ordered keyword arguments while keeping backward compatibility to existing programs. Only functions that ask for ordered kwargs would have to pay the minor performance penalty, too.
Well, you know, the performance concern also applies to pure Python functions, not just C ones ;) Regards Antoine.
On Thu, Feb 28, 2013 at 2:28 PM, Guido van Rossum <guido@python.org> wrote:
On Thu, Feb 28, 2013 at 1:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
There were a few other reasonable use cases mentioned in other threads a while back. I'll dig them up if that would help.
It would.
Other than OrderedDict, I've only found one other thread that had meaningful references: http://mail.python.org/pipermail/python-dev/2012-December/123105.html I know there were at least a couple more. I'll keep digging. -eric
Am 28.02.2013 11:15, schrieb Eric Snow:
Just wanted to put out some feelers for the feasibility of these two features:
* have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__().
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Raymond was/is working on a modification of the dict's internal data structure that reduces its memory consumption. IIRC the modification adds partial ordered as a side effect. The keys are ordered in insertion order until keys are removed. http://mail.python.org/pipermail/python-dev/2012-December/123028.html Christian
On Thu, Feb 28, 2013 at 11:30 AM, Christian Heimes <christian@python.org>wrote:
Raymond was/is working on a modification of the dict's internal data structure that reduces its memory consumption. IIRC the modification adds partial ordered as a side effect. The keys are ordered in insertion order until keys are removed.
While this is potentially convenient, in order for people to really be able to use it for these purposes it would have to be reliable behavior going forward. The question is therefore: do we actually want to explicitly make this behavior part of the dict API such that all future implementations of dict (and all implementations in any Python, not just CPython) must be guaranteed to also behave this way? I suspect the answer should be "no". On Thu, Feb 28, 2013 at 7:30 AM, Don Spaulding <donspauldingii@gmail.com> wrote:
So what's an acceptable level of performance regression for the sake of doing things the "right way" here?
For class declarations, I think a little performance hit might not be too big a deal, but those are really not the most important use case here, since there are already ways to work around this for classes if one wants to. However, for function calls, which I actually believe is the more important question, I think we would have to think very hard about anything that introduced any real performance overhead, because functions are typically called a _lot_. Adding even a small extra delay to function calls could impact the overall performance of some types of Python programs substantially.. I also agree with Guido that there may be some unexpected consequences of suddenly changing the type of the kwargs parameter passed to functions which were coded assuming it would always be a plain-ol' dict.. I think it might be doable, but we'd have to be very sure that OrderedDict is exactly compatible in every conceivable way.. Perhaps instead we could compromise by keeping the default case as it is, but providing some mechanism for the function declaration to specify that it wants ordering information passed to it? I'm not sure what a good syntax for this would be, though ("(*args, ***okwargs)"? "(*args, **kwargs, *kworder)"? Not sure I'm really big on either of those, actually, but you get the idea..) --Alex
On Thu, Feb 28, 2013 at 2:32 PM, Alex Stewart <foogod@gmail.com> wrote:
such that all future implementations of dict (and all implementations in any Python, not just CPython) must be guaranteed to also behave this way?
This is also a good point applied to both **kwargs and class namespace. Before we did either we'd need to be clear on the impact on other Python implementors.
However, for function calls, which I actually believe is the more important question, I think we would have to think very hard about anything that introduced any real performance overhead, because functions are typically called a _lot_. Adding even a small extra delay to function calls could impact the overall performance of some types of Python programs substantially..
**kwargs-as-OrderedDict impacts performance in two ways: the performance of packing the "unbound" keyword arguments into the OrderedDict and the performance of OrderedDict in normal operation after its handed off to the function. Otherwise you don't get a performance impact on functions.
I also agree with Guido that there may be some unexpected consequences of suddenly changing the type of the kwargs parameter passed to functions which were coded assuming it would always be a plain-ol' dict.. I think it might be doable, but we'd have to be very sure that OrderedDict is exactly compatible in every conceivable way..
For Python backward compatibility is a cornerstone. I'm surprised there isn't something in the Zen about it. <wink> For **kwargs the bar for compatibility is especially high and I agree with that.
Perhaps instead we could compromise by keeping the default case as it is, but providing some mechanism for the function declaration to specify that it wants ordering information passed to it? I'm not sure what a good syntax for this would be, though ("(*args, ***okwargs)"? "(*args, **kwargs, *kworder)"? Not sure I'm really big on either of those, actually, but you get the idea..)
**kwargs packing happens in the interpreter rather than in relationship to functionality provided by the function, so whatever the mechanism it would have be something the interpreter consumes, either a new API/syntax for building functions or a new syntax like you've mentioned. This has come up before. Classes have metaclasses (and __prepare__). Modules have loaders. Poor, poor functions. Because of the same concerns you've already expressed regarding the criticality of function performance, they miss out on all sorts of fun--inside their highly optimized box looking out at the other types showing off their cool new features all the time. It just isn't fair. :) -eric
28.02.2013 11:15, Eric Snow wrote:
* have the **kwargs param be an OrderedDict rather than a dict * have class definitions happen relative to an OrderedDict by default rather than a dict, and still overridable by a metaclass's __prepare__().
While having class namespace ordered sounds very nice, ordered **kwargs sometimes may not be a desirable feature -- especially when you want to keep a cooperatively used interface as simple as possible (because of the risk that keyword argument order could then be taken into consideration by *some* actors while others would still operate with the assumption that argument order cannot be known...).
Both of these will need OrderedDict in C, which is getting close (issue #16991).
Ad issue #16991: will the present semantics of inheriting from OrderedDict be kept untouched? (I mean: which methods call other methods => which methods you need to override etc.) If not we'll have a backward compatibility issue (maybe not very serious but still...). Cheers. *j
On Thu, Feb 28, 2013 at 3:51 PM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote:
While having class namespace ordered sounds very nice, ordered **kwargs sometimes may not be a desirable feature -- especially when you want to keep a cooperatively used interface as simple as possible (because of the risk that keyword argument order could then be taken into consideration by *some* actors while others would still operate with the assumption that argument order cannot be known...).
You mean like we had with dicts? Now that they are randomized things like docstrings started to break unexpectedly. With **kwargs the OrderedDict is created by the interpreter and passed to the called function. So the the writer of the function is the only one in control of how the ordering is interpreted. Granted, an existing function might, as currently written, expose the ordering or even the kwargs. So that aspect has to be considered. However it would still remain in the complete control of the function how the ordering of **kwargs is exposed.
Ad issue #16991: will the present semantics of inheriting from OrderedDict be kept untouched? (I mean: which methods call other methods => which methods you need to override etc.)
If not we'll have a backward compatibility issue (maybe not very serious but still...).
Any OrderedDict written in C must have identical semantics, including regarding subclassing. I've gone through my implementation on several occasions to check this and I'll probably do so again. Keep in mind that the unit tests for OrderedDict will be run against both the pure Python and C version (see PEP 399). That includes some tests regarding subclassing, though there could probably be a few more of those. Bottom line, if it doesn't quack it's not a duck we want. -eric
participants (10)
-
Alex Stewart
-
Antoine Pitrou
-
Christian Heimes
-
Don Spaulding
-
Eric Snow
-
Ethan Furman
-
Guido van Rossum
-
Jan Kaliszewski
-
Ned Batchelder
-
Nick Coghlan