
What do you think about providing an `OrderedDict.sort` method? I've been using my own `OrderedDict` subclass that defines `sort` for years, and I always wondered why the stdlib one doesn't provide `sort`. I can write the patch if needed. Thanks, Ram.

On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
What do you think about providing an `OrderedDict.sort` method? I've been using my own `OrderedDict` subclass that defines `sort` for years, and I always wondered why the stdlib one doesn't provide `sort`.
I can write the patch if needed.
I'm not entirely sure why anyone would need an OrderedDict sort method. Ordered Dicts store keys by insertion order. Sorting the keys goes against the purpose of an OrderedDict. I can understand a request for a SortedDict, that keeps the keys in sorted order as they are deleted or inserted. I personally don't have any need for one, since when I need the keys in sorted order I just sort them on the fly: for key in sorted(dict): ... but in any case, that's a separate issue from sorting an OrderedDict. Can you explain the use-case for why somebody might want to throw away the insertion order and replace with sorted order? -- Steven

I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it that way, and the fact that insertion order is the default is not a reason in my opinion. It's just a dict with an order, and I see no reason to not let users move elements about as they wish. Yes, I'm aware that the documentation defined OrderedDict your way too; I still think it's a pointless restriction. Regarding examples: I've used my `OrderedDict.sort` at least 10 times. Just today I've used it again. I was putting three items in an ordrered dict, with keys 'low', 'medium' and 'high'. I wanted to have them sorted as 'low', 'medium' and 'high' but the insertion order was different because of the algorithm that calculated them. (Also not all 3 items were guaranteed to exist, I wanted to sort those that existed.) So I created an OrderedDict of my subclass and called `.sort`. I'm sure you can think of a bunch more examples, if not I can give them to you. On Tuesday, September 24, 2013 3:13:15 PM UTC+3, Steven D'Aprano wrote:
On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
What do you think about providing an `OrderedDict.sort` method? I've been using my own `OrderedDict` subclass that defines `sort` for years, and I always wondered why the stdlib one doesn't provide `sort`.
I can write the patch if needed.
I'm not entirely sure why anyone would need an OrderedDict sort method. Ordered Dicts store keys by insertion order. Sorting the keys goes against the purpose of an OrderedDict.
I can understand a request for a SortedDict, that keeps the keys in sorted order as they are deleted or inserted. I personally don't have any need for one, since when I need the keys in sorted order I just sort them on the fly:
for key in sorted(dict): ...
but in any case, that's a separate issue from sorting an OrderedDict. Can you explain the use-case for why somebody might want to throw away the insertion order and replace with sorted order?
-- Steven _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> https://mail.python.org/mailman/listinfo/python-ideas

On Tue, Sep 24, 2013 at 7:27 AM, Ram Rachum <ram.rachum@gmail.com> wrote:
I think that your mistake is defining OrderedDict as a dict sorting by insertion order.
That's the definition straight out of the documentation. "An OrderedDict is a dict that remembers the order that keys were first inserted."

Then how do you explain `move_to_end`? Sent from my phone. On Sep 24, 2013 3:50 PM, "Brian Curtin" <brian@python.org> wrote:
On Tue, Sep 24, 2013 at 7:27 AM, Ram Rachum <ram.rachum@gmail.com> wrote:
I think that your mistake is defining OrderedDict as a dict sorting by insertion order.
That's the definition straight out of the documentation. "An OrderedDict is a dict that remembers the order that keys were first inserted."

On 09/24/2013 05:27 AM, Ram Rachum wrote:
I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it that way [...]
How would you like it sorted? - ascending? you can write an algorithm for that - descending? you can write an algorithm for that - cyclic? you can write an algorithm for that - insertion order? you can *not* write an algorithm for that Insertion order is the one that you either remember, or is lost. As for a practical example, think of classes that want to know which order their attributes were created in -- OrderedDict to the rescue! :) -- ~Ethan~

Ethan, you've misunderstood my message and given a correct objection to an argument I did not make. I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created. Sent from my phone. On Sep 24, 2013 4:42 PM, "Ethan Furman" <ethan@stoneleaf.us> wrote:
On 09/24/2013 05:27 AM, Ram Rachum wrote:
I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it that way [...]
How would you like it sorted?
- ascending? you can write an algorithm for that
- descending? you can write an algorithm for that
- cyclic? you can write an algorithm for that
- insertion order? you can *not* write an algorithm for that
Insertion order is the one that you either remember, or is lost.
As for a practical example, think of classes that want to know which order their attributes were created in -- OrderedDict to the rescue! :)
-- ~Ethan~ ______________________________**_________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/** topic/python-ideas/-RFTqV8_**aS0/unsubscribe<https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe> . To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe@googlegroups.com> . For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out> .

On 09/24/2013 08:23 AM, Ram Rachum wrote:
On Sep 24, 2013 4:42 PM, Ethan Furman wrote:
On 09/24/2013 05:27 AM, Ram Rachum wrote:
I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it that way [...]
Insertion order is the one that you either remember, or is lost.
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
Two points: - What happens when a new element is added to the OrderedDict after the user sorts it? - If by 'init' you mean something like `d = OrderedDict(a=1, b=2, c=3)` -- this does not preserve an insertion order as the keywords end up in a regular, unsorted dict that is passed to OrderedDict.__init__ -- ~Ethan~

On Tue, Sep 24, 2013 at 6:36 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 09/24/2013 08:23 AM, Ram Rachum wrote:
On Sep 24, 2013 4:42 PM, Ethan Furman wrote:
On 09/24/2013 05:27 AM, Ram Rachum wrote:
I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it that way [...]
Insertion order is the one that you either remember, or is lost.
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
Two points:
- What happens when a new element is added to the OrderedDict after the user sorts it?
The exact same thing that happens if the user does `.move_to_end` and then adds a new element, and the exact same thing that happens when a user does `list.sort` and adds a new element, and the exact same thing that happens when a user does `sorted(whatever)` and adds a new element. It just gets put in the end.
- If by 'init' you mean something like `d = OrderedDict(a=1, b=2, c=3)` -- this does not preserve an insertion order as the keywords end up in a regular, unsorted dict that is passed to OrderedDict.__init__
Does this relate to my proposal in any way? I don't see how. (I meant __init__, I was typing from a phone.)
-- ~Ethan~ ______________________________**_________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/** topic/python-ideas/-RFTqV8_**aS0/unsubscribe<https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe> . To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe@googlegroups.com> . For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out> .

On 24.09.2013 17:23, Ram Rachum wrote:
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
The overhead introduced by completely recreating the internal data structure after the sort is just as high as creating a new OrderedDict, so I don't understand why you don't like about: from collections import OrderedDict o = OrderedDict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems())) This even allows you to keep the original insert order should you need it again. If you don't need this, you can just use: o = dict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems())) which is also faster than first creating an OrderedDict and then recreating it with sorted entries. Put those two lines into a function and you have: def SortedOrderedDict(*args, **kws): o = dict(*args, **kws) return OrderedDict(sorted(o.iteritems())) p = SortedOrderedDict(((3,4), (5,4), (1,2))) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice. (By the way, couldn't you make the same argument about `list.sort`?) On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 24.09.2013 17:23, Ram Rachum wrote:
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
The overhead introduced by completely recreating the internal data structure after the sort is just as high as creating a new OrderedDict, so I don't understand why you don't like about:
from collections import OrderedDict o = OrderedDict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
This even allows you to keep the original insert order should you need it again. If you don't need this, you can just use:
o = dict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
which is also faster than first creating an OrderedDict and then recreating it with sorted entries.
Put those two lines into a function and you have:
def SortedOrderedDict(*args, **kws): o = dict(*args, **kws) return OrderedDict(sorted(o.iteritems()))
p = SortedOrderedDict(((3,4), (5,4), (1,2)))
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Sep 24, 2013, at 8:51, Ram Rachum <ram@rachum.com> wrote:
I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.
Honestly, I think having a sorted mapping in the stdlib would be even nicer in almost any situation where this might be nice. But, given that we don't have such a thing, and getting one into the stdlib is harder than it appears, maybe that's not an argument against your (obviously simpler) idea. Of course in most cases, you just want to iterate once in sorted order, and it's hard to beat this: for k, v in sorted(o.items()):
(By the way, couldn't you make the same argument about `list.sort`?)
You could. Except that list.sort predates sorted. And it's faster and saves memory, which isn't true of your suggestion. I don't know if that would be enough to add it today, but it's more than enough to keep it around.
On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 24.09.2013 17:23, Ram Rachum wrote:
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
The overhead introduced by completely recreating the internal data structure after the sort is just as high as creating a new OrderedDict, so I don't understand why you don't like about:
from collections import OrderedDict o = OrderedDict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
This even allows you to keep the original insert order should you need it again. If you don't need this, you can just use:
o = dict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
which is also faster than first creating an OrderedDict and then recreating it with sorted entries.
Put those two lines into a function and you have:
def SortedOrderedDict(*args, **kws): o = dict(*args, **kws) return OrderedDict(sorted(o.iteritems()))
p = SortedOrderedDict(((3,4), (5,4), (1,2)))
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas

On Tue, Sep 24, 2013 at 7:27 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 24, 2013, at 8:51, Ram Rachum <ram@rachum.com> wrote:
I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.
Honestly, I think having a sorted mapping in the stdlib would be even nicer in almost any situation where this might be nice. But, given that we don't have such a thing, and getting one into the stdlib is harder than it appears, maybe that's not an argument against your (obviously simpler) idea.
For the record, I think that having a SortedDict in the stdlib would be awesome.
Of course in most cases, you just want to iterate once in sorted order, and it's hard to beat this:
for k, v in sorted(o.items()):
I think that in most of my cases it won't work. Either because I iterate in Django templates, or I iterate several times which would make this cumbersome and wasteful.
(By the way, couldn't you make the same argument about `list.sort`?)
You could. Except that list.sort predates sorted. And it's faster and saves memory, which isn't true of your suggestion. I don't know if that would be enough to add it today, but it's more than enough to keep it around.
On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 24.09.2013 17:23, Ram Rachum wrote:
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
The overhead introduced by completely recreating the internal data structure after the sort is just as high as creating a new OrderedDict, so I don't understand why you don't like about:
from collections import OrderedDict o = OrderedDict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
This even allows you to keep the original insert order should you need it again. If you don't need this, you can just use:
o = dict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
which is also faster than first creating an OrderedDict and then recreating it with sorted entries.
Put those two lines into a function and you have:
def SortedOrderedDict(*args, **kws): o = dict(*args, **kws) return OrderedDict(sorted(o.iteritems()))
p = SortedOrderedDict(((3,4), (5,4), (1,2)))
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28 <http://egenix.com/go492013-09-28>: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas

On Tue, Sep 24, 2013, at 12:33, Ram Rachum wrote:
For the record, I think that having a SortedDict in the stdlib would be awesome.
There are two issues with that. First of all, this demands that every element be orderable with every other element. Since not every element is going to be compared with every other element on insertion, it's easy to imagine a case where this won't be caught until it's sorted again later on. And this is ignoring the pathological behavior of floating-point NaN values, which already silently break list sorting. (Can someone explain to me how nan works as a dict key, by the way?) Secondly, a SortedDict (or SortedSet) implies that the sorting is used _instead of_ hashing, for lookup. This raises the question as to whether keys/elements should be required to be hashable. On the one hand, requiring them to be hashable gives you the implied guarantee of an immutable equality relationship, which is _likely_ to also imply (on orderable types) an immutable ordering, whereas there is nothing else that can be used that directly implies an immutable ordering.

From: "random832@fastmail.us" <random832@fastmail.us> Sent: Tuesday, September 24, 2013 10:24 AM
On Tue, Sep 24, 2013, at 12:33, Ram Rachum wrote:
For the record, I think that having a SortedDict in the stdlib would be awesome.
There are two issues with that.
This discussion comes up at least once every two months, and I don't think anyone wants to have the whole discussion all over again. See http://stupidpythonideas.blogspot.com/2013/07/sorted-collections-in-stdlib.h..., which I wrote one or two iterations ago to collect all of the issues, and please let me know if I missed any or you have anything to add. Your two issues aren't really problems, just choices to be made, and I think everyone who's interested in this who has an opinion is unanimous. (There _is_ a problem, however: there are multiple good implementations out there, but none of them comes with someone who's willing to stdlibify it and maintain it for a few years…) But briefly: Yes, every key must be comparable with every other key, and the comparison must define a strict weak order, and the keys must be comparison-immutable, and there's no way to test either of those automatically. By comparison, a dict needs hashable keys, which can be tested automatically, and equality-immutable and hash-immutable keys, which can't really be tested but in practice hash is an acceptable test. But it's no worse than many other requirements in the stdlib that can't be tested automatically. And yes, NaN is a problem, but it's exactly the same problem it is everywhere else in Python.

Andrew Barnert writes:
See http://stupidpythonideas.blogspot.com/2013/07/sorted-collections-in-stdlib.h..., which I wrote one or two iterations ago to collect all of the issues, and please let me know if I missed any or you have anything to add.
A small nit: SortedSequence and SortedDicts should be mappings, guaranteeing "fast" (preferably O(1)) access for any key (integral and arbitrary, respectively). Therefore, in the case of a SortedDict the user should be no more surprised at a complaint about hashability than they should be in the case of a dict (especially considering the name!) I'll grant that some users might be perfectly happy with O(log N) "reasonably fast" access, but others would not be pleased.

On Sep 24, 2013, at 21:27, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
Andrew Barnert writes:
See http://stupidpythonideas.blogspot.com/2013/07/sorted-collections-in-stdlib.h..., which I wrote one or two iterations ago to collect all of the issues, and please let me know if I missed any or you have anything to add.
A small nit: SortedSequence and SortedDicts should be mappings, guaranteeing "fast" (preferably O(1)) access for any key (integral and arbitrary, respectively). Therefore, in the case of a SortedDict the user should be no more surprised at a complaint about hashability than they should be in the case of a dict (especially considering the name!)
I'll grant that some users might be perfectly happy with O(log N) "reasonably fast" access, but others would not be pleased.
O(log N) is fast enough for the standard mappings in C++, Java, etc., are python users more demanding of performance than C++? I don't know of any language that has a SortedAndHashedDict in it's stdlib, but there are many that have a SortedDict based on a tree. I don't know of any modules on PyPI that offer the former, but multiple popular modules offer the latter. Also, given a SortedSequence and a dict, you can trivially build a SortedAndHashedDict if you really want it for something; without SortedSequence, you can't. The other way around isn't true; if you want a SortedDict, without the time and space and requirements burden, a SortedAndHashedSequence is no help. If you think the name SortedDict is misleading, we could call it something different, with fewer implications. But, given that libraries like blist generally offer the type under a name like SortedDict, and in other languages that offer both tree-based and hash-based collections the names are always parallel (like map and unordered_map in C++), I don't think this is a problem.

On 09/25/2013 08:59 AM, Andrew Barnert wrote:
On Sep 24, 2013, at 21:27, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
I'll grant that some users might be perfectly happy with O(log N) "reasonably fast" access, but others would not be pleased.
O(log N) is fast enough for the standard mappings in C++, Java, etc., are python users more demanding of performance than C++?
I admit I know next to nothing about C++ and Java, but in Python the dict is ubiquitous: modules have them, classes have them, nearly every user defined instance has them, they're passed into functions, they're used for dispatch tables, etc., etc.. So I suspect that Python is more demanding of its mapping than the others are. -- ~Ethan~

On Sep 25, 2013, at 9:53, Ethan Furman <ethan@stoneleaf.us> wrote:
On 09/25/2013 08:59 AM, Andrew Barnert wrote:
On Sep 24, 2013, at 21:27, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
I'll grant that some users might be perfectly happy with O(log N) "reasonably fast" access, but others would not be pleased.
O(log N) is fast enough for the standard mappings in C++, Java, etc., are python users more demanding of performance than C++?
I admit I know next to nothing about C++ and Java, but in Python the dict is ubiquitous: modules have them, classes have them, nearly every user defined instance has them, they're passed into functions, they're used for dispatch tables, etc., etc..
So I suspect that Python is more demanding of its mapping than the others are.
Nobody is suggesting replacing dict with a tree-based mapping, just adding one in the collections module for the use cases where it's what you want.

On Tue, Sep 24, 2013, at 23:27, Andrew Barnert wrote:
Yes, every key must be comparable with every other key, and the comparison must define a strict weak order, and the keys must be comparison-immutable, and there's no way to test either of those automatically. By comparison, a dict needs hashable keys, which can be tested automatically, and equality-immutable and hash-immutable keys, which can't really be tested but in practice hash is an acceptable test.
I think of this as part of the hashable protocol, whereas we know that lists are orderable despite being mutable.
But it's no worse than many other requirements in the stdlib that can't be tested automatically.
And yes, NaN is a problem, but it's exactly the same problem it is everywhere else in Python.
I was serious about wanting to know how dictionaries handle NaN as a key. Is it a special case? The obvious way of implementing it would conclude it is a hash collision but not a match. I notice that Decimal('NaN') and float nan don't match each other (as do any other float/Decimal with the same value) but they do both work as dictionary keys.

On 25 September 2013 14:47, <random832@fastmail.us> wrote:
And yes, NaN is a problem, but it's exactly the same problem it is everywhere else in Python.
I was serious about wanting to know how dictionaries handle NaN as a key. Is it a special case? The obvious way of implementing it would conclude it is a hash collision but not a match. I notice that Decimal('NaN') and float nan don't match each other (as do any other float/Decimal with the same value) but they do both work as dictionary keys.
They're effectively compared by identity:
{float('nan'), float('nan')} set([nan, nan]) a = float('nan') {a, a} set([nan])
Oscar

On Sep 25, 2013, at 6:47, random832@fastmail.us wrote:
On Tue, Sep 24, 2013, at 23:27, Andrew Barnert wrote:
Yes, every key must be comparable with every other key, and the comparison must define a strict weak order, and the keys must be comparison-immutable, and there's no way to test either of those automatically. By comparison, a dict needs hashable keys, which can be tested automatically, and equality-immutable and hash-immutable keys, which can't really be tested but in practice hash is an acceptable test.
I think of this as part of the hashable protocol, whereas we know that lists are orderable despite being mutable.
Please read the blog post rather than the one-line summary if you want to discuss the contents.
But it's no worse than many other requirements in the stdlib that can't be tested automatically.
And yes, NaN is a problem, but it's exactly the same problem it is everywhere else in Python.
I was serious about wanting to know how dictionaries handle NaN as a key. Is it a special case? The obvious way of implementing it would conclude it is a hash collision but not a match.
I believe that, at least in CPython and PyPy, a hash collision is a match if they're identical or equal, which is why NaN values work, and why float("nan") and Decimal("nan") aren't matches, and so on. But is there anything in the documentation that requires this, or is it just a side effect of implementation specifics? I don't know.
I notice that Decimal('NaN') and float nan don't match each other (as do any other float/Decimal with the same value) but they do both work as dictionary keys.

On Tue, 24 Sep 2013 18:51:43 +0300 Ram Rachum <ram@rachum.com> wrote:
I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.
(By the way, couldn't you make the same argument about `list.sort`?)
list.sort() sorts the list in-place, it doesn't reallocate a new vector to replace the old one. (AFAIR anyway, but I trust Tim and Raymond here (or was it Tim, Tim, Raymond and Tim? :-)). Regards Antoine.

On 24.09.2013 17:51, Ram Rachum wrote:
I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.
You can avoid the temp dict by doing some introspection of the arguments and using iterators instead.
(By the way, couldn't you make the same argument about `list.sort`?)
The use case is different. With list.sort() you don't want to create a copy of the list, but instead have the list sort itself, since you're not interested in the original order. You'd only use an OrderedDict to begin with if you're interested in the insert order, otherwise you'd start out with a plain dict().
On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 24.09.2013 17:23, Ram Rachum wrote:
Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
The overhead introduced by completely recreating the internal data structure after the sort is just as high as creating a new OrderedDict, so I don't understand why you don't like about:
from collections import OrderedDict o = OrderedDict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
This even allows you to keep the original insert order should you need it again. If you don't need this, you can just use:
o = dict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
which is also faster than first creating an OrderedDict and then recreating it with sorted entries.
Put those two lines into a function and you have:
def SortedOrderedDict(*args, **kws): o = dict(*args, **kws) return OrderedDict(sorted(o.iteritems()))
p = SortedOrderedDict(((3,4), (5,4), (1,2)))
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 25 September 2013 04:10, M.-A. Lemburg <mal@egenix.com> wrote:
On 24.09.2013 17:51, Ram Rachum wrote:
I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.
You can avoid the temp dict by doing some introspection of the arguments and using iterators instead.
(By the way, couldn't you make the same argument about `list.sort`?)
The use case is different. With list.sort() you don't want to create a copy of the list, but instead have the list sort itself, since you're not interested in the original order.
You'd only use an OrderedDict to begin with if you're interested in the insert order, otherwise you'd start out with a plain dict().
Not quite. As Ram showed, it's perfectly possible to sort an OrderedDict in-place, which you couldn't do with a normal dict. In which case you're looking at equivalent semantics as for a list (where items are just added using append) - using Ram's implementation above:
import collections
class SortableOrderedDict(collections.OrderedDict): ... def sort(self, key=None): ... sorted_keys = sorted(self.keys(), key=key) ... for key_ in sorted_keys[1:]: ... self.move_to_end(key_) ... x = [] x.append('c') x.append('b') x.sort() x.append('a')
y = SortableOrderedDict() y['c'] = 1 y['b'] = 2 y.sort() y['a'] = 3
print(x) ['b', 'c', 'a'] print(y) SortableOrderedDict([('b', 2), ('c', 1), ('a', 3)]) print(x == list(y.keys())) True
FWIW Ram I think you should put the implementation up on PyPI. Tim Delaney

On 25 Sep 2013 01:52, "Ram Rachum" <ram@rachum.com> wrote:
I get your point. It's a nice idea. But I think it's slightly less
elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.
(By the way, couldn't you make the same argument about `list.sort`?)
No, because list.sort() both predates the sorted builtin and is optimised to be blazingly fast with reasonable memory overhead by directly interacting with internal details of the list object. It's actually the pre-existing list sorting machinery that powers the builtin. The situation is different now: the sorted builtin provides a generic API to get a sorted version of any iterable. This means a proposed in-place sort() method on a container has to demonstrate a few things to overcome the "default deny" that is applied to any proposal to add more methods to an object interface: - there are common use cases that can't be handled by sorting the input when creating the container in the first place - there are significant speed gains from an in-place sorting operation - there are significant memory gains from an in-place sorting operation Now, in the case of OrderedDict it *may* be possible to back up one or more of those assertions (especially the latter two if you talk to Eric Snow about an in-place sort method for his C implementation of the API). However, in the absence of such evidence, the default reaction will always be to avoid expanding APIs with functionality that can be provided by applying external algorithms to the existing API. Cheers, Nick.
On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 24.09.2013 17:23, Ram Rachum wrote:
Ethan, you've misunderstood my message and given a correct objection
to an
argument I did not make.
I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.
The overhead introduced by completely recreating the internal data structure after the sort is just as high as creating a new OrderedDict, so I don't understand why you don't like about:
from collections import OrderedDict o = OrderedDict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
This even allows you to keep the original insert order should you need it again. If you don't need this, you can just use:
o = dict(((3,4), (5,4), (1,2))) p = OrderedDict(sorted(o.iteritems()))
which is also faster than first creating an OrderedDict and then recreating it with sorted entries.
Put those two lines into a function and you have:
def SortedOrderedDict(*args, **kws): o = dict(*args, **kws) return OrderedDict(sorted(o.iteritems()))
p = SortedOrderedDict(((3,4), (5,4), (1,2)))
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Sep 24 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-28: PyDDF Sprint ... 4 days to go 2013-10-14: PyCon DE 2013, Cologne, Germany ... 20 days to go
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas

Ram Rachum writes:
I disagree with defining the entire class as an insertion ordering class and refusing
There's no refusal. It's just not in the battery pack.
to allow users to reorder it as they wish after it's created.
You can put your inefficient but useful implementation on PyPI. You can write a PEP in which you define the API. You can provide an efficient implementation suitable for the stdlib, or you can convince the gatekeepers that it doesn't need to be efficient. You can promise to maintain it for 5 years.[1] Why don't you? Four or five hackers do it every cycle (although sometimes it takes more than a cycle to actually get approval). Recent successes include Ethan and Steven, who are giving you the benefit of their experience. OTOH, the barrier for mere suggestions (even backed up by proof of concept implementations) these days is quite high. You need to convince somebody to do all of the above, which usually requires an argument that it's at least tricky to do right, and perhaps hard to do at all. Footnotes: [1] Or whatever the going rate is these days.

On Tue, Sep 24, 2013 at 7:02 PM, Stephen J. Turnbull <stephen@xemacs.org>wrote:
Ram Rachum writes:
I disagree with defining the entire class as an insertion ordering class and refusing
There's no refusal. It's just not in the battery pack.
to allow users to reorder it as they wish after it's created.
You can put your inefficient but useful implementation on PyPI. You can write a PEP in which you define the API. You can provide an efficient implementation suitable for the stdlib, or you can convince the gatekeepers that it doesn't need to be efficient. You can promise to maintain it for 5 years.[1]
I can do an inefficient implementation and put it on PyPI. I don't see the need for writing a PEP for a simple method. ("Define the API"? Anything I'm missing beyond a call signature `def sort(self, key=None)`?) If people here are opposed to allowing an implementation of `OrderedDict.sort` in the stdlib, I don't see a reason to waste my time putting an implementation on PyPI. What's that implementation going to help if you won't allow it anyway? Here's a simple inefficient implementation you can use: def sort(self, key=None): ''' Sort the items according to their keys, changing the order in-place. The optional `key` argument, (not to be confused with the dictionary keys,) will be passed to the `sorted` function as a key function. ''' sorted_keys = sorted(self.keys(), key=key) for key_ in sorted_keys[1:]: self.move_to_end(key_) Regarding committing to maintain it for N years: Sorry, that's beyond what I'm willing to do. If that's a requirement for contributing a minor feature to Python, I'll have to withdraw my suggestion.
Why don't you? Four or five hackers do it every cycle (although sometimes it takes more than a cycle to actually get approval). Recent successes include Ethan and Steven, who are giving you the benefit of their experience.
OTOH, the barrier for mere suggestions (even backed up by proof of concept implementations) these days is quite high. You need to convince somebody to do all of the above, which usually requires an argument that it's at least tricky to do right, and perhaps hard to do at all.
Footnotes: [1] Or whatever the going rate is these days.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe. To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.

On Sep 24, 2013, at 9:19, Ram Rachum <ram@rachum.com> wrote:
If people here are opposed to allowing an implementation of `OrderedDict.sort` in the stdlib, I don't see a reason to waste my time putting an implementation on PyPI. What's that implementation going to help if you won't allow it anyway?
Do you not see the benefit to ipython, numpy, requests, the various popular web frameworks, fancy collections like blist, tools like scrapy, etc. being a simple pip away? Why wouldn't the same be true for your module? A useful module on PyPI helps thousands of people who otherwise would have had to reproduce all the work themselves or settled for not having it. It also leads to de facto standard ways to do things, which makes it easier to communicate with devs on other projects. (Imagine trying to get help with "my custom multidimensional array class" or "a web scraper that I built from scratch" vs. numpy or scrapy.) Do you think your idea is so trivial that there really is no benefit in any of that?

My code* is *on PyPI, just not isolated to the OrderedDict improvements. My OrderedDict improvements are here: http://pypi.python.org/pypi/python_toolbox This is a big package with all my stuff. On Tue, Sep 24, 2013 at 7:37 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 24, 2013, at 9:19, Ram Rachum <ram@rachum.com> wrote:
If people here are opposed to allowing an implementation of `OrderedDict.sort` in the stdlib, I don't see a reason to waste my time putting an implementation on PyPI. What's that implementation going to help if you won't allow it anyway?
Do you not see the benefit to ipython, numpy, requests, the various popular web frameworks, fancy collections like blist, tools like scrapy, etc. being a simple pip away? Why wouldn't the same be true for your module?
A useful module on PyPI helps thousands of people who otherwise would have had to reproduce all the work themselves or settled for not having it. It also leads to de facto standard ways to do things, which makes it easier to communicate with devs on other projects. (Imagine trying to get help with "my custom multidimensional array class" or "a web scraper that I built from scratch" vs. numpy or scrapy.)
Do you think your idea is so trivial that there really is no benefit in any of that?
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe. To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.

On Tue, 24 Sep 2013 22:13:15 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
What do you think about providing an `OrderedDict.sort` method? I've been using my own `OrderedDict` subclass that defines `sort` for years, and I always wondered why the stdlib one doesn't provide `sort`.
I can write the patch if needed.
I'm not entirely sure why anyone would need an OrderedDict sort method. Ordered Dicts store keys by insertion order. Sorting the keys goes against the purpose of an OrderedDict.
An OrderedDict is basically an associative container with a well-defined ordering. It's not only "insertion order", because you can use move_to_end() to reorder it piecewise. (at some point I also filed a feature request to rotate an OrderedDict: http://bugs.python.org/issue17100) However, sorting would be difficult to implement efficiently with the natural implementation of an OrderedDict, which uses linked lists. Basically, you're probably as good sorting the items separately and reinitializing the OrderedDict with them. Regards Antoine.

Antoine, my concern is not efficiency but convenience. Whoever has high efficiency requirements and wants to sort an ordered dict will have to find their own solution anyway, and the other 99% of people who just want to sort a 20-items-long ordered dict in their small web app could happily use `OrderedDict.sort`. And while we're on that subject, can we also add `OrderedDict.index`? On Tue, Sep 24, 2013 at 3:29 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 24 Sep 2013 22:13:15 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
What do you think about providing an `OrderedDict.sort` method? I've been using my own `OrderedDict` subclass that defines `sort` for years, and I always wondered why the stdlib one doesn't provide `sort`.
I can write the patch if needed.
I'm not entirely sure why anyone would need an OrderedDict sort method. Ordered Dicts store keys by insertion order. Sorting the keys goes against the purpose of an OrderedDict.
An OrderedDict is basically an associative container with a well-defined ordering. It's not only "insertion order", because you can use move_to_end() to reorder it piecewise. (at some point I also filed a feature request to rotate an OrderedDict: http://bugs.python.org/issue17100)
However, sorting would be difficult to implement efficiently with the natural implementation of an OrderedDict, which uses linked lists. Basically, you're probably as good sorting the items separately and reinitializing the OrderedDict with them.
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe. To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
participants (13)
-
Andrew Barnert
-
Antoine Pitrou
-
Brian Curtin
-
Ethan Furman
-
M.-A. Lemburg
-
Nick Coghlan
-
Oscar Benjamin
-
Ram Rachum
-
Ram Rachum
-
random832@fastmail.us
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Tim Delaney