data:image/s3,"s3://crabby-images/e8710/e87101471e2aba8edbbfbc9ccdef1c5541486e3f" alt=""
Some kind of ordered dictionary would be nice to have in the standard library. e.g. a AVL tree or something like that. It would be nice so we can do things like that: for value in tree[:end_key]: do_something_with(value) del tree[:end_key] A alternative would be just to sort the keys of a dict but that's O( n log n ) for each sort. Depending on what's the more often occurring case (lookup, insert, get key-range, etc.) a other kind of dict object would make sense. What do you think? -panzi
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
This has been brought up many times. The general consensus has been 'you don't get what you think you get'. >>> u'a' < 'b' < () < u'a' True That is to say, there isn't a total ordering on objects that would make sense as a sorted key,value dictionary. In Python 3.0, objects that don't make sense to compare won't be comparable, so list.sort() and/or an AVL tree may make sense again. However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.) tree is deciding semantics. Do you allow duplicate keys? Do you allow insertion and removal by position? Do you allow the fetching of the key/value at position X? Do you allow the fetching of the position for key X? Insertion before/after (bisect_left, bisect_right equivalents). Etcetera. In many cases, using a sorted list gets you what you want, is almost as fast, and has the benefit of using less memory. - Josiah
data:image/s3,"s3://crabby-images/e8710/e87101471e2aba8edbbfbc9ccdef1c5541486e3f" alt=""
Josiah Carlson schrieb:
However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.) tree is deciding semantics. Do you allow duplicate keys?
Does dict? no. so no.
Do you allow insertion and removal by position?
Does dict? no. so no.
Do you allow the fetching of the key/value at position X?
Does dict? no. so no.
Do you allow the fetching of the position for key X?
Does dict? no. so no.
Insertion before/after (bisect_left, bisect_right equivalents). Etcetera.
Why should all this be relevant? It just has to be some kind of relation between a key and a value, and the keys should be accessible in a sorted way (and you should not to have to sort them every time). So it would be possible to slice such a container.
In many cases, using a sorted list gets you what you want, is almost as fast, and has the benefit of using less memory.
AFAIK does a doubled link list use the same amount of memory as a (very) simple AVL tree: struct tree_node { struct tree_node left; struct tree_node right; void * data; }; struct list_node { struct list_node prev; struct list_node next; void * data; };
- Josiah
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
Very few use-cases of trees involve an ordered key/value dictionary. In 90% of the cases where I have needed (and implemented) trees involved one of the following use-cases; sorted keys (but no values), no but fast insertion of value based on position, sorted keys indexed by position or key with (and without) values, etc. Please also understand that the semantics of Python's dictionary is a function of its implementation as an open-addressed hash table. It's useful for 95% of use-cases, but among the remaining 5% (which includes the use-case you have in mind for the structure), there is a huge variety of just as significant uses that shouldn't be discounted.
Python lists aren't linked lists. If you didn't know this, then you don't know enough about the underlying implementation to make comments about what should or should not be available in the base language. - Josiah
data:image/s3,"s3://crabby-images/dea96/dea96da8fd6a0710ec6541b9f8351d90b6557e3e" alt=""
Josiah Carlson wrote:
I generally agree. I also think that the term "ordered dictionary" ought to be avoided. One the one hand, I have no particular objection to someone creating an implementation of RB trees, B+-trees, PATRICIA radix trees and so on - in fact, these might be very useful things to have as standard collection classes. However, 'dict' has a whole set of semantic baggage that goes along with it that may or may not apply to these other container types; And similarly, these other container types may have operations and semantics that don't correspond to the standard Python dictionary. One expects to be able to do certain things with an RB-tree that are either disallowed or very inefficient with a regular dict, and the converse is true as well. You give a number of examples such as fetching the position for a given key. So my feeling is - let trees be trees, and dicts be dicts, and don't attempt to conflate the two. Otherwise, you end up with what I like to call the "overfactoring" anti-pattern - that is, attempt to generalize and unify two disparate systems that have different purposes and design intents into a single uniform interface. -- Talin
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
"Mathias Panzenböck" <grosser.meister.morti@gmx.net> wrote in message news:4628DF1F.3060803@gmx.net... | Some kind of ordered dictionary would be nice to have in the | standard library. This has come up frequently, with 'ordered' having two quite different meanings. 1. Order of entry into the dictionary (for use with class definitions, for instance(though don't ask me why!). When a given key is entered just once, this is relatively easy: just append to a subsidiary list. I believe this is being at least considered for 3.0. 2. Order in the sorting or collation sense, which I presume you mean. To reduce confusion, call this a sorted dictionary, as others have done. Regardless, this has the problem that potential keys are not always comparable. This will become worse when most cross-type comparisons are disallowed in 3.0. So pershaps the __init__ method should require a tuple of allowed key types. | e.g. a AVL tree or something like that. ... | A alternative would be just to sort the keys of a dict but | that's O( n log n ) for each sort. Depending on what's the more | often occurring case (lookup, insert, get key-range, etc.) a | other kind of dict object would make sense. | | What do you think? If not already present in PyPI, someone could code an implementation and add it there. When such has be tested and achieved enough usage, then it might be proposed for addition to the collections module. Terry Jan Reedy
data:image/s3,"s3://crabby-images/fa766/fa766e40d94f4d3ec20cfe84de928f13a528a508" alt=""
On 4/20/07, Terry Reedy <tjreedy@udel.edu> wrote:
If it is not a problem for lists it is not a problem for ordered dictionaries.
And that is how the currently considered for Python 3.0 ordered dict implementation got into Python? I find it amusing that over the years people have argued against having an ordered dict in Python. But now, when one is considered, only THAT version with THOSE semantics, is good. The rest should go to PyPI. -- mvh Björn
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
It's about a total ordering. Without a total ordering, you won't necessarily be able to *find* an object even if it is in there.
Also, in 3.0, objects will only be orderable if they are of compatible type. str and tuple are not compatible, so will raise an exception when something like "" < () is performed.
No, the "ordered dict" that is making its way into Python 3.0 is specifically ordered based on insertion order, and is to make more reasonable database interfaces like... class Person(db.table): firstname = str ... Its implementation is also a very simple variant of a dictionary, which isn't the case with any tree implementation. Further, because there are *so many* possible behaviors for a dictionary ordered by keys implemented as a tree, picking one (or even a small set of them) is guaranteed to raise comments of "can't we have one that does X too?" - Josiah
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote in message news:740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com...
Regardless, this has the problem that potential keys are not always comparable.
Current example:
[1, 1j].sort()
Traceback (most recent call last): File "<pyshell#2>", line 1, in -toplevel- [1, 1j].sort() TypeError: no ordering relation is defined for complex numbers
This will become worse when most cross-type comparisons are disallowed in 3.0.
Py 3.0 will raise an exception here as these will all be incomparable.
If it is not a problem for lists it is not a problem for ordered dictionaries.
But it *is* currently a problem for lists that will become much more extensive in the future, so it *is* currently a problem for sorted dicts that will be much more of a problem in the future. Hence, sorted dicts will have to be restricted to one type or one group of truly comparable types. Terry Jan Reedy
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
You could, but that would imply a total ordering on elements that Python itself is removing because it doesn't make any sense. Including a list of 'acceptable' classes as Terry has suggested would work, but would generally be superfluous. The moment a user first added an object to the sorted dictionary is the moment the type of objects that can be inserted is easily limited (hello Abstract Base Classes PEP!) - Josiah
data:image/s3,"s3://crabby-images/fa766/fa766e40d94f4d3ec20cfe84de928f13a528a508" alt=""
On 4/21/07, Josiah Carlson <jcarlson@uci.edu> wrote:
Where did the "we are all consenting adults here" mantra go? Java doesn't imply any total order on elements either, yet it manages to fit a TreeMap class that does not artificially limit the kind of items you can put in it. Yes, you can "screw up" by overriding the hashCode and equals methods of the items you put in it. Java, in this case, doesn't try to enforce correctness on the language level, instead it documents the contract the programmer is supposed to follow. m1.equals(m2) should imply that m1.hashCode() == m2.hashCode(). Python suffer the same "problem": class Obj: def __eq__(self, o): return 0 o1 = Obj() o2 = Obj() L = [o1, o2] assert L.index(o2) == 1 Similar fuck ups are possible when using dicts. In practice this is not a problem. An ordered dict doesn't need any more safeguards than Python's already existing data structures. Using the natural order of its items are just fine and when you need something more fancy, override the __eq__ method or give the collections sort method a comparator function argument. -- mvh Björn
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
At no point has there been discussion over removing the ability for types which don't have a total ordering to be placed into a dictionary (hash table). a = {1:2, None:6, 'hello':0} will always work. The only thing that anyone has talked about is the removal of >, >=, <, <= for types that make no sense to compare. Like unicode and tuple, or int and tuple, or int and list, etc. The removal of a "total ordering" does not imply that 5 != 'hello' will somehow start failing, it means that 5 < 'hello' will begin to raise an exception because it doesn't make any sense.
Except this series of posts is about a "sorted dict", with a key,value mapping in which the equivalent .items() are sorted() as an ordering (rather than more or less dependant on hash value as in a standard dictionary). But as I, and others have stated before, which you should read once again because you don't seem to get it: THE EXISTANCE OF A TOTAL ORDERING ON VALUES IN PYTHON TODAY IS A LIE. IN FUTURE PYTHONS WE ARE REMOVING THE LIE BECAUSE IT DOESN'T HELP ANYONE. IF YOU DON'T LIKE IT; TOUGH COOKIES. STANDARD PYTHON DICTIONARIES WILL WORK THE WAY THEY ALWAYS HAVE. ONLY PEOPLE WHO BELIEVE THAT INCOMPATIBLE TYPES SHOULD BE ORDERED IN A PARTICULAR WAY IN THINGS LIKE lst.sort() WILL BE AFFECTED. If you want an actual reference, please see PEP 3100 which says, "Comparisons other than == and != between disparate types will raise an exception unless explicitly supported by the type" ... and references: http://mail.python.org/pipermail/python-dev/2004-June/045111.html If you don't understand this, please ask again without profanity or accusing the Python developers of removing the "consenting adults" requirement. Python is getting smarter. Maybe you just don't understand why this is the case. - Josiah
data:image/s3,"s3://crabby-images/fa766/fa766e40d94f4d3ec20cfe84de928f13a528a508" alt=""
On 4/26/07, Josiah Carlson <jcarlson@uci.edu> wrote:
I really do not understand what you are talking about. Maybe you have misunderstood something? Lets talk about Java for a moment. Java contains a class called TreeMap which has all the features that the original poster asked for. The Python equivalent to TreeMap would be a "sorted dictionary." Java, just like Python 3k will, forbids comparisions between disparate Comparable types. It follows that Java does not enforce any "total ordering" on disparate types either. The absence of a total ordering does not mean that Java's TreeMap class' constructor needs to be supplied with a list of "allowed key types" as you and Terry Reedy suggested that Python's hypothetical sorted dictionary would need. Try the following Java code: TreeMap tm = new TreeMap(); tm.put(new Integer(3), "moo"); tm.put(new Double(7), "moo"); Java is definitely not designed according to the "we are all consenting adults" philosophy, but it has no problem whatsoever accepting this code. Note that you did NOT have to specify the "allowed key types" and that Integer and Double are incomparable. But when you run it: Exception in thread "main" java.lang.ClassCastException at java.lang.Double.compareTo(Double.java:642) at java.util.TreeMap.compare(TreeMap.java:1085) at java.util.TreeMap.put(TreeMap.java:463) at comp.main(comp.java:9) Which is exactly how I am suggesting that the hypothetical sorted dict class should work too (in py3k). You can read more about these Java classes and interfaces here http://java.sun.com/j2se/1.4.2/docs/api/java/util/TreeMap.html and here http://java.sun.com/j2se/1.4.2/docs/api/java/util/SortedMap.html. I hope you understand now why having to specify or restrict the allowed "key types" is superfluous and why a sorted dict "doesn't need any more safeguards than Python's already existing data structures." -- mvh Björn
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
In your post with Message-Id: <740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com> you state...
I pointed out in a reply to that message that it would be a problem because the sort will fail in Python 3.x . In your post with Message-Id: <740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com> you offer...
Alternatively, you could require a comparator function to be specified at creation time.
Now you offer Java TreeMap as an example of semantics that you would like to duplicate, from which I now understand the context of offering a 'comparator function'. Your current desired semantics are possible in Python 3.x, as it no longer relies on a total ordering, but merely a "natural ordering" (which should be defined for all a,b pairs both of type c), which list.sort() will rely on in 3.x as well. The question at this point is whether or not we want the equivalent of a Java TreeMap in Python. I'm leaning towards no, as I have found very few uses of such things in my own code. There's also the fact that Daniel Stutzbach's BList implementation (which may make it into Python's collections module), would allow for a more or less equivalent implementation of your desired semantics using bisect, though each operation would suffer an O(logn) slowdown for overall O(nlog^2n) sorting and O(log^2n) insertion/deletion/search per item (O(logn) queries, O(logn) per query*). Would that be sufficient? - Josiah * The BList is at worst log base 64 (at beast 128), so if you are emulating a TreeMap using BList and bisect, then in the worst case of 1 billion items in your TreeMap, you are running at around 1/5 the speed of an equivalent Red-Black tree (1/4 at 16 million, 1/3 at 256k, 1/2 at 4k).
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote in message news:740c3aec0704301624x60226147x7bb4c1def423441e@mail.gmail.com...
I don't believe I said 'needs' and I already agreed that such a list would be less helpful than I had suggested. But one would help give better messages from __str__ and exceptions. Also, the first item added does not get compared to anything, so without such a list, it effectively determines the key type. That said, propose what you want and see if it gets enough usage to justify addition to the collections module. Anyone who wants a sorted dict with keytype attibute could get one by subclassing one without. Terry Jan Reedy
data:image/s3,"s3://crabby-images/e8710/e87101471e2aba8edbbfbc9ccdef1c5541486e3f" alt=""
Ok, now. Forget all I said. Just a short question: When you have to store values accosiated with keys and the keys have to be accessible in a sorted manner. What container type would you use? What data structure would you implement? (I just thought a AVL tree would have been a good choice.) Thanks, panzi
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
If you want to use only things that are available in base Python, use a list and the bisect module. If you need O(logn) insertion and removal, then an AVL/Red-Black/2-3 tree with the semantics you described would also work. (I think there is both an AVL and Red-Black tree implementation in the Python package index [1]) If you only need to concern yourself with ordering every once in a while, then x = dct.items();x.sort() works reasonably well. Sometimes a "pair heap" can get you what you are looking for [2]. Data structure choices are tricky. It is usually better to describe the problem and one's approach (why you choose to use a particular algorithm and structure), rather than strictly asking "where can I find data structure X". - Josiah [1] http://www.python.org/pypi/ [2] http://mail.python.org/pipermail/python-dev/2006-November/069845.html
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
This has been brought up many times. The general consensus has been 'you don't get what you think you get'. >>> u'a' < 'b' < () < u'a' True That is to say, there isn't a total ordering on objects that would make sense as a sorted key,value dictionary. In Python 3.0, objects that don't make sense to compare won't be comparable, so list.sort() and/or an AVL tree may make sense again. However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.) tree is deciding semantics. Do you allow duplicate keys? Do you allow insertion and removal by position? Do you allow the fetching of the key/value at position X? Do you allow the fetching of the position for key X? Insertion before/after (bisect_left, bisect_right equivalents). Etcetera. In many cases, using a sorted list gets you what you want, is almost as fast, and has the benefit of using less memory. - Josiah
data:image/s3,"s3://crabby-images/e8710/e87101471e2aba8edbbfbc9ccdef1c5541486e3f" alt=""
Josiah Carlson schrieb:
However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.) tree is deciding semantics. Do you allow duplicate keys?
Does dict? no. so no.
Do you allow insertion and removal by position?
Does dict? no. so no.
Do you allow the fetching of the key/value at position X?
Does dict? no. so no.
Do you allow the fetching of the position for key X?
Does dict? no. so no.
Insertion before/after (bisect_left, bisect_right equivalents). Etcetera.
Why should all this be relevant? It just has to be some kind of relation between a key and a value, and the keys should be accessible in a sorted way (and you should not to have to sort them every time). So it would be possible to slice such a container.
In many cases, using a sorted list gets you what you want, is almost as fast, and has the benefit of using less memory.
AFAIK does a doubled link list use the same amount of memory as a (very) simple AVL tree: struct tree_node { struct tree_node left; struct tree_node right; void * data; }; struct list_node { struct list_node prev; struct list_node next; void * data; };
- Josiah
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
Very few use-cases of trees involve an ordered key/value dictionary. In 90% of the cases where I have needed (and implemented) trees involved one of the following use-cases; sorted keys (but no values), no but fast insertion of value based on position, sorted keys indexed by position or key with (and without) values, etc. Please also understand that the semantics of Python's dictionary is a function of its implementation as an open-addressed hash table. It's useful for 95% of use-cases, but among the remaining 5% (which includes the use-case you have in mind for the structure), there is a huge variety of just as significant uses that shouldn't be discounted.
Python lists aren't linked lists. If you didn't know this, then you don't know enough about the underlying implementation to make comments about what should or should not be available in the base language. - Josiah
data:image/s3,"s3://crabby-images/dea96/dea96da8fd6a0710ec6541b9f8351d90b6557e3e" alt=""
Josiah Carlson wrote:
I generally agree. I also think that the term "ordered dictionary" ought to be avoided. One the one hand, I have no particular objection to someone creating an implementation of RB trees, B+-trees, PATRICIA radix trees and so on - in fact, these might be very useful things to have as standard collection classes. However, 'dict' has a whole set of semantic baggage that goes along with it that may or may not apply to these other container types; And similarly, these other container types may have operations and semantics that don't correspond to the standard Python dictionary. One expects to be able to do certain things with an RB-tree that are either disallowed or very inefficient with a regular dict, and the converse is true as well. You give a number of examples such as fetching the position for a given key. So my feeling is - let trees be trees, and dicts be dicts, and don't attempt to conflate the two. Otherwise, you end up with what I like to call the "overfactoring" anti-pattern - that is, attempt to generalize and unify two disparate systems that have different purposes and design intents into a single uniform interface. -- Talin
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
"Mathias Panzenböck" <grosser.meister.morti@gmx.net> wrote in message news:4628DF1F.3060803@gmx.net... | Some kind of ordered dictionary would be nice to have in the | standard library. This has come up frequently, with 'ordered' having two quite different meanings. 1. Order of entry into the dictionary (for use with class definitions, for instance(though don't ask me why!). When a given key is entered just once, this is relatively easy: just append to a subsidiary list. I believe this is being at least considered for 3.0. 2. Order in the sorting or collation sense, which I presume you mean. To reduce confusion, call this a sorted dictionary, as others have done. Regardless, this has the problem that potential keys are not always comparable. This will become worse when most cross-type comparisons are disallowed in 3.0. So pershaps the __init__ method should require a tuple of allowed key types. | e.g. a AVL tree or something like that. ... | A alternative would be just to sort the keys of a dict but | that's O( n log n ) for each sort. Depending on what's the more | often occurring case (lookup, insert, get key-range, etc.) a | other kind of dict object would make sense. | | What do you think? If not already present in PyPI, someone could code an implementation and add it there. When such has be tested and achieved enough usage, then it might be proposed for addition to the collections module. Terry Jan Reedy
data:image/s3,"s3://crabby-images/fa766/fa766e40d94f4d3ec20cfe84de928f13a528a508" alt=""
On 4/20/07, Terry Reedy <tjreedy@udel.edu> wrote:
If it is not a problem for lists it is not a problem for ordered dictionaries.
And that is how the currently considered for Python 3.0 ordered dict implementation got into Python? I find it amusing that over the years people have argued against having an ordered dict in Python. But now, when one is considered, only THAT version with THOSE semantics, is good. The rest should go to PyPI. -- mvh Björn
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
It's about a total ordering. Without a total ordering, you won't necessarily be able to *find* an object even if it is in there.
Also, in 3.0, objects will only be orderable if they are of compatible type. str and tuple are not compatible, so will raise an exception when something like "" < () is performed.
No, the "ordered dict" that is making its way into Python 3.0 is specifically ordered based on insertion order, and is to make more reasonable database interfaces like... class Person(db.table): firstname = str ... Its implementation is also a very simple variant of a dictionary, which isn't the case with any tree implementation. Further, because there are *so many* possible behaviors for a dictionary ordered by keys implemented as a tree, picking one (or even a small set of them) is guaranteed to raise comments of "can't we have one that does X too?" - Josiah
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote in message news:740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com...
Regardless, this has the problem that potential keys are not always comparable.
Current example:
[1, 1j].sort()
Traceback (most recent call last): File "<pyshell#2>", line 1, in -toplevel- [1, 1j].sort() TypeError: no ordering relation is defined for complex numbers
This will become worse when most cross-type comparisons are disallowed in 3.0.
Py 3.0 will raise an exception here as these will all be incomparable.
If it is not a problem for lists it is not a problem for ordered dictionaries.
But it *is* currently a problem for lists that will become much more extensive in the future, so it *is* currently a problem for sorted dicts that will be much more of a problem in the future. Hence, sorted dicts will have to be restricted to one type or one group of truly comparable types. Terry Jan Reedy
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
You could, but that would imply a total ordering on elements that Python itself is removing because it doesn't make any sense. Including a list of 'acceptable' classes as Terry has suggested would work, but would generally be superfluous. The moment a user first added an object to the sorted dictionary is the moment the type of objects that can be inserted is easily limited (hello Abstract Base Classes PEP!) - Josiah
data:image/s3,"s3://crabby-images/fa766/fa766e40d94f4d3ec20cfe84de928f13a528a508" alt=""
On 4/21/07, Josiah Carlson <jcarlson@uci.edu> wrote:
Where did the "we are all consenting adults here" mantra go? Java doesn't imply any total order on elements either, yet it manages to fit a TreeMap class that does not artificially limit the kind of items you can put in it. Yes, you can "screw up" by overriding the hashCode and equals methods of the items you put in it. Java, in this case, doesn't try to enforce correctness on the language level, instead it documents the contract the programmer is supposed to follow. m1.equals(m2) should imply that m1.hashCode() == m2.hashCode(). Python suffer the same "problem": class Obj: def __eq__(self, o): return 0 o1 = Obj() o2 = Obj() L = [o1, o2] assert L.index(o2) == 1 Similar fuck ups are possible when using dicts. In practice this is not a problem. An ordered dict doesn't need any more safeguards than Python's already existing data structures. Using the natural order of its items are just fine and when you need something more fancy, override the __eq__ method or give the collections sort method a comparator function argument. -- mvh Björn
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
At no point has there been discussion over removing the ability for types which don't have a total ordering to be placed into a dictionary (hash table). a = {1:2, None:6, 'hello':0} will always work. The only thing that anyone has talked about is the removal of >, >=, <, <= for types that make no sense to compare. Like unicode and tuple, or int and tuple, or int and list, etc. The removal of a "total ordering" does not imply that 5 != 'hello' will somehow start failing, it means that 5 < 'hello' will begin to raise an exception because it doesn't make any sense.
Except this series of posts is about a "sorted dict", with a key,value mapping in which the equivalent .items() are sorted() as an ordering (rather than more or less dependant on hash value as in a standard dictionary). But as I, and others have stated before, which you should read once again because you don't seem to get it: THE EXISTANCE OF A TOTAL ORDERING ON VALUES IN PYTHON TODAY IS A LIE. IN FUTURE PYTHONS WE ARE REMOVING THE LIE BECAUSE IT DOESN'T HELP ANYONE. IF YOU DON'T LIKE IT; TOUGH COOKIES. STANDARD PYTHON DICTIONARIES WILL WORK THE WAY THEY ALWAYS HAVE. ONLY PEOPLE WHO BELIEVE THAT INCOMPATIBLE TYPES SHOULD BE ORDERED IN A PARTICULAR WAY IN THINGS LIKE lst.sort() WILL BE AFFECTED. If you want an actual reference, please see PEP 3100 which says, "Comparisons other than == and != between disparate types will raise an exception unless explicitly supported by the type" ... and references: http://mail.python.org/pipermail/python-dev/2004-June/045111.html If you don't understand this, please ask again without profanity or accusing the Python developers of removing the "consenting adults" requirement. Python is getting smarter. Maybe you just don't understand why this is the case. - Josiah
data:image/s3,"s3://crabby-images/fa766/fa766e40d94f4d3ec20cfe84de928f13a528a508" alt=""
On 4/26/07, Josiah Carlson <jcarlson@uci.edu> wrote:
I really do not understand what you are talking about. Maybe you have misunderstood something? Lets talk about Java for a moment. Java contains a class called TreeMap which has all the features that the original poster asked for. The Python equivalent to TreeMap would be a "sorted dictionary." Java, just like Python 3k will, forbids comparisions between disparate Comparable types. It follows that Java does not enforce any "total ordering" on disparate types either. The absence of a total ordering does not mean that Java's TreeMap class' constructor needs to be supplied with a list of "allowed key types" as you and Terry Reedy suggested that Python's hypothetical sorted dictionary would need. Try the following Java code: TreeMap tm = new TreeMap(); tm.put(new Integer(3), "moo"); tm.put(new Double(7), "moo"); Java is definitely not designed according to the "we are all consenting adults" philosophy, but it has no problem whatsoever accepting this code. Note that you did NOT have to specify the "allowed key types" and that Integer and Double are incomparable. But when you run it: Exception in thread "main" java.lang.ClassCastException at java.lang.Double.compareTo(Double.java:642) at java.util.TreeMap.compare(TreeMap.java:1085) at java.util.TreeMap.put(TreeMap.java:463) at comp.main(comp.java:9) Which is exactly how I am suggesting that the hypothetical sorted dict class should work too (in py3k). You can read more about these Java classes and interfaces here http://java.sun.com/j2se/1.4.2/docs/api/java/util/TreeMap.html and here http://java.sun.com/j2se/1.4.2/docs/api/java/util/SortedMap.html. I hope you understand now why having to specify or restrict the allowed "key types" is superfluous and why a sorted dict "doesn't need any more safeguards than Python's already existing data structures." -- mvh Björn
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote:
In your post with Message-Id: <740c3aec0704202128g6537c5bfv94c0f60a5d883d76@mail.gmail.com> you state...
I pointed out in a reply to that message that it would be a problem because the sort will fail in Python 3.x . In your post with Message-Id: <740c3aec0704210817m3e11a9e5lb3325523e2490348@mail.gmail.com> you offer...
Alternatively, you could require a comparator function to be specified at creation time.
Now you offer Java TreeMap as an example of semantics that you would like to duplicate, from which I now understand the context of offering a 'comparator function'. Your current desired semantics are possible in Python 3.x, as it no longer relies on a total ordering, but merely a "natural ordering" (which should be defined for all a,b pairs both of type c), which list.sort() will rely on in 3.x as well. The question at this point is whether or not we want the equivalent of a Java TreeMap in Python. I'm leaning towards no, as I have found very few uses of such things in my own code. There's also the fact that Daniel Stutzbach's BList implementation (which may make it into Python's collections module), would allow for a more or less equivalent implementation of your desired semantics using bisect, though each operation would suffer an O(logn) slowdown for overall O(nlog^2n) sorting and O(log^2n) insertion/deletion/search per item (O(logn) queries, O(logn) per query*). Would that be sufficient? - Josiah * The BList is at worst log base 64 (at beast 128), so if you are emulating a TreeMap using BList and bisect, then in the worst case of 1 billion items in your TreeMap, you are running at around 1/5 the speed of an equivalent Red-Black tree (1/4 at 16 million, 1/3 at 256k, 1/2 at 4k).
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
"BJörn Lindqvist" <bjourne@gmail.com> wrote in message news:740c3aec0704301624x60226147x7bb4c1def423441e@mail.gmail.com...
I don't believe I said 'needs' and I already agreed that such a list would be less helpful than I had suggested. But one would help give better messages from __str__ and exceptions. Also, the first item added does not get compared to anything, so without such a list, it effectively determines the key type. That said, propose what you want and see if it gets enough usage to justify addition to the collections module. Anyone who wants a sorted dict with keytype attibute could get one by subclassing one without. Terry Jan Reedy
data:image/s3,"s3://crabby-images/e8710/e87101471e2aba8edbbfbc9ccdef1c5541486e3f" alt=""
Ok, now. Forget all I said. Just a short question: When you have to store values accosiated with keys and the keys have to be accessible in a sorted manner. What container type would you use? What data structure would you implement? (I just thought a AVL tree would have been a good choice.) Thanks, panzi
data:image/s3,"s3://crabby-images/ccefc/ccefcd2eef7a755338fe5de3b95723fc96f07ed5" alt=""
Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
If you want to use only things that are available in base Python, use a list and the bisect module. If you need O(logn) insertion and removal, then an AVL/Red-Black/2-3 tree with the semantics you described would also work. (I think there is both an AVL and Red-Black tree implementation in the Python package index [1]) If you only need to concern yourself with ordering every once in a while, then x = dct.items();x.sort() works reasonably well. Sometimes a "pair heap" can get you what you are looking for [2]. Data structure choices are tricky. It is usually better to describe the problem and one's approach (why you choose to use a particular algorithm and structure), rather than strictly asking "where can I find data structure X". - Josiah [1] http://www.python.org/pypi/ [2] http://mail.python.org/pipermail/python-dev/2006-November/069845.html
participants (5)
-
BJörn Lindqvist
-
Josiah Carlson
-
Mathias Panzenböck
-
Talin
-
Terry Reedy