Add orderedset as set(iterable, *, ordered=False) and similarly for frozenset.
Hundreds of people want an orderedset (http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set. The only major design decision is (I think) whether to store the elements in a linked list or dynamic vector. My personal preference is to specify the ordered set via a flag to the constructor, which can be intercepted by __new__ to return a different object type as necessary. (If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!) Best, Neil
Interesting -- somehow this message came through on the google group mirror of python-ideas only -- so replying to it barfed. He's my reply again, with the proper address this time: On Wed, Feb 4, 2015 at 1:14 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Hundreds of people want an orderedset ( http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
It seems like a good way to go here is to contribute to one of the projects already on PyPI -- and once you like it, propose that it be added to the standard library. And can you leverage the OrderedDict implementation already in the std lib? Or maybe better, the new one in PyPy? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
2015-02-05 23:02 GMT+01:00 Chris Barker <chris.barker@noaa.gov>:
Interesting -- somehow this message came through on the google group mirror of python-ideas only -- so replying to it barfed.
He's my reply again, with the proper address this time:
On Wed, Feb 4, 2015 at 1:14 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Hundreds of people want an orderedset ( http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
It seems like a good way to go here is to contribute to one of the projects already on PyPI -- and once you like it, propose that it be added to the standard library.
And can you leverage the OrderedDict implementation already in the std lib? Or maybe better, the new one in PyPy?
PyPy dicts and sets (the built-in ones) are already ordered by default. -- Amaury Forgeot d'Arc
On Thu, Feb 5, 2015 at 2:04 PM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
And can you leverage the OrderedDict implementation already in the std
lib? Or maybe better, the new one in PyPy?
PyPy dicts and sets (the built-in ones) are already ordered by default.
Exactly ;-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
Interesting. It's more than a couple of lines of code to change for Jython (I tried), but now that we require Java 7, we should be able to implement identical ordered behavior to PyPy for dict, __dict__, set, and frozenset using ConcurrentSkipListMap instead of our current usage of ConcurrentHashMap. (Other collections like defaultdict and weakref collections use Google Guava's MapMaker, which doesn't have an ordered option.) I think this is a great idea from a user experience perspective. On Thu, Feb 5, 2015 at 3:04 PM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
2015-02-05 23:02 GMT+01:00 Chris Barker <chris.barker@noaa.gov>:
Interesting -- somehow this message came through on the google group mirror of python-ideas only -- so replying to it barfed.
He's my reply again, with the proper address this time:
On Wed, Feb 4, 2015 at 1:14 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Hundreds of people want an orderedset ( http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
It seems like a good way to go here is to contribute to one of the projects already on PyPI -- and once you like it, propose that it be added to the standard library.
And can you leverage the OrderedDict implementation already in the std lib? Or maybe better, the new one in PyPy?
PyPy dicts and sets (the built-in ones) are already ordered by default.
-- Amaury Forgeot d'Arc
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- - Jim jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com} twitter.com/jimbaker github.com/jimbaker bitbucket.com/jimbaker
On Feb 7, 2015, at 20:02, Jim Baker <jim.baker@python.org> wrote:
Interesting. It's more than a couple of lines of code to change for Jython (I tried), but now that we require Java 7, we should be able to implement identical ordered behavior to PyPy for dict, __dict__, set, and frozenset using ConcurrentSkipListMap instead of our current usage of ConcurrentHashMap.
How? A skip list is a sorted collection--kept in the natural order of the keys, not the insertion order. Of course you can transform each key into a (next_index(), key) pair at insert time to keep insertion order instead of natural sorting order, but then you lose the logarithmic lookup time when searching just by key. And even if I'm missing something stupid and there is a way to make this work, you're still only going to get logarithmic performance, not linear. And you're probably going to change the requirement on dict keys from hashable to fully ordered (which, on top of being different, is substantially harder to detect at runtime, and also doesn't map cleanly to immutable among Python's standard data types). So, I don't think it would be a good idea even if it were possible. (There's a reason that Java, as well as C++ and other languages, provides sets and maps based on both hash tables and sorted logarithmic structures--because they're not interchangeable.)
(Other collections like defaultdict and weakref collections use Google Guava's MapMaker, which doesn't have an ordered option.)
I think this is a great idea from a user experience perspective.
On Thu, Feb 5, 2015 at 3:04 PM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
2015-02-05 23:02 GMT+01:00 Chris Barker <chris.barker@noaa.gov>:
Interesting -- somehow this message came through on the google group mirror of python-ideas only -- so replying to it barfed.
He's my reply again, with the proper address this time:
On Wed, Feb 4, 2015 at 1:14 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Hundreds of people want an orderedset (http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
It seems like a good way to go here is to contribute to one of the projects already on PyPI -- and once you like it, propose that it be added to the standard library.
And can you leverage the OrderedDict implementation already in the std lib? Or maybe better, the new one in PyPy?
PyPy dicts and sets (the built-in ones) are already ordered by default.
-- Amaury Forgeot d'Arc
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- - Jim
jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com} twitter.com/jimbaker github.com/jimbaker bitbucket.com/jimbaker _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
(There's a reason that Java, as well as C++ and other languages, provides sets and maps based on both hash tables and sorted logarithmic structures--because they're not interchangeable.)
Yes. And Python unfortunately only provides them as hash tables. Sturla
Andrew, You're absolutely right - the semantics matter, as they always do. There are significant choices in the Java ecosystem because of this. If we are preserving insertion order, vs a comparator based ordering, skip lists and the specific ConcurrentSkipListMap implementation will not work. As often the case, it's good to try things out. I looked at PyPy 2.5.0 this morning, and both set and dict are insertion order preserving. Interestingly, when creating a set:
x = {1,2,3,4,5,6,7}
the resulting set is the reverse of its input, which I find surprising:
x set([7, 6, 5, 4, 3, 2, 1])
Note that this is not the case if initializing from a sequence:
set([1,2,3,4,5,6,7]) set([1, 2, 3, 4, 5, 6, 7])
Using LinkedHashSet from Jython also does what is expected:
from java.util import LinkedHashSet as LHS LHS([1,2,3,4,5,6,7]) [1, 2, 3, 4, 5, 6, 7]
(Iterators, etc, are consistent with the representations that are printed in these cases from PyPy and Jython.) But there's one more important thing we have to consider. Jython needs ConcurrentMap semantics, so that we can support our interpretation of the Python memory model (sequential consistency, plus weakly consistent iteration is also depended on, see also http://www.jython.org/jythonbook/en/1.0/Concurrency.html#python-memory-model). There's some interesting work out there to support a ConcurrentLinkedHashMap implementation ( https://code.google.com/p/concurrentlinkedhashmap/), the design of which was used as the basis of the design of the LRU cache in Google Guava, but it would require some further investigation to see how this would impact Jython. Trying out an alternative implementation in Jython would still be on the order of a few lines of code changes, however :), which gives us a nice low cost for such experiments. - Jim On Sun, Feb 8, 2015 at 12:31 AM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
On Feb 7, 2015, at 20:02, Jim Baker <jim.baker@python.org> wrote:
Interesting. It's more than a couple of lines of code to change for Jython (I tried), but now that we require Java 7, we should be able to implement identical ordered behavior to PyPy for dict, __dict__, set, and frozenset using ConcurrentSkipListMap instead of our current usage of ConcurrentHashMap.
How? A skip list is a sorted collection--kept in the natural order of the keys, not the insertion order. Of course you can transform each key into a (next_index(), key) pair at insert time to keep insertion order instead of natural sorting order, but then you lose the logarithmic lookup time when searching just by key.
And even if I'm missing something stupid and there is a way to make this work, you're still only going to get logarithmic performance, not linear. And you're probably going to change the requirement on dict keys from hashable to fully ordered (which, on top of being different, is substantially harder to detect at runtime, and also doesn't map cleanly to immutable among Python's standard data types). So, I don't think it would be a good idea even if it were possible.
(There's a reason that Java, as well as C++ and other languages, provides sets and maps based on both hash tables and sorted logarithmic structures--because they're not interchangeable.)
(Other collections like defaultdict and weakref collections use Google Guava's MapMaker, which doesn't have an ordered option.)
I think this is a great idea from a user experience perspective.
On Thu, Feb 5, 2015 at 3:04 PM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
2015-02-05 23:02 GMT+01:00 Chris Barker <chris.barker@noaa.gov>:
Interesting -- somehow this message came through on the google group mirror of python-ideas only -- so replying to it barfed.
He's my reply again, with the proper address this time:
On Wed, Feb 4, 2015 at 1:14 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Hundreds of people want an orderedset ( http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
It seems like a good way to go here is to contribute to one of the projects already on PyPI -- and once you like it, propose that it be added to the standard library.
And can you leverage the OrderedDict implementation already in the std lib? Or maybe better, the new one in PyPy?
PyPy dicts and sets (the built-in ones) are already ordered by default.
-- Amaury Forgeot d'Arc
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- - Jim
jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com} twitter.com/jimbaker github.com/jimbaker bitbucket.com/jimbaker
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- - Jim jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com} twitter.com/jimbaker github.com/jimbaker bitbucket.com/jimbaker
On Feb 8, 2015, at 8:30, Jim Baker <jim.baker@python.org> wrote:
But there's one more important thing we have to consider. Jython needs ConcurrentMap semantics, so that we can support our interpretation of the Python memory model (sequential consistency, plus weakly consistent iteration is also depended on, see also http://www.jython.org/jythonbook/en/1.0/Concurrency.html#python-memory-model). There's some interesting work out there to support a ConcurrentLinkedHashMap implementation (https://code.google.com/p/concurrentlinkedhashmap/), the design of which was used as the basis of the design of the LRU cache in Google Guava, but it would require some further investigation to see how this would impact Jython.
Correct me if I'm wrong, but doesn't a synchronized mapping provide the required semantics? Obviously in some cases the nonblocking implementation is much more scalable (and better parallelization is one of the reasons people sometimes use Jython), and in a few cases that actually matters, but I'm pretty sure that most people who want OrderedSet and OrderedDict will be fine with a synchronized implementation (which will still parallelize better than CPython and PyPy, just not as well as Jython's regular set and dict), and most people who really need a nonblocking ordered set collection but aren't picky enough to want a specific implementation don't exist. So, if the other implementations do add an OrderedSet in 3.5, and Jython has a 3.5 coming soon, and nonblocking linked hash containers are still in the research stage, can't you just use a synchronized linked hash container for now? Also, what does Jython do for collections.OrderedDict today? Surely if that's good enough, you can do the exact same thing for OrderedSet, and if it's not good enough, you can put a good OrderedSet on the same backlog as a good OrderedDict.
Trying out an alternative implementation in Jython would still be on the order of a few lines of code changes, however :), which gives us a nice low cost for such experiments.
- Jim
On Sun, Feb 8, 2015 at 12:31 AM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
On Feb 7, 2015, at 20:02, Jim Baker <jim.baker@python.org> wrote:
Interesting. It's more than a couple of lines of code to change for Jython (I tried), but now that we require Java 7, we should be able to implement identical ordered behavior to PyPy for dict, __dict__, set, and frozenset using ConcurrentSkipListMap instead of our current usage of ConcurrentHashMap.
How? A skip list is a sorted collection--kept in the natural order of the keys, not the insertion order. Of course you can transform each key into a (next_index(), key) pair at insert time to keep insertion order instead of natural sorting order, but then you lose the logarithmic lookup time when searching just by key.
And even if I'm missing something stupid and there is a way to make this work, you're still only going to get logarithmic performance, not linear. And you're probably going to change the requirement on dict keys from hashable to fully ordered (which, on top of being different, is substantially harder to detect at runtime, and also doesn't map cleanly to immutable among Python's standard data types). So, I don't think it would be a good idea even if it were possible.
(There's a reason that Java, as well as C++ and other languages, provides sets and maps based on both hash tables and sorted logarithmic structures--because they're not interchangeable.)
(Other collections like defaultdict and weakref collections use Google Guava's MapMaker, which doesn't have an ordered option.)
I think this is a great idea from a user experience perspective.
On Thu, Feb 5, 2015 at 3:04 PM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
2015-02-05 23:02 GMT+01:00 Chris Barker <chris.barker@noaa.gov>:
Interesting -- somehow this message came through on the google group mirror of python-ideas only -- so replying to it barfed.
He's my reply again, with the proper address this time:
On Wed, Feb 4, 2015 at 1:14 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Hundreds of people want an orderedset (http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
It seems like a good way to go here is to contribute to one of the projects already on PyPI -- and once you like it, propose that it be added to the standard library.
And can you leverage the OrderedDict implementation already in the std lib? Or maybe better, the new one in PyPy?
PyPy dicts and sets (the built-in ones) are already ordered by default.
-- Amaury Forgeot d'Arc
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- - Jim
jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com} twitter.com/jimbaker github.com/jimbaker bitbucket.com/jimbaker _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- - Jim
jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com} twitter.com/jimbaker github.com/jimbaker bitbucket.com/jimbaker _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Feb 04, 2015 at 01:14:14PM -0800, Neil Girdhar wrote:
Hundreds of people want an orderedset (http://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) and yet the ordered sets that are in PyPI (oset, ordered-set) are incomplete. Please consider adding an ordered set that has *all* of the same functionality as set.
I think you have this backwards: more likely, first you find a complete implementation on PyPI, then you propose it for the standard library.
The only major design decision is (I think) whether to store the elements in a linked list or dynamic vector.
That's an implementation detail and therefore irrelevant (in the sense that it is subject to change without affecting code that uses ordered sets). I think there are plenty of other major design decisions to be made. What's the union between a regular set and an ordered set? Is there a frozen ordered set? Do ordered sets compare unequal if they differ only in order (like lists)?
My personal preference is to specify the ordered set via a flag to the constructor, which can be intercepted by __new__ to return a different object type as necessary.
-1 on a flag to the constructor. I don't know if this rule (well, guideline) has a formal name anywhere, so I'm going to call it Guido's Rule: no constant mode arguments. I think this is a powerful design principle. Functions or methods (including constructors) should not take arguments which, when given, are invariably supplied as constants. This especially goes for "mode" arguments, usually a flag, where the implemention of the function then looks something like this: def thing(arg, flag=True): if flag: return this(arg) else: return that(arg) In cases like this, you should expose this and that as distinct functions. (If somebody -- Guido? -- would like to propose a cleaner explanation for the rule, I suggest we put it somewhere in the documention. Perhaps in the FAQs.) Examples: (1) We have list and tuple as separate functions, not sequences(*args, *, mutable=False). (2) str and unicode (str and bytes), not str(obj, unicode=True). (3) pairs of (sin, sin), (cos, acos), (tan, atan) functions, rather than sin(x, inverse=False). For that matter, same with sin/sinh etc. Note that this is a "should not", not "must not": exceptions are allowed where the API clearly improved by it: sorted(seq, reversed=True) rather than sorted/reverse_sorted.
(If anyone wants to add a complete ordered set to PyPI that would also be very useful for me!)
I'm sure it would be very useful to lots of people. Perhaps you could scratch your own itch and solve their problem at the same time? -- Steve
On Fri Feb 06 2015 at 12:12:19 AM Steven D'Aprano <steve@pearwood.info> wrote:
I think there are plenty of other major design decisions to be made. What's the union between a regular set and an ordered set? Is there a frozen ordered set? Do ordered sets compare unequal if they differ only in order (like lists)?
Are there many options for the union question? I think the only sane choice for any union involving ordered sets is an unordered set - in any other case the resulting order would be wrong or at least not obviously right, and that's almost certainly better expressed explicitly. Ed Kellett
On Thu, Feb 5, 2015 at 5:12 PM, Ed Kellett <edk141@gmail.com> wrote:
On Fri Feb 06 2015 at 12:12:19 AM Steven D'Aprano <steve@pearwood.info> wrote:
I think there are plenty of other major design decisions to be made. What's the union between a regular set and an ordered set? Is there a frozen ordered set? Do ordered sets compare unequal if they differ only in order (like lists)?
Are there many options for the union question? I think the only sane choice for any union involving ordered sets is an unordered set - in any other case the resulting order would be wrong or at least not obviously right, and that's almost certainly better expressed explicitly.
Surely a.union(b) should produce the same result as a.update(b) except out-of-place, and .update() already has well-defined semantics for OrderedDicts. -n -- Nathaniel J. Smith -- http://vorpus.org
participants (9)
-
Amaury Forgeot d'Arc
-
Andrew Barnert
-
Chris Barker
-
Ed Kellett
-
Jim Baker
-
Nathaniel Smith
-
Neil Girdhar
-
Steven D'Aprano
-
Sturla Molden