Maybe it's time to add a new module for sequence-specific functions (seqtools?). It should contain at least two classes or fabric functions: 1. A view that represents a sliced subsequence. Lazy equivalent of seq[start:end:step]. This feature is implemented in third-party module dataview [1]. 2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2]. Neither itertools nor collections modules look good place for these features, since they are not concrete classes and work only with sequences, not general iterables or iterators. On other side, mappingproxy and ChainMap look close, maybe new module should be oriented not on sequences, but on views. [1] https://pypi.python.org/pypi/dataview [2] https://www.python.org/dev/peps/pep-0467
On Sun, Jul 17, 2016, 3:22 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
Maybe it's time to add a new module for sequence-specific functions (seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of seq[start:end:step]. This feature is implemented in third-party module dataview [1].
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
NumPy slicing and reshaping sounds like it satisfies these requirements. Does it not?
I don't want to speak for Serhiy, but it seems like he wants NumPy-like behaviors over generic sequences. I think this idea is appealing. For example, Python list has O(1) append, while the equivalent for np.ndarray would be an O(n) copy to a larger array. Expressing those NumPy affordances genetically feels like a good thing. However, maybe this is something that could live in PyPI first top stabilize APIs. On Jul 17, 2016 1:08 PM, "Michael Selik" <michael.selik@gmail.com> wrote:
On Sun, Jul 17, 2016, 3:22 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
Maybe it's time to add a new module for sequence-specific functions (seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of seq[start:end:step]. This feature is implemented in third-party module dataview [1].
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
NumPy slicing and reshaping sounds like it satisfies these requirements. Does it not?
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Generically, not genetically. Shouldn't reply on phone. On Jul 17, 2016 2:56 PM, "David Mertz" <mertz@gnosis.cx> wrote:
I don't want to speak for Serhiy, but it seems like he wants NumPy-like behaviors over generic sequences. I think this idea is appealing.
For example, Python list has O(1) append, while the equivalent for np.ndarray would be an O(n) copy to a larger array.
Expressing those NumPy affordances genetically feels like a good thing. However, maybe this is something that could live in PyPI first top stabilize APIs.
On Jul 17, 2016 1:08 PM, "Michael Selik" <michael.selik@gmail.com> wrote:
On Sun, Jul 17, 2016, 3:22 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
Maybe it's time to add a new module for sequence-specific functions (seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of seq[start:end:step]. This feature is implemented in third-party module dataview [1].
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
NumPy slicing and reshaping sounds like it satisfies these requirements. Does it not?
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 17.07.16 23:08, Michael Selik wrote:
On Sun, Jul 17, 2016, 3:22 PM Serhiy Storchaka <storchaka@gmail.com <mailto:storchaka@gmail.com>> wrote:
Maybe it's time to add a new module for sequence-specific functions (seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of seq[start:end:step]. This feature is implemented in third-party module dataview [1].
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
NumPy slicing and reshaping sounds like it satisfies these requirements. Does it not?
NumPy have similar features, but they work with packed arrays of specific numeric types, not with general sequences (such as list or str). And NumPy is a large library, providing a number of features not needed for most Python users.
On 19.07.2016 16:43, Serhiy Storchaka wrote:
NumPy have similar features, but they work with packed arrays of specific numeric types, not with general sequences (such as list or str). And NumPy is a large library, providing a number of features not needed for most Python users.
IIRC, numpy also supports object-arrays not just numeric arrays. Sven
There are a number of generic implementations of these sequence algorithms: * http://toolz.readthedocs.io/en/latest/api.html#itertoolz * https://github.com/kachayev/fn.py#itertools-recipes * http://funcy.readthedocs.io/en/stable/seqs.html * http://docs.python.org/2/reference/expressions.html#slicings * https://docs.python.org/2/library/itertools.html#itertools.islice * https://docs.python.org/3/library/itertools.html#itertools.islice On Jul 17, 2016 3:23 PM, "Serhiy Storchaka" <storchaka@gmail.com> wrote:
Maybe it's time to add a new module for sequence-specific functions
(seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of
seq[start:end:step]. This feature is implemented in third-party module dataview [1]. islice?
2. A view that represents a linear sequence as 2D array. Iterating this
view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2]. partition? http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.partition _all
Neither itertools nor collections modules look good place for these
features, since they are not concrete classes and work only with sequences, not general iterables or iterators. On other side, mappingproxy and ChainMap look close, maybe new module should be oriented not on sequences, but on views.
[1] https://pypi.python.org/pypi/dataview [2] https://www.python.org/dev/peps/pep-0467
https://docs.python.org/3/library/collections.abc.html
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Jul 17, 2016 3:23 PM, "Serhiy Storchaka" <storchaka@gmail.com> wrote:
Maybe it's time to add a new module for sequence-specific functions (seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of seq[start:end:step]. This feature is implemented in third-party module dataview [1].
On Sun, Jul 17, 2016 at 3:21 PM, Wes Turner <wes.turner@gmail.com> wrote:
islice?
SortedContainers implements exactly that as a method on SortedList as: SortedList.islice(start=None, stop=None, reverse=False) Returns an iterator that slices self from start to stop index, inclusive and exclusive respectively. When reverse is True, values are yielded from the iterator in reverse order. Both start and stop default to None which is automatically inclusive of the beginning and end. Return type: iterator Reference: http://www.grantjenks.com/docs/sortedcontainers/sortedlist.html#sortedcontai... I chose to limit the stride to 1 or -1 using the keyword parameter "reverse." No complaints though it differs from traditional slice syntax. Grant
On 18 July 2016 at 08:21, Wes Turner <wes.turner@gmail.com> wrote:
There are a number of generic implementations of these sequence algorithms:
* http://toolz.readthedocs.io/en/latest/api.html#itertoolz * https://github.com/kachayev/fn.py#itertools-recipes * http://funcy.readthedocs.io/en/stable/seqs.html * http://docs.python.org/2/reference/expressions.html#slicings * https://docs.python.org/2/library/itertools.html#itertools.islice * https://docs.python.org/3/library/itertools.html#itertools.islice
I think the existence of multiple implementations in the context of larger libraries lends weight to the notion of a "seqtools" standard library module that works with arbitrary sequences, just as itertools works with arbitrary iterables. I don't think combining these algorithms with the "algorithms that work with arbitrary mappings" classes would make sense - for better or for worse, I think the latter is "collections" now, since that's also where the dict variants live (defaultdict, OrderedDict, Counter). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 18.07.16 01:21, Wes Turner wrote:
There are a number of generic implementations of these sequence algorithms:
* http://toolz.readthedocs.io/en/latest/api.html#itertoolz * https://github.com/kachayev/fn.py#itertools-recipes * http://funcy.readthedocs.io/en/stable/seqs.html * http://docs.python.org/2/reference/expressions.html#slicings * https://docs.python.org/2/library/itertools.html#itertools.islice * https://docs.python.org/3/library/itertools.html#itertools.islice
Aren't all these implementations works with iterables and iterators?
On Jul 17, 2016 3:23 PM, "Serhiy Storchaka" <storchaka@gmail.com <mailto:storchaka@gmail.com>> wrote:
Maybe it's time to add a new module for sequence-specific functions
(seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of
seq[start:end:step]. This feature is implemented in third-party module dataview [1].
islice?
The result of itertools.islice() is not a sequence.
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
partition?
http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.partition _all
toolz.itertoolz.partition() is just a generator.
On Tue, Jul 19, 2016 at 7:48 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 18.07.16 01:21, Wes Turner wrote:
There are a number of generic implementations of these sequence algorithms:
* http://toolz.readthedocs.io/en/latest/api.html#itertoolz * https://github.com/kachayev/fn.py#itertools-recipes * http://funcy.readthedocs.io/en/stable/seqs.html * http://docs.python.org/2/reference/expressions.html#slicings * https://docs.python.org/2/library/itertools.html#itertools.islice * https://docs.python.org/3/library/itertools.html#itertools.islice
Aren't all these implementations works with iterables and iterators?
On Jul 17, 2016 3:23 PM, "Serhiy Storchaka"
<storchaka@gmail.com <mailto:storchaka@gmail.com>> wrote:
Maybe it's time to add a new module for sequence-specific functions
(seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of
seq[start:end:step]. This feature is implemented in third-party module dataview [1].
islice?
The result of itertools.islice() is not a sequence.
Additionally, itertools.islice doesn't support negative indexing and it's O(start) to get the first element rather than the O(1) that it could be for sequences.
import itertools x = range(30) itertools.islice(x, -2, None) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Indices for islice() must be None or an integer: 0 <= x <= maxint.
With that said, I've never really worked with sequences that are big enough for the runtime complexity of normal python slicing to actually matter. I have a feeling that in the typical case, wrapping more abstraction around slicing and creating lazy "views" wouldn't lead to any practical performance benefits. For the cases where it *does* lead to practical performance benefits, `numpy` starts to look a whole lot more attractive as an option. I wonder how many applications would actually benefit from this but can't/shouldn't switch to `numpy` due to other constraints?
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
partition?
http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.partition _all
toolz.itertoolz.partition() is just a generator.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- [image: pattern-sig.png] Matt Gilson // SOFTWARE ENGINEER E: matt@getpattern.com // P: 603.892.7736 We’re looking for beta testers. Go here <https://www.getpattern.com/meetpattern> to sign up!
On Jul 19, 2016 10:51 AM, "Serhiy Storchaka" <storchaka@gmail.com> wrote:
On 18.07.16 01:21, Wes Turner wrote:
There are a number of generic implementations of these sequence
algorithms:
* http://toolz.readthedocs.io/en/latest/api.html#itertoolz * https://github.com/kachayev/fn.py#itertools-recipes * http://funcy.readthedocs.io/en/stable/seqs.html * http://docs.python.org/2/reference/expressions.html#slicings * https://docs.python.org/2/library/itertools.html#itertools.islice * https://docs.python.org/3/library/itertools.html#itertools.islice
Aren't all these implementations works with iterables and iterators?
On Jul 17, 2016 3:23 PM, "Serhiy Storchaka" <storchaka@gmail.com <mailto:storchaka@gmail.com>> wrote:
Maybe it's time to add a new module for sequence-specific functions
(seqtools?). It should contain at least two classes or fabric functions:
1. A view that represents a sliced subsequence. Lazy equivalent of
seq[start:end:step]. This feature is implemented in third-party module dataview [1].
islice?
The result of itertools.islice() is not a sequence.
2. A view that represents a linear sequence as 2D array. Iterating this view emits non-intersecting chunks of the sequence. For example it can be used for representing the bytes object as a sequence of 1-byte bytes objects (as in 2.x), a generalized alternative to iterbytes() from PEP 467 [2].
partition?
http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.partition _all
toolz.itertoolz.partition() is just a generator.
so you're looking for something like strided memoryviews for nonsequential access over sequences/iterables which are sequential? sort of like a bitmask? https://docs.python.org/3/library/stdtypes.html#memoryview
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
participants (8)
-
David Mertz
-
Grant Jenks
-
Matt Gilson
-
Michael Selik
-
Nick Coghlan
-
Serhiy Storchaka
-
Sven R. Kunze
-
Wes Turner