Make list.reverse() more flexible
Currently, list.reverse() only works for an entire list. If one wants to reverse a section of it 'in-place', one needs to slicing which makes the space complexity no longer O(1). One can also manually make a loop and do the reversal but that is even slower than slicing. List.reverse() does not take any arguments. Wouldn't it be a good if it can take in parameters such as 'start' and 'stop' to enable list.reverse() work even for a section of the list? When no arguments are specified, then it works on the whole list, like usual.
On 2021-03-05 at 16:27:27 -0000, Vincent Cheong <vincentcheong6945@gmail.com> wrote:
Currently, list.reverse() only works for an entire list. If one wants to reverse a section of it 'in-place', one needs to slicing which makes the space complexity no longer O(1). One can also manually make a loop and do the reversal but that is even slower than slicing. List.reverse() does not take any arguments. Wouldn't it be a good if it can take in parameters such as 'start' and 'stop' to enable list.reverse() work even for a section of the list? When no arguments are specified, then it works on the whole list, like usual.
Try this: def slice_reverse(the_list, start, stop): the_list[start:stop] = the_list[stop - 1, start - 1, -1] I'll defer on whether or not this deserves a place in the standard library; the older I get, the more I prefer building new data than mutating existing data.
On Fri, Mar 05, 2021 at 04:27:27PM -0000, Vincent Cheong wrote:
Currently, list.reverse() only works for an entire list. If one wants to reverse a section of it 'in-place', one needs to slicing which makes the space complexity no longer O(1).
The space complexity of a list is not constant, O(1). The lists [] and [1]*100000 do not take the same amount of space. I think that, perhaps, you are trying to say that reversing a list requires no additional storage and can be done in place. But having said that, I think that giving `list.reverse()` optional start and end parameters would probably be a simple and easy change to make, and would be backwards compatible. But even easy changes have some cost: the work has to be done, tests written and run, documentation updated, and that adds one more thing for people to learn. To balance those costs, we require something more than "Wouldn't it be good...?", we require an actual, real life use-case for the feature. (And even then, it's not a guarantee that we will accept the proposed feature.) 1. What are you doing that you need to reverse just a section of the list? 2. Is the performance (time or space) of this task so critical that you can't just use slicing? mylist[start:end] = mylist[end-1:start-1:-1] Some additional questions: - Do we extend this feature to the `reversed()` built-in? - What about sorting? -- Steve
Sorry for not explaining the background of my idea. I'm involved in the research area of sorting algorithms. Reversals are part of sorting and correct me if wrong, `list.reverse()` is the fastest method to reverse an entire list, which is also in-place. Yet, it doesn't work for a subsection of it. If not mistaken, the easiest way is to reassign that subsection with a reversed slicing, One, it is not as fast anymore. Two, by time complexity, the algorithm is no longer in-place because of the slicing. Thus, this is the entire story.
I think that, perhaps, you are trying to say that reversing a list requires no additional storage and can be done in place.
Yes! This is what I meant.
To balance those costs, we require something more than "Wouldn't it be good...?", we require an actual, real life use-case for the feature. (And even then, it's not a guarantee that we will accept the proposed feature.)
I try not to enter into details and technicalities. I try to help by proposing ideas, abstract thoughts, and visions. Anyways, there are many things we do that involve reversals and subsection reversals - it's not an uncommon thing to do. Thus, it will help. It's better to shape this perspective: with the proposed idea, Python gains more power, and along the way, when the time comes we need it, we already have it. For the short term or time being, normal reversal tasks would be the first to benefit.
But even easy changes have some cost: the work has to be done, tests written and run, documentation updated, and that adds one more thing for people to learn.
This is my assessment: list.reverse() lacks functionality because it only works for whole lists. Once any intention goes out of the radius of the stated purpose, it becomes completely unusable. So we should agree that it lacks functionality and versatility. This is unworthy and Python should be more powerful than this. Noticing that it takes no arguments, it has room for improvement where we can conveniently add 'start' and 'end' parameters. Those mentioned costs I believe are part of daily development hassles, but I would like to comment more on the 'adds one more thing for people to learn'. Assuming that we do add the two parameters, I would foresee as just an 'upgraded version' of the function and the changes are as simple as 'Now, list.reverse() can take in two arguments which allows us to reverse a specific range in the list, not the complete list.' Therefore, it will not pose a heavy stuff for learning. It should be a situation where people will be like, for experienced, 'Oh, now list.reverse() works on a specific range instead of the entire list', and for newcomers, 'Oh, list.reverse() can work on specific ranges too.'
Some additional questions: Do we extend this feature to the reversed() built-in? What about sorting?
Interesting. The second one is mainly my lack of explanation which I have cleared at the top. For the first one, I have not given much thought about it since it performs quite differently, hence a quite different territory, but it is interesting to explore. It returns an iterator for any given sequence, if I'm not mistaken. Reversed is used when we need to do something more to each item, in the opposite direction. It's usually used in a loop, list comprehension, etc. It can also be used to make a reversed list by putting it into the list constructor but then you have slicing more suitably for that. Back to the first functionality, we would usually traverse an entire list, and if one needs to traverse only a section of it, we would be passing in a slice (which already creates a copy of it, correct me if wrong). So, issues still tend to revolve around slicing and its copy-making behavior. I understand that we are encouraged to avoid mutating existing data, but that also brings in the issue of necessity - must we always make a copy of something? There are quite a number of things we do which traverse sections of lists, so by adding the two arguments, we are giving reversed() an in-place capability. On one hand, this allows us to save memory for larger and larger lists. On the other hand, it seems unnecessary because we can also use range() and indices within reversed() to traverse a section, but then that would be unclean as compared to having simply declare a 'start' and 'stop'. Here in reversed(), the proposal may not have much of a position of strength for consideration, compared to list.reverse(), but it is worth weighing how simpler, cleaner or better iterators are having 'start' and 'stop' arguments. You have extra flexibility but is it worth it; that would be better answered with more assessments by a wider range of audience.
This sounds like a very verbose repeated statement that "there may be a use in the future." That's definitely not going to get a feature, no matter how many times repeated. If this is something you actually need, write a subclass of list in Cython or C, and add that capability. I cannot think of a time I would have used it in the last 23 years, but maybe someone would. Put it on PyPI and explain why it helps something. On Sat, Mar 6, 2021, 1:42 AM Vincent Cheong <vincentcheong6945@gmail.com> wrote:
Sorry for not explaining the background of my idea. I'm involved in the research area of sorting algorithms. Reversals are part of sorting and correct me if wrong, `list.reverse()` is the fastest method to reverse an entire list, which is also in-place. Yet, it doesn't work for a subsection of it. If not mistaken, the easiest way is to reassign that subsection with a reversed slicing, One, it is not as fast anymore. Two, by time complexity, the algorithm is no longer in-place because of the slicing. Thus, this is the entire story.
I think that, perhaps, you are trying to say that reversing a list requires no additional storage and can be done in place.
Yes! This is what I meant.
To balance those costs, we require something more than "Wouldn't it be good...?", we require an actual, real life use-case for the feature. (And even then, it's not a guarantee that we will accept the proposed feature.)
I try not to enter into details and technicalities. I try to help by proposing ideas, abstract thoughts, and visions. Anyways, there are many things we do that involve reversals and subsection reversals - it's not an uncommon thing to do. Thus, it will help. It's better to shape this perspective: with the proposed idea, Python gains more power, and along the way, when the time comes we need it, we already have it. For the short term or time being, normal reversal tasks would be the first to benefit.
This is my assessment: list.reverse() lacks functionality because it only works for whole lists. Once any intention goes out of the radius of
But even easy changes have some cost: the work has to be done, tests written and run, documentation updated, and that adds one more thing for people to learn. the stated purpose, it becomes completely unusable. So we should agree that it lacks functionality and versatility. This is unworthy and Python should be more powerful than this. Noticing that it takes no arguments, it has room for improvement where we can conveniently add 'start' and 'end' parameters. Those mentioned costs I believe are part of daily development hassles, but I would like to comment more on the 'adds one more thing for people to learn'. Assuming that we do add the two parameters, I would foresee as just an 'upgraded version' of the function and the changes are as simple as 'Now, list.reverse() can take in two arguments which allows us to reverse a specific range in the list, not the complete list.' Therefore, it will not pose a heavy stuff for learning. It should be a situation where people will be like, for experienced, 'Oh , now list.reverse() works on a specific range instead of the entire list', and for newcomers, 'Oh, list.reverse() can work on specific ranges too.'
Some additional questions: Do we extend this feature to the reversed() built-in? What about sorting?
Interesting. The second one is mainly my lack of explanation which I have cleared at the top. For the first one, I have not given much thought about it since it performs quite differently, hence a quite different territory, but it is interesting to explore. It returns an iterator for any given sequence, if I'm not mistaken. Reversed is used when we need to do something more to each item, in the opposite direction. It's usually used in a loop, list comprehension, etc. It can also be used to make a reversed list by putting it into the list constructor but then you have slicing more suitably for that. Back to the first functionality, we would usually traverse an entire list, and if one needs to traverse only a section of it, we would be passing in a slice (which already creates a copy of it, correct me if wrong). So, issues still tend to revolve around slicing and its copy-making behavior. I understand that we are encouraged to avoid mutating existing data, but that also brings in t he issue of necessity - must we always make a copy of something? There are quite a number of things we do which traverse sections of lists, so by adding the two arguments, we are giving reversed() an in-place capability. On one hand, this allows us to save memory for larger and larger lists. On the other hand, it seems unnecessary because we can also use range() and indices within reversed() to traverse a section, but then that would be unclean as compared to having simply declare a 'start' and 'stop'. Here in reversed(), the proposal may not have much of a position of strength for consideration, compared to list.reverse(), but it is worth weighing how simpler, cleaner or better iterators are having 'start' and 'stop' arguments. You have extra flexibility but is it worth it; that would be better answered with more assessments by a wider range of audience.
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NAH2RR... Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Mar 6, 2021 at 7:07 AM David Mertz <mertz@gnosis.cx> wrote:
This sounds like a very verbose repeated statement that "there may be a use in the future." That's definitely not going to get a feature, no matter how many times repeated.
If this is something you actually need, write a subclass of list in Cython or C, and add that capability. I cannot think of a time I would have used it in the last 23 years, but maybe someone would. Put it on PyPI and explain why it helps something.
On Sat, Mar 6, 2021, 1:42 AM Vincent Cheong <vincentcheong6945@gmail.com> wrote:
Sorry for not explaining the background of my idea. I'm involved in the research area of sorting algorithms. Reversals are part of sorting and correct me if wrong, `list.reverse()` is the fastest method to reverse an entire list, which is also in-place. Yet, it doesn't work for a subsection of it.
The key thing is that it's very, very unlikely "research area of sorting algorithms" is going to beat Timsort. But maybe it is. Or maybe it is for some specialized kind of data collections that tend to come with some characteristic ascending or descending runs. Even assuming that is true, and some new algorithm is good for a specific case, it's not going to be especially fast once implemented in Python, even with the hypothetical "in-place reverse of a slice". Maybe it would be faster in C or another low-level language, but not on top of the Python interpreter. So this "research" is inherently doomed to fail.... UNLESS, you do the research not by actual raw timings, but rather in the sensible way of profiling the specific number of operations in an abstracted way. For that, existing "reversal of slice" techniques are perfectly fine to profile algorithms, but with things like internal counters of operations, not as actual machine timings. I was, however, a little curious, so timed a few related things:
a = list(range(1_000_000)) %timeit a.reverse() 450 µs ± 16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %timeit list(reversed(a)) 7.38 ms ± 149 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit a[:] = a[-1:0:-1] 14.1 ms ± 230 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit a[500_000:550_000] = a[550_000-1:500_000-1:-1] 240 µs ± 1.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
So indeed, list(reversed()) is more than 15x slower than list.reverse(). And assigning from a reverse slice is slower still by about 2x. However, assuming you only want to reverse a moderate amount in the middle of a large list, the time remains pretty moderate. In my example, 1/2 the time of reversing the entire list. Other than subclassing list per se, you could also write a Cython or C extension that merely provided extra operations on actual lists. Maybe call it `listtools` in the spirit of `itertools` and `functools`. Some external function like `reverse_middle(a, 500_000, 550_000)` isn't a terrible thing to experiment with. And in concept, that could be as fast as is theoretically possible.
On Sat, Mar 06, 2021 at 07:46:18AM +0000, David Mertz wrote:
So this "research" is inherently doomed to fail.... UNLESS, you do the research not by actual raw timings, but rather in the sensible way of profiling the specific number of operations in an abstracted way.
Sorry, are you trying to say that the way to determine better algorithms is not by measuring the actual time they take with real data on real machines, but by counting hypothetical abstract operations? Obviously we can and do use Big Oh analyse guide our investigations, but ultimately nothing beats actual timings. I can't begin to tell you how much time wasted trying to write O(N) algorithms in Python for some task when Python's O(N log N) sort was faster for any size of data I could create on my machine. -- Steve
Interesting. Just to comment, Mr. Mertz is realistic in the algorithmic sense. Running time is highly affected by various factors. For example, lets just assume that an insertion sort of O(N^2) time on a quantum computer is fast enough for all kinds of task in the world. So, naturally, there should be no problem and we should instead focus on other projects rather than making an O(N log N) sort for the quantum machine as its unnecessary. But, you're N^2, it won't change the fact that you're algorithmically inefficient. You, on the other hand, are realistic in the developmental sense. Why spend time on this instead of other things, right? I believe in 'depends-on-situation' and a balance between thinking about project and about algorithm, because without this project we wouldn't even be here, likewise without the algorithm we wouldn't reach here. This is for discussions-sake and should have no bearing on the main idea. Thanks for the input.
a while ago on this list, we had a bit of discussion about having a sequence view. It started with my more focused idea for a built in slice iterator, which I wrote up some time ago here: https://github.com/PythonCHB/islice-pep Then an even more expansive idea of a general purpose View was proposed. The conversation eventually petered out. Frankly, I didn't see a lot of use-cases at the time -- but now it seems that we have some more, and some more interest. For my part, I think a general purpose View object is far more compelling than adding add-hoc functionality to various methods. For example, he's the OP's suggestion with view semantics: view(a_list)[a_slice].reverse() Pretty nifty! See the discussion for why adding a method to the Sequence ABC is problematic: it could potentially clash with folks' existing methods for non-built-in sequences -- for instance, numpy arrays already have a .view method, and it has a different meaning (or at least necessarily different semantics). So a "view" builtin and maybe a new dunder may be the way to go. Anyway, perhaps it's time to revive the conversation. Here's the old thread --Please look that over before starting a new discussion! https://mail.python.org/archives/list/python-ideas@python.org/thread/QVTGZD6... -CHB On Sun, Mar 7, 2021 at 5:04 AM Vincent Cheong <vincentcheong6945@gmail.com> wrote:
Interesting. Just to comment, Mr. Mertz is realistic in the algorithmic sense. Running time is highly affected by various factors. For example, lets just assume that an insertion sort of O(N^2) time on a quantum computer is fast enough for all kinds of task in the world. So, naturally, there should be no problem and we should instead focus on other projects rather than making an O(N log N) sort for the quantum machine as its unnecessary. But, you're N^2, it won't change the fact that you're algorithmically inefficient. You, on the other hand, are realistic in the developmental sense. Why spend time on this instead of other things, right? I believe in 'depends-on-situation' and a balance between thinking about project and about algorithm, because without this project we wouldn't even be here, likewise without the algorithm we wouldn't reach here.
This is for discussions-sake and should have no bearing on the main idea.
Thanks for the input. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4PDBVL... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Indeed, making a slice a view does pose painful challenges. For a slice iterator, I wonder if there is an bigger overhead in being an iterator or building an iterator. I wholeheartedly agree that 'adding add-hoc functionality' is slightly toy-ish, but I brought up the idea of 'start' and 'stop' parameters because I believed that mentioning just these two things themselves are humbly adequately below that 'troublesome' or 'complex' line. Yes, Numpy's slice is a view and that is memory efficient. Memory_view, lazy_slice, View_object, and other terms, quite an expansive room of considerations. For discussions-sake, I would like to comment about: 'Frankly, I didn't see a lot of use-cases at the time -- but now it seems that we have some more, and some more interest.'. Indeed, if one puts on a perspective glasses of 'use-cases', it's obvious that there is no urgency, no real-time necessity for that. We can see that there is growing interest, but just my opinion, the more deserving point is that it exhibits intelligence and power. Intelligence because a users gets to choose what to do with that sublist 'before' it takes memory - it's intelligent not to use resources unless explicitly told to, like a generator. Power because if I can 'specify' that section of a list without making a copy, I'm effectively achieving the same thing as many would using for-loop + range() + indices. It has that tiny little conceptual resemblance to how reversed() being much better than for-loop + range() + negative_indices, when iterating backwards. I'm schooled by how there was a history archive on this. Thanks for the links. Thanks for the input.
Indeed, if one puts on a perspective glasses of 'use-cases', it's obvious that there is no urgency, no real-time necessity for that. We can see that there is growing interest, but just my opinion, the more deserving
On Mon, Mar 8, 2021 at 1:24 AM Vincent Cheong wrote: point is that it exhibits intelligence and power. As the zen says: “practically beats purity” I can’t speak for everyone, but I think use cases are the primary driver to adding anything to. Python. That is, in order to add something, it needs to be: A) useful And B) provide an easier/clearer way to write to code and/or significantly improve performance. “Exhibiting intelligence and power” is not enough. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Indeed, from previous replies, I have already learnt that use-cases are the primary driver here around. In fact that should be the general case. I do admit that my assessment is too abstractive for any feasible considerations. I was looking at it from the algorithmic sense, that if a function is performant then a handful if not many problems, discovered or undiscovered, would have been avoided through efficiency. For a little instance, we have the efficient BWT algorithm, life before it was normal and progressing, but with it data compression improved. It wasn't needed, but with it we improved. This is just the line of thought, hehe. Just for comment, now that you have outlined a more conditioned judgement as to how good an idea is, I would like to say that it does improve performance - maybe a little bit of time, but space is a sure. Does it improve coding, well, if the notations remain the same, then no change, if a different semantic is introduced, then it depends. Useful - ah, relates to above, relates to what many have already from before. The Zen is the wisest: since practicality beats purity, a function is only worth used when its code-friendly and readable, which points out that it heavily depends on the semantics we come up with. I think how useful it is realistically how simple it is to read it and code it. I guess it's just semantics! Thanks for the feedback!
I see. I do agree that my reply brings about that 'verbose repeated' feeling, haha. But for the record, it's not about having something in hand now for the future, but it's more of a paradigmatic approach to the implementation. Python has changed for the better in terms of necessity: - map() returns an iterator instead of a whole list at once - generator yields values only when needed (unlike list comprehension) So I thought, 'Why do we need to make a reversed copy to assign it to the original part, when we can simply reverse the original part itself.' That's the paradigm. Anyway, thanks.
On Sat, 6 Mar 2021 at 07:52, Vincent Cheong <vincentcheong6945@gmail.com> wrote:
So I thought, 'Why do we need to make a reversed copy to assign it to the original part, when we can simply reverse the original part itself.' That's the paradigm.
A few points strike me here: 1. The question you asked ("why do we need to") has an obvious and trivial answer. "We don't need to". But so what? It's what we have, and the status quo tends to win. If you're arguing for change, you need to argue that the *cost* of that change is worth it, so you need to be looking at what we will *gain*. 2. To put this another way, as far as this list is concerned, you're phrasing the question backwards. Because backward compatibility and availability of people for implementation and maintenance are real costs for any proposal, the question that matters *here* is "Why do we need partial-reverse to be a built in operation?" 3. If you're interested in the idea, you can, of course, implement it and see how it works out. No-one is stopping you writing either an extension that implements this, or a patch to Python. That's basically the *whole point* of open source :-) And then, coming to this list saying "I made this patch that implements in-place partial reversal of lists, would it be worth submitting it as a PR?" would be a much easier place to start from (because you're offering something that's already eliminated some of those costs, as well as demonstrating that you've looked at the practical aspects of the proposal). Basically, even though this list is about "ideas", purely theoretical "wouldn't it be nice if..." discussions don't tend to get very far (or if they do, it's "far" in the sense of "way off-topic" :-)). A little bit of work or thinking on the practical aspects of a proposal tends to help a lot. Paul
I see. You have coined the term exactly, partial-reverse. Nice. You have also put forward a realistic question of 'why do we need'. Well, surely not everyone needs it and definitely it's not urgently needed, but its just the counterintuitive incompleteness such that 'it works for a whole, but not part of it', you see. About the gain, of course it's unlike a monumental speed boost, but its just a little spot that I saw lacking in power. What I had in mind was the algorithmic cost to the program itself, not the cost in developing it. But now that you explained to me, I understand the situation. Thanks for the information.
On Sat, 6 Mar 2021 at 10:42, Vincent Cheong <vincentcheong6945@gmail.com> wrote:
I see.
You have coined the term exactly, partial-reverse. Nice. You have also put forward a realistic question of 'why do we need'. Well, surely not everyone needs it and definitely it's not urgently needed, but its just the counterintuitive incompleteness such that 'it works for a whole, but not part of it', you see. About the gain, of course it's unlike a monumental speed boost, but its just a little spot that I saw lacking in power.
What I had in mind was the algorithmic cost to the program itself, not the cost in developing it. But now that you explained to me, I understand the situation.
Thanks for the information.
To put this in context, I think that if you were to create a PR for Python, implementing this change, and post it as a feature request to bugs.python.org, it may well be accepted without (much) debate. It's a classic case of "actions speak louder than words", basically - it's fairly easy for a core dev to look at a PR, think "yes, this is a simple and logical enhancement" and focus on tidying up any technical details before simply merging it. Whereas coming to a discussion forum like this one, and opening with (in effect) "it would be nice if someone did X" tends to get everyone thinking about what it would need to persuade them to spend time writing that code, what they'd think about when writing it, etc, etc. And what you get is a long discussion, but little actual action. Paul
06.03.21 09:52, Vincent Cheong пише:
I see. I do agree that my reply brings about that 'verbose repeated' feeling, haha. But for the record, it's not about having something in hand now for the future, but it's more of a paradigmatic approach to the implementation. Python has changed for the better in terms of necessity:
- map() returns an iterator instead of a whole list at once - generator yields values only when needed (unlike list comprehension)
So I thought, 'Why do we need to make a reversed copy to assign it to the original part, when we can simply reverse the original part itself.' That's the paradigm.
Your proposition is not related to changes in map() and other builtins which return now an iterator instead of list. It is more similar to idea of adding parameters start and stop to map(), so it could be appplied to a part of the list. More general approach would allow you to write a[i:j] = a[i:j][::-1] and it would be a zero-overhead operation because slices are lazy or because the compiler recognize the pattern and generate the optimal code. But it requires a sophisticate compiler. Less sophisticate approach is using an explicit wrapper for lazy slices: a[i:j] = view(a)[i:j][::-1]
Indeed, it's not directly related, perhaps misunderstanding, but I'm just drawing the similar idea between the two situations of not taking memory first. If you slice, you make a copy and that takes space. So, the space complexity is no longer O(1). It's just that, not that it has any direct relation to map() function. Perhaps a generator is a better analogy to use in the first place because a generator does not make a whole list first, it does not take up as much space upon creation. Thanks for the feedback.
[This is the revised version of the previous reply which contained mistakes] Sorry for not explaining the background of my idea. I'm involved in the research area of sorting algorithms. Reversals are part of sorting and correct me if wrong, `list.reverse()` is the fastest method to reverse an entire list, which is also in-place. Yet, it doesn't work for a subsection of it. If not mistaken, the easiest way is to reassign that subsection with a reversed slicing, One, it is not as fast anymore. Two, by time complexity, the algorithm is no longer in-place because of the slicing. Thus, this is the entire story.
I think that, perhaps, you are trying to say that reversing a list requires no additional storage and can be done in place. Yes! This is what I meant.
To balance those costs, we require something more than "Wouldn't it be good...?", we require an actual, real life use-case for the feature. (And even then, it's not a guarantee that we will accept the proposed feature.) I try not to enter into details and technicalities. I try to help by proposing ideas, abstract thoughts, and visions. Anyways, there are many things we do that involve reversals and subsection reversals - it's not an uncommon thing to do. Thus, it will help. It's better to shape this perspective: with the proposed idea, Python gains more power, and along the way, when the time comes we need it, we already have it. For the short term or time being, normal reversal tasks would be the first to benefit.
But even easy changes have some cost: the work has to be done, tests written and run, documentation updated, and that adds one more thing for people to learn. This is my assessment: list.reverse() lacks functionality because it only works for whole lists. Once any intention goes out of the radius of the stated purpose, it becomes completely unusable. So we should agree that it lacks functionality and versatility. This is unworthy and Python should be more powerful than this. Noticing that it takes no arguments, it has room for improvement where we can conveniently add 'start' and 'end' parameters. Those mentioned costs I believe are part of daily development hassles, but I would like to comment more on the 'adds one more thing for people to learn'. Assuming that we do add the two parameters, I would foresee as just an 'upgraded version' of the function and the changes are as simple as 'Now, list.reverse() can take in two arguments which allows us to reverse a specific range in the list, not just the complete list.' Therefore, it will not pose a heavy stuff for learning. It should be a situation where people will be like, for experienced, 'Oh, now list.reverse() works on a specific range instead of just the entire list', and for newcomers, 'Oh, list.reverse() can work on specific ranges too.'
Some additional questions: Do we extend this feature to the reversed() built-in? What about sorting? Interesting. The second one is mainly my lack of explanation which I have cleared at the top. For the first one, I have not given much thought about it since it performs quite differently, hence a quite different territory, but it is interesting to explore. It returns an iterator for any given sequence, if I'm not mistaken. Reversed is used when we need to do something more to each item, in the opposite direction. It's usually used in a loop, list comprehension, etc. It can also be used to make a reversed list by putting it into the list constructor but then you have slicing more suitably for that. Back to the first functionality, we would usually traverse an entire list, and if one needs to traverse only a section of it, we would be passing in a slice (which already creates a copy of it, correct me if wrong). So, issues still tend to revolve around slicing and its copy-making behavior. I understand that we are encouraged to avoid mutating existing data, but that also brings in the issue of necessity - must we always make a copy of something? There are quite a number of things we do which traverse sections of lists, so by adding the two arguments, we are giving reversed() an in-place capability. On one hand, this allows us to save memory for larger and larger lists. On the other hand, it seems unnecessary because we can also use range() and indices within reversed() to traverse a section, but then that would be unclean as compared to having simply declare a 'start' and 'stop'. Here in reversed(), the proposal may not have much of a position of strength for consideration, compared to list.reverse(), but it is worth weighing how simpler, cleaner or better iterators are having 'start' and 'stop' arguments. You have extra flexibility but is it worth it; that would be better answered with more assessments by a wider range of audience.
Vincent Cheong writes:
Sorry for not explaining the background of my idea. I'm involved in the research area of sorting algorithms. Reversals are part of sorting
I'm curious: Many of the sorting algorithms I know use swap pairs of elements, but what sorting algorithm reverses segments longer than 3?
Depends on the implementation. If you, instead of swapping pair by pair one by one, rewrite that sequence in the opposite direction and that sequence is longer than 3, it already fits the situation. A block swap algorithm swaps two elements of an array. If out-of-place, you can specify more than 3 items, and you would have been doing that. Thanks.
On Mon, Mar 08, 2021 at 04:59:26PM +0900, Stephen J. Turnbull wrote:
I'm curious: Many of the sorting algorithms I know use swap pairs of elements, but what sorting algorithm reverses segments longer than 3?
I have a feeling that Timsort may do that --- I believe it looks for either ascending or descending runs, sorts them, and reverses them if necessary. Of course I may be wrong. -- Steve
On Fri, Mar 5, 2021 at 3:23 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Mar 05, 2021 at 04:27:27PM -0000, Vincent Cheong wrote:
Currently, list.reverse() only works for an entire list. If one wants to reverse a section of it 'in-place', one needs to slicing which makes the space complexity no longer O(1).
The space complexity of a list is not constant, O(1).
The lists [] and [1]*100000 do not take the same amount of space.
I think that, perhaps, you are trying to say that reversing a list requires no additional storage and can be done in place.
For what it's worth, that is exactly what it means for the space complexity to be constant. Space complexity of an algorithm refers to the additional storage needed to execute that algorithm, on top of the space used by the input. And so list.reverse() is O(1) in space and O(n) in time, if you hold pointer/index sizes to be O(1). (This is a big "if", and reverse is more accurately O(log n) in space, because pointers/indexes into a sequence of size n are O(log n) if you let n grow arbitrarily large.) I don't know good reading for this, maybe https://en.wikipedia.org/wiki/DSPACE#Machine_models or https://en.wikipedia.org/wiki/In-place_algorithm - What about sorting?
This is a good question! In fact most operations you might want to do with a list, you might want to do with a sub-list. There's even a bunch of methods that take start/stop already, even while there's others that don't, and no good rule of thumb for which methods have them and which don't. For example, list.index() takes optional start/stop, but list.remove() and list.count() don't. It even changes between types: bytes.count *does* support start/stop (as does str.count), even though this is not required of Sequence or even of ByteString. Speaking of which, ByteString.index() didn't have start/stop until 3.5, and... Ad-hoc start/stop parameters are annoying, but honestly it seems like the inconsistency shouldn't and doesn't block adding new start/stop parameters, since the inconsistency is already there, (Other languages that focus on avoiding copying find it more useful to use a vocabulary type that represents a sub-list -- for example, std::span<T> in C++, &[T] in Rust, or []T in Go. (The closest thing in Python that I'm aware of is memoryview.) Once you have that kind of lazy slice, you don't need index start/stop parameters for any particular method, and the whole problem mostly vanishes in a poof of smoke.) -- Devin
Insightful! You mentioned terms like 'memory_view', and 'lazy slice'. You felt the pulse of the situation. But the most elegant thing (I had it a long time ago but you brought it up before, haha) is that you notice the downside of copies - you indicated how a lazy slice is the magic wand that eliminates the problem. I have thought of making slices 'lazy' as a solution which conveniently and smoothly avoids the need for a 'start' and 'stop' parameter, but I could not bring myself to float that idea up here because I already had a hunch that it is, simply said, too much to ask for, too bizarre to talk about. A slice that acts a 'view' rather than a 'copy' shares the same lazy idea as like generators yielding items only when needed and map() returns an 'slim' iterator rather than a 'bulky' list. It's like a slogan that says: 'Do only if necessary'. It gives the flexibility of what to do with the slice, not always taking up memory first. That would be intelligent, in a way. Thanks for the input!
participants (9)
-
2QdxY4RzWzUUiLuE@potatochowder.com
-
Christopher Barker
-
David Mertz
-
Devin Jeanpierre
-
Paul Moore
-
Serhiy Storchaka
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Vincent Cheong