Proposal how to remove all occurrences of a value from a Python list

Hello everyone, the main idea of this proposal is to create convenient, maybe more Pythonic way to remove all occurrences of a value from a Python list. Suppose we have list 'arr': arr = [1, 2, 3, 1] and we want to remove all 1 from this list. The most Pythonic way to do it is: arr[:] = (x for x in arr if x != 1) Looks good, at least for experienced developer, but for Python's newcomers solution will look like this:
while True:
... try: ... a.remove(1) ... except: ... break I was surprised that Python doesn't have easy way to remove all occurrences. Currently I am reading "Effective Python" book and I have encounter good idea that it's also important to have readable code for new or another-language developers. And to my mind current Pythonic 'remove all occurrences' is not readable code and does not give insight (at least at first glance) into what happens in the code. This way looks a little bit nicer. I think so.. a.remove(1, all = True) So, that's idea in brief. Bellow you can find description how to install and test patch. Patch is attached. Note: this is test patch, so some things may not be done in accordance with PEP 007. How to install patch: hg update 3.5 cd Objects patch < remove_all.patch make How to use patch:
arr = [1, 2, 3, 1]
arr.remove(1)
arr
[2, 3, 1]
arr = [1, 2, 3, 1]
arr.remove(1, all = True)
arr
[2, 3]
arr = [1, 2, 3, 1]
arr.remove(1, True)
Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: remove() takes one argument
arr.remove(1, al = True)
Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'al' is an invalid keyword argument for this function Many thanks for your attention! - Eduard

On 10/3/2015 11:05 AM, Eduard Bondarenko wrote:
This is a special case of in-place filtering.
arr[:] = filter(lambda x: x != 1, x)
Python programmers really must learn either comprehensions or map and filter. They must also learn that repeatedly scanning a sequence should be avoided when possible.
I was surprised that Python doesn't have easy way to remove all occurrences.
The two ways above are pretty easy.
I would not mind this, but I don't know if there are enough use cases to justify an addition. -- Terry Jan Reedy

On 10/3/2015 17:54, Terry Reedy wrote:
I am generally opposed to this. This is sugar for filtering on a predicate, with the predicate simply being predefined as `bar == foo`. There are a number of other common filtering use cases, should we include them all? Does this make python an easier language to use? In the new user case, does this teach, or encourage the new user to intuit, how any other filtering operations should be written? Would it not benefit the new user more to just have them learn listcomps, map, filter, and even while loops?

On 03/10/2015 22:54, Terry Reedy wrote:
Other things that might be useful (sorry, I have no use cases in mind): a.remove(1, count) # maximum number of removals, analogous to aString.replace(old, new, count) a remove-like function that does not raise an error if the item is not present a remove-like function that returns the mutated list (like sorted(), as opposed to list.sort() ) It's not obvious to me how to design good API(s) to do some/all of this. Rob Cliffe

On 04/10/2015 00:14, Rob Cliffe wrote:
Make the default count=1 to preserve the current behaviour and wouldn't everybody be happy?
-- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

On Oct 3, 2015, at 14:54, Terry Reedy <tjreedy@udel.edu> wrote:
I don't think this is more Pythonic. You've wrapped a perfectly good expression in an unnecessary function just so you can unnecessarily use a higher-order function. (Of course, by the same token, if you already have a perfectly good function lying around, using filter makes more sense than wrapping it up in an unnecessary call expression just so you can avoid a higher-order function. And there are cases where is unclear which is more appropriate--e.g., if what you have lying around is type or instance with a bound or unbound method, or something you can call partial on, is that really better than wrapping it in an expression? But this isn't either of those cases.) I'd have no problem with a novice who didn't yet know comprehensions writing this, but I wouldn't want to teach it as a better alternative to comprehensions in cases where it's not actually better.
Agreed. In my experience, most novices who ask about code like this, once you explain to them that remove() has to keep repeatedly scanning from the start, immediately see why that's bad and ask how you can avoid doing that. That's a perfect opportunity to teach them about comprehensions while they're looking for exactly what comprehensions can do. There are some exceptions, but those aren't novices, they're intermediate-experienced C devs who insist that the following is the "simplest" code and must be the fastest or Python is broken (even though it's not necessarily fastest even in C, largely because memmove is so much faster than element by element move): idx = skip = 0 length = len(arr) while idx+skip < length: if arr[idx+skip] == element_to_remove: skip += 1 else: arr[idx] = arr[idx+skip] idx += 1 del arr[idx+1:]
It's readable to anyone who understands comprehensions and slice assignment. And both of those are such fundamental concepts that, if you don't understand either of them yet at all, your intuitions aren't very good yet. But notice that there's nothing stopping you from wrapping this up in a function, or adding whitespace or comments to help yourself work through it: def remove_all(arr, value): """Remove all instances of value from arr""" arr[:] = (x for x in arr # keep every element if x != value) # that doesn't equal value And now, everywhere you use it looks like this: remove_all(arr, 1) And it's hard to imagine anything more readable. And, even if remove_all isn't the kind of function an experienced developer would write, learning how to factor out the tricky bits into documentable and testable functions is one of the most useful skills for any developer in any language.

On 04.10.2015 01:32, Andrew Barnert via Python-ideas wrote:
arr.remove_all(1)
And, even if remove_all isn't the kind of function an experienced developer would write, learning how to factor out the tricky bits into documentable and testable functions is one of the most useful skills for any developer in any language.#
True. Btw. the same is true for Python core devs. This said, I would appreciate the method 'remove_all' provided by the stdlib. ;-) Best, Sven

On 10/5/2015 2:28 AM, Sven R. Kunze wrote:
The problem with methods is that they only work with one class. A list.removeall would only remove things equal to a specific item from a list (and presumably in place). We already have both a generic filter function and syntax that will remove all items from any iterable that meet any condition. The stream can be fed into any other function that accept an iterable. -- Terry Jan Reedy

If you have more than a few values to remove, it's often faster to create a new list, since each removal will require a copy operation of all trailing items and (every now and then) a realloc of the list object to free up the unused space: new_arr = [x for x in arr if x != 1] On 05.10.2015 22:18, Terry Reedy wrote:
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 06.10.2015 13:24, M.-A. Lemburg wrote:
I think that's a technical detail which can be sorted out in C somehow more efficiently. But the proposal is not about performance rather than readability and maintainability.
remove also removes in-place but it's not overly useful when you know or unsure about whether you need to remove an item multiple times. remove_all would basically complement remove as a more generic alternative. Btw. one could also think of an additional generalization for this that basically takes n arguments which are then removed altogether from the list in question: a = [1,2,3,5,4,2,3,4,1,2] a.remove_all(1,4,2) a == [3,5,3] Just thinking, and I might remember some time where I would have found it useful. Not sure. Best, Sven

On Wed, Oct 7, 2015 at 3:29 AM, Sven R. Kunze <srkunze@mail.de> wrote:
The more generalizations you offer, the more tempting the comprehension looks. It already allows as many generalizations as you like, because comprehensions are a fully-supported part of the language. ChrisA

On 07.10.2015 00:50, Chris Angelico wrote:
To you, not to everybody.
You miss the point. Common use-cases deserve methods on their own. People will never stop asking for meaningfully named methods, even when pretending comprehensions are the ultimate answer to all questions concerning lists/sets/dicts, They simply are not. Which variant conveys the intent of the developer more clearly? a.remove_all(1,4,2) a[:] = [x for x in a if x not in {1,4,2}] I have to admit, the latter variant has certain appeal if you love special characters. Best, Sven

On Wed, Oct 7, 2015 at 10:21 AM, Sven R. Kunze <srkunze@mail.de> wrote:
Maybe, but how many other variants do you need? "remove all elements 2.7<x<7.5"? "remove all strings that begin with 'a'"? Every new method you create adds cognitive load to everyone who reads the docs for the list object, and every new feature of a method adds cognitive load to understanding that method. Is it worth it? How common *is* this case? Is it really worth having the method? Comprehensions already exist, and are already general enough to handle all the variants. ChrisA

On Wed, 2015-10-07 at 11:01 +1100, Chris Angelico wrote:
Python is supposed to make use of duck-typing. Well... I say the standard library should apply that philosophy. If we are supposed to be a dynamic language, why can't we be like perl, and do list operations if we see lists, and scalar/single operations if we see single value types? It really comes down to "quack like a duck", in this case - Is it possible to loop over each thing in the given list, treat it like a string, and replace it with the string we apply to each iteration? I believe it should be. print(['One thing', 'another thing', 'each of these', 'on their own line']) Looking at that... isn't the intent very obvious? Why should python not respect that? "A long thing".replace("long ", '')
a thing
"A plyable soft tube an undescribable color.".replace(['soft', 'plyable'], '').replace('undescribable', 'indescribable') Isn't the behaviour of this code - as it would be, if this worked - fairly obvious? Frankly to me it seems much more readable and step-by-step to me than a comparable comprehension. What actually physically happens in the code is easily apparently, unlike in a comprehension, at least until you take in its local scope variables. And lets face it, nested comprehensions can be rather messy. Since Python does not encourage tail calls, there should be a good and efficient way to apply operations everywhere. But maybe map is best? I haven't seen anyone in this thread present an alternative to changing the function - Perhaps some trick of a decorator or similar. But there is no obvious syntax, in my opinion, for accomplishing what I did with the nested .replace calls.

On Tue, Oct 06, 2015 at 09:06:40PM -0700, Emil Rosendahl Petersen wrote:
Yes, the intent is obvious, because I am a sentient human who can read the English text and guess -- I emphasis that is it just a guess -- what you want. Do you believe that Python should do the same thing that I did? Read the individual strings, analyse them as English text, and understand that because the final item says "on their own line" that is your intent. But what if I wrote this instead: print(['one string per line', 'three strings per line', 'everything on one line', 'nah just kidding', 'print using two equal-spaced columns']) What is my intention now? If *you* can't guess what I want, how can the interpreter guess? Do I want the following printed one number per line or all numbers on one line? Should I see the list delimiters? Should the output be formatted into multiple columns? How many columns? Should each number be left-justified, right-justified? Centered? print([1, 2, 3, 4, 5, 6, 7, 8, 998, 999]) Trying to have a programming language intuit the programmers *intent* is a fool's errand: it cannot be done successfully. Computers cannot do what we want, they can only do what we tell them to do.
No, of course not. If you think it is obvious, you haven't thought about it in enough detail. What happens if the replacement strings overlap? What happens if the new string contains one or more of the old strings as substrings? Should the order of the old strings make a difference to the final result? Why a list? Isn't that likely to indicate a programming error? If replace() took multiple target substrings to be replaced, I have answers to those questions. But I don't know if those answers are the same as your answers. Maybe they are, maybe they're not. Who knows? - overlapping target strings shouldn't make a difference; - neither should the replacement string containing one or more of the targets; - or the order of the targets; - but a list probably means a programming error, I would prefer to require a tuple of substrings to be consistent with other string methods, and to avoid any questions of what happens with arbitrary iterables. -- Steve

On Sat, Oct 03, 2015 at 06:05:26PM +0300, Eduard Bondarenko wrote:
What's so special about removing all occurances from a list? How often do you find yourself doing such a thing? These aren't rhetorical questions. They're questions that need to be answered before your proposal can be turned into a new feature. If the answer is "not very special, and only rarely" that won't necessarily rule out the change, but it will make it harder to convince that it is desirable.
Part of the process of learning to be a programmer is learning to avoid awful code like the above and instead learn general purpose processing techniques like the list comp. In my experience, e beginner is more likely to come up with this: while 1 in arr: arr.remove(1) since it is far more straight-forward than your version, and involves fewer concepts ("what's try...except do?").
I was surprised that Python doesn't have easy way to remove all occurrences.
That depends on what you mean by "easy". Or "obvious". I would consider the list comp to be both. But of course, I don't expect beginners to see things the same way.
I disagree. The list comp *does* give insight into what happens in the code: you iterate over the existing elements of the list, collecting the ones which don't equal 1, and then save them back into the list. Whereas your suggestion:
This way looks a little bit nicer. I think so..
a.remove(1, all = True)
is just a mysterious method call. What insight does it give? How does it work? There is no hint, no clue. It might as well be magic. On the other hand, once the programmer can reason about the task well enough to write the list comprehension, they can easily extend it to slightly different tasks: # Skip the first 8 items, then remove numbers less than 5: arr[9:] = [x for x in arr[9:] if x >= 5] Your remove(all=True) method cannot be extended and teaches the programmer nothing except how to solve this one problem. -- Steve

On 10/3/2015 11:05 AM, Eduard Bondarenko wrote:
This is a special case of in-place filtering.
arr[:] = filter(lambda x: x != 1, x)
Python programmers really must learn either comprehensions or map and filter. They must also learn that repeatedly scanning a sequence should be avoided when possible.
I was surprised that Python doesn't have easy way to remove all occurrences.
The two ways above are pretty easy.
I would not mind this, but I don't know if there are enough use cases to justify an addition. -- Terry Jan Reedy

On 10/3/2015 17:54, Terry Reedy wrote:
I am generally opposed to this. This is sugar for filtering on a predicate, with the predicate simply being predefined as `bar == foo`. There are a number of other common filtering use cases, should we include them all? Does this make python an easier language to use? In the new user case, does this teach, or encourage the new user to intuit, how any other filtering operations should be written? Would it not benefit the new user more to just have them learn listcomps, map, filter, and even while loops?

On 03/10/2015 22:54, Terry Reedy wrote:
Other things that might be useful (sorry, I have no use cases in mind): a.remove(1, count) # maximum number of removals, analogous to aString.replace(old, new, count) a remove-like function that does not raise an error if the item is not present a remove-like function that returns the mutated list (like sorted(), as opposed to list.sort() ) It's not obvious to me how to design good API(s) to do some/all of this. Rob Cliffe

On 04/10/2015 00:14, Rob Cliffe wrote:
Make the default count=1 to preserve the current behaviour and wouldn't everybody be happy?
-- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

On Oct 3, 2015, at 14:54, Terry Reedy <tjreedy@udel.edu> wrote:
I don't think this is more Pythonic. You've wrapped a perfectly good expression in an unnecessary function just so you can unnecessarily use a higher-order function. (Of course, by the same token, if you already have a perfectly good function lying around, using filter makes more sense than wrapping it up in an unnecessary call expression just so you can avoid a higher-order function. And there are cases where is unclear which is more appropriate--e.g., if what you have lying around is type or instance with a bound or unbound method, or something you can call partial on, is that really better than wrapping it in an expression? But this isn't either of those cases.) I'd have no problem with a novice who didn't yet know comprehensions writing this, but I wouldn't want to teach it as a better alternative to comprehensions in cases where it's not actually better.
Agreed. In my experience, most novices who ask about code like this, once you explain to them that remove() has to keep repeatedly scanning from the start, immediately see why that's bad and ask how you can avoid doing that. That's a perfect opportunity to teach them about comprehensions while they're looking for exactly what comprehensions can do. There are some exceptions, but those aren't novices, they're intermediate-experienced C devs who insist that the following is the "simplest" code and must be the fastest or Python is broken (even though it's not necessarily fastest even in C, largely because memmove is so much faster than element by element move): idx = skip = 0 length = len(arr) while idx+skip < length: if arr[idx+skip] == element_to_remove: skip += 1 else: arr[idx] = arr[idx+skip] idx += 1 del arr[idx+1:]
It's readable to anyone who understands comprehensions and slice assignment. And both of those are such fundamental concepts that, if you don't understand either of them yet at all, your intuitions aren't very good yet. But notice that there's nothing stopping you from wrapping this up in a function, or adding whitespace or comments to help yourself work through it: def remove_all(arr, value): """Remove all instances of value from arr""" arr[:] = (x for x in arr # keep every element if x != value) # that doesn't equal value And now, everywhere you use it looks like this: remove_all(arr, 1) And it's hard to imagine anything more readable. And, even if remove_all isn't the kind of function an experienced developer would write, learning how to factor out the tricky bits into documentable and testable functions is one of the most useful skills for any developer in any language.

On 04.10.2015 01:32, Andrew Barnert via Python-ideas wrote:
arr.remove_all(1)
And, even if remove_all isn't the kind of function an experienced developer would write, learning how to factor out the tricky bits into documentable and testable functions is one of the most useful skills for any developer in any language.#
True. Btw. the same is true for Python core devs. This said, I would appreciate the method 'remove_all' provided by the stdlib. ;-) Best, Sven

On 10/5/2015 2:28 AM, Sven R. Kunze wrote:
The problem with methods is that they only work with one class. A list.removeall would only remove things equal to a specific item from a list (and presumably in place). We already have both a generic filter function and syntax that will remove all items from any iterable that meet any condition. The stream can be fed into any other function that accept an iterable. -- Terry Jan Reedy

If you have more than a few values to remove, it's often faster to create a new list, since each removal will require a copy operation of all trailing items and (every now and then) a realloc of the list object to free up the unused space: new_arr = [x for x in arr if x != 1] On 05.10.2015 22:18, Terry Reedy wrote:
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 06.10.2015 13:24, M.-A. Lemburg wrote:
I think that's a technical detail which can be sorted out in C somehow more efficiently. But the proposal is not about performance rather than readability and maintainability.
remove also removes in-place but it's not overly useful when you know or unsure about whether you need to remove an item multiple times. remove_all would basically complement remove as a more generic alternative. Btw. one could also think of an additional generalization for this that basically takes n arguments which are then removed altogether from the list in question: a = [1,2,3,5,4,2,3,4,1,2] a.remove_all(1,4,2) a == [3,5,3] Just thinking, and I might remember some time where I would have found it useful. Not sure. Best, Sven

On Wed, Oct 7, 2015 at 3:29 AM, Sven R. Kunze <srkunze@mail.de> wrote:
The more generalizations you offer, the more tempting the comprehension looks. It already allows as many generalizations as you like, because comprehensions are a fully-supported part of the language. ChrisA

On 07.10.2015 00:50, Chris Angelico wrote:
To you, not to everybody.
You miss the point. Common use-cases deserve methods on their own. People will never stop asking for meaningfully named methods, even when pretending comprehensions are the ultimate answer to all questions concerning lists/sets/dicts, They simply are not. Which variant conveys the intent of the developer more clearly? a.remove_all(1,4,2) a[:] = [x for x in a if x not in {1,4,2}] I have to admit, the latter variant has certain appeal if you love special characters. Best, Sven

On Wed, Oct 7, 2015 at 10:21 AM, Sven R. Kunze <srkunze@mail.de> wrote:
Maybe, but how many other variants do you need? "remove all elements 2.7<x<7.5"? "remove all strings that begin with 'a'"? Every new method you create adds cognitive load to everyone who reads the docs for the list object, and every new feature of a method adds cognitive load to understanding that method. Is it worth it? How common *is* this case? Is it really worth having the method? Comprehensions already exist, and are already general enough to handle all the variants. ChrisA

On Wed, 2015-10-07 at 11:01 +1100, Chris Angelico wrote:
Python is supposed to make use of duck-typing. Well... I say the standard library should apply that philosophy. If we are supposed to be a dynamic language, why can't we be like perl, and do list operations if we see lists, and scalar/single operations if we see single value types? It really comes down to "quack like a duck", in this case - Is it possible to loop over each thing in the given list, treat it like a string, and replace it with the string we apply to each iteration? I believe it should be. print(['One thing', 'another thing', 'each of these', 'on their own line']) Looking at that... isn't the intent very obvious? Why should python not respect that? "A long thing".replace("long ", '')
a thing
"A plyable soft tube an undescribable color.".replace(['soft', 'plyable'], '').replace('undescribable', 'indescribable') Isn't the behaviour of this code - as it would be, if this worked - fairly obvious? Frankly to me it seems much more readable and step-by-step to me than a comparable comprehension. What actually physically happens in the code is easily apparently, unlike in a comprehension, at least until you take in its local scope variables. And lets face it, nested comprehensions can be rather messy. Since Python does not encourage tail calls, there should be a good and efficient way to apply operations everywhere. But maybe map is best? I haven't seen anyone in this thread present an alternative to changing the function - Perhaps some trick of a decorator or similar. But there is no obvious syntax, in my opinion, for accomplishing what I did with the nested .replace calls.

On Tue, Oct 06, 2015 at 09:06:40PM -0700, Emil Rosendahl Petersen wrote:
Yes, the intent is obvious, because I am a sentient human who can read the English text and guess -- I emphasis that is it just a guess -- what you want. Do you believe that Python should do the same thing that I did? Read the individual strings, analyse them as English text, and understand that because the final item says "on their own line" that is your intent. But what if I wrote this instead: print(['one string per line', 'three strings per line', 'everything on one line', 'nah just kidding', 'print using two equal-spaced columns']) What is my intention now? If *you* can't guess what I want, how can the interpreter guess? Do I want the following printed one number per line or all numbers on one line? Should I see the list delimiters? Should the output be formatted into multiple columns? How many columns? Should each number be left-justified, right-justified? Centered? print([1, 2, 3, 4, 5, 6, 7, 8, 998, 999]) Trying to have a programming language intuit the programmers *intent* is a fool's errand: it cannot be done successfully. Computers cannot do what we want, they can only do what we tell them to do.
No, of course not. If you think it is obvious, you haven't thought about it in enough detail. What happens if the replacement strings overlap? What happens if the new string contains one or more of the old strings as substrings? Should the order of the old strings make a difference to the final result? Why a list? Isn't that likely to indicate a programming error? If replace() took multiple target substrings to be replaced, I have answers to those questions. But I don't know if those answers are the same as your answers. Maybe they are, maybe they're not. Who knows? - overlapping target strings shouldn't make a difference; - neither should the replacement string containing one or more of the targets; - or the order of the targets; - but a list probably means a programming error, I would prefer to require a tuple of substrings to be consistent with other string methods, and to avoid any questions of what happens with arbitrary iterables. -- Steve

On Sat, Oct 03, 2015 at 06:05:26PM +0300, Eduard Bondarenko wrote:
What's so special about removing all occurances from a list? How often do you find yourself doing such a thing? These aren't rhetorical questions. They're questions that need to be answered before your proposal can be turned into a new feature. If the answer is "not very special, and only rarely" that won't necessarily rule out the change, but it will make it harder to convince that it is desirable.
Part of the process of learning to be a programmer is learning to avoid awful code like the above and instead learn general purpose processing techniques like the list comp. In my experience, e beginner is more likely to come up with this: while 1 in arr: arr.remove(1) since it is far more straight-forward than your version, and involves fewer concepts ("what's try...except do?").
I was surprised that Python doesn't have easy way to remove all occurrences.
That depends on what you mean by "easy". Or "obvious". I would consider the list comp to be both. But of course, I don't expect beginners to see things the same way.
I disagree. The list comp *does* give insight into what happens in the code: you iterate over the existing elements of the list, collecting the ones which don't equal 1, and then save them back into the list. Whereas your suggestion:
This way looks a little bit nicer. I think so..
a.remove(1, all = True)
is just a mysterious method call. What insight does it give? How does it work? There is no hint, no clue. It might as well be magic. On the other hand, once the programmer can reason about the task well enough to write the list comprehension, they can easily extend it to slightly different tasks: # Skip the first 8 items, then remove numbers less than 5: arr[9:] = [x for x in arr[9:] if x >= 5] Your remove(all=True) method cannot be extended and teaches the programmer nothing except how to solve this one problem. -- Steve
participants (11)
-
Alexander Walters
-
Andrew Barnert
-
Chris Angelico
-
Eduard Bondarenko
-
Emil Rosendahl Petersen
-
M.-A. Lemburg
-
Mark Lawrence
-
Rob Cliffe
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy