
On 12/30/2012 9:10 AM, Ned Batchelder wrote:
On 12/30/2012 5:46 AM, Terry Reedy wrote:
On 12/29/2012 10:25 PM, David Kreuter wrote:
I think it would be nice to have a method in 'list' to replace certain elements by others in-place. Like this:
l = [x, a, y, a] l.replace(a, b) assert l == [x, b, y, b]
The alternatives are longer than they should be, imo. For example:
for i, n in enumerate(l):
Note that enumerate is a generic function of iterables, not a specific list method.
if n == a: l[i] = b
I dont see anything wrong with this. It is how I would do it in python. Wrap it in a function if you want. Or write it on two line ;-).
My deeper objection is that 'replace_all_in_place' is a generic mutable collection function, not a specific list or even mutable sequence function. Python 1 was stronger list oriented. Python 3 is mostly iterable oriented, with remnants of the Python 1 heritage.
I wonder at the underlying philosophy of things being accepted or rejected in this way. For example, here's a thought experiment: if list.count() and list.index() didn't exist yet, would we accept them as additions to the list methods?
I personally would have deleted list.find in 3.0. Count and index are not list methods but rather sequence methods, part of the sequence ABC. Tuples got them, as their only two public methods, in 3.0 to conform. This ties in to Nick's comment. (Actually counting a particular item in a collection is not specific to sequencess, but having multiple items to count tends to be.) It would be possible for count and index to be functions instead. But their definition as methods goes back to Python 1. Also note that .index has a start parameter, making it useful to get all indexes. See the code below.
By Terry's reasoning, there's no need to, since I can implement those operations in a few lines of Python.
We constantly get proposals to add new functions and methods that are easily written in a few lines. Everyone thinks their proposal is useful because it is useful in their work. If we accepted all such proposals, Python would have hundreds more.
Does that mean they persist only for backwards compatibility?
Backwards compatibility is important. Changing them to functions would be disruptive without sufficient gain.
Was their initial inclusion a violation of some "list method philosophy"?
No, it was part of the Python 1 philosophy of lists as the common data interchange type. As I said, this has changed in Python 3.
Or is there a good reason for them to exist, and if so, why shouldn't .replace() and .indexes() also exist?
Neither are list methods. Nicks gave a generic indexes generator. A specific list indexes generator can use repeated applications of .index with start argument. I do that 'inline' below.
I would hate for the main criterion to be, "these are the methods that existed in Python 2.3,"
Then you are hating reality ;-). The .method()s of basic builtin classes is close to frozen.
There is a perfectly good python version above that does the necessary search and replace as efficiently as possible. Thank you for posting it.
You say "as efficiently as possible," but you mean, "as algorithmically efficient as possible," which is true, they are linear, which is as good as it's going to get. But surely if coded in C, these operations would be faster.
You are right. Lets do the next-item search in C with .index. If the density of items to be replaces is low, as it would be for most applications, this should dominate. def enum(lis): for i, n in enumerate(lis): if n == 1: lis[i] = 2 a, b = 100, 10000 # started with 2,1 for initial tests start = a*([1]+b*[0]) after = a*([2]+b*[0]) # test that correct before test speed! # since the list is mutated, it must be reset for each test lis = start.copy() enum(lis) print('enum: ', lis == after) def repin(lis): i = -1 try: while True: i = lis.index(1, i+1) lis[i] = 2 except: pass lis = start.copy() repin(lis) print('repin: ', lis == after) from timeit import timeit # now for speed, remembering to reset for each test # first measure the copy time to subtract from test times print(timeit('lis = start.copy()', 'from __main__ import start', number=10)) print(timeit('lis = start.copy(); enum(lis)', 'from __main__ import start, enum', number=10)) print(timeit('lis = start.copy(); repin(lis)', 'from __main__ import start, repin', number=10)) # measure scan without replace to give an upper limit to python-coded replace # since lis is not mutated, it only needs to be defined once print(timeit('repin(lis)', 'from __main__ import a, b, repin; lis = a*(b+1)*[0]', number=10)) # prints enum: True repin: True 0.06801244890066886 0.849063227602523 0.2759397696510706 0.20790119084727898 After subtracting and dividing, enum take .078 seconds for 100 replacements in 1000000 items, repin just .021, which is essentially the time it takes just to scan 1000000 items. So doing the replacements also in C would not be much faster. Rerunning with 10000 replacements (a,b = 10000, 100), the times are .080 and .024. -- Terry Jan Reedy