
On 12/30/2012 10:05 AM, Nick Coghlan wrote:
On Mon, Dec 31, 2012 at 12:10 AM, Ned Batchelder <ned@nedbatchelder.com> wrote:
The two sides (count/index and replace/indexes) seem about the same to me:
- They are unambiguous operations. That is, no one has objected that reasonable people might disagree about how .replace() should behave, which is a common reason not to add things to the stdlib. - They implement simple operations that are easy to explain and will find use. In my experience, .indexes() is at least as useful as .count(). - All are based on element equality semantics. - Any of them could be implemented in a few lines of Python.
What is the organizing principle for the methods list (or any other built-in data structure) should have? I would hate for the main criterion to be, "these are the methods that existed in Python 2.3," for example. Why is .count() in and .replace() out? The general problem with adding new methods to types rather than adding new functions+protocols is that it breaks ducktyping. We can mitigate that now by adding the new methods to
collections.abc.Sequence, but it remains the case that relying on these methods being present rather than using the functional equivalent will needlessly couple your code to the underlying sequence implementation (since not all sequences inherit from the ABC, some are just registered).
We also have a problem with replace() specifically that it *does* already exist in the standard library, as a non-mutating operation on str, bytes and bytearray. Adding it as a mutating method on sequences in general would create an immediate name conflict in the bytearray method namespace. That alone is a dealbreaker for that part of the idea.
I don't understand the conflict? .replace() from sequence does precisely the same thing as .replace() from bytes if you limit the arguments to single-byte values. It seems perfectly natural to me. I must be missing something.
The question of an "indices" builtin or itertools function is potentially more interesting, but really, I don't think the algorithm David noted in his original post rises to the level of needing standardisation or acceleration:
def indices(seq, val): for i, x in enumerate(seq): if x == val: yield i
def map_assign(store, keys, val): for k in keys: store[k] = val
def replace(seq, old, new): map_assign(seq, indices(seq, old), new)
seq = [x, a, y, a] replace(seq, a, b) assert seq == [x, b, y, b]
Does this mean that if .index() or .count() didn't already exist, you wouldn't add them to list?
Cheers, Nick.