[Python-ideas] Another attempt at a sum() alternative: the concatenation protocol
Sergey
sergemp at mail.ru
Wed Jul 17 17:03:50 CEST 2013
On Jul 16, 2013 Oscar Benjamin wrote:
> On 16 July 2013 07:50, Nick Coghlan wrote:
>> I haven't been following the sum() threads fully, but something Ron
>> suggested gave me an idea for a concatenation API and protocol. I
>> think we may also be able to use a keyword-only argument to solve the
>> old string.join vs str.join problem in a more intuitive way.
>>
>> def concat(start, iterable, *, interleave=None):
>> try:
>> build = start.__concat__
>> except AttributeError:
>> result = start
>> if interleave is None:
>> for x in iterable:
>> result += x
>> else:
>> for x in iterable:
>> result += interleave
>> result += x
>> else:
>> result = build(iterable, interleave=interleave)
(I assume `return result` in the end)
That's an interesting idea. Somewhat similar to my #4 suggestion with
awful name __init_concatenable_sequence_from_iterable__.
Two questions about this idea:
* What obj.__concat__ is expected to mean? E.g.
class X:
def __add__(self, other):
returns new object being sum of `self` and `other`
But:
class X:
def __concat__(self, <what_is_here?>):
<what it is expected to return?>
* What should happen for mixed lists, i.e. code:
concat(["str1", "str2", "str3"])
looks rather obvious, but what about code:
concat(["string", some_object, some_other_object])
Would it raise an error or not?
If not, what type would be a result of such operation?
What if that `some_object` is somehow "concatenable" with
string, while string has no idea how to concat that some_object?
> The sum() threads have highlighted one and only one problem which is
> that people are often using (or at least suggesting to use) sum() in
> order to concatenate sequences even though it has quadratic
> performance for this. The stdlib already has a solution for this:
> chain. No one in the sum threads has raised any issue with using chain
> (or chain.from_iterable) except to argue that it is not widely used.
I did. Here's one of issues.
Imagine a type, that somehow modifies items that it stores, removes
duplicates, or sorts them, or something else, e.g.:
class aset(set):
def __add__(self, other):
return self|other
Now we have a code:
list_of_sets = [ aset(["item1","item2","item3"]) ] * 1000
[...]
for i in sum(list_of_sets, aset()):
deal_with(i)
If you replace `sum` with `chain` you get something like:
for i in chain.from_iterable(list_of_sets):
deal_with(i)
Which works! (that's the worst part) but produces WRONG result!
This example makes `chain` error-prone replacement for `sum`. It does
not make `chain` bad, if you understand what you do you're free to
use `chain`. It just makes `chain` not so good general replacement.
--
More information about the Python-ideas
mailing list