[Python-ideas] The non-obvious nature of str.join (was Re: sum(...) limitation)

Nathaniel Smith njs at pobox.com
Mon Aug 11 19:53:29 CEST 2014


On Mon, Aug 11, 2014 at 5:22 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
>
> On Mon, Aug 11, 2014 at 11:55 AM, Todd <toddrjen at gmail.com> wrote:
>>>
>>> In my experience, it is the asymmetry between x.join(y) and x.split(y)
>>> which causes most of the confusion.  In x.join(y), x is the separator and y
>>> is the data being joined, but in x.split(y), it is the other way around.
>>
>>
>> What would be the solution to this?
>
> Allow sum(list_of_strings, '') and stop mocking people who prefer it to
> ''.join(..).  This will not solve all the issues with join/split, but at
> least a simple task of concatenating a list of strings will have a more or
> less obvious solution.

I don't have any data here, but I bet people who know about str.join
(even for its natural use cases like ", ".join(...)) outnumber the
people who know that sum() takes a second argument by a very large
factor.

Of course this also means that sum()'s special error message is
probably pretty ineffective at reaching the people it's trying to
educate -- to do that we'd need to warn on str += str or something,
which is clearly not happening. So I can see the argument for just
making sum(iterable_of_strings, "") fast.

But practically speaking, how would this work? In general str.join and
sum have different semantics. What happens if we descend deep into the
iterable and then discover a non-string (that might nonetheless still
have a + operator)?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


More information about the Python-ideas mailing list