[Python-Dev] join() et al.

M.-A. Lemburg mal@lemburg.com
Wed, 17 May 2000 10:56:19 +0200

Tim Peters wrote:
> [Skip Montanaro]
> > ...
> > It's not a huge deal to me, but I think it mildly violates the
> > principle of least surprise when you try to apply it to sequences
> > of non-strings.
> When sep.join(seq) was first discussed, half the debate was whether str()
> should be magically applied to seq's elements.  I still favor doing that, as
> I have often explained the TypeError in e.g.
>     string.join(some_mixed_list_of_strings_and_numbers)
> to people and agree with their next complaint:  their intent was obvious,
> since string.join *produces* a string.  I've never seen an instance of this
> error that was appreciated (i.e., it never exposed an error in program logic
> or concept, it's just an anal gripe about an arbitrary and unnatural
> restriction).  Not at all like
>     "42" + 42
> where the intent is unknowable.

Uhm, aren't we discussing a generic sequence join API here ?

For strings, I think that " ".join(seq) is just fine... but it
would be nice to have similar functionality for other sequence
items as well, e.g. for sequences of sequences.
> > To extend this into the absurd, what should the following code display?
> >
> >     class Spam: pass
> >
> >     eggs = Spam()
> >     bacon = Spam()
> >     toast = Spam()
> >
> >     print join((eggs,bacon,toast))
> Note that we killed the idea of a new builtin join last time around.  It's
> the kind of muddy & gratuitous hypergeneralization Guido will veto if we
> don't kill it ourselves.

We did ? (I must have been too busy hacking Unicode ;-)

Well, in that case I'd still be interested in hearing about
your thoughts so that I can intergrate such a beast in mxTools.
The acceptance level neede for doing that is much lower than
for the core builtins ;-)

>  That said,
>     space.join((eggs, bacon, toast))
> should <wink> produce
>     str(egg) + space + str(bacon) + space + str(toast)
> although how Unicode should fit into all this was never clear to me.

But that would mask errors and, even worse, "work around" coercion,
which is not a good idea, IMHO. Note that the need to coerce to
Unicode was the reason why the implicit str() in " ".join() was
removed from Barry's original string methods implementation.

space.join(map(str,seq)) is much clearer in this respect: it
forces the user to think about what the join should do with non-
string types.

> > If a join builtin is supposed to be applicable to all types, we need to
> > decide what the semantics are going to be for all types.
> See above.
> > Maybe all that needs to happen is that you stringify any non-string
> > elements before applying the + operator (just one possibility among
> > many, not necessarily one I recommend).
> In my experience, that it *doesn't* do that today is a common source of
> surprise & mild irritation.  But I insist that "stringify" return a string
> in this context, and that "+" is simply shorthand for "string catenation".
> Generalizing this would be counterproductive.

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/