[Python-Dev] join() et al.

Tim Peters tim_one@email.msn.com
Wed, 17 May 2000 02:45:59 -0400

[Skip Montanaro]
> ...
> It's not a huge deal to me, but I think it mildly violates the
> principle of least surprise when you try to apply it to sequences
> of non-strings.

When sep.join(seq) was first discussed, half the debate was whether str()
should be magically applied to seq's elements.  I still favor doing that, as
I have often explained the TypeError in e.g.


to people and agree with their next complaint:  their intent was obvious,
since string.join *produces* a string.  I've never seen an instance of this
error that was appreciated (i.e., it never exposed an error in program logic
or concept, it's just an anal gripe about an arbitrary and unnatural
restriction).  Not at all like

    "42" + 42

where the intent is unknowable.

> To extend this into the absurd, what should the following code display?
>     class Spam: pass
>     eggs = Spam()
>     bacon = Spam()
>     toast = Spam()
>     print join((eggs,bacon,toast))

Note that we killed the idea of a new builtin join last time around.  It's
the kind of muddy & gratuitous hypergeneralization Guido will veto if we
don't kill it ourselves.  That said,

    space.join((eggs, bacon, toast))

should <wink> produce

    str(egg) + space + str(bacon) + space + str(toast)

although how Unicode should fit into all this was never clear to me.

> If a join builtin is supposed to be applicable to all types, we need to
> decide what the semantics are going to be for all types.

See above.

> Maybe all that needs to happen is that you stringify any non-string
> elements before applying the + operator (just one possibility among
> many, not necessarily one I recommend).

In my experience, that it *doesn't* do that today is a common source of
surprise & mild irritation.  But I insist that "stringify" return a string
in this context, and that "+" is simply shorthand for "string catenation".
Generalizing this would be counterproductive.

> If you want to limit join's inputs to (or only make it semantically
> meaningful for) sequences of strings, then it should probably
> not be a builtin, no matter how visually annoying you find
>     " ".join(["a","b","c"])

This is one of those "doctor, doctor, it hurts when I stick an onion up my
ass!" things <wink>.  space.join(etc) reads beautifully, and anyone who
doesn't spell it that way but hates the above is picking at a scab they
don't *want* to heal <0.3 wink>.

having-said-nothing-new-he-signs-off-ly y'rs  - tim