I come to praise .join, not to bury it...

Alex Martelli aleaxit at yahoo.com
Tue Mar 6 10:16:55 CET 2001


"Steve Holden" <sholden at holdenweb.com> wrote in message
news:DNXo6.24266$1D5.975603 at e420r-atl1.usenetserver.com...
    [snip]
> > 2. Can we really join all kinds of lists?
> >
> No. I was somewhat startled to see that a UserList of integers could not
be
> given as an argument to join(). Relief of a kind arrived when I realised
> that the same was true of lists themselves.
    [snip]
> It would seem reasonable to expect the join() method to try and coerce
> things to lists, but maybe I'm not a reliable guide to what's reasonable.

It's not an issue of lists vs other stuff -- .join accepts as
its argument any sequence which defines a length, which is
reasonable (although it might be nice to remove that need for
the length being defined -- the sequence-length is only used
to control a for-loop, after all, so the test-for-IndexError
might suffice; but I guess unbounded-sequences might prove to
be a problem here).

The key issue, anyway, is with the _items_ in the sequence,
rather than with the sequence itself.  The join method on
a single-byte string wants each item to *BE* a single-byte
string -- peculiar behavior follows if it isn't, consider:

class Sequence:
    def __init__(self, N, value):
        self.N = N
        self.value = value
    def __len__(self):
        return self.N
    def __getitem__(self, index):
        if index>=self.N:
            raise IndexError, index
        return self.value

>>> sb=Sequence(3,'aa')
>>> 'x'.join(sb)
'aaxaaxaa'
>>> u'y'.join(sb)
u'aayaayaa'
>>> uc=Sequence(3,u'bbb')
>>> u'z'.join(uc)
u'bbbzbbbzbbb'
>>> 't'.join(uc)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: coercing to Unicode: need string or buffer, tuple found

and a Control-Z now crashes/hangs the Python interpreter (2.0, on
Win32).  OK, some kind of ugly bug in 2.0 (gotta get 2.1b1 and
check if the bug is still there...!!!).


Alex






More information about the Python-list mailing list