[Python-Dev] bytes / unicode

Fri Jun 25 00:15:02 CEST 2010

On Fri, Jun 25, 2010 at 1:41 AM, Guido van Rossum <guido at python.org> wrote:
> I don't think we should abuse sum for this. A simple idiom to get the
> *empty* string of a particular type is x[:0] so you could write
> something like this to concatenate a list or strings or bytes:
> xs[:0].join(xs). Note that if xs is empty we wouldn't know what to do
> anyway so this should be disallowed.

That's a good trick, although there's a "[0]" missing from your join
example ("type(xs[0])()" is another way to spell the same idea, but
the subscripting version would likely be faster since it skips the
builtin lookup). Promoting that over explicit use of empty str and
bytes literals is probably step 1 in eliminating gratuitous breakage
of bytes/str polymorphism (this trick also has the benefit of working
with non-builtin character sequence types).

Use of non-empty bytes/str literals is going to be harder to handle -
actually trying to apply a polymorphic philosophy to the Python 3 URL
parsing libraries may be a good way to learn more on that front.

Cheers,
Nick.

P.S. I'm off to Sydney for PyconAU this evening, so I'm not sure how
much time I'll get to follow python-dev until next week.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia