[Python-ideas] str.split() oddness

Nick Coghlan ncoghlan at gmail.com
Mon Mar 7 00:24:23 CET 2011

On Mon, Mar 7, 2011 at 4:32 AM, Mart Sõmermaa <mrts.pydev at gmail.com> wrote:
> However, I myself, the Ruby folks and Nick would rather
> emphasize the "zero-element-nature" [1] of ''.

I did say maybe. As Jesse notes, there's another pattern based line of
argument that goes:

len(',,'.split('.')) == 3
len(','.split('.')) == 2
len(''.split('.')) == ???

(Well, 1 "obviously", since the pattern suggests that even when there
is no other text in the string, the length of the split result is
always 1 more than the number of separators occurring in the string)

There are reasonable arguments for "''.split(sep)" as the inverse of
either "sep.join([''])" or "sep.join([])", but once *either* has been
chosen for a given language, none of the arguments are strong enough
to justify switching to the other behaviour.

Note that, independent of which is chosen, the following identity will
hold for an explicit separator:

  sep.join((text.split(sep)) == text

It's only composing them the other way around as
"sep.join(data).split(sep)" that will convert either [] to [''] (as in
Python) or [''] to [] (as in Ruby).


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-ideas mailing list