First, sorry for such a big delay in replying. On Mon, Feb 28, 2011 at 2:13 AM, Guido van Rossum <guido@python.org> wrote:
Does Ruby in general leave out empty strings from the result? What does it return when "x,,y" is split on "," ? ["x", "", "y"] or ["x", "y"]?
"x,,y".split(",") => ["x", "", "y"]
But let me remind that the behaviour of foo.split(x) where foo is not an empty string is not questioned at all, only behaviour when splitting the empty string is. Python Ruby join1 [''] => '' [''] => '' join2 [ ] => '' [ ] => '' Python Ruby split [''] <= '' [ ] <= '' As you can see, join1 and join2 are identical in both languages. Python has chosen to make split the inverse of join1, Ruby, on the other hand, the inverse of join2.
In Python the generalization is that since "xx".split(",") is ["xx"], and "x",split(",") is ["x"], it naturally follows that "".split(",") is [""].
That is one line of reasoning that emphasizes the "string-nature" of ''. However, I myself, the Ruby folks and Nick would rather emphasize the "zero-element-nature" [1] of ''. Both approaches are based on solid reasoning, the latter just happens to be more practical. And I would still claim that "Applying the split operator to the zero element of strings should result in the zero element of lists" wins on theoretical grounds as well. The general problem stems from the fact that my initial expectation that f_a(x) = x.join(a).split(x), where x in lists, a in strings should be an identity function can not be satisfied as join is non-injective (because of the surjective example above). [1] http://en.wikipedia.org/wiki/Zero_element