[Python-ideas] str.split() oddness
Mart Sõmermaa
mrts.pydev at gmail.com
Sun Mar 6 19:32:07 CET 2011
First, sorry for such a big delay in replying.
On Mon, Feb 28, 2011 at 2:13 AM, Guido van Rossum <guido at python.org> wrote:
> Does Ruby in general leave out empty strings from the result? What
> does it return when "x,,y" is split on "," ? ["x", "", "y"] or ["x", "y"]?
>> "x,,y".split(",")
=> ["x", "", "y"]
But let me remind that the behaviour of foo.split(x) where
foo is not an empty string is not questioned at all, only
behaviour when splitting the empty string is.
Python Ruby
join1 [''] => '' [''] => ''
join2 [ ] => '' [ ] => ''
Python Ruby
split [''] <= '' [ ] <= ''
As you can see, join1 and join2 are identical in both
languages. Python has chosen to make split the inverse of
join1, Ruby, on the other hand, the inverse of join2.
> In Python the generalization is that since "xx".split(",") is ["xx"],
> and "x",split(",") is ["x"], it naturally follows that "".split(",")
> is [""].
That is one line of reasoning that emphasizes the
"string-nature" of ''.
However, I myself, the Ruby folks and Nick would rather
emphasize the "zero-element-nature" [1] of ''.
Both approaches are based on solid reasoning, the latter
just happens to be more practical. And I would still claim
that
"Applying the split operator to the zero element of
strings should result in the zero element of lists"
wins on theoretical grounds as well.
The general problem stems from the fact that my initial
expectation that
f_a(x) = x.join(a).split(x), where x in lists, a in strings
should be an identity function can not be satisfied as join
is non-injective (because of the surjective example above).
[1] http://en.wikipedia.org/wiki/Zero_element
More information about the Python-ideas
mailing list