[Python-ideas] str.split() oddness

Guido van Rossum guido at python.org
Mon Mar 7 04:27:31 CET 2011


Well, I'm sorry, but this is not going to change, so I don't see much
point in continuing to discuss it. We can explain the reasoning that
leads to the current behavior (as you note, it's solid), we can
discuss an alternative that could be considered just as solid, but it
can't prevail in this universe. The cost of change is just too high,
so we'll just have to live with the current behavior (and we might as
well accept that it's solid instead of trying to fight it).

--Guido

On Sun, Mar 6, 2011 at 10:32 AM, Mart Sõmermaa <mrts.pydev at gmail.com> wrote:
> First, sorry for such a big delay in replying.
>
> On Mon, Feb 28, 2011 at 2:13 AM, Guido van Rossum <guido at python.org> wrote:
>> Does Ruby in general leave out empty strings from the result? What
>> does it return when "x,,y" is split on "," ? ["x", "", "y"] or ["x", "y"]?
>
>>> "x,,y".split(",")
> => ["x", "", "y"]
>
> But let me remind that the behaviour of foo.split(x) where
> foo is not an empty string is not questioned at all, only
> behaviour when splitting the empty string is.
>
>              Python           Ruby
> join1     [''] => ''        [''] => ''
> join2     [  ] => ''        [  ] => ''
>
>              Python           Ruby
> split      [''] <= ''        [  ] <= ''
>
> As you can see, join1 and join2 are identical in both
> languages. Python has chosen to make split the inverse of
> join1, Ruby, on the other hand, the inverse of join2.
>
>> In Python the generalization is that since "xx".split(",") is ["xx"],
>> and "x",split(",") is ["x"], it naturally follows that "".split(",")
>> is [""].
>
> That is one line of reasoning that emphasizes the
> "string-nature" of ''.
>
> However, I myself, the Ruby folks and Nick would rather
> emphasize the "zero-element-nature" [1] of ''.
>
> Both approaches are based on solid reasoning, the latter
> just happens to be more practical. And I would still claim
> that
>
> "Applying the split operator to the zero element of
> strings should result in the zero element of lists"
>
> wins on theoretical grounds as well.
>
> The general problem stems from the fact that my initial
> expectation that
>
>  f_a(x) = x.join(a).split(x), where x in lists, a in strings
>
> should be an identity function can not be satisfied as join
> is non-injective (because of the surjective example above).
>
> [1] http://en.wikipedia.org/wiki/Zero_element
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list