On 3/6/2011 1:32 PM, Mart Sõmermaa wrote:
On Mon, Feb 28, 2011 at 2:13 AM, Guido van Rossum<guido-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org> wrote:
Two minutes before that, I posted a more extensive reply and refutation that you have not replied to.
But let me remind that the behaviour of foo.split(x) where foo is not an empty string is not questioned at all, only behaviour when splitting the empty string is.
Python Ruby join1 [''] => '' [''] => '' join2 [ ] => '' [ ] => ''
Python Ruby split ['']<= '' [ ]<= ''
As you can see, join1 and join2 are identical in both languages. Python has chosen to make split the inverse of join1, Ruby, on the other hand, the inverse of join2.
In Python the generalization is that since "xx".split(",") is ["xx"], and "x",split(",") is ["x"], it naturally follows that "".split(",") is [""].
Which I wrote as: (n*c).split(c) == (n+1)*[''] The generalization: len(s.split(c)) == s.count(c)+1 You want to change these into (n*c).split(c) == (n+1)*[''] if n else [] len(s.split(c)) == s.count(c)+1 if s else 0 which is to say, you want to add an easily forgotten conditional and alternative to definition of split.
That is one line of reasoning that emphasizes the "string-nature" of ''.
I do not see that particularly. I emphasize the algorithmic nature of functions and prefer simpler definitions/algorithms to more complicated ones with unnecessary special cases.
However, I myself, the Ruby folks and Nick would rather emphasize the "zero-element-nature" [1] of ''.
Which says nothing it itself. Saying that one member of the domain of a function is the identify element under some particular operation (concatenation, in this case) says nothing about what that member should be mapped to by any particular function. You seem to emphasize the mapping (set of ordered pairs) nature of functions and are hence willing to change one of the mappings (ordered pairs) without regard to its relation to all the other pairs. This is a consequence of the set view, which by itself denies any relation between its members (the mapping pairs).
"Applying the split operator to the zero element of strings should result in the zero element of lists"
To repeat, 'should' has no justification; it is just hand waving. Would you really say that every function should map identities to identities (and what if domain and range have more than one)? I hope not. Would you even say that every *string* function should map '' to the identity elememt of the range set? Or more specifically, should every string->list function map '' to []? Nonsense. It depends on the function. To also repeat, if split produced an iterable, then there would be no 'zero element of lists' to talk about. Anyway, it is a moot point as change would break code.
The general problem stems from the fact that my initial expectation that
f_a(x) = x.join(a).split(x), where x in lists, a in strings
should be an identity function can not be satisfied as join is non-injective (because of the surjective example above).
Since I was the first to point this out, I am glad you now agree. -- Terry Jan Reedy