On 2/26/2011 9:03 AM, Mart Sõmermaa wrote:
IMHO, x.join(a).split(x) should be "idempotent" in regard to a.
Given that x.join is *not* 1 to 1,
'a'.join([]) '' 'a'.join(['']) ''
it cannot have an inverse for all outputs. In particular, ''.split('a') cannot be both [] and ['']. This could only be fixed by changing the definition of join to not allow joining on [], but that would not be convenient. I believe joining is otherwise 1 to 1 and invertible for non-empty lists. Of course, join input a can be any iterable of strings, whereas split produces a list, so your equality test can only work for list inputs unless generalized to c.join(a).split(c) == list(a). ''.split('a') == [''], not [], by the definition of s.split(c): a list of pieces of s that were previously joined by c. In particular, string_not_containing_sep.split(sep) == [string_not_containing_sep]. Note that empty pieces are inserted for repeated seps so that splitting on seps (unlike splitting on 'whitespace') *is* 1 to 1. 'abc'.split('b') == ['a','c'] 'abbc'.split('b') == ['a','','c'] (whereas 'a c'.split() and 'a c'.split() are both ['a','c']) Therefore, sep splitting does have an inverse: c.join(s.split(c)) == s The doc for str.split specifies the above and makes clear that splitting with and without a separator are slightly different functions.
assert ' '.join(foo).split() == foo
You have pulled a fast one here. ' ' does not equal 'whitespace' ;-) If x in your original expression is nothing (to indicate 'whitespace'), then your desired equality becomes .join(a).split() == a which is not legal ;-). Some of the above is a rewording and expansion upon what Joao already said, which was all correct. -- Terry Jan Reedy