join and split with empty delimiter
Danilo Coccia
daniloco at acm.org
Thu Jul 18 15:44:27 EDT 2019
Il 18/07/2019 12:27, Ben Bacarisse ha scritto:
> Irv Kalb <Irv at furrypants.com> writes:
>
>> I have always thought that split and join are opposite functions. For
>> example, you can use a comma as a delimiter:
>>
>>>>> myList = ['a', 'b', 'c', 'd', 'e']
>>>>> myString = ','.join(myList)
>>>>> print(myString)
>> a,b,c,d,e
>>
>>>>> myList = myString.split(',')
>>>>> print(myList)
>> ['a', 'b', 'c', 'd', 'e']
>>
>> Works great.
>
> Note that join and split do not always recover the same list:
>
>>>> ','.join(['a', 'b,c', 'd']).split(',')
> ['a', 'b', 'c', 'd']
>
> You don't even have to have the delimiter in one of the strings:
>
>>>> '//'.join(['a', 'b/', 'c']).split('//')
> ['a', 'b', '/c']
>
>> But i've found a case where they don't work that way. If
>> I join the list with the empty string as the delimiter:
>>
>>>>> myList = ['a', 'b', 'c', 'd']
>>>>> myString = ''.join(myList)
>>>>> print(myString)
>> abcd
>>
>> That works great. But attempting to split using the empty string
>> generates an error:
>>
>>>>> myString.split('')
>> Traceback (most recent call last):
>> File "<pyshell#9>", line 1, in <module>
>> myString.split('')
>> ValueError: empty separator
>>
>> I know that this can be accomplished using the list function:
>>
>>>>> myString = list(myString)
>>>>> print(myString)
>> ['a', 'b', 'c', 'd']
>>
>> But my question is: Is there any good reason why the split function
>> should give an "empty separator" error? I think the meaning of trying
>> to split a string into a list using the empty string as a delimiter is
>> unambiguous - it should just create a list of single characters
>> strings like the list function does here.
>
> One reason might be that str.split('') is not unambiguous. For example,
> there's a case to be made that there is a '' delimiter at the start and
> the end of the string as well as between letters. '' is a very special
> delimiter because every string that gets joined using it includes it!
> It's a wild version of ','.join(['a', 'b,c', 'd']).split(',').
>
> Of course str.split('') could be defined to work the way you expect, but
> it's possible that the error is there to prompt the programmer to be
> more explicit.
>
It is even more ambiguous if you consider that any string starts with an
infinite number of empty strings, followed by a character, followed by
an infinite number of empty strings, followed by ...
The result wouldn't fit on screen, or in memory for that!
More information about the Python-list
mailing list