[Python-Dev] string.split

Sjoerd Mullender sjoerd at acm.org
Fri Feb 20 08:37:31 EST 2004


Andreas Beyer wrote:
> The documentation of string.split() says:
> "... The returned list will then have one more item than the number of 
> non-overlapping occurrences of the separator in the string. ..."
> 
> The behaviour of split with Python 2.3.3 is:
>  >>> '\tb'.split()
> ['b']               # Bug?
>  >>> '\tb'.split('\t')
> ['', 'b']
>  >>> 'a\t\tb'.split()
> ['a', 'b']
>  >>> 'a\t\tb'.split('\t')
> ['a', '', 'b']
>  >>>
> 
> I think there are different interpretations of what a separator is. That 
> is not necessarily a bug, because without stripping a new-line at the 
> end of the string would yield a non-intuitive result list. However, the 
> difference between split with and without the 'sep' argument should be 
> documented.

This is intended behavior and is not going to be changed.

However, I must agree that the documentation for the string method is 
somewhat lacking.  The documentation of the split function in the string 
module is much clearer.

Method:
split([sep [,maxsplit]])
     Return a list of the words in the string, using sep as the 
delimiter string. If maxsplit is given, at most maxsplit splits are 
done. If sep is not specified or None, any whitespace string is a separator.

string.split function:
split(s[, sep[, maxsplit]])
     Return a list of the words of the string s. If the optional second 
argument sep is absent or None, the words are separated by arbitrary 
strings of whitespace characters (space, tab, newline, return, 
formfeed). If the second argument sep is present and not None, it 
specifies a string to be used as the word separator. The returned list 
will then have one more item than the number of non-overlapping 
occurrences of the separator in the string. The optional third argument 
maxsplit defaults to 0. If it is nonzero, at most maxsplit number of 
splits occur, and the remainder of the string is returned as the final 
element of the list (thus, the list will have at most maxsplit+1 elements).

-- 
Sjoerd Mullender <sjoerd at acm.org>



More information about the Python-Dev mailing list