[docs] [issue37620] str.split(sep=None, maxsplit=-1,any=False)
Harry Coin
report at bugs.python.org
Fri Jul 19 14:59:56 EDT 2019
Harry Coin <hgcoin at gmail.com> added the comment:
I suspect the number of times the str.split builtin was examined for use
and rejected in favor of the much more complex and 'heavy' re module
far, far exceeds the number of times it found use with more than one
character in the split string.
The str.split documentation 'feels like' the python equivalent of the
linux 'tr' utility that treats the separator characters as a set instead
of a sequence. Notice the default and the help(str.split)
documentation tends to encourage that intuition as no sep= has a very
different behavior: no argument 'removes any whitespace and discards
empty strings from the result'. That leads one to suspect each
character in a string would do the same.
Mostly it's a use-case driven obviousness, you'd think python would
naturally do that in str.split. So very many cases seek to resolve a
string into a list of the interesting bits without regard to any mix of
separators (tabs, spaces, etc to increase the readability of the file).
I think it would be a heavily used enhancement to add the 'any=True'
parameter.
Or, in the alternative, allow the argument to sep to be an iterable so
that:
'ab, cd'.split(sep=' ,') --> ['ab, cd']
but
'ab, cd'.split(sep=[' ',',']) -> ['ab', 'cd']
On 7/19/19 1:34 PM, Serhiy Storchaka wrote:
> Serhiy Storchaka <storchaka+cpython at gmail.com> added the comment:
>
> An alternative is to use regular expressions.
>
>>>> re.split('[\t ]+', 'ab\t cd ef')
> ['ab', 'cd', 'ef']
> .
>
> ----------
> nosy: +serhiy.storchaka
>
> _______________________________________
> Python tracker <report at bugs.python.org>
> <https://bugs.python.org/issue37620>
> _______________________________________
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37620>
_______________________________________
More information about the docs
mailing list