Behavior of re.split on empty strings is unexpected
Thomas Jollans
thomas at jollans.com
Mon Aug 2 18:07:58 EDT 2010
On 08/02/2010 11:22 PM, John Nagle wrote:
>> [ s in rexp.split(long_s) if s ]
>
> Of course I can discard the blank strings afterward, but
> is there some way to do it in the "split" operation? If
> not, then the default case for "split()" is too non-standard.
>
> (Also, "if s" won't work; if s != '' might)
Of course it will work. Empty sequences are considered false in Python.
Python 3.1.2 (release31-maint, Jul 8 2010, 09:18:08)
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> sprexp = re.compile(r'\s+')
>>> [s for s in sprexp.split(' spaces every where ! ') if s]
['spaces', 'every', 'where', '!']
>>> list(filter(bool, sprexp.split(' more spaces \r\n\t\t ')))
['more', 'spaces']
>>>
(of course, the list comprehension I posted earlier was missing a couple
of words, which was very careless of me)
More information about the Python-list
mailing list