Behavior of re.split on empty strings is unexpected

Thomas Jollans thomas at
Tue Aug 3 00:07:58 CEST 2010

On 08/02/2010 11:22 PM, John Nagle wrote:
>> [ s in rexp.split(long_s) if s ]
>    Of course I can discard the blank strings afterward, but
> is there some way to do it in the "split" operation?  If
> not, then the default case for "split()" is too non-standard.
>    (Also, "if s" won't work;   if s != ''   might)

Of course it will work. Empty sequences are considered false in Python.

Python 3.1.2 (release31-maint, Jul  8 2010, 09:18:08)
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> sprexp = re.compile(r'\s+')
>>> [s for s in sprexp.split('   spaces   every where !  ') if s]
['spaces', 'every', 'where', '!']
>>> list(filter(bool, sprexp.split('   more  spaces \r\n\t\t  ')))
['more', 'spaces']

(of course, the list comprehension I posted earlier was missing a couple
of words, which was very careless of me)

More information about the Python-list mailing list