a splitting headache
jjposner at optimum.net
Fri Oct 16 17:30:10 CEST 2009
>>>> c = '0010000110'
> ['', '', '1', '', '', '', '11', '']
> Ok, the consecutive delimiters appear as empty strings for
> reasons unknown (except for the first one). Except when they
> start or end the string in which case the first one is included.
> Maybe there's a reason for this inconsistent behaviour but you
> won't find it in the documentation.
The "reason unknown" is that split() is designed to handle *substrings
separated by delimiters*, not *consecutive character runs*. For
example, TAB-separated (or if your prefer, COMMA-separated) strings.
If you split the above string on the <TAB> character, you really do want
to get an empty string among the result substrings, indicating that
"column 3" is empty.
>>> line = "one\ttwo\t\tfour"
['one', 'two', '', 'four']
A result of ['one', 'two', 'four'] would be misleading, no?
More information about the Python-list