Splitting on '^' ?
MRAB
python at mrabarnett.plus.com
Fri Aug 14 17:22:05 EDT 2009
Gary Herron wrote:
> kj wrote:
>> Sometimes I want to split a string into lines, preserving the
>> end-of-line markers. In Perl this is really easy to do, by splitting
>> on the beginning-of-line anchor:
>>
>> @lines = split /^/, $string;
>>
>> But I can't figure out how to do the same thing with Python. E.g.:
>>
>>
>>>>> import re
>>>>> re.split('^', 'spam\nham\neggs\n')
>>>>>
>> ['spam\nham\neggs\n']
>>
>>>>> re.split('(?m)^', 'spam\nham\neggs\n')
>>>>>
>> ['spam\nham\neggs\n']
>>
>>>>> bol_re = re.compile('^', re.M)
>>>>> bol_re.split('spam\nham\neggs\n')
>>>>>
>> ['spam\nham\neggs\n']
>>
>> Am I doing something wrong?
>>
> Just split on the EOL character: the "\n":
> re.split('\n', 'spam\nham\neggs\n')
> ['spam', 'ham', 'eggs', '']
>
> The "^" and "$" characters do not match END-OF-LINE, but rather the
> END-OF-STRING, which was doing you no good.
>
With the MULTLINE flag "^" matches START-OF-LINE and "$" matches
END-OF-LINE or END-OF-STRING.
The current re module won't split on a zero-width match.
More information about the Python-list
mailing list