Splitting on '^' ?

MRAB python at mrabarnett.plus.com
Fri Aug 14 17:22:05 EDT 2009


Gary Herron wrote:
> kj wrote:
>> Sometimes I want to split a string into lines, preserving the
>> end-of-line markers.  In Perl this is really easy to do, by splitting
>> on the beginning-of-line anchor:
>>
>>   @lines = split /^/, $string;
>>
>> But I can't figure out how to do the same thing with Python.  E.g.:
>>
>>   
>>>>> import re
>>>>> re.split('^', 'spam\nham\neggs\n')
>>>>>         
>> ['spam\nham\neggs\n']
>>   
>>>>> re.split('(?m)^', 'spam\nham\neggs\n')
>>>>>         
>> ['spam\nham\neggs\n']
>>   
>>>>> bol_re = re.compile('^', re.M)
>>>>> bol_re.split('spam\nham\neggs\n')
>>>>>         
>> ['spam\nham\neggs\n']
>>
>> Am I doing something wrong?
>>   
> Just split on the EOL character:  the "\n":
> re.split('\n', 'spam\nham\neggs\n')
> ['spam', 'ham', 'eggs', '']
> 
> The "^" and "$" characters do not match END-OF-LINE, but rather the  
> END-OF-STRING, which was doing you no good.
> 
With the MULTLINE flag "^" matches START-OF-LINE and "$" matches
END-OF-LINE or END-OF-STRING.

The current re module won't split on a zero-width match.



More information about the Python-list mailing list