Splitting text at whitespace but keeping the whitespace in the returned list
MRAB
python at mrabarnett.plus.com
Sun Jan 24 12:24:40 EST 2010
python at bdurham.com wrote:
> I need to parse some ASCII text into 'word' sized chunks of text AND
> collect the whitespace that seperates the split items. By 'word' I mean
> any string of characters seperated by whitespace (newlines, carriage
> returns, tabs, spaces, soft-spaces, etc). This means that my split text
> can contain punctuation and numbers - just not whitespace.
>
> The split( None ) method works fine for returning the word sized chunks
> of text, but destroys the whitespace separators that I need.
>
> Is there a variation of split() that returns delimiters as well as tokens?
>
I'd use the re module:
>>> import re
>>> re.split(r'(\s+)', "Hello world!")
['Hello', ' ', 'world!']
More information about the Python-list
mailing list