Splitting text at whitespace but keeping the whitespace in thereturned list

Tim Arnold tim.arnold at sas.com
Mon Jan 25 12:47:25 EST 2010


"MRAB" <python at mrabarnett.plus.com> wrote in message 
news:mailman.1362.1264353878.28905.python-list at python.org...
> python at bdurham.com wrote:
>> I need to parse some ASCII text into 'word' sized chunks of text AND 
>> collect the whitespace that seperates the split items. By 'word' I mean 
>> any string of characters seperated by whitespace (newlines, carriage 
>> returns, tabs, spaces, soft-spaces, etc). This means that my split text 
>> can contain punctuation and numbers - just not whitespace.
>>  The split( None ) method works fine for returning the word sized chunks 
>> of text, but destroys the whitespace separators that I need.
>>  Is there a variation of split() that returns delimiters as well as 
>> tokens?
>>
> I'd use the re module:
>
> >>> import re
> >>> re.split(r'(\s+)', "Hello world!")
> ['Hello', ' ', 'world!']

also, partition works though it returns a tuple instead of a list.
>>> s = 'hello world'
>>> s.partition(' ')
('hello', ' ', 'world')
>>>

--Tim Arnold





More information about the Python-list mailing list