Splitting a string

Patrick Maupin pmaupin at gmail.com
Fri Apr 2 09:48:47 EDT 2010


On Apr 2, 6:24 am, Peter Otten <__pete... at web.de> wrote:
> Thomas Heller wrote:
> > Maybe I'm just lazy, but what is the fastest way to convert a string
> > into a tuple containing character sequences and integer numbers, like
> > this:
>
> > 'si_pos_99_rep_1_0.ita'  -> ('si_pos_', 99, '_rep_', 1, '_', 0, '.ita')
> >>> parts = re.compile("([+-]?\d+)").split('si_pos_99_rep_1_0.ita')
> >>> parts[1::2] = map(int, parts[1::2])
> >>> parts
>
> ['si_pos_', 99, '_rep_', 1, '_', 0, '.ita']
>
> Peter

You beat me to it.  re.split() seems underappreciated for some
reason.  When I first started using it (even though it was faster for
the tasks I was using it for than other things) I was really annoyed
at the empty strings it was providing between matches.  It is only
within the past couple of years that I have come to appreciate the
elegant solutions that those empty strings allow for.  In short,
re.split() is by far the fastest and most elegant way to use the re
module for a large class of problems.

So, the only thing I have to add to this solution is that, for this
particular regular expression, if the source string starts with or
ends with digits, you will get empty strings at the beginning or end
of the resultant list, so if this is a problem, you will want to check
for and discard those.

Regards,
Pat



More information about the Python-list mailing list