how to split a string (or sequence) into pairs of characters?
eddie at holyrood.ed.ac.uk
Fri Aug 16 12:12:22 CEST 2002
Andrew Koenig <ark at research.att.com> writes:
>Berthold> Andrew Koenig <ark at research.att.com> writes:
>Jason> Can anyone come up with a better way of performing these
>Jason> operations? Extra kudos if it easily extends to any sublength
>Jason> and not just pairs.
>>> >>> import re
>>> >>> re.findall('..', 'aabbccddee')
>>> ['ab', 'cd', 'ef']
>>>>> re.findall('..', 'aabbccddee')
>Berthold> ['aa', 'bb', 'cc', 'dd', 'ee']
I suspect the OP didn't want to use REs but after going to all the effort to
think about it I'll post my variation anyway.
>>> x = 'aabbccdd'
>>> y = [lh for lh,rh in re.findall (r'((.)\2*)',x)]
>>> print y
['aa', 'bb', 'cc', 'dd']
Will basically find all contiguous blocks of identical characters. You could
force it to just pairs with \2 instead of \2*, using \2+ gets only sequences
that are longer than 1.
The (other) list comprehension answer was probably the simplest.
More information about the Python-list