how to split a string (or sequence) into pairs of characters?

Harvey Thomas hst at empolis.co.uk
Fri Aug 16 06:35:57 EDT 2002


Jason R. Coombs wrote:
> 
> I have a segment of code that seems particularly ugly.  I 
> want to take a
> sequence and divide it into pairs such that pairs( 'aabbccdd' ) == (
> 'aa','bb','cc','dd' ). Here's code that does what I want.
> 
>   even = range( 0, len( s ), 2 )
>   odd = range( 1, len( s ), 2 )
>   even = map( s.__getitem__, even )
>   odd = map( s.__getitem__, odd )
>   pairs = map( operator.add, even, odd )
> 
> I'd like it to be more elegant and more efficient, and I'd 
> like to avoid
> using loops.  I could obviously do:
> 
>   pairs = []
>   for i in range( 0,len(s),2 ):
>     pairs.append( s[i:i+2] )
> 
> But even that seems a bit inefficient, and doesn't take 
> advantages of the
> powerful sequence operations of Python.
> 
> Can anyone come up with a better way of performing these 
> operations? Extra
> kudos if it easily extends to any sublength and not just pairs.
> 
> Jason
> 
OK, I'll go for the kudos:

>>> s = 'abcdefghijklmnop\n\n'
>>> n = 2
>>> re.findall('(?s)(.{%d})' % n, s)
['ab', 'cd', 'ef', 'gh', 'ij', 'kl', 'mn', 'op', '\n\n']
>>> n = 3
>>> re.findall('(?s)(.{%d})' % n, s)
['abc', 'def', 'ghi', 'jkl', 'mno', 'p\n\n']

The (?s) allows matching of newlines. The above does assume that the length of the list is a multiple of n

HTH

Harvey

_____________________________________________________________________
This message has been checked for all known viruses by the MessageLabs Virus Scanning Service.




More information about the Python-list mailing list