Extracting subsequences composed of the same character
Tim Chase
python.list at tim.thechases.com
Thu Mar 31 21:58:42 EDT 2011
On 03/31/2011 07:43 PM, candide wrote:
> Suppose you have a string, for instance
>
> "pyyythhooonnn ---> ++++"
>
> and you search for the subquences composed of the same character, here
> you get :
>
> 'yyy', 'hh', 'ooo', 'nnn', '---', '++++'
>>> import re
>>> s = "pyyythhooonnn ---> ++++"
>>> [m.group(0) for m in re.finditer(r"(.)\1+", s)]
['yyy', 'hh', 'ooo', 'nnn', '---', '++++']
>>> [(m.group(0),m.group(1)) for m in re.finditer(r"(.)\1+", s)]
[('yyy', 'y'), ('hh', 'h'), ('ooo', 'o'), ('nnn', 'n'), ('---',
'-'), ('++++', '+')]
-tkc
More information about the Python-list
mailing list