[Tutor] Finding the "streaks" in heads/tails list
Danny Yoo
dyoo at cs.wpi.edu
Thu Oct 2 00:00:05 CEST 2008
> Regular expressions are for processing strings, not loops.
>From a theoretical point of view, this isn't quite true: regular
expressions can deal with sequences of things. It's true that most
regular expression libraries know how to deal only with characters,
but that's a matter of specializing the library for efficiency, and
not a general property of regexes.
But what regular expressions (i.e. finite-state automata) can't do
very well is count with memory, and the task you're asking for is
fundamentally an anti-regexp one.
> I would loop through the list with a for loop, keeping track of the
> last value seen and the current count. If the current value is the
> same as the last, increment the count; if it is different, reset the
> count.
Agreed. This seems direct.
If we want to be cute, we can also use the itertools.groupby()
function to do the clumping of identical sequential values for us.
For example:
#################################################
>>> for group in itertools.groupby('aaaabbbbcaaabaaaacc'):
... print group[0], len(list(group[1]))
...
a 4
b 4
c 1
a 3
b 1
a 4
c 2
#################################################
See the standard library documentation for more details on itertools.groupby():
http://www.python.org/doc/lib/itertools-functions.html
More information about the Tutor
mailing list