a RegEx puzzle
Charles Hartman
charles.hartman at conncoll.edu
Fri Mar 11 15:52:54 EST 2005
Thanks -- not only for the code, which does almost exactly what I need
to do, but for the reminder (thanks also to Jeremy Bowers for this!) to
prefer simple solutions. I was, of course, so tied up in getting my
nifty one-liner right that I totally lost sight of how
straightforwardly the job could be done; and now that I've got it, I've
also got room to tune it. For instance, your code keeps the first
"longest" match if several are equal in length; my program will I think
do slightly better if I keep the last "longest" instead, and changing
that required changing > into >=, which even I can't screw up.
Thanks to everyone who's helped on this. Makes me wish I were going to
pycon.
Charles Hartman
Professor of English, Poet in Residence
http://cherry.conncoll.edu/cohar
http://villex.blogspot.com
Kent Johnson wrote:
> It's pretty simple to put re.search() into a loop where subsequent
> searches start from the character after where the previous one
> matched. Here is a solution that uses a general-purpose longest match
> function:
>
> import re
>
> # RE solution
> def longestMatch(rx, s):
> ''' Find the longest match for rx in s.
> Returns (start, length) for the match or (None, None) if no
> match found.
> '''
>
> start = length = current = 0
>
> while True:
> m = rx.search(s, current)
> if not m:
> break
>
> mStart, mEnd = m.span()
> current = mStart + 1
>
> if (mEnd - mStart) > length:
> start = mStart
> length = mEnd - mStart
>
> if length:
> return start, length
>
> return None, None
>
>
> pairsRe = re.compile(r'(x[x/])+')
>
> for s in [ '/xx/xxx///', '//////xx//' ]:
> print s, longestMatch(pairsRe, s)
More information about the Python-list
mailing list