a RegEx puzzle

Charles Hartman charles.hartman at conncoll.edu
Fri Mar 11 18:52:17 CET 2005


If I'm understand you right, then I still didn't explain clearly. 
(Surprise!) If the string is '//////xx//' then possible matches are at 
position 6 and 7 (both length 2, so "longest" doesn't even come into 
it). My code searches from position 0, then 1, then 2, and so on, to 
catch every possible pattern and then compare them for length.

You seem to be suggesting a different approach, one I hadn't thought 
of: explicitly test series of pairs, rather than the whole remaining 
string at each point, and do this just once starting at 0, and once 
starting at 1. That sounds as though it would work, though the regex 
would have to be called in a different way so as to seek 
non-overlapping patterns (rather than the elaborate precautions I've 
taken to seek overlapping ones) -- I'm not yet sure quite how, and I'm 
not yet clear that it's any more efficient and/or elegant than what 
I've got now. Hm -- lots to think about here. Thank you.

Charles Hartman
Professor of English, Poet in Residence
http://cherry.conncoll.edu/cohar
http://villex.blogspot.com

>> pat = sre.compile('(x[x/])+')
>> (longest, startlongest) = max([(fnd.end()-fnd.start(), fnd.start()) 
>> for
>> i in range(len(marks))
>> for fnd in pat.finditer(marks,i)])
>
> If I'm understanding that correctly, the only way for you to get 
> different
> best matches are at offsets 0 and 1; offset 2 will yield the same 
> matches
> as 0, with the possibility of excluding the first two characters -- i. 
> e.
> any different matches should be guaranteed to be shorter. Therefore
>
> ... for i in range(2) ...
>
> instead of
>
> ... for i in range(len(marks)) ...
>
> should be sufficient.
>
> Peter




More information about the Python-list mailing list