(mostly-)POSIX regular expressions

John Machin sjmachin at lexicon.net
Sun May 28 18:43:08 EDT 2006


On 29/05/2006 7:46 AM, Sébastien Boisgérault wrote:
> Paddy a écrit :
> 
>> maybe this: http://www.pcre.org/pcre.txt and ctypes might work for you?
> 
> Well finally, it doesn't fit. What I need is a "longest match" policy
> in
> patterns like "(a)|(b)|(c)" and NOT a "left-to-right" policy.
> Additionaly,
> I need to be able to obtain the matched ("captured") substring and
> the PCRE does not allow this in DFA mode.
> 

Perhaps you might like to be somewhat more precise with your 
requirements. "POSIX-compliant" made me think of yuckies like [:fubar:] 
in character classes :-)

The operands of | are such that the length is not fixed and so you can't 
write them in descending length order? Care to tell us some more detail 
about those operands?

If those operands are simple strings (LOGICAL|LOGIC|LOG) and you've got 
more than a handful of them, try Danny Yoo's ahocorasick module.

HTH,
John



More information about the Python-list mailing list