(mostly-)POSIX regular expressions
John Machin
sjmachin at lexicon.net
Sun May 28 18:43:08 EDT 2006
On 29/05/2006 7:46 AM, Sébastien Boisgérault wrote:
> Paddy a écrit :
>
>> maybe this: http://www.pcre.org/pcre.txt and ctypes might work for you?
>
> Well finally, it doesn't fit. What I need is a "longest match" policy
> in
> patterns like "(a)|(b)|(c)" and NOT a "left-to-right" policy.
> Additionaly,
> I need to be able to obtain the matched ("captured") substring and
> the PCRE does not allow this in DFA mode.
>
Perhaps you might like to be somewhat more precise with your
requirements. "POSIX-compliant" made me think of yuckies like [:fubar:]
in character classes :-)
The operands of | are such that the length is not fixed and so you can't
write them in descending length order? Care to tell us some more detail
about those operands?
If those operands are simple strings (LOGICAL|LOGIC|LOG) and you've got
more than a handful of them, try Danny Yoo's ahocorasick module.
HTH,
John
More information about the Python-list
mailing list