Why no RE match of A AND B?

Anders J. Munch andersjm at dancontrol.dk
Tue Mar 4 07:26:38 EST 2003


"Rene Pijlman" <reply.in at the.newsgroup> wrote:
> Anders J. Munch:
> >"Rene Pijlman" <reply.in at the.newsgroup> wrote:
> >> I take it this means:
> >>
> >>   match(r1&r2,s) <==> match(r1,s) and match(r2,s)
> >>
> >> I assume (without a formal prove at this point) that r1&r2 can
> >> always be reformulated as a simpler expression, BIMBW.
> >
> >It can always be reformulated as an expression without an intersection
> >('&') operator.  But not necessarily a simpler one.
> 
> Could you give an example?

This is not in the usual syntax, but I hope you can make something of
it anyway.  (Ever since I wrote a utility with cleaned-up regexp
syntax, traditional syntax is just too painful for anything
non-trivial.)

Here's a regular expression that matches a C comment unambiguously.

:ccomment =  "/*" ([^"*"] | "*"+ [^"*/"])* "*"+ "/"

Say I have another regexp that matches certain sorts of special syntax
that might occur in comments:

:encoding =  "-*-" [" "A-Za-z]* "coding:" [A-Za-z0-9"- "]+ "-*-"

Now combine these two to form a regular expression to match comments
that contain encoding instructions.  With intersection, it's fairly
simple:

:containsencoding = .* :encoding .*
:commentwithencoding = :ccomment & :containsencoding

I haven't tried to express this without intersection, and I wouldn't
care to try either.

the-real-fun-begins-when-you-have-a-complement-operator-as-well-ly
y'rs, Anders






More information about the Python-list mailing list