Why no RE match of A AND B?

John Machin sjmachin at lexicon.net
Mon Mar 3 16:20:34 EST 2003


Tim Peters <tim.one at comcast.net> wrote in message news:<mailman.1046651583.23834.python-list at python.org>...
> 
> For a so-called DFA algorithm, yes, because the intersection of regular
> languages is also regular, and membership in any regular language can be
> recognized in one pass over the string in question.  What to do in the
> presence of (non-regular) backreferences is clear as mud, though, and
> Python's regexp package isn't a DFA engine anyway.  I don't know of any
> regexp package that supports an intersection operator, so it would also
> suffer from novelty.  All in all, better to do the obvious thing (i.e., run
> more than one regexp).

I seem to detect two quite different usages of "intersection":

(1) The regular language (RL) characterised by the RE "abc" produces
just one string i.e. 'abc'. Ditto the RL described by the RE "xyz"
produces just one string 'xyz'. I (a mug punter, not an expert) would
understand the intersection of the two RLs to be empty i.e. [] i.e.
the two RLs produce no common strings. Same with "a+" and "b+",
whereas "a*" and "b*" have a non-empty intersection [''].

(2) The effect that the OP (together with Tim and other posters)
seemed to expect from the & operator was that (abc)&(xyz) would match
(abcxyz)|(xyzabc).

<:-)>
And now for a pragmatic point: if the & operator was ever so slightly
useful, or ever so slightly demanded, or ever so slightly easy to
implement, it would have surfaced in a certain language long before
now.
</:-)>




More information about the Python-list mailing list