When Good Regular Expressions Go Bad

Malcolm Tredinnick malcolmt at smart.net.au
Wed Sep 29 04:43:28 EDT 1999


On Wed, Sep 29, 1999 at 02:55:19AM -0400, Tim Peters wrote:
> [Douglas Alan]
> > ...
> > It seems to me that even when a regular expression fails to match a
> > string, you might want to know just how far it was able to get before
> > getting stuck.  (And indeed I do!) For instance, let's say that I have
> > the regular expression "^foo.bar", and I try to match it on the
> > string "fooxbaz".  It might be useful to be able to find out that
> > the regular expression was able to get all the way through "fooxb"
> > before giving up the ghost.
> 
> I believe this would be easy to add to any regexp engine I've ever seen, and
> also believe I've never seen one that keeps track of it.  I confess I'm at a
> loss to think of a compelling use for it, though.

Isn't there a problem here with which "failing match" you are going to
report on? Do you take the longest failing match (which will probably
be horrendously expensive in a complexity sense, since you will have
to scan the entire candidate string), or the leftmost failing match
(which could be uninteresting a lot of the time) or what?

While I agree it's sort of a useful thing to have sometimes (error
recovery during parsing - something like detecting possible spelling
errors), it needs to be defined a little better before being
implemented.

Cheers,
Malcolm Tredinnick.





More information about the Python-list mailing list