Inverse regex?

Daniel Berlin dan at cgsoftware.com
Sat Jul 7 01:41:00 EDT 2001


"Andrew Henshaw" <ahenshaw.at.apower.dot.com at www.cgsoftware.com> writes:

> Has anybody seen or developed what might be called an inverse regex
> generator.  We would like to do some unit testing of modules that have user
> input validated by regular expressions.  It would be nice to throw some some
> strings at them and see if the propagated input causes an error in the code
> downstream.  In other words, it may help to detect if our validation
> routines are insufficient.
> 
> I understand that, in many cases, a complete list of valid strings is
> impossible.  However, for certain regular expressions it would be certainly
> reasonable.  For others, it will be possible, but practically impossible
> (set too large).  For the impossible, and near impossible, it would be nice
> to generate a bounded set that hits many of the degenerate and extreme
> cases.
> 
> Any ideas?
Yes.
There was an automata/regex/cfg toolkit that could do this.
I can't remember the name, but i can describe it, in case someone else
remembers (you should be able to find it off the description, if worse
comes to worst). I know it wasn't bruce watson's fire stuff.

It was written in C++ and consisted of a library and a bunch of
filters that used it.  It could do DFA's, NFA's, CFG's, Regular
expressions (and something else i found neat, but escapes me at the
moment), in all of their various
forms. The filters it built using the library (very simple of course,
the library did all  the real work) consisted of   conversions between
all of the various forms (where possible), enumeration of their
languages (whether they were infinite or not),
minimization of DFA's (by hopcroft's algorithm, as well as 
minimization by reversal), minimization of regular expressions, 

Oh, wait, just found it.

The Grail+ project.

www.csd.uwo.ca/research/grail

One of the filters will do enumeration of the languages.
It'll also let you specify the maximum number of strings you want to
enumerate (in case it's an infinite language).

So just throw a regex (in their regex format) at it, and away you go.

HTH,
Dan


> 
> Andrew Henshaw
> 
> 
> -- 
> http://mail.python.org/mailman/listinfo/python-list

-- 
"Four years ago...  No, it was yesterday.
Today I...  No, that wasn't me.
Sometimes I...  No, I don't.
"-Steven Wright




More information about the Python-list mailing list