Regular expression negative look-ahead

Jason Friedman jsf80238 at gmail.com
Thu Jul 4 04:49:23 CEST 2013


Huh, did not realize that endswith takes a list.  I'll remember that in the
future.

This need is actually for http://schemaspy.sourceforge.net/, which allows
one to include only tables/views that match a pattern.

Either there is a bug in Schemaspy's code or Java's implementation of
regular expressions is different than Python's or there is a flaw in my
logic, because the pattern I verify using Python produces different results
when used with Schemaspy.  I suppose I'll open a bug there unless I can
find the aforementioned flaw.


On Mon, Jul 1, 2013 at 11:44 PM, Ian Kelly <ian.g.kelly at gmail.com> wrote:

> On Mon, Jul 1, 2013 at 8:27 PM, Jason Friedman <jsf80238 at gmail.com> wrote:
> > Found this:
> >
> http://stackoverflow.com/questions/13871833/negative-lookahead-assertion-not-working-in-python
> .
> >
> > This pattern seems to work:
> > pattern = re.compile(r"^(?!.*(CTL|DEL|RUN))")
> >
> > But I am not sure why.
> >
> >
> > On Mon, Jul 1, 2013 at 5:07 PM, Jason Friedman <jsf80238 at gmail.com>
> wrote:
> >>
> >> I have table names in this form:
> >> MY_TABLE
> >> MY_TABLE_CTL
> >> MY_TABLE_DEL
> >> MY_TABLE_RUN
> >> YOUR_TABLE
> >> YOUR_TABLE_CTL
> >> YOUR_TABLE_DEL
> >> YOUR_TABLE_RUN
> >>
> >> I am trying to create a regular expression that will return true for
> only
> >> these tables:
> >> MY_TABLE
> >> YOUR_TABLE
> >>
> >> I tried these:
> >> pattern = re.compile(r"_(?!(CTL|DEL|RUN))")
> >> pattern = re.compile(r"\w+(?!(CTL|DEL|RUN))")
> >> pattern = re.compile(r"(?!(CTL|DEL|RUN)$)")
> >>
> >> But, both match.
> >> I do not need to capture anything.
>
>
> For some reason I don't seem to have a copy of your initial post.
>
> The reason that regex works is because you're anchoring it at the
> start of the string and then telling it to match only if
> ".*(CTL|DEL|RUN)" /doesn't/ match.  That pattern does match starting
> from the beginning of the string, so the pattern as a whole does not
> match.
>
> The reason that the other three do not work is because the forward
> assertions are not properly anchored.  The first one can match the
> first underscore in "MY_TABLE_CTL" instead of the second, and then the
> next three characters are "TAB", not any of the verboten strings, so
> it matches.  The second one matches any substring of "MY_TABLE_CTL"
> that isn't followed by "CTL".  So it will just match the entire string
> "MY_TABLE_CTL", and the rest of the string is then empty, so does not
> match any of those three strings, so it too gets accepted.  The third
> one simply matches an empty string that isn't followed by one of those
> three, so it will just match at the very start of the string and see
> that the next three characters meet the forward assertion.
>
> Now, all that said, are you sure you actually need a regular
> expression for this?  It seems to me that you're overcomplicating
> things.  Since you don't need to capture anything, your need can be
> met more simply with:
>
> if not table_name.endswith(('_CTL', '_DEL', '_RUN')):
>     # Do whatever
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20130703/2db8f2a2/attachment.html>


More information about the Python-list mailing list