[ python-Bugs-1597014 ] Can't exclude words before capture group

SourceForge.net noreply at sourceforge.net
Fri Nov 17 12:35:13 CET 2006


Bugs item #1597014, was opened at 2006-11-15 14:27
Message generated for change (Comment added) made by ctimmerman
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1597014&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.4
Status: Closed
Resolution: Invalid
Priority: 5
Private: No
Submitted By: Cees Timmerman (ctimmerman)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: Can't exclude words before capture group

Initial Comment:
Python 2.4.3 (#2, Oct  6 2006, 07:52:30)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2

Tried:

>>> re.findall(r'(?!def)\b(\S+)\(', "def bla(): dof blu()")

>>> re.findall(r'(?:def){0}\b(\S+)\(', "def bla(): dof blu()")

Result:

['bla', 'blu']

Expected:

['blu']


Why doesn't (?!) work like it does here?:

>>> re.findall(r'\b(\S+): (?!bad)', "bob: bad; suzy: good")
['suzy']


Wouldn't it be nice if (^) worked?

>>> re.findall(r'\b(\S+): (^bad)', "bob: bad; suzy: good")
[]

[^()] does, sorta. Also not before a capture group:

>>> re.findall(r'\b(\S+): [^(bad)]', "bob: bad; suzy: good")
['suzy']
>>> re.findall(r'[^(def)]\b(\S+)\(', "def bla(): dof blu()")
['bla', 'blu']
>>> re.findall(r'[^(def)] (\S+)\(', "def bla(): dof blu()")
[]
>>> re.findall(r'(^def) (\S+)\(', "def bla(): dof blu()")
[('def', 'bla')]


----------------------------------------------------------------------

>Comment By: Cees Timmerman (ctimmerman)
Date: 2006-11-17 11:35

Message:
Logged In: YES 
user_id=1646092
Originator: YES

Btw, thanks for the explanation, and I think you meant (?<[!=]

My final pattern:
>>> re.findall(r'(?<!def)[\W]+([.\w]+)\(', "def bla():
[ryawry.aet().blu()]")
['ryawry.aet', 'blu']


----------------------------------------------------------------------

Comment By: Cees Timmerman (ctimmerman)
Date: 2006-11-17 10:50

Message:
Logged In: YES 
user_id=1646092
Originator: YES

I tried (^ because [^ works. (^ doesn't seem to do anything. To match ^
inside () you need to use (\^), anyway.

----------------------------------------------------------------------

Comment By: Georg Brandl (gbrandl)
Date: 2006-11-15 17:20

Message:
Logged In: YES 
user_id=849994
Originator: NO

What you want is
>>> re.findall(r'(?<!def)\s(\S+)\(', "def bla(): dof blu()")

\b doesn't match a space, it matches an end of word. And to do a
lookbehind assertion, use (?<[!]...)

Why on earth do you expect that (^...) works?
[^(bad)] is something entirely different, it's a character class excluding
(, b, a, d and ).


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1597014&group_id=5470


More information about the Python-bugs-list mailing list