Trouble with regular expressions
John Machin
sjmachin at lexicon.net
Sat Feb 7 09:18:07 EST 2009
On Feb 7, 11:18 pm, LaundroMat <Laun... at gmail.com> wrote:
> Hi,
>
> I'm quite new to regular expressions, and I wonder if anyone here
> could help me out.
>
> I'm looking to split strings that ideally look like this: "Update: New
> item (Household)" into a group.
> This expression works ok: '^(Update:)?(.*)(\(.*\))$' - it returns
> ("Update", "New item", "(Household)")
>
> Some strings will look like this however: "Update: New item (item)
> (Household)". The expression above still does its job, as it returns
> ("Update", "New item (item)", "(Household)").
>
> It does not work however when there is no text in parentheses (eg
> "Update: new item"). How can I get the expression to return a tuple
> such as ("Update:", "new item", None)?
I don't see how it can be done without some post-matching adjustment.
Try this:
C:\junk>type mathieu.py
import re
tests = [
"Update: New item (Household)",
"Update: New item (item) (Household)",
"Update: new item",
"minimal",
"parenthesis (plague) (has) (struck)",
]
regex = re.compile("""
(Update:)? # optional prefix
\s* # ignore whitespace
([^()]*) # any non-parentheses stuff
(\([^()]*\))? # optional (blahblah)
\s* # ignore whitespace
(\([^()]*\))? # another optional (blahblah)
$
""", re.VERBOSE)
for i, test in enumerate(tests):
print "Test #%d: %r" % (i, test)
m = regex.match(test)
if not m:
print "No match"
else:
g = m.groups()
print g
if g[3] is not None:
x = (g[0], g[1] + g[2], g[3])
else:
x = g[:3]
print x
print
C:\junk>mathieu.py
Test #0: 'Update: New item (Household)'
('Update:', 'New item ', '(Household)', None)
('Update:', 'New item ', '(Household)')
Test #1: 'Update: New item (item) (Household)'
('Update:', 'New item ', '(item)', '(Household)')
('Update:', 'New item (item)', '(Household)')
Test #2: 'Update: new item'
('Update:', 'new item', None, None)
('Update:', 'new item', None)
Test #3: 'minimal'
(None, 'minimal', None, None)
(None, 'minimal', None)
Test #4: 'parenthesis (plague) (has) (struck)'
No match
HTH,
John
More information about the Python-list
mailing list