a regexp riddle: re.search(r'
Yingjie Lan
lanyjie at yahoo.com
Thu Nov 25 04:44:44 EST 2010
- Previous message (by thread): a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')
- Next message (by thread): a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
--- On Thu, 11/25/10, Phlip <phlip2005 at gmail.com> wrote:
> From: Phlip <phlip2005 at gmail.com>
> Subject: a regexp riddle: re.search(r'
> To: python-list at python.org
> Date: Thursday, November 25, 2010, 8:46 AM
> HypoNt:
>
> I need to turn a human-readable list into a list():
>
> print re.search(r'(?:(\w+), |and
> (\w+))+', 'whatever a, bbb, and
> c').groups()
>
> That currently returns ('c',). I'm trying to match "any
> word \w+
> followed by a comma, or a final word preceded by and."
>
> The match returns 'a, bbb, and c', but the groups return
> ('bbb', 'c').
> What do I type for .groups() to also get the 'a'?
>
First of all, the 'bbb' coresponds to the first capturing
group and 'c' the second. But 'a' is forgotten be cause
it was the first match of the first group, but there
is a second match 'bbb'.
Generally, a capturing group only remembers the last match.
It also seems that your re may match this: 'and c',
which does not seem to be your intention.
So it may be more intuitively written as:
r'(?:(\w+), )+and (\w+)'
I'm not sure how to get it done in one step,
but it would be easy to first get the whole
match, then process it with:
re.findall(r'(\w+)(?:,|$)', the_whole_match)
cheers,
Yingjie
- Previous message (by thread): a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')
- Next message (by thread): a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the Python-list
mailing list