Issue with regular expressions
ptmcg at austin.rr.com
Tue Apr 29 16:20:44 CEST 2008
On Apr 29, 8:46 am, Julien <jpha... at gmail.com> wrote:
> I'd like to select terms in a string, so I can then do a search in my
> query = ' " some words" with and "without quotes " '
> p = re.compile(magic_regular_expression) $ <--- the magic happens
> m = p.match(query)
> I'd like m.groups() to return:
> ('some words', 'with', 'and', 'without quotes')
> Is that achievable with a single regular expression, and if so, what
> would it be?
I dabbled with re's for a few minutes trying to get your solution,
then punted and used pyparsing instead. Pyparsing will run slower
than re, but many people find it much easier to work with readable
class names and instances rather than re's typoglyphics:
from pyparsing import OneOrMore, Word, printables, dblQuotedString,
# when a quoted string is found, remove the quotes,
# then strip whitespace from the contents
# define terms to be found in query string
term = dblQuotedString | Word(printables)
query_terms = OneOrMore(term)
# parse query string to extract terms
query = ' " some words" with and "without quotes " '
('some words', 'with', 'and', 'without quotes')
The pyparsing wiki is at http://pyparsing.wikispaces.com. You'll find
an examples page that includes a search query parser, and pointers to
a number of online documentation and presentation sources.
More information about the Python-list