ANN: Gmatch, a tiny matcher of google-like queries
denis-bz-gg at t-online.de
denis-bz-gg at t-online.de
Mon Feb 2 15:31:43 CET 2009
Gmatch is a tiny matcher of google-like query patterns, called
"goopat"s:
gm = Gmatch( "color: blue - sky size: < 100" )
if gm.match( record ): ...
matching_records = [rec for rec in records if gm.match( rec )]
A "record" is a dict, or namedtuple, or Dotdict,
or anything with record.get( key ) -> value.
A "goopat" pattern is made up of the following elements, here with
examples:
color: a key or field name, followed by a ":" colon
blu words made of letters and numbers match anywhere in a
field,
case-insensitive: blu matches Blue Blur sublunar
blue sky match all the words, anywhere in a field, in any
order:
bluesky sky-blue "frisky ... blues"
blue - sky blue but not sky -- put blanks around the "-"
blue green - sky cloud blue and green but not sky and not cloud
Double quotes are used for:
"key.word", "http://a-b-c.net" names containing special
characters
"sky blue" sky, any amount of white space, then blue
" sky blue " an exact word or phrase, not e.g. "frisky blues"
r"regular expression"
see http://docs.python.org/library/re.html .
Gmatch compiles them with re.I case-insensitive
and re.X, ignore white space except "\ " and "[ ]" ;
Comparing:
size: < <= > >= == != compare record.get( "size" ) to a number or
string
size: < 100 here size must be a number, or a string like
"12.3"
size: >= "Large" here it must be a string (beware: "Small" >
"Large").
size: >= 100 < 200 matches sizes in the range 100 to 200.
Notes, ramblings, corner cases
...
(REs are pretty fast, ~ 20 Mbytes/sec on my mac ppc with python
2.5.1 .)
Download files, 2feb --
246 gmatch.py http://denis.pastebin.com/m6609dea0
154 gmparse.py http://denis.pastebin.com/m308ba33b
More information about the Python-announce-list
mailing list