ANN: Gmatch, a tiny matcher of google-like queries

denis-bz-gg at t-online.de denis-bz-gg at t-online.de
Mon Feb 2 15:31:43 CET 2009


Gmatch is a tiny matcher of google-like query patterns, called
"goopat"s:

    gm = Gmatch( "color: blue - sky  size: < 100" )
    if gm.match( record ): ...
    matching_records = [rec for rec in records if gm.match( rec )]

A "record" is a dict, or namedtuple, or Dotdict,
    or anything with record.get( key ) -> value.

A "goopat" pattern is made up of the following elements, here with
examples:

    color:      a key or field name, followed by a ":" colon

    blu         words made of letters and numbers match anywhere in a
field,
                case-insensitive: blu matches Blue Blur sublunar

    blue sky    match all the words, anywhere in a field, in any
order:
                    bluesky sky-blue "frisky ... blues"

    blue - sky   blue but not sky -- put blanks around the "-"
    blue green - sky cloud   blue and green but not sky and not cloud

                Double quotes are used for:
    "key.word", "http://a-b-c.net"  names containing special
characters
    "sky blue"  sky, any amount of white space, then blue
    " sky blue "  an exact word or phrase, not e.g. "frisky blues"

    r"regular expression"
            see http://docs.python.org/library/re.html .
            Gmatch compiles them with re.I case-insensitive
            and re.X, ignore white space except "\ " and "[ ]" ;

            Comparing:
    size: < <= > >= == !=  compare record.get( "size" ) to a number or
string
    size: < 100     here size must be a number, or a string like
"12.3"
    size: >= "Large"  here it must be a string (beware: "Small" >
"Large").
    size: >= 100 < 200  matches sizes in the range 100 to 200.


Notes, ramblings, corner cases
    ...

    (REs are pretty fast, ~ 20 Mbytes/sec on my mac ppc with python
2.5.1 .)


Download files, 2feb --

     246  gmatch.py  http://denis.pastebin.com/m6609dea0
    154  gmparse.py  http://denis.pastebin.com/m308ba33b


More information about the Python-announce-list mailing list