ANN: NUCULAR B3 Full text indexing (now on Win32 too)

Paul Rubin http
Fri Feb 22 23:31:06 CET 2008

Aaron Watters <aaron.watters at> writes:
> [apologies to the list: I would have done this offline,
> but I can't figure out Paul's email address.]
> 1) Paul please forward your email address

Will send it privately.  I don't have a public email address any more
(death to spam!!!).  My general purpose online contact point is which currently has an expired certificate that
I'll get around to renewing someday.  Meanwhile you have to click
"accept" to connect using the expired cert.

> 3) Since you seem to know about these things: I was thinking
> of adding an optional feature to Nucular which would allow
> a look-up like "given a word find all attributes that contain
> that word anywhere and give a count of the number of times it
> is found in that attribute as well as the entry id for an example
> instance (arbitrarily chosen).  I was thinking about calling
> this "inverted faceting", but you probably know a
> better/standard name, yes?  What is it please?  Thanks!
> Answers from anyone else welcomed also.

In Solr this is called the DisMax (disjunction maximum) handler, I
think.  I tried it and it doesn't work very well, and ended up using a
script written by a co-worker, that expands such queries to more
complex queries that put user-supplied weights on each field.  It is a
somewhat messy problem.  Otis Gospodnetic's book "Lucene in Action"
talks about it some, I believe.  Manning and Schutz are working on a
new book at that discusses fancier
methods.  I think these are worth looking into, but I haven't had the
bandwidth to spend time on it so far.

More information about the Python-list mailing list