Indexing HTML!

David Mertz mertz at gnosis.cx
Sat Dec 28 17:42:51 EST 2002


|>>>>> "John" == John  <johng2001 at rediffmail.com> writes:
|John> I have been struggling for the past few days to get this done. I
|John> have a few small document (HTML) collections, each of which will
|John> be exposed on an independent simplistic intranet site (Apache on
|John> Linux). I need some indexing solutions.

Martin Christensen <knightsofspamalot-factotum at gvdnet.dk> wrote:
|The full-text indexer is prepared to handle different 'text munchers'
|that apply different filters to texts to prepare them for processing.

Take a look at:

    Developing a Full-Text Indexer in Python
    http://gnosis.cx/publish/programming/charming_python_15.html

The associated code is now part of Gnosis Utilities:

    http://gnosis.cx/download/Gnosis_Utils-current.tar.gz

--
 mertz@  _/_/_/_/ THIS MESSAGE WAS BROUGHT TO YOU BY: \_\_\_\_    n o
gnosis  _/_/             Postmodern Enterprises            \_\_
.cx    _/_/                                                 \_\_  d o
      _/_/_/ IN A WORLD W/O WALLS, THERE WOULD BE NO GATES \_\_\_ z e





More information about the Python-list mailing list