DocIndexer now handles unicode (the previous release was only really comfortable with ascii). A full list of changes is in the CHANGELOG. What is it? ----------- DocIndexer is a document indexer toolkit that uses the PyLucene search engine for indexing and searching document files. DocIndexer includes command-line utilities, Python index and search classes plus a Win32 COM server that can be used to integrate indexing and searching into application software. The current version has parser support for Microsoft Word, HTML, PDF and plain text documents. Runtime Requisites ------------------ Win32: None (compiled binary distribution). Linux: Python 2.5, PyLucene 2, antiword and poppler-utils. License ------- MIT URLs ---- Homepage: http://www.methods.co.nz/docindexer/ SourceForge: http://sourceforge.net/projects/docindexer/ Cheers, Stuart --- Stuart Rackham
participants (1)
-
Stuart Rackham