[Mailman-Users] Trouble getting htdig to work
r.barrett at openinfo.demon.co.uk
Sat Feb 8 19:23:00 CET 2003
At 16:53 08/02/2003, Paul Kleeberg wrote:
>I am attempting to get htdig to work on a RedHat 8.0 system with Apache
>2.0, htdig 3.2.0 and Mailman 2.1
>I installed the 4 patches (668685, 661138, 444879 & 444884) to Mailman 2.1
>to create the searchable archives for Mailman with htdig, and then
>reinstalled mailman. Created the link:
> ln -s /var/mailman/archives/htdig /etc/htdig-mailman
As long as htidg was configured with /etc as the default directory to
contain htdig configuration files this should be OK.
>and in mm_cfg.py, to make this compatible with RH8 I added:
> HTDIG_RUNDIG_PATH = '/usr/bin/rundig'
If that's where the Redhat RPM installed rundig, that's OK
> USE_HTDIG = 1
>and then ran the indexing engine:
> /var/mailman/cron/nightly_htdig -v
>and I get:
> /usr/bin/rundig: line 48: 1104 Aborted $BINDIR/htnotify $opts
> htfuzzy: Unable to open word database /var/lib/htdig/db.words.db
>but I would think htfuzzy should look in:
Have you checked out the section under heading "htdig Permissions
Considerations" in the file INSTALL.htdig-mm which patch 444884 installs in
$build? Some of the htdig 'databases' generated by the components called by
rundig can be safely shared between lists while others need to be list
specific to avoid information leakage from one list's indexes into another's.
htdig Permissions Considerations
Python scripts added by this patch (nightly_htdig and its relatives) run
the htdig rundig script identified by HTDIG_RUNDIG_PATH to build search
indices for Mailman archives. Code added by this patch generates per
list htdig configuration files which are passed as a parameter to the
rundig script. These configuration files identify a list specific
directory ($prefix/archives/private/<listname>/htdig) in which list
specific data files generated by and used by htdig are to be placed.
However, the rundig script identified by HTDIG_RUNDIG_PATH may attempt
to generate some files in htdig's COMMON_DIR when it is first run by
nightly_htdig; the files concerned are likely to be root2word.db,
word2root.db, synonyms.db and possibly some others generated by htidg's
htfuzzy program. The standard rundig script generates these files
selectively if they do not already exist. Depending on how you have
installed htdig and how the rundig script is first run, there may be a
permissions problem when nightly_hdig executes rundig under the mailman
UID if it tries to generate these files.
Basically you may have to change permssions over the htdig common directory
. For instance on my internal test system I have the following setup:
mailman at mailman2:/opt/www/htdig> ls -l
drwxr-xr-x 2 root root 4096 Jan 13 16:28 bin
drwxrwxr-x 2 root mailman 4096 Jan 14 17:19 common
drwxr-xr-x 2 root root 4096 Jan 14 17:22 conf
drwxrwxr-x 2 root mailman 4096 Jan 14 17:19 db
mailman at mailman2:/opt/www/htdig> ls -l db
mailman at mailman2:/opt/www/htdig> ls -l common/
-rw-rw-r-- 1 root mailman 84 Jan 13 16:28 bad_words
-rw-rw-r-- 1 root mailman 923308 Jan 13 16:28 english.0
-rw-rw-r-- 1 root mailman 5756 Jan 13 16:28 english.aff
-rw-rw-r-- 1 root mailman 190 Jan 13 16:28 footer.html
-rw-rw-r-- 1 root mailman 877 Jan 13 16:28 header.html
-rw-rw-r-- 1 root mailman 194 Jan 13 16:28 long.html
-rw-rw-r-- 1 root mailman 1390 Jan 13 16:28 nomatch.html
-rw-rw-r-- 1 mailman mailman 2285568 Jan 14 17:19 root2word.db
-rw-rw-r-- 1 root mailman 67 Jan 13 16:28 short.html
-rw-rw-r-- 1 root mailman 14481 Jan 13 16:28 synonyms
-rw-rw-r-- 1 mailman mailman 90112 Jan 14 17:19 synonyms.db
-rw-rw-r-- 1 root mailman 1261 Jan 13 16:28 syntax.html
-rw-rw-r-- 1 mailman mailman 3022848 Jan 14 17:19 word2root.db
-rw-rw-r-- 1 root mailman 1087 Jan 13 16:28 wrapper.html
mailman at mailman2:/opt/www/htdig>
As you can see 3 of the files in common were written by the mailman userid
when nightly_htdig first ran rundig. You will have to tweak things to suit
your htdig installation setup.
>In addition, when I look at the source for the search form on an archive
>page I see <form method="post" action="/cgi-bin/htsearch">. But on my
>system, htsearch exists in /usr/bin.
The htsearch program has to be available to the web server in a directory
from which the server is prepared to run cgi programs.
Remember that execution of htdig's components is in two parts. The indexing
of the material is typically done by a cron script running htidg components
as one some user id from whatever was set up as htdig's bin directory, for
The 'search' operation, i.e. looking up stuff in the search indexes, using
htsearch is run as a cgi-bin script under the auspices of the User/Group
your web server is configured to run as.
I think it is usual for the htdig installation process to involve copying
htsearch into the cgi-bin directory in whatever is configured by the
ServerRoot directive in your web server's httpd.conf file.
Personally, I build htdig from source and in any event run SuSe Linux so I
do not know how the Redhat RPMs have been configured.
If all else fails, as root, copy htsearch into the web server's cgi-bin
directory and make sure that it readable and excutable but not writable by
owner, group and other.
>What am I overlooking?
>paul at fpen.org
Let me know if you continue to have problems.
More information about the Mailman-Users