[Mailman-Users] Using htdig with private archives?
Richard Barrett
R.Barrett at ftel.co.uk
Mon Mar 12 14:12:31 CET 2001
>Does anyone know if/how htdig can be used to index and search private
>list archives?
>Thanks.
>
>. . . . . . . . . . .
>Michael Dunston
Yes it is possible: the trick is maintaining privacy in a user
friendly manner. The principle I adopted was that the search facility
should not provide access to information nor "tease" the user by
offering access via returned search results which would be denied
through the normal archive access mechanism. By this principle the
use of htdig should not compromise archive privacy in any way nor
annoy the user by returning links which then fail for security
reasons.
I took these requirements to mean:
1. avoid offering to search a list archive which is private unless
the user is known at the search request time to be allowed access to
the archive.
2. avoid returning an hypertext in search results to which links to
information for which access will be denied if the user clicks the
link.
There are two patches posted that provide integration of htdig with
Mailman built-in pipermail archiving that deal with these
requirements.
http://sourceforge.net/tracker/index.php?func=detail&aid=402422&group_
id=103&atid=300103
http://sourceforge.net/tracker/index.php?func=detail&aid=402423&group_
id=103&atid=300103
With these patches, the solution adopted is:
1. the search form for a list is embedded in a list's archive content
page. In the case of a private archive Mailman's regular
authorisation scheme is applied so the user only gets to this page,
and hence the search form. if they are authorised for the list and
the appropriate authorisation cookie for the list is thus set up in
the process. This means that only one list will searched at a time,
which may or may not concern you.
2. the hypertext links returned by the htdig search use an additional
Mailman cgi script called htdig, which is based on the private cgi
script that mediates access to private archives in the standard
Mailman release. The htdig script mediates access by checking whether
an access through it is to a private archive and if so it ensures
that the authorisation cookie is being provided for that list by the
user's browser. If not, the access is refused. This has the advantage
that even if an unauthorised user gets search results by some means,
the security breach is limited. Also, the links indexed by htdig are
homogeneous, with both public and private archive accesses going via
the htdig script, so that if a list archive is changed from private
to public or vice versa the htdig indices remain unchanged while
security requirements are still met.
The patch does all the stuff necessary to make this Mailman/htdig
integration work, including cron script to update htdig indices
regularly, automatic generation of htdig configuration files for
indexing each list's archives etc.
The posted patches can be applied to 2.0.1 and 2.0.2, and are based
on patches I had posted for earlier beta releases of 2.0. I found a
couple of glitches with the most recent patches. I've got modified
versions of them which I can let you have if you want them.
More information about the Mailman-Users
mailing list