[Mailman-Users] Using htdig with private archives?

Richard Barrett R.Barrett at ftel.co.uk
Mon Mar 12 14:12:31 CET 2001


>Does anyone know if/how htdig can be used to index and search private
>list archives?
>Thanks.
>
>. . .  .  .   .   .    .    .     .     .
>Michael Dunston

Yes it is possible: the trick is maintaining privacy in a user 
friendly manner. The principle I adopted was that the search facility 
should not provide access to information nor "tease" the user by 
offering access via returned search results which would be denied 
through the normal archive access mechanism. By this principle the 
use of htdig should not compromise archive privacy in any way nor 
annoy the user by returning links which then fail for security 
reasons.

I took these requirements to mean:

1. avoid offering to search a list archive which is private unless 
the user is known at the search request time to be allowed access to 
the archive.

2.  avoid returning an hypertext in search results to which links to 
information for which access will be denied if the user clicks the 
link.

There are two patches posted that provide integration of htdig with 
Mailman built-in pipermail archiving that deal with these 
requirements.

http://sourceforge.net/tracker/index.php?func=detail&aid=402422&group_ 
id=103&atid=300103

http://sourceforge.net/tracker/index.php?func=detail&aid=402423&group_ 
id=103&atid=300103

With these patches, the solution adopted is:

1. the search form for a list is embedded in a list's archive content 
page. In the case of a private archive Mailman's regular 
authorisation scheme is applied so the user only gets to this page, 
and hence the search form. if they are authorised for the list and 
the appropriate authorisation cookie for the list is thus set up in 
the process. This means that only one list will searched at a time, 
which may or may not concern you.

2. the hypertext links returned by the htdig search use an additional 
Mailman cgi script called htdig, which is based on the private cgi 
script that mediates access to private archives in the standard 
Mailman release. The htdig script mediates access by checking whether 
an access through it is to a private archive and if so it ensures 
that the authorisation cookie is being provided for that list by the 
user's browser. If not, the access is refused. This has the advantage 
that even if an unauthorised user gets search results by some means, 
the security breach is limited. Also, the links indexed by htdig are 
homogeneous, with both public and private archive accesses going via 
the htdig script, so that if a list archive is changed from private 
to public or vice versa the htdig indices remain unchanged while 
security requirements are still met.

The patch does all the stuff necessary to make this Mailman/htdig 
integration work, including cron script to update htdig indices 
regularly, automatic generation of htdig configuration files for 
indexing each list's archives etc.

The posted patches can be applied to 2.0.1 and 2.0.2, and are based 
on patches I had posted for earlier beta releases of 2.0. I found a 
couple of glitches with the most recent patches. I've got modified 
versions of them which I can let you have if you want them.





More information about the Mailman-Users mailing list