[Mailman-Users] Using Google Search Appliance With Mailman Archives?

Jon Forrest nobozo at gmail.com
Thu Oct 4 23:02:22 CEST 2012

I'm in the process of migrating from ezmlm to mailman.
So far everything is working great. I've even been able
to migrate ezmlm list archives to mailman so that
I can see the messages via the mailman web interface.

One of the goals of this migration is to be able to
use a Google Search Appliance to search the list
archives. What I've found is that the archive for
each list is in /var/lib/mailman/archives/private/listname,
and in this directory are

	1) a directory for each month containing the messages
	submitted during the month in HTML format.
	2) a file for each month containing all the messages
	for the month in text format concatenated together.

I'm trying to figure out the best way to search the archive
with a GSA. I'm worried that if I search #1 I'll find what
I want but it will be in HTML format which won't be very
easy to read. If I search #2 I'll find what I want but I'll
see the whole file, which will also contain a bunch of stuff
I'm not looking for.

Has anybody worked through these issues with a GSA? I'd be
interested in hearing how you did it.

Jon Forrest

