[Mailman-Developers] Re: [Mailman-Users] Poking and prodding the archiver
Barry A. Warsaw
barry@digicool.com
Mon, 9 Jul 2001 22:28:19 -0400
[Note: this discussion is more appropriate for mailman-developers, so
I've changed the Cc: -baw]
>>>>> "PS" == Phil Stracchino <alaric@babcom.com> writes:
PS> I've looked at some length through the code for the archiver
PS> now, and although I still don't understand python, I've
PS> figured out enough of what the archiver is doing to see that
PS> it's apparently intentional that the path to mbox archives is
PS> .../mailman/archives/private/list.mbox/list.mbox.
Yes, and this is for security reasons as explained in the comment in
Archiver.py (see InitVars()). The comment is slightly out-of-date in
that the file under listname.mbox/ is also called listname.mbox.
PS> What I haven't been able to figure out is *why* the code is
PS> written to duplicate the last element in the pathname;
See above.
PS> nor why it is that the archiver is written in such a way that
PS> it attempts to access this mbox archive directory with its
PS> duplicated final pathname element even when mbox archives are
PS> disabled, and fails if it doesn't exist.
If this is true (and I haven't tested it), then it's most likely just
old lurking bugs. The archiver/Pipermail stuff is the most neglected
part of the codebase. People keep threatening to help rewrite it, but
so far nothing's materialized, and I have little time or energy to
devote to the Pipermail side.
PS> I find this behavior even more curious in light of the fact
PS> that newlist apparently creates archives/private/list.mbox
PS> when it sets up the list, but does not create the
PS> archives/private/list.mbox/list.mbox without the existence of
PS> which the archiver fails.
Do you mean the archiver fails or that the web access to the archiver
fails? Certainly not the former (unless I misunderstand) because it
works for me, and loads of other people. It's a known buglet that the
pipermail url doesn't work until the first message is posted to the list.
PS> I've applied the following patch to my HyperArch.py file
PS> (patch also attached separately):
[patch deleted]
PS> I don't know what impact this has on mbox archives, but for
PS> me, it makes the HTML archiver work.
Hmm, odd. What I think will break is private archives. If you toggle
an archive to private, I seem to remember that you can craft a url to
trick the web server into vending an archive page for you directly,
instead of forcing you to go through authentication with the
private.py cgi.
PS> I would welcome comment, any explanations for the curious
PS> state of unsatisfied and illogical dependencies described
PS> above, and any advice on fixing anything that this patch
PS> breaks. It's still a mystery to me why the archiver should
PS> even *care* whether or not the mbox archive directory exists,
PS> when mbox archives are disabled in the master configuration
PS> anyway.
It probably shouldn't, but then Mailman probably shouldn't support
ARCHIVE_TO_MBOX=0. Archiving to the mbox is about as fast as it gets,
since it is just a file append, and it's /incredibly/ handy to have
that .mbox file around (even as large as it can get), in case you want
to regenerate your archive, or you want to migrate to a different
external archiver.
-Barry