Usenet gating and archiving
Is anybody currently gating Usenet to a mailing list, /and/ doing archiving with the default Pipermail? If so, you may not have noticed that the archives are pretty broken; nothing that originates on Usenet will show up in the archive.
I believe I understand why this is happening, and although I don't have a fix yet, it shouldn't be too difficult. I should be able to get it into 1.0b11. The problem occurs because Pipermail uses Python's mailbox.UnixMailbox objects which expect "From " separators. Message that originate on Usenet don't have these envelopes. The patch should be to simply synthesize these on messages gated off the newsgroup.
However, there's a more severe breakage. You've actually lost information that is hard to reproduce because the flat listname.mbox files don't have the separators in them either. Running bin/arch over the file will give you even more corrupt archives (the incremental archiver just throws the Usenet messages away; bin/arch will tack them onto the last email-originated message).
What you could do is to trek through the listname.mbox file looking for Newsgroup: headers (which appear, but I don't think are guaranteed, to be the first header in the message). When you find one, you jam in a synthesized "From " envelope.
I would rather not add this hack to Mailman unless a lot of people would benefit from it. For the test lists that I'm running, we're willing to just trash the archives and start over; the lists aren't live anyway. I'd like to know who would be really adversely affected by this problem.
Thanks, -Barry
On 2 Apr 99, at 15:56, Barry A. Warsaw wrote:
Is anybody currently gating Usenet to a mailing list, /and/ doing archiving with the default Pipermail? If so, you may not have noticed that the archives are pretty broken; nothing that originates on Usenet will show up in the archive.
I have not yet started using the gateway, but it and the automated archives are two of the features that attracted me to mailman. I do plan on using it in the future.
I would rather not add this hack to Mailman unless a lot of people would benefit from it. For the test lists that I'm running, we're willing to just trash the archives and start over; the lists aren't live anyway. I'd like to know who would be really adversely affected by this problem.
What good is an incomplete archive? I feel that it is very important that the archive is a true and complete record of the list activity, including mesages gated in from a newsgroup.
Richard B. Pyne, KB7RMU rpyne@kinfolk.org http://pyne.kinfolk.org/rbp2
"RBP" == Richard B Pyne rpyne@kinfolk.org writes:
RBP> I have not yet started using the gateway, but it and the
RBP> automated archives are two of the features that attracted me
RBP> to mailman. I do plan on using it in the future.
And it will work in 1.0b11.
>> I would rather not add this hack to Mailman unless a lot of
>> people would benefit from it. For the test lists that I'm
>> running, we're willing to just trash the archives and start
>> over; the lists aren't live anyway. I'd like to know who would
>> be really adversely affected by this problem.
RBP> What good is an incomplete archive? I feel that it is very
RBP> important that the archive is a true and complete record of
RBP> the list activity, including mesages gated in from a
RBP> newsgroup.
Maybe I phrased the question incorrectly. If you have not yet begun to archive a gated list, but only start doing so after you install 1.0b11 (which will be released sometime this morning), you will be fine.
If you are running 1.0b10 or earlier, and have already started archiving a gated group, your archives will be messed up (if you haven't already noticed ;-). Someone would have to write a script to jam in the Unixfrom headers in the .mbox files and then rerun the script. It's probably not hard to write -- I'm just proposing that I don't need to do it, 'cause the number of people affect approaches zero right now. ;-)
-Barry
participants (2)
-
Barry A. Warsaw
-
Richard B. Pyne