Hullo,
I'm running mailman-2.1.12, with the htdig patches on FreeBSD 7.0
I have a list with archives that are about 10 years old. The archive mbox size is 175M.
I was alerted by a subscriber that the August 2009 archives list 128 "No subject" emails that "look funny."
So I looked.. sure enough they're there. And they look something like this when I click on a single email listed in the archives:
No subject
Mon Aug 10 18:53:40 EDT 2009
* Previous message: [Redacted] Blah...
* Next message: No subject
* Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Tue, 14 Dec 1999 23:27:19 PST X-Originating-IP: [63.11.227.157] From: "redacted" <redacted_at...> To: redacted Date: Tue, 14 Dec 1999 23:27:19 PST Mailing-List: contact redacted X-Mailing-List: redacted Precedence: bulk List-Help: <http://www.example.com/redacted/info.html>, <mailto:redacted at example.com> List-Unsubscribe: <mailto:redacted-unsubscribe at example.com> List-Archive: <http://www.example.com/redacted/> Reply-To: redacted Subject: [Redacted] Redacted MIME-Version: 1.0 Content-Type: text/plain; format=flowed Content-Transfer-Encoding: 7bit Status: RO Content-Length: 7352 Lines: 174
(body of email starts here)
From Redacted <redacted at u...> Wed Dec 15 00:40:19 1999 Delivered-To: redacted Received: (listserv 1.291); by f7; 15 Dec 1999 08:43:59 -0000 Delivered-To: redacted Date: 15 Dec 99 03:44:15 EST From: Redacted <redacted at u...> To: redacted X-Mailing-List: redacted Precedence: bulk List-Help: <http://www.example.com/redacted/info.html>, <mailto:redacted at example.com> List-Unsubscribe: <mailto:redacted-unsubscribe at example.com> List-Archive: <http://www.example.com/redacted/> Reply-To: redacted Subject: [Redacted] RedactedMIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
(body of email starts here...)
(another email starts here, as above...)
(end of example)
Everything looks fine if I use mutt -f listname.mbox in the private archives directory for the list.
Has anyone had problems like this? My GoogleFu is failing me, or at least isn't showing me anything like this.
Thanks in advance! --Glenn
Glenn Sieb wrote:
I have a list with archives that are about 10 years old. The archive mbox size is 175M.
I was alerted by a subscriber that the August 2009 archives list 128 "No subject" emails that "look funny." [snip] From: "redacted" <redacted_at...> To: redacted Date: Tue, 14 Dec 1999 23:27:19 PST
(body of email starts here)
From Redacted <redacted at u...> Wed Dec 15 00:40:19 1999 Delivered-To: redacted
Have you tried running bin/cleanarch and then rerunning bin/arch to regenerate the messages? It's possible what you're seeing could be caused by messed up From lines in your old mbox file (used by the archiver to determine the start of messages). Mutt may just have a more forgiving parser.
Be warned, though, if you regenerate the entire archive, then the links in your archive will change (i.e. old posts that people have linked will no longer be in the same spot).
Terri
Glenn Sieb wrote:
I'm running mailman-2.1.12, with the htdig patches on FreeBSD 7.0
I have a list with archives that are about 10 years old. The archive mbox size is 175M.
I was alerted by a subscriber that the August 2009 archives list 128 "No subject" emails that "look funny."
So I looked.. sure enough they're there. And they look something like this when I click on a single email listed in the archives:
No subject
Mon Aug 10 18:53:40 EDT 2009
- Previous message: [Redacted] Blah...
- Next message: No subject
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Tue, 14 Dec 1999 23:27:19 PST X-Originating-IP: [63.11.227.157] From: "redacted" <redacted_at...> To: redacted Date: Tue, 14 Dec 1999 23:27:19 PST Mailing-List: contact redacted X-Mailing-List: redacted Precedence: bulk List-Help: <http://www.example.com/redacted/info.html>, <mailto:redacted at example.com> List-Unsubscribe: <mailto:redacted-unsubscribe at example.com> List-Archive: <http://www.example.com/redacted/> Reply-To: redacted Subject: [Redacted] Redacted MIME-Version: 1.0 Content-Type: text/plain; format=flowed Content-Transfer-Encoding: 7bit Status: RO Content-Length: 7352 Lines: 174
(body of email starts here)
From Redacted <redacted at u...> Wed Dec 15 00:40:19 1999 Delivered-To: redacted Received: (listserv 1.291); by f7; 15 Dec 1999 08:43:59 -0000 Delivered-To: redacted Date: 15 Dec 99 03:44:15 EST From: Redacted <redacted at u...> To: redacted X-Mailing-List: redacted Precedence: bulk List-Help: <http://www.example.com/redacted/info.html>, <mailto:redacted at example.com> List-Unsubscribe: <mailto:redacted-unsubscribe at example.com> List-Archive: <http://www.example.com/redacted/> Reply-To: redacted Subject: [Redacted] RedactedMIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
(body of email starts here...)
(another email starts here, as above...)
(end of example)
Everything looks fine if I use mutt -f listname.mbox in the private archives directory for the list.
Has anyone had problems like this? My GoogleFu is failing me, or at least isn't showing me anything like this.
Do you see these Dec. 1999 messages when you look with Mutt?
There is a problem with a Debian patch, but the symptom is somewhat different, and you're on FreeBSD anyway, so I don't think this is it.
It looks like someone or some script ran bin/arch on Mon Aug 10 18:53:40 EDT 2009 (and possibly at other times) with some spurious input, but I'm not sure what that input would be. The puzzling part is the "Previous/Next/Sorted" header which only appears in the periodic index files.
As Terry suggests, you could run bin/cleanarch as an additional test/correction on the listname.mbox. There may be unescaped "From " in message bodies that didn't confuse Mutt or that you didn't notice with Mutt, and then run bin/arch --wipe to rebuild the archive. But also be aware as Terry says that this may renumber messages and break saved links to archived messages.
An alternative alternative is to just remove 2009-August/, 2009-August.txt and 2009-August.txt.gz (if any) from archives/private/listname/ and then run bin/arch (without --wipe) with input just consisting of the Aug, 1999 portion of listname.mbox.
But the real questions are how did this happen; do the 128 "messages" all have Mon Aug 10 18:53:40 EDT 2009 timestamps or do they have different timestamps, and what may have been done at that/those times?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro said the following on 8/12/09 10:05 AM:
Do you see these Dec. 1999 messages when you look with Mutt?
*doublechecking* Yes. They look fine.
It looks like someone or some script ran bin/arch on Mon Aug 10 18:53:40 EDT 2009 (and possibly at other times) with some spurious input, but I'm not sure what that input would be. The puzzling part is the "Previous/Next/Sorted" header which only appears in the periodic index files.
Yup. My archives are indexed automagically by Month-Year...
As Terry suggests, you could run bin/cleanarch as an additional test/correction on the listname.mbox. There may be unescaped "From " in message bodies that didn't confuse Mutt or that you didn't notice with Mutt, and then run bin/arch --wipe to rebuild the archive. But also be aware as Terry says that this may renumber messages and break saved links to archived messages.
*nods* This is an instance where I may have to go through manually with vi and fix this email-by-email. :sigh:
It will take forever, considering there are 55k or so messages in the archive.
An alternative alternative is to just remove 2009-August/, 2009-August.txt and 2009-August.txt.gz (if any) from archives/private/listname/ and then run bin/arch (without --wipe) with input just consisting of the Aug, 1999 portion of listname.mbox.
Ooh. Let me try that one.
But the real questions are how did this happen; do the 128 "messages" all have Mon Aug 10 18:53:40 EDT 2009 timestamps or do they have different timestamps, and what may have been done at that/those times?
It was probably one of the times I ran arch --wipe.
And yes, they all have the same timestamp in the archives.
Let me try re-running the arch command with the 2009-August* files removed....
Odd. I had to manually create the 2009-August directory, but the problem is still there. :-/
(I did bin/arch (listname))
Thanks, Mark! --Glenn
Glenn Sieb wrote:
Mark Sapiro said the following on 8/12/09 10:05 AM:
As Terry suggests, you could run bin/cleanarch as an additional test/correction on the listname.mbox. There may be unescaped "From " in message bodies that didn't confuse Mutt or that you didn't notice with Mutt, and then run bin/arch --wipe to rebuild the archive. But also be aware as Terry says that this may renumber messages and break saved links to archived messages.
*nods* This is an instance where I may have to go through manually with vi and fix this email-by-email. :sigh:
It will take forever, considering there are 55k or so messages in the archive.
If as you imply below, you've already run bin/arch --wipe in the recent past, then you've already reneumbered the archive, so don't worry about doing it again.
An alternative alternative is to just remove 2009-August/, 2009-August.txt and 2009-August.txt.gz (if any) from archives/private/listname/ and then run bin/arch (without --wipe) with input just consisting of the Aug, 1999 portion of listname.mbox.
Ooh. Let me try that one.
But the real questions are how did this happen; do the 128 "messages" all have Mon Aug 10 18:53:40 EDT 2009 timestamps or do they have different timestamps, and what may have been done at that/those times?
It was probably one of the times I ran arch --wipe.
And yes, they all have the same timestamp in the archives.
Let me try re-running the arch command with the 2009-August* files removed....
Odd. I had to manually create the 2009-August directory, but the problem is still there. :-/
(I did bin/arch (listname))
I meant do
bin/arch (listname) /path/to/edited/mbox/containing/only/2009August.
However, if you've actually done "bin/arch --wipe (listname)" and wound up with those strange no-subject messages in the current month, there is either a problem with bin/arch or with the listname.mbox.
What happens if you run
bin/cleanarch < /path/to/listname.mbox > /dev/null
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (3)
-
Glenn Sieb -
Mark Sapiro -
Terri Oda