Re: [Mailman-Users] Problem with inappropriate breaks in messages

Based on the comments received here, I have gone back and had another look at this, and discovered I was wrong on a number of important issues. Apologies for this, but I am (obviously) new to Mailman, and didn't completely realise what I was seeing, the first time.
This time, having looked at the actual mbox file held in the Archive folder, I can see that incidents of "\nFrom " in the message body of new messages have been received correctly escaped by a ">"; and the mbox file clearly has them marked as >From_ lines. So that is all good.
However, the Pipermail Archive does consistently split messages whenever a message-body "\nFrom " occurs, as I described earlier, with the second part being attributed to "bogus@does.not.exist.com".
I've found that if I then run arch on the list (using the mbox file) the Archive is created correctly, without this splitting, although any subsequent messages with a message-body "\nFrom " cause further split messages.
So it looks like my problem is with the dynamic creation of the Pipermail Archive, rather than the generation from the mbox file. I haven't yet pinned down what script/process is responsible for this.
This suggests a perfectly acceptable "quick fix" of a daily cron job running Arch on the list, but I will look into this further when I get time.
Many thanks for your help, which pointed me in the right direction.
Chris

Chris Malme wrote:
[...]
A word of caution. The archiver is a tangled web of subclasses and overridden methods and is quite difficult to follow.
That said, I suspect the underlying OS here is Debian/Ubuntu and Mailman is the Debian/Ubuntu package which has patches in this area which are causing this. The patch is to fix <http://bugs.debian.org/244673>. The 2.1.9 patch is at <http://patch-tracker.debian.org/patch/series/view/mailman/1:2.1.9-7/77_heade...> (if that URL doesn't work, go to <http://patch-tracker.debian.org/package/mailman> and navigate from there - the direct URL is not stable and changes every time there is a package update).
I don't specifically recall if this patch causes your problem or not, but I'm pretty sure it does. I think you can fix it by finding the added code around line 200 of Mailman/Message.py and changing
g = Generator(fp)
to
g = Generator(fp, mangle_from_=True)
I have installed a refactored version of this patch upstream as of Mailman 2.1.13 which doesn't have this problem.
If you're interested, I can provide more detail on this, but I think the above change will fix your problem. It will also cause From_ to be escaped in outgoing non-digest messages (it is already escaped in digests) which may be an esthetic issue for some recipients, but for others, it will have been escaped anyway by an MTA/MDA in the delivery path.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

What a star!
Thanks Mark, I will take a look at it later today.
Yes, it is Debian/Ubuntu - I must learn to specify this things from the start.
Chris

Regarding my earlier query, regarding archive messages breaking, due to to in message "\nFrom " text; I am pleased to say that Mark's suggestion worked a treat, and fixed that archive woe. Many thanks.
However, it uncovered another problem, less serious, but fascinating. One of my users uses a mail client that creates MessageIDs containing the character "-". As far as I can tell, this is completely legit.
I've discovered that for every "-" in the MessageID, that message is moved one place across in the nesting of threads. As his MessageID can contain up to 5 "-" characters, this means any thread he participates in gets messed up somewhat.
Looking Mailman/Archive/pipermail.py, I can see lines such as
article.threadKey = parent.threadKey + article.date + '/' + article.msgid + '-'
and
self.write_threadindex_entry(article, artkey.count('-') - 1)
which suggests the dash is being used as a delimiter/flag in Pipermail, but I haven't looked into it in any detail, yet.
Before I do so, or begin to experiment, I thought I would ask if this is a known problem with a existing solution? I did do a quick search of the archive, but couldn't find anything obvious. If there isn't an existing fix, I might try something basic, like a global replacement of "'-'" for "'~'" in pipermail.py and just see what it does.
As before, running Mailman 2.1.9/Pipermail 0.09 on Debian/Ubuntu, running Plesk.
Chris

By the way, the mail client in question is Apple Mail 2.1, and I have confirmed that this is normal behaviour for it.

Chris Malme wrote:
Good work. It took me quite a bit of effort to figure that one out, even after I knew which bad Debian patch caused it.
[...]
Before I do so, or begin to experiment, I thought I would ask if this is a known problem with a existing solution?
It is a known problem caused by another bad Debian patch. See some of the gory details in the post at <http://mail.python.org/pipermail/mailman-users/2009-July/066610.html> and related posts.
The cure is to replace the debian patch with the one at <http://bazaar.launchpad.net/~mailman-coders/mailman/2.1/revision/1186>.
(That URL is currently returning "internal server error". This is generally a temporary Launchpad condition which will correct itself. If you can't get the patch, let me know and I'll send it.)
The bad Debian patch takes statements similar to
myThreadKey = parent.threadKey + article.date + '-'
in five places in pipermail.py and replaces "article.date + '-'" with "article.date + '/' + article.msgid + '-'".
The correct fix is to replace "article.date + '/' + article.msgid + '-'" in the Debian patch with "article.date + '.' + str(article.sequence) + '-'".
Or, you can go to <http://bazaar.launchpad.net/~mailman-coders/mailman/2.1/annotate/head%3A/Mai...> and look at the 5 groups of one or two lines marked revision 1186.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Chris Malme wrote:
[...]
A word of caution. The archiver is a tangled web of subclasses and overridden methods and is quite difficult to follow.
That said, I suspect the underlying OS here is Debian/Ubuntu and Mailman is the Debian/Ubuntu package which has patches in this area which are causing this. The patch is to fix <http://bugs.debian.org/244673>. The 2.1.9 patch is at <http://patch-tracker.debian.org/patch/series/view/mailman/1:2.1.9-7/77_heade...> (if that URL doesn't work, go to <http://patch-tracker.debian.org/package/mailman> and navigate from there - the direct URL is not stable and changes every time there is a package update).
I don't specifically recall if this patch causes your problem or not, but I'm pretty sure it does. I think you can fix it by finding the added code around line 200 of Mailman/Message.py and changing
g = Generator(fp)
to
g = Generator(fp, mangle_from_=True)
I have installed a refactored version of this patch upstream as of Mailman 2.1.13 which doesn't have this problem.
If you're interested, I can provide more detail on this, but I think the above change will fix your problem. It will also cause From_ to be escaped in outgoing non-digest messages (it is already escaped in digests) which may be an esthetic issue for some recipients, but for others, it will have been escaped anyway by an MTA/MDA in the delivery path.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

What a star!
Thanks Mark, I will take a look at it later today.
Yes, it is Debian/Ubuntu - I must learn to specify this things from the start.
Chris

Regarding my earlier query, regarding archive messages breaking, due to to in message "\nFrom " text; I am pleased to say that Mark's suggestion worked a treat, and fixed that archive woe. Many thanks.
However, it uncovered another problem, less serious, but fascinating. One of my users uses a mail client that creates MessageIDs containing the character "-". As far as I can tell, this is completely legit.
I've discovered that for every "-" in the MessageID, that message is moved one place across in the nesting of threads. As his MessageID can contain up to 5 "-" characters, this means any thread he participates in gets messed up somewhat.
Looking Mailman/Archive/pipermail.py, I can see lines such as
article.threadKey = parent.threadKey + article.date + '/' + article.msgid + '-'
and
self.write_threadindex_entry(article, artkey.count('-') - 1)
which suggests the dash is being used as a delimiter/flag in Pipermail, but I haven't looked into it in any detail, yet.
Before I do so, or begin to experiment, I thought I would ask if this is a known problem with a existing solution? I did do a quick search of the archive, but couldn't find anything obvious. If there isn't an existing fix, I might try something basic, like a global replacement of "'-'" for "'~'" in pipermail.py and just see what it does.
As before, running Mailman 2.1.9/Pipermail 0.09 on Debian/Ubuntu, running Plesk.
Chris

By the way, the mail client in question is Apple Mail 2.1, and I have confirmed that this is normal behaviour for it.

Chris Malme wrote:
Good work. It took me quite a bit of effort to figure that one out, even after I knew which bad Debian patch caused it.
[...]
Before I do so, or begin to experiment, I thought I would ask if this is a known problem with a existing solution?
It is a known problem caused by another bad Debian patch. See some of the gory details in the post at <http://mail.python.org/pipermail/mailman-users/2009-July/066610.html> and related posts.
The cure is to replace the debian patch with the one at <http://bazaar.launchpad.net/~mailman-coders/mailman/2.1/revision/1186>.
(That URL is currently returning "internal server error". This is generally a temporary Launchpad condition which will correct itself. If you can't get the patch, let me know and I'll send it.)
The bad Debian patch takes statements similar to
myThreadKey = parent.threadKey + article.date + '-'
in five places in pipermail.py and replaces "article.date + '-'" with "article.date + '/' + article.msgid + '-'".
The correct fix is to replace "article.date + '/' + article.msgid + '-'" in the Debian patch with "article.date + '.' + str(article.sequence) + '-'".
Or, you can go to <http://bazaar.launchpad.net/~mailman-coders/mailman/2.1/annotate/head%3A/Mai...> and look at the 5 groups of one or two lines marked revision 1186.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Chris Malme
-
Mark Sapiro