Re: Please stop the cross posting

"CT" == Christian Tismer <tismer@appliedbiometrics.com> writes:
CT> I will also propose to change Mailman to handle this
CT> in a better way. My own patched version on Starship
CT> never prepends the list name if it can be matched in the
CT> "re" already.
I thought Mailman already does this too. I'll double check. Chris, you might want to send your patches to mailman-developers@python.org
-Barry

Barry A. Warsaw wrote:
Well, these patches were for a very old Mailman. I don't know wether they still apply. But the idea is simple. Only prepend the prefix if you cannot string.find it.
From maillist.py (V0.95):
# Prepend the subject_prefix to the subject line.
subj = msg.getheader('subject')
prefix = self.subject_prefix
if prefix:
prefix = prefix + ' '
if not subj:
msg.SetHeader('Subject', '%s(no subject)' % prefix)
else:
#CT begin #CT adding only if not already present if string.find(subj, prefix) < 0: msg.SetHeader('Subject', '%s%s' % (prefix, subj)) #CT end
dont_send_to_sender = 0
ack_post = 0
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

On Fri, Jan 29, 1999 at 06:29:28PM +0100, Christian Tismer wrote:
Hmm, would anyone else be interested in the option to "merge" x-posted mailings... what I guess I mean is that I'm on a bunch of python.org lists, and when things get crossposted, I get like 400 copies of each message... it'd be nice to figure out a way to only get ONE copy :-)
Chris
| Christopher Petrilli | petrilli@amber.org

On Fri, 29 Jan 1999, Christopher G. Petrilli wrote:
This would be one reason to have an object database to keep track of what gets posted where. I think this sort of thing is likely in a new architecture for mailman, using and as components for a zope framework. (An issue here is whether zope's new relaxation of the attribution requirement, and Open Source certification, will satisfy stallman for gnu-yness - if so, it'd make it easy to integrate mailma and zope more closely, exploit bobopos/zopepos, etc.)
Ken

(Proper handling of the subject line prefix was one of the first things i did - i'm pretty sure it was before 1.0b3, maybe 1.0b1. If i recall correctly, i implemented a slightly more stringent strategy, only looking for the prefix early on in the subject line, after "re:"'s, but before other text. I know it takes care of the vast majority of cases, and i suspect it's preferable more of the time in offbeat cases than looking for the prefix anywhere in the subject line, but i may be wrong. I didn't see the original post - still sorting out environment and email - so i'm not sure about the specific case being discussed...)
Ken
On Fri, 29 Jan 1999, Christian Tismer wrote:

Barry A. Warsaw wrote:
Well, I had a look: New Mailman does this:
prefix = self.subject_prefix
if not subj:
msg.SetHeader('Subject', '%s(no subject)' % prefix)
elif not re.match("(re:? *)?" + re.escape(self.subject_prefix),
subj, re.I):
msg.SetHeader('Subject', '%s%s' % (prefix, subj))
if self.anonymous_list:
This is insufficient for cross-posts which are replied to from a different list. I think my string.find approach is crude, simple, but exactly the right thing. Forget about seldom possible missing prefixes...
ciao - chris
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

You're probably right - the simpler approach would prevent alternation across lists from compounding the respective prefixes. (Though the idea of messages alternating from list to list gives me the willies, but that's besides the point...-)
Ken

Ken Manheimer wrote:
It basically appeared to me as bad manners to reply to cross-posted messages in a zig-zag. On the other hand, if somebody is really on one list and not on the other, this is the only way that works. That thread on XML/Zope sigs became really unreadable since prefixes piled up.
It seems to be hard to tackle this in a more intelligent way. Ok, we can prevend prefixes from stacking. But the bigger problem which hurts at the same time (See Chris Petrilli's message) is that you get these crossposted things twice, also.
I don't see an easy way to avoid this when cross posting is allowed. One way would be this:
When Mailman receives a post, it can see all the recipients, especially it can see its own hostname, with different mailing lists. Unfortunately, the receiving process will duplicate the message and send it to the different aliases, one for each list. To capture this, sendmail must be intercepted by something like procmail (just as an example) which first makes sure that only one list gets that message.
The one list which receives the message now reads the recipient list, and temporarily merges the user lists of the mentioned mailing lists into one. This could make sure that you don't get a duplicate message.
How about that? - ciao - chris
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

On Fri, Jan 29, 1999 at 07:24:21PM +0100, Christian Tismer wrote:
I'm sure I was partially to blame for that one, unfortunately it's easy to get confused, plus sometimes I get them in different orders, and I usually respond to the first example :-) Oh well, the human brain works in mysterious and lousy ways.
Also, I think it should be possible to prevent xposts period... At least if the lists are hosted on the same machine... or maybe even an ability to define lists that aren't allowed to be xpoststed to... I dunno, this needs to be thought out more. I have some ideas about how to restructure the mailing list delivery stuff so that it will not send out multiple copies to the same person through different lists.
I don't think this belongs in the MTA, it belongs in the mailing list manager... but that's just me... Also, remember that a lot of us (myself for example) don't use Sendmail any more...
Hmm... I'll try and write down my ideas for a more hmm, "elegant" solution, at least in my mind... think garbage collection, it's a bizarre premise, but...
Basically, you have a list of all recipients on all lists merged together, you then have a list of messages that need to be delivered to recipients with certain interests... you can then attach this message object to all the recipients (via whatever method), thereby determining whether or not the recipient will already be receiving the message.
I've written (a long time ago) a Usenet server that worked on a similar principle... was quite quick, and very effecient. It was in C, but I can't imagine that it should be that hard in Python... this is NOT a Mailman 1.x exercise :-)
Chris
| Christopher Petrilli | petrilli@amber.org

On Fri, 29 Jan 1999, Christopher G. Petrilli wrote:
I don't think it's a hard problem, and i agree, i don't think it's high enough priority to retrofit to 1.x. If there's sufficient mandate (read, incentives for digital creations) for mailman and zope to exploit on eachother's capabilities, we may be able to spend some time tailoring them for eachother a bit, avoiding locking mailman to zope, but strengthing mailman where zope is available... Exciting prospects, and i actually think there will be the mandate. Got a lot to do before then, though, so don't hold your breath.
(BTW, ironically, i've been tempted to post my messages in this to the zope list!-)
Ken

On Fri, Jan 29, 1999 at 01:47:21PM -0500, Ken Manheimer wrote:
Well, given all the licensing things not being resolved yet, these are just hypothetical ideas... well, real ideas with hypothetical probability!
What I see is this... not Mailmain as part of Zope, but a parallel "application" on top of the Zbase... this of course depends on the new ideas proposed by Jim for BoboPOS3, ney Zbase 3? Zope could then be used to provide a management interface to the mailing-list "publisher" that works off the same database.
I don't know... hmm... it's all so complex! :-) I'm going to sit down and think through the object-model that would be necessary for this, maybe whip out a few UML models to put it down on virtual paper... assuming I remember all my modeling theory! ;-)
One idea, and I don't know how feasible it is at THIS point, is to use ZPublisher as the basis for a new management interface, and then wrap that up in such a way as to allow it to be Productized for Zope so someone could just plug the interface in anywhere they want :-)
(BTW, ironically, i've been tempted to post my messages in this to the zope list!-)
No no, not at this point anyway... I'm new to this whole thign anyway, so maybe I'm just wacko and thinking of things that have been hased out to death already.
Chris
| Christopher Petrilli | petrilli@amber.org

[... regarding compounding subject prefixes in cross-posted maillist messages, and more generally avoiding duplication of cross posts ...]
On Fri, 29 Jan 1999, Christian Tismer wrote:
Offhand, there seem to me to be two key items here:
mailman can unequivocally identify the message id and the recipients of members of any lists which it is serving on the same host, so given a decent architecture (modularized message flow tracking db) it should be easy to provide users the option to inhibit receiving more than the first copy of a cross posted message
this issue does not seem critical enough to retrofit it to the existing architecture (which does virtually no flow tracking, certainly not across lists), at least not compared to the various rough edges that currently need to be polished off (eg, subscription delivery address changing without un/resubscribe, better implementation of admin pending items, lots of other stuff.)
Um, it doesn't make sense to go external when all the info is available internally to mailman as a whole! Instead, the internal infrastructure needs to be developed - something that should go on the list for modular mailman, or whatever it'll be...
Ken

(Ken)
Well, I forgot about that. This makes it of course easy.
Of course! For me, identifying the message was the problem, but I should have known better. And instead of waiting for a new Mailman generation, I see that this problem has two sides: A message appears to have always an ID which is generated by the email program (not sure if that is guaranteed, but very likely). That means one could set up the sendmail (qmail, whatever) configuration in a way that it checks the incoming mail for duplicate IDs. This would also work for cross-postings to different Mailman sites. Having both would be perfect and would save bandwidth, but one suffices to yield the desired effect. Should I try to write a little filter which keeps track of message IDs for a short period, and drops those already seen? This would be a small Python tool for the email client side, not touching Mailman at all.
Looks quite practical to me. ciao - chris
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

On Fri, 29 Jan 1999, Christian Tismer wrote:
imo this really isn't something that belongs in mailman, you start getting into all kinds of nasty hueristics ... if you belong to two groups and a message is cross posted which group do you favor and give the message too? It just gets worse as you add groups. Do you addresss that by taking the intersection and making it look like it only comes from one address? Do you let each user decide?
It would seem better addressed as another list adminisitrator concern. You could help them I suppose, give them the ability to say, "list a and b will not accept crossposts from one another" but generally its probably enough that they gentley remind their users when it occurs excessively.
On the user side procmail makes filtering via msgid quite fiesable. I think http://members.xoom.com/procmail/aks-lib/dupcheck.rc will do what you want.
Darren Henderson darren@jasper.somtel.com
Help fight junk e-mail, visit http://www.cauce.org/

On Fri, Jan 29, 1999 at 03:30:51PM -0500, Darren Henderson wrote:
Well, I figure this isn't THAT hugely common, which of course asks, why do it, but regardless... anyway, the user gets the copy that first shows up in th mailsystem... which basically would be whichever one is listed first in the recipients list from the MTA's perspective.
THat's not relevent, underneith it's the same information. You don't munge anything in the headers, you just adjust the recipient list for the outgoing message (RCPT info), and you deal with it again when 2 copies come in again. I don't see any other way to fdo this.
Yes you could decide on a per user basis (make duplicate-supression an option, turned on by default, I suppose).
I think this is a seperate issue, though obviously related. Sometimes you do want x-posts... sometimes you don't, that should be an option.
Just my handwaving Chris
| Christopher Petrilli | petrilli@amber.org

Barry A. Warsaw wrote:
Well, these patches were for a very old Mailman. I don't know wether they still apply. But the idea is simple. Only prepend the prefix if you cannot string.find it.
From maillist.py (V0.95):
# Prepend the subject_prefix to the subject line.
subj = msg.getheader('subject')
prefix = self.subject_prefix
if prefix:
prefix = prefix + ' '
if not subj:
msg.SetHeader('Subject', '%s(no subject)' % prefix)
else:
#CT begin #CT adding only if not already present if string.find(subj, prefix) < 0: msg.SetHeader('Subject', '%s%s' % (prefix, subj)) #CT end
dont_send_to_sender = 0
ack_post = 0
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

On Fri, Jan 29, 1999 at 06:29:28PM +0100, Christian Tismer wrote:
Hmm, would anyone else be interested in the option to "merge" x-posted mailings... what I guess I mean is that I'm on a bunch of python.org lists, and when things get crossposted, I get like 400 copies of each message... it'd be nice to figure out a way to only get ONE copy :-)
Chris
| Christopher Petrilli | petrilli@amber.org

On Fri, 29 Jan 1999, Christopher G. Petrilli wrote:
This would be one reason to have an object database to keep track of what gets posted where. I think this sort of thing is likely in a new architecture for mailman, using and as components for a zope framework. (An issue here is whether zope's new relaxation of the attribution requirement, and Open Source certification, will satisfy stallman for gnu-yness - if so, it'd make it easy to integrate mailma and zope more closely, exploit bobopos/zopepos, etc.)
Ken

(Proper handling of the subject line prefix was one of the first things i did - i'm pretty sure it was before 1.0b3, maybe 1.0b1. If i recall correctly, i implemented a slightly more stringent strategy, only looking for the prefix early on in the subject line, after "re:"'s, but before other text. I know it takes care of the vast majority of cases, and i suspect it's preferable more of the time in offbeat cases than looking for the prefix anywhere in the subject line, but i may be wrong. I didn't see the original post - still sorting out environment and email - so i'm not sure about the specific case being discussed...)
Ken
On Fri, 29 Jan 1999, Christian Tismer wrote:

Barry A. Warsaw wrote:
Well, I had a look: New Mailman does this:
prefix = self.subject_prefix
if not subj:
msg.SetHeader('Subject', '%s(no subject)' % prefix)
elif not re.match("(re:? *)?" + re.escape(self.subject_prefix),
subj, re.I):
msg.SetHeader('Subject', '%s%s' % (prefix, subj))
if self.anonymous_list:
This is insufficient for cross-posts which are replied to from a different list. I think my string.find approach is crude, simple, but exactly the right thing. Forget about seldom possible missing prefixes...
ciao - chris
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

You're probably right - the simpler approach would prevent alternation across lists from compounding the respective prefixes. (Though the idea of messages alternating from list to list gives me the willies, but that's besides the point...-)
Ken

Ken Manheimer wrote:
It basically appeared to me as bad manners to reply to cross-posted messages in a zig-zag. On the other hand, if somebody is really on one list and not on the other, this is the only way that works. That thread on XML/Zope sigs became really unreadable since prefixes piled up.
It seems to be hard to tackle this in a more intelligent way. Ok, we can prevend prefixes from stacking. But the bigger problem which hurts at the same time (See Chris Petrilli's message) is that you get these crossposted things twice, also.
I don't see an easy way to avoid this when cross posting is allowed. One way would be this:
When Mailman receives a post, it can see all the recipients, especially it can see its own hostname, with different mailing lists. Unfortunately, the receiving process will duplicate the message and send it to the different aliases, one for each list. To capture this, sendmail must be intercepted by something like procmail (just as an example) which first makes sure that only one list gets that message.
The one list which receives the message now reads the recipient list, and temporarily merges the user lists of the mentioned mailing lists into one. This could make sure that you don't get a duplicate message.
How about that? - ciao - chris
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

On Fri, Jan 29, 1999 at 07:24:21PM +0100, Christian Tismer wrote:
I'm sure I was partially to blame for that one, unfortunately it's easy to get confused, plus sometimes I get them in different orders, and I usually respond to the first example :-) Oh well, the human brain works in mysterious and lousy ways.
Also, I think it should be possible to prevent xposts period... At least if the lists are hosted on the same machine... or maybe even an ability to define lists that aren't allowed to be xpoststed to... I dunno, this needs to be thought out more. I have some ideas about how to restructure the mailing list delivery stuff so that it will not send out multiple copies to the same person through different lists.
I don't think this belongs in the MTA, it belongs in the mailing list manager... but that's just me... Also, remember that a lot of us (myself for example) don't use Sendmail any more...
Hmm... I'll try and write down my ideas for a more hmm, "elegant" solution, at least in my mind... think garbage collection, it's a bizarre premise, but...
Basically, you have a list of all recipients on all lists merged together, you then have a list of messages that need to be delivered to recipients with certain interests... you can then attach this message object to all the recipients (via whatever method), thereby determining whether or not the recipient will already be receiving the message.
I've written (a long time ago) a Usenet server that worked on a similar principle... was quite quick, and very effecient. It was in C, but I can't imagine that it should be that hard in Python... this is NOT a Mailman 1.x exercise :-)
Chris
| Christopher Petrilli | petrilli@amber.org

On Fri, 29 Jan 1999, Christopher G. Petrilli wrote:
I don't think it's a hard problem, and i agree, i don't think it's high enough priority to retrofit to 1.x. If there's sufficient mandate (read, incentives for digital creations) for mailman and zope to exploit on eachother's capabilities, we may be able to spend some time tailoring them for eachother a bit, avoiding locking mailman to zope, but strengthing mailman where zope is available... Exciting prospects, and i actually think there will be the mandate. Got a lot to do before then, though, so don't hold your breath.
(BTW, ironically, i've been tempted to post my messages in this to the zope list!-)
Ken

On Fri, Jan 29, 1999 at 01:47:21PM -0500, Ken Manheimer wrote:
Well, given all the licensing things not being resolved yet, these are just hypothetical ideas... well, real ideas with hypothetical probability!
What I see is this... not Mailmain as part of Zope, but a parallel "application" on top of the Zbase... this of course depends on the new ideas proposed by Jim for BoboPOS3, ney Zbase 3? Zope could then be used to provide a management interface to the mailing-list "publisher" that works off the same database.
I don't know... hmm... it's all so complex! :-) I'm going to sit down and think through the object-model that would be necessary for this, maybe whip out a few UML models to put it down on virtual paper... assuming I remember all my modeling theory! ;-)
One idea, and I don't know how feasible it is at THIS point, is to use ZPublisher as the basis for a new management interface, and then wrap that up in such a way as to allow it to be Productized for Zope so someone could just plug the interface in anywhere they want :-)
(BTW, ironically, i've been tempted to post my messages in this to the zope list!-)
No no, not at this point anyway... I'm new to this whole thign anyway, so maybe I'm just wacko and thinking of things that have been hased out to death already.
Chris
| Christopher Petrilli | petrilli@amber.org

[... regarding compounding subject prefixes in cross-posted maillist messages, and more generally avoiding duplication of cross posts ...]
On Fri, 29 Jan 1999, Christian Tismer wrote:
Offhand, there seem to me to be two key items here:
mailman can unequivocally identify the message id and the recipients of members of any lists which it is serving on the same host, so given a decent architecture (modularized message flow tracking db) it should be easy to provide users the option to inhibit receiving more than the first copy of a cross posted message
this issue does not seem critical enough to retrofit it to the existing architecture (which does virtually no flow tracking, certainly not across lists), at least not compared to the various rough edges that currently need to be polished off (eg, subscription delivery address changing without un/resubscribe, better implementation of admin pending items, lots of other stuff.)
Um, it doesn't make sense to go external when all the info is available internally to mailman as a whole! Instead, the internal infrastructure needs to be developed - something that should go on the list for modular mailman, or whatever it'll be...
Ken

(Ken)
Well, I forgot about that. This makes it of course easy.
Of course! For me, identifying the message was the problem, but I should have known better. And instead of waiting for a new Mailman generation, I see that this problem has two sides: A message appears to have always an ID which is generated by the email program (not sure if that is guaranteed, but very likely). That means one could set up the sendmail (qmail, whatever) configuration in a way that it checks the incoming mail for duplicate IDs. This would also work for cross-postings to different Mailman sites. Having both would be perfect and would save bandwidth, but one suffices to yield the desired effect. Should I try to write a little filter which keeps track of message IDs for a short period, and drops those already seen? This would be a small Python tool for the email client side, not touching Mailman at all.
Looks quite practical to me. ciao - chris
-- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.skyport.net 10553 Berlin : PGP key -> http://pgp.ai.mit.edu/ we're tired of banana software - shipped green, ripens at home

On Fri, 29 Jan 1999, Christian Tismer wrote:
imo this really isn't something that belongs in mailman, you start getting into all kinds of nasty hueristics ... if you belong to two groups and a message is cross posted which group do you favor and give the message too? It just gets worse as you add groups. Do you addresss that by taking the intersection and making it look like it only comes from one address? Do you let each user decide?
It would seem better addressed as another list adminisitrator concern. You could help them I suppose, give them the ability to say, "list a and b will not accept crossposts from one another" but generally its probably enough that they gentley remind their users when it occurs excessively.
On the user side procmail makes filtering via msgid quite fiesable. I think http://members.xoom.com/procmail/aks-lib/dupcheck.rc will do what you want.
Darren Henderson darren@jasper.somtel.com
Help fight junk e-mail, visit http://www.cauce.org/

On Fri, Jan 29, 1999 at 03:30:51PM -0500, Darren Henderson wrote:
Well, I figure this isn't THAT hugely common, which of course asks, why do it, but regardless... anyway, the user gets the copy that first shows up in th mailsystem... which basically would be whichever one is listed first in the recipients list from the MTA's perspective.
THat's not relevent, underneith it's the same information. You don't munge anything in the headers, you just adjust the recipient list for the outgoing message (RCPT info), and you deal with it again when 2 copies come in again. I don't see any other way to fdo this.
Yes you could decide on a per user basis (make duplicate-supression an option, turned on by default, I suppose).
I think this is a seperate issue, though obviously related. Sometimes you do want x-posts... sometimes you don't, that should be an option.
Just my handwaving Chris
| Christopher Petrilli | petrilli@amber.org
participants (5)
-
Barry A. Warsaw
-
Christian Tismer
-
Christopher G. Petrilli
-
Darren Henderson
-
Ken Manheimer