Re: [Mailman-Users] Efficient handling of cross-posting

понеділок 28 січень 2008 05:16 по, Mark Sapiro Ви написали:
Well, depending on the archival method, a message can appear in both, with only a single copy of it being stored. This can be achived, for example, using symbolic (or even hard) links. Similarly, if a relational database is employed, the same message can be referred to from multiple places.
Hardlinks, for example, is how one IMAP-server (cyrus, I believe) stores messages sent to multiple recipients.
But I was referring to search-results only. Regardless of how the messages are stored, if I type the search string and select several of the mailing list archives to search through, the same message may appear in my search results more than once. That duplication should not happen -- I don't think, anyone would disagree.
Yours,
-mi

Mikhail Teterin wrote:
I disagree. Completely.
I think you are trying to convince everyone your way is the only proper way to do it and I don't think you are seeing the wider issues here.
First, there is no search facility built into mailman. Any such functionality is due to use of a third-party search facility either integrated into mailman by whoever installed it or through a source such as Google's site indexing.
Second, if a message is stored in multiple archives then it SHOULD show up in the search results if you search those multiple archives. It should only appear once if you search only a single archive. If you use the default pipermail archiver or something like mhonarc to archive your list traffic, those separate hits in each list archive could indeed be different in that the thread links and previous post and next post links in them may (most likely will) point to different places as people reply to the message.
Doing anything else is, IMO, both unnecessarily complex and may serve to obscure the relationship of posts in a thread within a single list.
It would really be a nightmare to implement a multi-list-aware archiving system and search system. How do you keep track of what posts are stored where and which ones link to them in the forward and backward directions for each list (and those things ARE list dependent) if you only store a single copy? I know this can be implemented in a sophisticated RDBMS system but that requires a lot more complexity and having to install and run yet another software component.
Just remember, you didn't pay anything for the right to use the mailman software, it is freely available under the GPL and both written and maintained by volunteers. Feature requests are just that, requests, not mandates. The developers may choose to implement something or they may not. The fact that one of the core developers (Mark Sapiro) is telling you that his is not desirable should be an indication that the way it functions is probably not going to change. Which leaves you with basically two options, develop the changes you want yourself or live with the way it works.
Dragon
Venimus, Saltavimus, Bibimus (et naribus canium capti sumus)

Mikhail Teterin wrote:
I disagree. If there are multiple ways to access the same message, a search engine should find all of them. If you want the search engine to supress the duplicates, that's fine, but that's a job for the search engine, not the archiver.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mikhail Teterin wrote:
I disagree. Completely.
I think you are trying to convince everyone your way is the only proper way to do it and I don't think you are seeing the wider issues here.
First, there is no search facility built into mailman. Any such functionality is due to use of a third-party search facility either integrated into mailman by whoever installed it or through a source such as Google's site indexing.
Second, if a message is stored in multiple archives then it SHOULD show up in the search results if you search those multiple archives. It should only appear once if you search only a single archive. If you use the default pipermail archiver or something like mhonarc to archive your list traffic, those separate hits in each list archive could indeed be different in that the thread links and previous post and next post links in them may (most likely will) point to different places as people reply to the message.
Doing anything else is, IMO, both unnecessarily complex and may serve to obscure the relationship of posts in a thread within a single list.
It would really be a nightmare to implement a multi-list-aware archiving system and search system. How do you keep track of what posts are stored where and which ones link to them in the forward and backward directions for each list (and those things ARE list dependent) if you only store a single copy? I know this can be implemented in a sophisticated RDBMS system but that requires a lot more complexity and having to install and run yet another software component.
Just remember, you didn't pay anything for the right to use the mailman software, it is freely available under the GPL and both written and maintained by volunteers. Feature requests are just that, requests, not mandates. The developers may choose to implement something or they may not. The fact that one of the core developers (Mark Sapiro) is telling you that his is not desirable should be an indication that the way it functions is probably not going to change. Which leaves you with basically two options, develop the changes you want yourself or live with the way it works.
Dragon
Venimus, Saltavimus, Bibimus (et naribus canium capti sumus)

Mikhail Teterin wrote:
I disagree. If there are multiple ways to access the same message, a search engine should find all of them. If you want the search engine to supress the duplicates, that's fine, but that's a job for the search engine, not the archiver.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (3)
-
Dragon
-
Mark Sapiro
-
Mikhail Teterin