28 Aug
2002
28 Aug
'02
4:26 p.m.
"TK" == Tokio Kikuchi tkikuchi@is.kochi-u.ac.jp writes:
>> An interesting issue came up today while we were playing with a
>> Bayesian spam classifier. Mailman's archives aren't very
>> clean. Messages are sent to the archiver after various
>> headering munging steps, including the adding of the List-*
>> headers and the Subject prefix.
TK> The headers are in the raw archive and not in the monthly (or
TK> quaterly, weekly) text format archive. I would rather stop
TK> publicizing the raw archive even if the other archives are
TK> public accessible. At least it should be configurable (in
TK> mm_cfg).
Some headers are stripped before being added to the quarterly/weekly mini-archive, but both see messages /after/ they've been munged.
(On the second point, I'll try to look at patch #594771. That would see like a good opportunity to make raw archives optional.)
>> We still want to do some munging, e.g. for anonymous lists.
>> This tells me that we may want to move ToArchive up before
>> CookHeaders in the global pipeline.
TK> We use a modified version of mailman 2.0.x in Japan and we
TK> like a feature of adding numbers in the subject header. The
TK> users tend to reference articles by the number not by the
TK> archive URL. So, we want the archive to be munged.
That seems to be the concensus, i.e. the archive should reflect what the members get. Makes sense -- if you want a more pristine archive, you can interpose a tee to a file before the message gets to Mailman, or you could add a different handler module. I'll leave things as is.
TK> BTW, I'm preparing a patch for numbering the subject prefix.
Cool. But this is likely a new feature that will have to wait until after 2.1 final.
Thanks, -Barry