Alle 16:46, giovedì 27 novembre 2003, Barry Warsaw ha scritto:
For the next version of Mailman, I'd prefer to see something more generic if possible. That way a site could add SA, SB[1], or some other system. It may be too hard to do this since there are no standards here, but even then, I'd like to see something pluggable rather than tightly integrated.
I'd be happy to help coding this pluggable filter, but I'd need your help in the interface design. I believe the biggest problems are:
- how do you plan to weight different filters? (e.g. using coefficients instead of a rigid pipeline, first-match-wins)
- how to plug UI?
I'd like also to have a per-list FIFO queue of pristine copies of all messages received, with a configurable max-size, so that when a message is mis-categorized as either good or spam (not unsure) we have a chance to train on the pristine version (not decorated, not header-cooked, not scrubbed, and so on) through the admin interface.
Use case A:
- Spammer sends a message to the list
- Message gets mis-categorized as good and forwarded to the list, but a pristine copy is held in a special queue.
- Admin notices the problem and opens the admin UI within a reasonble time
- Admin can recover the pristine copy of that message from the special queue (selecting from a list of still-queued messages, or better by pasting some id copied from the message header as received from the list)
- Admin can train one or all the filters on that pristine copy
Use case B:
- Subscriber sends a message to the list
- Message gets mis-categorized as spam and not sent to the list nor kept in the moderation queue, but a pristine copy is held in a special queue.
- Admin is notified of the problem (by angry Subscriber) and opens the admin UI within a reasonble time
- Admin can recover the pristine copy of that message from the special queue, selecting from a list of still-queued messages (no message header is available because the message was not forwarded)
- Admin can train one or all the filters on that pristine copy and/or force re-processing of this message so that subscribers will receive it.
Use case C:
- Someone sends a message to the list
- Message gets categorized as unsure and held in the moderation queue, a pristine copy is held in a special queue anyway.
- Admin is notified of the problem (by Mailman) and opens the admin UI within a reasonble time
- Admin can see the message in the moderation queue and decide what to do, including training on one or all the filters.
Use case D:
Messages was categorized correctly or admin didn't try to react within the allowed timeframe, so the pristine copy is silently deleted from the special queue.
I've been using the latest SB with my Evolution client and have been very impressed, although it does take a little bit of training.
For focused traffic (such as on a list) training on 50 good and 50 spam is enough for > 99.9% success. At least this is my experience.
-- Simone Piunno, chief architect Wireless Solutions SPA - DADA group Europe HQ, via Castiglione 25 Bologna web:www.wseurope.com tel:+390512966811 fax:+390512966800 God is real, unless declared integer