[spambayes-dev] SpamBayes server compliant w/ spamassassin
Jkx at Pythonfr
jkx at pythonfr.org
Sat Apr 24 23:22:11 EDT 2004
On Sunday 25 April 2004 04:31, Skip Montanaro wrote:
> jkx> 1) you still need to create a python process for every incomming
> jkx> mail sb_bnfilter. And python, even if it not a weight bloat,
> jkx> python eat something like 4.5Mb of memory instead of the poor
> jkx> 500Ko of spamc
>
> The sb_bnfilter/sb_bnserver combination runs several times faster on my
> machine. It would probably be faster if you recoded sb_bnfilter.py in C.
> Feel free.
Faster than ?
- sb_filter ?
- spamc + code_attached in previous email ?
But why should i rewrote sb_bnfilter in C, since sb_bnserver doesn't
feet w/ my needs .
> jkx> 2) sb_bnserver need to be launch by the user (thought
> sb_bnfilter), jkx> and it is written in this way, so it isn't
> system-wide filering. jkx> spamc as some usefull stuff like round-robin
> filtering .. For jkx> example, if i need to dipatch a lot of mail in
> mailbox (mailing jkx> list for example), for every user it will fork n
> servers .. and jkx> so on ?
>
> I don't recall that you said you wanted a single system-wide filter.
> Spambayes isn't designed that way at any rate. It will require some
> significant effort.
Where significant effort ?
I really miss something. Have you read the code i provided ?
It just serve as 1 single server (hammie filter) for a large number
of users. But all have their own database.
- one and only one server (not one per user !)
- every user have its own db
> jkx> I think sb_bn* is pretty nice for a system w/ only few mail
> jkx> accounts and should performs very for bursting email dispatch for
> a jkx> single user like after a fetchmail... but this isn't my goal.
>
> Some folks have experimented with using Spambayes for system-wide
> filtering. I don't know that anybody's produced any conclusive results.
What you think of system-wide filtering is : using the same hammie
filter database for all the users.
Once more .. this is not what my code is done for.
my code try to face this problems:
- spawning a python at each incomming mail (spamc)
- having one deamon (or more) per user .
> That said, one approach might be to rework sb_bnserver.py to open several
> unix domain sockets (one per user) and listen on all of them. When a
> connection is made on a socket spin off a new thread to handle it and use
> that user's database to score the message. If the user doesn't have a
> database of their own, default to a general database.
Do you really want to open one UnixDomain socket per user ?????
I usually work w/ about 50 users right now ..
( and i wrote this code to do on ~ 1000 accounts .. ).
Another thing, i don't care about 'general database' .. this isn't the goal
i want a system managable for a large number of user..
> Once you have that working, you can rewrite sb_bnfilter.py in C to reduce
> memory consumption and maybe improve performance a bit. sb_bnserver.py
> could probably be sped up just by running it with psyco.
pscyco have nothing about that. the trouble is 'exec a python' at each email
this is a bad idea. That why i use code ripped from spamassassin, because
1) it is really efficient code
2) quite clear code (despite too much goto)
3) it is system-wide:
- use syslogd
- handler error (you don't loose mails w/)
- have round-robin capabities ..
- and so on ..
I wrote this for sys-admin who wants to have spambayes for a large scale
of users. and that can manage easly the way mails are filtered ..
- only one spambayes server
- all incoming mails are sent (thought spamc) to this server
- and every user use it's own hammie database in there home.
so even it the server falls for a strange raison mails aren't lost .. (spamc
do that perfectly )
Bye Bye
More information about the spambayes-dev
mailing list