[spambayes-dev] SpamBayes server compliant w/ spamassassin

Jkx at Pythonfr jkx at pythonfr.org
Sat Apr 24 23:22:11 EDT 2004


On Sunday 25 April 2004 04:31, Skip Montanaro wrote:
>     jkx> 1) you still need to create a python process for every incomming
>     jkx>    mail sb_bnfilter. And python, even if it not a weight bloat,
>     jkx>    python eat something like 4.5Mb of memory instead of the poor
>     jkx>    500Ko of spamc
>
> The sb_bnfilter/sb_bnserver combination runs several times faster on my
> machine.  It would probably be faster if you recoded sb_bnfilter.py in C.
> Feel free.

Faster than ? 
- sb_filter ? 
- spamc + code_attached in previous email ?

But why should i rewrote sb_bnfilter in C, since sb_bnserver doesn't 
feet w/ my needs . 


>     jkx> 2) sb_bnserver need to be launch by the user (thought
> sb_bnfilter), jkx>    and it is written in this way, so it isn't
> system-wide filering. jkx>    spamc as some usefull stuff like round-robin
> filtering ..  For jkx>    example, if i need to dipatch a lot of mail in
> mailbox (mailing jkx>    list for example), for every user it will fork n
> servers .. and jkx>    so on ?
> 
> I don't recall that you said you wanted a single system-wide filter.
> Spambayes isn't designed that way at any rate.  It will require some
> significant effort.

Where significant effort ? 
I really miss something. Have you read the code i provided ? 
It just serve as 1 single server (hammie filter) for a large number
of users. But all have their own database. 
- one and only one server (not one per user !)
- every user have its own db 


>     jkx> I think sb_bn* is pretty nice for a system w/ only few mail
>     jkx> accounts and should performs very for bursting email dispatch for
> a jkx> single user like after a fetchmail... but this isn't my goal.
>
> Some folks have experimented with using Spambayes for system-wide
> filtering. I don't know that anybody's produced any conclusive results.

What you think of system-wide filtering is : using the same hammie
filter database for all the users. 

Once more .. this is not what my code is done for.

my code try to face this problems: 
- spawning a python at each incomming mail (spamc)
- having one deamon (or more) per user . 

> That said, one approach might be to rework sb_bnserver.py to open several
> unix domain sockets (one per user) and listen on all of them.  When a
> connection is made on a socket spin off a new thread to handle it and use
> that user's database to score the message.  If the user doesn't have a
> database of their own, default to a general database.

Do  you really want to open one UnixDomain socket per user ????? 
I usually work w/ about 50 users right now .. 
( and i wrote this code to do on ~ 1000 accounts .. ). 

Another thing, i don't care about 'general database' .. this isn't the goal
i want a system managable for a large number of user.. 


> Once you have that working, you can rewrite sb_bnfilter.py in C to reduce
> memory consumption and maybe improve performance a bit.  sb_bnserver.py
> could probably be sped up just by running it with psyco.

pscyco have nothing about that. the trouble is 'exec a python' at each email
this is a bad idea. That why i use code ripped from spamassassin, because
1) it is really efficient code 
2) quite clear code (despite too much goto)
3) it is system-wide: 
    - use syslogd 
    - handler error (you don't loose mails w/) 
    - have round-robin capabities .. 
    - and so on .. 


I wrote this for sys-admin who wants to have spambayes for a large scale 
of users. and that can manage easly the way mails are filtered .. 
- only one spambayes server 
- all incoming mails are sent (thought spamc) to this server 
- and every user use it's own hammie database in there home. 

so even it the server falls for a strange raison mails aren't lost .. (spamc
do that perfectly ) 


Bye Bye 



More information about the spambayes-dev mailing list