[Python-Dev] Re: [Python-checkins] python/nondist/sandbox/spambayes GBayes.py,1.7,1.8

Zack Weinberg zack@codesourcery.com
Wed, 21 Aug 2002 10:03:53 -0700


On Wed, Aug 21, 2002 at 02:22:26AM -0400, Eric S. Raymond wrote:
> 
> I'm on it.  The following is not yet working, but it's a straight road to get
> there....
> 
> There is a public spam-checker port.  Your client program sends it
> packets consisting of a list of header token counts.  You
> can send lots of these blocks; each one has to be under the maximum
> atomic-message size for sockets (I think that's 32K).  
> 
> The server accumulates the frequency counts you ship it until you say
> "OK, what is it?"  Does the Bayes test.  Ships you back a result.

My ISP-postmaster friend's reaction to that:

| As far it it goes, yes.  How would it learn?
|
| On a more mundane note, I'd like to see decoding of base64 in it.
|
| (Oh, and on a blue-sky note, has anyone taken up Graham's suggestion
| of having one of these things that looks at word pairs instead of
| words?)
|
| It's neat that ESR saw immediately that the daemon should be
| self-contained, no access to home directories.  SpamAssassin doesn't
| have a simple way of doing that, and [ISP] is modifying it to have
| one -- and you wouldn't believe the resistance to the proposed
| changes from some of the SA developers.  Some of them really seem
| to think that it's better and simpler to store user configuration
| in a database than to have the client send its config file to the
| server along with each message.

I remember you said you didn't want to do base64 decode because it was
too slow?

zw