[spambayes-dev] RE: [Spambayes] How low can you go?

Tim Peters tim.one at comcast.net
Mon Dec 22 13:35:30 EST 2003


[Seth Goodman]
> But for really unusual messages of the type you were concerned about,
> this may only happen once a year, or so, which is too long for a
> hapax-expiration scheme.

Yes, and I'm aware of that.

> I'm reposting an earlier post that didn't receive any comments (poor
> netiquette, I know) because I feel it's relevant to both comments made
> subsequently in this thread and the question of expiring hapaxes not
> recently used vs. whole messages.  I also asked for a little help
> getting started to be able to test some of my own and/or other
> peoples' ideas and would still like to do that, unless you folks
> would prefer otherwise.

Sorry, I can't make time to reply now.  Your original message is still
sitting in my queue (actually, several of your msgs are -- you write a lot,
you know <wink>), and I'll get to it when I can.

Let's do the easy ones:

> could someone please give me some hints as to which of the mapping
> features we've discussed in this thread exist

None.  We map string features to pairs of little integers (ham count and
spam count) now, and that's all.

> or will soon exist

Also none.

> and where I can look for them?

For now, somewhere over the rainbow.

> I saw on spambayes-dev that there is discussion of a new database,

Also just speculation at this time.  We "have problems" with the most-common
Berkeley back end now (there are several other database back ends you
*could* configure spambayes to use already), and mostly those threads are
trying to find ways to sidestep those problems.  "Problems" == error
messages from Berkeley saying that the database is corrupted.  It's very
unusual to see these in the Outlook addin, but it has happened.  For some
people on Linux, they seem downright common.

> so I don't want to go off on a useless fork with the present db if
> that comes to pass.

Try say something more specific about what you want to investigate, and
you'll probably get a better answer.




More information about the spambayes-dev mailing list