[spambayes-dev] RE: [Spambayes] How low can you go?
T. Alexander Popiel
popiel at wolfskeep.com
Sat Dec 27 01:19:35 EST 2003
In message: <LNBBLJKPBEHFEDALKOLCGEIJIAAB.tim.one at comcast.net>
"Tim Peters" <tim.one at comcast.net> writes:
>
>In your (Alex's) recent "nonedge" incremental training experiment, it looks
>like your training data grew to about a 5.5::1 spam::ham ratio after 400
>days.
Yup. I have a nice picture now of the ratio over time at the bottom
of the report at:
http://www.wolfskeep.com/~popiel/spambayes/nonedge
>I know my personal classifiers start acting flaky whenever I've let
>them get imbalanced by more than 2::1 in either direction.
Interestingly enough, though, the nonedge did better than TOE, despite
a worse imbalance.
>So if I had your data, I'd be curious to try variations that force better
>balance.
I'd love to... but I haven't been able to come up with anything which
maintains the balance better without extreme artificiality. If you
think of any regimes that make sense, I'd be more than happy to run them.
>You have enough data that it
>may well be more interesting to you to try variations including expiration
*grin* That's part of what's been burning my CPU ever since I posted
the last report. I'll have another report, including that, probably
within 3 days. Still have more to test... and my runs are taking
between 6 and 20 hours each, depending on the memory used by the
classifiers.
- Alex
More information about the spambayes-dev
mailing list