[spambayes-dev] Code changes
Tim Peters
tim.one at comcast.net
Sat Dec 27 22:07:56 EST 2003
The meaning of Outlook2000/export.py's -n option has changed. Here's the
checkin comment:
INCOMPATIBLE CHANGE: the -n option now gives the number of Set
subdirectories desired, instead of a number of msgs per Set subdir
"to shoot for". If you want to run, e.g., 10-fold cross-validation,
you have to have exactly 10 Set folders, and the # of msgs per folder
is of much less importance. Also added a note recommending to run
rebal.py afterwards. rebal is the expert in setting up randomized
Set subdirectories, and the export.py script probably should have
stuck to just extracting msgs from Outlook.
utilities/rebal.py has grown a -t option, which makes it (once again) easy
to use with a standard test setup. It was originally easy to use that way,
but grew -r and -s options, presumably added by someone with a non-standard
test setup. Unfortunately, those with a standard test setup had to use them
too, and they're both clumsy and error-prone to use with a standard test
setup. -t can't be used in the same run with -r or -s. Those with a
standard test setup no longer need to worry about -r or -s, just -t; vice
versa for those with a non-standard test setup.
The changes to testtools/sort+group.py discussed here have been checked in,
after fiddling to play nice with Python 2.2.3 too.
More information about the spambayes-dev
mailing list