[Eric S. Raymond]
Bogofilter throws out words of length one and two.
Right, I saw that. It's something I'll run experiments against later. I'm running a 5x5 test grid (skipping the diagonal), and as was also true in speech recognition, if I had been running against just one spam+ham training corpora and just one spam+ham prediction set, I would have erroneously concluded that various things either are improvements, are regressions, or don't matter. But some ideas obtained from staring at mistakes from one test run turn out to be irrelevant, or even counter-productive, if applied to other test runs. The idea that some notion of "word" is important seems highly defensible <wink>, but beyond that I discount claims that aren't derived from a similarly paranoid testing setup.
... And bogofilter includes the headers. This is important, since otherwise you don't rate things like spamhaus addresses and sender names.
Of course -- the reasons I'm not using headers in these particular tests have been spelled out several times. They'll get added later, but for now I don't have a large enough test set where doing so doesn't render the classifier's job trivial.