On 28 August 2002, Tim Peters said:
What's an acceptable false positive rate?
Speaking as one of the people who reviews suspected spam for python.org and rescues false positives, I would say that the more relevant figure is: how much suspected spam do I have to review every morning? < 10 messages would be peachy; right now it's around 5-20 messages per day.
Currently there are probably 1-3 FPs per day, although on a bad day there can be 5-10. (Eg. on 2002-08-21, six mailman-users posts from the same guy were all caught, mainly because his ISP added X-AntiAbuse, and his messages were multipart/alternative with unwrapped plain text. This is a perfect example of SpamAssassin screwing up royally.) 1-3 FPs/day I can live with, but the real burden is the manual review: I'd much rather have 5 FPs in a pool of 10 suspects than 1 FP out of 100 suspects.
What do we get from SpamAssassin?
Recall the stats I posted this morning; the bulk of spam is in Chinese or Korean, and I have things setup so SpamAssassin never even sees it. I think the only way to meaningfully answer this question is to stash *everything* mail.python.org receives for a day or 10, spam and otherwise, and run it all through SA.