[Spambayes-checkins] spambayes fpfn.py,NONE,1.1 README.txt,1.25,1.26

Guido van Rossum gvanrossum@users.sourceforge.net
Tue, 24 Sep 2002 18:01:51 -0700


Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv30699

Modified Files:
	README.txt 
Added Files:
	fpfn.py 
Log Message:
Add a tiny utility to extract the filenames of false positives/negatives
from the full test run output.  (Tested with timcv.py output only.)

--- NEW FILE: fpfn.py ---
#! /usr/bin/env python
"""Extract false positive and false negative filenames from timcv.py output."""

import sys
import re

def cmpf(a, b):
    # Sort function that sorts by numerical value
    ma = re.search(r'(\d+)/(\d+)$', a)
    mb = re.search(r'(\d+)/(\d+)$', b)
    if ma and mb:
        xa, ya = map(int, ma.groups())
        xb, yb = map(int, mb.groups())
        return cmp((xa, ya), (xb, yb))
    else:
        return cmp(a, b)

def main():
    for name in sys.argv[1:]:
        try:
            f = open(name + ".txt")
        except IOError:
            f = open(name)
        print "===", name, "==="
        fp = []
        fn = []
        for line in f:
            if line.startswith('    new fp: '):
                fp.extend(eval(line[12:]))
            elif line.startswith('    new fn: '):
                fn.extend(eval(line[12:]))
        fp.sort(cmpf)
        fn.sort(cmpf)
        print "--- fp ---"
        for x in fp:
            print x
        print "--- fn ---"
        for x in fn:
            print x

if __name__ == '__main__':
    main()

Index: README.txt
===================================================================
RCS file: /cvsroot/spambayes/spambayes/README.txt,v
retrieving revision 1.25
retrieving revision 1.26
diff -C2 -d -r1.25 -r1.26
*** README.txt	24 Sep 2002 08:16:24 -0000	1.25
--- README.txt	25 Sep 2002 01:01:49 -0000	1.26
***************
*** 119,122 ****
--- 119,126 ----
      and the change in average f-p and f-n rates.
  
+ fpfn.py
+     Given one or more TestDriver output files, prints list of false
+     positive and false negative filenames, one per line.
+ 
  
  Test Data Utilities