[Spambayes] spamprob combining

Wed, 09 Oct 2002 21:06:56 -0400

The thing about the geometric mean is that it is much more sensitive to
numbers near 0, so the S/(S+H) technique is biased in that way.

If you want to try something like that, I would suggest using the ARITHMETIC
means in computing S and H and again using S(S+H). That would remove that
bias.

It wouldn't be invoking that optimality theorem, but whatever works...

It really seems, as a matter of being educated, that the arithmetic approach
is worth trying if it doesn't take a lot of trouble to try it.

>"but more sensitive to overwhelming amounts of evidence than Gary-combining"

>From the email you sent at 1:02PM yesterday:

0.40    0
0.45    2 *
0.50  412 *********
0.55 3068 *************************************************************
0.60 1447 *****************************
0.65   71 **
0.70    0

One thing I'd like to be more clear on. If I understand the experiment
correctly you set 10 to .99 and 40 were random.

What percentage actually ended up as > .5, without regard to HOW MUCH over
.5?
'
> It's hard to know what to make of this, especially in light of the claim
> that Gary-combining has been proven to be the most sensitive possible test
> for rejecting the hypothesis that a collection of probs is uniformly
> distributed.

It's not the (S-H)/(S+H) that is the most sensitive (under certain
conditions), it that the geometric mean approach for computing S gives a
result that is MONOTONIC WITH a calculation which is the most sensitive.

The real technique would take S and feed it into an inverse chi-square
function with (in this experiment) 100 degrees of freedom. The output
(roughly speaking) would be the probability that that S (or a more extreme
one) might have occurred by chance alone.

Call these numbers S' and H' for S and H respectively.

The calculation (S-H)/(S+H) will be > 0 if and only if (S'-H')/(S'+H')
(unless I've made some error).

So, as a binary indicator, the two are equivalent. However, if you used S'
and H', you would see something more like real probabilities that would
probably be of magnitudes that would be more attractive to you.

You could probably use a table to approximate the inverse chi-square calc
rather than actually doing the computations all the time.

I didn't suggest doing that, at first, because I was interested in providing
a binary indicator and wanting to keep things simple -- and from the POV of
a binary indicator, it doesn't make any difference.

So, if it happens that feel like taking the time to go "all the way" with
this approach, I would suggest actually computing S' and H' and seeing what
happens. I think you would like the results better -- I just didn't suggest
it at first because I didn't know the spread would be of such interest and I
wanted to keep things simple.

I think this would work better than the S/(S+H) approach, because if you use
geometric means, it's more sensitive to one condition than the other, and if
you use arithmetic means, you don't invoke the optimality theorem.

Of course, this is ALL speculative. But the probabilities involved will
DEFINATELY be of greater magnitude, and so a better-defined spread, if the
inverse chi-square is used.

--Gary

-- 
Gary Robinson
CEO
Transpose, LLC
grobinson@transpose.com
207-942-3463
http://www.emergentmusic.com
http://radio.weblogs.com/0101454

> From: Tim Peters <tim.one@comcast.net>
> Date: Wed, 09 Oct 2002 20:34:15 -0400
> To: SpamBayes <spambayes@python.org>
> Cc: Gary Robinson <grobinson@transpose.com>
> Subject: RE: [Spambayes] spamprob combining
> 
> [Tim]
>> ...
>> Intuitively, it *seems* like it would be good to get something not so
>> insanely sensitive to random input as Paul-combining, but more
>> sensitive to overwhelming amounts of evidence than Gary-combining.
> 
> So there's a new option,
> 
> [Classifier]
> use_tim_combining: True
> 
> The comments (from Options.py) explain it:
> 
> # For the default scheme, use "tim-combining" of probabilities.  This
> # has no effect under the central-limit schemes.  Tim-combining is a
> # kind of cross between Paul Graham's and Gary Robinson's combining
> # schemes.  Unlike Paul's, it's never crazy-certain, and compared to
> # Gary's, in Tim's tests it greatly increased the spread between mean
> # ham-scores and spam-scores, while simultaneously decreasing the
> # variance of both.  Tim needed a higher spam_cutoff value for best
> # results, but spam_cutoff is less touchy than under Gary-combining.
> use_tim_combining: False
> 
> "Tim combining" simply takes the geometric mean of the spamprobs as a
> measure of spamminess S, and the geometric mean of 1-spamprob as a measure
> of hamminess H, then returns S/(S+H) as "the score".  This is well-behaved
> when fed random, uniformly distributed probabilities, but isn't reluctant to
> let an overwhelming number of extreme clues lead it to an extreme conclusion
> (although you're not going to see it give Graham-like 1e-30 or
> 1.0000000000000 scores).
> 
> Don't use a central-limit scheme with this (it has no effect on those).  If
> you test it, use whatever variations on the "all default" scheme you usually
> use, but it will probably help to boost spam_cutoff.  Note that the default
> max_discriminators is still 150, and that's what I used below.
> 
> Here's a 10-set cross-validation run on my data, restricted to 100 ham and
> 100 spam per set, with all defaults, except
> 
>                   before   after
>                   ------   -----
> use_tim_combining   False    True
> spam_cutoff         0.55     0.615
> 
> 
> -> <stat> tested 100 hams & 100 spams against 900 hams & 900 spams
>  [ditto 19 times]
> 
> false positive percentages
>   0.000  0.000  tied
>   1.000  0.000  won   -100.00%
>   1.000  1.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
> 
> won   1 times
> tied  9 times
> lost  0 times
> 
> total unique fp went from 2 to 1 won    -50.00%
> mean fp % went from 0.2 to 0.1 won    -50.00%
> 
> false negative percentages
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   1.000  1.000  tied
>   0.000  0.000  tied
> 
> won   0 times
> tied 10 times
> lost  0 times
> 
> total unique fn went from 1 to 1 tied
> mean fn % went from 0.1 to 0.1 tied
> 
> The real story here is in the score distributions; contrary to what the
> comment said above, the ham-score variance increased with this little data:
> 
> ham mean                     ham sdev
> 30.63   18.80  -38.62%        6.03    6.83  +13.27%
> 29.31   17.35  -40.81%        5.48    6.84  +24.82%
> 29.96   18.50  -38.25%        6.95    9.02  +29.78%
> 29.66   18.12  -38.91%        5.89    6.81  +15.62%
> 29.51   17.34  -41.24%        5.73    6.71  +17.10%
> 29.40   17.43  -40.71%        5.73    6.61  +15.36%
> 29.75   17.74  -40.37%        5.76    6.96  +20.83%
> 29.71   18.17  -38.84%        5.97    6.48   +8.54%
> 31.98   20.41  -36.18%        5.96    8.02  +34.56%
> 29.83   18.11  -39.29%        4.75    5.41  +13.89%
> 
> ham mean and sdev for all runs
> 29.97   18.20  -39.27%        5.90    7.08  +20.00%
> 
> spam mean                    spam sdev
> 79.23   88.38  +11.55%        6.96    5.52  -20.69%
> 79.40   88.70  +11.71%        7.00    5.64  -19.43%
> 78.68   88.06  +11.92%        6.69    5.13  -23.32%
> 79.65   89.01  +11.75%        7.20    5.22  -27.50%
> 79.91   88.87  +11.21%        6.35    4.67  -26.46%
> 80.47   89.16  +10.80%        7.22    6.06  -16.07%
> 80.94   89.78  +10.92%        6.60    4.45  -32.58%
> 80.30   89.41  +11.34%        6.95    5.49  -21.01%
> 78.54   87.70  +11.66%        7.30    6.45  -11.64%
> 80.06   89.06  +11.24%        6.98    5.43  -22.21%
> 
> spam mean and sdev for all runs
> 79.72   88.81  +11.40%        6.97    5.47  -21.52%
> 
> ham/spam mean difference: 49.75 70.61 +20.86
> 
> So before, the score equidistant from both means was 52.78, at 3.87 sdevs
> from each; after, it was 58.03, at 5.63 sdevs from each.  The populations
> are much better separated by this measure.
> 
> Histograms before:
> 
> -> <stat> Ham scores for all runs: 1000 items; mean 29.97; sdev 5.90
> -> <stat> min 13.521; median 29.6919; max 60.8937
> * = 2 items
> ...
> 13  2 *
> 14  0
> 15  2 *
> 16  8 ****
> 17  4 **
> 18  9 *****
> 19 17 *********
> 20 14 *******
> 21 16 ********
> 22 24 ************
> 23 38 *******************
> 24 47 ************************
> 25 62 *******************************
> 26 65 *********************************
> 27 69 ***********************************
> 28 73 *************************************
> 29 70 ***********************************
> 30 76 **************************************
> 31 70 ***********************************
> 32 61 *******************************
> 33 51 **************************
> 34 50 *************************
> 35 34 *****************
> 36 30 ***************
> 37 27 **************
> 38 18 *********
> 39 12 ******
> 40 11 ******
> 41 13 *******
> 42  2 *
> 43  5 ***
> 44  8 ****
> 45  2 *
> 46  1 *
> 47  3 **
> 48  1 *
> 49  0
> 50  3 **
> 51  0
> 52  0
> 53  0
> 54  0
> 55  1 *
> 56  0
> 57  0
> 58  0
> 59  0
> 60  1 *
> ...
> 
> -> <stat> Spam scores for all runs: 1000 items; mean 79.72; sdev 6.97
> -> <stat> min 52.3428; median 79.9799; max 98.1879
> * = 2 items
> ...
> 52  1 *
> 53  0
> 54  0
> 55  0
> 56  3 **
> 57  1 *
> 58  0
> 59  1 *
> 60  4 **
> 61  4 **
> 62  4 **
> 63  3 **
> 64  4 **
> 65  7 ****
> 66  9 *****
> 67 10 *****
> 68 13 *******
> 69 16 ********
> 70 26 *************
> 71 18 *********
> 72 29 ***************
> 73 35 ******************
> 74 40 ********************
> 75 39 ********************
> 76 56 ****************************
> 77 52 **************************
> 78 50 *************************
> 79 76 **************************************
> 80 60 ******************************
> 81 77 ***************************************
> 82 45 ***********************
> 83 61 *******************************
> 84 50 *************************
> 85 43 **********************
> 86 41 *********************
> 87 33 *****************
> 88 19 **********
> 89 11 ******
> 90 11 ******
> 91  8 ****
> 92  2 *
> 93  9 *****
> 94  4 **
> 95  9 *****
> 96  2 *
> 97 11 ******
> 98  3 **
> 99  0
> 
> Histograms after:
> 
> -> <stat> Ham scores for all runs: 1000 items; mean 18.20; sdev 7.08
> -> <stat> min 5.6946; median 17.1757; max 73.1302
> * = 2 items
> ...
> 5  1 *
> 6 13 *******
> 7 16 ********
> 8 25 *************
> 9 22 ***********
> 10 37 *******************
> 11 45 ***********************
> 12 56 ****************************
> 13 70 ***********************************
> 14 61 *******************************
> 15 66 *********************************
> 16 79 ****************************************
> 17 63 ********************************
> 18 59 ******************************
> 19 59 ******************************
> 20 56 ****************************
> 21 47 ************************
> 22 36 ******************
> 23 37 *******************
> 24 32 ****************
> 25  9 *****
> 26 20 **********
> 27 17 *********
> 28  8 ****
> 29  7 ****
> 30 11 ******
> 31  6 ***
> 32  7 ****
> 33  5 ***
> 34  4 **
> 35  2 *
> 36  2 *
> 37  6 ***
> 38  1 *
> 39  0
> 40  3 **
> 41  3 **
> 42  0
> 43  1 *
> 44  1 *
> 45  1 *
> 46  0
> 47  1 *
> 48  0
> 49  0
> 50  2 *
> 51  1 *
> 52  0
> 53  0
> 54  0
> 55  0
> 56  0
> 57  0
> 58  0
> 59  0
> 60  0
> 61  1 *
> 62  0
> 63  0
> 64  0
> 65  0
> 66  0
> 67  0
> 68  0
> 69  0
> 70  0
> 71  0
> 72  0
> 73  1 *
> 
> -> <stat> Spam scores for all runs: 1000 items; mean 88.81; sdev 5.47
> -> <stat> min 54.9382; median 89.5188; max 98.3805
> * = 2 items
> ...
> 54   1 *
> 55   0
> 56   0
> 57   0
> 58   0
> 59   0
> 60   0
> 61   0
> 62   0
> 63   1 *
> 64   3 **
> 65   0
> 66   1 *
> 67   0
> 68   2 *
> 69   2 *
> 70   3 **
> 71   3 **
> 72   2 *
> 73   2 *
> 74   4 **
> 75   4 **
> 76   6 ***
> 77   8 ****
> 78   8 ****
> 79   6 ***
> 80  12 ******
> 81  25 *************
> 82  26 *************
> 83  25 *************
> 84  39 ********************
> 85  58 *****************************
> 86  70 ***********************************
> 87  64 ********************************
> 88  74 *************************************
> 89 106 *****************************************************
> 90  85 *******************************************
> 91  62 *******************************
> 92  86 *******************************************
> 93  79 ****************************************
> 94  37 *******************
> 95  23 ************
> 96  42 *********************
> 97  25 *************
> 98   6 ***
> 99   0
> 
> There are snaky tails in either case, but "the middle ground" here is
> larger, sparser, and still contains the errors.
> 
> Across my full test data, which I actually ran first, you can ignore the
> "won/lost" business; I had spam_cutoff at 0.55 for both runs, and the
> overall results would have been virtually identical had I boosted
> spam_cutoff in the second run (recall that I can't demonstrate an
> improvement on this data anymore!  I can only determine whether something is
> a disaster, and this ain't).
> 
> -> <stat> tested 2000 hams & 1400 spams against 18000 hams & 12600 spams
>  [ditto 19 times]
> ...
> false positive percentages
>   0.000  0.050  lost  +(was 0)
>   0.000  0.050  lost  +(was 0)
>   0.000  0.050  lost  +(was 0)
>   0.000  0.000  tied
>   0.050  0.100  lost  +100.00%
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.000  0.000  tied
>   0.050  0.050  tied
> 
> won   0 times
> tied  6 times
> lost  4 times
> 
> total unique fp went from 2 to 6 lost  +200.00%
> mean fp % went from 0.01 to 0.03 lost  +200.00%
> 
> false negative percentages
>   0.000  0.000  tied
>   0.071  0.071  tied
>   0.000  0.000  tied
>   0.071  0.071  tied
>   0.143  0.071  won    -50.35%
>   0.143  0.000  won   -100.00%
>   0.143  0.143  tied
>   0.143  0.000  won   -100.00%
>   0.071  0.000  won   -100.00%
>   0.000  0.000  tied
> 
> won   4 times
> tied  6 times
> lost  0 times
> 
> total unique fn went from 11 to 5 won    -54.55%
> mean fn % went from 0.0785714285714 to 0.0357142857143 won    -54.55%
> 
> ham mean                     ham sdev
> 25.65   10.68  -58.36%        5.67    5.44   -4.06%
> 25.61   10.68  -58.30%        5.50    5.29   -3.82%
> 25.57   10.68  -58.23%        5.67    5.49   -3.17%
> 25.66   10.71  -58.26%        5.54    5.27   -4.87%
> 25.42   10.55  -58.50%        5.72    5.71   -0.17%
> 25.51   10.43  -59.11%        5.39    5.11   -5.19%
> 25.65   10.40  -59.45%        5.59    5.29   -5.37%
> 25.61   10.51  -58.96%        5.41    5.21   -3.70%
> 25.84   10.80  -58.20%        5.48    5.30   -3.28%
> 25.81   10.85  -57.96%        5.81    5.73   -1.38%
> 
> ham mean and sdev for all runs
> 25.63   10.63  -58.53%        5.58    5.39   -3.41%
> 
> spam mean                    spam sdev
> 83.86   93.17  +11.10%        7.09    4.55  -35.83%
> 83.64   93.16  +11.38%        6.83    4.52  -33.82%
> 83.27   92.91  +11.58%        6.81    4.52  -33.63%
> 83.82   93.14  +11.12%        6.88    4.67  -32.12%
> 83.89   93.29  +11.21%        6.65    4.56  -31.43%
> 83.78   93.11  +11.14%        6.96    4.72  -32.18%
> 83.42   93.00  +11.48%        6.82    4.74  -30.50%
> 83.86   93.29  +11.24%        6.71    4.55  -32.19%
> 83.88   93.22  +11.13%        6.98    4.71  -32.52%
> 83.75   93.28  +11.38%        6.65    4.32  -35.04%
> 
> spam mean and sdev for all runs
> 83.72   93.16  +11.28%        6.84    4.59  -32.89%
> 
> ham/spam mean difference: 58.09 82.53 +24.44
> 
> So the equidistant score changed from 51.73 at 4.68 sdevs from each mean, to
> 55.20 at 8.27 sdevs from each.  That's big.
> 
> The "after" histograms had 200 buckets in this run:
> 
> -> <stat> Ham scores for all runs: 20000 items; mean 10.63; sdev 5.39
> -> <stat> min 0.281945; median 9.69929; max 81.9673
> * = 17 items
> 0.0   7 *
> 0.5  13 *
> 1.0  21 **
> 1.5  41 ***
> 2.0  86 ******
> 2.5 166 **********
> 3.0 239 ***************
> 3.5 326 ********************
> 4.0 466 ****************************
> 4.5 554 *********************************
> 5.0 642 **************************************
> 5.5 701 ******************************************
> 6.0 793 ***********************************************
> 6.5 804 ************************************************
> 7.0 933 *******************************************************
> 7.5 972 **********************************************************
> 8.0 997 ***********************************************************
> 8.5 934 *******************************************************
> 9.0 947 ********************************************************
> 9.5 939 ********************************************************
> 10.0 839 **************************************************
> 10.5 786 ***********************************************
> 11.0 752 *********************************************
> 11.5 760 *********************************************
> 12.0 636 **************************************
> 12.5 606 ************************************
> 13.0 554 *********************************
> 13.5 483 *****************************
> 14.0 461 ****************************
> 14.5 399 ************************
> 15.0 360 **********************
> 15.5 317 *******************
> 16.0 275 *****************
> 16.5 224 **************
> 17.0 193 ************
> 17.5 169 **********
> 18.0 172 ***********
> 18.5 154 **********
> 19.0 153 *********
> 19.5  92 ******
> 20.0 104 *******
> 20.5  99 ******
> 21.0  74 *****
> 21.5  73 *****
> 22.0  73 *****
> 22.5  50 ***
> 23.0  38 ***
> 23.5  50 ***
> 24.0  38 ***
> 24.5  34 **
> 25.0  26 **
> 25.5  39 ***
> 26.0  24 **
> 26.5  34 **
> 27.0  18 **
> 27.5  15 *
> 28.0  20 **
> 28.5  15 *
> 29.0  14 *
> 29.5  15 *
> 30.0  12 *
> 30.5  15 *
> 31.0  14 *
> 31.5  10 *
> 32.0  12 *
> 32.5   6 *
> 33.0  10 *
> 33.5   4 *
> 34.0   8 *
> 34.5   5 *
> 35.0   5 *
> 35.5   6 *
> 36.0   7 *
> 36.5   4 *
> 37.0   2 *
> 37.5   3 *
> 38.0   1 *
> 38.5   4 *
> 39.0   6 *
> 39.5   2 *
> 40.0   2 *
> 40.5   5 *
> 41.0   0
> 41.5   2 *
> 42.0   3 *
> 42.5   3 *
> 43.0   1 *
> 43.5   2 *
> 44.0   1 *
> 44.5   2 *
> 45.0   1 *
> 45.5   1 *
> 46.0   2 *
> 46.5   0
> 47.0   3 *
> 47.5   0
> 48.0   1 *
> 48.5   1 *
> 49.0   1 *
> 49.5   0
> 50.0   1 *
> 50.5   0
> 51.0   2 *
> 51.5   0
> 52.0   1 *
> 52.5   0
> 53.0   0
> 53.5   1 *
> 54.0   1 *
> 54.5   2 *
> 55.0   0
> 55.5   0
> 56.0   1 *
> 56.5   1 *
> 57.0   0
> 57.5   0
> 58.0   0
> 58.5   1 *
> 59.0   0
> 59.5   0
> 60.0   0
> 60.5   0
> 61.0   1 *
> 61.5   0
> 62.0   0
> 62.5   0
> 63.0   0
> 63.5   0
> 64.0   0
> 64.5   0
> 65.0   0
> 65.5   0
> 66.0   0
> 66.5   0
> 67.0   0
> 67.5   0
> 68.0   0
> 68.5   0
> 69.0   0
> 69.5   0
> 70.0   1 *  the lady with the long & obnoxious employer-generated sig
> 70.5   0
> 71.0   0
> 71.5   0
> 72.0   0
> 72.5   0
> 73.0   0
> 73.5   0
> 74.0   0
> 74.5   0
> 75.0   0
> 75.5   0
> 76.0   0
> 76.5   0
> 77.0   0
> 77.5   0
> 78.0   0
> 78.5   0
> 79.0   0
> 79.5   0
> 80.0   0
> 80.5   0
> 81.0   0
> 81.5   1 *  the verbatim quote of a long Nigerian-scam spam
> ...
> 
> -> <stat> Spam scores for all runs: 14000 items; mean 93.16; sdev 4.59
> -> <stat> min 24.3497; median 93.8141; max 99.6769
> * = 15 items
> ...
> 24.0   1 *  not really sure -- it's a giant base64-encoded plain text file
> 24.5   0
> 25.0   0
> 25.5   0
> 26.0   0
> 26.5   0
> 27.0   0
> 27.5   0
> 28.0   0
> 28.5   0
> 29.0   1 *  the spam with the uuencoded body we throw away
> 29.5   0
> 30.0   0
> 30.5   0
> 31.0   0
> 31.5   0
> 32.0   0
> 32.5   0
> 33.0   0
> 33.5   0
> 34.0   0
> 34.5   0
> 35.0   0
> 35.5   0
> 36.0   0
> 36.5   0
> 37.0   0
> 37.5   0
> 38.0   0
> 38.5   0
> 39.0   0
> 39.5   0
> 40.0   0
> 40.5   0
> 41.0   0
> 41.5   0
> 42.0   0
> 42.5   0
> 43.0   0
> 43.5   0
> 44.0   0
> 44.5   0
> 45.0   0
> 45.5   0
> 46.0   1 *  Hello, my Name is BlackIntrepid
> 46.5   0
> 47.0   0
> 47.5   0
> 48.0   0
> 48.5   0
> 49.0   0
> 49.5   0
> 50.0   0
> 50.5   0
> 51.0   0
> 51.5   0
> 52.0   0
> 52.5   0
> 53.0   0
> 53.5   1 *  unclear; a collection of webmaster links
> 54.0   1 *  Susan makes a propsal (sic) to Tim
> 54.5   0
> 55.0   1 *
> 55.5   0
> 56.0   0
> 56.5   1 *
> 57.0   2 *
> 57.5   0
> 58.0   0
> 58.5   1 *
> 59.0   0
> 59.5   0
> 60.0   1 *
> 60.5   2 *
> 61.0   1 *
> 61.5   1 *
> 62.0   0
> 62.5   1 *
> 63.0   1 *
> 63.5   0
> 64.0   1 *
> 64.5   1 *
> 65.0   0
> 65.5   1 *
> 66.0   1 *
> 66.5   2 *
> 67.0   4 *
> 67.5   2 *
> 68.0   0
> 68.5   1 *
> 69.0   0
> 69.5   3 *
> 70.0   1 *
> 70.5   5 *
> 71.0   5 *
> 71.5   3 *
> 72.0   4 *
> 72.5   3 *
> 73.0   3 *
> 73.5   6 *
> 74.0   3 *
> 74.5   4 *
> 75.0   8 *
> 75.5   8 *
> 76.0  10 *
> 76.5  10 *
> 77.0  10 *
> 77.5  17 **
> 78.0  14 *
> 78.5  27 **
> 79.0  16 **
> 79.5  23 **
> 80.0  28 **
> 80.5  29 **
> 81.0  37 ***
> 81.5  37 ***
> 82.0  46 ****
> 82.5  55 ****
> 83.0  47 ****
> 83.5  53 ****
> 84.0  58 ****
> 84.5  68 *****
> 85.0  86 ******
> 85.5 118 ********
> 86.0 135 *********
> 86.5 159 ***********
> 87.0 165 ***********
> 87.5 178 ************
> 88.0 209 **************
> 88.5 231 ****************
> 89.0 299 ********************
> 89.5 391 ***************************
> 90.0 425 *****************************
> 90.5 402 ***************************
> 91.0 501 **********************************
> 91.5 582 ***************************************
> 92.0 636 *******************************************
> 92.5 667 *********************************************
> 93.0 713 ************************************************
> 93.5 685 **********************************************
> 94.0 610 *****************************************
> 94.5 621 ******************************************
> 95.0 721 *************************************************
> 95.5 735 *************************************************
> 96.0 870 **********************************************************
> 96.5 742 **************************************************
> 97.0 449 ******************************
> 97.5 447 ******************************
> 98.0 556 **************************************
> 98.5 561 **************************************
> 99.0 264 ******************
> 99.5 171 ************
> 
> The mistakes are all familiar; the good news is that "the normal cases" are
> far removed from what might plausibly be called a middle ground.  For
> example, if we called the region from 40 thru 70 here "the middle ground",
> and kicked those out for manual review, there would be very few msgs to
> review, but they would contain almost all the mistakes.
> 
> How does this do on your data?  I'm in favor what works <wink>.
>