[Python-Dev] RE: [Spambayes] Question (or possibly a bug report)

Mark Hammond mhammond@skippinet.com.au
Thu, 24 Jul 2003 14:13:04 +1000


[Tim]
> [Mark Hammond]
> > Hrm - OK - I bit the bullet, and re-booted as German locale.  If I
> > remove all calls to setlocale(), I can provoke come *very* strange
> > math errors.
> > Both:
> >
> >  File "E:\src\spambayes\Outlook2000\manager.py", line 664, in score
> >   return self.bayes.spamprob(bayes_tokenize(email), evidence)
> >  File "E:\src\spambayes\spambayes\classifier.py", line 236, in
> >               chi2_spamprob
> >    S = ln(S) + Sexp * LN2
> > exceptions.OverflowError: math range error
>
> Can you investigate this one a bit deeper?  My guess is that
>
>             S *= 1.0 - prob
>
> in the loop before is treating 1.0 as 10.0 due to the .pyc-file
> locale-dependent loading problem I detailed earlier, and that S is
> overflowing to infinity as a result.  Printing S inside the
> loop would shed
> best light on this, and printing S when the OverflowError
> occurs would nail
> it:

OK, the code now looks like:

        print repr(S), repr(H)
        S = ln(S) + Sexp * LN2
        H = ln(H) + Hexp * LN2

And I tested on a hammy mail.  I got:

3,0955714375167259e-015 0.0
...
  File "E:\src\spambayes\spambayes\classifier.py", line 238, in
chi2_spamprob
    H = ln(H) + Hexp * LN2
exceptions.OverflowError: math range error

A spam yields:
0.0 0.0
  File "E:\src\spambayes\spambayes\classifier.py", line 237, in
chi2_spamprob
    S = ln(S) + Sexp * LN2
exceptions.OverflowError: math range error

Interestingly, S in the first one uses a comma, while all the zeroes got '.'

Clueless ly,

Mark.