Lisp to Python translation criticism?
Andrew Henshaw
andrew.henshaw at mail.com
Sat Aug 17 05:16:00 CEST 2002
John E. Barham wrote:
...snip Lisp code ...
>
> Python:
>
> def spam_word_prob(word, good, bad, ngood, nbad):
> g = 2 * good.get(word, 0)
> b = bad.get(word, 0)
> if g + b >= 5:
> return max(0.01, min(0.99, float(min(1, b / nbad) / ((min(1, g /
> ngood) + min(1, b / nbad))))))
> else:
> return 0.0
>
> def spam_prob(probs):
> prod = 1.0
> for prob in probs:
> prod = prod * prob
> inv_probs = [1 - x for x in probs]
> inv_prob = 1.0
> for prob in inv_probs:
> inv_prob = inv_prob * prob
> return prod / (prob + inv_prob)
>
> Any comments on the correctness, style, efficiency etc. of my translation?
> I'd like to write a Python spam filtering system using Graham's
> techniques.
>
> Please note that this is not meant to revive the perpetual debate over the
> relative merits of Python's lambda... ;)
>
> John
Should that last line be
return prod / (prod + inv_prob)
?
Probably not a good idea to have such similar variable names.
On my machine, the fragment
inv_prob = 1.0
for prob in inv_probs:
inv_prob = inv_prob * prob
takes about 50% more time to execute, than
inv_prob = reduce(operator.mul, inv_probs)
for inv_probs of length 10. The advantage to this code increases as the
length of the list increases. That's one local optimization that could be
made.
I'd say that you would increase both clarity and speed by collapsing the
three loops in spam_prob into one loop, as
def spam_prob(probs):
inv_prob = prod = 1.0
for prob in probs:
prod *= prob
inv_prob *= (1 - prob)
return prod / (prod + inv_prob)
This is twice as fast on my machine for a ten-element list.
--
Andrew Henshaw
More information about the Python-list
mailing list