[Spambayes] Re: caching stuff
Tim Stone - Four Stones Expressions
Fri Nov 22 21:19:11 2002
11/22/2002 2:50:55 PM, "T. Alexander Popiel" <firstname.lastname@example.org> wrote:
>In message: <B7WSUSLKKEKGJITQ3CBB9ECTPNHA6ZU.3dde92fe@riven>
> <tim@fourstonesExpressions.com> writes:
>>From my careful and time consuming examination of the code <wink>, it
>>appeared to me that meta revision only changed when nham or nspam changed.
>>Therefore, caching on the ratios rather than nham and nspam allowed the
>>cache to be pertinent all the time. Nuking a cache is expensive...
>Unfortunately, preserving the cache when nham or nspam changes is bad,
>because the bayesian adjustment changes, even if the ham and spam
>ratios don't. :-(
>Nuking a cache in toto is a lot less expensive than individually
>invalidating or updating records (which was update_probabilities
>downfall). Either is a lot less expensive than giving the wrong
Well, if the baseian prob changes even if the ham and spam ratios don't, then
of course the caching scheme is bad. But I certainly don't see that in the
code that I changed. Maybe I'm looking in the wrong place...
>>As for indexing on an integer vs a float. Both are immutable types, so
>>you're really indexing on an object reference, not the value.
>Eh, I don't think so... but I don't know enough python internals to
>be sure. (Sure, they are immutable types, but I strongly doubt that
>they're hashed as objects; that would imply that all references to
>a float value 3.0 were references to the same object... which means
>some sort of search for the 3.0 object when you added 2.5 and 0.5...
>which would be a severe performance lose. It seems far more likely
>that they're hashed by value instead (even if that value is currently
>boxed in an object).)
>Does anyone with more python mojo have a definitive answer? Guido?
>Spambayes mailing list
More information about the Spambayes