[Spambayes] Re: caching stuff

Tim Stone - Four Stones Expressions tim@fourstonesExpressions.com
Fri Nov 22 21:19:11 2002

11/22/2002 2:50:55 PM, "T. Alexander Popiel" <popiel@wolfskeep.com> wrote:

>In message:  <B7WSUSLKKEKGJITQ3CBB9ECTPNHA6ZU.3dde92fe@riven>
>             <tim@fourstonesExpressions.com> writes:
>>From my careful and time consuming examination of the code <wink>, it
>>appeared to me that meta revision only changed when nham or nspam changed.
>>Therefore, caching on the ratios rather than nham and nspam allowed the
>>cache to be pertinent all the time.  Nuking a cache is expensive...
>Unfortunately, preserving the cache when nham or nspam changes is bad,
>because the bayesian adjustment changes, even if the ham and spam
>ratios don't.  :-(
>Nuking a cache in toto is a lot less expensive than individually
>invalidating or updating records (which was update_probabilities
>downfall).  Either is a lot less expensive than giving the wrong

Well, if the baseian prob changes even if the ham and spam ratios don't, then 
of course the caching scheme is bad.  But I certainly don't see that in the 
code that I changed.  Maybe I'm looking in the wrong place...

- TimS
>>As for indexing on an integer vs a float.  Both are immutable types, so
>>you're really indexing on an object reference, not the value.
>Eh, I don't think so... but I don't know enough python internals to
>be sure.  (Sure, they are immutable types, but I strongly doubt that
>they're hashed as objects; that would imply that all references to
>a float value 3.0 were references to the same object... which means
>some sort of search for the 3.0 object when you added 2.5 and 0.5...
>which would be a severe performance lose.  It seems far more likely
>that they're hashed by value instead (even if that value is currently
>boxed in an object).)
>Does anyone with more python mojo have a definitive answer?  Guido?
>- Alex
>Spambayes mailing list
- Tim

More information about the Spambayes mailing list