[Spambayes] Hammiefilter doesn't write out the pickle
Richie Hindle
richie@entrian.com
Mon Nov 18 18:02:07 2002
Hi Neale,
> Neale thinks this is the right way to do it. If the Bayes.* classes
> write out their state on destruction, we can treat them all the same.
> That's easy enough, just have them call self.store() in the __del__
> method.
Richie thinks this is a bad move. Here's a minor rant I sent to Tim Stone
when he did exactly this in his Bayes module:
--------------------------------------------------------------------------
PersistentBayes.__del__() calls store() - this seems like a bad thing for
three reasons. One is that I might not want to save my changes to the
database - pop3proxy has an explicit "Save & Shutdown" and "Shutdown"
buttons to give the user control over whether the database is saved or not
(to let you do speculative training and discard the results, for instance).
[This is the least important of the three reasons. Four, four reasons!]
Also, the pop3proxy self-test uses an in-memory bayes instance that it
never wants to write to disk. Secondly, it's unpredictable when __del__
will be called, or even *whether* it will be called - this:
class A:
def __del__(self):
print "A.__del__"
class B:
def __del__(self):
print "B.__del__"
a = A()
b = B()
a.b = b
b.a = a
print "Exiting..."
won't call either __del__ method in the current CPython implementation.
Thirdly, if users of PersistentBayes explicitly call store() - which seems
like the right thing to do - the database will be written out twice. [And
that can take *a long time*.]
[snip]
I've found another reason why PersistentBayes.__del__() is a bad thing -
self.db_name isn't set in the case where a PickledBayes is created using a
filename that doesn't exist (which is done by the pop3proxy self-test) -
that was leading to exceptions being throw from __del__, which is a
notoriously hard problem to track down.
--------------------------------------------------------------------------
I'd much rather have an explicit store() method and document the fact that
storage may be pre-empted by certain implementations. Relying on __del__
is nasty.
--
Richie Hindle
richie@entrian.com
More information about the Spambayes
mailing list