[Spambayes] [ spambayes-Bugs-709051 ] Config file loading and saving is fragile

SourceForge.net noreply at sourceforge.net
Fri Apr 4 04:05:16 EST 2003


Bugs item #709051, was opened at 2003-03-25 09:19
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=709051&group_id=61702

Category: Outlook
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: Mark Hammond (mhammond)
>Summary: Config file loading and saving is fragile

Initial Comment:
There was a report of this error using the second
binary release:

SpamAddin - Connecting to Outlook
pythoncom error: Failed to call the universal dispatcher
Traceback (most recent call last):
  File "E:\src\pythonex\com\win32com\universal.py",
line 170, in dispatch
  File "E:\src\pythonex\com\win32com\server\policy.py",
line 322, in _InvokeEx_
  File "E:\src\pythonex\com\win32com\server\policy.py",
line 601, in _invokeex_
  File "E:\src\pythonex\com\win32com\server\policy.py",
line 541, in _invokeex_
  File "E:\src\spambayes\Outlook2000\addin.py", line
655, in OnConnection
  File "E:\src\spambayes\Outlook2000\manager.py", line
475, in GetManager
  File "E:\src\spambayes\Outlook2000\manager.py", line
152, in __init__
  File "E:\src\spambayes\Outlook2000\manager.py", line
355, in LoadConfig
exceptions.EOFError: 

While there is another problem that caused this error,
we should not die completely loading the config pickle
should it get screwed up.  However, as this means
spambayes will be unconfigured, we do need a scheme to
let the user know this (as we do in the few other
places where we disable spambayes due to config errors)

----------------------------------------------------------------------

>Comment By: Mark Hammond (mhammond)
Date: 2003-04-04 22:05

Message:
Logged In: YES 
user_id=14198

I'm afraid you are wrong about the config file being the
same as the word database.  You are however correct about
the saving.  As we have the 2 pickles, I will track the
Outlook config pickle in this bug, and opened:
https://sourceforge.net/tracker/index.php?func=detail&aid=715248&group_id=61702&atid=498103
to track the word database bug.


----------------------------------------------------------------------

Comment By: Simone Piunno (pioppo)
Date: 2003-04-04 20:28

Message:
Logged In: YES 
user_id=227443

I disagree, the "configuration pickle" and the "word database" are the 
very same file. 
 
Moreover, without this path you seriously risk to completely loose 
your word database, in case execution stops beween open() (which 
truncates the file to zero length) and pickle.dump(). 
 
Execution could stop for whatever reason, from CTRL+C to system 
crash, so it's of vital importance that the file update is atomic, which 
should be guaranteed by rename(). 
 
I think this is a bug for sure, even if you don't plan to add support to 
concurrency. 
 
Of course this is not enough when you add concurrency, because you 
could loose some training information if 2 separate instances try to 
update the word database at the same time (they will both read the old 
file, then they will both create the temp file, then the second rename() 
will overwrite the result of the first one). 
 
To solve this, you should add some locking mechanism (in addition to 
atomic rename()), which could be out of your scope, I understand, but 
I think this would be a very useful enhancement on spambayes 
usability. 
 
If you need some code example, you can look at Mailman's handling 
of the MailList object persistency. 

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-04-04 09:19

Message:
Logged In: YES 
user_id=14198

Check the traceback is the same as yours - this error is
loading the configuration pickle, not the word database. 
Thus, locking shouldn't be the issue, as I can't see how two
threads or processes could write this file at once (Outlook
appears to have its own lock for startup; I've never seen
spambayes running twice in different processes.)

So, your patch wont help this exception.  However, if you
are getting a slightly different EOFError, you patch may apply.

----------------------------------------------------------------------

Comment By: Simone Piunno (pioppo)
Date: 2003-04-04 08:01

Message:
Logged In: YES 
user_id=227443

Maybe this patch would be a little (but insufficient)  
improvement?  I'd upload it as a separate file, but there's  
no "upload" button.... 
 
--- spambayes/storage.py.orig   2003-04-03 
23:35:47.000000000 +0200 
+++ spambayes/storage.py        2003-04-03 
23:43:16.000000000 +0200 
@@ -59,6 +59,7 @@ 
 import cPickle as pickle 
 import errno 
 import shelve 
+import os 
 from spambayes import dbmstorage 
  
 # Make shelve use binary pickles by default. 
@@ -121,9 +122,10 @@ 
         if options.verbose: 
             print 'Persisting',self.db_name,'as a pickle' 
  
-        fp = open(self.db_name, 'wb') 
+        fp = open(self.db_name+'.tmp', 'wb') 
         pickle.dump(self, fp, PICKLE_TYPE) 
         fp.close() 
+        os.rename(self.db_name+'.tmp', self.db_name) 
  
  
 class DBDictClassifier(classifier.Classifier): 
 
  

----------------------------------------------------------------------

Comment By: Simone Piunno (pioppo)
Date: 2003-04-04 05:38

Message:
Logged In: YES 
user_id=227443

I have another case, but without apparent cause: 
 
Traceback (most recent call last): 
  File "/home/mailman21/Mailman/Queue/Runner.py", line 105, in 
_oneloop 
    self._onefile(msg, msgdata) 
  File "/home/mailman21/Mailman/Queue/Runner.py", line 155, in 
_onefile 
    keepqueued = self._dispose(mlist, msg, msgdata) 
  File "/home/mailman21/Mailman/Queue/OutgoingRunner.py", line 
69, in _dispose 
    mlist.Load() 
  File "/home/mailman21/Mailman/MailList.py", line 626, in Load 
    self._spamdb = hammie.open(path, 0) 
  File "/home/mailman21/pythonlib/spambayes/hammie.py", line 
262, in open 
    b = storage.PickledClassifier(filename) 
  File "/home/mailman21/pythonlib/spambayes/storage.py", line 
80, in __init__ 
    self.load() 
  File "/home/mailman21/pythonlib/spambayes/storage.py", line 
98, in load 
    tempbayes = pickle.load(fp) 
EOFError 
 
it happens quite often but not always, I believe it is a concurrency 
issue (e.g. lack of locking). 

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-03-25 09:56

Message:
Logged In: YES 
user_id=14198

The reporter just let me know that the problem was caused by
about 20 power failures over short period.  So I don't think
we can cure the cause here, just the symptoms.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=709051&group_id=61702



More information about the Spambayes mailing list