Unsafe Pickle Saving
SpamBayes has, in storage.py, some code for storing pickles safely, but that code is only used for the training database and not the numerous other pickles used in SpamBayes, e.g. for caches. The result was that, when I was doing server-side filtering, cache pickles would be read while being written, and SpamBayes wasn't prepared to deal with the corruption. I submitted a patch at http://sourceforge.net/tracker/index.php?func=detail&aid=1816240&group_id=61702&atid=498103 that factors that code out and uses it wherever pickles are saved. Frankly it would be worth looking at all the places where SpamBayes saves files, because this is probably not a pickle-specific issue. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
David> The result was that, when I was doing server-side filtering, David> cache pickles would be read while being written, and SpamBayes David> wasn't prepared to deal with the corruption. David, Can you explain how the pickle would be read and written simultaneously? Are we talking multiple apps or the same app from multiple threads? (Haven't looked at your patch yet...) Thx, Skip
on Sun Oct 21 2007, skip-AT-pobox.com wrote:
David> The result was that, when I was doing server-side filtering, David> cache pickles would be read while being written, and SpamBayes David> wasn't prepared to deal with the corruption.
David,
Can you explain how the pickle would be read and written simultaneously? Are we talking multiple apps or the same app from multiple threads? (Haven't looked at your patch yet...)
I assume it's the same app from multiple processes. I don't know how my server hands messages off to procmail -- it could well be running them in parallel. And then there's my cron job that does training once daily, which could happen at the same time as any message arrives. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
>> Can you explain how the pickle would be read and written >> simultaneously? Dave> I assume it's the same app from multiple processes.... And then Dave> there's my cron job that does training once daily, which could Dave> happen at the same time as any message arrives. That would be it. There is no locking of the various databases between processes. I suppose we could implement something, but the vagaries of network file systems and cross-platform demands have always made that task problematic. Maybe we should expand your safe_pickle code into safe_pickle_read and safe_pickle_write with a corresponding bit of file locking code they can invoke (which will probably be ugly as sin - but hidden away in a dark corner - once it's complete). Skip
on Mon Oct 22 2007, skip-AT-pobox.com wrote:
>> Can you explain how the pickle would be read and written >> simultaneously?
Dave> I assume it's the same app from multiple processes.... And then Dave> there's my cron job that does training once daily, which could Dave> happen at the same time as any message arrives.
That would be it.
Well, the cron job does tte with a fresh database, so it's only the various caches that can get into trouble.
There is no locking of the various databases between processes. I suppose we could implement something, but the vagaries of network file systems and cross-platform demands have always made that task problematic. Maybe we should expand your safe_pickle code into safe_pickle_read and safe_pickle_write with a corresponding bit of file locking code they can invoke (which will probably be ugly as sin - but hidden away in a dark corner - once it's complete).
Sounds fine to me. I doubt it would be so terribly ugly either. Somebody should have already written a file lock library that encapsulates it, so maybe we can use that. And if it doesn't exist, it should! -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
Dave> Somebody should have already written a file lock library that Dave> encapsulates it, so maybe we can use that. And if it doesn't Dave> exist, it should! It does exist, in several flavors, as my query to python-dev indicates. I've gotten three responses so far (twisted, mailman and something called zc.lockfile). They are all different implementations and it's not clear they all satisfy the same constraints. I'll look at them then do something (not sure what at this point). Skip
skip@pobox.com wrote:
Dave> Somebody should have already written a file lock library that Dave> encapsulates it, so maybe we can use that. And if it doesn't Dave> exist, it should!
It does exist, in several flavors, as my query to python-dev indicates.
Sounds like a good addition to python, but Im not sure its the solution in this case. The cron job probably should tte using a different database filename, then atomically swap the new database in place of the old. This ensures you dont leave a partial database in place if the machine crashes/is shutdown/process is killed during the tte run. It also allows procmail to run on the old database concurrently with the training run, which I guess a lock file would prevent. Or am I confused here. Ive just re-read David's original post, and you say this the problem is caches rather than the database. Which caches are these? -- Toby Dickenson (happy procmail user)
on Tue Oct 23 2007, Toby Dickenson <tdickenson-OuhhWZXUDiwPIx3zVcpl1ZqQE7yCjDx5-AT-public.gmane.org> wrote:
skip@pobox.com wrote:
Dave> Somebody should have already written a file lock library that Dave> encapsulates it, so maybe we can use that. And if it doesn't Dave> exist, it should!
It does exist, in several flavors, as my query to python-dev indicates.
Sounds like a good addition to python, but Im not sure its the solution in this case. The cron job probably should tte using a different database filename, then atomically swap the new database in place of the old.
It does do that. However, the caches for dns and images (which are pickles) are the same in both cases.
This ensures you dont leave a partial database in place if the machine crashes/is shutdown/process is killed during the tte run. It also allows procmail to run on the old database concurrently with the training run, which I guess a lock file would prevent.
Or am I confused here. Ive just re-read David's original post, and you say this the problem is caches rather than the database. Which caches are these?
Bingo; see above. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
participants (3)
-
David Abrahams -
skip@pobox.com -
Toby Dickenson