[spambayes-dev] A new and altogether different bsddb breakage
Richie Hindle
richie at entrian.com
Mon Dec 15 04:00:19 EST 2003
[Richie]
> >>> print [db[k] for k in db]
> KeyError: 'pics'
[Tim]
> Ouch. What do you get if you open the database directly, instead of
> indirecting thru a shelf? I'm just trying to make sure it's really the
> database that's hosed.
I think we're using different versions of bsddb - your code fails for me:
>>> d = bsddb.hashopen("/src/tests/spambayes/hammie.db")
>>> len(d)
52331
>>> len([k for k in d if d.get(k, None) is None])
Traceback (most recent call last):
File "<pyshell#4>", line 1, in -toplevel-
len([k for k in d if d.get(k, None) is None])
File "C:\Python23\lib\bsddb\__init__.py", line 86, in __getitem__
return self.db[key]
TypeError: Integer keys only allowed for Recno and Queue DB's
I think this is because GET_ITER is creating a list-style iterator rather
than a dict-style one. bsddb objects don't look much like dictionaries:
>>> len([k for k in d.keys() if d.get(k, None) is None])
Traceback (most recent call last):
File "<pyshell#11>", line 1, in -toplevel-
len([k for k in d.keys() if d.get(k, None) is None])
AttributeError: _DBWithCursor instance has no attribute 'get'
I have Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)]
on win32. Assuming that's a red herring, here's an equivalent that works
for me:
>>> def get(d, k, default):
try:
return d[k]
except KeyError:
return default
>>> len([k for k in d.keys() if get(d, k, None) is None])
305
So yes, the underlying database is screwed. But one token less screwed
than last time - lovely. (I now get 305 when going through shelve as
well.) I've done some training in between, which must have jiggled things
around.
[Tim]
> Gotta say, I'm half ready to declare
> that ZODB is the only database anyone should ever use (the bugs in that are
> long fixed <wink>).
I'm certainly underwhelmed by bsddb in single-file mode. One day I want
to make spambayes use full transaction mode - that really ought to work.
(Does anyone know of any simple Python code I can steal that uses bsddb in
full-on multi-everything DBEnv mode? The pybsddb docs just link to the
SleepyCat C API docs, which aren't very approachable.)
--
Richie Hindle
richie at entrian.com
More information about the spambayes-dev
mailing list