[spambayes-dev] A new and altogether different bsddb breakage
Tim Peters
tim.one at comcast.net
Mon Dec 15 11:00:13 EST 2003
[Richie Hindle]
> I think we're using different versions of bsddb - your code fails for
> me:
>
> >>> d = bsddb.hashopen("/src/tests/spambayes/hammie.db")
> >>> len(d)
> 52331
> >>> len([k for k in d if d.get(k, None) is None])
> Traceback (most recent call last):
> File "<pyshell#4>", line 1, in -toplevel-
> len([k for k in d if d.get(k, None) is None])
> File "C:\Python23\lib\bsddb\__init__.py", line 86, in __getitem__
> return self.db[key]
> TypeError: Integer keys only allowed for Recno and Queue DB's
>
> I think this is because GET_ITER is creating a list-style iterator
> rather than a dict-style one. bsddb objects don't look much like
> dictionaries:
>
> >>> len([k for k in d.keys() if d.get(k, None) is None])
> Traceback (most recent call last):
> File "<pyshell#11>", line 1, in -toplevel-
> len([k for k in d.keys() if d.get(k, None) is None])
> AttributeError: _DBWithCursor instance has no attribute 'get'
Not here:
>>> PATH = "/WINDOWS/Application Data/SpamBayes/default_bayes_database.db"
>>> import bsddb
>>> d = bsddb.hashopen(PATH, 'r')
>>> len([k for k in d.keys() if d.get(k, None) is None])
0
>>>
> I have Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit
> (Intel)] on win32. Assuming that's a red herring,
I wouldn't assume that -- it may be the whole ball of wax. I'm using
exactly the same, *except* I'm using 2.3.3c1 (also on Windows), and a number
of bsddb3 fixes have been checked in since Python 2.3. It would help if you
tried 2.3.3c1. If your symptoms above persist, then we've got a Major
Mystery to sort out (e.g., maybe you-- or I --aren't getting the version of
bsddb the Windows installer intended us to get).
> here's an equivalent that works for me:
>
> >>> def get(d, k, default):
> try:
> return d[k]
> except KeyError:
> return default
>
> >>> len([k for k in d.keys() if get(d, k, None) is None]) 305
>
> So yes, the underlying database is screwed. But one token less
> screwed than last time - lovely. (I now get 305 when going through
> shelve as well.) I've done some training in between, which must have
> jiggled things around.
...
> I'm certainly underwhelmed by bsddb in single-file mode. One day I
> want to make spambayes use full transaction mode - that really ought
> to work. (Does anyone know of any simple Python code I can steal that
> uses bsddb in full-on multi-everything DBEnv mode? The pybsddb docs
> just link to the SleepyCat C API docs, which aren't very
> approachable.)
Best I can suggest is studying Python's bsddb3 substantial test suite. ZODB
has modules to build ZODB's transaction model on top of a Berkeley database,
but I don't think I'd call that simple. I'm not a bsddb guy, though, so
those are just random things I've seen.
More information about the spambayes-dev
mailing list