[Python-3000] [PATCH] Fix dumbdbm, which fixes test_shelve (for me); instrument other tests so we catch this sooner (and more directly)

Fri Aug 24 02:03:21 CEST 2007

Attached is a patch for review.

As of revision 57341 (only a couple hours old as of this writing), 
test_shelve was failing on my machine.  This was because I didn't have 
any swell databases available, so anydbm was falling back to dumbdbm, 
and dumbdbm had a bug.  In Py3k, dumbdbm's dict-like interface now 
requires byte objects, which it internally encodes to "latin-1" then 
uses with a real dict.  But dumbdbm.__contains__ was missing the 
conversion, so it was trying to use a bytes object with a real dict, and 
that failed with an error (as bytes objects are not hashable).  This 
patch fixes dumbdbm.__contains__ so it encodes the key, fixing 
test_shelve on my machine.

But there's more!  Neil Norvitz pointed out that test_shelve /didn't/ 
fail on his machine.  That's because dumbdbm is the last resort of 
anydbm, and he had a superior database module available.  So the 
regression test suite was really missing two things:

    * test_dumbdbm should test dumbdbm.__contains__.
    * test_anydbm should test all the database modules available, not
      merely its first choice.

So this patch also adds test_write_contains() to test_dumbdbm, and a new 
external function to test_anydbm: dbm_iterate(), which returns an 
iterator over all database modules available to anydbm, /and/ internally 
forces anydbm to use that database module, restoring anydbm to its first 
choice when it finishes iteration.  I also renamed _delete_files() to 
delete_files() so it could be the canonical dbm cleanup function for 
other tests.

While I was at it, I noticed that test_whichdbm.py did a good job of 
testing all the databases available, but with a slightly odd approach: 
it iterated over all the possible databases, and created new test 
methods--inserting them into the class object--for each one that was 
available.  I changed it to use dbm_iterate() and delete_files() from 
test.test_anydbm, so that that logic can live in only one place.  I 
didn't preserve the setattr() approach; I simply iterate over all the 
modules and run the tests inside one conventional method.

One final thought, for the folks who defined this "in Py3k, 
database-backed dict-like objects use byte objects as keys" interface.  
dumbdbm.keys() returns self._index.keys(), which means that it's serving 
up real strings, not byte objects.  Shouldn't it return 
[k.encode("latin-1") for k in self._index.keys()] ?  (Or perhaps change 
iterkeys to return that as a generator expression, and change keys() to 
return list(self.iterkeys())?)

Thanks,

/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070823/d08bb467/attachment-0001.htm 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lch.py3k.dumbdb.contains.diff.1.txt
Url: http://mail.python.org/pipermail/python-3000/attachments/20070823/d08bb467/attachment-0001.txt