increasing the page size of a dbm store?
Peter Otten
__peter__ at web.de
Wed Nov 27 06:50:54 EST 2019
Tim Chase wrote:
> Working with the dbm module (using it as a cache), I've gotten the
> following error at least twice now:
>
> HASH: Out of overflow pages. Increase page size
> Traceback (most recent call last):
> [snip]
> File ".py", line 83, in get_data
> db[key] = data
> _dbm.error: cannot add item to database
>
> I've read over the py3 docs on dbm
>
> https://docs.python.org/3/library/dbm.html
>
> but don't see anything about either "page" or "size" contained
> therein.
>
> There's nothing particularly complex as far as I can tell. Nothing
> more than a straightforward
>
> import dbm
> with dbm.open("cache", "c") as db:
> for thing in source:
> key = extract_key_as_bytes(thing)
> if key in db:
> data = db[key]
> else:
> data = long_process(thing)
> db[key] = data
>
> The keys can get a bit large (think roughly book-title length), but
> not huge. I have 11k records so it seems like it shouldn't be
> overwhelming, but this is the second batch where I've had to nuke the
> cache and start afresh. Fortunately I've tooled the code so it can
> work incrementally and no more than a hundred or so requests have to
> be re-performed.
>
> How does one increas the page-size in a dbm mapping? Or are there
> limits that I should be aware of?
>
> Thanks,
>
> -tkc
>
> PS: FWIW, this is Python 3.6 on FreeBSD in case that exposes any
> germane implementation details.
I found the message here
https://github.com/lattera/freebsd/blob/master/lib/libc/db/hash/hash_page.c#L695
but it's not immedately obvious how to increase the page size, and the
readme
https://github.com/lattera/freebsd/tree/master/lib/libc/db/hash
only states
"""
"bugs" or idiosyncracies
If you have a lot of overflows, it is possible to run out of overflow
pages. Currently, this will cause a message to be printed on stderr.
Eventually, this will be indicated by a return error code.
"""
what you learned the hard way.
Python has its own "dumb and slow but simple dbm clone" dbm.dump -- maybe
it's smart and fast enough for your purpose?
More information about the Python-list
mailing list