dbm clone with serious specs wanted

Someone has asked me for a dbm clone that can store 16M keys of 350 bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in keys alone! I presume most classic approaches won't cut it since total file size is typicall limited by the seek system call, internal data structures and/or file index format to 2Gb (signed longs) or 4Gb (unsigned longs). Does anyone have an idea where to start looking? Would a Python extension already exist? --Guido van Rossum (home page: http://www.python.org/~guido/)

Christopher Petrilli [petrilli@amber.org] wrote:
I just did some checking... first Robin Dunn has an interface, but it's not currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't be hard to retrofit. Anyway, the limits are based on page size... 512b page: 2TB 64K page: 256TB It uses 32bit numbers for pages, so I assume that is also a reflection of the number of keys allowed... given I belive one key must use a minimum of one page. I know that I've pushed earlier releases o around 50Gb without trouble, but you might see issues relatd to the number of keys. I'd ask Sleepycat directly, as they'r amazingly responsive. Chris -- | Christopher Petrilli | petrilli@amber.org

Guido van Rossum wrote:
I'd suggest using a dbm style wrapper around the DB-API and then trying out the many cross-platform databases. IBM DB2 comes to mind... it can certainly handle these sizes given the right hardware. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Christopher Petrilli [petrilli@amber.org] wrote:
I just did some checking... first Robin Dunn has an interface, but it's not currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't be hard to retrofit. Anyway, the limits are based on page size... 512b page: 2TB 64K page: 256TB It uses 32bit numbers for pages, so I assume that is also a reflection of the number of keys allowed... given I belive one key must use a minimum of one page. I know that I've pushed earlier releases o around 50Gb without trouble, but you might see issues relatd to the number of keys. I'd ask Sleepycat directly, as they'r amazingly responsive. Chris -- | Christopher Petrilli | petrilli@amber.org

Guido van Rossum wrote:
I'd suggest using a dbm style wrapper around the DB-API and then trying out the many cross-platform databases. IBM DB2 comes to mind... it can certainly handle these sizes given the right hardware. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
participants (3)
-
Christopher Petrilli
-
Guido van Rossum
-
M.-A. Lemburg