Which non SQL Database ?

Dan Stromberg drsalists at gmail.com
Sun Dec 5 16:55:35 EST 2010


On Sun, Dec 5, 2010 at 12:01 AM, John Nagle <nagle at animats.com> wrote:
> On 12/4/2010 8:44 PM, Monte Milanuk wrote:
>>
>> On 12/4/10 3:43 PM, Jorge Biquez wrote:
>>
>>> I do not see a good reason for not using Sqlite3 BUT if for some reason
>>> would not be an option.... what plain schema of files would you use?
>>
>> Would shelve work?
>
>    There are some systems for storing key-value pairs in files.
>
>    Underneath "shelve" is some primitive database, dbm, gdbm or bsddb.
> "bsddb" is deprecated and was removed from Python 3.x.  "dbm" has
> some classic problems.  "gdbm" is an improved version of "dbm".
> None of these handle access from multiple processes, or crash
> recovery.  We're looking at 1979 technology here.
>
>   SQLite works right when accessed from multiple processes.  SQLite
> is the entry-level database technology for Python today.  It handles
> the hard cases, like undoing transactions after a crash and
> locking against multiple accesses.  Lookup performance is good;
> simultaneous update by multiple processes, though, is not so
> good.  When you have a web site that has many processes hitting
> the same database, it's time to move up to MySQL or Postgres.
>
>   There's a lot of interest in "non-SQL" databases for very
> large distributed systems.  You worry about this if you're Facebook
> or Google, or are running a big game server farm.

SQLite isn't exactly no SQL.

I've used the bsddb and gdbm modules quite a bit.  I've found that
bsddb tables tend to get corrupted (whether used from CPython or C),
EG when a filesystem fills up.  I quite like the gdbm module though,
and have been using it in my current project.

If you find that converting your database keys and values to/from
strings is expensive, you could check out
http://stromberg.dnsalias.org/~dstromberg/cachedb.html which is a
caching wrapper around gdbm and other single-table database interfaces
supporting the same API.

As far as multiple processes, IINM, gdbm supports a single writer and
multiple readers.



More information about the Python-list mailing list