Which non SQL Database ?
Dan Stromberg
drsalists at gmail.com
Sun Dec 5 16:55:35 EST 2010
On Sun, Dec 5, 2010 at 12:01 AM, John Nagle <nagle at animats.com> wrote:
> On 12/4/2010 8:44 PM, Monte Milanuk wrote:
>>
>> On 12/4/10 3:43 PM, Jorge Biquez wrote:
>>
>>> I do not see a good reason for not using Sqlite3 BUT if for some reason
>>> would not be an option.... what plain schema of files would you use?
>>
>> Would shelve work?
>
> There are some systems for storing key-value pairs in files.
>
> Underneath "shelve" is some primitive database, dbm, gdbm or bsddb.
> "bsddb" is deprecated and was removed from Python 3.x. "dbm" has
> some classic problems. "gdbm" is an improved version of "dbm".
> None of these handle access from multiple processes, or crash
> recovery. We're looking at 1979 technology here.
>
> SQLite works right when accessed from multiple processes. SQLite
> is the entry-level database technology for Python today. It handles
> the hard cases, like undoing transactions after a crash and
> locking against multiple accesses. Lookup performance is good;
> simultaneous update by multiple processes, though, is not so
> good. When you have a web site that has many processes hitting
> the same database, it's time to move up to MySQL or Postgres.
>
> There's a lot of interest in "non-SQL" databases for very
> large distributed systems. You worry about this if you're Facebook
> or Google, or are running a big game server farm.
SQLite isn't exactly no SQL.
I've used the bsddb and gdbm modules quite a bit. I've found that
bsddb tables tend to get corrupted (whether used from CPython or C),
EG when a filesystem fills up. I quite like the gdbm module though,
and have been using it in my current project.
If you find that converting your database keys and values to/from
strings is expensive, you could check out
http://stromberg.dnsalias.org/~dstromberg/cachedb.html which is a
caching wrapper around gdbm and other single-table database interfaces
supporting the same API.
As far as multiple processes, IINM, gdbm supports a single writer and
multiple readers.
More information about the Python-list
mailing list