[Q] Databases -- Gadfly vs Metakit vs SQLite

Tim Churches tchur at optushome.com.au
Sun Mar 16 00:29:39 CET 2003


On Sun, 2003-03-16 at 06:31, achrist at easystreet.com wrote:
> I'm looking for a simple database for a single-user Windows desktop
> application.  The user's data will be text, maybe 10-20 data tables,
> with identifying keys and a few short fields plus plenty of long
> text memo-type fields.  I should like to plan for databases up to
> about 50k rows and 100 MB of data, but the typical will be far less
> than this.
> 
> It looks like Python gives me 3 good choices for this, Gadfly,
> SQLite, Metakit.  I'd like comments that compare and contrast these
> for such an application.

Don't forget Sleepycat's Berkeley DB (BSD DB)- see
http://www.sleepycat.com/ - which has a good reputation for robustness
and efficiency. It also has the advantage of being part of the standard
Python distribution on most platforms. The big problem is that the
ancient  version of BSD DB (1.8) included in versions of Python up to
2.2 has been terribly broken on many platforms. This situation has
thankfully been resolved in Python 2.3, which includes a recent release
of BSD DB and the latest Python interface code for it
(http://pybsddb.sourceforge.net).

> 
> This is a desktop productivity app, so I'd like to get snappy response,
> no 1 or 2 second delays for a single update or retrieval.  This should
> work to such a standard on a user's 300 MHz, 64 MB Windows machine.  

That might rule out Gadfly - it depends whether appropriate indexes will
be available to optimise all your queries.

> 
> Does Gadfly need to fetch the entire database into RAM?  This would be
> fine for me 99% of the time, but I don't want to worry about the user
> with the big database and the small machine.

Yes, it does, AFAIK.

> 1. Very high reliability -- Which of these can give us zero database
> failures from 1000's of users?

BSD DB is probably the most widely used, but all of the systems you have
identified are of production quality. But that doesn't mean you
shouldn't test, test, test in the context of your app.

> 3. Storage efficiency -- We don't want 10 MB of user data to take 
> 100 MB of disk space.  I know that disk is cheap, but we want the
> data to be easy to back up, send,  and transport.

I think they all support variable length text fields.

> Can anyone with experience with more than one of these advise?

Metakit is great but performance drops dramatically when the number of
records rises above about 250,000 - this is clearly acknowledged in the
Metakit documentation, and it is true. BSD DB performance when using the
hash table format also falls off dramatically with more than a few
hundred thousand records (but there are lots of tuning parameters which
might improve this). I've only just started to evaluate SQLite, but it
looks like its performance holds up well with large numbers of records.
Its main drawback is that everything is stored as text - but that
doesn't sound like a problem for your application (nor mine). The SQLite
data structure, in which records are stored on the leaves of a b-tree in
which integer record IDs are the keys, is interesting. B-trees are also
used for indexes.

Please summarise the responses you receive - embedded persistence for
Python is a topic of ongoing interest for many people.

-- 

Tim C

PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
or at http://members.optushome.com.au/tchur/pubkey.asc
Key fingerprint = 8C22 BF76 33BA B3B5 1D5B  EB37 7891 46A9 EAF9 93D0

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/python-list/attachments/20030316/65ffe408/attachment.pgp>


More information about the Python-list mailing list