fastest native python database?

Ethan Furman ethan at stoneleaf.us
Thu Jun 18 22:37:36 EDT 2009


Pierre Quentel wrote:
> On 18 juin, 05:28, per <perfr... at gmail.com> wrote:
> 
>>hi all,
>>
>>i'm looking for a native python package to run a very simple data
>>base. i was originally using cpickle with dictionaries for my problem,
>>but i was making dictionaries out of very large text files (around
>>1000MB in size) and pickling was simply too slow.
> 
> buzhug syntax doesn't use SQL statements, but a more Pythonic syntax :
> 
> from buzhug import Base
> db = Base('foo').create(('name',str),('age',int))
> db.insert('john',33)
> # simple queries
> print db(name='john')
> # complex queries
> print [ rec.name for rec in db if age > 30 ]
> # update
> rec.update(age=34)
> 
> I made a few speed comparisons with Gadfly, KirbyBase (another pure-
> Python DB, not maintained anymore) and SQLite. You can find the
> results on the buzhug home page : http://buzhug.sourceforge.net
> 
> The conclusion is that buzhug is much faster than the other pure-
> Python db engines, and (only) 3 times slower than SQLite
> 
> - Pierre

Howdy, Pierre!

I have also written a pure Python implementation of a database, one that 
uses dBase III or VFP 6 .dbf files.  Any chance you could throw it into 
the mix to see how quickly (or slowly!) it runs?

The code to run the same steps are (after an import dbf):

#insert test
table = dbf.Table('/tmp/tmptable', 'a N(6.0), b N(6.0), c C(100)')
# if recs is list of tuples
for rec in recs:
    table.append(rec)
# elif recs is list of lists
#for a, b, c in recs:
#   current = table.append()
#   current.a = a
#   current.b = b
#   current.c = c

#select1 test
for i in range(100):
    nb = len(table)
    if nb:
       avg = sum([r.b for r in table])/nb

#select2 test
for num_string in num_strings:
recs = table.find({'c':'%s'%num_string}, contained=True)
nb = len(recs)
if nb:
    avg = sum([r.b for r in recs])/nb

#delete1 test
for rec in table:
    if 'fifty' in rec.c:
       rec.delete_record()
# to purge the records would then require a table.pack()

#delete2 test
for rec in table:
    if 10 < rec.a < 20000:
       rec.delete_record()
# again, permanent deletion requires a table.pack()

#update1 test
table.order('a')
for i in range(100):  # update description says 1000, update code is 100
    records = table.query(python='10*%d <= a < 10*%d' %(10*i,10*(i+1)))
    for rec in records:
       rec.b *= 2

#update2 test
records = table.query(python="0 <= a < 1000")
for rec in records:
    rec.c = new_c[rec.a]

Thanks, I hope!  :)

~Ethan~
http://groups.google.com/group/python-dbase
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-dbf.zip
Type: application/x-zip-compressed
Size: 23032 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20090618/30bcb34d/attachment.bin>


More information about the Python-list mailing list