sqlite3 import performance

Gerhard Häring gh at ghaering.de
Sat Sep 6 01:30:18 CEST 2008


Ben Lee wrote:
> hi folks --
> 
> a quick python and sqlite3 performance question.  i find that
> inserting a million rows of in-memory data into an in-memory database
> via a single executemany() is about 30% slower than using the sqlite3
> CLI and the .import command (reading the same data from a disk file,
> even.)  i find this surprising, executemany() i assume is using a
> prepared statement and this is exactly what the .import command does
> (based on my quick perusal of the source.)
> 
> is this discrepancy to be expected?  where is the overhead coming
> from? [...]

Ok, I'll bite.

Well, first, the a 30 % slowdown with a Python DB-API wrapper compared
to the native commandline tool of the database is to be considered still
quite reasonable, in my opinion.

About a year ago I compared the performance of pysqlite vs. the other
SQLite wrapper, APSW. At the time, APSW was a bit faster, not
excessively, but measurable. In meaningless benchmarks like yours ;-)

So I changed pysqlite here and there to get the same performance as
APSW. Only minor tweaks, nothing spectacular. And a few hardcore tricks
as well, like special-casing *not-subclassed* classes. The result was
pysqlite 2.3.5:
http://oss.itsystementwicklung.de/trac/pysqlite/wiki/2.3.5_Changelog

These optmizations are not yet in Python 2.5.x, but they can be found in
the sqlite3 module of the Python 2.6/3.0 betas.

Well, here are my results of your benchmark :-)

-- Gerhard

# with Python 2.5.2's sqlite3 module
gerhard at lara:~/tmp$ python t.py
generating data...
done!
testing 1000000 inserts...
[42.795290946960449, 44.337385892868042, 46.35642409324646]


# with pysqlite 2.5.0, which I released earlier today
gerhard at lara:~/tmp$ python t.py
generating data...
done!
testing 1000000 inserts...
[33.027599096298218, 32.73675012588501, 32.823790073394775]

# SQLite commandline

gerhard at lara:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit'
real    0m32.514s
gerhard at lara:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit'
real    0m32.576s
gerhard at lara:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit'
real    0m32.604s





More information about the Python-list mailing list