[DB-SIG] Python/PostgreSQL API performance comparison
Chris Cogdon
chris at cogdon.org
Sat May 24 13:55:06 EDT 2003
Hey folks! This post is more a 'for your edification' than a call for
comments, but comments are welcome :)
I'm in the process of rewriting all the database glue logic for my
rather heavily used[1] website. In the process I discovered that the
new code runs significantly slower than the old, and that worries me
since the website IS rather heavily used, and I was hoping to move to
dynamic generation for a lot of the pages. I believe a good deal of the
slowdown was my overuse of 'elegant but slow' coding, and I intend to
remedy that. But, I also decided how much was due to the change from
the old 'pg' API to the dbapi-2.0 compliant PgSQL.
I wrote a little program that sends through a couple of complex queries
to the DB and retrieves the values using a variety of API's:
- D'Arcy's 'pg' module
- D'Arcy's dbapi-2.0-compliant 'pgdb' module
- PgSQL
- PgSQL with DECLARE cursor's off
- PoPy
In all cases, I ran the query three times as a 'warm up', then another
10 and timed that using os.times() The results are as follows:
method [ user time, system time, child user time, child system time,
real time ]
pg [ 0.110, 0.010, 0.000, 0.000, 29.050 ]
pgdb [ 4.490, 0.010, 0.000, 0.000, 33.230 ]
PgSQL [ 3.640, 0.000, 0.000, 0.000, 40.930 ]
PgSQL (nocursor) [ 3.630, 0.010, 0.000, 0.000, 32.280 ]
PoPy [ 0.130, 0.010, 0.000, 0.000, 29.030 ]
The python program to do this is available at
http://onca.catsden.net/~chris/testtimes.py
You wont be able to run this without a suitable schema in the DB, of
course :) This was run on a very lightly-loaded, dual proc Pentium
III/700MHz running redhat-7.2, postgresql 7.2.3. Obviously, this isn't
the production system :)
Observations:
- For all except the C-compiled PoPy module, all the dbapi-2.0
compliant modules add around 3.5-4.5 seconds of CPU overhead.
- Using PgSQL with DECLARE cursors turned on (the default) adds 7.5
seconds of real time. (probably due to increased IO and/or lack of IO
streaming)
- The PoPy module gives you 'pg' performance with 'dbapi-2.0' standard
interface.
If you have any questions, such as wanting to know details, or
suggestions on extra things I can do for the tests, please do write.
For myself, I'm probably going to switch to PoPy, or 'dumb down' the
interface and just use Pg. 3 seconds of CPU time is not something I can
afford.
[1] Heavily used = 2 million hits/day, 20,000 visitors/day, 1.5TBytes
outbound/month. Yes, I'm somewhat showing off ;)
--
("`-/")_.-'"``-._ Chris Cogdon <chris at cogdon.org>
. . `; -._ )-;-,_`)
(v_,)' _ )`-.\ ``-'
_.- _..-_/ / ((.'
((,.-' ((,/ fL
More information about the DB-SIG
mailing list