[DB-SIG] Python/PostgreSQL API performance comparison

Chris Cogdon chris at cogdon.org
Sat May 24 13:55:06 EDT 2003


Hey folks! This post is more a 'for your edification' than a call for 
comments, but comments are welcome :)

I'm in the process of rewriting all the database glue logic for my 
rather heavily used[1]  website. In the process I discovered that the 
new code runs significantly slower than the old, and that worries me 
since the website IS rather heavily used, and I was hoping to move to 
dynamic generation for a lot of the pages. I believe a good deal of the 
slowdown was my overuse of 'elegant but slow' coding, and I intend to 
remedy that. But, I also decided how much was due to the change from 
the old 'pg' API to the dbapi-2.0 compliant PgSQL.

I wrote a little program that sends through a couple of complex queries 
to the DB and retrieves the values using a variety of API's:

- D'Arcy's 'pg' module
- D'Arcy's dbapi-2.0-compliant 'pgdb' module
- PgSQL
- PgSQL with DECLARE cursor's off
- PoPy

In all cases, I ran the query three times as a 'warm up', then another 
10 and timed that using os.times() The results are as follows:

method [ user time, system time, child user time, child system time, 
real time ]
pg [ 0.110, 0.010, 0.000, 0.000, 29.050 ]
pgdb [ 4.490, 0.010, 0.000, 0.000, 33.230 ]
PgSQL [ 3.640, 0.000, 0.000, 0.000, 40.930 ]
PgSQL (nocursor) [ 3.630, 0.010, 0.000, 0.000, 32.280 ]
PoPy [ 0.130, 0.010, 0.000, 0.000, 29.030 ]

The python program to do this is available at 
http://onca.catsden.net/~chris/testtimes.py

You wont be able to run this without a suitable schema in the DB, of 
course :) This was run on a very lightly-loaded, dual proc Pentium 
III/700MHz running redhat-7.2, postgresql 7.2.3. Obviously, this isn't 
the production system :)

Observations:

- For all except the C-compiled PoPy module, all the dbapi-2.0 
compliant modules add around 3.5-4.5 seconds of CPU overhead.
- Using PgSQL with DECLARE cursors turned on (the default) adds 7.5 
seconds of real time. (probably due to increased IO and/or lack of IO 
streaming)
- The PoPy module gives you 'pg' performance with 'dbapi-2.0' standard 
interface.


If you have any questions, such as wanting to know details, or 
suggestions on extra things I can do for the tests, please do write. 
For myself, I'm probably going to switch to PoPy, or 'dumb down' the 
interface and just use Pg. 3 seconds of CPU time is not something I can 
afford.

[1] Heavily used = 2 million hits/day, 20,000 visitors/day, 1.5TBytes 
outbound/month. Yes, I'm somewhat showing off ;)

-- 
    ("`-/")_.-'"``-._        Chris Cogdon <chris at cogdon.org>
     . . `; -._    )-;-,_`)
    (v_,)'  _  )`-.\  ``-'
   _.- _..-_/ / ((.'
((,.-'   ((,/   fL




More information about the DB-SIG mailing list