Python Sybase slow compared to Perl?

Fri Sep 20 22:57:07 EDT 2002

> I have always written my database scripts in Perl.  I am very
> interested in Python, and would like to start using it for all my
> scripting.  I am not trying to start a flame war, I'd just like to
> understand why a particular script I wrote runs so much faster under
> Perl.  The script in question takes roughly 3 times as long to run
> under Python.  It is very possible that there is a faster way to do
> this under Python.

[snip]

> Here is the Perl code snippet:

[snip]

>    while ( @data = $sth->fetchrow_array() ) {

[snip]

> and here is the Python code snippet:

[snip]

>    while 1:
>        row = c.fetchone()

snip]

> No matter what table how many records there are, the Python script
> always takes longer to run (usually by a facter of 3 or more).

I suspect that some of the difference is due to the fact that you are
fetching an array of rows using Perl but only a single row at a time
in Python.

Having said that, the current implementation of the Python module does
not do array binding for cursors.  The Sybase.py module is built on
top of an extension module called sybasect which does support array
binding.

To find out if that is the problem (until I get around to fixing the
cursor to do array binding) you could try doing this:

   # output the actual data
   num_recs = 0
   if os.path.isfile( "%s.gz" % dumpfile ):
       os.remove( "%s.gz" % dumpfile )
   file = os.popen( "gzip -c > %s.gz" % dumpfile, "w" )
   result_sets = db.execute( "SELECT %s FROM %s" % ( select, table ) )
   for row in result_sets[0]:

       line = []
       for i in range( len( row ) ):
           line.append( "%-*s" % ( lengths[i], row[i] ) )

       print >> file, "".join( line )
       num_recs += 1

   file.close()
   c.close()

Note that the c.fetchone() method retrieves rows one at a time from
the server so does not consume much memory when there are a huge
number of rows returned.  The db.execute() array binds a buffer for 16
rows so retrieves the entire result 16 rows at a time before
returning.  It will use a large amount of memory if there are a large
number of rows returned.

> I am still a Python newbie, so I may be doing something wrong.  Has anyone 
> else experienced similar results?  Is this due to a poorly optimized 
> Sybase module?  I really want to use Python, and I don't mind giving up a 
> little speed, but this is more than I can accept.
> 
> I ran the Python script using the profiler and the output is below.  Any 
> help would be much appreciated.  Thanks.

> profile output:
> 
>  Ordered by: standard name
> 
>  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>       1    0.000    0.000 1230.540 1230.540 <string>:1(?)
>       4    0.000    0.000    0.000    0.000 Sybase.py:114(_get_diag)
>       2    0.000    0.000    0.000    0.000 Sybase.py:141(_ct_errors)
>       3    0.000    0.000    0.000    0.000 Sybase.py:148(_row_bind)
>  900345   96.240    0.000   96.240    0.000 Sybase.py:166(_extract_row)
>  900348  177.670    0.000  273.910    0.000 Sybase.py:180(_fetch_rows)
>       3    0.000    0.000    0.010    0.003 Sybase.py:229(__init__)
>       3    0.000    0.000    0.000    0.000 Sybase.py:246(__del__)
>  900361   83.230    0.000  267.090    0.000 Sybase.py:258(_lock)
>  900361   86.720    0.000  269.000    0.000 Sybase.py:262(_unlock)
>       3    0.000    0.000    0.010    0.003 Sybase.py:312(close)
>       3    0.000    0.000    0.000    0.000 Sybase.py:328(execute)
>  900348  178.280    0.000  988.260    0.001 Sybase.py:365(fetchone)

[snip]

The other thing which may be causing some of the slowness is the
locking.  The module allows threads to share connections which
requires a fair amount of lock maintenance.

- Dave

-- 
http://www.object-craft.com.au