My Python/MySQL code is too slow

Duncan Smith buzzard at urubu.freeserve.co.uk
Wed Jun 19 19:03:26 EDT 2002


I have already posted this problem on mailing.database.mysql as I see it as
a problem with my lack of knowledge of SQL.  But no responses so far, and
maybe Python could do more of the work.  I repost the original question
below (SQL easier to follow), and the Python code below that.  Any help
appreciated.  TIA.

---------------------------------------------------------------------
I have two tables (say dbA and dbB) and need to find the number of rows in
dbB which have exactly 2 matching rows in dbA (on the selected columns).
I'm using Python/MySQL, but the relevant SQL is below.  This is slow, and
I'm sure it's possible without creating the temporary table and without the
join.  But everything I try fails (because of my limited knowledge of SQL).
Can anyone show me an efficient way of getting the answer?  All I need is
the number of rows in dbB which have exactly 2 matching rows in dbA.  Thanks
in advance.

CREATE TEMPORARY TABLE tmp SELECT var1, var2, var3 FROM dbA GROUP BY var1,
var2, var3 HAVING COUNT(*)=2;

SELECT dbB.var1, dbB.var2, dbB.var3, tmp.var1, tmp.var2, tmp.var3 FROM dbB,
tmp WHERE dbB.var1=tmp.var1 AND dbB.var2=tmp.var2 AND dbB.var3=tmp.var3;

Cheers.

Duncan Smith

------------------------------------------------------------------
import sys
import string
import MySQLdb

def MyFunc(dbname, tblname, pert_table, variables=()):

    """'dbname' is a MySQL database containing tables
    'tblname' and 'pert_table'.  'variables' is a
    sequence listing the relevant columns"""

    variables = list(variables)

    try:
        conn = MySQLdb.connect()
        curs = conn.cursor()
        curs.execute('USE %s' % (dbname,))
        if not variables:
            curs.execute('DESCRIBE %s' % (tblname,))
            variables = [x[0] for x in list(curs.fetchall())]
        vars = string.join(variables, ', ')

        #produce a temporary table containing all records
        #of tblname which occur in pairs and find P

        curs.execute("""CREATE TEMPORARY TABLE tmp
                    SELECT %s FROM %s GROUP BY %s
                    HAVING COUNT(*)=2""" % (vars, tblname, vars))

        def f(s): return string.join([s + var for var in variables], ', ')

        P = curs.execute("""SELECT %s, %s FROM %s, tmp WHERE %s""" %
(f(pert_table+'.'), f('tmp.'), pert_table, string.join([pert_table + '.' +
variables[i] + '=' 'tmp.' + variables[i] for i in range(len(variables))], '
AND ')))

        curs.execute('DROP TABLE tmp')

        curs.close()
        conn.close()

        return P/2

    except MySQLdb.Error, e:
        print "Error %d: %s" % (e.args[0], e.args[1])
        sys.exit(1)





More information about the Python-list mailing list