Help cleaning up some code
odeits
odeits at gmail.com
Mon Mar 9 00:58:46 EDT 2009
On Mar 8, 12:31 pm, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
> On Sat, 7 Mar 2009 23:07:55 -0800 (PST), odeits <ode... at gmail.com>
> declaimed the following in gmane.comp.python.general:
>
>
>
>
>
> > For those of you who asked about the SQL the full function is here.
> > The connection is to a sqlite database with the row_factory set to
> > sqlite.Row
>
> > def get(self,user):
> > '''
> > Fetches an ad for USER, assigns the ad and returns a
> > dictionary that represents an ad
> > '''
>
> > stack = []
> > aduserquerystring = "SELECT ni, adid, rundateid, rundate,
> > city, state, status, priority,time FROM ads NATURAL JOIN rundates
> > NATURAL JOIN newspapers WHERE ( status in (1,3) and user = ?) "
> > expiredadquerystring = "SELECT ni, adid, rundateid, rundate,
> > city, state, status, priority,time FROM ads NATURAL JOIN rundates
> > NATURAL JOIN newspapers WHERE ( status = 1 AND time < DATETIME
> > ('NOW', '-%d MINUTES') ) LIMIT 0,?"%(MINUTES_TO_EXPIRE,)
> > adquerystring = "SELECT ni, adid, rundateid, rundate, city,
> > state, status , priority, time FROM ads NATURAL JOIN rundates NATURAL
> > JOIN newspapers WHERE (status IN (0,2) AND priority IN ( SELECT
> > priority FROM users NATURAL JOIN groups WHERE user = ? )) ORDER BY
> > status DESC, priority ASC, ni ASC, time ASC, adid ASC LIMIT 0,?"
>
> Not knowing which fields come from which tables (nor which are
> commonly named per the definition of natural join) the first thing that
> strikes me here is that it looks like you could create a view/virtual
> table in the database to consolidate that
>
> ... rundates natural join newspapers ...
>
> And do your tables really share that many columns? Since the
> definition of the natural join is basically an inner/equi-join over
> every column that has the same name in the two tables, you are forcing
> the SQL engine to basically loop over all the columns in first table,
> looking for matching column name in the second table, and computing a
> "t1.colname = t2.colname" for EACH matching column.
>
> If there are many such columns in common, I'd be looking to factor
> them out into a separate table with an integer primary key, and put the
> foreign keys into the original tables.
>
> > rows = self.con.execute(aduserquerystring,(user,)).fetchall()
> > if len(rows) == 0:
>
> I suspect a simple
>
> if not rows:
>
> can be used.
>
> > rows = self.con.execute(expiredadquerystring,
> > (STACK_SIZE,)).fetchall()
> > if len(rows) == 0:
> > rows = self.con.execute(adquerystring,
> > (user,STACK_SIZE)).fetchall()
>
> In the worst case you are using three separate hits on the database
> (and recomputing the same natural joins in many cases each time --
> definitely suggest creating a view for the basic joins to simplify the
> parsing -- if not a physical read-only table, though that would need to
> be refreshed at times).
>
> A one-time database hit:
>
> SQL = """select 1, ni, adid, rundateid, rundate, city,
> state, status, priority, time
> from ads natural join rundates
> natural join newspapers
> where status in (1, 3) and user = :USER
>
> UNION
> select 2, ni, adid, rundateid, rundate, city,
> state, status, priority, time
> from ads natural join rundates
> natural join newspapers
> where status = 1 and time < datetime("now", "-%d minutes")
> limit 0, :STACK
>
> UNION
> select 3, ni, adid, rundateid, rundate, city,
> state, status, priority, time
> from ads natural join rundates
> natural join newspapers
> where status in (0, 2) and priority in
> (select priority from users natural join groups
> where user = :USER)
> order by status desc, priority, ni, time, adid
> limit 0, :STACK""" % (MINUTES_TO_EXPIRE,)
>
> rows = self.con.execute(SQL,
> {"USER" : user, "STACK" : STACK_SIZE}).fetchall()
>
> I've added a column which identifies which of the three select
> groups the data came from.
>
> #filter out unwanted data
> #(if select 1 returned data, exclude select 2 and 3)
> if rows:
> grp = rows[0][0]
> rows = [r for r in rows if r[0] = grp]
>
>
>
> > print user
> > keys =
> > ['ni','adid','rundateid','rundate','status','city','state']
>
> > for row in ifilter(lambda r: r['ni'] == rows[0]['ni'], rows):
> > ad = dict( )
>
> > for key in keys:
> > if row[key] is None:
> > ad[key] = 'None'
> > else:
> > ad[key] = row[key]
>
> > stack.append(ad)
> > print row
>
> > self.con.executemany('UPDATE ads SET user = ?, status = CASE
> > (status) WHEN 1 THEN 1 WHEN 0 THEN 1 WHEN 2 THEN 3 END WHERE adid = ?',
> > [(user, ad['adid']) for ad in stack])
>
> For some reason that case statement just seems to smell to me... I
> suspect, since you have retrieved "status" earlier, I'd have used a
> dictionary look-up...
>
> update ads set
> user = ?,
> status = ?
> where adid = ?
>
> and
>
> [(user,
> {1 : 1, 0 : 1, 2 : 3}[ad["status"]],
> ad["adid"]) for ad in stack]
>
> Question: I don't see anything taking the case when status is already 3;
> nor does anything (at this level) ever update the value to a 2...
> --
> Wulfraed Dennis Lee Bieber KD6MOG
> wlfr... at ix.netcom.com wulfr... at bestiaria.com
> HTTP://wlfraed.home.netcom.com/
> (Bestiaria Support Staff: web-a... at bestiaria.com)
> HTTP://www.bestiaria.com/
Amazingly enough I tested your compound select statment and found that
it is ~100 times slower than what I currently have. I am trying to
figure out why.
More information about the Python-list
mailing list