[DB-SIG] client side sub queries

Wed May 23 21:16:46 CEST 2007

Chris Clark wrote:
 > Carl Karsten wrote:
 >> Carsten Haese wrote:
 >>
 >>> On Tue, 22 May 2007 17:33:27 -0500, Carl Karsten wrote
 >>>
 >>>> Or some such abomination of results of one query as a parameter of a
 >>>> 2nd.
 >>>>
 >>>> given my use case, I can understand why this isn't in the spec, and
 >>>> why it may never be.  but it seems to come up more often that I
 >>>> would expect, so here we go.
 >>>>
 >>>> My current problem: reconcile transaction details that are off due
 >>>> to rounding errors. the 2 sets of details are stored on different
 >>>> servers, and no chance of getting one server to hit the 2nd, so the
 >>>> python client code is going to have to help by getting a list of
 >>>> keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2',
 >>>>  'key3', ...)"  which isn't 'hard' but I find annoying that I have
 >>>> to convert formats in the application layer.
 >>>>
 >>> Option 1: Create a temporary table on one server and load the data
 >>> from the
 >>> other server into it. Then, use a server side subquery or join the
 >>> tables
 >>> together.
 >>>
 >>> If you don't have write permission on either server, there is Option
 >>> 2: Fetch
 >>> all relevant data from server 1 into client memory, fetch all
 >>> relevant data
 >>> from server 2 into client memory, and do the reconciliation in client
 >>> memory.
 >>>
 >>
 >> The perms issue can be taken care of by having the table created ahead
 >> of time.
 >> but, the application level code is still 'custom' and falls into a
 >> similar pit as embedding parameters into the SQL command string.
 >>
 >
 >  From my ivory tower, this sounds more like a database problem than an
 > application problem.

Which is why I started with
 >>>> I can understand why this isn't in the spec, and
 >>>> why it may never be.

 > I agree completely that having to implement this
 > logic in the application is awkward. Is replication an option with the
 > database you are using? Either replicating from server 1 into server 2
 > (possibly vice-versa) or replicate from server 1 into server 3 and
 > replicate from server 2 into server 3 so you can perform all queries in
 > server 3.

"it depends" :)  sometimes yes, sometimes no.  in this case, no.  it is 5 or so 
developers on a team, many teams using the same servers.  many DBAs, have to get 
one of them involved with server side things.  but even when yes, it is still 
extra work that I could see being eliminated.

 >
 > With replication you tend to have options of either batch extract/load
 > (ETL) or real time replication (usually implemented by the database
 > itself, sometimes by a 3rd party tool). The downside of replication are
 > increased space requirements and possibly delays in replication if it is
 > not real time (for a reporting situation like this, collisions should
 > not be an issue).
 >

Given my case of debugging a problem, the more clutter involved in the 
investigation, the harder the investigation is.  kinda like the whole HeisenBug* 
thing, only you have to debug and understand the code used to debug and 
understand some other code.  so in this case, it would be totally wonderful if 
the code was clean, simple, robust, consistent, etc.  things taken for granted 
when you use dbapi features instead of code written quickly for this one task, 
or code written for some other task that is close to this one.

In the end, I would not recommend this for production code, but even that I see 
often enough.  I totally understand the ivory tower view.  I prefer to keep 
things in that realm too.  and I can be opposed to making it easy to write bad 
code.  but given the environments that create this problem are not going to go 
away, does it really do 'us' any good to 'ignore' them as opposed to providing a 
solution?

(*) A HeisenBug is a bug whose presence is affected by act of observing it.
http://www.c2.com/cgi/wiki?HeisenBug

Carl K