Mapping objects to tables, was Re: [DB-SIG] Python and static SQL

Tue, 27 Feb 2001 15:28:04 +0100

This is a multi-part message in MIME format.
--------------23F59D99BB240DE0D746FF14
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Ok...this mail tilted rather heavily in focus as I
wrote it. Sorry abut that...

> How do stored procedures stack up against static SQL?

As I understand it, static SQL is very similar to stored
procedures in DB2. In both cases, the SQL commands are stored
in the database, and QEP are precalculated etc. I think the
interface is the same as well. Of course it depends on your
your code.

With stored procedures, you can have loops and conditionals
etc in one database call. With embedded SQL the actual SQL
commands are in the database, but the loops etc are in your
C++ app, meaning that you will have to communicate with the
database, and transfer data, on each turn in the loop.

I think other databases might be rather different from DB2 in
how they treat dynamic and static SQL. It seems that Oracle is.

So...it depends.

Imagine that you have a piece of software that has to look at
many lines in the database and eventually return just a little
data. E.g. if you need to get the standard deviation
for a Poisson distributed set of data--I bet all databases
fail on all but normal distribution. In this case I can imagine
that a stored procedure beats all alternatives. Sure, your table
scan will be as heavy, but you won't have to transport more than
a float and some meta-data out of the database. Pulling millions
of values to the C++/C/Python app might be a heavy task. Especially
if your client program is across the network.

If it's a simple CRUD operation, I imagine the difference is
slim, and it seems from what Anthony says, that at least for
Oracle, the difference is slim also compared to Dynamic SQL.

Oracle might have a point here, when they suggest using their
tools with PL/SQL that allows you to move the same piece of
code from a client into the server as a stored procedure. Then
you can profile your code as needed... Not that I think it
outweighs the disadvantage of being tied in by their proprietary
tools...

> I am in the process of re-implementing my sorry excuse for a Sybase
> module and have been reading that the only penalty that dynamic SQL
> had over stored procedures is the initial compile stage.

Maybe we could find the time to make some actual tests if we split
the burden between us? But is there really an option for a Python
database adapter?

> How much faster than
> 
>         insert blah values (?,?,?,?,?)
> 
> is it going to be when the SQL is compiled once, then you just stream
> values for the placeholders?
> 
> Are you able to tell us the measurement results?

Not now. I'm only thinking aloud right now. Not really coding this.

I doubt that inserts will be different. Maybe joined selects will
be more noticable? If  I have two linked objects, I'd like to select
the objects linked to one object.

Basically, in my thinking, there are five operations that make
up almost all communication with the database:

INSERT INTO CLASS ...

SELECT attributes FROM CLASS WHERE OID = ?

UPDATE CLASS ... WHERE OID = ?

DELETE FROM CLASS WHERE OID = ?

SELECT attributes FROM CLASS WHERE FOREIGN_OID = 'other_class.oid'

In general (I'm shifting subject a bit now...) I've been thinking
of how to map objects to relational databases in a simple and
consistent manner. I'm 100% sold to the concept of artificial keys,
but I'm wondering how to handle associations between classes. Using
a foreign key in a table is obviously only good for one-to-many
or one-to-one assiciations.

With many-to-many associations you need a separate link table.
I was thinking that it might be good to consistently use CLASS
tables for objects, and ASSOCIATION tables for links. Thus,
something like this:

An object model of:

|---------|            |--------|
|Dept     |            |Emp     |
|---------|1_______1..*|--------|
|Name     |workplace   |Name    |
|Location |            |EmpDate |
|         |1__________1|Salary  |
|---------|     manager|--------|

Would lead to the following tables with foreign keys:

DEPT
----
oid (pk)
Name
Location
Manager (fk = emp.oid)

and 

EMP
---
oid (pk)
name
salary
empDate
workplace (fk = dept.oid)

With association tables it would instead be:

oDEPT
----
oid (pk)
Name
Location

and 

oEMP
---
oid (pk)
name
salary
empDate

and

lDEPT_MANAGER
------------
dept (unique, fk = odept.oid)
emp (unique, fk = oemp.oid)

and finally

lEMP_WORKPLACE
-------------
emp (pk, fk = oemp.oid)
dept (fk = odept.oid)

This solution leads to more tables and more joins, and it's
not really needed unless you have many-to-many relationships.
It also leads to more SQL code than above.

On the other hand, it has advantages:
- It works for all types of asscociations. You only have
  to change the unique indices on the link-tables to change
  multiplicity
- It makes it easier to change the model, for instance to
  move the association 
- I think it makes it easier to reuse tables and code.
- It maps better gainst the object model--differentiating
  between objects/classes and links/associations.

Obviously, another issue to consider is how to map classes to
tables when you take inheritage into account...

There might be some relevant comments and links at 
http://www.thinkware.se:80/cgi-bin/thinki.cgi/DatabaseDesign

Comments?
--------------23F59D99BB240DE0D746FF14
Content-Type: text/x-vcard; charset=us-ascii;
 name="magnus.vcf"
Content-Description: Card for Magnus Lycka
Content-Disposition: attachment;
 filename="magnus.vcf"
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by texas.it-center.se id f1REZ1U19860

begin:vcard=20
n:Lyck=E5;Magnus
tel;cell:+46(0)705 82 80 65
tel;fax:+46(0)706 12 80 65
x-mozilla-html:FALSE
url:http://thinkware.se/
org:Thinkware AB
adr:;;Alvans vag 99, 1tr;Umea;;907 50;Sweden
version:2.1
email;internet:magnus@thinkware.se
title:SW Developer / Consultant
fn:Magnus Lyck=E5
end:vcard

--------------23F59D99BB240DE0D746FF14--