[PYTHON DB-SIG] Objects and Relational Databases - Part 2

Joel Shprentz shprentz@bdm.com
Wed, 10 Jan 1996 14:23:32 -0500


This and the previous message are forwarded from the Coad Letter mailing
list, which is sent monthly from Object International.  In these two
messages, Peter Coad discusses how problem domain objects should interact
with data management objects.  The discussion extends the data management
appendix to Coad's object modelling book, which was published last year.

--
Joel Shprentz              (202) 863-3121
BDM Federal, Inc.          shprentz@bdm.com
1501 BDM Way
McLean, VA 22102
------------------------- Forwarded Message Follows --------------------------
Return-Path: <owner-coad-letter@oi.com>
From: <owner-coad-letter@oi.com>
Date: Tue, 9 Jan 96 14:07:06 CST
To: coad-letter@oi.com
Subject: 23b--Data Management--"Objects and RDB" Q&A (Part 2)
Sender: owner-coad-letter@oi.com
Reply-To: owner-coad-letter@oi.com

The Coad Letter
Issue 23b
Category: Data Management
Title: "Objects and RDB" Q&A (Part 2)
Date: Tuesday, January 9, 1996


Continuing from Part I --

        ----------------------------------------------------
                        "Objects and RDB:
        App Architecture and Other Practical Considerations"
        ----------------------------------------------------

                    Peter Coad and David North
              (with added insights from Ken Ritchie)
                           January 1996


Q. Do you simply add attributes and services that came from a DM point-of-
   view to PD objects.

   For example:
   - Added id's and services that are only used internally
   - Added correlation objects
   - Taking a problem-domain attribute and breaking it up into fields
   - Taking a problem-domain attribute and adding a new class.

A. [Peter] The model changes, it's true. And PD objects are affected
   by this. This is why view management is so very important.

   Responses for each example:
   - Added id's and services that are only used internally
     -- Hide these with view management.
   - Added correlation objects
     -- These are a problem. What I've done is let domain experts
        know I need them for relating objects in one class to another.
   - Taking a problem-domain attribute and breaking it up into fields
     -- Here, you are simply choosing to show more detail about
        the attributes in an object model. And that is just fine
        (not a big deal).
   - Taking a problem-domain attribute and adding a new class
     -- An improvement to the object model; might even be added
        insight that comes from building part of the app; and
        that's just fine.

   [David] Agreed. No need ending up with two models (both an object
   model and a data model); use an object model all the way.


Q. About the DMserver object. What are its services? It seems like
   they are:
   - Find a specific DM object.
   - Maintain a changes log.
   - Save all.

   Does a significant difference arise if one chooses to
   make each DM object global, rather than have a DM
   server watch over them?

   An HI object could tell each DM object to save itself,
   couldn't it?

A. [David] The DMserver keeps the change log in the automated version
   of DM.

   Not much change if all of the DMobjects are global. Just more
   global variables; I prefer to have fewer.

   If you are keeping a change log of all of the objects that are
   changed, then the HI might send the saveAll to the DMserver to save
   all of the changed objects.


then
Q. Would a DMserver object have some other services?

A. [David] Yes: connect/disconnect to database(s).


Q. The "change log" attribute in the DM Server object.
   What is it?

A. [Peter] It is a collection of objects that have changed.

  [David] Includes deletions -- plus additions and deletions to
  collections, too.


then
Q. If it is a collection of objects, how does the DMserver
   work with them. Does it tell each changed PD object to save
   itself? Or?

A. [David] Yes, the DMserver asks each to save -- and then each
   PD object asks its DM object to "save me."


then
Q. Should the change log be held by the DMserver -- or by each
   DM object (each responsible for the objects within a specific
   PD class)?

A. [David] It could, but then a saveAll would have to ask each
   DMobject if it had any changes -- instead of just knowing which
   ones need to be saved. Order of the changes can be important
   (integrity constraints in the DB) and this can be preserved
   in a single change log.


Q. Could you show us an example of a real DM object, how it is really
   done (Smalltalk would be fine)? Really, we are looking for a
   pseudo-code example of how a real DM object does its job --
   including database interaction.

A. [David] The ones that we have are like the automated version
   I discussed. Because they provide generic support for any object,
   they are very complicated. If I can work it in, I will build a
   simple one and send it to you, for you to look at.


Q. How many objects does one bring into memory at one time?
   A challenging question. Yet is there some way of deciding
   how many should be brought in at one time?

A. [Peter] That is an age-old question, when working with RDB.
   One heuristic: use problem-domain groupings.

   [David] Agreed. Usually it is one at a time or a group to support a 
   collection.


then
Q. If an RDB cursor can be used in this way, can you provide
   an example?

A. [David]  Use the cursor when you want to bring in a
   collection of objects (loadAll, loadEmpForDept)
      1) special service in the DM for this
      2) issue the SQL request
      3) process each row returned by the cursor and convert it
         to an object -- and add that object to the collection
         to be returned.


Q. PersistentPD class?
      Is there a PD class, too?
      Are all PD classes persistent? (If not, please give examples.)

A. [Peter] A PD class? Usually, no.

   Classes with no attributes (e.g., classes added to encapsulate
   behavior across a collection, with just one object in a class)
   do not need to be persistent; the solo objects are simply created
   and initialized at startup each time.

   [David]  The PersistentPD class is the generalization class
   for any PD class that needs persistence.

   [David]  Most of the time in business applications, all of the
   PD classes are persistent. There are cases (as Peter mentioned)
   in which PD classes are not persistent. I'd add these
   special cases to the list: special window or report structures
   (often just containers of persistent objects).


Q. Triggers?
   If the database uses triggers to support its consistency,
   how does that relate to content in the PD component?

   For example, if super-org is deleted, then its sub-orgs are
   deleted. Who does this? And how is it shown in an object model?

A. [David] If you want the database to automatically do the work
   by triggers, then the only place to describe that work is in
   the service description or comments in the object model. The
   example you mentioned, the delete for super-org, does all
   the work of deleting all of the sub-orgs; only in the delete
   service description would that work be described. You need
   to be a bit careful about this, because what if you had sub-org
   objects in memory, yet they end up getting deleted in the
   database without getting removed from memory?

   Again, keeping the object model in your development means that you
   may not always make use of all of the capabilities of the RDB.


Q. If you are building a client-server system, where do you allocate
   the objects? Some possibilities:
   - some PD on each
   - HI to client; DM to server
   - DM on client, sending SQL to server

A. [Peter] HI-PD-DM on the client and the RDB on the server.
   SQL is passed over the LAN.

   [David] Agreed.


Q. Can you recommend a product that supports network messaging --
   using objects (e.g., an object request broker of some sort).
   Any experience with this?

A. [Peter] Mark Mayfield is working on a banking app that
   is doing development in this area.

   [David]  We are looking at HP-Distributed Objects using
   VisualWorks, but we don't have much experience with it yet.
   Some of the ODB vendors also have support for this.


Q. HI-PD-DM is much like the three-tiered model.
   Yet is this the same three-tiered system architecture model?

A. [Peter] Good point. HI-PD-DM is a software architecture.

   The three-tiered model is most often used in the context of
   system architecture.

   These two are related (both a triad; both a separation of
   concerns), yet the mapping is not necessarily 1:1.

   [David] Conceptually they both come from the desire to separate
   the responsibilities. Usually today the three-tiered model is
   talking about the number of machines in the system architecture
   and the responsibilities of the machines.


Q. RDBs each have their strengths. The available services vary,
   product to product.

   How many of those services should one try to use?
   Use it to get the most out of the database?
   Or limit use to what fits well with the PD objects?
   What should one think of in making such decisions?

A. [David] The answers to all of these questions are based on
   application-specific tradeoffs. No strategies on this, just yet.


Q. We have heard that the SQL-3 standard has some ODB capabilities
   defined in it. Is that really so? What might be the impact on
   mapping an object model to RDB?

A. [Peter and David]  Don't know.


Q. Why are object databases not very popular for business apps?

A. [Peter] RDBs are at the heart of building business systems.
   The common technology for business apps is the use of an RDB.
   More common than any one language. More common than any one
   operating system.

   Moreover, the content of an RDB represents very significant
   value to an organization. Few organizations could survive
   any major disruption of that content.

   Yet I see increased interest in this area; time will tell.

   [David]  They are new and don't have a track record. The
   companies that offer them are (relatively) small. This all
   adds up to perceived risk. However, there are many business
   that are starting to use them and I expect that to increase.

   [Ken] Yes, this is a case of history repeating itself. Many
   business systems support high transaction rates (lots of
   updates) and/or large volumes of data (lots of rows in
   tables). A dozen years ago this very same concern generated
   resistance to relational systems. The new database technology
   will need time to mature and become more robust and more
   efficient (and supported with all necessary maintenance
   utilities) before it will gain significant market share.

Q. What ODBs would you recommend for business applications?

A. [Peter and David] ObjectStore, Versant, Gemstone.


Q. DMserver takes care of application connect and disconnect with
   a database.

   Who handles the open and close of each table in the database?
   Each DM object? How often -- each time a DM object needs to
   interact with the database? Or?

A. [David] In SQL you don't open or close tables. If you did, a
   DM object would handle it, though.


Q. Many-to-many object connections:
   Is adding a correlation class the only way to implement this?
   Is there another way?

A. [Peter] A transaction class (or a transaction class with no
   attributes, called a correlation class) is needed.

   [David] You have to have the table in the RDB to contain the
   information about the connection. The easiest way to use this
   table is to add a class to the object model. The only other
   choice is to make your dmObject smart enough to know about the
   table.


Q. Are there any other responsibilities for the DM object, to map
   PD objects to database tables? Can they or should they provide
   other services?

A. [David] Yes, the DM objects have the responsibility to do the
   mapping and you may implement several services to accomplish
   this work. In addition to the mapping, the DM objects do the
   actual communication with the database using SQL for load,
   save, delete. DM objects may have special services to support
   specific database requests. DM objects also make sure that
   there is only one of a given object in memory.

   [Ken] I prefer to resolve every many-to-many relationship in
   the model by injecting a correlation class into the
   relationship. Often there will be some additional responsibility
   for the correlation class, and the explicit use of such a class
   helps in the discovery. These correlation objects must also be
   persistent if the others are.

   Consider an example. A person may enroll in several workshops.
   Each workshop may have a number of persons enrolled. Thus, 
   we recognize a many-to-many relationship between "person" and
   "workshop."  We might choose the name "enrollment" for a
   correlation class, placed between the "person" and the
   "workshop" classes. The constraints on the enrollment class 
   indicate exactly one person and one workshop.

   Now imagine this possibility: We could be asked, later on, to
   also keep track of the enrollment date for each specific
   pairing of person and workshop. If we have already identified a
   correlation class, we know exactly where to place the
   responsibility for the enrollment date. The correlation becomes
   a transaction class for each specific enrollment event
   remembered.


Q. If you have several databases, and you have to access them
   within a single application, would the DM objects take care
   of all of the concerns of dealing with multiple databases?

A. [Peter] Yes. Very helpful to use a class library that
   supports all of them consistently (e.g., Rogue Wave's
   dbTools).

   [David] Yes, this should be encapsulated in the DM objects.
   You will probably have to do special work to have connections
   to multiple databases at a time, especially if they are from
   different vendors. Peter is right; library support for this is
   valuable.


Q. If you are building a distributed system, is there a good way
   of distributing objects among the processors? Guidelines?

A. [Peter] Low coupling, high cohesion -- so distributing
   in light of groups of interacting objects is helpful.
   You make best engineering judgments; implement; and
   then adjust again and again.


Q. What is the status of distributed business systems in the US?
   In the conferences, we hear a lot about distributed object
   systems. Yet are they really used in business apps at this point?

A. [Peter] Mostly just client-server, with objects on the client
   and RDB on the server. Oracle has placed venture funding
   in a firm that is developing an object database for the client
   side, specifically supporting this concept (the startup is
   called Omniscience). Another company we're watching is Persistence
   (promising yet very pricey).

   [David] I have worked with client-server apps (multiple
   machines, multiple databases). Yet distributed is certainly
   more than client-server. Work is being done in this area by
   some of the larger companies. Many conference speakers talk
   about such matters, yet it takes a while for most businesses
   to sort things out and apply new technologies.


Q. Who checks constraints, upon exiting a field?
   A PD object? Seems reasonable.

   Yet another approach is to do type checking and format
   checking (at least, simple checking of values) within
   the human interface. In fact, certain class libraries
   support such data entry checking.

   So, it seems like HI objects could do basic field checking
   and then let PD objects do more detailed checking after that.

   Agreed?

A. [Peter and David] Let the HI object do that which is easy:
   type, size, range, and mandatory/required (for example, 
   a person's last name is: string, 20, alpha, mandatory).

   Let PD objects check for inter-attribute consistency and
   algorithmic checks (checking to make sure that the next
   assignment does not put an employee over an agreed-upon
   maximum number of active assignments).

   [Ken] When a conflict arises, an HI object will have the 
   responsibility to notify the user about the specific
   problem. You may be able to guide the user toward
   correcting it by taking some subsequent action. In some
   RDB's, you can assert referential integrities (data
   dependencies between objects). If the RDB raises an
   exception, the DM object may have to detect that by
   checking the results of each update query. When a
   transaction fails, the DM object would most likely notify
   its corresponding PD object. Then, the PD object can alert
   the HI object which had initiated the sequence.  When RDB
   integrities aren't available, put the logic into the PD
   object, where it will be most apparent (and least
   complicated).

=====
_________________________________________________________________________
Peter Coad
  Object Intl, Inc. Education-Tools-Consulting. Founded in 1986
    "Helping teams deliver frequent, tangible, working results"
     8140 N MoPac 4-200, Austin TX 78759 USA
     1-512-795-0202 fax 1-512-795-0332
     direct line 1-919-851-5422 fax 1-919-851-5579
  FREE newsletter. Write majordomo@oi.com, message: subscribe coad-letter
      coad@oi.com   info@oi.com   http://www.oi.com/oi_home.html
  pgp: 3DBA 3BDD 57B6 04EB B730 9D06 A1E1 0550 (public key via finger)


=================
DB-SIG  - SIG on Tabular Databases in Python

send messages to: db-sig@python.org
administrivia to: db-sig-request@python.org
=================