[DB-SIG] SQL storage for Python objects

Joel Shprentz shprentz@bdm.com
Wed, 02 Sep 1998 07:18:45 -0400

Since the 1997 Python conference, I have continued to use and
improve my system for storing Python objects in relational databases.
I will try to get a current version onto Starship.  For now, visit
<http://starship.skyport.net/crew/shprentz/> for my conference slides
and the 1997 code.

Some Background

I've been using Python to develop intranet applications that
connect to Sybase.  I followed the design strategies described
by Peter Coad in his book, "Object Models:  Strategies, Patterns
and Applications."  (See a description of the book on amazon.com at
or visit the Object International web site [when it gets fixed] at
<http://www.oi.com/> [hint: look at the free stuff].)  In an
appendix, Coad outlined his data access approach, which I implemented
in Python.

The Architecture

Each database table holds objects from a persistent object class.
The system uses field values from retreived database records to
instantiate Python objects.  New and updated objects are stored
in the database.  Numeric, text, date, and BLOB database field
types map well to Python object attributes.

More work is required to support attributes that contain other
Python objects.  For example, an employee object has a department
attribute that contains a department object.  All of the employees
in the same department would share the same department object.  We
want to ask the employee object, "What is your department?" and
receive a department object in response. We also want to ask the
department object, "Who are your employees?" and receive a list
of employee objects in response.

As new objects are saved to the database, they are assigned a unique
number (within their class).  These ID numbers support the persistent
storage of object linkages.  Continuing the example, the employee
table contains a departementID field.  When a employee object is saved
to the database, its database record holds only the ID number of the
employee's department, not the entire department object.  The department
object will have been previously stored in another table.

When the employee record is read from the database, the department ID
is saved in the employee object.  Later, if the department is requested,
it can be retrieved from the database by its unique ID number.  Caching
makes this operation more efficient.

A similar lazy execution is implemented for large text fields and BLOBS.
The system retrieves these from the database only when asked.


A collection of template files specify the object model.  Each template
file contains a list of object attributes and their database representation.
A "buildObject" program parses the template files and generates the Python
code to implement them.

The generated code includes these capabilities:
- Create, index, and drop database tables.
- Create, retrieve, update, and delete objects.
- Get and update object attributes.
- Defer retrieval of long text, BLOB, and object attributes until needed.
- Cache objects for improved performance.
- Support row limits, transactions, and multitasking if available.

Through class inheritance, a programmer can replace or enhance the
generated code. In practice, most of the code customizations fall
into a few categories:
- Reformat attributes for display (e.g., combine first and last name).
- Validate, restrict, or cascade updates to attributes.
- Perform summarization or calculation
- Retrieve objects based on complex criteria


At 12:05 AM 9/2/98 -0700, Greg Stein wrote:
>Joel Shprentz presented a paper at last year's conference on something
>that is probably what you're looking for. It's online at:
>Patrick Curtain wrote:
>> Has anyone developed a useful idiom or pattern for storing python
>> objects to relational databases?  Said another way, is there a
>> well-defined API that improves on simply embedding all the SQL for the
>> operations on an object?

Joel Shprentz              (202) 863-3487
BDM Federal, Inc.          shprentz@bdm.com
1501 BDM Way
McLean, VA 22102