[Pycon2005-attendees] The Database Divide

john.m.camara at comcast.net john.m.camara at comcast.net
Fri Mar 25 06:20:44 CET 2005

I know this is a bit long for a post but I’m hoping someone will benefit from this.  I’m also interested in hearing your comments.

The questions and comments that were made in today’s database sessions indicate that there are many developers who have a one track mind vision when it comes to database types.  There are those who swear by relational and then there are those who swear by object oriented ones.

So who is right?  Well that all depends on the application.  Relational databases work great for some applications and object orient ones are great for others.  On the other hand using both types are sometimes the most appropriate approach.

Typically people who write business applications tend to choose relational and those who write engineering/scientific applications use object oriented.  Fortunately both groups in general are choosing the correct types of database for their application domains.  This is a good thing as when you choose the wrong type of database you will tend to have to write a significant amount of unnecessary code and will likely have performance issues.

The problem though, is some business applications could benefit from an object orient database.  Likewise, some engineering/science applications could benefit from a relational database.  It is very likely the type that is used will be the type that a particular domain prefers to use.  The decision should be made based on what type of querying is required to select and use that data once it is stored in the database.

I will go over two trivial applications, one from each domain and will point out some of the pros and cons to storing the data in both types of database.

The first application will be one that consists of drawings that are made with lines.  The class is shown next with a few instances of Line and a list to represent the drawing.
class Line(object):
    def __init__(self, x1, y1, x2, y2, red, green, blue):
        self.x1 = x1
        self.y1 = y1
        self.x2 = x2
        self.y2 = y2
        self.red = red
        self.green = green
        self.blue = blue
l1 = Line(0, 0, 1, 1, 255, 0, 0)
l2 = Line(1, 1, 2, 2, 0, 255, 0)
l3 = Line(2, 2, 3, 3, 0, 0, 255)
drawing = [l1, l2, l3]
Object Oriented approach:
Create a table to store drawings
Pickle the drawing list
Store pickle object with a key = drawingName
Get pickle object with key = drawingName
Unpickle object
Use objects
Pros: Only a couple lines of code needed for storage and retrieval is very fast
Cons: None
Relational approach:
Create a table to store drawings
Create a table to store lines
Insert a new row of data into the drawings table storing an ID for a Primary key and name under drawingName
For each Line in drawing add a row to the lines table and insert the appropriate data
Perform a query on the drawings and lines table to retrieve all line records associated with the drawing of interest.
Create an empty drawing list
Unpack the data for each row and create a Line Object
Append each Line object to drawing list
Use objects
Pros: None (assuming there is no need to do arbitrarily queries on the endpoints and color properties)
Cons: Far more code to write than Object Oriented approach and retrieval is slower.
For the second application we will save information about some artists and instead of going through all the details of both database approaches I will just talk about the pros and cons.
class Person(objects):
    def __init__(self, name, age, address):
        self.name = name
        self.age = age
        self.address = address
p1 = Person(...)
p3 = Person(...)
lineArtists = [p1, p2, p3]
Relational Approach
Pros:  Very easy to create flexible searches
Cons:  Slower retrieval
Object Orient Approach
Pros: Fast Retrieval
Cons: It will take quite a bit of code in order to have a flexible querying capability.

The structure of the classes/objects in each application is similar yet in one case it makes sense to use an object oriented database and in the other a relational database.  The approach that was best for the application was based on the types of queries that were required.  In the first case only a query by drawingName was required and in the second case it was necessary to be able to perform flexible queries over all the fields.

Now what would you do if you had an application that needed to store data on Artists who created line drawings and you also needed to save the line drawings.  Well you might want to think about using both database types to take advantages of their strengths.  Store the artist data in a relational database as well as the drawingNames for each drawing an Artist has created.  Then save the drawing data into an object database.  To retrieve drawing data for a particular drawing from a particular Artist, you would first query the relational database first to retrieve the drawingName.  Then retrieve the drawing from the object database using the drawingName as the key.  The best of both worlds.  The least amount of code and higher performance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/pycon2005-attendees/attachments/20050325/e42d8c47/attachment-0001.html

More information about the Pycon2005-attendees mailing list