[CentralOH] Pickling data into a database

Brian Costlow brian.costlow at gmail.com
Wed Sep 3 21:37:26 CEST 2008

On Wed, Sep 3, 2008 at 2:32 PM, Mark Erbaugh <mark at microenh.com> wrote:

> I'm working on a Python database application where the type and number
> of fields that need to be saved vary from row to row.


> However, the other day I had a "brainstorm".  Much of this data does not
> need to be queried from the database.  I could streamline my database
> design and store this variable data in Pickle'd format in a text field.0).
> Is this a good approach?  What are the potential pitfalls?
> One pitfall that just came to mind is that this limits the ability of
> non-Python applications to process the data.  Maybe an alternative would
> be to convert the data to XML instead of Pickling it?

I think you hit one of the biggies there. However, if all the data is as
simple as the curve pairs, XML may be overkill. You could, for instance,
represent the pairs as a simple space delimited string. On the other hand,
if the data is going to be around for a decade, XML gives you a way to add
some semantic info so that some theoretical future developer using the next
big language knows what the data means. You need to think through these kind
of trade-offs.

Also, in my experience, sooner or later someone will want to search this
data, no matter how unlikely it seems now. I built an app that ingests XML
docs, parsed out the stuff the customer was interested in, and inserted it
into a postgres db. After playing around with speed of retrieving the entire
XML file (these were multi MB files) from db fields or disk files, I wrote
the XML itself to disk and indexed the location in the db.

Now of course the end users want to search on things in the XML, but not db
fields that they told me earlier they'd never search. So I'm trying to
decide whether to move the XML back to the db and implement text search, or
if we can do something via Xquery (which I know very little about).

So, my caveats: don't do something python centric for data going into the db
that might get used by non python apps. Be careful about making a 'this
won't ever be searched' decision.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/mailman/private/centraloh/attachments/20080903/20fc021c/attachment.htm>

More information about the CentralOH mailing list