A cautionary tale

Peter Otten __peter__ at web.de
Wed Dec 4 11:08:23 CET 2013


Frank Millman wrote:

> Hi all
> 
> There is no question at the end of this, it is just an account of a couple
> of days in the life of a programmer (me). I just felt like sharing it.
> Feel free to ignore.
> 
> The business/accounting system I am writing involves a lot of reading data
> from a database, and if changed, writing it back again.
> 
> There are a number of data types involved - string, integer, decimal,
> boolean, date, datetime. I currently support PostgreSQL, MS SQL Server,
> and sqlite3. In all cases, they have a DB-API 2.0-compliant adaptor which
> handles the conversion to and from python objects transparently.
> 
> Over the last year or so I have added two new types. I added a JSON type,
> to handle 'lists' and 'dicts', and an XML type to handle more complex
> structures. In both cases they are stored in the database as strings, so I
> have to handle the conversions myself.
> 
> I don't allow direct access to the objects, as they can be affected by
> various business rules, so I use getters and setters. For the new types, I
> used the getter to convert from the string to the underlying object, and
> the setter to convert it back to a string.
> 
> Then a couple of days ago I decided that this was not the correct place to
> do it - it should be done when reading from and writing to the database.
> That way the data is always represented by the underlying object, which
> can be passed around without worrying about conversions.
> 
> It was a bit of effort, as I had to add extra getters and setters to
> handle the transfer between the database and the program, and then
> over-ride them in the case of the new data types to provide the required
> functionality. But after a few hours of hunting down all the places that
> required changes, testing, fixing errors, etc, it seemed to be working
> fine, so I thought I could carry on with the meat of my program.
> 
> Then I noticed that certain changes were not being written back to the
> database. After some investigation, I found the error in a part of my
> program that I have not had to look at for ages. When reading data in from
> the database, I preserve a copy of the original value. When saving, I
> compare that to the current value when deciding which columns need
> updating. I do this in the obvious way -
> 
>     on reading -
>         orig_value = value
> 
>     on saving -
>        if value != orig_value:
>             this one needs updating
> 
> Have you spotted the deliberate mistake yet? In the case of a JSON list,
> orig_value and value point to the same, mutable, list. So when I compare
> value with orig_value, they are always the same, whether changes have been
> made or not!
> 
> The obvious answer is to store a copy of the list. It was not so obvious
> where to make the change, as there were other implications. Eventually I
> decided to over-ride the 'getter' for the JSON type, and return
> copy(value) instead of value. That way if it is changed and then put back
> using the 'setter', the two objects are no longer equal. I have made that
> change, done some more testing, and for now it seems ok.
> 
> So have the last couple of days been a waste of time? I don't think so. Is
> the program a bit cleaner and conceptually sounder? I hope so.
> 
> Why am I telling you all this? No particular reason, just thought some of
> you might find it interesting.

You might like to know how others cope with basically the same problem:

http://docs.python.org/dev/library/shelve.html#example
http://www.zodb.org/en/latest/documentation/tutorial.html#persistent-objects





More information about the Python-list mailing list