[Tutor] data storage question

Steven D'Aprano steve at pearwood.info
Mon Aug 1 22:14:04 EDT 2016


On Mon, Aug 01, 2016 at 04:47:32PM -0400, Colby Christensen wrote:

> I'm a novice programmer. I have a decent understanding of algorithms 
> but I don't have a lot of computer science/software engineering 
> experience. As a way to help me learn, I've begun a coordinate 
> geometry program similar to the COGO program developed years ago at 
> MIT. Currently, I store the points in a dictionary in the format 
> point_number : [North, East]. I eventually will add a Z component to 
> the points and possibly a description after I get enough of the 
> horizontal geometry worked through. I would like to be able to save 
> the point table so that I can have multiple projects. 

For external storage, you have lots of options. Two very common or 
popular suitable standards are JSON or PList.

Suppose you have a point table:

pt_table = {25: [30.1, 42.5, 2.8, 'The Shire'],
            37: [17.2, 67.2, 11.6, 'Mt Doom'],
            84: [124.0, 93.8, 65.2, 'Rivendell'],
            }

I can save that table out to a JSON file, then read it back in, like 
this:

py> import json
py> with open('/tmp/data.json', 'w') as f:  # save to table to disk
...     json.dump(pt_table, f)
...
py> with open('/tmp/data.json', 'r') as f:  # read it back in
...     new_table = json.load(f)
...
py> print new_table
{u'25': [30.1, 42.5, 2.8, u'The Shire'], u'37': [17.2, 67.2, 11.6, u'Mt 
Doom'], u'84': [124.0, 93.8, 65.2, u'Rivendell']}


You'll see that the JSON format has made two changes to the point table:

(1) All strings are Unicode strings instead of "byte strings" (sometimes 
called "ASCII strings").

(2) The keys were numbers (25, 37, 84) but have now been turned into 
Unicode strings too.

We can advise you how to deal with these changes. Nevertheless, JSON is 
probably the most common standard used today, and you can see how easy 
the writing and reading of the data is.

Here is an alternative: Plists. Like JSON, Plist requires the keys to be 
strings, but unlike JSON, it won't convert them for you. So you have to 
use strings in the first place, or write a quick converter:

py> pt_table = dict((str(key), value) for key, value in 
pt_table.items())
py> print pt_table
{'25': [30.1, 42.5, 2.8, 'The Shire'], '37': [17.2, 67.2, 11.6, 'Mt 
Doom'], '84': [124.0, 93.8, 65.2, 'Rivendell']}


Notice that now the keys (which were numbers) are now strings. Now we 
can write to a plist, and read it back:

py> plistlib.writePlist(pt_table, '/tmp/data.plist')
py> new_table = plistlib.readPlist('/tmp/data.plist')
py> new_table == pt_table
True


Again, if you need to work with numeric keys, there are ways to work 
around that.

If anything is unclear, please feel free to ask questions on the mailing 
list, and somebody will try to answer them.


-- 
Steve


More information about the Tutor mailing list