[Tutor] data storage question
Steven D'Aprano
steve at pearwood.info
Mon Aug 1 22:14:04 EDT 2016
On Mon, Aug 01, 2016 at 04:47:32PM -0400, Colby Christensen wrote:
> I'm a novice programmer. I have a decent understanding of algorithms
> but I don't have a lot of computer science/software engineering
> experience. As a way to help me learn, I've begun a coordinate
> geometry program similar to the COGO program developed years ago at
> MIT. Currently, I store the points in a dictionary in the format
> point_number : [North, East]. I eventually will add a Z component to
> the points and possibly a description after I get enough of the
> horizontal geometry worked through. I would like to be able to save
> the point table so that I can have multiple projects.
For external storage, you have lots of options. Two very common or
popular suitable standards are JSON or PList.
Suppose you have a point table:
pt_table = {25: [30.1, 42.5, 2.8, 'The Shire'],
37: [17.2, 67.2, 11.6, 'Mt Doom'],
84: [124.0, 93.8, 65.2, 'Rivendell'],
}
I can save that table out to a JSON file, then read it back in, like
this:
py> import json
py> with open('/tmp/data.json', 'w') as f: # save to table to disk
... json.dump(pt_table, f)
...
py> with open('/tmp/data.json', 'r') as f: # read it back in
... new_table = json.load(f)
...
py> print new_table
{u'25': [30.1, 42.5, 2.8, u'The Shire'], u'37': [17.2, 67.2, 11.6, u'Mt
Doom'], u'84': [124.0, 93.8, 65.2, u'Rivendell']}
You'll see that the JSON format has made two changes to the point table:
(1) All strings are Unicode strings instead of "byte strings" (sometimes
called "ASCII strings").
(2) The keys were numbers (25, 37, 84) but have now been turned into
Unicode strings too.
We can advise you how to deal with these changes. Nevertheless, JSON is
probably the most common standard used today, and you can see how easy
the writing and reading of the data is.
Here is an alternative: Plists. Like JSON, Plist requires the keys to be
strings, but unlike JSON, it won't convert them for you. So you have to
use strings in the first place, or write a quick converter:
py> pt_table = dict((str(key), value) for key, value in
pt_table.items())
py> print pt_table
{'25': [30.1, 42.5, 2.8, 'The Shire'], '37': [17.2, 67.2, 11.6, 'Mt
Doom'], '84': [124.0, 93.8, 65.2, 'Rivendell']}
Notice that now the keys (which were numbers) are now strings. Now we
can write to a plist, and read it back:
py> plistlib.writePlist(pt_table, '/tmp/data.plist')
py> new_table = plistlib.readPlist('/tmp/data.plist')
py> new_table == pt_table
True
Again, if you need to work with numeric keys, there are ways to work
around that.
If anything is unclear, please feel free to ask questions on the mailing
list, and somebody will try to answer them.
--
Steve
More information about the Tutor
mailing list