[Tutor] Storing information as attributes or in a dictionary
Steven D'Aprano
steve at pearwood.info
Wed Sep 19 06:48:46 CEST 2012
On Tue, Sep 18, 2012 at 07:14:26AM -0700, Michiel de Hoon wrote:
> Dear all,
>
> Suppose I have a parser that parses information stored in e.g. an XML file.
You mean like the XML parsers that already come with Python?
http://docs.python.org/library/markup.html
http://eli.thegreenplace.net/2012/03/15/processing-xml-in-python-with-elementtree/
Or powerful third-party libraries that already exist?
http://lxml.de/index.html
Please don't waste your time re-inventing the wheel :)
> I would like to design a Python class to store the information
> contained in this XML file.
>
> One option is to create a class like this:
>
> class Record(object):
> pass
>
> and store the information in the XML file as attributes of objects of
> this class
That is perfectly fine if you have a known set of attribute names, and
none of them clash with Python reserved words (like "class", "del",
etc.) or are otherwise illegal identifiers (e.g. "2or3").
In general, I prefer to use a record-like object if and only if I have a
pre-defined set of field names, in which case I prefer to use
namedtuple:
py> from collections import namedtuple as nt
py> Record = nt("Record", "north south east west")
py> x = Record(1, 2, 3, 4)
py> print x
Record(north=1, south=2, east=3, west=4)
py> x.east
3
> Alternatively I could subclass the dictionary class:
>
> class Record(dict):
> pass
Why bother subclassing it? You don't add any functionality. Just return
a dict, it will be lighter-weight and faster.
> I can see some advantage to using a dictionary, because it allows me
> to use the same strings as keys in the dictionary as in used in the
> XML file itself. But are there some general guidelines for when to use
> a dictionary-like class,
Yes. You should prefer a dictionary when you have one or more of these:
- your field names could be illegal as identifiers
(e.g. "field name", "foo!", etc.)
- you have an unknown and potentially unlimited number of field names
- each record could have a different set of field names
- or some fields may be missing
- you expect to be programmatically inspecting field names that aren't
known until runtime, e.g.:
name = get_name_of_field()
value = record[name] # is cleaner than getattr(record, name)
- you expect to iterate over all field names
You might prefer to use attributes of a class if you have one or more
of these:
- all field names are guaranteed to be legal identifiers
- you have a fixed set of field names, known ahead of time
- you value the convenience of writing record.field instead of
record['field']
> and when to use attributes to store
> information? In particular, are there any situations where there is
> some advantage in using attributes?
Not so much. Attributes are convenient, because you save three
characters:
obj.spam
obj['spam']
but otherwise attributes are just a more limited version of dict keys.
Anything that can be done with attributes can be done with a dict, since
attributes are usually implemented with a dict.
--
Steven
More information about the Tutor
mailing list