[Tutor] Storing information as attributes or in a dictionary

Steven D'Aprano steve at pearwood.info
Wed Sep 19 06:48:46 CEST 2012

On Tue, Sep 18, 2012 at 07:14:26AM -0700, Michiel de Hoon wrote:
> Dear all,
> Suppose I have a parser that parses information stored in e.g. an XML file.

You mean like the XML parsers that already come with Python?


Or powerful third-party libraries that already exist?


Please don't waste your time re-inventing the wheel :)

> I would like to design a Python class to store the information 
> contained in this XML file.
> One option is to create a class like this:
> class Record(object):
>     pass
> and store the information in the XML file as attributes of objects of 
> this class

That is perfectly fine if you have a known set of attribute names, and 
none of them clash with Python reserved words (like "class", "del", 
etc.) or are otherwise illegal identifiers (e.g. "2or3").

In general, I prefer to use a record-like object if and only if I have a 
pre-defined set of field names, in which case I prefer to use 

py> from collections import namedtuple as nt
py> Record = nt("Record", "north south east west")
py> x = Record(1, 2, 3, 4)
py> print x
Record(north=1, south=2, east=3, west=4)
py> x.east

> Alternatively I could subclass the dictionary class:
> class Record(dict):
>     pass

Why bother subclassing it? You don't add any functionality. Just return 
a dict, it will be lighter-weight and faster.

> I can see some advantage to using a dictionary, because it allows me 
> to use the same strings as keys in the dictionary as in used in the 
> XML file itself. But are there some general guidelines for when to use 
> a dictionary-like class, 

Yes. You should prefer a dictionary when you have one or more of these:

- your field names could be illegal as identifiers 
  (e.g. "field name", "foo!", etc.)

- you have an unknown and potentially unlimited number of field names

- each record could have a different set of field names

- or some fields may be missing

- you expect to be programmatically inspecting field names that aren't 
  known until runtime, e.g.:

  name = get_name_of_field()
  value = record[name] # is cleaner than getattr(record, name)

- you expect to iterate over all field names

You might prefer to use attributes of a class if you have one or more 
of these:

- all field names are guaranteed to be legal identifiers

- you have a fixed set of field names, known ahead of time

- you value the convenience of writing record.field instead of

> and when to use attributes to store 
> information? In particular, are there any situations where there is 
> some advantage in using attributes?

Not so much. Attributes are convenient, because you save three 


but otherwise attributes are just a more limited version of dict keys. 
Anything that can be done with attributes can be done with a dict, since 
attributes are usually implemented with a dict.


More information about the Tutor mailing list