[Python-Dev] Suggestions for Improvements to namedtuple
Isaac Morland
ijmorlan at cs.uwaterloo.ca
Wed Nov 14 19:30:04 CET 2007
I was working on something very similar to namedtuple for a project of my own,
when it occurred to me that it's generally useful idea and maybe somebody else
was working on it too. So I searched, and found Raymond Hettinger's addition
to collections.py, and it does pretty much what I want. I have a few
suggestions which I hope are improvements. I will discuss each one, and then
at the bottom I will put my version of namedtuple. It is not based on the one
in Python SVN because it was mostly written before I found out about that one.
If my suggestions meet with approval, I could check out a copy of
collections.py and make a patch for further comment and eventual submission to
somebody with checkin privileges.
1. I think it is important that there be a way to create individual namedtuple
instances from an existing sequence that doesn't involve splitting the sequence
out into individual parameters and then re-assembling a tuple to pass to the
base tuple constructor. In my application, I'm taking database rows and
creating named tuples from them, with the named tuple type being automatically
created as appropriate. So there will be *lots* of named tuple instances
created, so for efficiency I would prefer to avoid using * to break up the
sequences generated directly by the database interface. I would like to pass
those sequences directly to the base tuple constructor.
To restore to my code the feature of being able to use individual parameters as
in collections.py, I added a classmethod to the generated classes called
__fromvalues__. This uses Signature, my other idea (next message) to convert a
call matching a procedure signature of (fieldname1, ...) into a dictionary, and
passes that dictionary into another classmethod __fromdict__ which creates a
named tuple instance from the dictionary contents.
The problem I see with this is that having to say
Point.__fromvalues__ (11, y=22)
instead of
Point (11, y=22)
is a little verbose. Perhaps there could be an __fromsequence__ instead for
the no-unpacking method of instance creation, as the most common use of
direct-from-sequence creation I think is in a more general circumstance.
2. It would be nice to be able to have default values for named tuple fields.
Using Signature it's easy to do this - I just specify a dictionary of defaults
at named tuple class creation time.
3. In my opinion __replace__ should be able to replace multiple fields. My
version takes either two parameters, same as collections.py, or a single
dictionary containing replacements.
4. I put as much of the implementation as possible of the named tuple classes
into a base class which I've called BaseNamedTuple. This defines the
classmethods __fromvalues__ and __fromdict__, as well as the regular methods
__repr__, __asdict__, and __replace__.
5. It didn't occur to me to use exec ... in so I just create the new type
using the type() function. To me, exec is a last resort, but I'm a Python
newbie so I'd be interested to hear what people have to say about this.
6. Not an improvement but a concern about my code: the generated classes and
instances have all the crucial stuff like __fields__ and __signature__ fully
read-write. It feels like those should be read-only properties. I think that
would require namedtuple to be a metaclass instead of just a function (in order
for the properties of the generated classes to be read-only). On the other
hand, I'm a recovering Java programmer, so maybe it's un-Pythonic to want stuff
to be read-only. Here I would especially appreciate any guidance more
experienced hands can offer.
And now, here is the code, together with a rudimentary example of how this
could be used to improve the "addr" functions in email.utils:
#!/usr/bin/env python
from operator import itemgetter
class BaseNamedTuple (tuple):
@classmethod
def __fromvalues__ (cls, *args, **keys):
return cls.__fromdict__ (cls.__signature__.expand_args (*args, **keys))
@classmethod
def __fromdict__ (cls, d):
return cls ([d[name] for name in cls.__fields__])
def __repr__ (self):
return self.__reprtemplate__ % self
def __asdict__ (self):
return dict (zip (self.__fields__, self))
def __replace__ (self, *args):
slist = list (self)
if len (args) == 1:
sdict = args[0]
elif len (args) == 2:
sdict = {args[0]: args[1]}
else:
raise TypeError
for key in sdict:
slist[self.__indices__[key]] = sdict[key]
return self.__class__ (slist)
def namedtuple (name, fields, defaults=None):
fields = tuple (fields)
result = type (name, (BaseNamedTuple,), {})
for i in range (len (fields)):
setattr (result, fields[i], property (itemgetter (i), None, result))
result.__fields__ = fields
result.__signature__ = Signature (fields, defaults=defaults)
result.__reprtemplate__ = "%s(%s)" % (name,
", ".join ("%s=%%r" % name for name in fields))
result.__indices__ = dict ((field, i) for i, field in enumerate (fields))
return result
from email.utils import formataddr
class test (namedtuple ("test", ("realname", "email"), {'realname': None})):
@property
def valid (self):
return self.email.find ("@") >= 0
__str__ = formataddr
if __name__ == "__main__":
e1 = test (("Smith, John", "jsmith at example.com"))
print "e1.realname =", e1.realname
print "e1.email =", e1.email
print "e1 =", repr (e1)
print "str(e1) =", str (e1)
e2 = test.__fromvalues__ (email="test at example.com")
print "e2 =", repr (e2)
print "str(e2) =", str (e2)
Isaac Morland CSCF Web Guru
DC 2554C, x36650 WWW Software Specialist
More information about the Python-Dev
mailing list