[Python-ideas] Anonymous namedtuples
Steven D'Aprano
steve at pearwood.info
Tue Apr 19 08:59:58 EDT 2016
On Tue, Apr 19, 2016 at 09:49:44AM +0000, Joseph Martinot-Lagarde wrote:
> Hi, list !
>
> namedtuples are really great. I would like to use them even more, for
> example for functions that return multiple arguments. The problem is that
> namedtuples have to be "declared" beforehand, so it would be quite tedious
> to declare a nameedtuple by function, that's why I very rarely do it.
You aren't forced to declare the type ahead of time, you can just use it
as part of an expression:
py> from collections import namedtuple
py> my_tuple = namedtuple("_", "x y z")(1, 2, 3)
py> print(my_tuple)
_(x=1, y=2, z=3)
However, there is a gotcha if you do this: each time you create an
anonymous namedtuple, you create a new, distinct, class:
py> a = namedtuple("_", "x y")(1, 2)
py> b = namedtuple("_", "x y")(1, 2)
py> type(a) == type(b)
False
which could be both surprising and expensive.
One possible solution: keep your own cache of classes:
def nt(name, fields, _cache={}):
cls = _cache.get((name, fields), None)
if cls is None:
cls = _cache[(name, fields)] = namedtuple(name, fields)
return cls
> Another hting I don't like about namedtuples is the duplication of the name.
> TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`,
> where `Point` is repeated two times. I'll go one step further and say that
> the name is useless most of the time, so let's just get rid of it.
The name is not useless. It is very useful for string representations,
debugging, introspection, and generally having a clue what the object
represents. How else do you know what kind of object you are dealing
with?
I agree that it is a little sad that we have to repeat the name twice,
but strictly speaking, you don't even need to do that. For example:
class Point(namedtuple("AbstractPoint", "x y z")):
def method(self):
...
In my opinion, avoiding having to repeat the name twice is a "nice to
have", not a "must have".
> Proposal
> ========
>
> So I thought about a new (ok, maybe it has been proposed before but I
> couldn't find it) syntax for anonymous namedtuples (I put the prints as
> comments, otherwise gmane is complainig about top-posting):
>
> my_point = (x=12, y=16)
> # (x=12, y=16)
> my_point[0]
> # 12
How do you know that the 0th item is field "x"?
Keyword arguments are not ordered. Even if you could somehow determine
the order that they are given, you can't use that information since we
should expect that:
assert (x=12, y=16) == (y=16, x=12)
will pass.
(Why? Because they're keyword arguments, and the order of keyword
arguments shouldn't matter.)
So we have a problem that the indexing order (the same order is used for
iteration and tuple unpacking) is not specified anywhere. This is a
*major* problem for an ordered type like tuple.
namedtuple avoids this problem by requiring the user to specify the
field name order before creating an instance.
SimpleNamespace avoids this problem by not being ordered or iterable.
I suppose we could put the fields in sorted order. But that's going to
make life difficult for uses where we would like some other order, e.g.
to match common conventions. Consider a 3D point in spherical
coordinates:
pt = (r=3, theta=0.5, phi=0.25)
In sorted order, pt == (3, 0.25, 0.5) which goes against the standard
mathematical definition.
Namedtuples pre-define what fields are allowed, what they are
called, and what order they appear in. Anonymous namedtuples as you
describe them don't do any of these things.
Consider that since they're tuples, we should be able to provide the
items as positional arguments to the construct, just like regular
namedtuples:
pt = (x=12, y=16)
type(pt)(1, 2)
This works fine with namedtuple, but how will it work with your
proposal? And what happens if we do this?
type(pt)(1, 2, 3)
And of course, a naive implementation would suffer from the same issue
as mentioned above, where every instance is a singleton of a distinct
class. Python isn't a language where we care too much about memory use,
but surely we don't want to be quite this profligate:
py> a = namedtuple("Point", "x y z")(1, 2, 3)
py> sys.getsizeof(a) # reasonably small
36
py> sys.getsizeof(type(a)) # not so small
420
Obviously we don't want every instance to belong to a distinct class, so
we need some sort of cache. SimpleNamespace solves this problem by
making all instances belong to the same class. That's another difference
between namedtuple and what you seem to want.
--
Steve
More information about the Python-ideas
mailing list