[Python-ideas] Anonymous namedtuples

Steven D'Aprano steve at pearwood.info
Tue Apr 19 08:59:58 EDT 2016


On Tue, Apr 19, 2016 at 09:49:44AM +0000, Joseph Martinot-Lagarde wrote:
> Hi, list !
> 
> namedtuples are really great. I would like to use them even more, for
> example for functions that return multiple arguments. The problem is that
> namedtuples have to be "declared" beforehand, so it would be quite tedious
> to declare a nameedtuple by function, that's why I very rarely do it.

You aren't forced to declare the type ahead of time, you can just use it 
as part of an expression:

py> from collections import namedtuple
py> my_tuple = namedtuple("_", "x y z")(1, 2, 3)
py> print(my_tuple)
_(x=1, y=2, z=3)


However, there is a gotcha if you do this: each time you create an 
anonymous namedtuple, you create a new, distinct, class:

py> a = namedtuple("_", "x y")(1, 2)
py> b = namedtuple("_", "x y")(1, 2)
py> type(a) == type(b)
False

which could be both surprising and expensive.

One possible solution: keep your own cache of classes:

def nt(name, fields, _cache={}):
    cls = _cache.get((name, fields), None)
    if cls is None:
        cls = _cache[(name, fields)] = namedtuple(name, fields)
    return cls



> Another hting I don't like about namedtuples is the duplication of the name.
> TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`,
> where `Point` is repeated two times. I'll go one step further and say that
> the name is useless most of the time, so let's just get rid of it.

The name is not useless. It is very useful for string representations, 
debugging, introspection, and generally having a clue what the object 
represents. How else do you know what kind of object you are dealing 
with?

I agree that it is a little sad that we have to repeat the name twice, 
but strictly speaking, you don't even need to do that. For example:

class Point(namedtuple("AbstractPoint", "x y z")):
    def method(self):
        ...

In my opinion, avoiding having to repeat the name twice is a "nice to 
have", not a "must have".



> Proposal
> ========
> 
> So I thought about a new (ok, maybe it has been proposed before but I
> couldn't find it) syntax for anonymous namedtuples (I put the prints as
> comments, otherwise gmane is complainig about top-posting):
> 
>     my_point = (x=12, y=16)
>     # (x=12, y=16)
>     my_point[0]
>     # 12

How do you know that the 0th item is field "x"?

Keyword arguments are not ordered. Even if you could somehow determine 
the order that they are given, you can't use that information since we 
should expect that:

assert (x=12, y=16) == (y=16, x=12)

will pass. 

(Why? Because they're keyword arguments, and the order of keyword 
arguments shouldn't matter.)

So we have a problem that the indexing order (the same order is used for 
iteration and tuple unpacking) is not specified anywhere. This is a 
*major* problem for an ordered type like tuple.

namedtuple avoids this problem by requiring the user to specify the 
field name order before creating an instance.

SimpleNamespace avoids this problem by not being ordered or iterable.

I suppose we could put the fields in sorted order. But that's going to 
make life difficult for uses where we would like some other order, e.g. 
to match common conventions. Consider a 3D point in spherical 
coordinates:

pt = (r=3, theta=0.5, phi=0.25)

In sorted order, pt == (3, 0.25, 0.5) which goes against the standard 
mathematical definition.


Namedtuples pre-define what fields are allowed, what they are 
called, and what order they appear in. Anonymous namedtuples as you 
describe them don't do any of these things.

Consider that since they're tuples, we should be able to provide the 
items as positional arguments to the construct, just like regular 
namedtuples:

pt = (x=12, y=16)
type(pt)(1, 2)

This works fine with namedtuple, but how will it work with your 
proposal? And what happens if we do this?

type(pt)(1, 2, 3)


And of course, a naive implementation would suffer from the same issue 
as mentioned above, where every instance is a singleton of a distinct 
class. Python isn't a language where we care too much about memory use, 
but surely we don't want to be quite this profligate:

py> a = namedtuple("Point", "x y z")(1, 2, 3)
py> sys.getsizeof(a)  # reasonably small
36
py> sys.getsizeof(type(a))  # not so small
420

Obviously we don't want every instance to belong to a distinct class, so 
we need some sort of cache. SimpleNamespace solves this problem by 
making all instances belong to the same class. That's another difference 
between namedtuple and what you seem to want.



-- 
Steve


More information about the Python-ideas mailing list