[Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Mon Jul 24 09:31:19 EDT 2017

On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote:

> I'm not sure why everybody have such a grip on the type.
> 
> When we use regular tuples, noone care, it's all tuples, no matter what.

Some people care.

This is one of the serious disadvantages of ordinary tuples as a 
record/struct type. There's no way to distinguish between (let's say) 
rectangular coordinates (1, 2) and polar coordinates (1, 2), or between 
(name, age) and (movie_title, score). They're all just 2-tuples.

[...]
> The whole point of this is to make it a litteral, simple and quick to
> use. If you make it more than it is, we already got everything to do
> this and don't need to modify the language.

I disagree: in my opinion, the whole point is to make namedtuple faster, 
so that Python's startup time isn't affected so badly. Creating new 
syntax for a new type of tuple is scope-creep.

Even if we had that new syntax, the problem of namedtuple slowing down 
Python startup would remain. People can't use this new syntax until they 
have dropped support for everything before 3.7, which might take many 
years. But a fast namedtuple will give them benfit immediately their 
users upgrade to 3.7.

I agree that there is a strong case to be made for a fast, built-in, 
easy way to make record/structs without having to pre-declare them. But 
as the Zen of Python says:

    Now is better than never.
    Although never is often better than *right* now.

Let's not rush into designing a poor record/struct builtin just because 
we have a consensus (Raymond dissenting?) that namedtuple is too slow. 
The two issues are, not unrelated, but orthogonal. Record syntax would 
be still useful even if namedtuple was accelerated, and faster 
namedtuple would still be necessary even if we have record syntax.

I believe that a couple of people (possibly including Guido?) are 
already thinking about a PEP for that. If that's the case, let's wait 
and see what they come up with.

In the meantime, lets get back to the original question here: how can we 
make namedtuple faster?

- Guido has ruled out using a metaclass as the implementation, 
  as that makes it hard to inherit from namedtuple and another
  class with a different metaclass.

- Backwards compatibility is a must.

- *But* maybe we can afford to bend backwards compatibility 
  a bit. Perhaps we don't need to generate the *entire* class
  using exec, just __new__.

- I don't think that the _source attribute itself makes
  namedtuple slow. That might effect the memory usage of the
  class object itself, but its just a name binding:

    result._source = class_definition

  The expensive part is, I'm fairly sure, this:

    exec(class_definition, namespace)

(Taken from the 3.5 collections/__init__.py.)

I asked on PythonList at python.org whether people made us of the _source 
attribute, and the overwhelming response was that they either didn't 
know it existed, or if they did know, they didn't use it.

https://mail.python.org/pipermail/python-list/2017-July/723888.html

*If* it is accurate to say that nobody uses _source, then perhaps we 
might be willing to make this minor backwards-incompatible change in 3.7 
(but not in a bug-fix release):

- Only the __new__ method is generated by exec (my rough tests
  suggest that may make namedtuple four times faster);

- _source only gives the source to __new__;

- or perhaps we can save backwards compatibility by making _source
  generate the rest of the template lazily, when needed, even if
  the entire template isn't used by exec.

That risks getting the *actual* source and the *reported* source 
getting out of sync. Maybe its better to just break compatibility rather 
than risk introducing a discrepancy between the two.

-- 
Steve