[Python-ideas] namedtuple literals [Was: RE a new namedtuple]
steve at pearwood.info
Mon Jul 24 09:31:19 EDT 2017
On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote:
> I'm not sure why everybody have such a grip on the type.
> When we use regular tuples, noone care, it's all tuples, no matter what.
Some people care.
This is one of the serious disadvantages of ordinary tuples as a
record/struct type. There's no way to distinguish between (let's say)
rectangular coordinates (1, 2) and polar coordinates (1, 2), or between
(name, age) and (movie_title, score). They're all just 2-tuples.
> The whole point of this is to make it a litteral, simple and quick to
> use. If you make it more than it is, we already got everything to do
> this and don't need to modify the language.
I disagree: in my opinion, the whole point is to make namedtuple faster,
so that Python's startup time isn't affected so badly. Creating new
syntax for a new type of tuple is scope-creep.
Even if we had that new syntax, the problem of namedtuple slowing down
Python startup would remain. People can't use this new syntax until they
have dropped support for everything before 3.7, which might take many
years. But a fast namedtuple will give them benfit immediately their
users upgrade to 3.7.
I agree that there is a strong case to be made for a fast, built-in,
easy way to make record/structs without having to pre-declare them. But
as the Zen of Python says:
Now is better than never.
Although never is often better than *right* now.
Let's not rush into designing a poor record/struct builtin just because
we have a consensus (Raymond dissenting?) that namedtuple is too slow.
The two issues are, not unrelated, but orthogonal. Record syntax would
be still useful even if namedtuple was accelerated, and faster
namedtuple would still be necessary even if we have record syntax.
I believe that a couple of people (possibly including Guido?) are
already thinking about a PEP for that. If that's the case, let's wait
and see what they come up with.
In the meantime, lets get back to the original question here: how can we
make namedtuple faster?
- Guido has ruled out using a metaclass as the implementation,
as that makes it hard to inherit from namedtuple and another
class with a different metaclass.
- Backwards compatibility is a must.
- *But* maybe we can afford to bend backwards compatibility
a bit. Perhaps we don't need to generate the *entire* class
using exec, just __new__.
- I don't think that the _source attribute itself makes
namedtuple slow. That might effect the memory usage of the
class object itself, but its just a name binding:
result._source = class_definition
The expensive part is, I'm fairly sure, this:
(Taken from the 3.5 collections/__init__.py.)
I asked on PythonList at python.org whether people made us of the _source
attribute, and the overwhelming response was that they either didn't
know it existed, or if they did know, they didn't use it.
*If* it is accurate to say that nobody uses _source, then perhaps we
might be willing to make this minor backwards-incompatible change in 3.7
(but not in a bug-fix release):
- Only the __new__ method is generated by exec (my rough tests
suggest that may make namedtuple four times faster);
- _source only gives the source to __new__;
- or perhaps we can save backwards compatibility by making _source
generate the rest of the template lazily, when needed, even if
the entire template isn't used by exec.
That risks getting the *actual* source and the *reported* source
getting out of sync. Maybe its better to just break compatibility rather
than risk introducing a discrepancy between the two.
More information about the Python-ideas