Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple]

July 24, 2017

      Le 24/07/2017 à 15:31, Steven D'Aprano a écrit :
...
On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote:
...
I'm not sure why everybody have such a grip on the type.
When we use regular tuples, noone care, it's all tuples, no matter what.
Some people care.
This is one of the serious disadvantages of ordinary tuples as a 
record/struct type. There's no way to distinguish between (let's say) 
rectangular coordinates (1, 2) and polar coordinates (1, 2), or between 
(name, age) and (movie_title, score). They're all just 2-tuples.
You are just using my figure of speech as a way to counter argument.
It's not a very useful thing to do.

Of course some people care, there are always a few people caring about
anything.

But you just created your manual namedtuple or a namespace and be done
with it.

Rejecting completly the literal syntax just because it doesn't improve
this use case you already had and worked but was a bit verbose is very
radical. Unless you have a very nice counter proposal that makes
everyone happy, accepting the current one doesn't take anything from you.
...
[...]
...
The whole point of this is to make it a litteral, simple and quick to
use. If you make it more than it is, we already got everything to do
this and don't need to modify the language.
I disagree: in my opinion, the whole point is to make namedtuple faster, 
so that Python's startup time isn't affected so badly. Creating new 
syntax for a new type of tuple is scope-creep.
You are in the wrong thread. This thread is specifically about
namedtupels literal. Making namedtuple faster can be done in many other
ways and doesn't require a literal syntax. A literal syntax, while
making things slightly faster by nature, is essentially to make things
faster to read and write.
...
Even if we had that new syntax, the problem of namedtuple slowing down 
Python startup would remain. People can't use this new syntax until they 
have dropped support for everything before 3.7, which might take many 
years. But a fast namedtuple will give them benfit immediately their 
users upgrade to 3.7.
Again you are mixing the 2 things. This is why we have 2 threads: the
debate splitted.
...
I agree that there is a strong case to be made for a fast, built-in, 
easy way to make record/structs without having to pre-declare them.
Do other languages have such a thing that can be checked against types ?
...
But 
as the Zen of Python says:
Now is better than never.
    Although never is often better than *right* now.
I agree. I don't thing we need to rush it. I can live without it now.  I
can live without it at all.
...
Let's not rush into designing a poor record/struct builtin just because 
we have a consensus (Raymond dissenting?) that namedtuple is too slow.
We don't. We can solve the slowness problem without having the
namedtuple. The litteral is a convenience.
...
The two issues are, not unrelated, but orthogonal. Record syntax would 
be still useful even if namedtuple was accelerated, and faster 
namedtuple would still be necessary even if we have record syntax.
On that we agree.
...
I believe that a couple of people (possibly including Guido?) are 
already thinking about a PEP for that. If that's the case, let's wait 
and see what they come up with.
Yes but it's about making classes less verbose if I recall. Or at least
use the class syntax. It's nice but not the same thing. Namedtuple
litterals are way more suited for scripting. You really don't want to
write a class in quick scripts, when you do exploratory programming or
data analysis on the fly.
...
In the meantime, lets get back to the original question here: how can we 
make namedtuple faster?
The go to the other thread for that.
...
- Guido has ruled out using a metaclass as the implementation, 
  as that makes it hard to inherit from namedtuple and another
  class with a different metaclass.
- Backwards compatibility is a must.
- *But* maybe we can afford to bend backwards compatibility 
  a bit. Perhaps we don't need to generate the *entire* class
  using exec, just __new__.
- I don't think that the _source attribute itself makes
  namedtuple slow. That might effect the memory usage of the
  class object itself, but its just a name binding:
result._source = class_definition
The expensive part is, I'm fairly sure, this:
exec(class_definition, namespace)
(Taken from the 3.5 collections/__init__.py.)
I asked on PythonList@python.org whether people made us of the _source 
attribute, and the overwhelming response was that they either didn't 
know it existed, or if they did know, they didn't use it.
https://mail.python.org/pipermail/python-list/2017-July/723888.html
*If* it is accurate to say that nobody uses _source, then perhaps we 
might be willing to make this minor backwards-incompatible change in 3.7 
(but not in a bug-fix release):
- Only the __new__ method is generated by exec (my rough tests
  suggest that may make namedtuple four times faster);
- _source only gives the source to __new__;
- or perhaps we can save backwards compatibility by making _source
  generate the rest of the template lazily, when needed, even if
  the entire template isn't used by exec.
That risks getting the *actual* source and the *reported* source 
getting out of sync. Maybe its better to just break compatibility rather 
than risk introducing a discrepancy between the two.