
Le 24/07/2017 à 15:31, Steven D'Aprano a écrit :
On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote:
I'm not sure why everybody have such a grip on the type.
When we use regular tuples, noone care, it's all tuples, no matter what.
Some people care.
This is one of the serious disadvantages of ordinary tuples as a record/struct type. There's no way to distinguish between (let's say) rectangular coordinates (1, 2) and polar coordinates (1, 2), or between (name, age) and (movie_title, score). They're all just 2-tuples.
You are just using my figure of speech as a way to counter argument. It's not a very useful thing to do. Of course some people care, there are always a few people caring about anything. But you just created your manual namedtuple or a namespace and be done with it. Rejecting completly the literal syntax just because it doesn't improve this use case you already had and worked but was a bit verbose is very radical. Unless you have a very nice counter proposal that makes everyone happy, accepting the current one doesn't take anything from you.
[...]
The whole point of this is to make it a litteral, simple and quick to use. If you make it more than it is, we already got everything to do this and don't need to modify the language.
I disagree: in my opinion, the whole point is to make namedtuple faster, so that Python's startup time isn't affected so badly. Creating new syntax for a new type of tuple is scope-creep.
You are in the wrong thread. This thread is specifically about namedtupels literal. Making namedtuple faster can be done in many other ways and doesn't require a literal syntax. A literal syntax, while making things slightly faster by nature, is essentially to make things faster to read and write.
Even if we had that new syntax, the problem of namedtuple slowing down Python startup would remain. People can't use this new syntax until they have dropped support for everything before 3.7, which might take many years. But a fast namedtuple will give them benfit immediately their users upgrade to 3.7.
Again you are mixing the 2 things. This is why we have 2 threads: the debate splitted.
I agree that there is a strong case to be made for a fast, built-in, easy way to make record/structs without having to pre-declare them.
Do other languages have such a thing that can be checked against types ?
But as the Zen of Python says:
Now is better than never. Although never is often better than *right* now.
I agree. I don't thing we need to rush it. I can live without it now. I can live without it at all.
Let's not rush into designing a poor record/struct builtin just because we have a consensus (Raymond dissenting?) that namedtuple is too slow.
We don't. We can solve the slowness problem without having the namedtuple. The litteral is a convenience.
The two issues are, not unrelated, but orthogonal. Record syntax would be still useful even if namedtuple was accelerated, and faster namedtuple would still be necessary even if we have record syntax.
On that we agree.
I believe that a couple of people (possibly including Guido?) are already thinking about a PEP for that. If that's the case, let's wait and see what they come up with.
Yes but it's about making classes less verbose if I recall. Or at least use the class syntax. It's nice but not the same thing. Namedtuple litterals are way more suited for scripting. You really don't want to write a class in quick scripts, when you do exploratory programming or data analysis on the fly.
In the meantime, lets get back to the original question here: how can we make namedtuple faster?
The go to the other thread for that.
- Guido has ruled out using a metaclass as the implementation, as that makes it hard to inherit from namedtuple and another class with a different metaclass.
- Backwards compatibility is a must.
- *But* maybe we can afford to bend backwards compatibility a bit. Perhaps we don't need to generate the *entire* class using exec, just __new__.
- I don't think that the _source attribute itself makes namedtuple slow. That might effect the memory usage of the class object itself, but its just a name binding:
result._source = class_definition
The expensive part is, I'm fairly sure, this:
exec(class_definition, namespace)
(Taken from the 3.5 collections/__init__.py.)
I asked on PythonList@python.org whether people made us of the _source attribute, and the overwhelming response was that they either didn't know it existed, or if they did know, they didn't use it.
https://mail.python.org/pipermail/python-list/2017-July/723888.html
*If* it is accurate to say that nobody uses _source, then perhaps we might be willing to make this minor backwards-incompatible change in 3.7 (but not in a bug-fix release):
- Only the __new__ method is generated by exec (my rough tests suggest that may make namedtuple four times faster);
- _source only gives the source to __new__;
- or perhaps we can save backwards compatibility by making _source generate the rest of the template lazily, when needed, even if the entire template isn't used by exec.
That risks getting the *actual* source and the *reported* source getting out of sync. Maybe its better to just break compatibility rather than risk introducing a discrepancy between the two.