[Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Mon Jul 24 12:37:37 EDT 2017

Le 24/07/2017 à 15:31, Steven D'Aprano a écrit :
> On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote:
> 
>> I'm not sure why everybody have such a grip on the type.
>>
>> When we use regular tuples, noone care, it's all tuples, no matter what.
> 
> Some people care.
> 
> This is one of the serious disadvantages of ordinary tuples as a 
> record/struct type. There's no way to distinguish between (let's say) 
> rectangular coordinates (1, 2) and polar coordinates (1, 2), or between 
> (name, age) and (movie_title, score). They're all just 2-tuples.

You are just using my figure of speech as a way to counter argument.
It's not a very useful thing to do.

Of course some people care, there are always a few people caring about
anything.

But you just created your manual namedtuple or a namespace and be done
with it.

Rejecting completly the literal syntax just because it doesn't improve
this use case you already had and worked but was a bit verbose is very
radical. Unless you have a very nice counter proposal that makes
everyone happy, accepting the current one doesn't take anything from you.

> 
> 
> [...]
>> The whole point of this is to make it a litteral, simple and quick to
>> use. If you make it more than it is, we already got everything to do
>> this and don't need to modify the language.
> 
> I disagree: in my opinion, the whole point is to make namedtuple faster, 
> so that Python's startup time isn't affected so badly. Creating new 
> syntax for a new type of tuple is scope-creep.

You are in the wrong thread. This thread is specifically about
namedtupels literal. Making namedtuple faster can be done in many other
ways and doesn't require a literal syntax. A literal syntax, while
making things slightly faster by nature, is essentially to make things
faster to read and write.

> 
> Even if we had that new syntax, the problem of namedtuple slowing down 
> Python startup would remain. People can't use this new syntax until they 
> have dropped support for everything before 3.7, which might take many 
> years. But a fast namedtuple will give them benfit immediately their 
> users upgrade to 3.7.

Again you are mixing the 2 things. This is why we have 2 threads: the
debate splitted.

> 
> I agree that there is a strong case to be made for a fast, built-in, 
> easy way to make record/structs without having to pre-declare them. 

Do other languages have such a thing that can be checked against types ?

> But 
> as the Zen of Python says:
> 
>     Now is better than never.
>     Although never is often better than *right* now.
> 

I agree. I don't thing we need to rush it. I can live without it now.  I
can live without it at all.

> Let's not rush into designing a poor record/struct builtin just because 
> we have a consensus (Raymond dissenting?) that namedtuple is too slow. 

We don't. We can solve the slowness problem without having the
namedtuple. The litteral is a convenience.

> The two issues are, not unrelated, but orthogonal. Record syntax would 
> be still useful even if namedtuple was accelerated, and faster 
> namedtuple would still be necessary even if we have record syntax.

On that we agree.

> 
> I believe that a couple of people (possibly including Guido?) are 
> already thinking about a PEP for that. If that's the case, let's wait 
> and see what they come up with.

Yes but it's about making classes less verbose if I recall. Or at least
use the class syntax. It's nice but not the same thing. Namedtuple
litterals are way more suited for scripting. You really don't want to
write a class in quick scripts, when you do exploratory programming or
data analysis on the fly.

> 
> In the meantime, lets get back to the original question here: how can we 
> make namedtuple faster?

The go to the other thread for that.

> 
> - Guido has ruled out using a metaclass as the implementation, 
>   as that makes it hard to inherit from namedtuple and another
>   class with a different metaclass.
> 
> - Backwards compatibility is a must.
> 
> - *But* maybe we can afford to bend backwards compatibility 
>   a bit. Perhaps we don't need to generate the *entire* class
>   using exec, just __new__.
> 
> - I don't think that the _source attribute itself makes
>   namedtuple slow. That might effect the memory usage of the
>   class object itself, but its just a name binding:
> 
>     result._source = class_definition
> 
>   The expensive part is, I'm fairly sure, this:
> 
>     exec(class_definition, namespace)
> 
> (Taken from the 3.5 collections/__init__.py.)
> 
> I asked on PythonList at python.org whether people made us of the _source 
> attribute, and the overwhelming response was that they either didn't 
> know it existed, or if they did know, they didn't use it.
> 
> https://mail.python.org/pipermail/python-list/2017-July/723888.html
> 
> 
> *If* it is accurate to say that nobody uses _source, then perhaps we 
> might be willing to make this minor backwards-incompatible change in 3.7 
> (but not in a bug-fix release):
> 
> - Only the __new__ method is generated by exec (my rough tests
>   suggest that may make namedtuple four times faster);
> 
> - _source only gives the source to __new__;
> 
> - or perhaps we can save backwards compatibility by making _source
>   generate the rest of the template lazily, when needed, even if
>   the entire template isn't used by exec.
> 
> That risks getting the *actual* source and the *reported* source 
> getting out of sync. Maybe its better to just break compatibility rather 
> than risk introducing a discrepancy between the two.
> 
> 
>