
Guido has decreed that namedtuple shall be reimplemented with speed in mind. I haven't timed it (I'm hoping somebody will volunteer to be the bench mark guru), I'll offer my NamedTuple implementation from my aenum [1] library. It uses the same metaclass techniques as Enum, and offers doc string and default value support in the class-based form. -- ~Ethan~ [1] https://pypi.python.org/pypi/aenum/1.4.5

Just FYI, typing.NamedTuple is there for almost a year and already supports default values, methods, docstrings etc. Also there is ongoing work towards dataclasses PEP, see https://github.com/ericvsmith/dataclasses So that would keep namedtuple API as it is, and focus only on performance improvements. -- Ivan On 18 July 2017 at 02:01, Ethan Furman <ethan@stoneleaf.us> wrote:

If we are worried about speed but want to keep the same API I have a near drop in replacement for collections.namedtuple that dramatically improves class and instance creation speed [1]. The only things missing from this implementation are `_source` and `verbose` which could be dynamically computed to provide equivalent Python source. This project was originally proposed as a replacement for the standard namedtuple, but after talking to Raymond we decided the performance did not outweigh the simplicity of the existing implementation. Now that people seem more concerned with performance, I wanted to bring this up again. [1] https://github.com/llllllllll/cnamedtuple On Mon, Jul 17, 2017 at 8:04 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:

On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Guido has decreed that namedtuple shall be reimplemented with speed in mind.
FWIW, I'm sure that any changes to namedtuple will be kept as minimal as possible. Changes would be limited to the underlying implementation, and would not include the namedtuple() signature, or using metaclasses, etc. However, I don't presume to speak for Guido or Raymond. :) -eric

On 18 July 2017 at 14:31, Guido van Rossum <guido@python.org> wrote:
In that vein, something I'll note that *wasn't* historically possible due to the lack of keyword argument order preservation is an implementation that implicitly defines anonymous named tuple types based on the received keyword arguments. Given Python 3.6+ though, this works: from collections import namedtuple def _make_named_tuple(*fields): cls_name = "_ntuple_" + "_".join(fields) # Use the module globals as a cache for pickle compatibility namespace = globals() try: return namespace[cls_name] except KeyError: cls = namedtuple(cls_name, fields) return namespace.setdefault(cls_name, cls) def ntuple(**items): cls = _make_named_tuple(*items) return cls(*items.values()) >>> p1 = ntuple(x=1, y=2) >>> p2 = ntuple(x=4, y=5) >>> type(p1) is type(p2) True >>> type(p1) <class '__main__.ntuple_x_y'> That particular approach isn't *entirely* pickle friendly (since unpickling will still fail if a suitable type hasn't been defined in the destination process yet), but you can fix that by way of playing games with setting cls.__qualname__ to refer to an instance of a custom class that splits "_ntuple_*" back out into the component field names in __getattr__ and then calls _make_named_tuple, rather than relying solely on a factory function as I have done here. However, it also isn't all that hard to imagine a literal syntax instead using a dedicated builtin type factory (perhaps based on structseq) that implicitly produced types that knew to rely on the appropriate builtin to handle instance creation on unpickling - the hardest part of the problem (preserving the keyword argument order) was already addressed in 3.6. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Jul 18, 2017 at 8:56 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
It is until you try to subclass with another metaclass -- then you have a metaclass conflict. If the namedtuple had no metaclass this would not be a conflict. (This is one reason to love class decorators.) -- --Guido van Rossum (python.org/~guido)

On Tue, Jul 18, 2017 at 3:18 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Newer versions of the typing module do this: https://docs.python.org/3/library/typing.html#typing.NamedTuple (and indeed it's done with a metaclass). -- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)

On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote:
With respect Ethan, if you're going to offer up NamedTuple as a faster version of namedtuple, you should at least do a quick proof of concept to demonstrate that it actually *is* faster. Full bench marking can wait, but you should be able to do at least something like: python3 -m timeit --setup "from collections import namedtuple" \ "K = namedtuple('K', 'a b c')" versus python3 -m timeit --setup "from aenum import NamedTuple" \ "K = NamedTuple('K', 'a b c')" (or whatever the interface is). If there's only a trivial speed up, or if its slower, then there's no point even considing it unless you speed it up first. -- Steve

On 07/17/2017 06:34 PM, Steven D'Aprano wrote:
On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote:
I suck at benchmarking, so thank you for providing those quick-and-dirty hints.
546 usec
250 usec So it seems to be faster! :) It is also namedtuple compatible, except for the _source attribute. -- ~Ethan~

Can you try across a range of tuple sizes? E.g. what about with 100 items? 1000? On Jul 17, 2017 7:56 PM, "Ethan Furman" <ethan@stoneleaf.us> wrote: On 07/17/2017 06:34 PM, Steven D'Aprano wrote:
On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote:
Guido has decreed that namedtuple shall be reimplemented with speed in mind.
I suck at benchmarking, so thank you for providing those quick-and-dirty hints. Full bench marking
546 usec versus
python3 -m timeit --setup "from aenum import NamedTuple" \ "K = NamedTuple('K', 'a b c')"
250 usec So it seems to be faster! :) It is also namedtuple compatible, except for the _source attribute. -- ~Ethan~ _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 07/17/2017 09:06 PM, David Mertz wrote:
Can you try across a range of tuple sizes? E.g. what about with 100 items? 1000?
I tried 26 and 52 (which really seems unlikely for a named tuple), and NamedTuple was consistently faster by about 250 usec. Using a metaclass is off the table, but this is still interesting data for me. :) -- ~Ethan~

In the other thread, I had mentioned my "extradict" implementation - it does have quite a few differences as it did not try to match namedtuple API, but it works nicely for all common use cases - these are the timeit timings: (env) [gwidion@caylus ]$ python3 -m timeit --setup "from collections import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 362 usec per loop (env) [gwidion@caylus ]$ python3 -m timeit --setup "from extradict import namedtuple" "K = namedtuple('K', 'a b c')" 10000 loops, best of 3: 20 usec per loop (env) [gwidion@caylus ]$ python3 -m timeit --setup "from extradict import fastnamedtuple as namedtuple" "K = namedtuple('K', 'a b c')" 10000 loops, best of 3: 21 usec per loop Source at: https://github.com/jsbueno/extradict/blob/master/extradict/extratuple.py On 17 July 2017 at 22:34, Steven D'Aprano <steve@pearwood.info> wrote:

Just FYI, typing.NamedTuple is there for almost a year and already supports default values, methods, docstrings etc. Also there is ongoing work towards dataclasses PEP, see https://github.com/ericvsmith/dataclasses So that would keep namedtuple API as it is, and focus only on performance improvements. -- Ivan On 18 July 2017 at 02:01, Ethan Furman <ethan@stoneleaf.us> wrote:

If we are worried about speed but want to keep the same API I have a near drop in replacement for collections.namedtuple that dramatically improves class and instance creation speed [1]. The only things missing from this implementation are `_source` and `verbose` which could be dynamically computed to provide equivalent Python source. This project was originally proposed as a replacement for the standard namedtuple, but after talking to Raymond we decided the performance did not outweigh the simplicity of the existing implementation. Now that people seem more concerned with performance, I wanted to bring this up again. [1] https://github.com/llllllllll/cnamedtuple On Mon, Jul 17, 2017 at 8:04 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:

On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Guido has decreed that namedtuple shall be reimplemented with speed in mind.
FWIW, I'm sure that any changes to namedtuple will be kept as minimal as possible. Changes would be limited to the underlying implementation, and would not include the namedtuple() signature, or using metaclasses, etc. However, I don't presume to speak for Guido or Raymond. :) -eric

On 18 July 2017 at 14:31, Guido van Rossum <guido@python.org> wrote:
In that vein, something I'll note that *wasn't* historically possible due to the lack of keyword argument order preservation is an implementation that implicitly defines anonymous named tuple types based on the received keyword arguments. Given Python 3.6+ though, this works: from collections import namedtuple def _make_named_tuple(*fields): cls_name = "_ntuple_" + "_".join(fields) # Use the module globals as a cache for pickle compatibility namespace = globals() try: return namespace[cls_name] except KeyError: cls = namedtuple(cls_name, fields) return namespace.setdefault(cls_name, cls) def ntuple(**items): cls = _make_named_tuple(*items) return cls(*items.values()) >>> p1 = ntuple(x=1, y=2) >>> p2 = ntuple(x=4, y=5) >>> type(p1) is type(p2) True >>> type(p1) <class '__main__.ntuple_x_y'> That particular approach isn't *entirely* pickle friendly (since unpickling will still fail if a suitable type hasn't been defined in the destination process yet), but you can fix that by way of playing games with setting cls.__qualname__ to refer to an instance of a custom class that splits "_ntuple_*" back out into the component field names in __getattr__ and then calls _make_named_tuple, rather than relying solely on a factory function as I have done here. However, it also isn't all that hard to imagine a literal syntax instead using a dedicated builtin type factory (perhaps based on structseq) that implicitly produced types that knew to rely on the appropriate builtin to handle instance creation on unpickling - the hardest part of the problem (preserving the keyword argument order) was already addressed in 3.6. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Jul 18, 2017 at 8:56 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
It is until you try to subclass with another metaclass -- then you have a metaclass conflict. If the namedtuple had no metaclass this would not be a conflict. (This is one reason to love class decorators.) -- --Guido van Rossum (python.org/~guido)

On Tue, Jul 18, 2017 at 3:18 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Newer versions of the typing module do this: https://docs.python.org/3/library/typing.html#typing.NamedTuple (and indeed it's done with a metaclass). -- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)

On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote:
With respect Ethan, if you're going to offer up NamedTuple as a faster version of namedtuple, you should at least do a quick proof of concept to demonstrate that it actually *is* faster. Full bench marking can wait, but you should be able to do at least something like: python3 -m timeit --setup "from collections import namedtuple" \ "K = namedtuple('K', 'a b c')" versus python3 -m timeit --setup "from aenum import NamedTuple" \ "K = NamedTuple('K', 'a b c')" (or whatever the interface is). If there's only a trivial speed up, or if its slower, then there's no point even considing it unless you speed it up first. -- Steve

On 07/17/2017 06:34 PM, Steven D'Aprano wrote:
On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote:
I suck at benchmarking, so thank you for providing those quick-and-dirty hints.
546 usec
250 usec So it seems to be faster! :) It is also namedtuple compatible, except for the _source attribute. -- ~Ethan~

Can you try across a range of tuple sizes? E.g. what about with 100 items? 1000? On Jul 17, 2017 7:56 PM, "Ethan Furman" <ethan@stoneleaf.us> wrote: On 07/17/2017 06:34 PM, Steven D'Aprano wrote:
On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote:
Guido has decreed that namedtuple shall be reimplemented with speed in mind.
I suck at benchmarking, so thank you for providing those quick-and-dirty hints. Full bench marking
546 usec versus
python3 -m timeit --setup "from aenum import NamedTuple" \ "K = NamedTuple('K', 'a b c')"
250 usec So it seems to be faster! :) It is also namedtuple compatible, except for the _source attribute. -- ~Ethan~ _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 07/17/2017 09:06 PM, David Mertz wrote:
Can you try across a range of tuple sizes? E.g. what about with 100 items? 1000?
I tried 26 and 52 (which really seems unlikely for a named tuple), and NamedTuple was consistently faster by about 250 usec. Using a metaclass is off the table, but this is still interesting data for me. :) -- ~Ethan~

In the other thread, I had mentioned my "extradict" implementation - it does have quite a few differences as it did not try to match namedtuple API, but it works nicely for all common use cases - these are the timeit timings: (env) [gwidion@caylus ]$ python3 -m timeit --setup "from collections import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 362 usec per loop (env) [gwidion@caylus ]$ python3 -m timeit --setup "from extradict import namedtuple" "K = namedtuple('K', 'a b c')" 10000 loops, best of 3: 20 usec per loop (env) [gwidion@caylus ]$ python3 -m timeit --setup "from extradict import fastnamedtuple as namedtuple" "K = namedtuple('K', 'a b c')" 10000 loops, best of 3: 21 usec per loop Source at: https://github.com/jsbueno/extradict/blob/master/extradict/extratuple.py On 17 July 2017 at 22:34, Steven D'Aprano <steve@pearwood.info> wrote:
participants (10)
-
David Mertz
-
Eric Snow
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
Ivan Levkivskyi
-
Joao S. O. Bueno
-
Joseph Jevnik
-
Nick Coghlan
-
Steven D'Aprano