[Python-Dev] Impact of Namedtuple on startup time
Steven D'Aprano
steve at pearwood.info
Mon Jul 17 12:45:20 EDT 2017
On Mon, Jul 17, 2017 at 02:43:19PM +0200, Antoine Pitrou wrote:
>
> Hello,
>
> Cost of creating a namedtuple has been identified as a contributor to
> Python startup time. Not only Python core and the stdlib, but any
> third-party library creating namedtuple classes (there are many of
> them). An issue was created for this:
> https://bugs.python.org/issue28638
Some time ago, I needed to backport a version of namedtuple to Python
2.4, so I started with Raymond's recipe on Activestate and modified it
to only exec the code needed for __new__. The rest of the class is an
ordinary inner class:
# a short sketch
def namedtuple(...):
class Inner(tuple):
...
exec(source, ns)
Inner.__new__ = ns['__new__']
return Inner
Here's my fork of Raymond's recipe:
https://code.activestate.com/recipes/578918-yet-another-namedtuple/
Out of curiosity, I took that recipe, updated it to work in Python 3,
and compared it to the std lib version. Here are some representative
timings:
[steve at ando ~]$ python3.5 -m timeit -s "from collections import
namedtuple" "K = namedtuple('K', 'a b c')"
1000 loops, best of 3: 1.02 msec per loop
[steve at ando ~]$ python3.5 -m timeit -s "from nt3 import namedtuple" "K =
namedtuple('K', 'a b c')"
1000 loops, best of 3: 255 usec per loop
I think that proves that this approach is viable and can lead to a big
speed up.
I don't think that merely dropping the _source attribute will save much
time. It might save a bit of memory, but in my experiements dropping it
only saves about 10µs more. I think the real bottleneck is the cost of
exec'ing the entire class.
--
Steve
More information about the Python-Dev
mailing list