<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2017-07-17 9:45 GMT-07:00 Steven D'Aprano <span dir="ltr"><<a href="mailto:steve@pearwood.info" target="_blank">steve@pearwood.info</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-">On Mon, Jul 17, 2017 at 02:43:19PM +0200, Antoine Pitrou wrote:<br>
><br>
> Hello,<br>
><br>
> Cost of creating a namedtuple has been identified as a contributor to<br>
> Python startup time. Not only Python core and the stdlib, but any<br>
> third-party library creating namedtuple classes (there are many of<br>
> them). An issue was created for this:<br>
> <a href="https://bugs.python.org/issue28638" rel="noreferrer" target="_blank">https://bugs.python.org/<wbr>issue28638</a><br>
<br>
</span>Some time ago, I needed to backport a version of namedtuple to Python<br>
2.4, so I started with Raymond's recipe on Activestate and modified it<br>
to only exec the code needed for __new__. The rest of the class is an<br>
ordinary inner class:<br>
<br>
# a short sketch<br>
def namedtuple(...):<br>
class Inner(tuple):<br>
...<br>
exec(source, ns)<br>
Inner.__new__ = ns['__new__']<br>
return Inner<br>
<br>
<br>
Here's my fork of Raymond's recipe:<br>
<br>
<a href="https://code.activestate.com/recipes/578918-yet-another-namedtuple/" rel="noreferrer" target="_blank">https://code.activestate.com/<wbr>recipes/578918-yet-another-<wbr>namedtuple/</a><br>
<br>
<br>
Out of curiosity, I took that recipe, updated it to work in Python 3,<br>
and compared it to the std lib version. Here are some representative<br>
timings:<br>
<br>
[steve@ando ~]$ python3.5 -m timeit -s "from collections import<br>
namedtuple" "K = namedtuple('K', 'a b c')"<br>
1000 loops, best of 3: 1.02 msec per loop<br>
<br>
[steve@ando ~]$ python3.5 -m timeit -s "from nt3 import namedtuple" "K =<br>
namedtuple('K', 'a b c')"<br>
1000 loops, best of 3: 255 usec per loop<br>
<br>
<br>
I think that proves that this approach is viable and can lead to a big<br>
speed up.<br>
<br></blockquote><div>I have an open pull request implementing this approach: <a href="https://github.com/python/cpython/pull/2736">https://github.com/python/cpython/pull/2736</a>. We can discuss the exact form the code should take there (Ivan already added some good suggestions).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
I don't think that merely dropping the _source attribute will save much<br>
time. It might save a bit of memory, but in my experiements dropping it<br>
only saves about 10µs more. I think the real bottleneck is the cost of<br>
exec'ing the entire class.<br>
<span class="gmail-HOEnZb"><font color="#888888"><br>
<br>
<br>
--<br>
Steve<br>
</font></span><div class="gmail-HOEnZb"><div class="gmail-h5">______________________________<wbr>_________________<br>
Python-Dev mailing list<br>
<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-dev" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-dev</a><br>
Unsubscribe: <a href="https://mail.python.org/mailman/options/python-dev/jelle.zijlstra%40gmail.com" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/options/python-dev/<wbr>jelle.zijlstra%40gmail.com</a><br>
</div></div></blockquote></div><br></div></div>