<br><br><div class="gmail_quote">On Tue, Feb 7, 2012 at 18:08, Antoine Pitrou <span dir="ltr">&lt;<a href="mailto:solipsis@pitrou.net">solipsis@pitrou.net</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="im">On Tue, 7 Feb 2012 17:16:18 -0500<br>

Brett Cannon &lt;<a href="mailto:brett@python.org">brett@python.org</a>&gt; wrote:<br>

&gt;<br>

&gt; &gt;  &gt; IOW I really do not look forward to someone saying &quot;importlib is so much<br>

&gt; &gt; &gt; slower at importing a module containing ``pass``&quot; when (a) that never<br>

&gt; &gt; &gt; happens, and (b) most programs do not spend their time importing but<br>

&gt; &gt; &gt; instead doing interesting work.<br>

&gt; &gt;<br>

&gt; &gt; Well, import time is so important that the Mercurial developers have<br>

&gt; &gt; written an on-demand import mechanism, to reduce the latency of<br>

&gt; &gt; command-line operations.<br>

&gt; &gt;<br>

&gt;<br>

&gt; Sure, but they are a somewhat extreme case.<br>

<br>

</div>I don&#39;t think Mercurial is extreme. Any command-line tool written in<br>

Python applies. For example, yum (Fedora&#39;s apt-get) is written in<br>

Python. And I&#39;m sure many people do small administration scripts in<br>

Python. These tools may then be run in a loop by whatever other script.<br>

<div class="im"><br>

&gt; &gt; But it&#39;s not only important for Mercurial and the like. Even if you&#39;re<br>

&gt; &gt; developing a Web app, making imports slower will make restarts slower,<br>

&gt; &gt; and development more tedious in the first place.<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; Fine, startup cost from a hard crash I can buy when you are getting 1000<br>

&gt; QPS, but development more tedious?<br>

<br>

</div>Well, waiting several seconds when reloading a development server is<br>

tedious. Anyway, my point was that other cases (than command-line<br>

tools) can be negatively impacted by import time.<br>

<div class="im"><br>

&gt; &gt;  &gt; So, if there is going to be some baseline performance target I need to<br>

&gt; &gt; hit<br>

&gt; &gt; &gt; to make people happy I would prefer to know what that (real-world)<br>

&gt; &gt; &gt; benchmark is and what the performance target is going to be on a<br>

&gt; &gt; non-debug<br>

&gt; &gt; &gt; build.<br>

&gt; &gt;<br>

&gt; &gt; - No significant slowdown in startup time.<br>

&gt; &gt;<br>

&gt;<br>

&gt; What&#39;s significant and measuring what exactly? I mean startup already has a<br>

&gt; ton of imports as it is, so this would wash out the point of measuring<br>

&gt; practically anything else for anything small.<br>

<br>

</div>I don&#39;t understand your sentence. Yes, startup has a ton of imports and<br>

that&#39;s why I&#39;m fearing it may be negatively impacted :)<br>

<br>

(&quot;a ton&quot; being a bit less than 50 currently)<br></blockquote><div><br></div><div>So you want less than a 50% startup cost on the standard startup benchmarks?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="im"><br>

&gt; This is why I said I want a<br>

&gt; benchmark to target which does actual work since flat-out startup time<br>

&gt; measures nothing meaningful but busy work.<br>

<br>

</div>&quot;Actual work&quot; can be very small in some cases. For example, if you run<br>

&quot;hg branch&quot; I&#39;m quite sure it doesn&#39;t do a lot of work except importing<br>

many modules and then reading a single file in .hg (the one named<br>

&quot;.hg/branch&quot; probably, but I&#39;m not a Mercurial dev).<br>

<br>

In the absence of more &quot;real world&quot; benchmarks, I think the startup<br>

benchmarks in the benchmarks repo are a good baseline.<br>

<br>

That said you could also install my 3.x port of Twisted here:<br>

<a href="https://bitbucket.org/pitrou/t3k/" target="_blank">https://bitbucket.org/pitrou/t3k/</a><br>

<br>

and then run e.g. &quot;python3 bin/trial -h&quot;.<br>

<div class="im"><br>

&gt; I would get more out of code<br>

&gt; that just stat&#39;ed every file in Lib since at least that did some work.<br>

<br>

</div>stat()ing files is not really representative of import work. There are<br>

many indirections in the import machinery.<br>

(actually, even import.c appears quite slower than a bunch of stat()<br>

calls would imply)<br>

<div class="im"><br>

&gt; &gt; - Within 25% of current performance when importing, say, the &quot;struct&quot;<br>

&gt; &gt;  module (Lib/struct.py) from bytecode.<br>

&gt; &gt;<br>

&gt;<br>

&gt; Why struct? It&#39;s such a small module that it isn&#39;t really a typical module.<br>

<br>

</div>Precisely to measure the overhead. Typical module size will vary<br>

depending on development style. Some people may prefer writing many<br>

small modules. Or they may be using many small libraries, or using<br>

libraries that have adoptes such a development style.<br>

<br>

Measuring the overhead on small modules will make sure we aren&#39;t overly<br>

confident.<br>

<div class="im"><br>

&gt; The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes (which<br>

&gt; is barely past Hello World). And is this just importing struct or is this<br>

&gt; from startup, e.g. ``python -c &quot;import struct&quot;``?<br>

<br>

</div>Just importing struct, as with the timeit snippets in the other thread.</blockquote><div><br></div><div> OK, so less than 25% slowdown when importing a module with pre-existing bytecode that is very small.</div><div>


<br></div><div>And here I was worrying you were going to suggest easy goals to reach for. ;)</div></div>