<br><br><div class="gmail_quote">On Tue, Feb 7, 2012 at 18:08, Antoine Pitrou <span dir="ltr"><<a href="mailto:solipsis@pitrou.net">solipsis@pitrou.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On Tue, 7 Feb 2012 17:16:18 -0500<br>
Brett Cannon <<a href="mailto:brett@python.org">brett@python.org</a>> wrote:<br>
><br>
> > > IOW I really do not look forward to someone saying "importlib is so much<br>
> > > slower at importing a module containing ``pass``" when (a) that never<br>
> > > happens, and (b) most programs do not spend their time importing but<br>
> > > instead doing interesting work.<br>
> ><br>
> > Well, import time is so important that the Mercurial developers have<br>
> > written an on-demand import mechanism, to reduce the latency of<br>
> > command-line operations.<br>
> ><br>
><br>
> Sure, but they are a somewhat extreme case.<br>
<br>
</div>I don't think Mercurial is extreme. Any command-line tool written in<br>
Python applies. For example, yum (Fedora's apt-get) is written in<br>
Python. And I'm sure many people do small administration scripts in<br>
Python. These tools may then be run in a loop by whatever other script.<br>
<div class="im"><br>
> > But it's not only important for Mercurial and the like. Even if you're<br>
> > developing a Web app, making imports slower will make restarts slower,<br>
> > and development more tedious in the first place.<br>
> ><br>
> ><br>
> Fine, startup cost from a hard crash I can buy when you are getting 1000<br>
> QPS, but development more tedious?<br>
<br>
</div>Well, waiting several seconds when reloading a development server is<br>
tedious. Anyway, my point was that other cases (than command-line<br>
tools) can be negatively impacted by import time.<br>
<div class="im"><br>
> > > So, if there is going to be some baseline performance target I need to<br>
> > hit<br>
> > > to make people happy I would prefer to know what that (real-world)<br>
> > > benchmark is and what the performance target is going to be on a<br>
> > non-debug<br>
> > > build.<br>
> ><br>
> > - No significant slowdown in startup time.<br>
> ><br>
><br>
> What's significant and measuring what exactly? I mean startup already has a<br>
> ton of imports as it is, so this would wash out the point of measuring<br>
> practically anything else for anything small.<br>
<br>
</div>I don't understand your sentence. Yes, startup has a ton of imports and<br>
that's why I'm fearing it may be negatively impacted :)<br>
<br>
("a ton" being a bit less than 50 currently)<br></blockquote><div><br></div><div>So you want less than a 50% startup cost on the standard startup benchmarks?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
> This is why I said I want a<br>
> benchmark to target which does actual work since flat-out startup time<br>
> measures nothing meaningful but busy work.<br>
<br>
</div>"Actual work" can be very small in some cases. For example, if you run<br>
"hg branch" I'm quite sure it doesn't do a lot of work except importing<br>
many modules and then reading a single file in .hg (the one named<br>
".hg/branch" probably, but I'm not a Mercurial dev).<br>
<br>
In the absence of more "real world" benchmarks, I think the startup<br>
benchmarks in the benchmarks repo are a good baseline.<br>
<br>
That said you could also install my 3.x port of Twisted here:<br>
<a href="https://bitbucket.org/pitrou/t3k/" target="_blank">https://bitbucket.org/pitrou/t3k/</a><br>
<br>
and then run e.g. "python3 bin/trial -h".<br>
<div class="im"><br>
> I would get more out of code<br>
> that just stat'ed every file in Lib since at least that did some work.<br>
<br>
</div>stat()ing files is not really representative of import work. There are<br>
many indirections in the import machinery.<br>
(actually, even import.c appears quite slower than a bunch of stat()<br>
calls would imply)<br>
<div class="im"><br>
> > - Within 25% of current performance when importing, say, the "struct"<br>
> > module (Lib/struct.py) from bytecode.<br>
> ><br>
><br>
> Why struct? It's such a small module that it isn't really a typical module.<br>
<br>
</div>Precisely to measure the overhead. Typical module size will vary<br>
depending on development style. Some people may prefer writing many<br>
small modules. Or they may be using many small libraries, or using<br>
libraries that have adoptes such a development style.<br>
<br>
Measuring the overhead on small modules will make sure we aren't overly<br>
confident.<br>
<div class="im"><br>
> The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes (which<br>
> is barely past Hello World). And is this just importing struct or is this<br>
> from startup, e.g. ``python -c "import struct"``?<br>
<br>
</div>Just importing struct, as with the timeit snippets in the other thread.</blockquote><div><br></div><div> OK, so less than 25% slowdown when importing a module with pre-existing bytecode that is very small.</div><div>
<br></div><div>And here I was worrying you were going to suggest easy goals to reach for. ;)</div></div>