[Python-Dev] Python 3.3 vs. Python 2.7 benchmark results (again, but this time more solid numbers)
Stefan Behnel
stefan_ml at behnel.de
Tue Oct 30 07:47:19 CET 2012
Tim Delaney, 28.10.2012 20:48:
> On 28 October 2012 18:22, Stefan Behnel wrote:
>>> How much of an effect would it have on startup times and these
>>> benchmarks if Cython-compiled extensions were used?
>>
>> Depends on what and how much code you use. If you compile everything into
>> one big module that "imports" all of the stdlib when it gets loaded, you'd
>> likely loose a lot of time because it would take a while to initialise all
>> that useless code on startup. If you keep it separate, it would likely be a
>> lot faster because you avoid the interpreter for most of the module startup.
>
> I was specifically thinking in terms of the tests Brett ran (that was the
> full set on speed.python.org, wasn't it?), and having each stdlib module be
> its own extension i.e. no big import module. A literal 1:1 replacement
> where possible.
There's also an intermediate solution of linking the top-N modules into the
interpreter core and leaving the rest outside, but I'd rather go for the
straight forward approach of having separate libs first.
Compiling all that can be compiled is easy enough. I fixed up a couple of
things in Cython (so you need the latest github master) and then ran this
setup.py script from the Lib directory with "build_ext -i":
"""
from distutils.core import setup
from Cython.Build import cythonize
from Cython.Compiler import Options
# improve Python compatibility by allowing some broken code
Options.error_on_unknown_names = False
import sys
setup(
name = 'stuff',
ext_modules = cythonize(
["**/*.py"],
exclude=['**/test/**/*.py', '**/tests/**/*.py',
'**/__init__.py',
'idlelib/MultiCall.py'],
exclude_failures=True,
language_level=sys.version_info[0],
compiler_directives=dict(auto_cpdef=True)
),
)
"""
Note that the extra compiler option above disables fatal compile errors on
unknown (usually mistyped) names of which Cython hits a couple in the
stdlib. pylint should find them as well, they're worth fixing.
The directive at the end enables automatic module internal C calls which
usually gives a major speed-up by allowing the C compiler to see what happens.
With the above setup, Cython compiles 612 out of 620 Python modules for me,
excluding test modules and __init__.py files. The rest fails to compile due
to either compiler bugs or statically detected bugs in the Python code.
I'll look through them when I find a bit of time.
One major problem I ran into is that the new importlib bootstrap module
crashes with a RuntimeError("maximum recursion depth exceeded while calling
a Python object)" when it hits compiled modules with import cycles (e.g.
shutil and tarfile, or os and posixpath). I guess that's the kind of corner
case you get when working code gets rewritten. Worth giving Py3.2 a try in
comparison.
>>> To be clear - I'm *not* suggesting Cython become part of the required build
>>> toolchain. But *if* the Cython-compiled extensions prove to be
>>> significantly faster I'm thinking maybe it could become a semi-supported
>>> option (e.g. a HOWTO with the caveat "it worked on this particular
>>> system").
>>
>> Sounds reasonable.
>
> I think a stdlib compile script
... see above ...
> + pre-packaged hints for the 3.3 release
> would likely help both 3.3 and Cython acceptance.
That would certainly be a cool feature. This can often be as easy as
putting a .pxd file next to the .py file that overrides the declarations of
functions and classes with static types.
> Putting aside my development interest and looking at it purely from the PoV
> of a Python *user*, I'd really like to see Cython on
> speed.python.org eventually (in two modes - one without hints as a
> baseline and one with
> hints).
I think the above setup.py script, with appropriately adapted glob
patterns, should do that trick well enough for now. Certainly better and
simpler than my initial pyximport configuration. With the obvious caveat
that it takes a bit longer to compile everything, not just the modules that
are actually used. But that's only an install time issue.
Stefan
More information about the Python-Dev
mailing list