[Python-ideas] Add specialized bytecode with guards to functions

Victor Stinner victor.stinner at gmail.com
Wed Oct 21 23:11:01 CEST 2015


Hi,

2015-10-21 22:55 GMT+02:00 Kevin Modzelewski <kmod at dropbox.com>:
> I don't think there will be that many cases that len() gets called on a
> constant object, especially in performance-sensitive code since that would
> be easy to spot and get rid of.

Hum, I guess that almost no code calls len() on a constant string :-)
In my experience astoptimizer, I noticed that it becomes more common
when you combine it with constant folding optimization. Many strings
are "constants" in Python and commonly tested. Example: "if
sys.platform.startswith('freebsd'): ...". If sys.platform is "linux",
it's dead code and the whole if can be removed.

>  And without some good knowledge of the
> argument, I would guess that it's only marginally helpful to know that "len"
> is the builtin len.

I took the len("abc") example just because it's very easy to
understand :-) It's just to explain the principle. Later, if you
combine multiple optimizations, I bet that it will become really
interesting on real applications. But first we need a framework to
make these optimizations legit in Python (don't change Python
semantic).

The goal is to optimize classes, not only builtin functions.

>  I'm not saying that CPython couldn't add techniques
> like this, but I think it might need to go decently far into JIT territory
> to really make use of it.  For example, which functions would have multiple
> versions generated?  There could be startup+memory overheads if it gets
> applied to all functions, so maybe there needs to be some sort of light
> profiling to determine when to produce the optimized version.  I'm biased
> but I think this should be left to the JITs :)

We will need heuristic to estimate the theoric performance speedup
with specialized bytecode, to then decide if it's worth to keep it or
not.

In the long term, we can design a profiler for profile guided
optimizations (PGO).

In the short term, we can use hints as type hints or manual hints, to
decide which functions should be optimized or not.

> On the other hand, I think it would be pretty interesting for the core
> python community to come up with source-level constructs that could help
> with this sort of thing.  For example, one idea is to make it canonical to
> do something like:
> def f():
>   from __builtin__ import len
>   return len("abc")
>
> and then let implementations pattern-match this to statically resolve the
> import. (...)

As written in other messages, they are already many ways to reduce the
"overhead" of the Python language. But I'm trying to write an generic
optimizer which would not require to modify the source code.

Maybe we can modify Python to make some of these optimizations easy to
use, but I'm not sure that it's worth it.

Victor


More information about the Python-ideas mailing list