Cpython optimization

Jason Sewall jasonsewall at gmail.com
Fri Oct 23 20:53:19 CEST 2009

On Fri, Oct 23, 2009 at 2:31 PM, Olof Bjarnason
<olof.bjarnason at gmail.com> wrote:
>> >
>> > This would be way to speed up things in an image processing algorithm:
>> > 1. divide the image into four subimages 2. let each core process each
>> > part independently 3. fix&merge (along split lines for example) into a
>> > resulting, complete image
>> Well, don't assume you're the first to think about that.
>> I'm sure that performance-conscious image processing software already has
>> this kind of tile-based optimizations.
>> (actually, if you look at benchmarks of 3D rendering which are regularly
>> done by "enthusiast" websites, it shows exactly that)

This is indeed a tried-and-true method for parallelizing certain image
and other grid-based algorithms, but it is in fact not appropriate for
a wide variety of techniques. Things like median filters, where f(A|B)
!= f(A)|f(B) (with | as some sort of concatenation), will not be able
to generate correct results given the scheme you outlined.

> No I didn't assume I was the first to think about that - I wanted to learn
> more about how optimization at all is possible/viable with multi-core
> motherboards, when the memory speed is the bottleneck anyway, regardless of
> smart caching technologies.
> I still have not received a convincing answer :)

Give Ulrich Drepper's "What Every Programmer Should Know about Memory"
a read (http://people.redhat.com/drepper/cpumemory.pdf) and you'll
hear all you want to know (and more) about how the memory hierarchy
plays with multi-core.

I don't contribute to CPython, but I am virtually certain that they
are not interested in having the compiler/interpreter try to apply
some generic threading to arbitrary code. The vast majority of Python
code wouldn't benefit from it even if it worked well, and I'm _very_
skeptical that there is any silver bullet for parallelizing general
code. If you think of one, tell Intel, they'll hire you.

_Perhaps_ the numpy or scipy people (I am not associated with either
of them) would be interested in some threading for certain array
operations. Maybe you could write something on top of what they have
to speed up array ops.


More information about the Python-list mailing list