[Python-ideas] Type Hinting - Performance booster ?
abarnert at yahoo.com
Sat Dec 27 01:28:14 CET 2014
On Dec 26, 2014, at 23:05, David Mertz <mertz at gnosis.cx> wrote:
> On Fri, Dec 26, 2014 at 1:39 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> On Fri, 26 Dec 2014 13:11:19 -0700
>> David Mertz <mertz at gnosis.cx> wrote:
>> > I think the 5-6 year estimate is pessimistic. Take a look at
>> > http://en.wikipedia.org/wiki/Xeon_Phi for some background.
>> """Intel Many Integrated Core Architecture or Intel MIC (pronounced
>> Mick or Mike) is a *coprocessor* computer architecture"""
>> Enough said. It's not a general-purpose chip. It's meant as a
>> competitor against the computational use of GPU, not against
>> traditional general-purpose CPUs.
> Yes and no:
> The cores of Intel MIC are based on a modified version of P54C design, used in the original Pentium. The basis of the Intel MIC architecture is to leverage x86 legacy by creating a x86-compatible multiprocessor architecture that can utilize existing parallelization software tools. Programming tools include OpenMP, OpenCL, Cilk/Cilk Plus and specialised versions of Intel's Fortran, C++ and math libraries.
> x86 is pretty general purpose, but also yes it's meant to compete with GPUs too. But also, there are many projects--including Numba--that utilize GPUs for "general computation" (or at least to offload much of the computation). The distinctions seem to be blurring in my mind.
> But indeed, as many people have observed, parallelization is usually non-trivial, and the presence of many cores is a far different thing from their efficient utilization.
I think what we're eventually going to see is that optimized, explicit parallelism is very hard, but general-purpose implicit parallelism is pretty easy if you're willing to accept a lot of overhead. When people start writing a lot of code that takes 4x as much CPU but can run on 64 cores instead of 2 and work with a dumb ring cache instead of full coherence, that's when people will start selling 128-core laptops. And it's not going to be new application programming techniques that make that happen, it's going to be things like language-level STM, implicit parallelism libraries, kernel schedulers that can migrate low-utilization processes into low-power auxiliary cores, etc.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas