[pypy-commit] extradoc extradoc: Check in the current state

Wed Apr 23 19:06:49 CEST 2014

Author: Armin Rigo <arigo at tunes.org>
Branch: extradoc
Changeset: r5197:b94e6d82d560
Date: 2014-04-23 19:06 +0200
http://bitbucket.org/pypy/extradoc/changeset/b94e6d82d560/

Log:	Check in the current state

diff --git a/talk/icooolps2014/overview.txt b/talk/icooolps2014/overview.txt
new file mode 100644
--- /dev/null
+++ b/talk/icooolps2014/overview.txt
@@ -0,0 +1,71 @@
+Position paper outline
+
+Introduction
+============
+
+
+Issue
+-----
+
+- efficiently supporting multi-CPU usage on dynamic languages that were designed with GIL semantics in mind
+(supporting (large) atomic blocks for synchronization)
+
+
+Our Position
+------------
+
+Current solutions like STM, HTM, and fine-grained locking are slow, hard
+to implement correctly, and don't fit the specific problems of dynamic
+language.  STM is the best way forward but has bad performance, so we
+fix that.
+
+
+Discussion
+==========
+
+dynamic language VM problems:
+
+- high allocation rate (short lived objects)
+- (don't know anything about the program that runs until it actually runs: arbitrary atomic block size)
+
+GIL:
+
+- nice semantics
+- easy support of atomic blocks
+- no parallelism
+
+fine-grained locking:
+
+- support of atomic blocks?
+- hard to get right (deadlocks, performance, lock-granularity)
+- very hard to get right for a large language
+- hard to retro-fit, as all existing code assumes GIL semantics
+- (there are some semantic differences, right? not given perfect lock-placement, but well)
+( http://www.jython.org/jythonbook/en/1.0/Concurrency.html )
+
+multiprocessing:
+
+- often needs major restructuring of programs (explicit data exchange)
+- sometimes communication overhead is too large
+- shared memory is a problem, copies of memory are too expensive
+
+HTM:
+
+- false-sharing on cache-line level
+- limited capacity (caches, undocumented)
+- random aborts (haswell)
+- generally: transaction-length limited (no atomic blocks)
+
+STM:
+
+- overhead (100-1000%) (barrier reference resolution, kills performance on low #cpu)
+(FastLane: low overhead, not much gain)
+- unlimited transaction length (easy atomic blocks)
+
+
+Potential alternative approach
+==============================
+
+possible solution:
+- use virtual memory paging to somehow lower the STM overhead
+- tight integration with GC and jit?