[pypy-commit] extradoc extradoc: merge

Fri Dec 7 14:54:26 CET 2012

Author: Maciej Fijalkowski <fijall at gmail.com>
Branch: extradoc
Changeset: r4933:ee71ba86558a
Date: 2012-12-07 15:54 +0200
http://bitbucket.org/pypy/extradoc/changeset/ee71ba86558a/

Log:	merge

diff --git a/blog/draft/py3k-status-update-8.rst b/blog/draft/py3k-status-update-8.rst
new file mode 100644
--- /dev/null
+++ b/blog/draft/py3k-status-update-8.rst
@@ -0,0 +1,56 @@
+Py3k status update #8
+---------------------
+
+This is the eight status update about our work on the `py3k branch`_, which
+we can work on thanks to all of the people who donated_ to the `py3k
+proposal`_.
+
+Just a short update on November's work: we're now passing about 194 of
+approximately 355 modules of CPython's regression test suite, up from passing
+160 last month. Many test modules only fail a small number of individual tests
+now.
+
+We'd like to thank Amaury Forgeot d'Arc for his contributions, in particular he
+has made significant progress on updating `CPyExt`_ for Python 3 this month.
+
+Some other highlights:
+
+* ``test_marshal`` now passes, and there's been significant progress on
+  pickling (thanks `Kenny Levinsen`_ and Amaury for implementing
+  ``int.{to,from}_bytes``)
+
+* We now have a ``_posixsubprocess`` module
+
+* More encoding related fixes, which affects many failing tests
+
+* ``_sre`` was updated and now ``test_re`` almost passes
+
+* Exception behavior is almost complete per the Python 3 specs, what's mostly
+  missing now are the new ``__context__`` and ``__traceback__`` attributes (`PEP
+  3134`_)
+
+* Fixed some crashes and deadlocks occurring during the regression tests
+
+* We merged the `unicode-strategies`_ branch both to default and to py3k: now we
+  have versions of lists, dictionaries and sets specialized for unicode
+  elements, as we already had for strings.
+
+* However, for string-specialized containers are still faster in some cases
+  because there are shortcuts which have not been implemented for unicode yet
+  (e.g., constructing a set of strings from a list of strings). The plan is to
+  completely kill the shortcuts and improve the JIT to produce the fast
+  version automatically for both the string and unicode versions, to have a
+  more maintainable codebase without sacrificing the speed. The `autoreds`_
+  branch (already merged) was a first step in this direction.
+
+cheers,
+Philip&Antonio
+
+.. _donated: http://morepypy.blogspot.com/2012/01/py3k-and-numpy-first-stage-thanks-to.html
+.. _`py3k proposal`: http://pypy.org/py3donate.html
+.. _`py3k branch`: https://bitbucket.org/pypy/pypy/commits/all/tip/branch%28%22py3k%22%29
+.. _`autoreds`: https://bitbucket.org/pypy/pypy/commits/all/tip/branch%28%22autoreds%22%29
+.. _`unicode-strategies`: https://bitbucket.org/pypy/pypy/commits/all/tip/branch%28%22unicode-strategies%22%29
+.. _`CPyExt`: http://morepypy.blogspot.com/2010/04/using-cpython-extension-modules-with.html
+.. _`Kenny Levinsen`: https://twitter.com/Joushou
+.. _`PEP 3134`: http://www.python.org/dev/peps/pep-3134/
diff --git a/planning/2.0/todo.txt b/planning/2.0/todo.txt
--- a/planning/2.0/todo.txt
+++ b/planning/2.0/todo.txt
@@ -8,6 +8,6 @@
 * cffi on pypy on windows
 * raw malloc virtuals
 * bug tracker gardening
- * 1292, 1090, 1294, 1282, 1289, 1282, 1286
+ * 1090, 1282, 1289, 1286
  * numpy: 1143, 1160, 1287
 * all green buildbots
diff --git a/sprintinfo/san-francisco-2012/announce.txt b/sprintinfo/san-francisco-2012/announce.txt
new file mode 100644
--- /dev/null
+++ b/sprintinfo/san-francisco-2012/announce.txt
@@ -0,0 +1,39 @@
+PyPy San Francisco Sprint Dec 1st - Dec 2nd 2012
+================================================
+
+The next PyPy sprint will be in San Francisco, California. It is a
+public sprint, suitable for newcomers. It will run on Saturday December 1st and
+Sunday December 2nd. The goals for the sprint are continued work towards the
+2.0 release as well as code cleanup, we of course welcome any topic which
+contributors are interested in working on.
+
+Some other possible topics are:
+
+* running your software on PyPy
+
+* work on PyPy's numpy (status__)
+
+* work on STM (status__)
+
+* JIT improvements
+
+* any exciting stuff you can think of
+
+If there are newcomers, we'll run the usual introduction to hacking on
+PyPy.
+
+.. __: http://morepypy.blogspot.ch/2012/09/numpy-on-pypy-status-update.html
+.. __: http://mail.python.org/pipermail/pypy-dev/2012-September/010513.html
+
+
+Location
+--------
+
+The sprint will be held at the Rackspace Office:
+
+620 Folsom St, Ste 100
+San Francisco
+
+The doors will open at 10AM both days, and run until 6PM both days.
+
+Thanks to David Reid for helping get everything set up!
diff --git a/sprintinfo/san-francisco-2012/planning.txt b/sprintinfo/san-francisco-2012/planning.txt
new file mode 100644
--- /dev/null
+++ b/sprintinfo/san-francisco-2012/planning.txt
@@ -0,0 +1,4 @@
+Planning
+========
+
+* Implement ``os.setgroups``
diff --git a/talk/dls2006/talk-long.txt b/talk/dls2006/talk-long.txt
new file mode 100644
--- /dev/null
+++ b/talk/dls2006/talk-long.txt
@@ -0,0 +1,353 @@
+.. include:: <s5defs.txt>
+
+=================================================
+PyPy's VM Approach
+=================================================
+
+:Authors: Armin Rigo, Samuele Pedroni
+:Date: 23 October 2006
+:Location: DLS'06
+
+PyPy
+========================
+
+- Python VM implementation
+  in Python (a well-chosen subset)
+- A translation tool-chain
+- Open source project (MIT license)
+
+VMs are still hard
+========================
+
+It is hard to achieve:
+
+- flexibility
+- maintainability
+- performance (needs
+  dynamic compilation techniques)
+
+Especially with limited resources.
+
+
+Python Case
+===================================
+
+CPython is a straightforward,
+portable VM.
+
+- Some decisions are pervasive:
+  reference counting, single global lock ...
+
+- No dynamic compilation.
+  Performance is limited.
+
+
+- Extensions:
+
+  * *Stackless* (heap-bound recursion,
+    coroutines, serializable continuations)
+
+  * *Psyco* (run-time specializer,
+    interesting results)
+
+
+Python Case (ii)
+===================================
+
+- Extensions...
+
+  ... need to keep track and are hard to maintain.
+  Hard to port Psyco to other architectures.
+
+- The community wants Python to run everywhere:
+  Jython (Java), IronPython (.NET).
+  Lots of effort and duplication.
+
+PyPy's approach
+=================================
+
+*Goal: generate VMs from a single
+high-level description of the language,
+in a retargettable way.*
+
+- Write an interpreter for a dynamic language (Python)
+  in a high-level language (Python)
+
+- Leave out low-level details, favour simplicity
+  and flexibility
+
+- Define a mapping to low-level targets, generating
+  VMs from the interpreter
+
+Mapping to low-level targets
+===============================
+
+- Mechanically translate the interpreter to multiple
+  lower-level targets (C-like, Java, .NET...)
+
+- Insert low-level aspects into the code as required by
+  the target (object layout, memory management...)
+
+- Optionally insert new pervasive features not expressed
+  in the source (continuations, specialization abilities...)
+
+Status of the project
+==========================
+
+Fully compliant interpreter, translatable to C,
+LLVM and the CLR.
+
+Maintainability: following the (fast-paced)
+language evolution is very easy.
+
+Flexibility: we were able to reimplement
+Stackless features without extensive
+changes to the baseline interpreter
+
+Performance: work in-progress,
+2.3 times slower than CPython
+without dynamic compilation (current goal)
+
+... and many experiments at various levels
+
+Translation approach
+==========================
+
+* Refine a subset of your favourite
+  language (e.g. Python) amenable
+  to analysis but expressive enough 
+  to write interpreters in it.
+
+* Write a translation tool-chain
+  from this subset ("RPython")
+  to multiple targets (C-like, .NET, etc.)
+
+* The translation tool-chain should
+  implement (and be configurable to 
+  be) a good mapping from the interpreter
+  to reasonably efficient implementations for 
+  the various targets.
+
+Translation overview
+==========================
+
+.. raw:: html
+
+    <br>
+
+.. image:: image/arch2.png
+   :align: center
+
+
+Type Inference
+=================
+
+- based on abstract interpretation
+
+- fix-point forward propagation
+
+- extensible
+
+Targets as Type Systems
+========================
+
+- RPython types (lists, strings, dicts, instances and classes...)
+  may be too high-level for the target (e.g. in C, structs and pointers)
+
+- approach: reflect the essential aspects
+  of a target as a custom type system
+  into RPython (e.g. C-like types)
+
+::
+
+    STR = GcStruct('rpy_string',
+                      ('hash', Signed),
+                      ('chars', Array(Char)))
+
+Targets as Type Systems (ii)
+================================
+  
+- implement a simulation
+  of the types in normal Python,
+  allowing code like this to run::
+
+    def ll_char_mul(char, length):
+        newstr = malloc(STR, length)
+        newstr.hash = 0
+        for i in range(length):
+            newstr.chars[i] = char
+        return newstr
+
+
+Targets as Type Systems (iii)
+===============================
+
+- extend the type inferencer
+  to understand usages of these types
+
+- use the type system
+  to express how regular, high-level RPython types
+  should be represented 
+  at the level of the target
+
+- write implementation "helper" code (e.g. ``ll_char_mul``)
+  which is again RPython and can be type inferenced
+  and translated
+
+Translation Aspects
+=====================
+
+*Features not present in the source can be
+added during translation:*
+
+- memory management (Boehm, or reference counting
+  by transforming all control flow graphs, or our own
+  GCs - themselves written within the same framework as the
+  RPython "helper" code)
+
+.. GC Pressure blues
+
+Translation Aspects (ii)
+==========================
+
+- continuation capture, implemented by saving the low-level
+  frames' local variables into the heap and back
+
+- work in progress: turning an interpreter into a compiler
+  is a translation aspect too (based on binding-time analysis
+  and partial evaluation, extended to use the techniques of
+  Psyco)
+
+Translation Summary
+===========================
+
+*The translation tool-chain
+has proved effective:*
+
+- low-level details and
+  pervasive decision can be
+  left out of the interpreter
+
+- it can targets at the same time:
+  C, LLVM, the CLR
+  and is open for further backends (JVM in progress)
+
+- it can and has been used
+  in the context of other research
+  projects and spin-off ideas
+  (e.g. a JavaScript backend,
+  compilation of other RPython programs...)
+  
+Website etc.
+=============
+
+* http://codespeak.net/pypy
+* IST EU co-funded project in FP6
+  (7 partners)
+* Thanks
+
+Run-time Specialization
+========================
+
+Previous experience: Psyco
+
+- a "just-in-time specializer" which can transparently
+  accelerate user code
+
+- a C hand-written "generating extension", in the terminology
+  of partial evaluation
+
+- similar to conventional JITs with the additional ability
+  to suspend compilation at any point, and wait for actual
+  run-time information (e.g. type of an object):
+  **promotion**.
+
+A Specializer as an Aspect
+==========================================
+
+General idea (the devil is in the details):
+
+- Transform the flowgraphs of the interpreter
+  into a compiler, using the type inference
+  framework to do binding-time analysis (runtime/
+  compile-time) based on a few hints.
+
+- Special hints to insert and control promotion.
+
+- We think that promotion is the key to making
+  it practical for large interpreters and complex
+  semantics.
+
+This is what we are working on right now.
+
+JIT Generation Diagram
+========================
+
+.. image:: image/arch-jit-gen.png
+   :align: center
+
+Translation Diagram
+=========================
+
+.. image:: image/arch-translation.png
+   :align: center
+
+Self-hosted JITs
+===========================
+
+- they work: Jikes VM
+- the language semantics need to
+  be captured into a good compiler
+- good means the resulting VM
+  should be fast enough
+- target hardware CPUs
+- lots of effort still, and hard
+  to reuse for another language
+
+Target platform VMs (JVM, CLR)
+==============================
+
+- semantics mismatch (e.g.
+  lookup) can result in speed penalty
+  or unnatural code
+
+- how to obliviously layer dynamic
+  compilation on top of a JIT
+  is effectively an open problem
+
+- urge to tweak the underlying VM
+
+- coding in Java, C#: not expressive
+  enough, same risks of inflexibility,
+  hard to revert pervasive decisions
+
+Open Virtual Machines
+==========================
+
+Reconfigurable at run time to run 
+specific languages.
+
+- Open research area.
+
+- Large design space.
+
+- What are the best primitives?
+
+- Likely same trade-offs in
+  more acute form: need sharp tools.
+
+GC Pressure
+======================
+
+RPython is still a garbage collected language.
+
+Large allocation rate from interpreter objects
+(boxes, frames) but easily temporary objects
+too.
+
+Good allocation removal optimizations
+and memory management very much needed.
+
+.. |bullet| unicode:: U+02022
+.. footer:: DLS'06
+
diff --git a/talk/ustour2011/talk.txt b/talk/ustour2011/talk.txt
new file mode 100644
--- /dev/null
+++ b/talk/ustour2011/talk.txt
@@ -0,0 +1,69 @@
+
+* most Python benchmarks run much faster than with CPython or Psyco
+
+
+    what pypy-c is (project started in 2003, now 200KLoc + 150KLoc tests)
+    (2 years U.E. (~2005-2007) + 2 years Germany+Sweden (2010-running))
+
+    PyPy 1.4.1 supports Python 2.5; but we are almost done with support
+    for Python 2.7, which will be PyPy 1.5
+
+    boring demo (multi-line editing)
+
+    speeeeeeeeed
+
+    http://speed.pypy.org/
+
+    but underline *benchmarks* here: it's typically programs that repeatedly
+    do similar things for at least 10-20 seconds.
+
+    mention also memory usage
+
+
+* the real-world PyPy compiler toolchain itself (200 KLocs) runs twice as fast
+
+
+    "extreme" example: big program, very unfriendly to our approach of
+    tracing JITs
+
+
+* already supports 64bit and is in the process of supporting ARM
+
+
+    pypy-c on 64bits
+
+    (pypy-c on ARM -- jitted but slower so far (missing JIT+GC integration))
+
+
+* full compatibility with  CPython (more than Jython/IronPython)
+* new "cpyext" layer which integrates existing CPython C extensions
+
+
+    the main issue is that C extension modules don't all work out of the box
+
+    but some do (slowly (which may matter or not))
+
+    the core supports "the full language", which is CPython minus some
+    small number of issues; the most visible ones are related to refcounts
+    (ends up closer than Jython/IronPython)
+
+
+* full (and JIT-ed) ctypes support to call C libraries from Python
+* supports Stackless Python (in-progress)
+* an experimental super-fast JIT-compilation of calls to C++ libraries
+
+
+    this is all experimental
+
+
+* architecture
+
+
+    interpreter written in Python (actually RPython, a subset)
+
+    gets "translated" to C code
+
+    various "aspects" are added during translation to C, like
+    the GC and the JIT
+
+    it's a tracing JIT (expand...?)
diff --git a/talk/vmil2012/jit-guards_submitted.pdf b/talk/vmil2012/jit-guards_submitted.pdf
deleted file mode 100644
Binary file talk/vmil2012/jit-guards_submitted.pdf has changed
diff --git a/talk/vmil2012/paper.tex b/talk/vmil2012/paper.tex
--- a/talk/vmil2012/paper.tex
+++ b/talk/vmil2012/paper.tex
@@ -120,18 +120,10 @@
 \conferenceinfo{VMIL'12,} {October 21, 2012, Tucson, Arizona, USA.}
 \CopyrightYear{2012}
 \copyrightdata{978-1-4503-1633-0/12/10}
-\crdata{}
+%\crdata{}
 
 \maketitle
 
-\category{D.3.4}{Programming Languages}{Processors}[code generation,
-incremental compilers, interpreters, run-time environments]
-
-\terms
-Languages, Performance, Experimentation
-
-\keywords{tracing JIT, guards, deoptimization}
-
 \begin{abstract}
 Tracing just-in-time (JIT) compilers record linear control flow paths,
 inserting operations called guards at points of possible divergence. These
@@ -144,6 +136,11 @@
 % \o/
 \end{abstract}
 
+\category{D.3.4}{Programming Languages}{Processors}[code generation,
+incremental compilers, interpreters, run-time environments]
+\terms
+Languages, Performance, Experimentation
+\keywords{tracing JIT, guards, deoptimization}
 
 %___________________________________________________________________________
 \section{Introduction}
@@ -512,7 +509,7 @@
 \label{sec:Guards in the Backend}
 
 \begin{figure}[ht]
-\includegraphics[width=0.4\textwidth]{figures/resume_data}
+\includegraphics[width=0.45\textwidth]{figures/resume_data}
 \vspace{-3mm}
 \caption{The resume data for Figure~\ref{fig:trace-log}}
 \label{fig:resume-data}
diff --git a/talk/vmil2012/vmil01-schneider.pdf b/talk/vmil2012/vmil01-schneider.pdf
new file mode 100644
index 0000000000000000000000000000000000000000..f6355831e03ac23489f8f1a7419029d7398cdd3c
GIT binary patch

[cut]