[Python-checkins] r77617 - peps/trunk/pep-3146.txt

collin.winter python-checkins at python.org
Wed Jan 20 23:08:05 CET 2010

Author: collin.winter
Date: Wed Jan 20 23:08:04 2010
New Revision: 77617

Add PEP 3146: Merge Unladen Swallow into CPython.


Added: peps/trunk/pep-3146.txt
--- (empty file)
+++ peps/trunk/pep-3146.txt	Wed Jan 20 23:08:04 2010
@@ -0,0 +1,1315 @@
+PEP: 3146
+Title: Merging Unladen Swallow into CPython
+Version: $Revision$
+Last-Modified: $Date$
+Author: Collin Winter <collinwinter at google.com>,
+        Jeffrey Yasskin <jyasskin at google.com>,
+        Reid Kleckner <rnk at mit.edu>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 1-Jan-2010
+Python-Version: 3.3
+This PEP proposes the merger of the Unladen Swallow project [#us]_ into
+CPython's source tree. Unladen Swallow is an open-source branch of CPython
+focused on performance. Unladen Swallow is source-compatible with valid Python
+2.6.4 applications and C extension modules.
+Unladen Swallow adds a just-in-time (JIT) compiler to CPython, allowing for the
+compilation of selected Python code to optimized machine code. Beyond classical
+static compiler optimizations, Unladen Swallow's JIT compiler takes advantage of
+data collected at runtime to make checked assumptions about code behaviour,
+allowing the production of faster machine code.
+This PEP proposes to integrate Unladen Swallow into CPython's development tree
+in a separate ``py3k-jit`` branch, targeted for eventual merger with the main
+``py3k`` branch. While Unladen Swallow is by no means finished or perfect, we
+feel that Unladen Swallow has reached sufficient maturity to warrant
+incorporation into CPython's roadmap. We have sought to create a stable platform
+that the wider CPython development team can build upon, a platform that will
+yield increasing performance for years to come.
+This PEP will detail Unladen Swallow's implementation and how it differs from
+CPython 2.6.4; the benchmarks used to measure performance; the tools used to
+ensure correctness and compatibility; the impact on CPython's current platform
+support; and the impact on the CPython core development process. The PEP
+concludes with a proposed merger plan and brief notes on possible directions
+for future work.
+We seek the following from the BDFL:
+- Approval for the overall concept of adding a just-in-time compiler to CPython,
+  following the design laid out below.
+- Permission to continue working on the just-in-time compiler in the CPython
+  source tree.
+- Permission to eventually merge the just-in-time compiler into the ``py3k``
+  branch once all blocking issues have been addressed.
+- A pony.
+Rationale, Implementation
+Many companies and individuals would like Python to be faster, to enable its
+use in more projects. Google is one such company.
+Unladen Swallow is a Google-sponsored branch of CPython, initiated to improve
+the performance of Google's numerous Python libraries, tools and applications.
+To make the adoption of Unladen Swallow as easy as possible, the project
+initially aimed at four goals:
+- A performance improvement of 5x over the baseline of CPython 2.6.4 for
+  single-threaded code.
+- 100% source compatibility with valid CPython 2.6 applications.
+- 100% source compatibility with valid CPython 2.6 C extension modules.
+- Design for eventual merger back into CPython.
+We chose 2.6.4 as our baseline because Google uses CPython 2.4 internally, and
+jumping directly from CPython 2.4 to CPython 3.x was considered infeasible.
+To achieve the desired performance, Unladen Swallow has implemented a
+just-in-time (JIT) compiler [#jit]_ in the tradition of Urs Hoelzle's work on
+Self [#urs-self]_, gathering feedback at runtime and using that to inform
+compile-time optimizations. This is similar to the approach taken by the current
+breed of JavaScript engines [#v8]_, [#squirrelfishextreme]_; most Java virtual
+machines [#hotspot]_; Rubinius [#rubinius]_, MacRuby [#macruby]_, and other Ruby
+implementations; Psyco [#psyco]_; and others.
+We explicitly reject any suggestion that our ideas are original. We have sought
+to reuse the published work of other researchers wherever possible. If we have
+done any original work, it is by accident. We have tried, as much as possible,
+to take good ideas from all corners of the academic and industrial community. A
+partial list of the research papers that have informed Unladen Swallow is
+available on the Unladen Swallow wiki [#us-relevantpapers]_.
+The key observation about optimizing dynamic languages is that they are only
+dynamic in theory; in practice, each individual function or snippet of code is
+relatively static, using a stable set of types and child functions. The current
+CPython bytecode interpreter assumes the worst about the code it is running,
+that at any moment the user might override the ``len()`` function or pass a
+never-before-seen type into a function. In practice this never happens, but user
+code pays for that support. Unladen Swallow takes advantage of the relatively
+static nature of user code to improve performance.
+At a high level, the Unladen Swallow JIT compiler works by translating a
+function's CPython bytecode to platform-specific machine code, using data
+collected at runtime, as well as classical compiler optimizations, to improve
+the quality of the generated machine code. Because we only want to spend
+resources compiling Python code that will actually benefit the runtime of the
+program, an online heuristic is used to assess how hot a given function is. Once
+the hotness value for a function crosses a given threshold, it is selected for
+compilation and optimization. Until a function is judged hot, however, it runs
+in the standard CPython eval loop, which in Unladen Swallow has been
+instrumented to record interesting data about each bytecode executed. This
+runtime data is used to reduce the flexibility of the generated machine code,
+allowing us to optimize for the common case. For example, we collect data on
+- Whether a branch was taken/not taken. If a branch is never taken, we will not
+  compile it to machine code.
+- Types used by operators. If we find that ``a + b`` is only ever adding
+  integers, the generated machine code for that snippet will not support adding
+  floats.
+- Functions called at each callsite. If we find that a particular ``foo()``
+  callsite is always calling the same ``foo`` function, we can optimize the
+  call or inline it away
+Refer to [#us-llvm-notes]_ for a complete list of data points gathered and how
+they are used.
+However, if by chance the historically-untaken branch is now taken, or some
+integer-optimized ``a + b`` snippet receives two strings, we must support this.
+We cannot change Python semantics. Each of these sections of optimized machine
+code is preceded by a `guard`, which checks whether the simplifying assumptions
+we made when optimizing still hold. If the assumptions are still valid, we run
+the optimized machine code; if they are not, we revert back to the interpreter
+and pick up where we left off.
+We have chosen to reuse a set of existing compiler libraries called LLVM
+[#llvm]_ for code generation and code optimization. This has saved our small
+team from needing to understand and debug code generation on multiple machine
+instruction sets and from needing to implement a large set of classical compiler
+optimizations. The project would not have been possible without such code reuse.
+We have found LLVM easy to modify and its community receptive to our suggestions
+and modifications.
+In somewhat more depth, Unladen Swallow's JIT works by compiling CPython
+bytecode to LLVM's own intermediate representation (IR) [#llvm-langref]_, taking
+into account any runtime data from the CPython eval loop. We then run a set of
+LLVM's built-in optimization passes, producing a smaller, optimized version of
+the original LLVM IR. LLVM then lowers the IR to platform-specific machine code,
+performing register allocation, instruction scheduling, and any necessary
+relocations. This arrangement of the compilation pipeline allows the LLVM-based
+JIT to be easily omitted from a compiled ``python`` binary by passing
+``--without-llvm`` to ``./configure``; various use cases for this flag are
+discussed later.
+For a complete detailing of how Unladen Swallow works, consult the Unladen
+Swallow documentation [#us-projectplan]_, [#us-llvm-notes]_.
+Unladen Swallow has focused on improving the performance of single-threaded,
+pure-Python code. We have not made an effort to remove CPython's global
+interpreter lock (GIL); we feel this is separate from our work, and due to its
+sensitivity, is best done in a mainline development branch. We considered
+making GIL-removal a part of Unladen Swallow, but were concerned by the
+possibility of introducing subtle bugs when porting our work from CPython 2.6
+to 3.x.
+A JIT compiler is an extremely versatile tool, and we have by no means
+exhausted its full potential. We have tried to create a sufficiently flexible
+framework that the wider CPython development community can build upon it for
+years to come, extracting increased performance in each subsequent release.
+Unladen Swallow has developed a fairly large suite of benchmarks, ranging from
+synthetic microbenchmarks designed to test a single feature up through
+whole-application macrobenchmarks. The inspiration for these benchmarks has come
+variously from third-party contributors (in the case of the ``html5lib``
+benchmark), Google's own internal workloads (``slowspitfire``, ``pickle``,
+``unpickle``), as well as tools and libraries in heavy use throughout the wider
+Python community (``django``, ``2to3``, ``spambayes``). These benchmarks are run
+through a single interface called ``perf.py`` that takes care of collecting
+memory usage information, graphing performance, and running statistics on the
+benchmark results to ensure significance.
+The full list of available benchmarks is available on the Unladen Swallow wiki
+[#us-benchmarks]_, including instructions on downloading and running the
+benchmarks for yourself. All our benchmarks are open-source; none are
+Google-proprietary. We believe this collection of benchmarks serves as a useful
+tool to benchmark any complete Python implementation, and indeed, PyPy is
+already using these benchmarks for their own performance testing
+[#pypy-bmarks]_, [#us-wider-perf-issue]_. We welcome this, and we seek
+additional workloads for the benchmark suite from the Python community.
+We have focused our efforts on collecting macrobenchmarks and benchmarks that
+simulate real applications as well as possible, when running a whole application
+is not feasible. Along a different axis, our benchmark collection originally
+focused on the kinds of workloads seen by Google's Python code (webapps, text
+processing), though we have since expanded the collection to include workloads
+Google cares nothing about. We have so far shied away from heavily-numerical
+workloads, since NumPy [#numpy]_ already does an excellent job on such code and
+so improving numerical performance was not an initial high priority for the
+team; we have begun to incorporate such benchmarks into the collection
+[#us-nbody]_ and have started work on optimizing numerical Python code.
+Beyond these benchmarks, there are also a variety of workloads we are explicitly
+not interested in benchmarking. Unladen Swallow is focused on improving the
+performance of pure-Python code, so the performance of extension modules like
+NumPy is uninteresting since NumPy's core routines are implemented in
+C. Similarly, workloads that involve a lot of IO like GUIs, databases or
+socket-heavy applications would, we feel, fail to accurately measure interpreter
+or code generation optimizations. That said, there's certainly room to improve
+the performance of C-language extensions modules in the standard library, and
+as such, we have added benchmarks for the ``cPickle`` and ``re`` modules.
+Performance vs CPython 2.6.4
+The charts below compare the arithmetic mean of multiple benchmark iterations
+for CPython 2.6.4 and Unladen Swallow. ``perf.py`` gathers more data than this,
+and indeed, arithmetic mean is not the whole story; we reproduce only the mean
+for the sake of conciseness. We include the ``t`` score from the Student's
+two-tailed T-test [#students-t-test]_ at the 95% confidence interval to indicate
+the significance of the result. Most benchmarks are run for 100 iterations,
+though some longer-running whole-application benchmarks are run for fewer
+A description of each of these benchmarks is available on the Unladen Swallow
+wiki [#us-benchmarks]_.
+  ./perf.py -r -b default,apps ../a/python ../b/python
+32-bit; gcc 4.0.3; Ubuntu Dapper; Intel Core2 Duo 6600 @ 2.4GHz; 2 cores; 4MB L2 cache; 4GB RAM
+| Benchmark    | CPython 2.6.4 | Unladen Swallow r988 | Change       | Significance  | Timeline                   |
+| 2to3         | 25.13 s       | 24.87 s              | 1.01x faster | t=8.94        | http://tinyurl.com/yamhrpg |
+| django       | 1.08 s        | 0.80 s               | 1.35x faster | t=315.59      | http://tinyurl.com/y9mrn8s |
+| html5lib     | 14.29 s       | 13.20 s              | 1.08x faster | t=2.17        | http://tinyurl.com/y8tyslu |
+| nbody        | 0.51 s        | 0.28 s               | 1.84x faster | t=78.007      | http://tinyurl.com/y989qhg |
+| rietveld     | 0.75 s        | 0.55 s               | 1.37x faster | Insignificant | http://tinyurl.com/ye7mqd3 |
+| slowpickle   | 0.75 s        | 0.55 s               | 1.37x faster | t=20.78       | http://tinyurl.com/ybrsfnd |
+| slowspitfire | 0.83 s        | 0.61 s               | 1.36x faster | t=2124.66     | http://tinyurl.com/yfknhaw |
+| slowunpickle | 0.33 s        | 0.26 s               | 1.26x faster | t=15.12       | http://tinyurl.com/yzlakoo |
+| spambayes    | 0.31 s        | 0.34 s               | 1.10x slower | Insignificant | http://tinyurl.com/yem62ub |
+64-bit; gcc 4.2.4; Ubuntu Hardy; AMD Opteron 8214 HE @ 2.2 GHz; 4 cores; 1MB L2 cache; 8GB RAM
+| Benchmark    | CPython 2.6.4 | Unladen Swallow r988 | Change       | Significance  | Timeline                   |
+| 2to3         | 31.98 s       | 30.41 s              | 1.05x faster | t=8.35        | http://tinyurl.com/ybcrl3b |
+| django       | 1.22 s        | 0.94 s               | 1.30x faster | t=106.68      | http://tinyurl.com/ybwqll6 |
+| html5lib     | 18.97 s       | 17.79 s              | 1.06x faster | t=2.78        | http://tinyurl.com/yzlyqvk |
+| nbody        | 0.77 s        | 0.27 s               | 2.86x faster | t=133.49      | http://tinyurl.com/yeyqhbg |
+| rietveld     | 0.74 s        | 0.80 s               | 1.08x slower | t=-2.45       | http://tinyurl.com/yzjc6ff |
+| slowpickle   | 0.91 s        | 0.62 s               | 1.48x faster | t=28.04       | http://tinyurl.com/yf7en6k |
+| slowspitfire | 1.01 s        | 0.72 s               | 1.40x faster | t=98.70       | http://tinyurl.com/yc8pe2o |
+| slowunpickle | 0.51 s        | 0.34 s               | 1.51x faster | t=32.65       | http://tinyurl.com/yjufu4j |
+| spambayes    | 0.43 s        | 0.45 s               | 1.06x slower | Insignificant | http://tinyurl.com/yztbjfp |
+Many of these benchmarks take a hit under Unladen Swallow because the current
+version blocks execution to compile Python functions down to machine code. This
+leads to the behaviour seen in the timeline graphs for the ``html5lib`` and
+``rietveld`` benchmarks, for example, and slows down the overall performance of
+``2to3``. We have an active development branch to fix this problem
+([#us-background-thread]_, [#us-background-thread-issue]_), but working within
+the strictures of CPython's current threading system has complicated the process
+and required far more care and time than originally anticipated. We view this
+issue as critical to final merger into the ``py3k`` branch.
+We have obviously not met our initial goal of a 5x performance improvement. A
+`performance retrospective`_ follows, which addresses why we failed to meet our
+initial performance goal. We maintain a list of yet-to-be-implemented
+performance work [#us-perf-punchlist]_. 
+Memory Usage
+The following table shows maximum memory usage (in kilobytes) for each of
+Unladen Swallow's default benchmarks for both CPython 2.6.4 and Unladen Swallow
+r988, as well as a timeline of memory usage across the lifetime of the
+benchmark. We include tables for both 32- and 64-bit binaries. Memory usage was
+measured on Linux 2.6 systems by summing the ``Private_`` sections from the
+kernel's ``/proc/$pid/smaps`` pseudo-files [#smaps]_.
+  ./perf.py -r --track_memory -b default,apps ../a/python ../b/python
+| Benchmark    | CPython 2.6.4 | Unladen Swallow r988 | Change | Timeline                   |
+| 2to3         | 26396 kb      | 46896 kb             | 1.77x  | http://tinyurl.com/yhr2h4z |
+| django       | 10028 kb      | 27740 kb             | 2.76x  | http://tinyurl.com/yhan8vs |
+| html5lib     | 150028 kb     | 173924 kb            | 1.15x  | http://tinyurl.com/ybt44en |
+| nbody        | 3020 kb       | 16036 kb             | 5.31x  | http://tinyurl.com/ya8hltw |
+| rietveld     | 15008 kb      | 46400 kb             | 3.09x  | http://tinyurl.com/yhd5dra |
+| slowpickle   | 4608 kb       | 16656 kb             | 3.61x  | http://tinyurl.com/ybukyvo |
+| slowspitfire | 85776 kb      | 97620 kb             | 1.13x  | http://tinyurl.com/y9vj35z |
+| slowunpickle | 3448 kb       | 13744 kb             | 3.98x  | http://tinyurl.com/yexh4d5 |
+| spambayes    | 7352 kb       | 46480 kb             | 6.32x  | http://tinyurl.com/yem62ub |
+| Benchmark    | CPython 2.6.4 | Unladen Swallow r988 | Change | Timeline                   |
+| 2to3         | 51596 kb      | 82340 kb             | 1.59x  | http://tinyurl.com/yljg6rs |
+| django       | 16020 kb      | 38908 kb             | 2.43x  | http://tinyurl.com/ylqsebh |
+| html5lib     | 259232 kb     | 324968 kb            | 1.25x  | http://tinyurl.com/yha6oee |
+| nbody        | 4296 kb       | 23012 kb             | 5.35x  | http://tinyurl.com/yztozza |
+| rietveld     | 24140 kb      | 73960 kb             | 3.06x  | http://tinyurl.com/ybg2nq7 |
+| slowpickle   | 4928 kb       | 23300 kb             | 4.73x  | http://tinyurl.com/yk5tpbr |
+| slowspitfire | 133276 kb     | 148676 kb            | 1.11x  | http://tinyurl.com/y8bz2xe |
+| slowunpickle | 4896 kb       | 16948 kb             | 3.46x  | http://tinyurl.com/ygywwoc |
+| spambayes    | 10728 kb      | 84992 kb             | 7.92x  | http://tinyurl.com/yhjban5 |
+The increased memory usage comes from a) LLVM code generation, analysis and
+optimization libraries; b) native code; c) memory usage issues or leaks in
+LLVM; d) data structures needed to optimize and generate machine code; e)
+as-yet uncategorized other sources.
+While we have made significant progress in reducing memory usage since the
+initial naive JIT implementation [#us-memory-issue]_, there is obviously more
+to do. We believe that there are still memory savings to be made without
+sacrificing performance. We have tended to focus on raw performance, and we
+have not yet made a concerted push to reduce memory usage. We view reducing
+memory usage as a blocking issue for final merger into the ``py3k`` branch. We
+seek guidance from the community on an acceptable level of increased memory
+Start-up Time
+Statically linking LLVM's code generation, analysis and optimization libraries
+increases the time needed to start the Python binary. C++ static initializers
+used by LLVM also increase start-up time, as does importing the collection of
+pre-compiled C runtime routines we want to inline to Python code.
+Results from Unladen Swallow's ``startup`` benchmarks:
+  $ ./perf.py -r -b startup /tmp/cpy-26/bin/python /tmp/unladen/bin/python
+  ### normal_startup ###
+  Min: 0.219186 -> 0.352075: 1.6063x slower
+  Avg: 0.227228 -> 0.364384: 1.6036x slower
+  Significant (t=-51.879098, a=0.95)
+  Stddev: 0.00762 -> 0.02532: 3.3227x larger
+  Timeline: http://tinyurl.com/yfe8z3r
+  ### startup_nosite ###
+  Min: 0.105949 -> 0.264912: 2.5004x slower
+  Avg: 0.107574 -> 0.267505: 2.4867x slower
+  Significant (t=-703.557403, a=0.95)
+  Stddev: 0.00214 -> 0.00240: 1.1209x larger
+  Timeline: http://tinyurl.com/yajn8fa
+Unladen Swallow has made headway toward optimizing startup time, but there is
+still more work to do and further optimizations to implement. Improving start-up
+time is a high-priority item [#us-issue-startup-time]_ in Unladen Swallow's
+merger punchlist.
+Binary Size
+Statically linking LLVM's code generation, analysis and optimization libraries
+significantly increases the size of the ``python`` binary.
+32-bit; gcc 4.0.3
+| Binary size | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
+| Release     | 3.8M          | 4.0M          |  74M                 |
+| Debug       | 3.3M          | 3.6M          | 118M                 |
+64-bit; gcc 4.2.4
+| Binary size | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
+| Release     | 5.5M          | 5.7M          |  89M                 |
+| Debug       | 4.1M          | 4.4M          | 128M                 |
+The increased binary size is due to statically linking LLVM's code generation,
+analysis and optimization libraries into the ``python`` binary. This can be
+straightforwardly addressed by modifying LLVM to better support shared linking
+and then using that, instead of the current static linking. For the moment,
+though, static linking provides an accurate look at the cost of linking against
+Unladen Swallow recently experienced a regression in binary size, going from
+19MB in Unladen's 2009Q3 release up to the current 74MB shown in the table
+above. Resolution of this issue [#us-binary-size]_ will block final merger into
+the ``py3k`` branch.
+Performance Retrospective
+Our initial goal for Unladen Swallow was a 5x performance improvement over
+CPython 2.6. We did not hit that, nor to put it bluntly, even come close. Why
+did the project not hit that goal, and can an LLVM-based JIT ever hit that goal?
+Why did Unladen Swallow not achieve its 5x goal? The primary reason was
+that LLVM required more work than we had initially anticipated. Based on the
+fact that Apple was shipping products based on LLVM [#llvm-users]_, and
+other high-level languages had successfully implemented LLVM-based JITs
+([#rubinius]_, [#macruby]_, [#hlvm]_), we had assumed that LLVM's JIT was
+relatively free of show-stopper bugs.
+That turned out to be incorrect. We had to turn our attention away from
+performance to fix a number of critical bugs in LLVM's JIT infrastructure (for
+example, [#llvm-far-call-issue]_, [#llvm-jmm-rev]_) as well as a number of
+nice-to-have enhancements that would enable further optimizations along various
+axes (for example, [#llvm-globaldce-rev]_,
+[#llvm-memleak-rev]_, [#llvm-availext-issue]_). LLVM's static code generation
+facilities, tools and optimization passes are stable and stress-tested, but the
+just-in-time infrastructure was relatively untested and buggy. We have fixed
+(Our hypothesis is that we hit these problems -- problems other projects had
+avoided -- because of the complexity and thoroughness of CPython's standard
+library test suite.)
+We also diverted engineering effort away from performance and into support tools
+such as gdb and oProfile. gdb did not work well with JIT compilers at all, and
+LLVM previously had no integration with oProfile. Having JIT-aware debuggers and
+profilers has been very valuable to the project, and we do not regret
+channeling our time in these directions. See the `Debugging`_ and `Profiling`_
+sections for more information.
+Can an LLVM-based CPython JIT ever hit the 5x performance target? The benchmark
+results for JIT-based JavaScript implementations suggest that 5x is indeed
+possible, as do the results PyPy's JIT has delivered for numeric workloads. The
+experience of Self-92 [#urs-self]_ is also instructive.
+Can LLVM deliver this? We believe that we have only begun to scratch the surface
+of what our LLVM-based JIT can deliver. The optimizations we have incorporated
+into this system thus far have borne significant fruit (for example,
+[#us-specialization-issue]_, [#us-direct-calling-issue]_,
+[#us-fast-globals-issue]_). Our experience to date is that the limiting factor
+on Unladen Swallow's performance is the engineering cycles needed to implement
+the literature. We have found LLVM easy to work with and to modify, and its
+built-in optimizations have greatly simplified the task of implementing
+Python-level optimizations.
+An overview of further performance opportunities is discussed in the
+`Future Work`_ section.
+Correctness and Compatibility
+Unladen Swallow's correctness test suite includes CPython's test suite (under
+``Lib/test/``), as well as a number of important third-party applications and
+libraries [#tested-apps]_. A full list of these applications and libraries is
+reproduced below. Any dependencies needed by these packages, such as
+``zope.interface`` [#zope-interface]_, are also tested indirectly as a part of
+testing the primary package, thus widening the corpus of tested third-party
+Python code.
+- 2to3
+- Cheetah
+- cvs2svn
+- Django
+- Nose
+- NumPy
+- PyCrypto
+- pyOpenSSL
+- PyXML
+- Setuptools
+- SQLAlchemy
+- SymPy
+- Twisted
+These applications pass all relevant tests when run under Unladen Swallow. Note
+that some tests that failed against our baseline of CPython 2.6.4 were disabled,
+as were tests that made assumptions about CPython internals such as exact
+bytecode numbers or bytecode format. Any package with disabled tests includes
+a ``README.unladen`` file that details the changes (for example,
+In addition, Unladen Swallow is tested automatically against an array of
+internal Google Python libraries and applications. These include Google's
+internal Python bindings for BigTable [#bigtable]_, the Mondrian code review
+application [#mondrian]_, and Google's Python standard library, among others.
+The changes needed to run these projects under Unladen Swallow have consistently
+broken into one of three camps:
+- Adding CPython 2.6 C API compatibility. Since Google still primarily uses
+  CPython 2.4 internally, we have needed to convert uses of ``int`` to
+  ``Py_ssize_t`` and similar API changes.
+- Fixing or disabling explicit, incorrect tests of the CPython version number.
+- Conditionally disabling code that worked around or depending on bugs in
+  CPython 2.4 that have since been fixed.
+Testing against this wide range of public and proprietary applications and
+libraries has been instrumental in ensuring the correctness of Unladen Swallow.
+Testing has exposed bugs that we have duly corrected. Our automated regression
+testing regime has given us high confidence in our changes as we have moved
+In addition to third-party testing, we have added further tests to CPython's
+test suite for corner cases of the language or implementation that we felt were
+untested or underspecified (for example, [#us-import-tests]_,
+[#us-tracing-tests]_). These have been especially important when implementing
+optimizations, helping make sure we have not accidentally broken the darker
+corners of Python.
+We have also constructed a test suite focused solely on the LLVM-based JIT
+compiler and the optimizations implemented for it [#us-test_llvm]_. Because of
+the complexity and subtlety inherent in writing an optimizing compiler, we have
+attempted to exhaustively enumerate the constructs, scenarios and corner cases
+we are compiling and optimizing. The JIT tests also include tests for things
+like the JIT hotness model, making it easier for future CPython developers to
+maintain and improve.
+We have recently begun using fuzz testing [#fuzz-testing]_ to stress-test the
+compiler. We have used both pyfuzz [#pyfuzz]_ and Fusil [#fusil]_ in the past,
+and we recommend they be introduced as an automated part of the CPython testing
+Known Incompatibilities
+The only application or library we know to not work with Unladen Swallow that
+does work with CPython 2.6.4 is Psyco [#psyco]_. We are aware of some libraries
+such as PyGame [#pygame]_ that work well with CPython 2.6.4, but suffer some
+degradation due to changes made in Unladen Swallow. We are tracking this issue
+[#us-background-thread-issue]_ and are working to resolve these instances of
+While Unladen Swallow is source-compatible with CPython 2.6.4, it is not
+binary compatible. C extension modules compiled against one will need to be
+recompiled to work with the other.
+Platform Support
+Unladen Swallow is inherently limited by the platform support provided by LLVM,
+especially LLVM's JIT compilation system [#llvm-hardware]_. LLVM's JIT has the
+best support on x86 and x86-64 systems, and these are the platforms where
+Unladen Swallow has received the most testing. We are confident in LLVM/Unladen
+Swallow's support for x86 and x86-64 hardware. PPC and ARM support exists, but
+is not widely used and may be buggy.
+Unladen Swallow is known to work on the following operating systems: Linux,
+Darwin, Windows. Unladen Swallow has received the most testing on Linux and
+Darwin, though it still builds and passes its tests on Windows.
+In order to support hardware and software platforms where LLVM's JIT does not
+work, Unladen Swallow provides a ``./configure --without-llvm`` option. This
+flag carves out any part of Unladen Swallow that depends on LLVM, yielding a
+Python binary that works and passes its tests, but has no performance
+advantages. This configuration is recommended for hardware unsupported by LLVM,
+or systems that care more about memory usage than performance.
+Impact on CPython Development
+Experimenting with Changes to Python or CPython Bytecode
+Unladen Swallow's JIT compiler operates on CPython bytecode, and as such, it is
+immune to Python languages changes that only affect the parser.
+We recommend that changes to the CPython bytecode compiler or the semantics of
+individual bytecodes be prototyped in the interpreter loop first, then be ported
+to the JIT compiler once the semantics are clear. To make this easier, Unladen
+Swallow includes a ``--without-llvm`` configure-time option that strips out the
+JIT compiler and all associated infrastructure. This leaves the current burden
+of experimentation unchanged so that developers can prototype in the current
+low-barrier-to-entry interpreter loop.
+Unladen Swallow began implementing its JIT compiler by doing straightforward,
+naive translations from bytecode implementations into LLVM API calls. We found
+this process to be easily understood, and we recommend the same approach for
+CPython. We include several sample changes from the Unladen Swallow repository
+here as examples of this style of development: [#us-r359]_, [#us-r376]_,
+[#us-r417]_, [#us-r517]_.
+The Unladen Swallow team implemented changes to gdb to make it easier to use gdb
+to debug JIT-compiled Python code. These changes were released in gdb 7.0
+[#gdb70]_. They make it possible for gdb to identify and unwind past
+JIT-generated call stack frames. This allows gdb to continue to function as
+before for CPython development if one is changing, for example, the ``list``
+type or builtin functions.
+Example backtrace after our changes, where ``baz``, ``bar`` and ``foo`` are
+  Program received signal SIGSEGV, Segmentation fault.
+  0x00002aaaabe7d1a8 in baz ()
+  (gdb) bt
+  #0 0x00002aaaabe7d1a8 in baz ()
+  #1 0x00002aaaabe7d12c in bar ()
+  #2 0x00002aaaabe7d0aa in foo ()
+  #3 0x00002aaaabe7d02c in main ()
+  #4 0x0000000000b870a2 in llvm::JIT::runFunction (this=0x1405b70, F=0x14024e0, ArgValues=...)
+  at /home/rnk/llvm-gdb/lib/ExecutionEngine/JIT/JIT.cpp:395
+  #5 0x0000000000baa4c5 in llvm::ExecutionEngine::runFunctionAsMain
+  (this=0x1405b70, Fn=0x14024e0, argv=..., envp=0x7fffffffe3c0)
+  at /home/rnk/llvm-gdb/lib/ExecutionEngine/ExecutionEngine.cpp:377
+  #6 0x00000000007ebd52 in main (argc=2, argv=0x7fffffffe3a8,
+  envp=0x7fffffffe3c0) at /home/rnk/llvm-gdb/tools/lli/lli.cpp:208
+Previously, the JIT-compiled frames would have caused gdb to unwind incorrectly,
+generating lots of obviously-incorrect ``#6 0x00002aaaabe7d0aa in ?? ()``-style
+stack frames.
+- gdb 7.0 is able to correctly parse JIT-compiled stack frames, allowing full
+  use of gdb on non-JIT-compiled functions, that is, the vast majority of the
+  CPython codebase.
+- Disassembling inside a JIT-compiled stack frame automatically prints the full
+  list of instructions making up that function. This is an advance over the
+  state of gdb before our work: developers needed to guess the starting address
+  of the function and manually disassemble the assembly code.
+- Flexible underlying mechanism allows CPython to add more and more information,
+  and eventually reach parity with C/C++ support in gdb for JIT-compiled machine
+  code.
+- gdb cannot print local variables or tell you what line you're currently
+  executing inside a JIT-compiled function. Nor can it step through
+  JIT-compiled code, except for one instruction at a time.
+- Not yet integrated with Apple's gdb or Microsoft's Visual Studio debuggers.
+The Unladen Swallow team is working with Apple to get these changes
+incorporated into their future gdb releases.
+Unladen Swallow integrates with oProfile 0.9.4 and newer [#oprofile]_ to support
+assembly-level profiling on Linux systems. This means that oProfile will
+correctly symbolize JIT-compiled functions in its reports.
+Example report, where the ``#u#``-prefixed symbol names are JIT-compiled Python
+  $ opreport -l ./python | less
+  CPU: Core 2, speed 1600 MHz (estimated)
+  Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
+  samples % image name symbol name
+  79589 4.2329 python PyString_FromFormatV
+  62971 3.3491 python PyEval_EvalCodeEx
+  62713 3.3354 python tupledealloc
+  57071 3.0353 python _PyEval_CallFunction
+  50009 2.6597 24532.jo #u#force_unicode
+  47468 2.5246 python PyUnicodeUCS2_Decode
+  45829 2.4374 python PyFrame_New
+  45173 2.4025 python lookdict_string
+  43082 2.2913 python PyType_IsSubtype
+  39763 2.1148 24532.jo #u#render5
+  38145 2.0287 python _PyType_Lookup
+  37643 2.0020 python PyObject_GC_UnTrack
+  37105 1.9734 python frame_dealloc
+  36849 1.9598 python PyEval_EvalFrame
+  35630 1.8950 24532.jo #u#resolve
+  33313 1.7717 python PyObject_IsInstance
+  33208 1.7662 python PyDict_GetItem
+  33168 1.7640 python PyTuple_New
+  30458 1.6199 python PyCFunction_NewEx
+This support is functional, but as-yet unpolished. Unladen Swallow maintains a
+punchlist of items we feel are important to improve in our oProfile integration
+to make it more useful to core CPython developers [#us-oprofile-punchlist]_.
+- Symbolization of JITted frames working in oProfile on Linux.
+- No work yet invested in improving symbolization of JIT-compiled frames for
+  Apple's Shark [#shark]_ or Microsoft's Visual Studio profiling tools.
+- Some polishing still desired for oProfile output.
+We recommend using oProfile 0.9.5 (and newer) to work around a now-fixed bug on
+x86-64 platforms in oProfile. oProfile 0.9.4 will work fine on 32-bit platforms,
+Given the ease of integrating oProfile with LLVM [#llvm-oprofile-change]_ and
+Unladen Swallow [#us-oprofile-change]_, other profiling tools should be easy as
+well, provided they support a similar JIT interface [#oprofile-jit-interface]_.
+Addition of C++ to CPython
+In order to use LLVM, Unladen Swallow has introduced C++ into the core CPython
+tree and build process. This is an unavoidable part of depending on LLVM; though
+LLVM offers a C API [#llvm-c-api]_, it is limited and does not expose the
+functionality needed by CPython. Because of this, we have implemented the
+internal details of the Unladen Swallow JIT and its supporting infrastructure
+in C++. We do not propose converting the entire CPython codebase to C++.
+- Easy use of LLVM's full, powerful code generation and related APIs.
+- Convenient, abstract data structures simplify code.
+- C++ is limited to relatively small corners of the CPython codebase.
+- Developers must know two related languages, C and C++ to work on the full
+  range of CPython's internals.
+- A C++ style guide will need to be developed and enforced. See `Open Issues`_.
+Managing LLVM Releases, C++ API Changes
+LLVM is released regularly every six months. This means that LLVM may be
+released two or three times during the course of development of a CPython 3.x
+release. Each LLVM release brings newer and more powerful optimizations,
+improved platform support and more sophisticated code generation.
+LLVM releases usually include incompatible changes to the LLVM C++ API; the
+release notes for LLVM 2.6 [#llvm-26-whatsnew]_ include a list of
+intentionally-introduced incompatibilities. Unladen Swallow has tracked LLVM
+trunk closely over the course of development. Our experience has been
+that LLVM API changes are obvious and easily or mechanically remedied. We
+include two such changes from the Unladen Swallow tree as references here:
+[#us-llvm-r820]_, [#us-llvm-r532]_.
+Due to API incompatibilities, we recommend that an LLVM-based CPython target
+compatibility with a single version of LLVM at a time. This will lower the
+overhead on the core development team. Pegging to an LLVM version should not be
+a problem from a packaging perspective, because pre-built LLVM packages
+generally become available via standard system package managers fairly quickly
+following an LLVM release, and failing that, llvm.org itself includes binary
+Pre-built LLVM packages are available from MacPorts [#llvm-macports]_ for
+Darwin, and from most major Linux distributions ([#llvm-ubuntu]_,
+[#llvm-debian]_, [#llvm-fedora]_). LLVM itself provides additional binaries,
+such as for MinGW [#llvm-mingw]_.
+LLVM is currently intended to be statically linked; this means that binary
+releases of CPython will include the relevant parts (not all!) of LLVM. This
+will increase the binary size, as noted above.
+Unladen Swallow has tasked a full-time engineer with fixing any remaining
+critical issues in LLVM before LLVM's 2.7 release. We would like CPython 3.x to
+be able to depend on a released version of LLVM, rather than closely tracking
+LLVM trunk as Unladen Swallow has done. We believe we will finish this work
+before the release of LLVM 2.7, expected in May 2010.
+Building CPython
+In addition to a runtime dependency on LLVM, Unladen Swallow includes a
+build-time dependency on Clang [#clang]_, an LLVM-based C/C++ compiler. We use
+this to compile parts of the C-language Python runtime to LLVM's intermediate
+representation; this allows us to perform cross-language inlining, yielding
+increased performance. Clang is not required to run Unladen Swallow. Clang
+binary packages are available from most major Linux distributions (for example,
+We examined the impact of Unladen Swallow on the time needed to build Python,
+including configure, full builds and incremental builds after touching a single
+C source file.
+| ./configure | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
+| Run 1       | 0m20.795s     | 0m16.558s     | 0m15.477s            |
+| Run 2       | 0m15.255s     | 0m16.349s     | 0m15.391s            |
+| Run 3       | 0m15.228s     | 0m16.299s     | 0m15.528s            |
+| Full make   | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
+| Run 1       | 1m30.776s     | 1m22.367s     | 1m54.053s            |
+| Run 2       | 1m21.374s     | 1m22.064s     | 1m49.448s            |
+| Run 3       | 1m22.047s     | 1m23.645s     | 1m49.305s            |
+Full builds take a hit due to a) additional ``.cc`` files needed for LLVM
+interaction, b) statically linking LLVM into ``libpython``, c) compiling parts
+of the Python runtime to LLVM IR to enable cross-language inlining.
+Incremental builds, however, are significantly slower. The table below shows
+incremental rebuild times after touching ``Objects/listobject.c``.
+| Incr make   | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
+| Run 1       | 0m1.854s      | 0m1.456s      | 0m24.464s            |
+| Run 2       | 0m1.437s      | 0m1.442s      | 0m24.416s            |
+| Run 3       | 0m1.440s      | 0m1.425s      | 0m24.352s            |
+As with full builds, this extra time comes from a) additional ``.cc`` files
+needed for LLVM interaction, and b) statically linking LLVM into ``libpython``.
+If ``libpython`` were linked shared against LLVM, this overhead would go down.
+Incremental builds of Unladen Swallow also currently (as of r988) suffer from a
+known bug in the Unladen Swallow ``Makefile`` [#rebuild-too-much]_ where too
+many ``.cc`` files are recompiled. We consider this a blocking issue for full
+merger with the ``py3k`` branch.
+Proposed Merge Plan
+We propose focusing our efforts on eventual merger with CPython's 3.x line of
+development. The BDFL has indicated that 2.7 is to be the final release of
+CPython's 2.x line of development [#bdfl-27-final]_, and since 2.7 alpha 1 has
+already been released [#cpy-27a1]_, we have missed the window. Python 3 is the
+future, and that is where we will target our performance efforts.
+We recommend the following plan for merger of Unladen Swallow into the CPython
+source tree:
+- Creation of a branch in the CPython SVN repository to work in, call it
+  ``py3k-jit`` as a strawman. This will be a branch of the CPython ``py3k``
+  branch.
+- We will keep this branch closely integrated to ``py3k``. The further we
+  deviate, the harder our work will be.
+- Any JIT-related patches will go into the ``py3k-jit`` branch.
+- Non-JIT-related patches will go into the ``py3k`` branch (once reviewed and
+  approved) and be merged back into the ``py3k-jit`` branch.
+- Potentially-contentious issues, such as the introduction of new command line
+  flags or environment variables, will be discussed on python-dev.
+Because Google uses CPython 2.x internally, Unladen Swallow is based on CPython
+2.6. We would need to port our compiler to Python 3; this would be done as
+patches are applied to the ``py3k-jit`` branch, so that the branch remains a
+consistent implementation of Python 3 at all times.
+We believe this approach will be minimally disruptive to the 3.2 or 3.3 release
+process while we iron out any remaining issues blocking final merger into
+``py3k``. Unladen Swallow maintains a punchlist of known issues needed before
+final merger [#us-punchlist]_, which includes all problems mentioned in this
+PEP; we trust the CPython community will have its own concerns.
+See the `Open Issues`_ section for questions about code review policy for the
+``py3k-jit`` branch.
+Future Work
+A JIT compiler is an extremely flexible tool, and we have by no means exhausted
+its full potential. Unladen Swallow maintains a list of yet-to-be-implemented
+performance optimizations [#us-perf-punchlist]_ that the team has not yet
+had time to fully implement. Examples:
+- Python/Python inlining [#inlining]_. Our compiler currently performs no
+  inlining between pure-Python functions. Work on this is on-going
+  [#us-inlining]_.
+- Unboxing [#unboxing]_. Unboxing is critical for numerical performance. PyPy
+  in particular has demonstrated the value of unboxing to heavily-numeric
+  workloads.
+- Recompilation, adaptation. Unladen Swallow currently only compiles a Python
+  function once, based on its usage pattern up to that point. If the usage
+  pattern changes, limitations in LLVM [#us-recompile-issue]_ prevent us from
+  recompiling the function to better serve the new usage pattern.
+- JIT-compile regular expressions. Modern JavaScript engines reuse their JIT
+  compilation infrastructure to boost regex performance [#us-regex-perf]_.
+  Unladen Swallow has developed benchmarks for Python regular expression
+  performance ([#us-bm-re-compile]_, [#us-bm-re-v8]_, [#us-bm-re-effbot]_), but
+  work on regex performance is still at an early stage [#us-regex-issue]_.
+- Trace compilation [#traces-waste-of-time]_, [#traces-explicit-pipeline]_.
+  Based on the results of PyPy and Tracemonkey [#tracemonkey]_, we believe that
+  a CPython JIT should incorporate trace compilation to some degree. We
+  initially avoided a purely-tracing JIT compiler in favor of a simpler,
+  function-at-a-time compiler. However this function-at-a-time compiler has laid
+  the groundwork for a future tracing compiler implemented in the same terms.
+This list is by no means exhaustive. There is a vast literature on optimizations
+for dynamic languages that could and should be implemented in terms of Unladen
+Swallow's LLVM-based JIT compiler [#us-relevantpapers]_.
+Open Issues
+- *Code review policy for the ``py3k-jit`` branch.* How does the CPython
+  community want us to procede with respect to checkins on the ``py3k-jit``
+  branch? Pre-commit reviews? Post-commit reviews?
+  Unladen Swallow has enforced pre-commit reviews in our trunk, but we realize
+  this may lead to long review/checkin cycles in a purely-volunteer
+  organization. We would like a non-Google-affiliated member of the CPython
+  development team to review our work for correctness and compatibility, but we
+  realize this may not be possible for every commit.
+- *How to link LLVM.* Should we change LLVM to better support shared linking,
+  and then use shared linking to link the parts of it we need into CPython?
+- *Prioritization of remaining issues.* We would like input from the CPython
+  development team on how to prioritize the remaining issues in the Unladen
+  Swallow codebase. Some issues like memory usage are obviously critical before
+  merger with ``py3k``, but others may fall into a "nice to have" category that
+  could be kept for resolution into a future CPython 3.x release.
+- *Create a C++ style guide.* Should PEP 7 be extended to include C++, or
+  should a separate C++ style PEP be created? Unladen Swallow maintains its own
+  style guide [#us-styleguide]_, which may serve as a starting point; the
+  Unladen Swallow style guide is based on both LLVM's [#llvm-styleguide]_ and
+  Google's [#google-styleguide]_ C++ style guides.
+Unladen Swallow Community
+We would like to thank the community of developers who have contributed to
+Unladen Swallow, in particular: James Abbatiello, Joerg Blank, Eric Christopher,
+Alex Gaynor, Chris Lattner, Nick Lewycky, Evan Phoenix and Thomas Wouters.
+All work on Unladen Swallow is licensed to the Python Software Foundation (PSF)
+under the terms of the Python Software Foundation License v2 [#psf-lic]_ under
+the umbrella of Google's blanket Contributor License Agreement with the PSF.
+.. [#us]
+   http://code.google.com/p/unladen-swallow/
+.. [#llvm]
+   http://llvm.org/
+.. [#clang]
+   http://clang.llvm.org/
+.. [#tested-apps]
+   http://code.google.com/p/unladen-swallow/wiki/Testing
+.. [#llvm-hardware]
+   http://llvm.org/docs/GettingStarted.html#hardware
+.. [#rebuild-too-much]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=115
+.. [#llvm-c-api]
+   http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/
+.. [#llvm-26-whatsnew]
+   http://llvm.org/releases/2.6/docs/ReleaseNotes.html#whatsnew
+.. [#us-llvm-r820]
+   http://code.google.com/p/unladen-swallow/source/detail?r=820
+.. [#us-llvm-r532]
+   http://code.google.com/p/unladen-swallow/source/detail?r=532
+.. [#llvm-macports]
+   http://trac.macports.org/browser/trunk/dports/lang/llvm/Portfile
+.. [#llvm-ubuntu]
+   http://packages.ubuntu.com/karmic/llvm
+.. [#llvm-debian]
+   http://packages.debian.org/unstable/devel/llvm
+.. [#clang-debian]
+   http://packages.debian.org/sid/clang
+.. [#llvm-fedora]
+   http://koji.fedoraproject.org/koji/buildinfo?buildID=134384
+.. [#gdb70]
+   http://www.gnu.org/software/gdb/download/ANNOUNCEMENT
+.. [#oprofile]
+   http://oprofile.sourceforge.net/news/
+.. [#us-oprofile-punchlist]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=63
+.. [#shark]
+   http://developer.apple.com/tools/sharkoptimize.html
+.. [#llvm-oprofile-change]
+   http://llvm.org/viewvc/llvm-project?view=rev&revision=75279
+.. [#us-oprofile-change]
+   http://code.google.com/p/unladen-swallow/source/detail?r=986
+.. [#oprofile-jit-interface]
+   http://oprofile.sourceforge.net/doc/devel/jit-interface.html
+.. [#llvm-mingw]
+   http://llvm.org/releases/download.html
+.. [#us-r359]
+   http://code.google.com/p/unladen-swallow/source/detail?r=359
+.. [#us-r376]
+   http://code.google.com/p/unladen-swallow/source/detail?r=376
+.. [#us-r417]
+   http://code.google.com/p/unladen-swallow/source/detail?r=417
+.. [#us-r517]
+   http://code.google.com/p/unladen-swallow/source/detail?r=517
+.. [#bdfl-27-final]
+   http://mail.python.org/pipermail/python-dev/2010-January/095682.html
+.. [#cpy-27a1]
+   http://www.python.org/dev/peps/pep-0373/
+.. [#cpy-32]_
+   http://www.python.org/dev/peps/pep-0392/
+.. [#us-punchlist]
+   http://code.google.com/p/unladen-swallow/issues/list?q=label:Merger
+.. [#us-binary-size]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=118
+.. [#us-issue-startup-time]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=64
+.. [#zope-interface]
+   http://www.zope.org/Products/ZopeInterface
+.. [#bigtable]
+   http://en.wikipedia.org/wiki/BigTable
+.. [#mondrian]
+   http://www.niallkennedy.com/blog/2006/11/google-mondrian.html
+.. [#us-sqlalchemy-readme]
+   http://code.google.com/p/unladen-swallow/source/browse/tests/lib/sqlalchemy/README.unladen
+.. [#us-test_llvm]
+   http://code.google.com/p/unladen-swallow/source/browse/trunk/Lib/test/test_llvm.py
+.. [#fuzz-testing]
+   http://en.wikipedia.org/wiki/Fuzz_testing
+.. [#pyfuzz]
+   http://bitbucket.org/ebo/pyfuzz/overview/
+.. [#fusil]
+   http://lwn.net/Articles/322826/
+.. [#us-memory-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=68
+.. [#us-benchmarks]
+   http://code.google.com/p/unladen-swallow/wiki/Benchmarks
+.. [#students-t-test]
+   http://en.wikipedia.org/wiki/Student's_t-test
+.. [#smaps]
+   http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html
+.. [#us-background-thread]
+   http://code.google.com/p/unladen-swallow/source/browse/branches/background-thread
+.. [#us-background-thread-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=40
+.. [#us-import-tests]
+   http://code.google.com/p/unladen-swallow/source/detail?r=888
+.. [#us-tracing-tests]
+   http://code.google.com/p/unladen-swallow/source/diff?spec=svn576&r=576&format=side&path=/trunk/Lib/test/test_trace.py
+.. [#us-perf-punchlist]
+   http://code.google.com/p/unladen-swallow/issues/list?q=label:Performance
+.. [#jit]
+   http://en.wikipedia.org/wiki/Just-in-time_compilation
+.. [#urs-self]
+   http://research.sun.com/self/papers/urs-thesis.html
+.. [#us-projectplan]
+   http://code.google.com/p/unladen-swallow/wiki/ProjectPlan
+.. [#us-relevantpapers]
+   http://code.google.com/p/unladen-swallow/wiki/RelevantPapers
+.. [#us-llvm-notes]
+   http://code.google.com/p/unladen-swallow/source/browse/trunk/Python/llvm_notes.txt
+.. [#psf-lic]
+   http://www.python.org/psf/license/
+.. [#v8]
+   http://code.google.com/p/v8/
+.. [#squirrelfishextreme]
+   http://webkit.org/blog/214/introducing-squirrelfish-extreme/
+.. [#rubinius]
+   http://rubini.us/
+.. [#parrot-on-llvm]
+   http://lists.parrot.org/pipermail/parrot-dev/2009-September/002811.html
+.. [#macruby]
+   http://www.macruby.org/
+.. [#hotspot]
+   http://en.wikipedia.org/wiki/HotSpot
+.. [#psyco]
+   http://psyco.sourceforge.net/
+.. [#pypy]
+   http://codespeak.net/pypy/dist/pypy/doc/
+.. [#inlining]
+   http://en.wikipedia.org/wiki/Inline_expansion
+.. [#unboxing]
+   http://en.wikipedia.org/wiki/Object_type_(object-oriented_programming)
+.. [#us-inlining]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=86
+.. [#us-styleguide]
+   http://code.google.com/p/unladen-swallow/wiki/StyleGuide
+.. [#llvm-styleguide]
+   http://llvm.org/docs/CodingStandards.html
+.. [#google-styleguide]
+   http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
+.. [#us-recompile-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=41
+.. [#us-regex-perf]
+   http://code.google.com/p/unladen-swallow/wiki/ProjectPlan#Regular_Expressions
+.. [#us-bm-re-compile]
+   http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm_regex_compile.py
+.. [#us-bm-re-v8]
+   http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm_regex_v8.py
+.. [#us-bm-re-effbot]
+   http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm_regex_effbot.py
+.. [#us-regex-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=13
+.. [#pygame]
+   http://www.pygame.org/
+.. [#numpy]
+   http://numpy.scipy.org/
+.. [#pypy-bmarks]
+   http://codespeak.net:8099/plotsummary.html
+.. [#llvm-users]
+   http://llvm.org/Users.html
+.. [#hlvm]
+   http://www.ffconsultancy.com/ocaml/hlvm/
+.. [#llvm-far-call-issue]
+   http://llvm.org/PR5201
+.. [#llvm-jmm-rev]
+   http://llvm.org/viewvc/llvm-project?view=rev&revision=76828
+.. [#llvm-memleak-rev]
+   http://llvm.org/viewvc/llvm-project?rev=91611&view=rev
+.. [#llvm-globaldce-rev]
+   http://llvm.org/viewvc/llvm-project?rev=85182&view=rev
+.. [#llvm-availext-issue]
+   http://llvm.org/PR5735
+.. [#us-specialization-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=73
+.. [#us-direct-calling-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=88
+.. [#us-fast-globals-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=67
+.. [#traces-waste-of-time]
+   http://www.ics.uci.edu/~franz/Site/pubs-pdf/C44Prepub.pdf
+.. [#traces-explicit-pipeline]
+   http://www.ics.uci.edu/~franz/Site/pubs-pdf/ICS-TR-07-12.pdf
+.. [#tracemonkey]
+   https://wiki.mozilla.org/JavaScript:TraceMonkey
+.. [#llvm-langref]
+   http://llvm.org/docs/LangRef.html
+.. [#us-wider-perf-issue]
+   http://code.google.com/p/unladen-swallow/issues/detail?id=120
+.. [#us-nbody]
+   http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm_nbody.py
+This document has been placed in the public domain.
+   Local Variables:
+   mode: indented-text
+   indent-tabs-mode: nil
+   sentence-end-double-space: t
+   fill-column: 70
+   coding: utf-8
+   End:

More information about the Python-checkins mailing list