[Python-ideas] PyParallel update (was: solving multi-core Python)

Trent Nelson trent at snakebite.org
Tue Jun 23 16:03:55 CEST 2015


On Tue, Jun 23, 2015 at 09:53:01AM -0400, Trent Nelson wrote:
> On Sat, Jun 20, 2015 at 03:42:33PM -0600, Eric Snow wrote:
> > Furthermore, removing the GIL is perhaps an obvious solution but not
> > the only one.  Others include Trent Nelson's PyParallels, STM, and
> > other Python implementations..
> 
> So, I've been sprinting relentlessly on PyParallel since Christmas, and
> recently reached my v0.0 milestone of being able to handle all the TEFB
> tests, plus get the "instantaneous wiki search" thing working too.
> 
> The TEFB (Techempower Framework Benchmarks) implementation is here:
>     https://bitbucket.org/tpn/pyparallel/src/8528b11ba51003a9821ceb75683ee96ed33db28a/examples/tefb/tefb.py?at=3.3-px
>     (The aim was to have it compete in this: https://www.techempower.com/benchmarks/#section=data-r10, but unfortunately they broke their Windows support after round 9, so there's no way to get PyParallel into the official results without fixing that first.)
> 
> The wiki thing is here:
> 
> https://bitbucket.org/tpn/pyparallel/src/8528b11ba51003a9821ceb75683ee96ed33db28a/examples/wiki/wiki.py?at=3.3-px
> 
> I particularly like the wiki example as it leverages a lot of benefits
> afforded by PyParallel's approach to parallelism, concurrency and
> asynchronous I/O:
>     - Load a digital search trie (datrie.Trie) that contains every
>       Wikipedia title and the byte-offset within the wiki.xml where
>       the title was found.  (Once loaded the RSS of python.exe is about
>       11GB; the trie itself has about 16 million items in it.)

Oops, I was off by about 12 million:

    C:\PyParallel33>python.exe
    PyParallel 3.3.5 (3.3-px:829ae345012e+, Jun 15 2015, 16:54:16) [MSC v.1600 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.chdir('examples\\wiki')
    >>> import wiki as w
    About to load titles trie, this will take a while...
    >>> len(w.titles)
    27962169


More information about the Python-ideas mailing list