[pypy-dev] Performance, json and standard library

Bob Ippolito bob at redivi.com
Tue Sep 27 00:13:25 CEST 2011


You should also try the master branch of simplejson, the
_pypy_speedups branch is not necessarily better (which is why it is
not master).

On Mon, Sep 26, 2011 at 2:39 PM, Matthew Kaniaris <mkaniaris at gmail.com> wrote:
> I did some testing to see where we stand on JSON.  The pypy is from
> the trunk and the simplejson used with pypy is the _pypy_speedups
> branch.  The speedups make pypy about 2x faster on dumps than with the
> stdlib JSON module, slightly slower with loads, but up to ten times
> slower than cpython with simplejson with the 32kb file.  I'll try
> profiling the speedups branch to see if there is any easy fruit left,
> but I doubt we will get another 50% improvement out of it.
>
> -kans
>
> results:
> python using json:
>
> /home/test/3.4kb.json
> loads: 5 loops, best of 1000: 953 usec per loop
>
> dumps: 5 loops, best of 1000: 706 usec per loop
>
> /home/test/32kb.json
> loads: 5 loops, best of 1000: 10.9 msec per loop
>
> dumps: 5 loops, best of 1000: 9.13 msec per loop
>
> -------------------------
> python using simplejson:
>
> /home/test/3.4kb.json
> loads: 5 loops, best of 1000: 41.2 usec per loop
>
> dumps: 5 loops, best of 1000: 56 usec per loop
>
> /home/test/32kb.json
> loads: 5 loops, best of 1000: 604 usec per loop
>
> dumps: 5 loops, best of 1000: 391 usec per loop
>
> -------------------------
> pypy using json:
>
> /home/test/3.4kb.json
> loads: 5 loops, best of 1000: 146 usec per loop
>
> dumps: 5 loops, best of 1000: 429 usec per loop
>
> /home/test/32kb.json
> loads: 5 loops, best of 1000: 2.93 msec per loop
>
> dumps: 5 loops, best of 1000: 7.16 msec per loop
>
> -------------------------
> pypy using simplejson:
>
> /home/test/3.4kb.json
> loads: 5 loops, best of 1000: 197 usec per loop
>
> dumps: 5 loops, best of 1000: 148 usec per loop
>
> /home/test/32kb.json
> loads: 5 loops, best of 1000: 3.47 msec per loop
>
> dumps: 5 loops, best of 1000: 3.2 msec per loop
>
>
>
> On Sun, Sep 25, 2011 at 1:53 PM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
>>
>>
>> On Sun, Sep 25, 2011 at 1:49 PM, Bob Ippolito <bob at redivi.com> wrote:
>>>
>>> simplejson would be a good target for changes that would not be easy
>>> to implement on top of the stdlib json. I'd be happy to accept any
>>> contributions. I failed to make big differences in performance when I
>>> tried at PyCon (at least that didn't regress performance for some
>>> people). The other things I'm missing are a good suite of documents to
>>> benchmark with, and a good tool to run the benchmarks so it's easy to
>>> see if incremental changes are better or worse.
>>>
>>> However, if RPython is required to make it faster, maybe implementing
>>> _json for the stdlib would actually be best.
>>>
>>> On Sun, Sep 25, 2011 at 10:30 AM, Zooko O'Whielacronx <zooko at zooko.com>
>>> wrote:
>>> > But don't people who need better json performance use simplejson
>>> > explicitly instead of using the standard library's json?
>>> >
>>> > Regards,
>>> >
>>> > Zooko
>>> > _______________________________________________
>>> > pypy-dev mailing list
>>> > pypy-dev at python.org
>>> > http://mail.python.org/mailman/listinfo/pypy-dev
>>> >
>>> _______________________________________________
>>> pypy-dev mailing list
>>> pypy-dev at python.org
>>> http://mail.python.org/mailman/listinfo/pypy-dev
>>
>> For what it's worth, I think we can get there, without needing to write any
>> RPython, through a combination of careful Python, and more JIT
>> optimizations.  For example, I'd like to get the code input[i:i+4] == "NULL"
>> to eventually generate:
>> read str length
>> check length >= 4
>> read 4 bytes out of input (single MOVL)
>> integer compare to ('N' << 0) | ('U' << 8) | ('L' << 16) | ('L' << 24)
>> in total about 7 x86 instructions.  I think this is definitely possible!
>> Alex
>>
>> --
>> "I disapprove of what you say, but I will defend to the death your right to
>> say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
>> "The people's good is the highest law." -- Cicero
>>
>>
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> http://mail.python.org/mailman/listinfo/pypy-dev
>>
>>
>


More information about the pypy-dev mailing list