[pypy-dev] Performance, json and standard library

Maciej Fijalkowski fijall at gmail.com
Tue Sep 27 03:18:02 CEST 2011


On Mon, Sep 26, 2011 at 7:13 PM, Bob Ippolito <bob at redivi.com> wrote:
> You should also try the master branch of simplejson, the
> _pypy_speedups branch is not necessarily better (which is why it is
> not master).

You should also look at https://bugs.pypy.org/issue866 for various patches.

>
> On Mon, Sep 26, 2011 at 2:39 PM, Matthew Kaniaris <mkaniaris at gmail.com> wrote:
>> I did some testing to see where we stand on JSON.  The pypy is from
>> the trunk and the simplejson used with pypy is the _pypy_speedups
>> branch.  The speedups make pypy about 2x faster on dumps than with the
>> stdlib JSON module, slightly slower with loads, but up to ten times
>> slower than cpython with simplejson with the 32kb file.  I'll try
>> profiling the speedups branch to see if there is any easy fruit left,
>> but I doubt we will get another 50% improvement out of it.
>>
>> -kans
>>
>> results:
>> python using json:
>>
>> /home/test/3.4kb.json
>> loads: 5 loops, best of 1000: 953 usec per loop
>>
>> dumps: 5 loops, best of 1000: 706 usec per loop
>>
>> /home/test/32kb.json
>> loads: 5 loops, best of 1000: 10.9 msec per loop
>>
>> dumps: 5 loops, best of 1000: 9.13 msec per loop
>>
>> -------------------------
>> python using simplejson:
>>
>> /home/test/3.4kb.json
>> loads: 5 loops, best of 1000: 41.2 usec per loop
>>
>> dumps: 5 loops, best of 1000: 56 usec per loop
>>
>> /home/test/32kb.json
>> loads: 5 loops, best of 1000: 604 usec per loop
>>
>> dumps: 5 loops, best of 1000: 391 usec per loop
>>
>> -------------------------
>> pypy using json:
>>
>> /home/test/3.4kb.json
>> loads: 5 loops, best of 1000: 146 usec per loop
>>
>> dumps: 5 loops, best of 1000: 429 usec per loop
>>
>> /home/test/32kb.json
>> loads: 5 loops, best of 1000: 2.93 msec per loop
>>
>> dumps: 5 loops, best of 1000: 7.16 msec per loop
>>
>> -------------------------
>> pypy using simplejson:
>>
>> /home/test/3.4kb.json
>> loads: 5 loops, best of 1000: 197 usec per loop
>>
>> dumps: 5 loops, best of 1000: 148 usec per loop
>>
>> /home/test/32kb.json
>> loads: 5 loops, best of 1000: 3.47 msec per loop
>>
>> dumps: 5 loops, best of 1000: 3.2 msec per loop
>>
>>
>>
>> On Sun, Sep 25, 2011 at 1:53 PM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
>>>
>>>
>>> On Sun, Sep 25, 2011 at 1:49 PM, Bob Ippolito <bob at redivi.com> wrote:
>>>>
>>>> simplejson would be a good target for changes that would not be easy
>>>> to implement on top of the stdlib json. I'd be happy to accept any
>>>> contributions. I failed to make big differences in performance when I
>>>> tried at PyCon (at least that didn't regress performance for some
>>>> people). The other things I'm missing are a good suite of documents to
>>>> benchmark with, and a good tool to run the benchmarks so it's easy to
>>>> see if incremental changes are better or worse.
>>>>
>>>> However, if RPython is required to make it faster, maybe implementing
>>>> _json for the stdlib would actually be best.
>>>>
>>>> On Sun, Sep 25, 2011 at 10:30 AM, Zooko O'Whielacronx <zooko at zooko.com>
>>>> wrote:
>>>> > But don't people who need better json performance use simplejson
>>>> > explicitly instead of using the standard library's json?
>>>> >
>>>> > Regards,
>>>> >
>>>> > Zooko
>>>> > _______________________________________________
>>>> > pypy-dev mailing list
>>>> > pypy-dev at python.org
>>>> > http://mail.python.org/mailman/listinfo/pypy-dev
>>>> >
>>>> _______________________________________________
>>>> pypy-dev mailing list
>>>> pypy-dev at python.org
>>>> http://mail.python.org/mailman/listinfo/pypy-dev
>>>
>>> For what it's worth, I think we can get there, without needing to write any
>>> RPython, through a combination of careful Python, and more JIT
>>> optimizations.  For example, I'd like to get the code input[i:i+4] == "NULL"
>>> to eventually generate:
>>> read str length
>>> check length >= 4
>>> read 4 bytes out of input (single MOVL)
>>> integer compare to ('N' << 0) | ('U' << 8) | ('L' << 16) | ('L' << 24)
>>> in total about 7 x86 instructions.  I think this is definitely possible!
>>> Alex
>>>
>>> --
>>> "I disapprove of what you say, but I will defend to the death your right to
>>> say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
>>> "The people's good is the highest law." -- Cicero
>>>
>>>
>>> _______________________________________________
>>> pypy-dev mailing list
>>> pypy-dev at python.org
>>> http://mail.python.org/mailman/listinfo/pypy-dev
>>>
>>>
>>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev
>


More information about the pypy-dev mailing list