[pypy-dev] Performance, json and standard library

Matthew Kaniaris mkaniaris at gmail.com
Mon Sep 26 23:39:10 CEST 2011


I did some testing to see where we stand on JSON.  The pypy is from
the trunk and the simplejson used with pypy is the _pypy_speedups
branch.  The speedups make pypy about 2x faster on dumps than with the
stdlib JSON module, slightly slower with loads, but up to ten times
slower than cpython with simplejson with the 32kb file.  I'll try
profiling the speedups branch to see if there is any easy fruit left,
but I doubt we will get another 50% improvement out of it.

-kans

results:
python using json:

/home/test/3.4kb.json
loads: 5 loops, best of 1000: 953 usec per loop

dumps: 5 loops, best of 1000: 706 usec per loop

/home/test/32kb.json
loads: 5 loops, best of 1000: 10.9 msec per loop

dumps: 5 loops, best of 1000: 9.13 msec per loop

-------------------------
python using simplejson:

/home/test/3.4kb.json
loads: 5 loops, best of 1000: 41.2 usec per loop

dumps: 5 loops, best of 1000: 56 usec per loop

/home/test/32kb.json
loads: 5 loops, best of 1000: 604 usec per loop

dumps: 5 loops, best of 1000: 391 usec per loop

-------------------------
pypy using json:

/home/test/3.4kb.json
loads: 5 loops, best of 1000: 146 usec per loop

dumps: 5 loops, best of 1000: 429 usec per loop

/home/test/32kb.json
loads: 5 loops, best of 1000: 2.93 msec per loop

dumps: 5 loops, best of 1000: 7.16 msec per loop

-------------------------
pypy using simplejson:

/home/test/3.4kb.json
loads: 5 loops, best of 1000: 197 usec per loop

dumps: 5 loops, best of 1000: 148 usec per loop

/home/test/32kb.json
loads: 5 loops, best of 1000: 3.47 msec per loop

dumps: 5 loops, best of 1000: 3.2 msec per loop



On Sun, Sep 25, 2011 at 1:53 PM, Alex Gaynor <alex.gaynor at gmail.com> wrote:
>
>
> On Sun, Sep 25, 2011 at 1:49 PM, Bob Ippolito <bob at redivi.com> wrote:
>>
>> simplejson would be a good target for changes that would not be easy
>> to implement on top of the stdlib json. I'd be happy to accept any
>> contributions. I failed to make big differences in performance when I
>> tried at PyCon (at least that didn't regress performance for some
>> people). The other things I'm missing are a good suite of documents to
>> benchmark with, and a good tool to run the benchmarks so it's easy to
>> see if incremental changes are better or worse.
>>
>> However, if RPython is required to make it faster, maybe implementing
>> _json for the stdlib would actually be best.
>>
>> On Sun, Sep 25, 2011 at 10:30 AM, Zooko O'Whielacronx <zooko at zooko.com>
>> wrote:
>> > But don't people who need better json performance use simplejson
>> > explicitly instead of using the standard library's json?
>> >
>> > Regards,
>> >
>> > Zooko
>> > _______________________________________________
>> > pypy-dev mailing list
>> > pypy-dev at python.org
>> > http://mail.python.org/mailman/listinfo/pypy-dev
>> >
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> http://mail.python.org/mailman/listinfo/pypy-dev
>
> For what it's worth, I think we can get there, without needing to write any
> RPython, through a combination of careful Python, and more JIT
> optimizations.  For example, I'd like to get the code input[i:i+4] == "NULL"
> to eventually generate:
> read str length
> check length >= 4
> read 4 bytes out of input (single MOVL)
> integer compare to ('N' << 0) | ('U' << 8) | ('L' << 16) | ('L' << 24)
> in total about 7 x86 instructions.  I think this is definitely possible!
> Alex
>
> --
> "I disapprove of what you say, but I will defend to the death your right to
> say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
> "The people's good is the highest law." -- Cicero
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev
>
>


More information about the pypy-dev mailing list