[pypy-dev] Unicode encode/decode speed
Eleytherios Stamatogiannakis
estama at gmail.com
Tue Feb 12 19:14:13 CET 2013
On 12/02/13 11:04, Maciej Fijalkowski wrote:
>
> I would like to see some evidence about it. Did you try valgrind?
>
> Cheers,
> fijal
>
Even better, we wanted to find a way for you to be able to test it by
yourselves, so we tried to create a representative synthetic benchmark.
Surprisingly when we retested the benchmark that we had previously
posted here in this mailing list, we found that the performance profile
is very similar to the one slow query that i've talked about in my
recent emails.
To make it easier i'll repeat the freshened instructions (from the old
email) of how to run that benchmark. Also attached is the updated (and
heavily optimized) MSPW:
--repost--
To run it you'll need latest madIS. You can clone it using:
hg clone https://code.google.com/p/madis/
For running the test with CPython you'll need:
CPython 2.7 + APSW:
https://code.google.com/p/apsw/
For PyPy you'll need MPSW renamed to "apsw.py" (the attached MPSW is
already renamed to "apsw.py").
Move "apsw.py" to pypy's "site-packages" directory. For MSPW to work in
PyPy, you'll also need CFFI and "libsqlite3" installed.
To run the test with PyPy:
pypy mterm.py < mspw_bench.sql
or with CPython
python mterm.py < mspw_bench.sql
The timings of "mspw_bench" that we get are:
CPython 2.7 + APSW: ~ 2.6sec
PyPy + MSPW: ~ 4sec
There are two ways to adjust the processing load of mspw_bench.
One is to change the value in "range(xxxxx)". This will in essence
create a bigger/smaller "synthetic text". This puts more pressure on
CPython's/pypy's side.
The other way is to adjust the window size of textwindow(t, xx, xx).
This puts more pressure on the wrapper (APSW/MSPW) because it changes
the number of columns that CPython/PyPy have to send to SQLite (they are
send one value at a time).
--/repost--
Attached you'll find our latest MSPW (renamed to "apsw.py") and
"mspw_bench.sql"
Also we are looking into adding a special ffi.string_decode_UTF8 in
CFFI's backend to reduce the number of calls that are needed to go from
utf8_char* to PyPy's unicode.
Do you thing that such an addition would be worthwhile?
Thank you,
lefteris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mspw_bench.sql
Type: text/x-sql
Size: 124 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20130212/db8af507/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apsw.py
Type: text/x-python
Size: 67124 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20130212/db8af507/attachment-0001.py>
More information about the pypy-dev
mailing list