Python and the need for speed
Brecht Machiels
brecht__gmane at mos6581.org
Thu Apr 13 05:52:04 EDT 2017
Bah. My newsreader lost my reply when the WiFi connection dropped
out... attempt #2.
On 2017-04-12 18:45:16 +0000, bart4858 at gmail.com said:
> On Wednesday, 12 April 2017 16:04:53 UTC+1, Brecht Machiels wrote:
>> On 2017-04-12 14:46:45 +0000, Michael Torrie said:
>
>> It would be great if you could run the benchmark I mention in my first>
>> link and share the results. Highly appreciated!
>
> Were you ever able to isolate what it was that's taking up most of the
> time? Either in general or in the bit that pypy has trouble with. Or is
> execution time spread too widely?
It's been a while since I last focused on performance, but the profile
is still pretty flat. It's easy enough to verify (see also the URL
referenced below):
python -m cProfile -o demo.prof `which rinoh` -f restructuredtext demo.rst
python -m pstats demo.prof
demo.prof% strip
demo.prof% sort tottime
demo.prof% stats 15
Thu Apr 13 10:59:19 2017 demo.prof
35193174 function calls (27868271 primitive calls) in 22.461 seconds
Ordered by: internal time
List reduced from 5499 to 15 due to restriction <15>
ncalls tottime percall cumtime percall filename:lineno(function)
6020041/321084 2.812 0.000 2.884 0.000 layout.py:152(document_part)
287201 1.211 0.000 6.156 0.000 style.py:645(match)
98788 0.901 0.000 1.965 0.000 version.py:198(__init__)
419928/232734 0.751 0.000 17.332 0.000 util.py:109(function_wrapper)
344783 0.588 0.000 1.198 0.000 style.py:319(match)
1302467 0.534 0.000 0.840 0.000 style.py:438(__hash__)
128992/83504 0.459 0.000 15.477 0.000
style.py:556(get_style_recursive)
1472251/1472250 0.399 0.000 0.469 0.000 {built-in method
builtins.isinstance}
701320 0.395 0.000 0.679 0.000 parse.py:18(reader)
306381/10913 0.389 0.000 6.546 0.000 style.py:757(find_matches)
89622/86768 0.368 0.000 2.126 0.000 style.py:369(match)
176 0.311 0.002 0.840 0.005 parse.py:157(check_sum)
339968/10360 0.308 0.000 0.417 0.000 dimension.py:239(__float__)
95312 0.301 0.000 0.347 0.000 version.py:343(_cmpkey)
2642 0.288 0.000 3.380 0.001 __init__.py:792(resolve)
> (I looked at your project but it's too large, and didn't get much
> further with the github benchmark, which requires me to subscribe, but
> the .sh file extensions don't seem too promising to someone on Windows.)
GitHub benchmark? .sh file extensions?
You can easily run some benchmarks following the instructions here (pip
install):
https://bitbucket.org/pypy/pypy/issues/2365/rinohtype-much-slower-on-pypy3
As I commented on that issue, I have been able to run the benchmarks
using PyPy3 5.7.1 beta, which is now significantly faster than CPython.
That's very promising!
> Your program seems to be to do with typesetting. Is it possible to at
> least least quantity the work that is being done in terms of total
> bytes (and total files) of input, and bytes of output? That might
> enable comparisons with other systems executing similar tasks, to see
> if the Python version is taking unreasonably long.
The Sphinx benchmark's source reStructuredText files add up to 584 KB.
The output PDF file is almost 3 MB (includes fonts and images). Note
that the input document is parsed into a document tree where each
paragraph is represented by an object of the Paragraph class,
containing StyledText objects and so on. The total memory used is about
1 GB!
LaTeX is orders of magnitude faster, but requires multiple passes. It's
memory usage is probably much less since it works stream-based.
Best regards,
Brecht
More information about the Python-list
mailing list