[Doc-SIG] Building Python Document 30% faster.

稲田直哉 inada-n at klab.jp
Sat Apr 4 15:57:25 CEST 2009


Hi, all.

I'm a member of Japanese translate of Python document Project.
We complete translating Python 2.5 document last year and now
work for Python 2.6 Document.

I feel building document is slow a little. So I try to tune docutils
and Sphinx.

Attached patches make building document 30% faster.
(In my environ. 330sec -> 220sec roughly)

I post sphinx.patch to bitbucket, but I don't know where to post docutils.patch.
Could anyone review these patch?

These patches changes following:

1. Use PyStemmer instead of PorterStemmer.
PorterStemmer is implemented Python and consumes about 50seconds
during buid.
PyStemmer <http://pypi.python.org/pypi/PyStemmer/1.0.1> implemented in C
and consumes only 7 seconds.

But searchindex.js with PyStemmer is different to one with PorterStemmer.

2. Avoid building OptionParser many times.
Sphinx uses docutils.core.publish_parts() without `settings` argument
many times.
This causes building docutils.frontend.OptionParser many times and consumes
29 seconds.

3. Avoid building NestedStateMachine many times.
NestedStateMachine is built and destroyed many times.
Recycling that SM make significant performance gain.

== before ==
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
25720/459    0.997    0.000  134.085    0.292
tools/docutils/statemachine.py:178(run)
92281/1513    1.420    0.000  133.935    0.089
tools/docutils/statemachine.py:384(check_line)
    25720    0.184    0.000   89.628    0.003
tools/docutils/statemachine.py:129(__init__)
    25720    0.632    0.000   89.444    0.003
tools/docutils/statemachine.py:448(add_states)
   385800    1.665    0.000   88.813    0.000
tools/docutils/statemachine.py:436(add_state)
   385800    2.356    0.000   85.287    0.000
tools/docutils/statemachine.py:928(__init__)
   385800    1.793    0.000   82.931    0.000
tools/docutils/statemachine.py:566(__init__)

== after ==
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
25720/459    1.051    0.000   68.175    0.149
tools/docutils/statemachine.py:178(run)
92281/1513    1.405    0.000   68.024    0.045
tools/docutils/statemachine.py:384(check_line)
     6862    0.031    0.000   24.241    0.004
tools/docutils/statemachine.py:129(__init__)
     6862    0.174    0.000   24.210    0.004
tools/docutils/statemachine.py:448(add_states)
   102930    0.430    0.000   24.036    0.000
tools/docutils/statemachine.py:436(add_state)
   102930    0.633    0.000   23.162    0.000
tools/docutils/statemachine.py:928(__init__)
   102930    0.549    0.000   22.529    0.000
tools/docutils/statemachine.py:566(__init__)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sphinx.patch
Type: application/octet-stream
Size: 3930 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/doc-sig/attachments/20090404/f90eab5d/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: docutils.patch
Type: application/octet-stream
Size: 1923 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/doc-sig/attachments/20090404/f90eab5d/attachment-0001.obj>


More information about the Doc-SIG mailing list