[Doc-SIG] Building Python Document 30% faster.
Naoki INADA
inada-n at klab.jp
Sat Apr 4 18:03:25 CEST 2009
Hi Georg.
>> Attached patches make building document 30% faster.
>> (In my environ. 330sec -> 220sec roughly)
>>
>> I post sphinx.patch to bitbucket, but I don't know where to post docutils.patch.
>> Could anyone review these patch?
>
> I will, when I have a bit more time.
Thank you.
>> But searchindex.js with PyStemmer is different to one with PorterStemmer.
>
> This could be a problem. The client-side search implemented in JavaScript
> uses exactly the same stemmer (which is necessary to be able to find all
> words). In short, if you can find a C implementation of the Porter stemmer
> we could include it in Sphinx as an optional extension.
I see.
Original Porter Stemmer is here.
http://tartarus.org/~martin/PorterStemmer/
And that implemented in C. I'll try to make Python wrapper with swig and
compare searchindex.js. Wait for a while.
>> 2. Avoid building OptionParser many times.
>> Sphinx uses docutils.core.publish_parts() without `settings` argument
>> many times.
>> This causes building docutils.frontend.OptionParser many times and consumes
>> 29 seconds.
>>
>> 3. Avoid building NestedStateMachine many times.
>> NestedStateMachine is built and destroyed many times.
>> Recycling that SM make significant performance gain.
>
> I assume that both of this is in the second commit I see on bitbucket? Both
> look like a worthy optimization.
Former is in bitbucket.
http://bitbucket.org/methane/sphinx-speedup/changeset/72fa0ceefcae/
And later is not in bitbucket because NestedStateMachine is not in Sphinx
but docutils.
--
Naoki INADA <inada-n at klab.jp>
KLab Inc. <http://www.klab.jp>
More information about the Doc-SIG
mailing list