[Python-3000] Lazy strings (was Re: Py3k release schedule worries)
Larry Hastings
larry at hastings.org
Sat Jan 13 00:59:08 CET 2007
Guido van Rossum wrote:
> Finally (unrelated to the memory problem) I'd like to see some
> benchmarks to prove that this is really worth it.
Here's a first cut at some benchmarks. I gently hacked the pybench in
Tools so it'd run, and compared the full "lazy strings" patch to an
unpatched tree. In the following output, "this" is unpatched and
"other" has the lazy patch. The envelope please:
-------------------------------------------------------------------------------
PYBENCH 2.0
-------------------------------------------------------------------------------
* using Python 3.0x
* disabled garbage collection
* system check interval set to maximum: 2147483647
* using timer: time.clock
Running 10 round(s) of the suite at warp factor 10:
Test minimum run-time average run-time
this other diff this other
diff
-------------------------------------------------------------------------------
ConcatUnicode: 185ms 46ms +298.6% 206ms 48ms
+332.7%
CreateUnicodeWithConcat: 129ms 67ms +93.1% 132ms 71ms
+86.5%
UnicodeSlicing: 156ms 75ms +108.0% 161ms 77ms
+108.9%
-------------------------------------------------------------------------------
Totals: 8350ms 8148ms +2.5% 8586ms
8416ms +2.0%
I'll post a zip file containing the full results and the pybench data
files to the patch page.
As I suspected, this is a bigger win than it was with 8-bit strings, as
8-bit strings have gotten a lot more TLC over the years. Once we
propagate the accumulated tweaks from stringobject.c to unicodeobject.c
the improvement will be a little less dramatic. Then again, some of
those improvements help lazy evaluation too, most notably the
concatenation speed hack in Python/ceval.c string_concatenation() (see
line 4179, comment "In the common case").
Keep in mind that "lazy slices" means more than just [:] notation; I
converted things like str.split() and str.strip() and str.partition() to
generate lazy slices too, and nearly all of unicodeobject.c will process
lazy slices directly without rendering. So the speed improvements (and
corresponding change in memory usage) affects more than you might
suspect at first glance.
Cheers,
/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070112/3c117f6e/attachment.html
More information about the Python-3000
mailing list