<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
I've significantly enhanced my string-concatenation patch, to the point
where that name is no longer accurate. So I've redubbed it the "lazy
strings" patch.<br>
<br>
The major new feature is that string *slices* are also represented with
a lazy-evaluation placeholder for the actual string, just as
concatenated strings were in my original patch. The lazy slice object
stores a reference to the original PyStringObject * it is sliced from,
and the desired start and stop slice markers. (It only supports step =
1.) Its ob_sval is NULL until the string is rendered--but that rarely
happens! Not only does this mean string slices are faster, but I bet
this generally reduces overall memory usage for slices too.<br>
<br>
Now, one rule of the Python programming API is that "all strings are
zero-terminated". That part of makes the life of a Python extension
author sane--they don't have to deal with some exotic Python string
class, they can just assume C-style strings everywhere. Ordinarily,
this means a string slice couldn't simply point into the original
string; if it did, and you executed<br>
x = "abcde"<br>
y = x[1:4]<br>
internally y->ob_sval[3] would not be 0, it would be 'e', breaking
the API's rule about strings.<br>
<br>
However! When a PyStringObject lives out its life purely within the
Python VM, the only code that strenuously examines its internals is
stringobject.c. And that code almost never needs the trailing zero*.
So I've added a new static method in stringobject.c:<br>
char * PyString_AsUnterminatedString(PyStringObject *)<br>
If you call it on a lazy-evaluation slice object, it gives you back a
pointer into the original string's ob_sval. The s->ob_size'th
element of this *might not* be zero, but if you call this function
you're saying that's a-okay, you promise not to look at it. (If the
PyStringObject * is any other variety, it calls into PyString_AsString,
which renders string concatenation objects then returns ob_sval.)<br>
<br>
Again: this behavior is *never* observed by anyone outside of
stringobject.c. External users of PyStringObjects call
PyString_AS_STRING(), which renders all lazy concatenation and lazy
slices so they look just like normal zero-terminated PyStringObjects.
With my patch applied, trunk still passes all expected tests.<br>
<br>
Of course, lazy slice objects aren't just for literal slices created
with [x:y].
There are lots of string methods that return what are effectively
string slices, like lstrip() and split().<br>
<br>
With this code in place, string slices that aren't examined by modules
are very rarely rendered. I ran "pybench -n 2" (two rounds, warp 10
(whatever that means)) while collecting some statistics. When it
finished, the interpreter had created a total of 640,041 lazy slices,
of which only *19* were ever rendered.<br>
<br>
<br>
Apart from lazy slices, there's only one more enhancement when compared
with v1: string prepending now reuses lazy concatenation objects much
more often. There was an optimization in string_concatenate
(Python/ceval.c) that said: "if the left-side string has two
references, and we're about to overwrite the second reference by
storing this concatenation to an object, tell that object to drop its
reference". That often meant the reference on the string dropped to 1,
which meant PyString_Resize could just resize the left-side string in
place and append the right-side. I modified it so it drops the
reference to the right-hand operand too. With this change, even with a
reduction in the allowable stack depth for right-hand recursion (so
it's less likely to blow the stack), I was able to prepend over 86k
strings before it forced a render. (Oh, for the record: I ensure depth
limits are enforced when combining lazy slices and lazy concatenations,
so you still won't blow your stack when you mix them together.)<br>
<br>
<br>
Here are the highlights of a single apples-to-apples pybench run, 2.6
trunk revision 52413 ("this") versus that same revision with my patch
applied ("other"):<br>
<br>
<tt>Test minimum run-time average
run-time<br>
this other diff this
other diff<br>
-------------------------------------------------------------------------------<br>
ConcatStrings: 204ms 76ms +168.4% 213ms 77ms
+177.7%<br>
CreateStringsWithConcat: 159ms 138ms +15.7% 163ms
142ms +15.1%<br>
StringSlicing: 142ms 86ms +65.5% 145ms
88ms +64.6%<br>
-------------------------------------------------------------------------------<br>
Totals: 7976ms 7713ms +3.4% 8257ms
7975ms +3.5%<br>
</tt><br>
I also ran this totally unfair benchmark:<br>
x = "abcde" * (20000) # 100k characters<br>
for i in xrange(10000000):<br>
y = x[1:-1]<br>
and found my patched version to be 9759% faster. (You heard that
right, 98x faster.)<br>
<br>
<br>
I'm ready to post the patch. However, as a result of this work, the
description on the original patch page is really no longer accurate:<br>
<a class="moz-txt-link-freetext" href="http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470">http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470</a><br>
Shall I close/delete that patch and submit a new patch with a more
modern description? After all, there's not a lot of activity on the
old patch page...<br>
<br>
<br>
Cheers,<br>
<br>
<br>
<i>larry</i><br>
<br>
* As I recall, stringobject.c needs the trailing zero in exactly *one*
place: when comparing two zero-length strings. My patch ensures that
zero-length slices and concatenations still return nullstring, so this
still works as expected.<br>
</body>
</html>