<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

<br>

I've significantly enhanced my string-concatenation patch, to the point

where that name is no longer accurate.&nbsp; So I've redubbed it the "lazy

strings" patch.<br>

<br>

The major new feature is that string *slices* are also represented with

a lazy-evaluation placeholder for the actual string, just as

concatenated strings were in my original patch.&nbsp; The lazy slice object

stores a reference to the original PyStringObject * it is sliced from,

and the desired start and stop slice markers.&nbsp; (It only supports step =

1.)&nbsp; Its ob_sval is NULL until the string is rendered--but that rarely

happens!&nbsp; Not only does this mean string slices are faster, but I bet

this generally reduces overall memory usage for slices too.<br>

<br>

Now, one rule of the Python programming API is that "all strings are

zero-terminated".&nbsp; That part of makes the life of a Python extension

author sane--they don't have to deal with some exotic Python string

class, they can just assume C-style strings everywhere.&nbsp; Ordinarily,

this means a string slice couldn't simply point into the original

string; if it did, and you executed<br>

&nbsp; x = "abcde"<br>

&nbsp; y = x[1:4]<br>

internally y-&gt;ob_sval[3] would not be 0, it would be 'e', breaking

the API's rule about strings.<br>

<br>

However!&nbsp; When a PyStringObject lives out its life purely within the

Python VM, the only code that strenuously examines its internals is

stringobject.c.&nbsp; And that code almost never needs the trailing zero*.&nbsp;

So I've added a new static method in stringobject.c:<br>

&nbsp;&nbsp;&nbsp; char * PyString_AsUnterminatedString(PyStringObject *)<br>

If you call it on a lazy-evaluation slice object, it gives you back a

pointer into the original string's ob_sval.&nbsp; The s-&gt;ob_size'th

element of this *might not* be zero, but if you call this function

you're saying that's a-okay, you promise not to look at it.&nbsp; (If the

PyStringObject * is any other variety, it calls into PyString_AsString,

which renders string concatenation objects then returns ob_sval.)<br>

<br>

Again: this behavior is *never* observed by anyone outside of

stringobject.c.&nbsp; External users of PyStringObjects call

PyString_AS_STRING(), which renders all lazy concatenation and lazy

slices so they look just like normal zero-terminated PyStringObjects.&nbsp;

With my patch applied, trunk still passes all expected tests.<br>

<br>

Of course, lazy slice objects aren't just for literal slices created

with [x:y].&nbsp;

There are lots of string methods that return what are effectively

string slices, like lstrip() and split().<br>

<br>

With this code in place, string slices that aren't examined by modules

are very rarely rendered.&nbsp; I ran "pybench -n 2" (two rounds, warp 10

(whatever that means)) while collecting some statistics.&nbsp; When it

finished, the interpreter had created a total of 640,041 lazy slices,

of which only *19* were ever rendered.<br>

<br>

<br>

Apart from lazy slices, there's only one more enhancement when compared

with v1: string prepending now reuses lazy concatenation objects much

more often. There was an optimization in string_concatenate

(Python/ceval.c) that said: "if the left-side string has two

references, and we're about to overwrite the second reference by

storing this concatenation to an object, tell that object to drop its

reference".&nbsp; That often meant the reference on the string dropped to 1,

which meant PyString_Resize could just resize the left-side string in

place and append the right-side.&nbsp; I modified it so it drops the

reference to the right-hand operand too.&nbsp; With this change, even with a

reduction in the allowable stack depth for right-hand recursion (so

it's less likely to blow the stack), I was able to prepend over 86k

strings before it forced a render.&nbsp; (Oh, for the record: I ensure depth

limits are enforced when combining lazy slices and lazy concatenations,

so you still won't blow your stack when you mix them together.)<br>

<br>

<br>

Here are the highlights of a single apples-to-apples pybench run, 2.6

trunk revision 52413 ("this") versus that same revision with my patch

applied ("other"):<br>

<br>

<tt>Test&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; minimum run-time&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; average&nbsp;

run-time<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; this&nbsp;&nbsp;&nbsp; other&nbsp;&nbsp; diff&nbsp;&nbsp;&nbsp; this&nbsp;&nbsp;&nbsp;

other&nbsp;&nbsp; diff<br>

-------------------------------------------------------------------------------<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ConcatStrings:&nbsp;&nbsp; 204ms&nbsp;&nbsp;&nbsp; 76ms +168.4%&nbsp;&nbsp; 213ms&nbsp;&nbsp;&nbsp; 77ms

+177.7%<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CreateStringsWithConcat:&nbsp;&nbsp; 159ms&nbsp;&nbsp; 138ms&nbsp; +15.7%&nbsp;&nbsp; 163ms&nbsp;&nbsp;

142ms&nbsp; +15.1%<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; StringSlicing:&nbsp;&nbsp; 142ms&nbsp;&nbsp;&nbsp; 86ms&nbsp; +65.5%&nbsp;&nbsp; 145ms&nbsp;&nbsp;&nbsp;

88ms&nbsp; +64.6%<br>

-------------------------------------------------------------------------------<br>

Totals:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 7976ms&nbsp; 7713ms&nbsp;&nbsp; +3.4%&nbsp; 8257ms&nbsp;

7975ms&nbsp;&nbsp; +3.5%<br>

</tt><br>

I also ran this totally unfair benchmark:<br>

&nbsp;&nbsp;&nbsp; x = "abcde" * (20000) # 100k characters<br>

&nbsp;&nbsp;&nbsp; for i in xrange(10000000):<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; y = x[1:-1]<br>

and found my patched version to be 9759% faster.&nbsp; (You heard that

right, 98x faster.)<br>

<br>

<br>

I'm ready to post the patch.&nbsp; However, as a result of this work, the

description on the original patch page is really no longer accurate:<br>

&nbsp;&nbsp;&nbsp;

<a class="moz-txt-link-freetext" href="http://sourceforge.net/tracker/index.php?func=detail&aid=1569040&group_id=5470&atid=305470">http://sourceforge.net/tracker/index.php?func=detail&amp;aid=1569040&amp;group_id=5470&amp;atid=305470</a><br>

Shall I close/delete that patch and submit a new patch with a more

modern description?&nbsp; After all, there's not a lot of activity on the

old patch page...<br>

<br>

<br>

Cheers,<br>

<br>

<br>

<i>larry</i><br>

<br>

* As I recall, stringobject.c needs the trailing zero in exactly *one*

place: when comparing two zero-length strings.&nbsp; My patch ensures that

zero-length slices and concatenations still return nullstring, so this

still works as expected.<br>

</body>

</html>