<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Chetan Pandya wrote:
<blockquote
cite="mida65732a30610181736r79ad71aftb305e507c85a8029@mail.gmail.com"
type="cite">
<div>
<div>I don't have a patch build, since I didn't download the revision
used by the patch. </div>
However, I did look at values in the debugger and it looked like x in
your example above had a reference count of 2 or more within
string_concat even when there were no other assignments that would
account for it.</div>
</blockquote>
It could be the optimizer. If you concatenate hard-coded strings, the
peephole optimizer does constant folding. It says "hey, look, this
binary operator is performed on two constant objects". So it evaluates
the expression itself and substitutes the result, in this case swapping
(pseudotokens here) [PUSH "a" PUSH "b" PLUS] for [PUSH "ab"].<br>
<br>
Oddly, it didn't seem to optimize away the whole expression. If you
say "a" + "b" + "c" + "d" + "e", I would have expected the peephole
optimizer to turn that whole shebang into [PUSH "abcde"]. But when I
gave it a cursory glance it seemed to skip every-other; it
constant-folded "a" + "b", then + "c" and optimized ("a" + "b" + "c")
+ "d", resulting ultimately I believe in [PUSH "ab" PUSH "cd" PLUS PUSH
"e" PLUS]. But I suspect I missed something; it bears further
investigation.<br>
<br>
But this is all academic, as real-world performance of my patch is not
contingent on what the peephole optimizer does to short runs of
hard-coded strings in simple test cases.<br>
<br>
<blockquote
cite="mida65732a30610181736r79ad71aftb305e507c85a8029@mail.gmail.com"
type="cite">
<div>
<div>The recursion limit seems to be optimistic, given the default
stack limit, but of course, I haven't tried it.</div>
</div>
</blockquote>
I've tried it, on exactly one computer (running Windows XP). The depth
limit was arrived at experimentally. But it is probably too optimistic
and should be winched down.<br>
<br>
On the other hand, right
now when you do x = "a" + x ten zillion times there are always two
references to the concatenation object stored in x: the interpreter
holds one, and x itself holds the other. That means I have to build
a new concatenation object each time, so it becomes a degenerate tree
(one leaf and one
subtree) recursing down the right-hand side.<br>
<br>
I plan to fix that in
my next patch. There's already code that says "if the next instruction
is a store, and the location we're storing to holds a reference to the
left-hand side of the concatenation, make the location drop its
reference". That was an optimization for the old-style concat code;
when the left side only had one reference it would simply resize it and
memcpy() in the right side. I plan to add support
for dropping the reference when it's the *right*-hand side of the
concatenation, as that would help prepending immensely. Once that's
done, I believe it'll prepend ((depth limit) * (number of items in
ob_sstrings - 1)) + 1 strings before needing to render.<br>
<br>
<blockquote
cite="mida65732a30610181736r79ad71aftb305e507c85a8029@mail.gmail.com"
type="cite">
<div>
<div>There is probably a depth limit on the left hand side as well,
since recursiveConcatenate is recursive even on the left side.</div>
</div>
</blockquote>
Let me again stress that recursiveConcatenate is *iterative* down the
left side; it is *not* not *not* recursive. The outer loop iterates
over "s->ob_sstrings[0]"s. The nested "for" loop iterates
backwards, from the highest string used down to "s->ob_sstrings +
1", aka "&s->ob_sstrings[1]", recursing into them. It then sets
"s" to "*s->ob_sstrings", aka "s->ob_sstrings[0]" and the outer
loop repeats. This is iterative.<br>
<br>
As a personal favor to me, please step through my code before you tell
me again how my code is recursive down the left-hand side.<br>
<br>
Passing the dutchie,<br>
<br>
<br>
<i>larry</i>
</body>
</html>