[Patches] [ python-Patches-980695 ] efficient string concatenation

SourceForge.net noreply at sourceforge.net
Sun Jul 25 12:18:11 CEST 2004


Patches item #980695, was opened at 2004-06-27 13:39
Message generated for change (Comment added) made by arigo
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=980695&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Armin Rigo (arigo)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: efficient string concatenation

Initial Comment:
A wild idea that makes repeated string concatenations efficient without changing stringobject.c.

If we assume we don't want to change the representation of strings, then the problem is that string_concat(v,w) doesn't know if v will soon released, so it cannot resize it in-place even if the refcnt is 1.

But with some hacks ceval.c can know that.  Statements like s=s+expr or s+=expr both compile to a BINARY_ADD or INPLACE_ADD followed by a STORE_FAST or STORE_NAME.  So in the attached patch ceval.c special-cases addition of two strings (in the same way as it special-cases addition of two integers already).  If moreover the addition is followed by a STORE that is about to overwrite the addition's left argument, and if the refcnt is right, then the left argument can be resized in-place (plus some obscure magic to ensure that everything is still valid even if resize moves the string in memory).

With Python's good memory manager, repeated resizes even without manual over-allocation perform nicely.

As a side effect, other constructions like a+b+c+d+e+f also work in-place now.

The patch would do with a lot more comments.

----------------------------------------------------------------------

>Comment By: Armin Rigo (arigo)
Date: 2004-07-25 10:18

Message:
Logged In: YES 
user_id=4771

Raymond, do you have time to review it?

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2004-07-25 10:17

Message:
Logged In: YES 
user_id=4771

Here is another patch.  This one focuses on simplicity, both
implementation-wise and from the user's point of view:

1) It only makes repeated  variable += expr  faster. It
doesn't touch the '+'.
2) It doesn't mess with the internals of strings and dicts
any more.  It is just one well-documented function now.

The goal of this new patch is to be reviewable and
maintainable, to get it in the core to stop people from
being bitten by the performance of += (I just saw yet
another example yesterday).

----------------------------------------------------------------------

Comment By: Michael Chermside (mcherm)
Date: 2004-06-28 11:38

Message:
Logged In: YES 
user_id=99874

Hmmm... Interesting. I kinda like it.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2004-06-28 09:41

Message:
Logged In: YES 
user_id=4771

another patch with support for STORE_DEREF (thanks Phillip for pointing it out)

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2004-06-27 20:39

Message:
Logged In: YES 
user_id=4771

resubmitted as a context diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=980695&group_id=5470


More information about the Patches mailing list