
On Mon, May 26, 2008 at 4:21 AM, Hrvoje Nikšić <hrvoje.niksic@avl.com> wrote:
On Thu, 2008-05-22 at 13:27 -0300, Facundo Batista wrote:
2008/5/22 Scott Dial <scott+python-dev@scottdial.com>:
If we changed Python to slice-by-reference, then tomorrow someone would be asking why it isn't slice-by-copy. There are pros and cons to both that are
Which are the cons of slice-by-reference of an immutable string?
You have to consider the ramifications of such a design choice. There are two cases to consider: either slices return strings, or they return a different types.
If they return strings, then all strings must grow three additional fields: start, end, and the reference to the actual string. That is 16 more bytes for *every* string, hardly a net win.
A lot of dynamic language implementations have a complex string representation, where individual bits of the string tell what the rest of the representation is. Mozilla's JavaScript implementation is like this. At the moment, a string in JavaScript is two pointer-sized words, and JavaScript has O(1) slicing and, in many cases, O(len(s2)) string concatenation. There's a rather dense comment here explaining it: http://hg.mozilla.org/mozilla-central/index.cgi/file/79924d3b5bba/js/src/jss... The equivalent of PyString_AS_STRING and PyString_GET_SIZE contains a branch. I don't think the implementation avoids the worst cases Guido was talking about; tiny substrings can keep huge strings alive. -j