What does Python fix?

Mon Jan 21 11:21:20 EST 2002

"François Pinard" <pinard at iro.umontreal.ca> writes:
    ...
> > > In Lisp, there is the capability of having genuine pointers all along
a
> > > single list.  There is probably no memory-efficient equivalent in
Python.
>
> > I don't know what "genuine pointers" are in this context (as opposed to
> > Python references).  Do you mean pointers subject to address-arithmetic,
> > as in C?
>
> I meant that, given a tuple (a, b, c, d) or a list [a, b, c, d], you can
> only save a reference to the whole object starting on the first element.
> You may not reference the (b, c, d) part without having a new object with
> this contents; there is no memory sharing, at least so far that I know.

Ah, I see.  Not among the built-in types, AFAIK (it would surely be
feasible for _tuples_, given immutability).  Easy (if boilerplatey) to
synthesize, of course:

class listshare:
    def __init__(self, offset, alist):
        self._alist = alist
        self._offset = offset
    def __getitem__(self, index):
        return self._alist[index+offset]
    # etc, etc

If you wanted to make widespread use of this construct, it might be
worth your while to wrap it up as a C-coded extension, of course.

> > In any case, if for some given application you think cons cells are
> > the way to go, they're easy to make in Python.  I've chosen them as
> > an extension example, but they really work well in pure Python too.
> > See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66509 ...
>
> Someone might use the equivalent of cons-cells, with this representation:
>
>     (a, (b, (c, (d, ()))))
>
> Of course, I had this in mind while writing the original message.  When I
> say that there is probably no memory-efficient equivalent in Python, I am
> guessing that each 2-tuple is probably much more costly than the mere pair
> of memory words that would be needed for the same thing, in Lisp.

In which case, you can choose to use the referenced extension (where an
object takes, I believe, three "words" -- a type-pointer, car, and cdr).
I don't think you can boil it down to two, because of the type issue.
But this 50% overhead is just for the structuring -- each of a, b, etc,
is unaffected, so the "overall overhead" is partly amortized.

On the other hand, memory overhead for a very long Python list can be
much lower (in the same sense, i.e., just for the structuring part)
than an equivalently-long Lisp list, because the Python list is in
fact implemented as a contiguous array of single pointers -- one word
of structuring overhead per item (for long enough lists) versus two.
Performance characteristics differ wildly, of course.

> I much agree that all these considerations are a bit of rhetorical
exercise,
> as in real practice, for the examples being discussed, neither the space
> or time consumed in Python is worth the worry.  On the other hand, a
better
> understanding of things might yield better abilities to choose the correct
> representation, for when the time come to address bigger applications.

Yes, but I doubt that saving memory will be the dominating issue in any
such "bigger application" -- issues of O(N) vs O(N log N) performance,
and so on, would appear to be more likely to dominate, I think.

Alex