Does Python really follow its philosophy of "Readability counts"?

Fri Jan 23 20:41:35 EST 2009

Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:

> Let's be specific here. The list implementation in CPython is an array 
> with a hidden field storing the current length. If this hidden field was 
> exposed to Python code, you could set it to a value much larger than the 
> actual size of the array and cause buffer overflows, and random Python 
> code could cause core dumps (and possibly even security exploits).

[...]

> As I see it, you have two coherent positions. On the one hand, you could 
> be like Mark Wooding, and say that Yes you want to risk buffer overflows 
> by messing with the internals

Please, point out where I said that!

I'm pretty sure that the only time I commented on this particular point
(in message <87y6x2cih0.fsf.mdw at metalzone.distorted.org.uk>), I said:

: Umm... I'm pretty sure that that's available via the `len' function,
: which is tied to list.__len__ (via the magic C-implemented-type mangler,
: in C).  Though it's read-only -- and this is a shame, 'cos it'd be nice
: to be able to adjust the length of a list in ways which are more
: convenient than
: 
:   * deleting or assigning to a trailing slice, or
:   * augmenting or assigning to a trailing zero-width slice
: 
: (Perl has supported assigning to $#ARRAY for a long time.  Maybe that's
: a good argument against it.)

While I realise I didn't spell it out, the semantics I had in mind where

        foo.len = n

means

        if n < 0:
          raise ValueError, 'don\'t be stupid'
        elif len(foo) < n:
          foo += [None] * (n - len(foo))
        else:
          foo[n:] = []

(I'm not fussy what the new array slots get filled with, but it seems
sensible to be clear.  Perl's semantics are more complicated: if you
decrease $#foo and then increase it again you get the old values back.
Common Lisp users will recall the idea of a fill-pointer.)

If there's anything unsafe about that, I'll be surprised.

> -- in which case I'm not sure what you see in Python, which protects
> so many internals from you. Or you can say that you made a mistake,
> that there are *some* good reasons to protect/hide internals from
> external access.

Safety is good.  Escape hatches are good, too.

> In the second case, the next question is, why should it only be code 
> written in C that is allowed that protection?

Because Python code can't cause those sorts of problems without
resorting to the escape hatches (e.g., ctypes).  And, very
significantly, because C code /needs/ that protection and Python
basically doesn't.

The basic difference is that C code is fundamentally brittle: if you
mess up its invariants, it can crash horribly and possibly allow its
brain to be taken over by evil people.  Python code is fundamentally
robust.  The worst that can happen[1] is that the interpreter raises an
exception.  This makes it ideally suited to having a more relaxed
attitude to life.  And that, in turn, makes it approachable,
hackable interactively, fun!

[1] Assuming that (a) the Python implementation and C extensions are
    correct, and (b) that the code in question isn't using the escape
    hatches.

-- [mdw]