[Python-Dev] Deprecate the buffer object?

Neil Schemenauer nas-python at python.ca
Thu Oct 30 12:19:39 EST 2003


On Thu, Oct 30, 2003 at 07:21:01AM -0800, Neil Schemenauer wrote:
> I don't see any problem with that.

Okay, small problem.  The hash function for the buffer object is brain
damaged, in more ways than one actually:

    >>> import array
    >>> a = array.array('c')
    >>> b = buffer(a)
    >>> hash(b)

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 16384 (LWP 5311)]
    buffer_hash (self=0x40262d00) at Objects/bufferobject.c:241
    241             x = *p << 7;
    (gdb) l
    236                     return -1;
    237             }
    238     
    239             len = self->b_size;
    240             p = (unsigned char *) self->b_ptr;
    241             x = *p << 7;
    242             while (--len >= 0)
    243                     x = (1000003*x) ^ *p++;
    244             x ^= self->b_size;
    245             if (x == -1)
    (gdb) p len
    $1 = 0
    (gdb) p *p
    Cannot access memory at address 0x0

The buffer object has 'b_readonly' and 'b_hash' fields.  If readonly
is true than the object is considered hashable and once computed the
hash is stored in the 'hash' field.  The problem is that the buffer
API doesn't provide a way to determine 'readonly'.  The absence of
getwritebuf() is not the same thing as being read only.  The
buffer() builtin always sets the 'readonly' flag!

I don't think the buffer hash method can depend on the data being
pointed to.  There is nothing in the buffer interface that tells
you if the data is immutable.  The hash method could return the id
of the buffer object but I'm not sure how useful that would be.

  Neil



More information about the Python-Dev mailing list