[Python-Dev] buffer objects

Scott Gilbert xscottg@yahoo.com
Sat, 4 May 2002 21:32:19 -0700 (PDT)


--- Tim Peters <tim.one@comcast.net> wrote:
>
> Note that "the buffer object problem" is an annual pointless
> discussion. The last big round on Python-Dev was near the start
> of June 2001, under the unlikely Subject "strop vs. string".
> Nobody volunteered to do anything then either, although one
> person made a big show of agreeing to plug the holes, and then
> didn't.  That's how it usually ends.
> 
> If you want to "do something" here, I suggest reading the old
> threads to get coverage of the outstanding issues.  Nothing new
> has been said or suggested in years.
> 


Lest I become associated with other do nothing whiners, I've submitted
a (bug fix only) patch that addresses the following issues:

  1) Dangling pointer problem
  2) buffer allocated by PyBuffer_New not aligned

It doesn't fix these:

  3) The hash is cached, and can be made invalid
  4) An int is not big enough for 64 bit pointers
  5) Can't handle multiple segments

It introduces:

  6) Working with the buffer object that refers
     to a separate "base" object might be slower.

Issue #2 only aligns it to the alignment of doubles.  It's probably not
worth the effort to promise more, but I'll revisit that (and do it the way
you suggest in bug 472568) if you think it is.

Issue #3 is easy to fix IMNSHO.  My first choice would be to just delete
the hash function in there.  My second would be to have the hash delegate
to the contained "base" object.  Any of these could probably break code
somewhere though, and that sounds like a Guido call to me...

Issue #4 is too big to worry about here.  PyBufferProcs don't return a
LONG_LONG, and changing that WOULD break a lot of code.

Issue #5 is confusing for me, and probably a YAGNI for most everyone.  I
don't know what the correct behavior is.  Are you supposed to consider the
discontiguous segments as one contiguous buffer?  If so, then it could be
fixed, but it probably double the code length...

Issue #6 is probably a non-issue.  I haven't profiled it, but calling
PyBufferProcs to get the pointers each time is probably not that expensive.
 Since this object is really only useful from Python script, loop overhead
would probably dominate any true to life problem.


Are there any other problems in PyBufferObject that I'm missing?  I
searched SourceForge, but searching for "buffer" returns a lot of false
hits, and searching for PyBufferObject only returns 2.


I hope I'm not out of line, but since you seem to be aware of the issues,
I've assigned this patch to you.  I was hoping to test it on Unix Friday at
work, but a long lunch led to seeing Spiderman led to an early happy hour
led to Cinco de Mayo led to a hangover today.  Since my reputation as a
non-whiner is on the line here, I think I'll submit it now rather than wait
til Monday to test it on Unix.  It passes the regression tests on Win32.

I'm also assigning my feature patch to you.  If you don't find any problems
with my bug fixes above, then there shouldn't be any reason why the buffer
builtin can't return a read-write PyBufferObject.


BTW: It's pretty easy to understand why this thing never gets fixed.  As
soon as you discover what the problems are, you realize that even if they
are fixed, this thing isn't very applicable to what you were hoping to use
it for.  I'll submit a PEP for what I think would be more usable...


Cheers,
    -Scott






__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com