[Image-SIG] Proper application of the buffer interface

Mark Hammond mhammond at skippinet.com.au
Thu Aug 5 04:57:04 CEST 1999

>   The context is this:  the t1python package (an interface to a Type1
> font renderer) has, in the past, been passing bitmaps back to Python
> as string objects.  In the next release, I may create a new type in
> the C layer that provides all the interesting stuff, and
> would like to
> be able to convert these "glyph" objects to PIL images and GTK+/GNOME
> compatible images.  Does it make the most sense for the glyph objects
> to offer the buffer interface to make more conversions possible
> without increasing the number of memory copies, or do I misunderstand
> the application of the interface?

I believe you are correct, although _all_ these packages need to support
the interface.  The good news is that due to tricks inside Python, they
already may.

As you mention, it is common to use Python string objects to move chunks of
binary data around.  The buffer interfaces now allow us to use _any_
object, and as long as it conforms to the buffer interface, you dont lose
anything by not using strings (and obviously gain whatever functionality
your object has)

The general idea is that PIL and other such frameworks can think in terms
of buffers.  Rather than PIL saying "give me a Python string with the raw
binary data", it can say "give me an object from which I can obtain a
buffer with the raw binary data".  Strings obviously still fit the bill.

The good news is that it is most common for extension modules to spell
"give me a string with the raw binary data" as "PyArg_ParseTyple("s#",
...);".  PyArg_ParseTuple has been upgraded to use the buffer interfaces,
and so have Python string objects.  Thus, whenever code uses
PyArg_ParseTuple in that way, they are already supporting the buffer

Thus, you could implement your new object, and define the buffer
interfaces.  This object could then be passed to any C extension function
that use PyArg_ParseTuple, and the extension module will still think it has
a "char *" pointer from an in-place string object.  In practice, this means
that automatically people will be able to say "file.write(your_object)"
etc. with your new object.

The problem remains, of course, for extensions that use the
PyString_Check(), PyString_AsString() etc functions.  If they were upgraded
to use the buffer interfaces, then the transition would be complete.

Just to extend my guesswork somewhat, there is a new built-in "buffer()"
function.  This returns a "buffer" object.  I speculate this should be used
in preference to Python strings when you have binary data.  As this buffer
object supports the buffer interfaces, they are basically as functional as
strings for this purpose, but clearly indicate the data is not really a
string!  This appears to be more a matter of style, and also paves the road
to Unicode - eg, it makes sense to convert any Python string object to
Unicode, but not necessarily a binary buffer.

>   I'd appreciate some input on this matter.  I hope to get the next
> release of t1python done before too much longer, and this is probably
> the biggest remaining question that I need to deal with.

Probably Greg and Guido are the only 2 with the real insight, as they
threashed out the details.  But Im pretty happy with my understanding (as
detailed above) of the issues.


More information about the Python-list mailing list