Improved struct module

Tim Peters tim_one at email.msn.com
Wed Oct 13 02:37:46 EDT 1999


[Tim, asks why the xstruct extension
    http://www.sis.nl/python/xstruct/xstruct.html
 is written in C
]

[Robin Boerdijk]
> There are a number of reasons for implementing the xstruct module in C as
> opposed to implementing it as a Python wrapper around the current struct
> module.
>
> 1. Implementing it as a Python wrapper would be horribly inefficient and
> not as simple as you seem to think. To randomly change the value of a
field
> somewhere in the middle of a packed binary data buffer, I would have to do
> something like:
>  ... [code to randomly change the value of a field somewhere in the middle
>       of a binary string] ...

But that's not a reasonable implementation -- the data only needs to be in
packed form when it's made externally visible; e.g., Just van Rossum's
sstruct module got toward a similar end by slinging ordinary class instance
dicts:

     http://starship.python.net/crew/just/code/sstruct.py

Mutations are faster that way, despite being written in Python.  Just pays
for a regular struct.pack() at the end (and/or struct.unpack() at the
start), though.

> I also feel that the xstruct's interface is the more natural interface for
> low-level access to packed binary data structures.

People will argue that both ways (the usual "wordy vs concise" and "two
steps vs one step" arguments), but I don't see that the spelling of the
interface bears one way or the other on the choice of implementation
language.

> I'm pretty sure that most people currently using the pack/unpack interface
> would not have done so if they had a more structured interface like
xstruct
> provides (see another follow-up to your reply).

Just's sstruct.py has been around for about a year, but isn't mentioned
much.  I can't help but draw an analogy to Ka-Ping Yee's "nice but wordy"
regexp wrapper -- the wordier approaches have their fans (among which I
sometimes appear <wink>), but in a world overwhelmingly filled with 4-member
structs and 15-character regexps, most people vote for
concise-despite-cryptic one-step approaches most of the time.

So I disagree on this one:  given a choice, most people would continue to
use the gibberish format strings.  Wordier interfaces are wonderful when
things get complicated, though, and it's good to have a choice.

It would help if you beefed up the docs.  For example, none of the module's
magic constants are documented (whether for layout, field type or "flags");
doc strings are generally absent; and objects of type structdef and
structobject respond to dir() by returning an empty list.  Because of this,
you really have to be an expert in the use of the current struct module to
approach the xstruct module, and there's no way to figure it out except by
reading the implementation code.

> 2. The xstruct objects support the new buffer interface of Python 1.5.2.
> This makes it possible to read and write data directly from a
> file or socket into and out of a packed memory buffer. All we have to do
> is to add support for the buffer interface to other low level modules like
> cStringIO and socket (files already do !!) and we would have a perfect set
> of seemlessly fitting, complementary packages. How could I achieve this in
> native Python ?

You could not at present, although it's not clear that's a bad thing.  The
buffer interface has had a troubled history, and it's unclear whether it
will survive.  The confusions surrounding it largely account for why it's
almost wholly undocumented, and not yet widely implemented even internally.
It's not ready for prime time (e.g., it's easy to crash the interpreter with
the little that's already there, staying within pure Python).

If it becomes fully supported, ways to get at it from within Python will tag
along (note that there's already a builtin "buffer()" function -- that's the
hook).

> Another reason for providing the buffer interface is to facilitate writing
> extension modules for C APIs that make heavily use of C structs. The most
> notorious example I know is the MQI interface of IBMs MQSeries. I think it
> defines more than 20 C structs, all of which I can define in
> Python now and still interact nicely with the Python MQI extension module
> written in C.

Haven't heard of this, but don't people use SWIG anymore <0.7 wink>?

mucking-with-raw-c-structs-is-a-thing-to-avoid-ly y'rs  - tim






More information about the Python-list mailing list