Working with binary data, S-records (long)

Sat Mar 22 14:48:14 EST 2003

Am Fri, 21 Mar 2003 10:03:26 +0100 schrieb Thomas Heller:

> You should try the struct module, this should simplify and maybe also
> speed up your code quite a bit.  Another possibility should be to write
> an extension not in plain C but with pyrex.

I don't think struct is really good at this. But maybe I'll test that,
too.

> OTOH, if you already have C code which implements the functionality of
> readrecord it's not a bug deal to reuse this. You don't have to
> construct complicated Python data structures in C, simply build a tuple
> with Py_BuildValue().

Well, yes, kind of <ducking>. I wasn't all that honest: I've already used
rxactly that module for a S-record-to-S-record utility, but it didn't
really use the data read (just the addresses, so it could find an unused
part). It just appended a few lines to the original. Anyway, it was kind
of kludgy. And I prefer a pure python version (as I don't have a Windows
machine with a compiler [I'm even compiling the old C program with a cross
compiler under Linux]). A Python version is just so much more portable and
maintainable.

> There are mutable string types in Python (sort of), they are just named
> differently.  The array module comes to mind, but if your data comes
> from files, mmap should also be considered.

I have now run a few profile runs (here at home, created a 330K (binary
size) S-record file with objcopy and tested with that):

Original version: about 4 seconds
binascii.unhexlify instead of list comprehension: about 2.7 seconds
unhexlify and array("B") instead of lists: about 2.0 seconds

That's a 50 % speedup in my books, and I declare this as good enough.

On a side note: It took me about 3 minutes to run all 3 versions,
including the 2 modifications. Just one of the reasons I love Python: It's
so easy to try various things. Not only because of the interpreter, but
because my programs tend to just work when done in Python. Even after
bigger changes, there's maybe a few small errors, then they're ok again.

A word about Windows 2k and XP on said 166 MHz machines: They're currently
running '95 or NT 4.0, but the C program only works under '95. Now, when
those machines get replaced, there'll be no more Windows 95, so it's
better to be ready for that change. But until then, this program has to
run on those boxen. Hope this clarifies this issue.

Many thanks to all of you for the good suggestions.
The friendly and helpful people in c.l.p are one of Python's biggest
assets.

Hans-Joachim