[Tutor] need help with syntax

Liam Clarke ml.cyresse at gmail.com
Wed Jan 11 21:48:58 CET 2006


>
> You mentioned earlier that you're expecting an integer, an integer, and
> then a sequence of float.  Don't count bytes if you can help it: let
> Python's struct.calcsize() do this for you.
>
>    http://www.python.org/doc/lib/module-struct.html
>
> Your machine may align bytes differently than you might expect: it may be
> best to let the machine handle that for you.


Erk, Danny, is of course, right.

Perhaps I could rewrite my earlier suggestion as -

stepValPattern = ">2i"
bytesToRead = struct.calcsize(stepValPattern)
bytes = f.read(bytesToRead)
(steps, value) = struct.unpack(stepValPattern, bytes)

floatPattern = ">%df" % steps
bytesToRead = struct.calcsize(floatPattern)
bytes = f.read(bytesToRead)
floatList = struct.unpack(floatPattern, bytes)

f.read() works like following, (rough explanation follows) Let's say
you have a 10 byte file -

00 0E A1 DC B3 09 FF AA B1 B2 00

f.read(3) will return 00 0E A1.

Your file reference is now pointing at the 4th byte of the file, like so.

00 0E A1 DC B3 09 FF AA B1 B2 00
...............^   <-current read position

So calling f.read(3) again will return

DC B3 09

Calling f.tell() will return the curernt read position, which would
now be 6, the 7th byte (remembering computers count from 0 up.)
f.seek(x) tells Python to move the read position to the xth byte of the file.

You can use that with text files. I.E. If you've just called
f.readlines(), you can use f.seek(0) to return to the start of the
file.

Anyway, I digress, but I'm trying to clarify how struct and f.read()
work together.

So, let's pretend all integers are 4 bytes, ditto all floats, and
there's no byte alignment going on whatsoever. (of course, in real
life you can't, hence using struct.calcsize())

So, you have a file like so -

03 00 00 00 0A 00 00 00 09 0B 21 CD....

That's

step              value           step number of floats
03 00 00 00| 0A 00 00 00| 09 0B 21 CD....
^ <-- read pos

stepValPattern = ">2i"
bytesToRead = struct.calcsize(stepValPattern)
#bytesToRead is 8 (idealised ints)
bytes = f.read(bytesToRead)

At this point, bytes is 03 00 00 00 0A 00 00 00
(Python would show it as a string "\x03\x00\x00\x00\x0A\x00\x00\x00")

And your file now looks like this -
step              value           floats
03 00 00 00| 0A 00 00 00| 09 0B 21 CD....
.......................................^ <-- read pos

Okay, so now to read the floats.

(steps, value) = struct.unpack(stepValPattern, bytes)
#Converts 03 00 00 00 0A 00 00 00 to (3, 9) 9 chosen at random
#steps equals 3

So now we substitute steps into the floatPattern

floatPattern = ">%df" % steps
#floatPattern now equals ">3f" which is the same
#struct pattern as ">fff"

bytesToRead = struct.calcsize(floatPattern)
#bytesToRead is 12
bytes = f.read(bytesToRead)
floatList = struct.unpack(floatPattern, bytes)


Does that make sense?

I hope so... but suffice to say - struct.calcsize() ensures cross
platform compatibility... I tend not to use the endian identifier
unless I'm dealing with a data source that will always, absolutely be
a certain endianess, saves having to rejig your patterns for each
different platform.

And use f.read(x) for binary data.

Regards,

Liam Clarke


More information about the Tutor mailing list