struct: type registration?
John Machin
sjmachin at lexicon.net
Thu Jun 1 18:51:16 EDT 2006
On 2/06/2006 4:18 AM, Serge Orlov wrote:
> Giovanni Bajo wrote:
>> John Machin wrote:
>>> I am an idiot, so please be gentle with me: I don't understand why you
>>> are using struct.pack at all:
>> Because I want to be able to parse largest chunks of binary datas with custom
>> formatting. Did you miss the whole point of my message:
>>
>> struct.unpack("3liiSiiShh", data)
>
> Did you want to write struct.unpack("Sheesh", data) ? Seriously, the
> main problem of struct is that it uses ad-hoc abbreviations for
> relatively rarely[1] used functions calls and that makes it hard to
> read.
Indeed. The first time I saw something like struct.pack('20H', ...) I
thought it was a FORTRAN format statement :-)
>
> If you want to parse binary data use pyconstruct
> <http://pyconstruct.wikispaces.com/>
>
Looks promising on the legibility and functionality fronts. Can you make
any comment on the speed? Reason for asking is that Microsoft Excel
files have this weird "RK" format for expressing common float values in
32 bits (refer http://sc.openoffice.org, see under "Documentation"
heading). I wrote and support the xlrd module (see
http://cheeseshop.python.org/pypi/xlrd) for reading those files in
portable pure Python. Below is a function that would plug straight in as
an example of Giovanni's custom unpacker functions. Some of the files
can be very large, and reading rather slow.
Cheers,
John
from struct import unpack
def unpack_RK(rk_str): # arg is 4 bytes
flags = ord(rk_str[0])
if flags & 2:
# There's a SIGNED 30-bit integer in there!
i, = unpack('<i', rk_str)
i >>= 2 # div by 4 to drop the 2 flag bits
if flags & 1:
return i / 100.0
return float(i)
else:
# It's the most significant 30 bits
# of an IEEE 754 64-bit FP number
d, = unpack('<d', '\0\0\0\0' + chr(flags & 252) + rk_str[1:4])
if flags & 1:
return d / 100.0
return d
More information about the Python-list
mailing list