[Python-ideas] hexdump

anatoly techtonik techtonik at gmail.com
Sat May 12 10:59:03 CEST 2012


Just an idea of usability fix for Python 3.
hexdump module (function or bytes method is better) as simple, easy
and intuitive way for dumping binary data when writing programs in
Python.

hexdump(bytes)   - produce human readable dump of binary data,
byte-by-byte representation, separated by space, 16-byte rows


Rationale:
1. Debug.
    Generic binary data can't be output to console. A separate helper
is needed to print, log or store its value in human readable format in
database. This takes time.
2. Usability.
    binascii is ugly: name is not intuitive any more, there are a lot
of functions, and it is not clear how it relates to unicode.
3. Serialization.
    It is convenient to have format that can be displayed in a text
editor. Simple tools encourage people to use them.

Practical example:
>>> print(b)
� � � �� �� � �� �� �
 �  � �
>>> b
'\xe6\xb0\x08\x04\xe7\x9e\x08\x04\xe7\xbc\x08\x04\xe7\xd5\x08\x04\xe7\xe4\x08\x04\xe6\xb0\x08\x04\xe7\xf0\x08\x04\xe7\xff\x08\x04\xe8\x0b\x08\x04\xe8\x1a\x08\x04\xe6\xb0\x08\x04\xe6\xb0\x08\x04'
>>> print(binascii.hexlify(data))
e6b00804e79e0804e7bc0804e7d50804e7e40804e6b00804e7f00804e7ff0804e80b0804e81a0804e6b00804e6b00804
>>>
>>> data = hexdump(b)
>>> print(data)
E6 B0 08 04 E7 9E 08 04 E7 BC 08 04 E7 D5 08 04
E7 E4 08 04 E6 B0 08 04 E7 F0 08 04 E7 FF 08 04
E8 0B 08 04 E8 1A 08 04 E6 B0 08 04 E6 B0 08 04
>>>
>>> # achieving the same output with binascii is overcomplicated
>>> data_lines = [binascii.hexlify(b)[i:min(i+32, len(binascii.hexlify(b)))] for i in xrange(0, len(binascii.hexlify(b)), 32)]
>>> data_lines = [' '.join(l[i:min(i+2, len(l))] for i in xrange(0, len(l), 2)).upper() for l in data_lines]
>>> print('\n'.join(data_lines))
E6 B0 08 04 E7 9E 08 04 E7 BC 08 04 E7 D5 08 04
E7 E4 08 04 E6 B0 08 04 E7 F0 08 04 E7 FF 08 04
E8 0B 08 04 E8 1A 08 04 E6 B0 08 04 E6 B0 08 04

On the other side, getting rather useless binascii output from
hexdump() is quite trivial:
>>> data.replace(' ','').replace('\n','').lower()
'e6b00804e79e0804e7bc0804e7d50804e7e40804e6b00804e7f00804e7ff0804e80b0804e81a0804e6b00804e6b00804'

But more practical, for example, would be counting offset from hexdump:
>>> print( ''.join( '%05x: %s\n' % (i*16,l) for i,l in enumerate(hexdump(b).split('\n'))))

Etc.

Conclusion:
By providing better building blocks on basic level Python will become
a better tool for more useful tasks.


References:
[1] http://stackoverflow.com/questions/2340319/python-3-1-1-string-to-hex
[2] http://en.wikipedia.org/wiki/Hex_dump

--
anatoly t.



More information about the Python-ideas mailing list