[Python-ideas] hexdump
Terry Reedy
tjreedy at udel.edu
Sat May 12 18:18:33 CEST 2012
On 5/12/2012 4:59 AM, anatoly techtonik wrote:
> Just an idea of usability fix for Python 3.
> hexdump module (function or bytes method is better) as simple, easy
> and intuitive way for dumping binary data when writing programs in
> Python.
>
> hexdump(bytes) - produce human readable dump of binary data,
> byte-by-byte representation, separated by space, 16-byte rows
Hexdump, as you propose it, does three things. In each case, it fixes a
parameter that could reasonably have a different value.
1. Splits the hex characters into groups of two characters, each
representing one byte. For some uses, large chunks would be more useful.
2. Uppercases the alpha hex characters. This is a holdover from the
ancient all-uppercase world, where there was no choice. While is may
make the block visual more 'even' and 'aesthetic', which not actually
being read, it makes it harder to tell the difference between a 0-9
digit and alpha digit. B and 8 become very similar. There is
justification for binascii.hexlify using locecase.
3. Group the hex-represented units into lines of 16 each. This is only
useful when the bytes come from memory with hex addresses, when the
point is to determine the specific bytes at specific addresses. For
displaying decimal-length byte strings, 25 bytes per line would be better.
What it does not do.
4. Break lines into blocks. One might want to break up multiple lines of
25 into blocks of four lines each.
5. Label the rows and column either with hex or decimal labels.
6. Add 'dotted ascii' translation to reveal embedded ascii strints.
Output: choices are an iterator of lines, a list of lines, and a string
with embedded newlines. The second and third are easily derived from the
first, so I propose the first as the best choice. A iterator can also be
used to write to a file.
A flexible module would be a good addition to pypi if not there already.
Let see....
hexencoder 1.0
hex encode decode and compare
This project offers 3 basic tools for manipulating binary files: 1)
flexible hexdump
Home Page: http://sourceforge.net/projects/hexencoder
I did not look to see how flexible is 'flexible', but there it is.
> Rationale:
> 1. Debug.
> Generic binary data can't be output to console.
That depends on the console. Old IBM PCs had a character for every byte.
That was meant for line-drawing, accents, and symbols, but could also be
used for binary dumps. I believe there are Windows codepages that will
do similar. Any bytes can be decoded as latin-1 and then printed.
> A separate helper
> is needed to print, log or store its value in human readable format in
> database. This takes time.
A custom helper gives custom output.
> 2. Usability.
> binascii is ugly: name is not intuitive any more, there are a lot
> of functions, and it is not clear how it relates to unicode.
Even if there are lots of functions, one might be added.
What does 'it' refer to? hexdump or binascii? Both are about binary
bytes and not about unicode characters, so neither relate to abstract
unicode. Encoded unicode characters are binary data like any other,
though if the encoding is utf-16 or utf-32, one would want 2 or 4 bytes
dumped together, as I suggested above.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list