Forgive me if this idea has been already proposed and shot down. Earlier today I was trying to print some binary data I received over a socket in hexadecimal format so that I could decipher the contents. The first thing I instinctively tried was hex(datastring) and was somewhat suprised to find that it didn't work. I wandered around the documentation a bit looking for a simple and "obvious" way to do this and came up empty (I would appreciate it if someone could enlighten me if there already is a way). For the rest of the day it kinda bothered me that my first attempt didn't work. After spending probably not enough time thinking about this, I came up with what might be a reasonable extension for hex() and oct() that would allow those functions to accept a string and produce a quoted hexadecimal or octal representation of the input string using the escape notation \xhh or \ooo. To illustrate:
hex('ABC') "'\\x41\\x42\\x43'" print hex('ABC') # This is what I was trying to do today '\x41\x42\x43' eval(hex('ABC')) 'ABC'
oct('ABC') "'\\101\\102\\103'" print oct('ABC') '\101\102\103' eval(oct('ABC')) 'ABC'
Are there any subtle issues with supporting this that I'm not seeing? Matthew Barnes
To illustrate:
hex('ABC') "'\\x41\\x42\\x43'" print hex('ABC') # This is what I was trying to do today '\x41\x42\x43' eval(hex('ABC')) 'ABC'
oct('ABC') "'\\101\\102\\103'" print oct('ABC') '\101\102\103' eval(oct('ABC')) 'ABC'
Are there any subtle issues with supporting this that I'm not seeing?
This is indeed elegant because it preserves the property that hex() and oct() have for numbers: eval(hex(x)) == x == eval(oct(x)). However, I'm not sure if it should be added. It adds yet another feature to document and support (think of Jython, Pychecker, etc.), and I think that the number of people who care about hexadecimal strings is becoming a vanishingly small fraction of the total number of programmers. (Nobody has cared about octal for a long time; the only use for it at this point is to show unix file permission bits.) And I think that what you really wanted is probably closer to this:
import binascii binascii.hexlify('ABC') '414243'
--Guido van Rossum (home page: http://www.python.org/~guido/)
import binascii binascii.hexlify('ABC') '414243'
Or
>>> 'ABC'.encode('hex_codec') '414243'
>>> '414243'.decode('hex_codec') 'ABC'
The only defense I can think to offer for the hex() and oct() idea is that I felt like there was something significant about the fact that it was the first thing I thought to try. It seemed like the "obvious" way to do it. While both of these solutions are quite readable now that I see them, I'm not sure that I'd call either of them obvious. However, I can certainly appreciate the cost of adding bells and whistles to the language, especially for something that would probably not see much use (as Guido pointed out). Thank you both for the tips. Matthew Barnes
participants (3)
-
Bob Halley
-
Guido van Rossum
-
Matthew F. Barnes