[Python-Dev] Problems with hex-conversion functions
brett at python.org
Sun Sep 6 00:39:50 CEST 2009
On Sat, Sep 5, 2009 at 14:26, Ender Wiggin<wiggin15 at gmail.com> wrote:
> Hello everyone.
> I see several problems with the two hex-conversion function pairs that
> Python offers:
> 1. binascii.hexlify and binascii.unhexlify
> 2. bytes.fromhex and bytes.hex
> Problem #1:
> bytes.hex is not implemented, although it was specified in PEP 358.
Probably an oversight.
> This means there is no symmetrical function to accompany bytes.fromhex.
> Problem #2:
> Both pairs perform the same function, although The Zen Of Python suggests
> "There should be one-- and preferably only one --obvious way to do it."
> I do not understand why PEP 358 specified the bytes function pair although
> it mentioned the binascii pair...
It's nicer to have this kind of functionality on the built-ins than in
the standard library. "Practicality beats purity".
> Problem #3:
> bytes.fromhex may receive spaces in the input string, although
> binascii.unhexlify may not.
> I see no good reason for these two functions to have different features.
Well, one allows for sloppy input while the other does not. Usually
accepting sloppy input but giving strict input is better.
> Problem #4:
> binascii.unhexlify may receive both input types: strings or bytes, whereas
> bytes.fromhex raises an exception when given a bytes parameter.
> Again there is no reason for these functions to be different.
Well, giving bytes back into bytes seems somewhat silly. That's an
error in mixing your strings and bytes.
> Problem #5:
> binascii.hexlify returns a bytes type - although ideally, converting to hex
> always return string types and converting from hex should always return
> IMO there is no meaning of bytes as an output of hexlify, since the output
> is a
> representation of other bytes.
> This is also the suggested behavior of bytes.hex in PEP 358
> Problems #4 and #5 call for a decision about the input and output of the
> functions being discussed:
> Option A : Strict input and output
> unhexlify (and bytes.fromhex) may only receives string and may only return
> hexlify (and bytes.hex) may only receives bytes and may only return strings
> Option B : Robust input and strict output
> unhexlify (and bytes.fromhex) may receive bytes and strings and may only
> return bytes
> hexlify (and bytes.hex) may receive bytes or strings and may only return
> Of course we may also consider a third option, which will allow the return
> type of
> all functions to be robust (perhaps specified in a keyword argument), but as
> I wrote in
> the description of problem #5, I see no sense in that.
> Note that PEP 3137 describes: "... the more strict definitions of encoding
> and decoding in
> Python 3000: encoding always takes a Unicode string and returns a bytes
> sequence, and decoding
> always takes a bytes sequence and returns a Unicode string." - suggesting
> option A.
> To repeat problems #4 and #5, the current behavior does not match any
> * The return type of binascii.hexlify should be string, and this is not the
> current behavior.
> As for the input:
> * Option A is not the current behavior because binascii.unhexlify may
> receive both input types.
> * Option B is not the current behavior because bytes.fromhex does not allow
> bytes as input.
> To fix these issues, three changes should be applied:
> 1. Deprecate bytes.fromhex. This fixes the following problems:
> #4 (go with option B and remove the function that does not allow bytes
> #2 (the binascii functions will be the only way to "do it")
> #1 (bytes.hex should not be implemented)
> 2. In order to keep the functionality that bytes.fromhex has over unhexlify,
> the latter function should be able to handle spaces in its input (fix #3)
> 3. binascii.hexlify should return string as its return type (fix #5)
Or we fix bytes.fromhex(), add bytes.hex() and deprecate binascii.(un)hexlify().
More information about the Python-Dev