[Python-Dev] Problems with hex-conversion functions
Brett Cannon
brett at python.org
Sun Sep 6 00:39:50 CEST 2009
On Sat, Sep 5, 2009 at 14:26, Ender Wiggin<wiggin15 at gmail.com> wrote:
> Hello everyone.
>
> I see several problems with the two hex-conversion function pairs that
> Python offers:
> 1. binascii.hexlify and binascii.unhexlify
> 2. bytes.fromhex and bytes.hex
>
> Problem #1:
> bytes.hex is not implemented, although it was specified in PEP 358.
Probably an oversight.
> This means there is no symmetrical function to accompany bytes.fromhex.
>
> Problem #2:
> Both pairs perform the same function, although The Zen Of Python suggests
> that
> "There should be one-- and preferably only one --obvious way to do it."
> I do not understand why PEP 358 specified the bytes function pair although
> it mentioned the binascii pair...
>
It's nicer to have this kind of functionality on the built-ins than in
the standard library. "Practicality beats purity".
> Problem #3:
> bytes.fromhex may receive spaces in the input string, although
> binascii.unhexlify may not.
> I see no good reason for these two functions to have different features.
>
Well, one allows for sloppy input while the other does not. Usually
accepting sloppy input but giving strict input is better.
> Problem #4:
> binascii.unhexlify may receive both input types: strings or bytes, whereas
> bytes.fromhex raises an exception when given a bytes parameter.
> Again there is no reason for these functions to be different.
Well, giving bytes back into bytes seems somewhat silly. That's an
error in mixing your strings and bytes.
>
> Problem #5:
> binascii.hexlify returns a bytes type - although ideally, converting to hex
> should
> always return string types and converting from hex should always return
> bytes.
> IMO there is no meaning of bytes as an output of hexlify, since the output
> is a
> representation of other bytes.
> This is also the suggested behavior of bytes.hex in PEP 358
>
> Problems #4 and #5 call for a decision about the input and output of the
> functions being discussed:
>
> Option A : Strict input and output
> unhexlify (and bytes.fromhex) may only receives string and may only return
> bytes
> hexlify (and bytes.hex) may only receives bytes and may only return strings
>
> Option B : Robust input and strict output
> unhexlify (and bytes.fromhex) may receive bytes and strings and may only
> return bytes
> hexlify (and bytes.hex) may receive bytes or strings and may only return
> strings
>
> Of course we may also consider a third option, which will allow the return
> type of
> all functions to be robust (perhaps specified in a keyword argument), but as
> I wrote in
> the description of problem #5, I see no sense in that.
>
> Note that PEP 3137 describes: "... the more strict definitions of encoding
> and decoding in
> Python 3000: encoding always takes a Unicode string and returns a bytes
> sequence, and decoding
> always takes a bytes sequence and returns a Unicode string." - suggesting
> option A.
>
> To repeat problems #4 and #5, the current behavior does not match any
> option:
> * The return type of binascii.hexlify should be string, and this is not the
> current behavior.
> As for the input:
> * Option A is not the current behavior because binascii.unhexlify may
> receive both input types.
> * Option B is not the current behavior because bytes.fromhex does not allow
> bytes as input.
>
> To fix these issues, three changes should be applied:
> 1. Deprecate bytes.fromhex. This fixes the following problems:
> #4 (go with option B and remove the function that does not allow bytes
> input)
> #2 (the binascii functions will be the only way to "do it")
> #1 (bytes.hex should not be implemented)
> 2. In order to keep the functionality that bytes.fromhex has over unhexlify,
> the latter function should be able to handle spaces in its input (fix #3)
> 3. binascii.hexlify should return string as its return type (fix #5)
Or we fix bytes.fromhex(), add bytes.hex() and deprecate binascii.(un)hexlify().
-Brett
More information about the Python-Dev
mailing list