[Python-Dev] bytes.from_hex()
Just van Rossum
just at letterror.com
Thu Mar 2 09:57:57 CET 2006
Ron Adam wrote:
> Josiah Carlson wrote:
> > Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >> u = unicode(b)
> >> u = unicode(b, 'utf8')
> >> b = bytes['utf8'](u)
> >> u = unicode['base64'](b) # encoding
> >> b = bytes(u, 'base64') # decoding
> >> u2 = unicode['piglatin'](u1) # encoding
> >> u1 = unicode(u2, 'piglatin') # decoding
> >
> > Your provided semantics feel cumbersome and confusing to me, as
> > compared with str/unicode.encode/decode() .
> >
> > - Josiah
>
> This uses syntax to determine the direction of encoding. It would be
> easier and clearer to just require two arguments or a tuple.
>
> u = unicode(b, 'encode', 'base64')
> b = bytes(u, 'decode', 'base64')
>
> b = bytes(u, 'encode', 'utf-8')
> u = unicode(b, 'decode', 'utf-8')
>
> u2 = unicode(u1, 'encode', 'piglatin')
> u1 = unicode(u2, 'decode', 'piglatin')
>
>
>
> It looks somewhat cleaner if you combine them in a path style string.
>
> b = bytes(u, 'encode/utf-8')
> u = unicode(b, 'decode/utf-8')
It gets from bad to worse :(
I always liked the assymmetry between
u = unicode(s, "utf8")
and
s = u.encode("utf8")
which I think was the original design of the unicode API. Cudos for
whoever came up with that.
When I saw
b = bytes(u, "utf8")
mentioned for the first time, I thought: why on earth must the bytes
constructor be coupled to the unicode API?!?! It makes no sense to me
whatsoever. Bytes have so much more use besides encoded text.
I believe (please correct me if I'm wrong) that the encoding argument of
bytes() was invented to make it easier to write byte literals. Perhaps a
true bytes literal notation is in order after all?
My preference for bytes -> unicode -> bytes API would be this:
u = unicode(b, "utf8") # just like we have now
b = u.tobytes("utf8") # like u.encode(), but being explicit
# about the resulting type
As to base64, while it works as a codec ("Why a base64 codec? Because we
can!"), I don't find it a natural API at all, for such conversions.
(I do however agree with Greg Ewing that base64 encoded data is text,
not ascii-encoded bytes ;-)
Just-my-2-cts
More information about the Python-Dev
mailing list