[Python-Dev] Why does base64 return bytes?

Victor Stinner victor.stinner at gmail.com
Tue Jun 14 11:35:15 EDT 2016


To port OpenStack to Python 3, I wrote 4 (2x2) helper functions which
accept bytes *and* Unicode as input. xxx_as_bytes() functions return bytes,
xxx_as_text() return Unicode:
http://docs.openstack.org/developer/oslo.serialization/api.html

Victor
Le 14 juin 2016 5:21 PM, "Steven D'Aprano" <steve at pearwood.info> a écrit :

> Normally I'd take a question like this to Python-List, but this question
> has turned out to be quite diversive, with people having strong opinions
> but no definitive answer. So I thought I'd ask here and hope that some
> of the core devs would have an idea.
>
> Why does base64 encoding in Python return bytes?
>
> base64.b64encode take bytes as input and returns bytes. Some people are
> arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
> should transform bytes to characters:
>
> https://tools.ietf.org/html/rfc3548.html
>
> albeit US-ASCII characters. E.g.:
>
>     The encoding process represents 24-bit groups of input bits
>     as output strings of 4 encoded characters.
>     [...]
>     Each 6-bit group is used as an index into an array of 64 printable
>     characters.  The character referenced by the index is placed in the
>     output string.
>
> Are they misinterpreting the standard? Has Python got it wrong? Is there
> a good reason for returning bytes?
>
> I see that other languages choose different strategies. Microsoft's
> languages C#, F# and VB (plus their C++ compiler) take an array of bytes
> as input, and outputs a UTF-16 string:
>
> https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx
>
> Java's base64 encoder takes and returns bytes:
>
> https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html
>
> and Javascript's Base64 encoder takes input as UTF-16 encoded text and
> returns the same:
>
>
> https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
>
> I'm not necessarily arguing that Python's strategy is the wrong one, but
> I am interested in what (if any) reasons are behind it.
>
>
> Thanks in advance,
>
>
>
>
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160614/87c65358/attachment.html>


More information about the Python-Dev mailing list