[Python-Dev] Why does base64 return bytes?

Victor Stinner victor.stinner at gmail.com
Tue Jun 14 11:35:15 EDT 2016

To port OpenStack to Python 3, I wrote 4 (2x2) helper functions which
accept bytes *and* Unicode as input. xxx_as_bytes() functions return bytes,
xxx_as_text() return Unicode:

Le 14 juin 2016 5:21 PM, "Steven D'Aprano" <steve at pearwood.info> a écrit :

> Normally I'd take a question like this to Python-List, but this question
> has turned out to be quite diversive, with people having strong opinions
> but no definitive answer. So I thought I'd ask here and hope that some
> of the core devs would have an idea.
> Why does base64 encoding in Python return bytes?
> base64.b64encode take bytes as input and returns bytes. Some people are
> arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
> should transform bytes to characters:
> https://tools.ietf.org/html/rfc3548.html
> albeit US-ASCII characters. E.g.:
>     The encoding process represents 24-bit groups of input bits
>     as output strings of 4 encoded characters.
>     [...]
>     Each 6-bit group is used as an index into an array of 64 printable
>     characters.  The character referenced by the index is placed in the
>     output string.
> Are they misinterpreting the standard? Has Python got it wrong? Is there
> a good reason for returning bytes?
> I see that other languages choose different strategies. Microsoft's
> languages C#, F# and VB (plus their C++ compiler) take an array of bytes
> as input, and outputs a UTF-16 string:
> https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx
> Java's base64 encoder takes and returns bytes:
> https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html
> and Javascript's Base64 encoder takes input as UTF-16 encoded text and
> returns the same:
> https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
> I'm not necessarily arguing that Python's strategy is the wrong one, but
> I am interested in what (if any) reasons are behind it.
> Thanks in advance,
> Steve
