[Python-Dev] Why does base64 return bytes?

Terry Reedy tjreedy at udel.edu
Tue Jun 14 12:38:46 EDT 2016


On 6/14/2016 11:19 AM, Steven D'Aprano wrote:
> Normally I'd take a question like this to Python-List, but this question
> has turned out to be quite diversive, with people having strong opinions
> but no definitive answer. So I thought I'd ask here and hope that some
> of the core devs would have an idea.
>
> Why does base64 encoding in Python return bytes?

Ultimately, because we never decided to change this in 3.0.

> base64.b64encode take bytes as input and returns bytes. Some people are
> arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
> should transform bytes to characters:
>
> https://tools.ietf.org/html/rfc3548.html
>
> albeit US-ASCII characters. E.g.:
>
>     The encoding process represents 24-bit groups of input bits
>     as output strings of 4 encoded characters.

One could argue that 'encoded character' means 'bytes' in Python, but I 
don't know what the standard writer meant, as unicode characters always 
have some internal encoding.

>     [...]
>     Each 6-bit group is used as an index into an array of 64 printable
>     characters.  The character referenced by the index is placed in the
>     output string.

-- 
Terry Jan Reedy



More information about the Python-Dev mailing list