[Python-Dev] Why does base64 return bytes?

Paul Sokolovsky pmiscml at gmail.com
Tue Jun 14 13:19:09 EDT 2016


Hello,

On Tue, 14 Jun 2016 16:51:44 +0100
Paul Moore <p.f.moore at gmail.com> wrote:

> On 14 June 2016 at 16:19, Steven D'Aprano <steve at pearwood.info> wrote:
> > Why does base64 encoding in Python return bytes?
> 
> I seem to recall there was a debate about this around the time of the
> Python 3 move. (IIRC, it was related to the fact that there used to be
> a base64 "codec", that wasn't available in Python 3 because it wasn't
> clear whether it converted bytes to text or bytes). I don't remember
> any of the details, let alone if a conclusion was reached, but a
> search of the archives may find something.

Well, it's easy to remember the conclusion - it was decided to return
bytes. The reason also wouldn't be hard to imagine - regardless of the
fact that base64 uses ASCII codes for digits and letters, it's still
essentially a binary data. And the most natural step for it is to send
it down the socket (socket.send() accepts bytes), etc.

I'd find it a bit more surprising that binascii.hexlify() returns
bytes, but I personally got used to it, and consider it a
consistency thing on binascii module.

Generally, with Python3 by default using (inefficient) Unicode for
strings, any efficient data processing would use bytes, and then one
appreciates the fact that data encoding/decoding routines also return
bytes, avoiding implicit expensive conversion to strings.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com


More information about the Python-Dev mailing list