[Python-Dev] Why does base64 return bytes?
Isaac Morland
ijmorlan at uwaterloo.ca
Wed Jun 15 06:21:25 EDT 2016
On Wed, 15 Jun 2016, Greg Ewing wrote:
> Simon Cross wrote:
>> If we only support one, I would prefer it to be bytes since (bytes ->
>> bytes -> unicode) seems like less overhead and slightly conceptually
>> clearer than (bytes -> unicode -> bytes),
>
> Whereas bytes -> unicode, followed if needed by unicode -> bytes,
> seems conceptually clearer to me. IOW, base64 is conceptually a
> bytes-to-text transformation, and the usual way to represent
> text in Python 3 is unicode.
And in CPython, do I understand correctly that the output text would be
represented using one byte per character? If so, would there be a way of
encoding that into UTF-8 that re-used the raw memory that backs the
Unicode object? And, therefore, avoids almost all the inefficiency of
going via Unicode? If so, this would be a win - proper use of Unicode to
represent a text string, combined with instantaneous conversion into a
bytes object for the purpose of writing to the OS.
Isaac Morland CSCF Web Guru
DC 2619, x36650 WWW Software Specialist
More information about the Python-Dev
mailing list