[Python-Dev] Why can't I encode/decode base64 without importing a module?
ijmorlan at uwaterloo.ca
Thu Apr 25 19:29:32 CEST 2013
On Thu, 25 Apr 2013, Lennart Regebro wrote:
> On Thu, Apr 25, 2013 at 4:22 PM, MRAB <python at mrabarnett.plus.com> wrote:
>> The JSON specification says that it's text. Its string literals can
>> contain Unicode codepoints. It needs to be encoded to bytes for
>> transmission and storage, but JSON itself is not a bytestring format.
> OK, fair enough.
>> base64 is a way of encoding binary data as text.
> It's a way of encoding binary data using ASCII. There is a subtle but
> important difference.
It is a way of encoding arrays of 8-bit bytes as arrays of characters that
are part of the printable, non-whitespace subset of the ASCII repertoire.
Since the ASCII repertoire is now simply the first 128 code points in the
Unicode repertoire, it is equally correct to say that base64 is a way of
encoding binary data as Unicode text.
>> In Python 3 we're trying to stop mixing binary data (bytestrings) with
>> text (Unicode strings).
> Yup. And that's why a byte64 encoding shouldn't return Unicode strings.
That is exactly why it should return Unicode strings. What bytes should
get sent if base64 is used to send a byte array over an EBCDIC link? [*]
Having said that, there may be other reasons for base64 encoding to return
bytes - I can conceive of arguments involving efficiency, or practicality,
or the most common use cases. So I can't say for sure what base64
encoding actually ought to return in Python. But the purist stance should
be that base64 encoding should return text, i.e. a string, i.e. unicode.
[*] I apologize to anybody who just ate.
Isaac Morland CSCF Web Guru
DC 2554C, x36650 WWW Software Specialist
More information about the Python-Dev