[Python-Dev] Why can't I encode/decode base64 without importing a module?

Lennart Regebro regebro at gmail.com
Thu Apr 25 12:05:01 CEST 2013

On Thu, Apr 25, 2013 at 11:25 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Le Thu, 25 Apr 2013 08:38:12 +0200,
>> Yes it is. Base64 takes 8-bit bytes and transforms them into another
>> 8-bit stream that can be safely transmitted over various channels that
>> would mangle an unencoded 8-bit stream, such as email etc.
>> http://en.wikipedia.org/wiki/Base64
> I don't see anything in that Wikipedia page that validates your opinion.

OK, quote me the exact page text from the Wikipedia article or RFC
that explains how you map the 31-bit character space of Unicode to

> The Wikipedia page does talk about *text* and *characters* for
> the result of base64 encoding.

So are saying that you want the Python implementation of base64
encoding to take 8-bit binary data in bytes format and return a
Unicode string containing the Base64 encoded data? I think that would
surprise most people, and be of significantly less use than a base64
encoding that returns bytes.

Python 3 still views text as Unicode only. Everything else is not
text, but binary data. This makes sense, is consistent and makes
things easier to handle. This is the whole point of making the str
into Unicode in Python 3.

>> No, if you explicitly use such an encoding it is because you need to
>> because you are transferring data to a system that needs the encoding
>> in question. Unicode errors are unavoidable at that point, not an
>> unexpected surprise because a conversion happened implicitly that you
>> didn't know about.
> I don't know what "implicit conversion" you are talking about. There's
> no "implicit conversion" in a scheme where the result of base64
> encoding is a text string.

I'm sorry, I thought you were arguing for a base64 encoding taking
Unicode strings and returning 8-bit bytes. That position I can
understand, although I disagree with it. The position that a base64
encoding should take 8-bit bytes and return Unicode strings is
incomprehensible to me. I have no idea why you would want that, how
you would use it, how you would implement that API in a reasonable
way, nor how you would explain why it is like that. I can't think of
any usecase where you would want base64 encoded data unless you intend
to transmit it over an 8-bit channel, so why it should return a
Unicode string instead of 8-bit bytes is completely beyond my
comprehension. Sorry.


More information about the Python-Dev mailing list