[Tutor] Help understanding base64 decoding

Cameron Simpson cs at cskk.id.au
Thu Sep 13 18:19:11 EDT 2018


On 13Sep2018 08:23, Ryan Smith <ryan at allwegot.net> wrote:
>[...] I'm still getting familiar with all of the
>different encodings at play. For example the way I currently
>understand things is that python supports unicode which ultimately
>defaults to being encoded in UTF-8. Hence I'm guessing is  the reason
>for converting strings to a bytes object in the first place.

Yeah. "str" is text, using Unicode code points.

To store this in a file, the text must be transcribed in some encoding. The 
default encoding in Python is UTF-8, which has some advantages: the bottom 128 
values are one to one with ASCII, and it is fairly compact when the source text 
live in or near that range.

Windows often works with UTF-16, which is why your source bytes look the way 
they do.

So the path is:

  base64 text (which fits in a conservative subset of ASCII)
  => bytes holding a UTF-16 encoding of your target text
  => decode to a Python str

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Tutor mailing list