John Machin wrote: > The UTF-n siblings are *external* representations. > 2.x: a_unicode_object.decode('UTF-16') -> an_str_object > 3.x: an_str_object.decode('UTF-16') -> a_bytes_object That should be .encode() to bytes, which is the coded form. .decode is bytes => str/unicode