[Tutor] how to struct.pack a unicode string?

Albert-Jan Roskam fomcl at yahoo.com
Sun Dec 2 14:34:52 CET 2012



 

<snip>


> to make is that the transform formats are multibyte encodings (except
> ASCII in UTF-8), which means the expression str(len(hello)) is using
> the wrong length; it needs to use the length of the encoded string.
> Also, UTF-16 and UTF-32 typically have very many null bytes. Together,
> these two observations explain the error: "unicode_internal' codec
> can't decode byte 0x00 in position 12: truncated input".

Hi Eryksun,

Observation #1: Yes, makes perfect sense. I should have thought about that. Observation #2:
As I emailed earlier today to Peter Otten, I thought unicode_internal means UCS-2 or UCS-4,
depending on the size of sys.maxunicode? How is this related to UTF-16 and UTF-32?

Thank you!

Best regards,
Albert-Jan



More information about the Tutor mailing list