[Mailman-Users] Encoding issues when importing archives

Eric Abrahamsen eric at ericabrahamsen.net
Mon May 14 17:16:52 EDT 2018


I'm recreating some old lists I had in a Mailman 2 installation, and
trying to import the old mboxes into Hyperkitty.

The lists were on Chinese-related subjects, and we've got both messages
that contain Chinese characters, and attachments that have Chinese
filenames and contents.

The import process is blowing up with a UnicodeEncodeError, in
hyperkitty/lib/incoming.py#add_to_list, it looks like when the
attachments are being processed:

content = content.encode(decoding)

UnicodeEncodeError: 'gb2312' codec can't encode character '\ufffd' in position 3131: illegal multibyte sequence

Apparently the offending attachments are specified as gb2312 (a common
Chinese encoding).

Is there something I can do to somehow preprocess the archive mboxes, or
otherwise re-encode the attachments?

Thanks,
Eric


More information about the Mailman-Users mailing list