[Mailman-Users] Encoding issues when importing archives

Mark Sapiro mark at msapiro.net
Tue May 15 08:52:02 EDT 2018

On 5/14/18 2:16 PM, Eric Abrahamsen wrote:
> I'm recreating some old lists I had in a Mailman 2 installation, and
> trying to import the old mboxes into Hyperkitty.

This is not the appropriate list for Mailman 3.
mailman-users at mailman3.org
or possibly mailman-developers at python3.org
<https://mail.python.org/mailman/listinfo/mailman-developers> are the
appropriate lists.

> The lists were on Chinese-related subjects, and we've got both messages
> that contain Chinese characters, and attachments that have Chinese
> filenames and contents.
> The import process is blowing up with a UnicodeEncodeError, in
> hyperkitty/lib/incoming.py#add_to_list, it looks like when the
> attachments are being processed:
> content = content.encode(decoding)
> UnicodeEncodeError: 'gb2312' codec can't encode character '\ufffd' in position 3131: illegal multibyte sequence
> Apparently the offending attachments are specified as gb2312 (a common
> Chinese encoding).
> Is there something I can do to somehow preprocess the archive mboxes, or
> otherwise re-encode the attachments?

Possibly there is, but this is a bug in the hyperkitty_import process.
It would help if you file an issue at
<https://gitlab.com/mailman/hyperkitty/issues/new> with enough
information for us to reproduce it.

Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

More information about the Mailman-Users mailing list