Public bug reported:
After upgrading Mailman from 2.1.23 to 2.1.26-1 on Debian, things went smoothly, the list's mbox is updated but the archives are not updated. In the error log, one sees, for every message
Jun 23 19:59:03 2018 (20419) SHUNTING: 1529776742.660616+f4a3eea82ed27ce3f481064f194162863c62b280 Jun 23 21:35:24 2018 (20419) Uncaught runner exception: 'utf8' codec can't decode byte 0xaa in position 26: invalid start byte Jun 23 21:35:24 2018 (20419) Traceback (most recent call last): File "/var/lib/mailman/Mailman/Queue/Runner.py", line 119, in _oneloop self._onefile(msg, msgdata) File "/var/lib/mailman/Mailman/Queue/Runner.py", line 190, in _onefile keepqueued = self._dispose(mlist, msg, msgdata) File "/var/lib/mailman/Mailman/Queue/ArchRunner.py", line 77, in _dispose mlist.ArchiveMail(msg) File "/var/lib/mailman/Mailman/Archiver/Archiver.py", line 214, in ArchiveMail h.processUnixMailbox(f) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox self.add_article(a) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 640, in add_article author = fixAuthor(article.decoded['author']) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 63, in fixAuthor while i>0 and (L[i-1] in lowercase or UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 26: invalid start byte
This is always the same complaint. I have checked shunted messages and the mbox itself and I have not found any 0xaa value in them.
** Affects: mailman Importance: Undecided Status: New
** Tags: pipermail unicode
This may have something to do with the archive database. You can try the script at https://www.msapiro.net/scripts/hddump to dump the database for the affected period with --verbose and look for values of 'author' and 'decoded'['author']. Is there anything unusual in those or anything with a number like 'Doe, John 3rd'.
If you can post one of the shunted message files or email it to firstname.lastname@example.org if you don't want to post it, I'll see if I can duplicate this, but also, please see https://wiki.list.org/x/12812344 .
** Changed in: mailman Status: New => Incomplete
** Changed in: mailman Assignee: (unassigned) => Mark Sapiro (msapiro)
I am experiencing the same issue migrating from 2.1.16 (on Ubuntu 14.04) to 2.1.26 (18.04).
As this was a blocker for the migration, I rsynced the /usr/lib/mailman directory from the 2.1.16 installation onto the new machine and the bin/arch/wipe worked again.
Nothing I looked at on the author or decoded author looked out of place.
Are you saying that given the same input mbox file that 'bin/arch --wipe' throws the UnicodeDecodeError with Mailman 2.1.26 but not with 2.1.16? If so, that's strange as there are no changes between 2.1.16 and 2.1.26 in pipermail.py in the area of 'fixAuthor', and there don't seem to be any Debian patches in this area either.