Mailman archive creation problem

Hello mailman users
We have a mailman version 2.1.22 installation. We try to reconstruct a mail list archive with command 'arch --wipe listname' and receive the following error.
#00000 <F60795D9B7521C478F6F9174A9183DC94279D62D@MAIL.cti.gr> ÎÏίθμηÏη άÏθÏÏν ιÏÏ Î¿ÏικÏν αÏÏείÏν 2013-February Pickling archive state into /var/mailman/archives/private/tenders/pipermail.pck Traceback (most recent call last): File "/var/mailman/bin/arch", line 201, in <module> main() File "/var/mailman/bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox
Is there some suggestion or experience on this problem?
Thank you, for your attention.
Antonis Limperis

On 09/16/2016 02:58 AM, Limperis Antonis wrote:
Hello mailman users
We have a mailman version 2.1.22 installation. We try to reconstruct a mail list archive with command 'arch --wipe listname' and receive the following error.
#00000 <F60795D9B7521C478F6F9174A9183DC94279D62D@MAIL.cti.gr> ÎÏίθμηÏη άÏθÏÏν ιÏÏ Î¿ÏικÏν αÏÏείÏν
There seems to be some character encoding issue here, but ...
2013-February Pickling archive state into /var/mailman/archives/private/tenders/pipermail.pck Traceback (most recent call last): File "/var/mailman/bin/arch", line 201, in <module> main() File "/var/mailman/bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox
This traceback is incomplete. Please post the complete traceback, and if possible, the correct rendering of the
ÎÏίθμηÏη άÏθÏÏν ιÏÏ ...
stuff. Or is that perhaps some garbled rendering of
Αρίθμηση άρθρων ιστορικών αρχείων
Also, is other stuff missing. I.e., is there more between "2013-February" and "Pickling archive state into ..."
Normally, I would expect three lines like
#nnnnnn <message-id> figuring article archives yyyy-Month
for each message in the mbox followed by something like
Updating index files for archive [yyyy-Month] Date Subject Author Thread Computing threaded index Updating HTML for article n1 Updating HTML for article n2 ...
for each month in the archive and then finally, the
Pickling archive state into ...
line.
I really need the complete traceback to begin to say more.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Hello mark
This is the complete trace:
[root@mailman ~]# /var/mailman/bin/arch --wipe testlist #00000 <002f01c4a17b$0e714490$c7eb3fc2@avgoulea> Αρίθμηση άρθρων ιστορικών αρχείων 2004-September #00001 <24BA97B6F4C0E84999709C650C255C2E604D92@amfithea.cti.gr> Αρίθμηση άρθρων ιστορικών αρχείων 2004-September Ανανέωση των αρχείων καταλόγου για το ιστορικό αρχείο [2004-September] Date Subject Author Ακολουθία μηνύματος Γίνεται υπολογισμός των περιεχομένων Ανανέωση της HTML για το άρθρο 0 Pickling archive state into /var/mailman/archives/private/testlist/pipermail.pck Traceback (most recent call last): File "/var/mailman/bin/arch", line 201, in <module> main() File "/var/mailman/bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox self.add_article(a) File "/var/mailman/Mailman/Archiver/pipermail.py", line 640, in add_article author = fixAuthor(article.decoded['author']) File "/var/mailman/Mailman/Archiver/pipermail.py", line 63, in fixAuthor while i>0 and (L[i-1][0] in lowercase or UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 26: ordinal not in range(128)
I found that the reconstruction works properly if set locale to el_GR.utf8 with " export LC_ALL=el_GR.utf8". The original system locale environment was:
LANG=el_GR.ISO8859-7 LC_CTYPE=el_GR.ISO8859-7 LC_NUMERIC=el_GR.ISO8859-7 LC_TIME=el_GR.ISO8859-7 LC_COLLATE=el_GR.ISO8859-7 LC_MONETARY=el_GR.ISO8859-7 LC_MESSAGES=C LC_PAPER="el_GR.ISO8859-7" LC_NAME="el_GR.ISO8859-7" LC_ADDRESS="el_GR.ISO8859-7" LC_TELEPHONE="el_GR.ISO8859-7" LC_MEASUREMENT="el_GR.ISO8859-7" LC_IDENTIFICATION="el_GR.ISO8859-7" LC_ALL=
Thank you for your attention. Antonis
-----Original Message----- From: Mailman-Users [mailto:mailman-users-bounces+limperis=cti.gr@python.org] On Behalf Of Mark Sapiro Sent: Friday, September 16, 2016 6:31 PM To: mailman-users@python.org Subject: Re: [Mailman-Users] Mailman archive creation problem
On 09/16/2016 02:58 AM, Limperis Antonis wrote:
Hello mailman users
We have a mailman version 2.1.22 installation. We try to reconstruct a mail list archive with command 'arch --wipe listname' and receive the following error.
#00000 <F60795D9B7521C478F6F9174A9183DC94279D62D@MAIL.cti.gr> ÎÏίθμηÏη άÏθÏÏν ιÏÏ Î¿ÏικÏν αÏÏείÏν
There seems to be some character encoding issue here, but ...
2013-February Pickling archive state into /var/mailman/archives/private/tenders/pipermail.pck Traceback (most recent call last): File "/var/mailman/bin/arch", line 201, in <module> main() File "/var/mailman/bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox
This traceback is incomplete. Please post the complete traceback, and if possible, the correct rendering of the
ÎÏίθμηÏη άÏθÏÏν ιÏÏ ...
stuff. Or is that perhaps some garbled rendering of
Αρίθμηση άρθρων ιστορικών αρχείων
Also, is other stuff missing. I.e., is there more between "2013-February" and "Pickling archive state into ..."
Normally, I would expect three lines like
#nnnnnn <message-id> figuring article archives yyyy-Month
for each message in the mbox followed by something like
Updating index files for archive [yyyy-Month] Date Subject Author Thread Computing threaded index Updating HTML for article n1 Updating HTML for article n2 ...
for each month in the archive and then finally, the
Pickling archive state into ...
line.
I really need the complete traceback to begin to say more.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/limperis%40cti.gr

On 09/20/2016 08:36 PM, Limperis Antonis wrote:
Traceback (most recent call last): File "/var/mailman/bin/arch", line 201, in <module> main() File "/var/mailman/bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox self.add_article(a) File "/var/mailman/Mailman/Archiver/pipermail.py", line 640, in add_article author = fixAuthor(article.decoded['author']) File "/var/mailman/Mailman/Archiver/pipermail.py", line 63, in fixAuthor while i>0 and (L[i-1][0] in lowercase or UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 26: ordinal not in range(128)
Pipermail is trying to canonicalize the display name in the From: header of a message into "Last, First" form and it is trying to see if the initial character of a "word" in the name is in the string of lower case characters for the locale. At this point, the name is a unicode and Python is trying to decode the "lowercase" string to unicode for the comparison. For some reason, the "lowercase" string appears to be iso-8859-7, but the decoding is being done as if it were ascii.
I found that the reconstruction works properly if set locale to el_GR.utf8 with " export LC_ALL=el_GR.utf8". The original system locale environment was:
LANG=el_GR.ISO8859-7 LC_CTYPE=el_GR.ISO8859-7 LC_NUMERIC=el_GR.ISO8859-7 LC_TIME=el_GR.ISO8859-7 LC_COLLATE=el_GR.ISO8859-7 LC_MONETARY=el_GR.ISO8859-7 LC_MESSAGES=C LC_PAPER="el_GR.ISO8859-7" LC_NAME="el_GR.ISO8859-7" LC_ADDRESS="el_GR.ISO8859-7" LC_TELEPHONE="el_GR.ISO8859-7" LC_MEASUREMENT="el_GR.ISO8859-7" LC_IDENTIFICATION="el_GR.ISO8859-7" LC_ALL=
I'm guessing that Python is confused because most of the locale stuff is "el_GR.ISO8859-7", but LC_ALL is not. In any case, it appears you have solved the problem by "export LC_ALL=el_GR.utf8".
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Limperis Antonis
-
Mark Sapiro