Mailman 2.1.30 archives challenges
Thanks for help with my earlier problem, figured out how to get Apache straight and all is good.
Now trying to get archiving to work properly. It worked fine under 2.1.20, but not now. Ubuntu 16.04, mailman 2.1.30
The raw archive file is being written ok.
I have a test list (called ’test’) and can see that /var/lib/mailman/archives/private/test/2020-April.txt and /var/lib/mailman/archives/private/test.mbox/test.mbox have both been created and have messages being written to them, but the html archive pages are not being built.
manually running '/var/lib/mailman/bin/arch test’ gives me
#00000 <1F9F4DC9-AF84-4D49-B1EE-ABDC0A612692@skylands.ibmwr.org> figuring article archives 2020-April Pickling archive state into /var/lib/mailman/archives/private/test/pipermail.pck Traceback (most recent call last): File "bin/arch", line 201, in <module> main() File "bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox self.add_article(a) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 640, in add_article author = fixAuthor(article.decoded['author']) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 63, in fixAuthor while i>0 and (L[i-1][0] in lowercase or UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 26: ordinal not in range(128)
The following is what’s in /var/lib/mailman/archives/private/test at this time
drwxrwsr-x 2 tcora list 4096 Apr 24 20:30 2020-April -rw-rw-r-- 1 tcora list 300 Apr 24 20:30 2020-April.txt -rw-rw-r-- 1 tcora list 1048 Apr 24 20:30 index.html -rw-rw---- 1 tcora list 588 Apr 24 20:30 pipermail.pck
Kinda at a loss here, so any insight would be much appreciated!
Regards,
— Tom Coradeschi tjcora@icloud.com
On 4/24/20 6:32 PM, Thomas Coradeschi via Mailman-Developers wrote:
I have a test list (called ’test’) and can see that /var/lib/mailman/archives/private/test/2020-April.txt and /var/lib/mailman/archives/private/test.mbox/test.mbox have both been created and have messages being written to them, but the html archive pages are not being built.
manually running '/var/lib/mailman/bin/arch test’ gives me
Note, do not run /var/lib/mailman/bin/arch more than once on a list without the --wipe option.
#00000 <1F9F4DC9-AF84-4D49-B1EE-ABDC0A612692@skylands.ibmwr.org> figuring article archives 2020-April Pickling archive state into /var/lib/mailman/archives/private/test/pipermail.pck Traceback (most recent call last): File "bin/arch", line 201, in <module> main() File "bin/arch", line 189, in main archiver.processUnixMailbox(fp, start, end) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox self.add_article(a) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 640, in add_article author = fixAuthor(article.decoded['author']) File "/var/lib/mailman/Mailman/Archiver/pipermail.py", line 63, in fixAuthor while i>0 and (L[i-1][0] in lowercase or UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 26: ordinal not in range(128)
This looks like a manifestation of an issue we've seen before. There are multiple threads on this issue in the archive of the mailman-users@python.org list The bulk of it is at <https://mail.python.org/pipermail/mailman-users/2019-March/thread.html> in threads with Subject: [Mailman-Users] Uncaught runner exception The bottom line is in <https://mail.python.org/pipermail/mailman-users/2019-March/084280.html>. We could never figure out where it was coming from, but the import from string import lowercase in /var/lib/mailman/Mailman/Archiver/pipermail.py was returning a string that contained many accented characters in addition to the 26 letters a-z, namely the iso-8859-1 encoding of 'abcdefghijklmnopqrstuvwxyzµßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ' This is some packaging quirk in (I think) Debian's Python. If you follow all through the threads, you'll see several suggested patches for diagnosis and avoidance, and there is also a thread at <https://mail.python.org/pipermail/mailman-users/2019-May/084432.html>. All that notwithstanding, I think this is the best patch for avoiding/fixing the issue. === modified file 'Mailman/Archiver/pipermail.py' --- Mailman/Archiver/pipermail.py 2018-05-03 21:23:47 +0000 +++ Mailman/Archiver/pipermail.py 2020-04-25 02:13:46 +0000 @@ -60,7 +60,7 @@ else: # Mixed case; assume that small parts of the last name will be # in lowercase, and check them against the list. - while i>0 and (L[i-1][0] in lowercase or + while i>0 and (L[i-1][0] in lowercase[:26] or L[i-1].lower() in smallNameParts): i = i - 1 author = SPACE.join(L[-1:] + L[i:-1]) + ', ' + SPACE.join(L[:i]) -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 24 Apr 2020, at 10:18 PM, Mark Sapiro <mark@msapiro.net> wrote:
On 4/24/20 6:32 PM, Thomas Coradeschi via Mailman-Developers wrote:
I have a test list (called ’test’) and can see that /var/lib/mailman/archives/private/test/2020-April.txt and /var/lib/mailman/archives/private/test.mbox/test.mbox have both been created and have messages being written to them, but the html archive pages are not being built.
[...]
This looks like a manifestation of an issue we've seen before. There are multiple threads on this issue in the archive of the mailman-users@python.org list The bulk of it is at <https://mail.python.org/pipermail/mailman-users/2019-March/thread.html> in threads with
Subject: [Mailman-Users] Uncaught runner exception
The bottom line is in <https://mail.python.org/pipermail/mailman-users/2019-March/084280.html>. We could never figure out where it was coming from, but the import
from string import lowercase
in /var/lib/mailman/Mailman/Archiver/pipermail.py was returning a string that contained many accented characters in addition to the 26 letters a-z, namely the iso-8859-1 encoding of
'abcdefghijklmnopqrstuvwxyzµßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ'
This is some packaging quirk in (I think) Debian's Python.
If you follow all through the threads, you'll see several suggested patches for diagnosis and avoidance, and there is also a thread at <https://mail.python.org/pipermail/mailman-users/2019-May/084432.html>.
All that notwithstanding, I think this is the best patch for avoiding/fixing the issue.
=== modified file 'Mailman/Archiver/pipermail.py' --- Mailman/Archiver/pipermail.py 2018-05-03 21:23:47 +0000 +++ Mailman/Archiver/pipermail.py 2020-04-25 02:13:46 +0000 @@ -60,7 +60,7 @@ else: # Mixed case; assume that small parts of the last name will be # in lowercase, and check them against the list. - while i>0 and (L[i-1][0] in lowercase or + while i>0 and (L[i-1][0] in lowercase[:26] or L[i-1].lower() in smallNameParts): i = i - 1 author = SPACE.join(L[-1:] + L[i:-1]) + ', ' + SPACE.join(L[:i])
Bingo - thanks for the tip, Mark. I need to become more facile in using the search engines:-) Any particular reason this hasn’t been flowed into the existing mailman distribution? Regards, — Tom Coradeschi tjcora@icloud.com
On 4/25/20 5:36 AM, Tom Coradeschi via Mailman-Developers wrote:
On 24 Apr 2020, at 10:18 PM, Mark Sapiro <mark@msapiro.net> wrote:
All that notwithstanding, I think this is the best patch for avoiding/fixing the issue.
=== modified file 'Mailman/Archiver/pipermail.py' --- Mailman/Archiver/pipermail.py 2018-05-03 21:23:47 +0000 +++ Mailman/Archiver/pipermail.py 2020-04-25 02:13:46 +0000 @@ -60,7 +60,7 @@ else: # Mixed case; assume that small parts of the last name will be # in lowercase, and check them against the list. - while i>0 and (L[i-1][0] in lowercase or + while i>0 and (L[i-1][0] in lowercase[:26] or L[i-1].lower() in smallNameParts): i = i - 1 author = SPACE.join(L[-1:] + L[i:-1]) + ', ' + SPACE.join(L[:i])
Bingo - thanks for the tip, Mark. I need to become more facile in using the search engines:-)
Any particular reason this hasn’t been flowed into the existing mailman distribution?
Because its a workaround for a Debian (or some distro) Python packaging issue which should be fixed by the packager, and because I only thought of the above fix yesterday when generating the above reply. Also, it seems quite rare. Yours is only the third report I've seen which is very rare for something that would affect archiving of every message. All that said, I probably will put the above in the next patch release. -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
Because its a workaround for a Debian (or some distro) Python packaging issue which should be fixed by the packager, and because I only thought of the above fix yesterday when generating the above reply.
Also, it seems quite rare.
I wonder if this is something that is related to Unicode and locales; only happens if your server has certain locales (LC_CTYPE). It does seem like a distro bug to me, too, though.
Steve
participants (4)
-
Mark Sapiro
-
Stephen J. Turnbull
-
Thomas Coradeschi
-
Tom Coradeschi