[Mailman-Users] diagnosing messages missing from archives

Thu May 11 13:57:20 EDT 2017

On 05/11/2017 09:00 AM, Matt Morgan wrote:
> On Tue, May 9, 2017 at 12:58 PM, Mark Sapiro <mark at msapiro.net> wrote:
>>
>> In Mailman's directory (/usr/local/mailman in your case)
>>
>> bin/show_qfiles qfiles/shunt/*
>>
> 
> Just FYI, in case anyone's reading this in the list archives in future, you
> may need a "../" in front of "qfiles/shunt/*" there.

Are you saying in your case Mailman's qfiles directory is
/usr/local/qfiles? That seems unusual? Or did you do

cd /usr/local/mailman/bin
show_qfiles ../qfiles/shunt/*

...
>   File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 63, in
> fixAuthor
>     while i>0 and (L[i-1][0] in lowercase or
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 26:
> ordinal
> not in range(128)

The full statement throwing the exception is

        while i>0 and (L[i-1][0] in lowercase or
                       L[i-1].lower() in smallNameParts):

lowercase is string.lowercase which from below appears to be all ascii.
smallNameParts is defined in the module as a list of short ascii strings

smallNameParts = ['van', 'von', 'der', 'de']

That leaves L as the only possible source of non-ascii. L is a list of
the 'words' in the From: display name, but at that point in the code, it
should be unicode. I suppose this may be involved in the issue, but then
what is the XXXXXXXXXX in the anonymized address below? Does it contain
non-ascii?

> Here is the output from the dumpdb command (anonymized a little):
> 
...
> From: XXXXXXXXXX <xxxxxxxx at me.com>

...
>> What do you get if you invoke Python interactively on this server and do
>>
>> import string
>> string.lowercase
>>
>> I get 'abcdefghijklmnopqrstuvwxyz'
>>
> 
> I get the same thing! Does that make any sense?
> 
> xxxx at yyyyy:/usr/local/mailman/logs# python
> Python 2.7.5 (default, May 29 2013, 02:28:51)
> [GCC 4.8.0] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import string
>>>> string.lowercase
> 'abcdefghijklmnopqrstuvwxyz'

I'm still leaning towards non-ascii in string.lowercase, because I think
that's the only thing it could be and because the \xb5 is in 0-based
position 26 which would be the caharacter after 'z'.

Is Python 2.7.5 (default, May 29 2013, 02:28:51) the python that Mailman
is using? Look at whatever script starts/stops/restarts Mailman and the
Python it uses and do the above test in that Python.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan