[Mailman-Developers] unicode / archive problem followup (scrubber.py too)

Ron Brogden rb at islandnet.com
Wed Dec 4 01:30:06 2002

Howdy.  Since no one replied with any suggestions for me I decided to take a 
chance with a crude little hack to deal with the archiver getting confused 
when the unicode type returned was "None".  This also brought up a bug with 
the attachment handling code in the archiver though which again, I have done 
a crude hack to but which I want to make sure I haven't done a big no no.

First off, the specific changes:

In HyperArchive.py the following lines were munged to always return the 
following charset if none could be discerned:

450:            return unicode(subj, "iso-8859-1")
1010:            return unicode(result, "iso-8859-1") 

This got me past the initial error but this led to a new error to with dates 
and attachement folders.  

The sanity checking is not too bright for the folder name selection and so 
what happens is that if the date cannot be culled from the message due to a 
logic problem (it falls through in a funny way), the net result is that the 
archiver dies instead of either just picking a date or tossing the 
attachment. Since I am not totally clear on what all the date efffects, I 
chickened out and just assumed that there would always be a date in the 
received header which probably isn't wise but is better than it is now.

In Scrubber.py around line 80:

def calculate_attachments_dir(mlist, msg, msgdata):
    # Calculate the directory that attachments for this message will go
    # under.  To avoid inode limitations, the scheme will be:
    # archives/private/<listname>/attachments/YYYYMMDD/<msgid-hash>/<files>
    # Start by calculating the date-based and msgid-hash components.
    msgdate = msg.get('Date')
    if msgdate is None:
        now = time.gmtime(msgdata.get('received_time', time.time()))
        now = parsedate(msgdate)

This a problem since it appears that parsedate() is not guaranteed to return 
something useful and so again, the archiver was dying.  To get around this I 
had to add the following catchall afterwards:

    if now is None:
        now = time.gmtime(msgdata.get('received_time', time.time()))

So two questions.  First, is the above a reasonable thing to do (i.e. am I 
introducing anything nasty by doing this)?  Secondly, I don't want 
attachments archived at all - how do how do you disable this behaviour?

Thanks for any suggestions or feedback.



More information about the Mailman-Developers mailing list