[Mailman-Developers] Scrubber.py confusion, 2.1b3

Barry A. Warsaw barry@python.org
Tue, 13 Aug 2002 11:38:59 -0400

>>>>> "MM" == Michael Meltzer <mjm@michaelmeltzer.com> writes:

    MM> Actually I "reusing" the code from Scrubber.py in MimeDel.py
    MM> to turn attachments into links :-) I hardwired it for image
    MM> types but it is generic enough. Some sample output from my
    MM> "staging":

    MM> Name: beach.jpg Type: image/jpeg Size: 18853 bytes Desc:
    MM> not_available Url:
    MM> http://www.michaelmeltzer.com/pipermail/meltzer-list/attachments/200208/12/beach.jpg-0005.jpe

Cool.  I'm using a slightly different naming algorithm for the path.

    MM> It turned out to be a 4 line hack to filter_parts, 1 line at
    MM> the top and 10 lines to reformat the payload, the reset came
    MM> from save_attachment, very handle :-)

Can you try to update it to current cvs?  If it's really a 4 line
hack, you've got to post it. :)  I tried to write the Scrubber.py
updates with you in mind, by factoring out some other functionality
you might need.
    MM> I have to admit environment is nice to work in.

    MM> I am not sure my code it upto patch quality :-) The next step
    MM> would be a modification to the content filter page for the
    MM> type it should react to.

    MM> I would also subject(Scrubber.py needs this too) that the
    MM> filter pages list the extensions that it is allow to write. Or
    MM> the converse the extensions it should not write,
    MM> http://office.microsoft.com/Assistance/2000/Out2ksecFAQ.aspx. would
    MM> be my start :-), save the masses someday :-)

I've been thinking about this.  I vaguely remember that someone did a
patch to support pass-or-block semantics to the filter, but I can't
put my finger on it now.  I want to link Dan Mick's name to that, but
does this ring a bell with anyone?

    MM> The issue with the directory is the number of files, not a
    MM> name clash

Yep, I know.
    MM> , `ls -d archives/private/listname/attachments/* |
    MM> wc -l` > 1000 I think system performance will be
    MM> effected. Above 10,000 I know it would(it would also be a
    MM> problem for the http server on access). I can understand that
    MM> keeping the attachment from each email in it own directory,
    MM> but this way the "files version control" :-) groups them
    MM> together for access(assuming least regency theory) and make
    MM> cleaning out for space/inodes simple. it was just strftime
    MM> wielded on.

I'm not sure I followed all that, but the current Scrubber.py does add
the date directory to the path, so I think we're good here.