[Patches] [ python-Patches-413766 ] Reimplementation of multifile.py

noreply@sourceforge.net noreply@sourceforge.net
Wed, 11 Apr 2001 08:30:41 -0700


Patches item #413766, was updated on 2001-04-04 11:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413766&group_id=5470

Category: Modules
Group: None
Status: Open
Priority: 5
Submitted By: Geoffrey T. Dairiki (dairiki)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Reimplementation of multifile.py

Initial Comment:
This is a re-implementation of the stock multifile.py

The main changes:

1. Efficiency:

This version supports calling the read() method with an
argument.
(In many cases, I've found that reading a MultiFile
line by line
is just too slow --- remember multipart messages often
contain
large binary attachments.)

This version performs reads on the underlying input
stream in
larger chunks as well, and uses a regular expression
search to
search for separator lines.

2. Buglets fixed

The original version has a buglet regarding its
handling of the
newline which preceeds a separator line.  According to
RFC 2046,
section 5.1.1 the newline preceeding a separator is
part of the
separator, not part of the preceeding content.  The old
version
of multifile.py treats the newline as part of the
content.  Thus,
it introduces a spurious empty line at the end of each
content.

Matching of the separators:  RFC 2046, section 5.1.1
also states,
that if the beginning of a line matches the separator,
it is a
separator.  The old code ignores only trailing white
space when
looking for a separator line.  This code ignores
trailing anything
on the separator line.


----------------------------------------------------------------------

>Comment By: Geoffrey T. Dairiki (dairiki)
Date: 2001-04-11 08:30

Message:
Logged In: YES 
user_id=45814

Oof.  I wish I had found your mimelib a couple of weeks
ago.  :-)

You'll notice I've also posted a patch to Mailman
(SF#413752)
which adds an option to filter MIME attachments to plain
text
(delete binary attachments,  convert HTML to plain text,
...)
To do that (without defining new interfaces) I subclassed
MimeWriter --- it's a bit messy.   Using mimelib probably 
would have/will be cleaner. 

The Mailman patch includes a text/{richtext,enriched} parser
(same interface as HTMLParser) which you guys might be
interested in.

I'm about to head off for a (long) weekend of skiing, so I
won't
have a chance to look carefully at mimelib until next week.
Do expect to hear from me then, though.

-Jeff



----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2001-04-10 22:13

Message:
Logged In: YES 
user_id=12800

I will definitely look at this -- and soon -- but obviously
not in time for the 2.1 release.  Geoffrey, have you looked
at mimelib (see url below)?  My intent is to replace all the
MIME handling stuff in the standard library with mimelib. 
I'm using mimelib extensively in Mailman, but I would love
to get some additional outside feedback about it.  E.g. how
do you think your new multifile.py would fit in with
mimelib, and how well do you think mimelib conforms to RFC
2046?

http://barry.wooz.org/software/pyware.html

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:31

Message:
Logged In: YES 
user_id=6380

Thanks. I'll assign this to Barry, who has been working on
another replacement for multifile, so maybe he can review
your contribution.

Barry, please don't sit on this too long -- If you've no
interest, please unassign it.

----------------------------------------------------------------------

Comment By: Geoffrey T. Dairiki (dairiki)
Date: 2001-04-04 11:09

Message:
Logged In: YES 
user_id=45814

PS. FWIW, This was developed and tested under python 1.5.2.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413766&group_id=5470