[ python-Bugs-1112856 ] patch 1079734 broke cgi.FieldStorage w/ multipart post req.

Fri Feb 4 23:06:37 CET 2005

Bugs item #1112856, was opened at 2005-01-31 01:58
Message generated for change (Comment added) made by irmen
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1112856&group_id=5470

Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Submitted By: Irmen de Jong (irmen)
Assigned to: Nobody/Anonymous (nobody)
Summary: patch 1079734 broke cgi.FieldStorage w/ multipart post req.

Initial Comment:
Patch 1079734 "Make cgi.py use email instead of rfc822
or mimetools" seems to have broken the cgi.FieldStorage
in cases where the request is a multipart post (for
instance, when a file upload form field is used).
See the attached test program.

With cgi.py revision <1.83 (python 2.4 for instance) I
get the expected results;

374
FieldStorage(None, None, [FieldStorage('param1', None,
'Value of param1'), FieldStorage('param2', None, 'Value
of param2'), FieldStorage('file', '', ''),
FieldStorage(None, None, '')])

but with cgi.py rev 1.83 (current) I get this:

374
FieldStorage(None, None, [FieldStorage('param1', None,
'')])

Another thing that I observed (which isn't reproduced
by this test program) is that cgi.FieldStorage.__init__
never completes when the fp is a socket-file (and the
request is again a multipart post). It worked fine with
the old cgi.py.

----------------------------------------------------------------------

>Comment By: Irmen de Jong (irmen)
Date: 2005-02-04 23:06

Message:
Logged In: YES 
user_id=129426

Johannes: while your patch makes my cgibug.py test case run
fine, it has 2 problems:
1- it runs much slower than the python2.4 code (probably
because of the reading back thing Josh is talking about);
2- it still doesn't fix the second problem that I observed:
cgi.FieldStorage never completes when fp is a socket. I
don't have a separate test case for this yet, sorry.

So Josh: perhaps your idea doesn't have these 2 problems?

----------------------------------------------------------------------

Comment By: Josh Hoyt (joshhoyt)
Date: 2005-02-04 16:45

Message:
Logged In: YES 
user_id=693077

Johannes, your patch looks fine to me. It would be nice if
we didn't have to keep reading back each part from the
parsed message, though.

I had an idea for another approach. Use email to parse the
MIME message fully, then convert it to FieldStorage fields.
Parsing could go something like:

== CODE ==

from email.FeedParser import FeedParser
parser = FeedParser()
# Create bogus content-type header...
parser.feed('Content-type: %s ; boundary=%s \r\n\r\n' %
(self.type, self.innerboundary))
parser.feed(self.fp.read())
message = parser.close()

# Then take parsed message and convert to FieldStorage fields

== END CODE ==

This lets the email parser handle all of the complexities of
MIME, but it does mean that we have to accurately re-create
all of the necessary headers. I can cook up a full patch if
anyone thinks this would fly.

----------------------------------------------------------------------

Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2005-02-04 11:15

Message:
Logged In: YES 
user_id=469548

Here's a patch. We're interested in two things in the
patched loop: 

* the rest of the multipart/form-data, including the headers
for the current part, for consumption by HeaderParser (this
is the tail variable)
* the rest of the multipart/form-data without the headers
for the current part, for consumption by FieldStorage (this
is the message.get_payload() call)

Josh, Irmen, do you see any problems with this patch?

(BTW, this fix should be ported to the parse_multipart
function as well, when I check it in (and I should make
cgibug.py into a test))

----------------------------------------------------------------------

Comment By: Josh Hoyt (joshhoyt)
Date: 2005-02-04 08:21

Message:
Logged In: YES 
user_id=693077

The email.HeaderParser module that is now used for parsing
headers consumes the rest of the body of the message when it
is used. If there was some way of telling the parser to stop
reading when it got to the end of the headers, I think this
problem would be fixed. Unfortunately, there is no external
interface to the email module's parser that lets us do this.
I hope we can come up with a reasonable solution that does
not involve reverting to using deprecated modules.

Thanks for the test case.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1112856&group_id=5470