[ python-Bugs-1112549 ] cgi.FieldStorage memory usage can spike in line-oriented ops

Sun Apr 3 06:00:06 CEST 2005

Bugs item #1112549, was opened at 2005-01-30 08:40
Message generated for change (Comment added) made by chrism
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1112549&group_id=5470

Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Chris McDonough (chrism)
Assigned to: Nobody/Anonymous (nobody)
Summary: cgi.FieldStorage memory usage can spike in line-oriented ops

Initial Comment:
Various parts of cgi.FieldStorage call its
"read_lines_to_outerboundary", "read_lines" and
"skip_lines" methods.    These methods use the
"readline" method of the file object that represents an
input stream.  The input stream is typically data
supplied by an untrusted source (such as a user
uploading a file from a web browser).  The input data
is not required by the RFC 822/1521/1522/1867
specifications to contain any newline characters.  For
example, it is within the bounds of the specification
to supply a a multipart/form-data input stream with a
"file-data" part that consists of a 2GB string composed
entirely of "x" characters (which happens to be
something I did that led me to noticing this bug).

The simplest fix is to make use of the "size" argument
of the readline method of the file object where it is
used within all parts of FieldStorage that make use of
it.  A patch against the Python 2.3.4 cgi.py module
that does this is attached.

----------------------------------------------------------------------

>Comment By: Chris McDonough (chrism)
Date: 2005-04-02 23:00

Message:
Logged In: YES 
user_id=32974

FYI, I'd be happy to do the merging here if you wanted to
give me checkin access.

----------------------------------------------------------------------

Comment By: Chris McDonough (chrism)
Date: 2005-04-02 22:42

Message:
Logged In: YES 
user_id=32974

An updated test_cgi.py is attached.  I test both the
readline behavior and add a test for basic multipart parsing. 

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2005-03-31 23:48

Message:
Logged In: YES 
user_id=6380

Can I tweak you into uploading a unit test?

----------------------------------------------------------------------

Comment By: Chris McDonough (chrism)
Date: 2005-03-31 21:56

Message:
Logged In: YES 
user_id=32974

Re: parse_multipart..  yes, it looks like there's no use
fixing that as it just turns around and puts the line into a
list.. it is vulnerable but just by virtue of its non-use of
a tempfile, it appears doomed anyway for large requests.  I
don't know of anything that uses it.

Good catch wrt boundary recognition bug, I'm uploading
another patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2005-03-31 17:13

Message:
Logged In: YES 
user_id=6380

Methinks that the fix isn't quite right: it would
incorrectly recognize as a boundary a very long line
starting with "--" followed by the appropriate random string
at offset 2**16. This could probably be taken care of by
adding a flag that is true initially and after that keeps
track of whether the previous line ended in \n.

Also, there's a call to fp.readline() in parse_multipart()
that you didn't patch -- it wouldn't help because that code
is saving the lines in a list anyway, but isn't that code
vulnerable as well? Or is it not used?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1112549&group_id=5470