Change in cgi module's handling of POST requests
bkline at rksystems.com
Fri Feb 13 05:43:48 CET 2009
Joshua Kugler wrote:
>> We just upgraded Python to 2.6 on some of our servers and a number of our
>> CGI scripts broke because the cgi module has changed the way it handles
>> POST requests. When the 'action' attribute was not present in the form
>> element on an HTML page the module behaved as if the value of the
>> attribute was the URL which brought the user to the page with the form,
>> but without the query (?x=y...) part.
> This does not make sense. Can you give an example?
Sure. Here's a tiny repro script:
import cgi, xml.sax.saxutils
def quote(me): return me and xml.sax.saxutils.quoteattr(str(me)) or ''
<html><body><form method='post'><input name='x' value=%s>
</form></body></html>""" % quote(cgi.FieldStorage().getvalue('x'))
#################### end of repro script ########################
Try it out on this pre-2.6 Python page:
When the page comes up, click Submit. Click it several times. No
change in the content of the text field, which is populated when the
page first comes up from the GET request's URL, and then subsequently
from the POST request's parameters.
For comparison, here's the equivalent Perl page, which behaves the same way:
Or PHP; again, same behavior, no matter how many times you click the
Now try the Python script above from a server where Python has been
upgraded to version 2.6:
Notice that when you click on the Submit button, the field is populated
with the string representation of the list which FieldStorage.getvalue()
returns. Each time you click the submit button you'll see the effect
recursively snowballing. This is exactly the same script as the one
behind the first URL above, byte for byte.
>> Now FieldStorage.getvalue () is
>> giving the script a list of two copies of the value for some of the
>> parameters (folding in the parameters from the previous request) instead
>> of the single string it used to return for each.
> There is a function call to get only one value. I think it's get_first() or
> some such.
That's true, but risky. I have no guarantee that the value entered by
the user on the form will be first on the list. I might instead get the
initial value carried over from the URL which brought up the form to
begin with. We're working around the problem by modifying the broken
scripts to explicitly set the action attributes.
> Well, the CGI module hasn't had many changes. There was this bug fix a few
> months back:
Looks like that was where it happened.
> It is possible that by fixing a bug, they brought the behavior in line with
> what it *should* be.
That's certainly possible. I'm not contending that Perl and PHP and the
previous versions of Python all got it right and the new Python is
wrong. It could very well be the other way around. But my
expectation, based on what I've seen happen over the years with other
proposed changes to the language and the libraries, was that there would
have been some (possibly extended) discussion of the risks of breaking
existing code, and the best way to phase in the change with as little
sudden breakage as possible. I haven't been able to find that
discussion, and I was hoping some kind soul would point me in the right
> Or maybe the browser behavior changed?
Clearly not, as you will see by using the same browser to try out the
URLs above. If you look at the HTML source when the page first comes up
for each of the scripts, you'll see it's the same. It's the behavior on
the server (that is, in the Python library module) which changes.
> The server
> does not care about an "action" attribute. That only tells the browser
> where to send the data.
Well that's a pretty good formulation of the conclusion you would come
to based on the behavior of all of Perl, PHP, and (pre-2.6) Python. And
intuitively, that's how one (or at least I) would expect things to
work. The parameters in the original URL are appropriately used to seed
initial values in the form when the form is invoked with a GET request,
but after that point it's hard to see them as anything but history. But
that's not how the new version of the cgi module is behaving. It's
folding in the parameters it finds in the original URL, which it gets
from the environment's QUERY_STRING variable, in with the fields it
parses from the POST request's body.
> It is possible the browser did not properly format
> a request when there was no "action" attribute.
When the 'action' attribute is not present in the form element, the
browser implicitly assigns it the value of the original URL which first
brought up the page with the form. This browser behavior has not
changed. It's doing the same thing no matter which version of which
language and libraries are used to implement the CGI script (it has no
idea what those are). Nor, as far as I have been able to determine, is
this behavior dependent on which (version of which) browser you're using.
> Can you provide more details?
I think we should have enough specifics with what I've provided above to
make it clear what's happening, but if you can think of anything I've
left out which you think would be useful, let me know and I'll try to
 I haven't yet finished my attempts to parse the relevant RFCs; I
assumed that the original authors and maintainers of this module (which
includes the BDFL himself), would have been more adept at that than I
am, which is one of the reasons I was hoping to find some discussion in
the mailing list archives of the discussion of the proposed change in
the module's behavior.
More information about the Python-list