[Tutor] reading POST method in cgi script
Luke Paireepinart
rabidpoobear at gmail.com
Tue Oct 16 06:22:49 CEST 2007
Eric Abrahamsen wrote:
> I'm trying to learn the fundamentals of cgi scripting, before moving
> on to actually using the cgi module and, eventually, mod_python. I've
> grasped that, if users submit a form to my cgi script, I can read the
> form contents from os.environ['QUERY_STRING'] if it was stuck in the
> URL via method=get, and from sys.stdin if it was sent via method=post.
>
> The get method works fine, but when it comes to post I can't actually
> read anything off sys.stdin.
Yeah, you can.
> What's particularly odd is that in my
> test script below, the returnform() function should only be called if
> len(sys.stdin.read())!=0, and yet when I submit a word and the script
> runs returnform(), it tells me that len(sys.stdin.read()) is equal to
> 0. If that's the case, how did returnform() get called in the first
> place? Why didn't it just re-run printform()?
>
Because you're confused about what sys.stdin.read() does.
Imagine stdin is a string:
"Hello, Eric!"
now if I call
sys.stdin.read(5)
it will return
"Hello,"
and the contents of sys.stdin.read() will now be
" Eric!"
or, to be more accurate, the current-position pointer will be set to
offset 5, so that future reads will start at that position.
To understand why they would do it this way, consider you want to read
in 5 characters at a time from a file.
Say the file is 6 GB. If you were to read in the whole file and loop
over it 5 characters at a time,
you'd probably overflow your memory.
However, if you tell the OS you want to open the file for reading, and
you then read 5 characters at a time from the file,
you won't have to load everything into memory. This is accomplished
using pointers inside the file, so you know where you
last read.
Even though sys.stdin in this case is not a file, it works the same way,
because it's a "File-Like Object."
Python uses Duck Typing, wherein the feature set of a particular item
determines its type, rather than something arbitrary you define.
So if any item has read(), write(), and seek() methods, you can usually
use these in place of file objects.
This is true no matter what your functions actually do.
In other words, your read() function could just count how many times
it's called, but do nothing with the value, or anything else
you may want it to do.
> At first I tried printing each line of sys.stdin to the HTML page, so
> I could see the details of how post works. Nothing was printed, and
> that's when I tried using len() to see whether sys.stdin contained
> anything. Then, thinking that stdin was getting reset somehow,
It's not getting reset, really.
> I
> tried calling returnform() and directly passing in stdin as a
> parameter. That had the same result.
Usually when you get to the point where you're trying random things
hoping something will work,
you've reached the time when you either sleep on it or ask someone for help.
Even if you reach a solution through exhaustive testing, you still
haven't learned anything, which is pretty useless to you in the long run.
> Now, I'm thoroughly confused...
>
Don't worry about it. It's good you asked.
> Any help would be much appreciated.
>
> Yours,
> Eric
>
> ************
>
> #!/Library/Frameworks/Python.framework/Versions/Current/bin/python
> import sys
> import cgitb;cgitb.enable()
>
> def main():
> if len(sys.stdin.read())!=0:
>
This is your problem right here.
len(sys.stdin.read()) is calling len() on a string returned by
sys.stdin.read()
As mentioned earlier, calling this function sets the current-location
pointer to one after the last-read position.
Since you passed no maximum to read(), the whole stdin contents were
read into this variable, leaving
the pointer at the end of the file. Thus on subsequent calls,
sys.stdin.read() will return nothing.
Try changing this to len(sys.stdin.read(3)) and then pass your program a
string longer than 3 characters.
Your length output in returnform should then be 3 less than your
expected value, or 0, whichever is greater.
> returnform()
> else:
> printform()
>
> def printform():
> [snip printing form]
>
> def returnform():
>
This name is a bit confusing, because the function doesn't return anything.
Perhaps display_form_contents would be a better name?
> print "Content-Type: text/html\n\n"
>
You could include this in the block print statement, if you wanted.
It's fine to keep the header separate, though,
and probably a good idea.
> print """
> <html>
> <head></head>
> <body>
> <p>Here's what results for standard in:</p>"""
> print "<p>Length of stdin is %s</p>" % len(sys.stdin.read())
>
So here when you're reading the length of sys.stdin.read() it's reading
in from stdin again,
and since the pointer is at the end of the file, len() is getting an
empty string.
You could reset the pointer to 0 if you wanted, using seek(), but I'd
say just read the data once,
in your main, and pass it as an argument to returnform.
That makes your returnform function more versatile, as well, because
then if I wanted to have a different form's contents
printed that I had saved in a file, for example, I could just pass it to
your function, without having to jump through the hoops
of redirecting sys.stdin to my file.
Also I doubt it'd be a problem to have a block-printed format string.
I.E. just do it this way:
print """
<html>
<head></head>
<body>
<p>Here's what results for standard in:</p>
<p>Length of stdin is %s</p>
</body>
</html>""" % len(sys.stdin.read())
Perhaps this seems less intuitive to you. It's really just a preference
thing, but it makes for slightly more readable code in my opinion.
> main()
A neat trick is to do this:
if __name__ == "__main__":
main()
The __name__ variable is set to "__main__" only when the script is
executed directly.
So if I call form.py, for example.
However, if I write my own python program, and I want to use your
printform() function, I can do
import form
which will import your functions into my namespace, and I can then do
form.printform()
If you don't have the above 2 lines, then when I import your form.py
module, main() will be run every time,
causing all the main() stuff to be executed even if I really just wanted
to use one of your functions, which in almost every case is an "unwanted
side-effect."
Hope this has helped in some way,
-Luke
More information about the Tutor
mailing list