[Tutor] reading POST method in cgi script

Luke Paireepinart rabidpoobear at gmail.com
Tue Oct 16 06:22:49 CEST 2007


Eric Abrahamsen wrote:
> I'm trying to learn the fundamentals of cgi scripting, before moving  
> on to actually using the cgi module and, eventually, mod_python. I've  
> grasped that, if users submit a form to my cgi script, I can read the  
> form contents from os.environ['QUERY_STRING'] if it was stuck in the  
> URL via method=get, and from sys.stdin if it was sent via method=post.
>
> The get method works fine, but when it comes to post I can't actually  
> read anything off sys.stdin. 
Yeah, you can.
> What's particularly odd is that in my  
> test script below, the returnform() function should only be called if  
> len(sys.stdin.read())!=0, and yet when I submit a word and the script  
> runs returnform(), it tells me that len(sys.stdin.read()) is equal to  
> 0. If that's the case, how did returnform() get called in the first  
> place? Why didn't it just re-run printform()?
>   
Because you're confused about what sys.stdin.read() does.
Imagine stdin is a string:
"Hello, Eric!"
now if I call
sys.stdin.read(5)
it will return
"Hello,"
and the contents of sys.stdin.read() will now be
" Eric!"
or, to be more accurate, the current-position pointer will be set to 
offset 5, so that future reads will start at that position.

To understand why they would do it this way, consider you want to read 
in 5 characters at a time from a file.
Say the file is 6 GB.  If you were to read in the whole file and loop 
over it 5 characters at a time,
you'd probably overflow your memory.
However, if you tell the OS you want to open the file for reading, and 
you then read 5 characters at a time from the file,
you won't have to load everything into memory.  This is accomplished 
using pointers inside the file, so you know where you
last read.

Even though sys.stdin in this case is not a file, it works the same way, 
because it's a "File-Like Object."
Python uses Duck Typing, wherein the feature set of a particular item 
determines its type, rather than something arbitrary you define.
So if any item has read(), write(), and seek() methods, you can usually 
use these in place of file objects.
This is true no matter what your functions actually do.
In other words, your read() function could just count how many times 
it's called, but do nothing with the value, or anything else
you may want it to do.
> At first I tried printing each line of sys.stdin to the HTML page, so  
> I could see the details of how post works. Nothing was printed, and  
> that's when I tried using len() to see whether sys.stdin contained  
> anything. Then, thinking that stdin was getting reset somehow,
It's not getting reset, really.
>  I  
> tried calling returnform() and directly passing in stdin as a  
> parameter. That had the same result. 
Usually when you get to the point where you're trying random things 
hoping something will work,
you've reached the time when you either sleep on it or ask someone for help.
Even if you reach a solution through exhaustive testing, you still 
haven't learned anything, which is pretty useless to you in the long run.
> Now, I'm thoroughly confused...
>   
Don't worry about it.  It's good you asked.
> Any help would be much appreciated.
>
> Yours,
> Eric
>
> ************
>
> #!/Library/Frameworks/Python.framework/Versions/Current/bin/python
> import sys
> import cgitb;cgitb.enable()
>
> def main():
>      if len(sys.stdin.read())!=0:
>   
This is your problem right here.
len(sys.stdin.read()) is calling len() on a string returned by 
sys.stdin.read()
As mentioned earlier, calling this function sets the current-location 
pointer to one after the last-read position.
Since you passed no maximum to read(), the whole stdin contents were 
read into this variable, leaving
the pointer at the end of the file.  Thus on subsequent calls, 
sys.stdin.read() will return nothing.
Try changing this to len(sys.stdin.read(3)) and then pass your program a 
string longer than 3 characters.
Your length output in returnform should then be 3 less than your 
expected value, or 0, whichever is greater.
>          returnform()
>      else:
>          printform()
>
> def printform():
>      [snip printing form]
>
> def returnform():
>   
This name is a bit confusing, because the function doesn't return anything.
Perhaps display_form_contents would be a better name?
>      print "Content-Type: text/html\n\n"
>   
You could include this in the block print statement, if you wanted.  
It's fine to keep the header separate, though,
and probably a good idea.
>      print """
>      <html>
>      <head></head>
>      <body>
>      <p>Here's what results for standard in:</p>"""
>      print "<p>Length of stdin is %s</p>" % len(sys.stdin.read())
>   
So here when you're reading the length of sys.stdin.read() it's reading 
in from stdin again,
and since the pointer is at the end of the file, len() is getting an 
empty string.
You could reset the pointer to 0 if you wanted, using seek(), but I'd 
say just read the data once,
in your main, and pass it as an argument to returnform.

That makes your returnform function more versatile, as well, because 
then if I wanted to have a different form's contents
printed that I had saved in a file, for example, I could just pass it to 
your function, without having to jump through the hoops
of redirecting sys.stdin to my file.

Also I doubt it'd be a problem to have a block-printed format string.
I.E. just do it this way:

print """
<html>
<head></head>
<body>
<p>Here's what results for standard in:</p>
<p>Length of stdin is %s</p>
</body>
</html>"""  % len(sys.stdin.read())


Perhaps this seems less intuitive to you. It's really just a preference 
thing, but it makes for slightly more readable code in my opinion.
> main()
A neat trick is to do this:

if __name__ == "__main__":
    main()

The __name__ variable is set to "__main__" only when the script is 
executed directly.
So if I call form.py, for example.
However, if I write my own python program, and I want to use your 
printform() function, I can do
import form
which will import your functions into my namespace, and I can then do
form.printform()
If you don't have the above 2 lines, then when I import your form.py 
module, main() will be run every time,
causing all the main() stuff to be executed even if I really just wanted 
to use one of your functions, which in almost every case is an "unwanted 
side-effect."

Hope this has helped in some way,
-Luke


More information about the Tutor mailing list