[Tutor] reading POST method in cgi script

Eric Abrahamsen eric at abrahamsen.com
Tue Oct 16 09:28:50 CEST 2007


Thanks for the detailed help. I'm dumping sys.stdin into a variable  
at the very start now, and operating on that variable for everything  
else, and all is well.

Thanks again,
E


On Oct 16, 2007, at 12:22 PM, Luke Paireepinart wrote:

> Eric Abrahamsen wrote:
>> I'm trying to learn the fundamentals of cgi scripting, before  
>> moving  on to actually using the cgi module and, eventually,  
>> mod_python. I've  grasped that, if users submit a form to my cgi  
>> script, I can read the  form contents from os.environ 
>> ['QUERY_STRING'] if it was stuck in the  URL via method=get, and  
>> from sys.stdin if it was sent via method=post.
>>
>> The get method works fine, but when it comes to post I can't  
>> actually  read anything off sys.stdin.
> Yeah, you can.
>> What's particularly odd is that in my  test script below, the  
>> returnform() function should only be called if  len(sys.stdin.read 
>> ())!=0, and yet when I submit a word and the script  runs  
>> returnform(), it tells me that len(sys.stdin.read()) is equal to   
>> 0. If that's the case, how did returnform() get called in the  
>> first  place? Why didn't it just re-run printform()?
>>
> Because you're confused about what sys.stdin.read() does.
> Imagine stdin is a string:
> "Hello, Eric!"
> now if I call
> sys.stdin.read(5)
> it will return
> "Hello,"
> and the contents of sys.stdin.read() will now be
> " Eric!"
> or, to be more accurate, the current-position pointer will be set  
> to offset 5, so that future reads will start at that position.
>
> To understand why they would do it this way, consider you want to  
> read in 5 characters at a time from a file.
> Say the file is 6 GB.  If you were to read in the whole file and  
> loop over it 5 characters at a time,
> you'd probably overflow your memory.
> However, if you tell the OS you want to open the file for reading,  
> and you then read 5 characters at a time from the file,
> you won't have to load everything into memory.  This is  
> accomplished using pointers inside the file, so you know where you
> last read.
>
> Even though sys.stdin in this case is not a file, it works the same  
> way, because it's a "File-Like Object."
> Python uses Duck Typing, wherein the feature set of a particular  
> item determines its type, rather than something arbitrary you define.
> So if any item has read(), write(), and seek() methods, you can  
> usually use these in place of file objects.
> This is true no matter what your functions actually do.
> In other words, your read() function could just count how many  
> times it's called, but do nothing with the value, or anything else
> you may want it to do.
>> At first I tried printing each line of sys.stdin to the HTML page,  
>> so  I could see the details of how post works. Nothing was  
>> printed, and  that's when I tried using len() to see whether  
>> sys.stdin contained  anything. Then, thinking that stdin was  
>> getting reset somehow,
> It's not getting reset, really.
>>  I  tried calling returnform() and directly passing in stdin as a   
>> parameter. That had the same result.
> Usually when you get to the point where you're trying random things  
> hoping something will work,
> you've reached the time when you either sleep on it or ask someone  
> for help.
> Even if you reach a solution through exhaustive testing, you still  
> haven't learned anything, which is pretty useless to you in the  
> long run.
>> Now, I'm thoroughly confused...
>>
> Don't worry about it.  It's good you asked.
>> Any help would be much appreciated.
>>
>> Yours,
>> Eric
>>
>> ************
>>
>> #!/Library/Frameworks/Python.framework/Versions/Current/bin/python
>> import sys
>> import cgitb;cgitb.enable()
>>
>> def main():
>>      if len(sys.stdin.read())!=0:
>>
> This is your problem right here.
> len(sys.stdin.read()) is calling len() on a string returned by  
> sys.stdin.read()
> As mentioned earlier, calling this function sets the current- 
> location pointer to one after the last-read position.
> Since you passed no maximum to read(), the whole stdin contents  
> were read into this variable, leaving
> the pointer at the end of the file.  Thus on subsequent calls,  
> sys.stdin.read() will return nothing.
> Try changing this to len(sys.stdin.read(3)) and then pass your  
> program a string longer than 3 characters.
> Your length output in returnform should then be 3 less than your  
> expected value, or 0, whichever is greater.
>>          returnform()
>>      else:
>>          printform()
>>
>> def printform():
>>      [snip printing form]
>>
>> def returnform():
>>
> This name is a bit confusing, because the function doesn't return  
> anything.
> Perhaps display_form_contents would be a better name?
>>      print "Content-Type: text/html\n\n"
>>
> You could include this in the block print statement, if you  
> wanted.  It's fine to keep the header separate, though,
> and probably a good idea.
>>      print """
>>      <html>
>>      <head></head>
>>      <body>
>>      <p>Here's what results for standard in:</p>"""
>>      print "<p>Length of stdin is %s</p>" % len(sys.stdin.read())
>>
> So here when you're reading the length of sys.stdin.read() it's  
> reading in from stdin again,
> and since the pointer is at the end of the file, len() is getting  
> an empty string.
> You could reset the pointer to 0 if you wanted, using seek(), but  
> I'd say just read the data once,
> in your main, and pass it as an argument to returnform.
>
> That makes your returnform function more versatile, as well,  
> because then if I wanted to have a different form's contents
> printed that I had saved in a file, for example, I could just pass  
> it to your function, without having to jump through the hoops
> of redirecting sys.stdin to my file.
>
> Also I doubt it'd be a problem to have a block-printed format string.
> I.E. just do it this way:
>
> print """
> <html>
> <head></head>
> <body>
> <p>Here's what results for standard in:</p>
> <p>Length of stdin is %s</p>
> </body>
> </html>"""  % len(sys.stdin.read())
>
>
> Perhaps this seems less intuitive to you. It's really just a  
> preference thing, but it makes for slightly more readable code in  
> my opinion.
>> main()
> A neat trick is to do this:
>
> if __name__ == "__main__":
>    main()
>
> The __name__ variable is set to "__main__" only when the script is  
> executed directly.
> So if I call form.py, for example.
> However, if I write my own python program, and I want to use your  
> printform() function, I can do
> import form
> which will import your functions into my namespace, and I can then do
> form.printform()
> If you don't have the above 2 lines, then when I import your  
> form.py module, main() will be run every time,
> causing all the main() stuff to be executed even if I really just  
> wanted to use one of your functions, which in almost every case is  
> an "unwanted side-effect."
>
> Hope this has helped in some way,
> -Luke
>



More information about the Tutor mailing list