New subject: Web servers, bytes, str, documentation, Python 3.2a4

Nov. 19, 2010

      So maybe this is the wrong forum, if so please tell me what the right 
forum is for each of the various pieces.  I'm assuming that I should 
file some bugs in the tracker, but I'm not exactly sure whether to file 
them on cgitb, http.server, or subprocess, or all of the above.  Pretty 
sure there are at least some in http.server, but maybe some of those 
will be considered "enhancement requests" since they are long 
outstanding in the predecessor code.

So I've been writing CGI scripts in Python behind Apache.  No framework, 
just raw CGI.

Got everything working on Python 2.6 (it's the newest that the hosting 
company has).  Whacked at 2.6's CGIHTTPServer.py until I got an 
environment that would actually run CGI programs in the same sort of way 
that Apache does, so I can test faster, locally.  Got the site working.  
Am happy.

Now I decided to tackle porting the code to Python 3, in hopes that 
someday the hosting company might have it, and to see what I could learn 
about the "Subject:" matters, and to altruistically see if 3.2a4 has a 
consistent story.  Um.  Well.  Some of me,  Python 3.2a4, or its 
documentation is missing something.  Maybe several somethings.

Here's some code to ponder.

import sys
import traceback
sys.stdout = open("sob", "wb")  # WSGI sez data should be binary, so 
stdout should be binary???
import cgitb
sys.stdout.write(b"out")
fhb = open("fhb", "wb")
cgitb.enable(0,"d:\temp")
fhb.write("abcdef")  # try writing non-binary to binary file.  Expect an 
error, of course.

Feed it to python32...

d:\temp>c:\python32\python.exe test11.py
Error in sys.excepthook:
TypeError: 'str' does not support the buffer interface

Original exception was:
Traceback (most recent call last):
   File "d:\my\py\test11.py", line 8, in <module>
     fhb.write("abcdef")  # try writing non-binary to binary file.  
Expect an err
or, of course.
TypeError: 'str' does not support the buffer interface

So it seems that cgitb can't write to binary files, to report the 
error?  Or how else should I interpret the Error in sys.excepthook ?

So then I tweaked the code for cgitb's enjoyment:

import sys
import traceback
sys.stdout = open("sob", "w", encoding="UTF-8")  # WSGI sez data should 
be binary, so stdout should be binary???
import cgitb
sys.stdout.write("out")
fhb = open("fhb", "wb")
cgitb.enable(0,"d:\temp")
fhb.write("abcdef")  # try writing non-binary to binary file.  Expect an 
error, of course.

Now I get the following report in the stdout file:

out<!--: spam
Content-Type: text/html

<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> -->
<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> --> -->
</font> </font> </font> </script> </object> </blockquote> </pre>
</table> </table> </table> </table> </table> </font> </font> </font><p>A 
problem occurred in a Python script.

and the following error on the console:

d:\temp>c:\python32\python.exe test12.py
Error in sys.excepthook:
Traceback (most recent call last):
   File "c:\python32\lib\tempfile.py", line 209, in _mkstemp_inner
     fd = _os.open(file, flags, 0o600)
OSError: [Errno 22] Invalid argument

Original exception was:
Traceback (most recent call last):
   File "d:\my\py\test12.py", line 8, in <module>
     fhb.write("abcdef")  # try writing non-binary to binary file.  
Expect an error, of course.
TypeError: 'str' does not support the buffer interface

I was expecting see a whole cgitb in sob, but no such luck.  Not sure 
why it is trying to create a temporary file, but it seems to fail to do 
that.

Of course, the next test, would have been to write binary data into fhb, 
and try to copy it to stdout, which would fail, because stdout has to 
not be binary to make cgitb work???

That brings me to http.server, the 3.2a4 replacement for CGIHTTPServer.  
There are definitely some improvements here, and some 
reported-but-yet-unfixed bugs.  And some pitiful missing features, 
especially on Windows.  I applied some of the whacks I had applied to 
CGIHTTPServer, and got some things working, but, per what I was trying 
to demonstrate above, there seems to be an incompatibility with the idea 
of using cgitb (which wants stdout open with some encoding provided) and 
serving binary files (which wants stdout open in binary) [this latter is 
supported by the WSGI spec too].

So it seems to be that there are some problems.  Yet, it seems that 
http.server can some accept the data sent by cgitb, which comes from 
subprocess running my CGI script, but my CGI script fails to be able to 
copy a binary file to its stdout (a subprocess created PIPE).  The 
subprocess documentation doesn't say what encoding is supplied to the 
PIPE-created handles, if any, but since cgitb data is accepted but 
binary file data is not, I infer it must be a non-binary handle, 
encoding unknown.  The subprocess documentation doesn't document any way 
to specify what encoding should be used on the PIPE-created handles, 
either.  So this isn't very enlightening.  In the absence of a 
specification or parameter, I would have expected the PIPEs to be 
binary, but this seems to be experimentally false.

Yet http.server, when serving plain files, seems to open them in binary 
mode, and transfer them successfully to the browser.  And it can also 
accept the non-binary?? data from cgitb from my CGI script, and display 
it in the browser.  The former comes from a file it opens in binary 
mode, and the latter from the subprocess PIPE in unknown mode.

It seems that the socketfile.server opens the socket in "wb" mode, and 
encodes most data.  That in turn, seems to imply that the binary data 
from SimpleHTTPServer files are reasonably returned, and I note the 
headers and such are expliticly encoded before being written to wfile... 
again, consistent with the socket, wfile, being in binary mode.

But the data coming back from the subprocess PIPE from my CGI script 
seems to be acceptable to be written to wfile also, implying that  the 
PIPEs are binary, like the absence of specifications and parameters and 
knowledge of pipes as being bytestreams would be expected.  But then, it 
would seem that the cgitb output should be in binary to get into the 
PIPE, but it seems that using a binary stdout makes cgitb fail, in the 
above experiment... and I can't find any code in cgitb that does 
explicit encoding.

So I'm confused, and it seems a little extra documentation might help 
decide which are the modules that have bugs or missing features, and 
which do not.

One of the cgitb outputs from my attempt to serve the binary file claims 
that my CGI script's output file (which comes from a subprocess PIPE) is 
a TextIOWrapper with encoding cp1252.  Maybe that is the default that 
comes when a new Python is launched, even though it gets a subprocess 
PIPE as stdout?

Web servers, bytes, str, documentation, Python 3.2a4

Glenn Linderman

Glenn Linderman

Éric Araujo

Glenn Linderman

Glenn Linderman

R. David Murray

Glenn Linderman

R. David Murray

Glenn Linderman

tags

participants (3)