CGI and Unicode

Jim Hefferon jhefferon at smcvt.edu
Mon Jun 23 20:07:47 CEST 2003


Hello,

I have been struggling with getting Unicode out of Python's cgi 
module. I have a small script illustrating the problem at the bottom 
but first I need to explain.

I want that users can send me material with a wide variety of 
characters.

I understand from looking around the net (particularly at this 
discussion:
http://216.239.39.100/search?q=cache:QbQ_esNHtswJ:mail.python.org/pipermail/python-dev/2002-April/023077.html+x-www-form-urlencoded+unicode&hl=en&ie=UTF-8
on the Python developers list) that the best that I can hope for 
is to set the page with the form on it to be showing, say UTF-8, 
and then the data should show up UTF-8 encoded at my site.  

I think I have the page set to UTF-8 encoded by following the 
recommendation on http://www.w3.org/TR/REC-html/charset.html 
about the META tag.

But when I ask what is the type of the variable that I get from 
the cgi module, it comes out as StringType, not UnicodeType.  My 
browser is Galeon on the latest Debian and I've also tested it 
with IE on NT.

What am I missing?  Thanks for any help,
Jim Hefferon

-------- test_cgi.py ----------------------------
#!/usr/bin/python -u
# test_cgi.py
# test CGI unicode issue
from types import *

import cgi
import cgitb
cgitb.enable()

# create the HTML document
print "Content-Type: text/html\n\n"
print "<html><head><title>CGI TEST</title></head>\n\n"
print "<META http-equiv=\"Content-Type\" content=\"text/html;
charset=UTF-8\">"
print "<body bgcolor=\"white\">\n"

cgi_params=cgi.FieldStorage()
cgi_keys=cgi_params.keys()
try:
    var=cgi_params['name'].value
    if type(var) is UnicodeType:
        print "<p>The type of the variable is a Unicode</p>\n"
    elif type(var) is StringType:
        print "<p>The type of the variable is a regular string</p>\n"
    print "<p>Character 12 is %s</p>\n" % (var[12],)
except:
    pass

print "<form method=\"POST\" accept-charset=\"utf-8\">\n"
print "<input type=\"text\" name=\"name\">\n"
print "<input type=\"submit\">\n"
print "</form>\n"
print "</body></html>"




More information about the Python-list mailing list