[Python-Dev] Can the cgi module be made Unicode-aware?

Skip Montanaro skip@pobox.com
Fri, 12 Apr 2002 17:45:00 -0500


    >> I did some reading before nodding off last night.  The <form> tag
    >> takes an optional "accept-charset" attribute, which can be a list.

    Martin> No, it doesn't - that's a proprietary extension. Or, maybe I'm
    Martin> missing something: where did you find a statement that this is
    Martin> "official" in any sense?

w3.org recommendations:

    http://www.w3.org/TR/REC-html40/interact/forms.html

    >> Adding an "accept-charset" attribute to the <form> does appear to
    >> have some effect on Content-Type in some instances, but not in all.

    Martin> It might depend on the browser, since it's proprietary.

I question your assertion that it's a proprietary attribute, simply because
I discovered it on w3.org.  The only two browsers I tried it with (Mozilla
0.9.4 and Opera 6.0) both respect it, though as I mentioned, Mozilla doesn't
decorate the Content-Type header with its value in the form submission
request.

    >> The cgi programmer can't rely on charset information coming from the
    >> browser and will need a way to tell the cgi module what the charset
    >> of the incoming data is.  I think FieldStorage and MiniFieldStorage
    >> need optional charset parameters and I think the charset needs to be
    >> used from the Content-Type header, if present.

    Martin> Of course, if you also have uploaded files, this cannot work:
    Martin> the file data never follow the encoding - only the "text" fields
    Martin> do.

Well, yeah, but that's a case of a multipart deal.  Each part could (or
should? must?) have its own Content-Type header.

Skip