[Python-Dev] Encoding detection in the standard library?

Bill Janssen janssen at parc.com
Wed Apr 23 18:16:55 CEST 2008


martin at v.loewis.de writes:
> > When a web browser POSTs data, there is no standard way of communicating
> > which encoding it's using.
> 
> That's just not true. Web browser should and do use the encoding of the
> web page that originally contained the form.

I wonder if the discussion is confusing two different things.  Take a
look at
<http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13.4>.

There are two prescribed ways of sending form data:
application/x-www-form-urlencoded, which can only be used with ASCII
data, and multipart/form-data.  ``The content type
"multipart/form-data" should be used for submitting forms that contain
files, non-ASCII data, and binary data.''

It's true that the page containing the form may specify which of these
two forms to use, but the character encodings are determined by the
choice.

> For web forms, I always encode the pages in UTF-8, and that always
> works.

Should work, if you use the "multipart/form-data" format.

Bill



More information about the Python-Dev mailing list