[Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)
Bill Janssen
janssen at parc.com
Mon Apr 20 05:41:23 CEST 2009
Antoine Pitrou <solipsis at pitrou.net> wrote:
> Bill Janssen <janssen <at> parc.com> writes:
> >
> > ``The content type "application/x-www-form-urlencoded" is inefficient
> > for sending large quantities of binary data or text containing non-ASCII
> > characters.
>
> The fact that it's "inefficient" (i.e. takes more bytes than an optimal encoding
> scheme would) doesn't mean that it doesn't work.
Absolutely. I'm just quoting the spec to you. In any case, being able to send
multipart/form-data would be a nice thing to have, if only for file uploads.
> Look out there, many Web pages specify a different character set than
> Latin-1... UTF8 is quite a common choice in the modern world.
Sure. But nowhere does a spec say that this page charset should be used
in sending the values of a FORM using application/x-www-form-urlencoded
in a new HTTP request. It's just a convention some browsers use.
> Also, browsers will encode those characters that cannot be encoded in the
> character set using HTML escapes ("&1234;"). This means you can enter any
Sure, some browsers will. Others will apparently replace them with
question marks. It's undefined.
> unicode text into any form, regardless of the encoding of the source page. It's
> up to the Web application to decode the text, sure, but any decent Web framework
> or toolkit should do it for you.
Bill
More information about the Python-Dev
mailing list