Re: [docs] Remove “Content-Type: application/x-www-form-urlencoded; charset” advice (issue 25576)

Reviewers: r.david.murray, http://bugs.python.org/review/25576/diff/15901/Doc/library/urllib.request.rs... File Doc/library/urllib.request.rst (right): http://bugs.python.org/review/25576/diff/15901/Doc/library/urllib.request.rs... Doc/library/urllib.request.rst:1245: >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f: On 2015/11/08 18:24:56, r.david.murray wrote:
We lose something from the example no longer showing a use of add_header. Is there any other header we could add that would make sense in context? I don't think it is a big deal, though.
Nothing immediately comes to mind for this particular example. It might be better to illustrate one concept per example, rather than adding in unrelated headers at the same time. There is already an example at line 1209 setting Referer with add_header(). Some more options: * Perhaps there should be a link to <https://docs.python.org/dev/howto/urllib2.html#headers>, which has example code for setting User-Agent via the Request constructor parameter. * In my patch at Issue 23360 I proposed setting the Content-Type in the examples where data is not in urlencoded format, although again using the constructor rather than add_header(). * Perhaps we can resolve Issue 25570 with an example setting User-Agent via add_header(). Please review this at http://bugs.python.org/review/25576/ Affected files: Doc/library/urllib.parse.rst Doc/library/urllib.request.rst Lib/urllib/request.py # HG changeset patch # Parent 4df1eaecb506507e5d613f8601b48057e7f328a8 Remove advice about setting charset with application/x-www-form-urlencoded No charset parameter is standardized for this Content-Type value. Also clarify that urlencode() outputs ASCII. diff -r 4df1eaecb506 -r 6ff71efb3b5b Doc/library/urllib.parse.rst --- a/Doc/library/urllib.parse.rst Sat Nov 07 03:15:32 2015 +0000 +++ b/Doc/library/urllib.parse.rst Sat Nov 07 08:36:34 2015 +0000 @@ -535,10 +535,11 @@ errors=None, quote_via=quote_plus) Convert a mapping object or a sequence of two-element tuples, which may - contain :class:`str` or :class:`bytes` objects, to a "percent-encoded" - string. If the resultant string is to be used as a *data* for POST - operation with :func:`~urllib.request.urlopen` function, then it should be - properly encoded to bytes, otherwise it would result in a :exc:`TypeError`. + contain :class:`str` or :class:`bytes` objects, to a percent-encoded ASCII + text string. If the resultant string is to be used as a *data* for POST + operation with the :func:`~urllib.request.urlopen` function, then + it should be encoded to bytes, otherwise it would result in a + :exc:`TypeError`. The resulting string is a series of ``key=value`` pairs separated by ``'&'`` characters, where both *key* and *value* are quoted using the *quote_via* diff -r 4df1eaecb506 -r 6ff71efb3b5b Doc/library/urllib.request.rst --- a/Doc/library/urllib.request.rst Sat Nov 07 03:15:32 2015 +0000 +++ b/Doc/library/urllib.request.rst Sat Nov 07 08:36:34 2015 +0000 @@ -36,13 +36,8 @@ *data* should be a buffer in the standard :mimetype:`application/x-www-form-urlencoded` format. The :func:`urllib.parse.urlencode` function takes a mapping or sequence of - 2-tuples and returns a string in this format. It should be encoded to bytes - before being used as the *data* parameter. The charset parameter in - ``Content-Type`` header may be used to specify the encoding. If charset - parameter is not sent with the Content-Type header, the server following the - HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1 - encoding. It is advisable to use charset parameter with encoding used in - ``Content-Type`` header with the :class:`Request`. + 2-tuples and returns an ASCII text string in this format. It should + be encoded to bytes before being used as the *data* parameter. urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header in its HTTP requests. @@ -180,16 +175,9 @@ the only ones that use *data*; the HTTP request will be a POST instead of a GET when the *data* parameter is provided. *data* should be a buffer in the standard :mimetype:`application/x-www-form-urlencoded` format. - The :func:`urllib.parse.urlencode` function takes a mapping or sequence of - 2-tuples and returns a string in this format. It should be encoded to bytes - before being used as the *data* parameter. The charset parameter in - ``Content-Type`` header may be used to specify the encoding. If charset - parameter is not sent with the Content-Type header, the server following the - HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1 - encoding. It is advisable to use charset parameter with encoding used in - ``Content-Type`` header with the :class:`Request`. - + 2-tuples and returns an ASCII string in this format. It should be + encoded to bytes before being used as the *data* parameter. *headers* should be a dictionary, and will be treated as if :meth:`add_header` was called with each key and value as arguments. @@ -202,7 +190,7 @@ ``"Python-urllib/2.6"`` (on Python 2.6). An example of using ``Content-Type`` header with *data* argument would be - sending a dictionary like ``{"Content-Type":" application/x-www-form-urlencoded;charset=utf-8"}``. + sending a dictionary like ``{"Content-Type": "application/x-www-form-urlencoded"}``. The final two arguments are only of interest for correct handling of third-party HTTP cookies: @@ -1230,7 +1218,7 @@ opener.open('http://www.example.com/') Also, remember that a few standard headers (:mailheader:`Content-Length`, -:mailheader:`Content-Type` without charset parameter and :mailheader:`Host`) +:mailheader:`Content-Type` and :mailheader:`Host`) are added when the :class:`Request` is passed to :func:`urlopen` (or :meth:`OpenerDirector.open`). @@ -1253,11 +1241,8 @@ >>> import urllib.request >>> import urllib.parse >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) - >>> data = data.encode('utf-8') - >>> request = urllib.request.Request("http://requestb.in/xrbl82xr") - >>> # adding charset parameter to the Content-Type header. - >>> request.add_header("Content-Type","application/x-www-form-urlencoded;charset=utf-8") - >>> with urllib.request.urlopen(request, data) as f: + >>> data = data.encode('ascii') + >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f: ... print(f.read().decode('utf-8')) ... diff -r 4df1eaecb506 -r 6ff71efb3b5b Lib/urllib/request.py --- a/Lib/urllib/request.py Sat Nov 07 03:15:32 2015 +0000 +++ b/Lib/urllib/request.py Sat Nov 07 08:36:34 2015 +0000 @@ -149,13 +149,8 @@ *data* should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or sequence - of 2-tuples and returns a string in this format. It should be encoded to - bytes before being used as the data parameter. The charset parameter in - Content-Type header may be used to specify the encoding. If charset - parameter is not sent with the Content-Type header, the server following - the HTTP 1.1 recommendation may assume that the data is encoded in - ISO-8859-1 encoding. It is advisable to use charset parameter with encoding - used in Content-Type header with the Request. + of 2-tuples and returns an ASCII text string in this format. It should be + encoded to bytes before being used as the data parameter. urllib.request module uses HTTP/1.1 and includes a "Connection:close" header in its HTTP requests.
participants (1)
-
vadmium+py@gmail.com