[docs] [issue24460] urlencode() of dictionary not as expected

Wed Jun 17 16:27:39 CEST 2015

David Rueter added the comment:

Ah hah! Indeed, urlencode() does work on dictionaries as expected when doseq=True. Thank you for clarifying.

FWIW I had read the documentation and the referenced examples multiple times. I would like to make a few documentation suggestions for clarity.

1 ) Update https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlencode 

Where documentation currently says: "When a sequence of two-element tuples is used as the query argument, the first element of each tuple is a key and the second is a value. The value element in itself can be a sequence and in that case, if the optional parameter doseq is evaluates to True, individual key=value pairs separated by '&' are generated for each element of the value sequence for the key. The order of parameters in the encoded string will match the order of parameter tuples in the sequence."

Perhaps instead the following would be more clear:  "The query argument may be a sequence of two-element tuples where the first element of each tuple is a key and the second is a value.  However the optional parameter doseq must then be set to True in order to reliably generate individual key=value pairs separated by '&' for each element of the value sequence for the key, and to preserve the sequence of the elements in the query parameter."

2) Update https://docs.python.org/3/library/urllib.request.html#urllib-examples

The examples are referenced from the documentation: "Refer to urllib examples to find out how urlencode method can be used for generating query string for a URL or data for POST."  However the example page provides contradictory information and examples for this specific use case.

Currently the examples page says:  "The urllib.parse.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format. It should be encoded to bytes before being used as the data parameter. The charset parameter in Content-Type header may be used to specify the encoding. If charset parameter is not sent with the Content-Type header, the server following the HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1 encoding. It is advisable to use charset parameter with encoding used in Content-Type header with the Request."

Perhaps instead the following would be more clear:  "The urllib.parse.urlencode() query parameter can accept a mapping or sequence of 2-tuples and return a string in this format if the optional parameter doseq is set to True. It should be encoded to bytes before being used as the query parameter. The charset parameter in Content-Type header may be used to specify the encoding. If charset parameter is not sent with the Content-Type header, the server following the HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1 encoding. It is advisable to use charset parameter with encoding used in Content-Type header with the Request."

3) Also on the example page, there are examples of urlencode operating on dictionaries where doseq is not provided. This is confusing.  It would be better to show doseq = True:

Here is an example session that uses the GET method to retrieve a URL containing parameters:
...
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
...
The following example uses the POST method instead. Note that params output from urlencode is encoded to bytes before it is sent to urlopen as data:
...
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})

I suggest that these examples read:
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}, doseq=true)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24460>
_______________________________________