[Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

Jacob Holm jh at improva.dk
Sun Apr 12 14:23:54 CEST 2009


Hi Mart

I haven't really followed this thread closely, so I apologize if some of 
my comments below have already been addressed.

Mart Sõmermaa wrote:
> The general consensus in python-ideas is that the following is needed, 
> so I bring it to python-dev to final discussions before I file a 
> feature request in bugs.python.org <http://bugs.python.org>.
>
> Proposal: add add_query_params() for appending query parameters to an 
> URL to urllib.parse and urlparse.
>
> Implementation: 
> http://github.com/mrts/qparams/blob/83d1ec287ec10934b5e637455819cf796b1b421c/qparams.py 
> (feel free to fork and comment).
>
> Behaviour (longish, guided by "simple things are simiple, complex 
> things possible"):
>
> In the simplest form, parameters can be passed via keyword arguments:
>
>     >>> add_query_params('foo', bar='baz')
>     'foo?bar=baz'
>
>     >>> add_query_params('http://example.com/a/b/c?a=b', b='d')
>     'http://example.com/a/b/c?a=b&b=d <http://example.com/a/b/c?a=b&b=d>'
>
> Note that '/', if given in arguments, is encoded:
>
>     >>> add_query_params('http://example.com/a/b/c?a=b', b='d', 
> foo='/bar')
>     'http://example.com/a/b/c?a=b&b=d&foo=%2Fbar 
> <http://example.com/a/b/c?a=b&b=d&foo=%2Fbar>'
>
> Duplicates are discarded:

Why discard duplicates?  They are valid and have a well-defined meaning.

>
>     >>> add_query_params('http://example.com/a/b/c?a=b', a='b')
>     'http://example.com/a/b/c?a=b'

I would prefer: 'http://example.com/a/b/c?a=b&a=b'

>
>     >>> add_query_params('http://example.com/a/b/c?a=b&c=q 
> <http://example.com/a/b/c?a=b&c=q>', a='b', b='d',
>     ...  c='q')
>     'http://example.com/a/b/c?a=b&c=q&b=d 
> <http://example.com/a/b/c?a=b&c=q&b=d>'
>

I would prefer: 'http://example.com/a/b/c?a=b&c=q&a=b&b=d'


> But different values for the same key are supported:
>
>     >>> add_query_params('http://example.com/a/b/c?a=b', a='c', b='d')
>     'http://example.com/a/b/c?a=b&a=c&b=d 
> <http://example.com/a/b/c?a=b&a=c&b=d>'
>
> Pass different values for a single key in a list (again, duplicates are
> removed):
>
>     >>> add_query_params('http://example.com/a/b/c?a=b', a=('q', 'b', 
> 'c'),
>     ... b='d')
>     'http://example.com/a/b/c?a=b&a=q&a=c&b=d 
> <http://example.com/a/b/c?a=b&a=q&a=c&b=d>'
>
> Keys with no value are respected, pass ``None`` to create one:
>
>     >>> add_query_params('http://example.com/a/b/c?a', b=None)
>     'http://example.com/a/b/c?a&b <http://example.com/a/b/c?a&b>'
>
> But if a value is given, the empty key is considered a duplicate (i.e. the
> case of a&a=b is considered nonsensical):

Again, it is a valid url and this will change its meaning.  Why?

>
>     >>> add_query_params('http://example.com/a/b/c?a', a='b', c=None)
>     'http://example.com/a/b/c?a=b&c <http://example.com/a/b/c?a=b&c>'
>
> If you need to pass in key names that are not allowed in keyword 
> arguments,
> pass them via a dictionary in second argument:
>
>     >>> add_query_params('foo', {"+'|äüö": 'bar'})
>     'foo?%2B%27%7C%C3%A4%C3%BC%C3%B6=bar'
>
> Order of original parameters is retained, although similar keys are 
> grouped
> together. 

Why the grouping?  Is it a side effect of your desire to discard 
duplicates?   Changing the order like that changes the meaning of the 
url.  A concrete case where the order of field names matters is the 
":records" converter in http://pypi.python.org/pypi/zope.httpform/1.0.1 
(a small independent package extracted from the form handling code in zope).

> Order of keyword arguments is not (and can not be) retained:
>
>     >>> add_query_params('foo?a=b&b=c&a=b&a=d', a='b')
>     'foo?a=b&a=d&b=c'
>
>     >>> add_query_params('http://example.com/a/b/c?a=b&q=c&e=d 
> <http://example.com/a/b/c?a=b&q=c&e=d>',
>     ... x='y', e=1, o=2)
>     'http://example.com/a/b/c?a=b&q=c&e=d&e=1&x=y&o=2 
> <http://example.com/a/b/c?a=b&q=c&e=d&e=1&x=y&o=2>'
>
> If you need to retain the order of the added parameters, use an
> :class:`OrderedDict` as the second argument (*params_dict*):
>
>     >>> from collections import OrderedDict
>     >>> od = OrderedDict()
>     >>> od['xavier'] = 1
>     >>> od['abacus'] = 2
>     >>> od['janus'] = 3
>     >>> add_query_params('http://example.com/a/b/c?a=b', od)
>     'http://example.com/a/b/c?a=b&xavier=1&abacus=2&janus=3 
> <http://example.com/a/b/c?a=b&xavier=1&abacus=2&janus=3>'
>
> If both *params_dict* and keyword arguments are provided, values from the
> former are used before the latter:
>
>     >>> add_query_params('http://example.com/a/b/c?a=b', od, xavier=1.1,
>     ... zorg='a', alpha='b', watt='c', borg='d')
>     
> 'http://example.com/a/b/c?a=b&xavier=1&xavier=1.1&abacus=2&janus=3&zorg=a&borg=d&watt=c&alpha=b 
> <http://example.com/a/b/c?a=b&xavier=1&xavier=1.1&abacus=2&janus=3&zorg=a&borg=d&watt=c&alpha=b>'
>
> Do nothing with a single argument:
>
>     >>> add_query_params('a')
>     'a'
>
>     >>> add_query_params('arbitrary strange stuff?öäüõ*()+-=42')
>     'arbitrary strange stuff?\xc3\xb6\xc3\xa4\xc3\xbc\xc3\xb5*()+-=42'

If you change it to keep duplicates and not unnecessarily mangle the 
field order I am +1, else I am -0.

- Jacob



More information about the Python-ideas mailing list