[Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)
Jacob Holm
jh at improva.dk
Sun Apr 12 14:23:54 CEST 2009
Hi Mart
I haven't really followed this thread closely, so I apologize if some of
my comments below have already been addressed.
Mart Sõmermaa wrote:
> The general consensus in python-ideas is that the following is needed,
> so I bring it to python-dev to final discussions before I file a
> feature request in bugs.python.org <http://bugs.python.org>.
>
> Proposal: add add_query_params() for appending query parameters to an
> URL to urllib.parse and urlparse.
>
> Implementation:
> http://github.com/mrts/qparams/blob/83d1ec287ec10934b5e637455819cf796b1b421c/qparams.py
> (feel free to fork and comment).
>
> Behaviour (longish, guided by "simple things are simiple, complex
> things possible"):
>
> In the simplest form, parameters can be passed via keyword arguments:
>
> >>> add_query_params('foo', bar='baz')
> 'foo?bar=baz'
>
> >>> add_query_params('http://example.com/a/b/c?a=b', b='d')
> 'http://example.com/a/b/c?a=b&b=d <http://example.com/a/b/c?a=b&b=d>'
>
> Note that '/', if given in arguments, is encoded:
>
> >>> add_query_params('http://example.com/a/b/c?a=b', b='d',
> foo='/bar')
> 'http://example.com/a/b/c?a=b&b=d&foo=%2Fbar
> <http://example.com/a/b/c?a=b&b=d&foo=%2Fbar>'
>
> Duplicates are discarded:
Why discard duplicates? They are valid and have a well-defined meaning.
>
> >>> add_query_params('http://example.com/a/b/c?a=b', a='b')
> 'http://example.com/a/b/c?a=b'
I would prefer: 'http://example.com/a/b/c?a=b&a=b'
>
> >>> add_query_params('http://example.com/a/b/c?a=b&c=q
> <http://example.com/a/b/c?a=b&c=q>', a='b', b='d',
> ... c='q')
> 'http://example.com/a/b/c?a=b&c=q&b=d
> <http://example.com/a/b/c?a=b&c=q&b=d>'
>
I would prefer: 'http://example.com/a/b/c?a=b&c=q&a=b&b=d'
> But different values for the same key are supported:
>
> >>> add_query_params('http://example.com/a/b/c?a=b', a='c', b='d')
> 'http://example.com/a/b/c?a=b&a=c&b=d
> <http://example.com/a/b/c?a=b&a=c&b=d>'
>
> Pass different values for a single key in a list (again, duplicates are
> removed):
>
> >>> add_query_params('http://example.com/a/b/c?a=b', a=('q', 'b',
> 'c'),
> ... b='d')
> 'http://example.com/a/b/c?a=b&a=q&a=c&b=d
> <http://example.com/a/b/c?a=b&a=q&a=c&b=d>'
>
> Keys with no value are respected, pass ``None`` to create one:
>
> >>> add_query_params('http://example.com/a/b/c?a', b=None)
> 'http://example.com/a/b/c?a&b <http://example.com/a/b/c?a&b>'
>
> But if a value is given, the empty key is considered a duplicate (i.e. the
> case of a&a=b is considered nonsensical):
Again, it is a valid url and this will change its meaning. Why?
>
> >>> add_query_params('http://example.com/a/b/c?a', a='b', c=None)
> 'http://example.com/a/b/c?a=b&c <http://example.com/a/b/c?a=b&c>'
>
> If you need to pass in key names that are not allowed in keyword
> arguments,
> pass them via a dictionary in second argument:
>
> >>> add_query_params('foo', {"+'|äüö": 'bar'})
> 'foo?%2B%27%7C%C3%A4%C3%BC%C3%B6=bar'
>
> Order of original parameters is retained, although similar keys are
> grouped
> together.
Why the grouping? Is it a side effect of your desire to discard
duplicates? Changing the order like that changes the meaning of the
url. A concrete case where the order of field names matters is the
":records" converter in http://pypi.python.org/pypi/zope.httpform/1.0.1
(a small independent package extracted from the form handling code in zope).
> Order of keyword arguments is not (and can not be) retained:
>
> >>> add_query_params('foo?a=b&b=c&a=b&a=d', a='b')
> 'foo?a=b&a=d&b=c'
>
> >>> add_query_params('http://example.com/a/b/c?a=b&q=c&e=d
> <http://example.com/a/b/c?a=b&q=c&e=d>',
> ... x='y', e=1, o=2)
> 'http://example.com/a/b/c?a=b&q=c&e=d&e=1&x=y&o=2
> <http://example.com/a/b/c?a=b&q=c&e=d&e=1&x=y&o=2>'
>
> If you need to retain the order of the added parameters, use an
> :class:`OrderedDict` as the second argument (*params_dict*):
>
> >>> from collections import OrderedDict
> >>> od = OrderedDict()
> >>> od['xavier'] = 1
> >>> od['abacus'] = 2
> >>> od['janus'] = 3
> >>> add_query_params('http://example.com/a/b/c?a=b', od)
> 'http://example.com/a/b/c?a=b&xavier=1&abacus=2&janus=3
> <http://example.com/a/b/c?a=b&xavier=1&abacus=2&janus=3>'
>
> If both *params_dict* and keyword arguments are provided, values from the
> former are used before the latter:
>
> >>> add_query_params('http://example.com/a/b/c?a=b', od, xavier=1.1,
> ... zorg='a', alpha='b', watt='c', borg='d')
>
> 'http://example.com/a/b/c?a=b&xavier=1&xavier=1.1&abacus=2&janus=3&zorg=a&borg=d&watt=c&alpha=b
> <http://example.com/a/b/c?a=b&xavier=1&xavier=1.1&abacus=2&janus=3&zorg=a&borg=d&watt=c&alpha=b>'
>
> Do nothing with a single argument:
>
> >>> add_query_params('a')
> 'a'
>
> >>> add_query_params('arbitrary strange stuff?öäüõ*()+-=42')
> 'arbitrary strange stuff?\xc3\xb6\xc3\xa4\xc3\xbc\xc3\xb5*()+-=42'
If you change it to keep duplicates and not unnecessarily mangle the
field order I am +1, else I am -0.
- Jacob
More information about the Python-ideas
mailing list