[issue1673007] urllib2 requests history + HEAD support

Patrick Westerhoff report at bugs.python.org
Tue Oct 11 11:01:27 CEST 2011


Patrick Westerhoff <PatrickWesterhoff at gmail.com> added the comment:

Senthil, I highly disagree with what you said:

> The next problem comes when a user has specified both data and method="GET".
> This becomes an invalid scenario, but a decision has been to taken as what
> should be given preference?
That is incorrect, RFC2616 states that the server should ignore the message body when the request method does not define any semantics for it, but there is nothing that makes the inclusion of a message body along with the GET request method invalid.

> - As the user has sent "data", should the request be considered a POST?
No, absolutely not. Sending data via other request methods, like DELETE or PUT, is semantically correct and should be supported completely if we are going to include a way to set the request method. If I set the method to PUT, and include data, I don’t want the library to override that to POST just because I set data.

> - But user specified it as "GET" (intentionally or by mistake), so should the
> data not be used and Request should do only a GET?
If I data is included and the request method is explicitely set to GET, make a GET request and include the data in the request body. It might not be a semantically good decision to send data over GET, but it still is not disallowed and as such should be possible (for whatever reasons).

> - Or should we throw an error?
We especially should’t throw an error, as this is not invalid at all.

> A person would just send data and forget about changing the method to "POST".
I agree that the library should still default to POST if data is included and the request method was not explicitely set (see also below).
> 1) should method always have priority or should 'POST' always be used whenever
>    data is passed?
> If data is passed use POST.
No, if data is passed and no special method is set, use POST, otherwise use the method the user specified, because that is what he expects.

> 2) if the method is e.g. 'GET' and data is passed, should an error be raised?
> Nope, give data the priority and do POST. (As urllib is currently doing)
Don't give data any priority if the method was set explicitely. Otherwise the ability to set a custom method is totally useless, especially with request methods where a message body is semantically useful (i.e. all other than HEAD and GET).

> 3) should Request.method be private?
> Not necessarily, it should  be public.
Depends on the way the method will be set. Looking at the way, the other request parameters are set (especially with the accessors being “deprecated”), it makes sense to leave this public.

> 4) should the value of Request.method be initialized in the __init__ or can it
>    also be None?
> My take - It should be initialized to default (GET), instead of None.
Initializing it to GET will make the implementation difficult, especially if we want to continue supporting the behaviour of setting the method to POST when data is changed (and the method was not explicitely set). Unless we override the built-in property accessors I don’t think this is possible.

> 5) if the value of Request.method is always initialized, should we deprecate
>    get_method?
> This is an interesting question. get_method essentially becomes less useful or
> it could serve as an arbiter when data and GET is sent and may be used as
> reliable way to get the Request's method. It should not be deprecated.
To my understanding, and this is also why I provided the same patch on the duplicate bug, `get_method` is used by the other libraries to get the used request method. Unless we change the other libraries to determine the method in a different way, deprecating `get_method` won’t get us anywhere.

What I tried to respect with the patch, and I think this was also Denver’s intention, is to add this functionality without breaking the current behaviour. That behaviour is that GET is default, and POST is set as default if there is any data. The functionality requires that the request method can be set (and the default behaviour can be overriden) without looking at the data (as explained above).

Ideally I would probably like to see the functionality of `get_method` done in another library, which performs the request. I.e. check `request.method` and use that if it’s set, otherwise check `request.data` and choose either POST or GET. But again this would require far too many changes in other libraries for this simple change.

And again on the `data` property: I think the name “data” is a bit confusing. Request does not provide any encoding details on that “data”, and it is actually just the request body in its original form. What I usually do in my subclass of Request is to provide a way to encode the data I pass to the constructor (often even with multipart encoding for file streams), while the `request.data` attribute to me still means “request body”.


Regarding those questions on the implementation, I agree with Ezio, and as I said this is probably the only way to add this functionality without breaking previous usages. And if we break previous usages (or restrict its functionality with weird priority rules based on the data), we better not add this functionality at all.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1673007>
_______________________________________


More information about the Python-bugs-list mailing list