[issue2193] Cookie Colon Name Bug
report at bugs.python.org
Wed Jun 29 23:58:20 CEST 2011
Luke Plant <L.Plant.98 at cantab.net> added the comment:
@ David Murray:
Thanks for taking the time to look at this - can I trouble you to keep going and read my response?
> IMO the thing that needs to be fixed here is that receiving an invalid cookie
> makes it difficult to receive the valid cookies.
I agree absolutely, and my patch implements exactly that aim. So I don't understand why the rest of your reply goes on to assume a very different goal - handling RFC-invalid cookies such that we can read their values.
> I'd love to accept your patch, but "silently ignore" sounds like a bad
> idea and is something we try to avoid in Python where practical.
"silently ignore" is what the current BaseCookie implementation does for **every other** type of invalid input, with the only exception (I can find) being the case where everything is correct apart from the name:
>>> from Cookie import SimpleCookie
>>> c = SimpleCookie()
>>> c.load('name=value; invalidattribute;')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/Cookie.py", line 632, in load
File "/usr/lib/python2.7/Cookie.py", line 665, in __ParseString
self.__set(K, rval, cval)
File "/usr/lib/python2.7/Cookie.py", line 585, in __set
M.set(key, real_value, coded_value)
File "/usr/lib/python2.7/Cookie.py", line 460, in set
raise CookieError("Illegal key value: %s" % key)
Cookie.CookieError: Illegal key value: namewith:colon
As you said, I think this ticket is about fixing that last one, which is the gross exception to the rule, and the only thing that is causing major problems for people.
I agree that handling cookies with invalid names so that we can read them would be nice, but I think it is **very** difficult to find a rationale for saying that this bug fix should have to wait for that change.
In addition, I do not think there is any sensible way to implement that suggestion given the API of BaseCookie at the moment - it's not just the implementation I baulked at, it is the API of BaseCookie that works against you every step of the way.
The API of BaseCookie works like this in the typical case:
input -> load() -> store in BaseCookie -> __getitem__()
input -> __setitem__() -> store in BaseCookie -> output()
(Of course you don't have to do it that way, but looking at the docs and examples, http://docs.python.org/library/cookie.html, it's very clear that it's meant to be used that way).
The fact that both modes involves storing in BaseCookie really messes up any attempt to make these two work differently, and means you would have a really surprising and strange API whichever way you do it.
So, option (1): you could allow BaseCookie to store invalid cookie names if they come via load, but not via __setitem__(). This means that you've got an easy work-around for BaseCookie refusing to store your value - just use 'load()' instead. It also means that simple code like this fails:
>>> c['name:val'] = c['name:val']
which can be a problem if you are trying to do some generalised processing of the contents of a BaseCookie.
This leads us to option (2): allow RFC-invalid cookies to be stored, but then quietly sanitise (i.e. discard) in output().
But this becomes a much worse example of "silently ignoring" - the former is tolerant handling of data that is outside the programmers control - Postel's law. But this would be accepting incorrect data from a programmer, and then silently discarding it, is what we should do our utmost to avoid.
Neither of these options is any good, and it is because the API of BaseCookie was just not designed for it. The only sensible things to do, given our API, is sanitise on input. We could have an 'RFC-invalid' mode, but it would have to be turned on for the whole Cookie instance, and it would be a feature addition. An alternative implementation would be a 'badmorsels' attribute which preserves the rubbish where possible, in case you want it - but again, it's a feature addition.
Finally, I think the behaviour you are aiming at is unreasonable to ask for, especially in this ticket. You are essentially asking for a tolerant kind of cookie parsing which does its best to do 'do-what-I-mean'. But:
1) You haven't specified further than that (and there are no RFCs of use)
2) in general that kind of thing is notoriously hard to get right
3) the job is never finished - there are always more cases of invalid cookies that you *could* handle
4) and in fact it is impossible to provide an implementation that pleases everyone - there will always be invalid cookies that were 'meant' to be one thing, but the implementation interprets in a different way.
Do we really have to wait for all of that to be sorted out before we fix the glaring inconsistency in the way that BaseCookie.load() deals with some kinds of invalid data?
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list