[Web-SIG] URL quoting in WSGI (or the lack therof)

Luis Bruno lbruno at 100blossoms.com
Tue Jan 22 12:25:50 CET 2008


Ian Bicking wrote:
> But relating REQUEST_URI with SCRIPT_NAME/PATH_INFO is awkward and 
> having the information in duplicate places can lead to errors and 
> unclear situations if they don't match up properly.

True, and you can apply the same reasoning to my suggestion too.

Apart from the duplication of information, there's how or where to do 
the actual decoding. Not everyone is dispatching to a CherryPy-style 
tree of objects, so putting a %-decoded list of path segments in a 
environ key doesn't work -- I knew it was a bad idea! I'm going with 
CherryPy's on this: don't decode "%2F". Should other characters be kept 
encoded?

Also, this crystallizes my thoughts on the matter: %-decoding is the 
applications' job. Or frameworks'. *Not* the servers'.


> Luis Bruno wrote:
>> I was not amused to see egg:Paste#http urldecoding the whole PATH_INFO.
> Unfortunately this is in the WSGI spec, so it's not Paste#http so much 
> as WSGI that demands this.
Cite?

I skimmed PEP 333 before grumbling and I've just re-read it; didn't find 
it, unless you're referring to the code in "URL Reconstruction" section. 
If you're referring[*] to the CGI 1.1 draft linked in "environ 
Variables", I think it supports my position that unquoting(PATH_INFO) 
was not the correct thing to do.

[*] I'm not sure how to spell that.


> I made note of this issue on the WSGI 2.0 ideas page 
Didn't find it here: <URL:http://wsgi.org/wsgi/WSGI_2.0>. Should I look 
elsewhere?


> [/Laptops/LN500%2F9DW/ ] would be the Right Thing, except for not 
> being WSGI.
Looks to me like a good candidate for an amendment.


What's the next step?
-- 
Luís Bruno


More information about the Web-SIG mailing list