I&#39;ll just add that *if* you can design your URL space (you didn&#39;t just inherit one), and you want to distinguish path segments from values that contain &#39;/&#39;, you can use URLs like:<br>  /item/{some/value}/view<br>


<br>And then use the matching {}&#39;s to figure out that &quot;some/value&quot; is one path segment.  This makes it possible, for instance, to use GData (where XML namespaces can show up in the URL, and they contain /&#39;s, but they need to be treated as a single value).  It&#39;s not perfect, but it does work.<br>


<br><br><div class="gmail_quote">On Thu, Mar 17, 2011 at 4:02 PM, And Clover <span dir="ltr">&lt;<a href="mailto:and-py@doxdesk.com">and-py@doxdesk.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div class="im">On Thu, 2011-03-17 at 19:10 +0100, Florian Friesdorf wrote:<br>

&gt; I think paste.httpserver.WSGIHandlerMixin.wsgi_setup should not<br>

</div>&gt; urllib.unquote the path before setting it in the wsgi environment<br>

<br>

I&#39;m afraid it must. This is something the WSGI specification inherits<br>

from CGI.<br>

<br>

Yes, it was a terrible decision to have SCRIPT_NAME and PATH_INFO<br>

automatically unescaped, as it loses the distinction between ‘%2F’ and<br>

‘/’, and has resulted in endless problems with non-ASCII characters that<br>

could otherwise been handled perfectly well as %-sequences.<br>

<br>

But that decision was taken a couple of decades ago and there&#39;s not<br>

really much we can do about it now. CGI may be an anachronism, but it is<br>

still widely used and its assumptions are still felt through Apache, IIS<br>

and WSGI.<br>

<div class="im"><br>

&gt; By urllib.unquoting it is not possible to<br>

&gt; have urllib.quoted slashes within one path segment.<br>

<br>

</div>Correct. And neither Apache nor IIS allows %2F to be used within a path<br>

segment either, so really if you want to write a portable web app you<br>

simply have to avoid them (along with %00 and %5C). It is not currently<br>

practical to include any arbitrary byte sequence in a URL path segment,<br>

even though by the URL specification you should be able to.<br>

<br>

It&#39;s annoying, it&#39;s inelegant, it&#39;s limiting. But none of our attempts<br>

to extend or replace it for non-CGI-based servers (see past list<br>

discussion on path-info-raw or standardising REQUEST_URI) have come to<br>

any acceptable conclusion. We are stuck with it for the foreseeable.<br>

<font color="#888888"><br>

--<br>

And Clover<br>

mailto:<a href="mailto:and@doxdesk.com">and@doxdesk.com</a><br>

<a href="http://www.doxdesk.com" target="_blank">http://www.doxdesk.com</a><br>

gtalk:chat?jid=<a href="mailto:bobince@gmail.com">bobince@gmail.com</a><br>

</font><div><div></div><div class="h5"><br>

_______________________________________________<br>

Web-SIG mailing list<br>

<a href="mailto:Web-SIG@python.org">Web-SIG@python.org</a><br>

Web SIG: <a href="http://www.python.org/sigs/web-sig" target="_blank">http://www.python.org/sigs/web-sig</a><br>

Unsubscribe: <a href="http://mail.python.org/mailman/options/web-sig/ianb%40colorstudy.com" target="_blank">http://mail.python.org/mailman/options/web-sig/ianb%40colorstudy.com</a><br>

</div></div></blockquote></div><br>