It&#39;s not a specific proposal, but here&#39;s my opinions on what a proposal should be:<br><br>On Tue, Sep 22, 2009 at 1:06 AM, Mark Nottingham <span dir="ltr">&lt;<a href="mailto:mnot@mnot.net">mnot@mnot.net</a>&gt;</span> wrote:<br>


<div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">OK, that&#39;s quite exhaustive.<br>

<br>

For the benefit of those of us jumping in, could you summarise your proposal in something like the following manner:<br>

<br>

1. How the request method is made available to WSGI applications<br></blockquote><div><br>Graham talked about it as bytes/unicode/native, where native is unicode on Python 3 and str on Python 2.  For instance, I think there&#39;s general consensus (though not really specifically discussed) that environ keys should be native.<br>


<br>I think method should be native.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

2. How the request-uri is made available to WSGI applications -- in particular, whether any decoding of punycode and/or %-escapes happens<br></blockquote><div><br>Hah, didn&#39;t even think about de-punycoding HTTP_HOST.  That&#39;d be a blast.<br>


<br>I think:<br>* scheme as native<br>* HTTP_HOST as native (no decoding of punycode)<br>* path as native (no URL decoding) - big break with WSGI 1 and CGI, but what the hell.  I could easily waffle on this.<br>* query string as native - *should* be ASCII-safe currently.<br>


<br>Wow, that was easy!<br><br>Request headers, which you didn&#39;t split out... those I&#39;m not sure.  I&#39;d *like* them to be native.  But damn, I&#39;m just not sure quite how.  surrogateescape?  Latin1?  Latin1 as a kind of poor man&#39;s surrogateescape isn&#39;t so bad.  And the headers *should* be ASCII for sane requests, so it&#39;s not a horrible compromise.  I guess libraries could lazilly transcode, just like they currently lazily decode.  But it&#39;d be a bit obnoxious at the library level.  Transcoding middleware would be easier, but it adds the question of how to record that the transcoding has taken place.<br>


 </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

3. How request headers are made available to WSGI apps<br></blockquote><div><br>Request handlers?  I don&#39;t understand your terminology.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


4. How the request body is made available to to WSGI apps<br></blockquote><div><br>Ugh.  wsgi.input could remain.  I think at least it should become a file-like interface (i.e., giving an empty string when the content is exausted) and I might even ask that it implement .tell() (.seek() would be nice of course, but optional).  If there was some other idea, I think there&#39;s room for improvement on wsgi.input and the file interface.<br>


<br>wsgi.input should definitely work with bytes only.  I believe this is consensus.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


5. Likewise for how apps should expose the response status message, headers and body to WSGI implementations.<br></blockquote><div><br>I believe there is consensus that the response body should remain an iterator that yields bytes.<br>


<br>In one way, it&#39;d be nice if we&#39;d just say that status/headers should be ASCII, because that&#39;s the reasonable choice.  But for proxying or representing &quot;HTTP as it is&quot;, it&#39;s not always the case.  And I&#39;m committed to keeping WSGI fully capable of representing arbitrary requests and responses so long as they aren&#39;t entirely diabololical.<br>


<br>But, an ASCII status is not unreasonable, especially since there&#39;s zero semantic meaning to the reason.  Which makes native strings perfectly fine.<br><br>So, headers...<br><br>Well, Latin1 is easy enough.  In theory, or at least particular theories, headers can be Latin1.  And you can represent arbitrary bytes that way.  So if you want to send crazy stuff to the browser, you can do it that way.  And if you want to stick to plain ASCII then that&#39;s easy enough as well.  So... native?  str or unicode?  I&#39;m not sure specifically for this one.<br>


</div></div><br clear="all"><br>-- <br>Ian Bicking  |  <a href="http://blog.ianbicking.org">http://blog.ianbicking.org</a>  |  <a href="http://topplabs.org/civichacker">http://topplabs.org/civichacker</a><br>