[Web-SIG] resources for porting wsgi apps from python 2 to 3

Mon Oct 1 19:07:40 CEST 2012

I was at pyconuk over the weekend and came away from that all
refreshed and wanting to hack. That combined with the recent release
of Python 3.3 had me deciding it was time to start porting
TiddlyWeb[1] to Python3.

I'm having progress along some lines and a bit of a mess along others.
The major holdback right now are dependencies which are not yet
ported, which I'd like to port as well, but proving hard to port
because they have test dependencies which themselves are not yet
ported.

A medium sized issue is related to how WSGI is supposed to behave in
Python3. TiddlyWeb is its own framework and doesn't use webob or
werkzeug, etc. It does dispatch with selector, but other than
that processes handling headers and request body as it gets them from
the server. For tests it uses wsgi-intercept[2] to simulate a web
server. I've volunteered to port that (having already done some minor
work on it in the past) so need to get clear and the disposition of
bytes or strings in headers and bodies of both requests and responses.

I have a few questions that I'm hoping people here will help me
answer, or at least point me in the right direction. I'll be happy to
summarize the results after the discussion has tailed off.[3]

I've looked over pep 3333 and don't some other reading, but I don't feel
fully confident. The question is mostly around what part of the stack
should be uptight. In the below when I say "bytes" and "str" I mean the
Python 3 types.

* Should wsgi-intercept (which fakes a server) when giving request
    info to a "fake app":

    * Use bytes or str for environ keys?
    * Use bytes or str for environ values?
      * Are all environ values created equal or would, for example,
        QUERY_STRING's value (prior to any parameter to decoding)
        be handled differently from HTTP_COOKIE
      * If str, I see that ISO-8859-1 is the assumed encoding. How much
        hurt occurs in the world if I just assume utf-8 when decoding to
        str[4]?

* When wsgi-intercept is accepting data from the wsgi app:
    * Should start_response only accept bytes (and error if not), or
      should it also accept str and encode appropriately? To put it
      another way: be liberal or srict? If encoding, which encoding?
    * Should the returned iterable be rejected or encoded if not bytes?

What have I forgotten?

Thanks for any input, comments, etc. The thing at [3] has a few more
details on some of the related issues and pieces of the puzzle.

[1] http://tiddlyweb.com/
      https://github.com/tiddlyweb/tiddlyweb

[2] http://code.google.com/p/wsgi-intercept/

[3] I've started keeping notes on this project at
      http://tiddlyweb3.tiddlyspace.com/

[4] Which is what it should have been all along?
-- 
Chris Dent
http://peermore.com/

-- 
Chris Dent                                   http://burningchrome.com/
                                 [...]