From manlio_perillo at libero.it Thu Apr 1 00:36:41 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 01 Apr 2010 00:36:41 +0200 Subject: [Web-SIG] WSGI safe write callable using greenlet In-Reply-To: <4BB25F3B.1010806@libero.it> References: <4BB25F3B.1010806@libero.it> Message-ID: <4BB3CE79.10308@libero.it> Manlio Perillo ha scritto: > Hi. > > In this period I'm upgrading my WSGI implementation for Nginx: > http://hg.mperillo.ath.cx/nginx/ngx_http_wsgi_module/ > [...] > So, I was thinking: what about a WSGI middleware that, using greenlets, > expose to the application a write callable with the correct code flow? > > > Here is a very first draft: > http://pastebin.com/4k1Ep4dH > > It should work with every standard WSGI implementation. > Here is a more generic middleware and example application: http://pastebin.com/S8c1gRfY and here is the output: http://pastebin.com/zzkRiRuA The example also contains hints about features I plan to implement, like the wsgiorg.suspend extension, and subrequests. Regards Manlio From jdmain at comcast.net Sat Apr 3 18:32:01 2010 From: jdmain at comcast.net (J.D. Main) Date: Sat, 03 Apr 2010 10:32:01 -0600 Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? Message-ID: <4BB71921.6374.11014531@jdmain.comcast.net> Hi Folks, I hope this question hasn't already been answered... I'm using IIS 5 and calling a python script directly in the URL of a request. Something like: http://someserver/myscript.py or even http://someserver/myscript.py?var1=something&var2=somthingelse Using the CGI module, I can certainly see and act upon the variables that are passed as GET or POST actions. What I'm after is something more low level. I want to see the entire HTTP request with everything inside it. Does IIS actually pass that information to the CGI application or does it just pass the variables? The intent is to write a "RESTFUL" CGI script. I need to actually "see" the URI and the parameters of the incoming request to map the appropriate action. Without short circuiting the IIS webserver, how would my python parse the following: http://someserver/someapp/someuser/someupdate?var1=Charlie Thanks in advance! JDM From arw1961 at yahoo.com Mon Apr 5 16:14:26 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Mon, 5 Apr 2010 07:14:26 -0700 (PDT) Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? In-Reply-To: <4BB71921.6374.11014531@jdmain.comcast.net> Message-ID: <370368.22226.qm@web111719.mail.gq1.yahoo.com> I think you should consider using the WSGI interface. The WSGI interface puts all the components of a request into a request environment dictionary which is sent as a parameter to the function generating the response. For example have a look at the test application http://whiffdoc.appspot.com/tests/misc/testDebugDump?thisVar=thatValue&thisOtherVar=ThatOtherValue which dumps out the WSGI environment (with WHIFF extensions) to the response. All the information you need is somewhere inside the environment dictionary (but it's not always easy to find). You could also look at WHIFF which helps combine some of the features of the CGI module with the WSGI interface. http://whiffdoc.appspot.com/ Hope that helps, -- Aaron Watters === % man less less is more. --- On Sat, 4/3/10, J.D. Main wrote: > From: J.D. Main > Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? > To: web-sig at python.org > Date: Saturday, April 3, 2010, 12:32 PM > Hi Folks, > > I hope this question hasn't already been answered... > > I'm using IIS 5 and calling a python script directly in the > URL of a request.? > Something like: > > http://someserver/myscript.py > > or even > > http://someserver/myscript.py?var1=something&var2=somthingelse > > Using the CGI module, I can certainly see and act upon the > variables that > are passed as? GET or POST actions.? What I'm > after is something more > low level.? I want to see the entire HTTP request with > everything inside it. > > Does IIS actually pass that information to the CGI > application or does it just > pass the variables? > > The intent is to write a "RESTFUL" CGI script.? I need > to actually "see" the > URI and the parameters of the incoming request to map the > appropriate > action.? Without short circuiting the IIS webserver, > how would my python > parse the following: > > http://someserver/someapp/someuser/someupdate?var1=Charlie > > Thanks in advance! > > JDM > > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > From rsyring at inteli-com.com Tue Apr 6 22:37:04 2010 From: rsyring at inteli-com.com (Randy Syring) Date: Tue, 06 Apr 2010 16:37:04 -0400 Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid Message-ID: <4BBB9B70.2060007@inteli-com.com> I am planning on building a library that will facilitate creation of custom queries and html display of resulting datasets from SQLAlechemy queries. I have some basic work done here: https://svn.rcslocal.com:8443/svn/pysmvt/pysapp/branches/0.1/pysapp/modules/datagrid/ But I don't like the API and I don't want the library to be dependent on pysapp. Furthermore, I would like to have a more verbose querying ability akin to Redmine: http://www.redmine.org/projects/redmine/issues Including: * Filters * Column Selection * Grouping (multiple levels) * Sorting (multiple columns) * some kind of query saving/loading mechanism with a flexible backend I have done some basic table generation work here: https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/ with the tests being the best place to get an idea of how it works: https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/pysdatagrid/tests/test_render.py Looking for comments, pointers to other projects, and/or possibly interest in helping with a project like this. I am currently working in SVN but will most likely move to hg/git if there are others who are interested. -- -------------------------------------- Randy Syring Intelicom 502-644-4776 "Whether, then, you eat or drink or whatever you do, do all to the glory of God." 1 Cor 10:31 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdmain at comcast.net Wed Apr 7 03:25:54 2010 From: jdmain at comcast.net (J.D. Main) Date: Tue, 06 Apr 2010 19:25:54 -0600 Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? In-Reply-To: <370368.22226.qm@web111719.mail.gq1.yahoo.com> References: <4BB71921.6374.11014531@jdmain.comcast.net>, <370368.22226.qm@web111719.mail.gq1.yahoo.com> Message-ID: <4BBB8AC2.26347.225D24F8@jdmain.comcast.net> Thanks Aaron, I think I will explore the WSGI interface. However, I did learn a trick using the OS Module: import cgi, os formfields = cgi.FieldStorage() http_stuff = os.environ print "Content-type: text/html" print print "" print "" print "Raw HTTP Test" print "" print "" print formfields print http_stuff print "" print "" The os.environ variable is a big dictionary containing most (if not all) of the values inside the original http GET or POST. Best Regards, JDM I think you should consider using the WSGI interface. The WSGI interface puts all the components of a request into a request environment dictionary which is sent as a parameter to the function generating the response. For example have a look at the test application http://whiffdoc.appspot.com/tests/misc/testDebugDump?thisVar=thatValue& thisOtherVar=ThatOtherValue which dumps out the WSGI environment (with WHIFF extensions) to the response. All the information you need is somewhere inside the environment dictionary (but it's not always easy to find). You could also look at WHIFF which helps combine some of the features of the CGI module with the WSGI interface. http://whiffdoc.appspot.com/ Hope that helps, -- Aaron Watters === % man less less is more. --- On Sat, 4/3/10, J.D. Main wrote: > From: J.D. Main > Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? > To: web-sig at python.org > Date: Saturday, April 3, 2010, 12:32 PM > Hi Folks, > > I hope this question hasn't already been answered... > > I'm using IIS 5 and calling a python script directly in the > URL of a request.? > Something like: > > http://someserver/myscript.py > > or even > > http://someserver/myscript.py?var1=something&var2=somthingelse > > Using the CGI module, I can certainly see and act upon the > variables that > are passed as? GET or POST actions.? What I'm > after is something more > low level.? I want to see the entire HTTP request with > everything inside it. > > Does IIS actually pass that information to the CGI > application or does it just > pass the variables? > > The intent is to write a "RESTFUL" CGI script.? I need > to actually "see" the > URI and the parameters of the incoming request to map the > appropriate > action.? Without short circuiting the IIS webserver, > how would my python > parse the following: > > http://someserver/someapp/someuser/someupdate?var1=Charlie > > Thanks in advance! > > JDM > > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > From arw1961 at yahoo.com Wed Apr 7 16:03:14 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Wed, 7 Apr 2010 07:03:14 -0700 (PDT) Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid In-Reply-To: <4BBB9B70.2060007@inteli-com.com> Message-ID: <611997.80677.qm@web111702.mail.gq1.yahoo.com> Thanks Randy, very interesting. My initial reaction is that you are building a stack on top of a stack. It's not clear to me what problem you want to solve and your requirements are.? It's possible that you could find it easier to abstract directly on top of SQL or alternatively you could consider using another sort of data model like mongodb. Building an abstraction on top of SQLAlchemy which hasn't even reached 1.0 strikes me as dubious. Thanks again,? -- Aaron Watters --- On Tue, 4/6/10, Randy Syring wrote: From: Randy Syring Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid To: web-sig at python.org Date: Tuesday, April 6, 2010, 4:37 PM I am planning on building a library that will facilitate creation of custom queries and html display of resulting datasets from SQLAlechemy queries.? I have some basic work done here: https://svn.rcslocal.com:8443/svn/pysmvt/pysapp/branches/0.1/pysapp/modules/datagrid/ But I don't like the API and I don't want the library to be dependent on pysapp.? Furthermore, I would like to have a more verbose querying ability akin to Redmine: http://www.redmine.org/projects/redmine/issues Including: Filters Column Selection Grouping (multiple levels) Sorting (multiple columns) some kind of query saving/loading mechanism with a flexible backend I have done some basic table generation work here: https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/ with the tests being the best place to get an idea of how it works: https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/pysdatagrid/tests/test_render.py Looking for comments, pointers to other projects, and/or possibly interest in helping with a project like this.? I am currently working in SVN but will most likely move to hg/git if there are others who are interested. -- -------------------------------------- Randy Syring Intelicom 502-644-4776 "Whether, then, you eat or drink or whatever you do, do all to the glory of God." 1 Cor 10:31 -----Inline Attachment Follows----- _______________________________________________ Web-SIG mailing list Web-SIG at python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From arw1961 at yahoo.com Wed Apr 7 16:06:15 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Wed, 7 Apr 2010 07:06:15 -0700 (PDT) Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? In-Reply-To: <4BBB8AC2.26347.225D24F8@jdmain.comcast.net> Message-ID: <272248.5337.qm@web111711.mail.gq1.yahoo.com> --- On Tue, 4/6/10, J.D. Main wrote: > From: J.D. Main > Subject: Re: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? > To: web-sig at python.org > Date: Tuesday, April 6, 2010, 9:25 PM > Thanks Aaron, > > I think I will explore the WSGI interface.? However, I > did learn a trick using > the OS Module: > > import cgi, os > > formfields = cgi.FieldStorage() > http_stuff = os.environ ..... Yes, that will work too. In fact the CGI interface to WSGI works like this. The advantage to using WSGI is that it makes it possible to move your application to other configurations more easily (in theory) and it's just a tiny bit more high level. Best regards, -- Aaron Watters From rsyring at inteli-com.com Wed Apr 7 19:37:50 2010 From: rsyring at inteli-com.com (Randy Syring) Date: Wed, 07 Apr 2010 13:37:50 -0400 Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid In-Reply-To: <611997.80677.qm@web111702.mail.gq1.yahoo.com> References: <611997.80677.qm@web111702.mail.gq1.yahoo.com> Message-ID: <4BBCC2EE.5070804@inteli-com.com> Aaron, Sorry, I must not really have explained clearly. This isn't an abstraction layer, but more like a UI component or widgit that facilities basic reporting. Look at these pages: http://www.redmine.org/issues http://trac.edgewall.org/query Both pages have a similar structure: * UI Controls o filtering o grouping o column selection o sorting * Paged/Sortable Recordset Display The library would do the heavy lifting and allow any application using SQLAlchemy to easily create such query/recordset interfaces to underlying data. You would need to: * Instantiate the DataGrid class * Create a base SQLAlchemy query to be used for the data * Define the filter types associated with the columns (i.e. TextFieldFilter, OptionsFilter('low','medium', 'high'), DateSpanFilter, etc.) * Limit sorting, grouping to appropriate columns * choose which columns of the dataset to show by defaults The library would then: * parse GET/POST for filter/column/sort/page settings &/or use defaults * compose SQLALchemy query to satisfy the request * execute the query and get the database results * put results into an HTML table * return the HTML form needed for the controls and the recordset table including necessary CSS and JS Obviously, the library should be easily customizable and the rendering of HTML should be flexible. It should also be flexible enough to work with the different WSGI libraries out there. I hope that makes better sense. If you or anyone else is interested, I can give you some code I have working from the pysapp project which does some basic filtering, paging, and sorting. The API is awful, but I think it might give you a better idea of what I am talking about. -------------------------------------- Randy Syring Intelicom 502-644-4776 "Whether, then, you eat or drink or whatever you do, do all to the glory of God." 1 Cor 10:31 Aaron Watters wrote: > > Thanks Randy, very interesting. > > My initial reaction is that you are building a stack on top of a stack. > It's not clear to me what problem you want to solve > and your requirements are. It's possible that > you could find it easier to abstract directly on top of SQL or > alternatively > you could consider using another sort of data model like mongodb. > Building an abstraction on top of SQLAlchemy which hasn't even reached > 1.0 strikes me as dubious. > > Thanks again, -- Aaron Watters > > --- On *Tue, 4/6/10, Randy Syring //* wrote: > > > From: Randy Syring > Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid > To: web-sig at python.org > Date: Tuesday, April 6, 2010, 4:37 PM > > I am planning on building a library that will facilitate creation > of custom queries and html display of resulting datasets from > SQLAlechemy queries. I have some basic work done here: > > https://svn.rcslocal.com:8443/svn/pysmvt/pysapp/branches/0.1/pysapp/modules/datagrid/ > > But I don't like the API and I don't want the library to be > dependent on pysapp. Furthermore, I would like to have a more > verbose querying ability akin to Redmine: > > http://www.redmine.org/projects/redmine/issues > > Including: > > * Filters > * Column Selection > * Grouping (multiple levels) > * Sorting (multiple columns) > * some kind of query saving/loading mechanism with a flexible > backend > > I have done some basic table generation work here: > > https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/ > > with the tests being the best place to get an idea of how it works: > > https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/pysdatagrid/tests/test_render.py > > Looking for comments, pointers to other projects, and/or possibly > interest in helping with a project like this. I am currently > working in SVN but will most likely move to hg/git if there are > others who are interested. > > -- > -------------------------------------- > Randy Syring > Intelicom > 502-644-4776 > > "Whether, then, you eat or drink or > whatever you do, do all to the glory > of God." 1 Cor 10:31 > > > -----Inline Attachment Follows----- > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arw1961 at yahoo.com Wed Apr 7 20:54:03 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Wed, 7 Apr 2010 11:54:03 -0700 (PDT) Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid In-Reply-To: <4BBCC2EE.5070804@inteli-com.com> Message-ID: <161344.83249.qm@web111724.mail.gq1.yahoo.com> Randy: It seems you want a sortable HTML table that talks to a back end query engine.? I don't see why this needs to be specific to SQLAlchemy. Here is a WHIFF middleware which does some of what you are talking about (the demo formatting is basic/ugly for simplicity purposes). Demo ? http://whiffdoc.appspot.com/tests/misc/testSortTable Demo source ?? http://whiffdoc.appspot.com/tests/showText?path=./misc/testSortTable The documentation ?http://whiffdoc.appspot.com/docs/W1200_1400.stdMiddleware#Header83 is not extensive, but here is the source for the core middleware widget. ? http://aaron.oirt.rutgers.edu/cgi-bin/whiffRepo.cgi/file/8c031c68a5a0/whiff/middleware/sortTable.py As written it requires the whole table as a list of dictionaries and then does paging from the full list.? It certainly needs generalization but maybe it's a start.? Let me know if you have questions or comments. ?? -- Aaron Watters --- On Wed, 4/7/10, Randy Syring wrote: From: Randy Syring Subject: Re: [Web-SIG] SQLAlchemy Queries & HTML Data Grid To: "Aaron Watters" Cc: web-sig at python.org Date: Wednesday, April 7, 2010, 1:37 PM Aaron, Sorry, I must not really have explained clearly.? This isn't an abstraction layer, but more like a UI component or widgit that facilities basic reporting.? Look at these pages: http://www.redmine.org/issues http://trac.edgewall.org/query].... -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsyring at inteli-com.com Wed Apr 7 21:07:14 2010 From: rsyring at inteli-com.com (Randy Syring) Date: Wed, 07 Apr 2010 15:07:14 -0400 Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid In-Reply-To: <161344.83249.qm@web111724.mail.gq1.yahoo.com> References: <161344.83249.qm@web111724.mail.gq1.yahoo.com> Message-ID: <4BBCD7E2.4020504@inteli-com.com> > It seems you want a sortable HTML table that talks to a back end > query engine. I don't see why this needs to be specific to SQLAlchemy. Well...not just sorting though. Sorting, filtering, grouping, column selection, and paging. You are right that the backend does not need to be SQLAlchemy specific, but since that is what I use, that is what I was going to start with. Ideally, the library would be both sql framework and wsgi framework agnostic. The main point of the library would be to save the time/hassle in creating the HTML/CSS/JS for the query controls. -------------------------------------- Randy Syring Intelicom 502-644-4776 "Whether, then, you eat or drink or whatever you do, do all to the glory of God." 1 Cor 10:31 From manlio_perillo at libero.it Thu Apr 8 16:08:26 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 08 Apr 2010 16:08:26 +0200 Subject: [Web-SIG] WSGI and start_response Message-ID: <4BBDE35A.3050101@libero.it> Hi. Some time ago I objected the decision to remove start_response function from next version WSGI, using as rationale the fact that without start_callable, asynchronous extension are impossible to support. Now I have found that removing start_response will also make impossible to support coroutines (or, at least, some coroutines usage). Here is an example (this is the same example I posted few days ago): http://paste.pocoo.org/show/199202/ Forgetting about the write callable, the problem is that the application starts to yield data when tmpl.render_unicode function is called. Please note that this has *nothing* to do with asynchronus applications. The code should work with *all* WSGI implementations. In the pasted example, the Mako render_unicode function is "turned" into a generator, with a simple function that allows to flush the current buffer. Can someone else confirm that this code is impossible to support in WSGI 2.0? If my suspect is true, I once again object against removing start_response. WSGI 1.0 is really a well designed protocol, since it is able to support both asynchonous application (with a custom extension) and coroutines, *even* if this was not considered during protocol design. Thanks Manlio From arw1961 at yahoo.com Thu Apr 8 16:19:04 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Thu, 8 Apr 2010 07:19:04 -0700 (PDT) Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BBDE35A.3050101@libero.it> Message-ID: <165558.15790.qm@web111701.mail.gq1.yahoo.com> someone remind me: where is the canonical WSGI 2 spec? I assume there is a way to "wrap" WSGI 1 applications without breaking them? Or is this the regex-->re fiasco all over again? -- Aaron Watters --- On Thu, 4/8/10, Manlio Perillo wrote: > From: Manlio Perillo > Subject: [Web-SIG] WSGI and start_response > To: "'Web SIG'" > Date: Thursday, April 8, 2010, 10:08 AM > Hi. > > Some time ago I objected the decision to remove > start_response function > from next version WSGI, using as rationale the fact that > without > start_callable, asynchronous extension are impossible to > support. > > Now I have found that removing start_response will also > make impossible > to support coroutines (or, at least, some coroutines > usage). > > Here is an example (this is the same example I posted few > days ago): > http://paste.pocoo.org/show/199202/ > > Forgetting about the write callable, the problem is that > the application > starts to yield data when tmpl.render_unicode function is > called. > > Please note that this has *nothing* to do with asynchronus > applications. > The code should work with *all* WSGI implementations. > > > In the pasted example, the Mako render_unicode function is > "turned" into > a generator, with a simple function that allows to flush > the current buffer. > > > Can someone else confirm that this code is impossible to > support in WSGI > 2.0? > > If my suspect is true, I once again object against removing > start_response. > > WSGI 1.0 is really a well designed protocol, since it is > able to support > both asynchonous application (with a custom extension) and > coroutines, > *even* if this was not considered during protocol design. > > > Thanks? Manlio > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > From pje at telecommunity.com Thu Apr 8 16:53:35 2010 From: pje at telecommunity.com (P.J. Eby) Date: Thu, 08 Apr 2010 10:53:35 -0400 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BBDE35A.3050101@libero.it> References: <4BBDE35A.3050101@libero.it> Message-ID: <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote: >Hi. > >Some time ago I objected the decision to remove start_response function >from next version WSGI, using as rationale the fact that without >start_callable, asynchronous extension are impossible to support. > >Now I have found that removing start_response will also make impossible >to support coroutines (or, at least, some coroutines usage). > >Here is an example (this is the same example I posted few days ago): >http://paste.pocoo.org/show/199202/ > >Forgetting about the write callable, the problem is that the application >starts to yield data when tmpl.render_unicode function is called. > >Please note that this has *nothing* to do with asynchronus applications. >The code should work with *all* WSGI implementations. > > >In the pasted example, the Mako render_unicode function is "turned" into >a generator, with a simple function that allows to flush the current buffer. > > >Can someone else confirm that this code is impossible to support in WSGI >2.0? I don't understand why it's a problem. See my previous post here: http://mail.python.org/pipermail/web-sig/2009-September/003986.html for a sketch of a WSGI 1-to-2 converter. It takes a WSGI 1 application callable as the input, and returns a WSGI 2 function. From manlio_perillo at libero.it Thu Apr 8 16:59:34 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 08 Apr 2010 16:59:34 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <165558.15790.qm@web111701.mail.gq1.yahoo.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> Message-ID: <4BBDEF56.8060507@libero.it> Aaron Watters ha scritto: > someone remind me: where is the canonical WSGI 2 spec? http://wsgi.org/wsgi/WSGI_2.0 > I assume there is a way to "wrap" WSGI 1 applications > without breaking them? Or is this the regex-->re fiasco > all over again? > start_response can be implemented by a function that will store the status code and response headers. There should be a sample WSGI 2.0 implementation for CGI, and a sample WSGI 1.0 -> 2.0 adapter. This adapter should be able to support the coroutine example, > http://paste.pocoo.org/show/199202/ but I would like to test. write callable, as far as I know, can not be implemented. > [...] Regards Manlio From pje at telecommunity.com Thu Apr 8 17:20:44 2010 From: pje at telecommunity.com (P.J. Eby) Date: Thu, 08 Apr 2010 11:20:44 -0400 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BBDEF56.8060507@libero.it> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> Message-ID: <20100408152052.700413A40AA@sparrow.telecommunity.com> At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote: >Aaron Watters ha scritto: > > someone remind me: where is the canonical WSGI 2 spec? > >http://wsgi.org/wsgi/WSGI_2.0 > > > I assume there is a way to "wrap" WSGI 1 applications > > without breaking them? Or is this the regex-->re fiasco > > all over again? > > > >start_response can be implemented by a function that will store the >status code and response headers. > >There should be a sample WSGI 2.0 implementation for CGI, and a sample >WSGI 1.0 -> 2.0 adapter. > >This adapter should be able to support the coroutine example, > > http://paste.pocoo.org/show/199202/ >but I would like to test. > >write callable, as far as I know, can not be implemented. Implementing it requires greenlets or threads, but it's implementable. See: http://mail.python.org/pipermail/web-sig/2009-September/003986.html (Btw, I've noticed that this early sketch of mine doesn't support the case where an application is a generator, because start_response won't have been called when the application returns. This can be fixed, but it requires the addition of a wrapper class and a few other annoying details. It also doesn't support exc_info properly, so it's still a ways from being a correct WSGI 1 server implementation. Getting rid of all these little variations, though, is the goal of having a WSGI 2 - it's difficult to write *any* middleware to be completely WSGI 1 compliant.) From manlio_perillo at libero.it Thu Apr 8 17:40:06 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 08 Apr 2010 17:40:06 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <20100408152052.700413A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> Message-ID: <4BBDF8D6.60704@libero.it> P.J. Eby ha scritto: > At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote: > [...] >> There should be a sample WSGI 2.0 implementation for CGI, and a sample >> WSGI 1.0 -> 2.0 adapter. >> >> This adapter should be able to support the coroutine example, >> > http://paste.pocoo.org/show/199202/ >> but I would like to test. >> >> write callable, as far as I know, can not be implemented. > > Implementing it requires greenlets or threads, but it's implementable. > See: > > http://mail.python.org/pipermail/web-sig/2009-September/003986.html > Right. In fact, in the example I posted, I implemented the write callable using greenlets (although the implementation is different). > (Btw, I've noticed that this early sketch of mine doesn't support the > case where an application is a generator, because start_response won't > have been called when the application returns. This can be fixed, but > it requires the addition of a wrapper class and a few other annoying > details. It also doesn't support exc_info properly, so it's still a > ways from being a correct WSGI 1 server implementation. Getting rid of > all these little variations, though, is the goal of having a WSGI 2 - > it's difficult to write *any* middleware to be completely WSGI 1 > compliant.) > I agree that this is a good goal. However I don't like the idea of losing support for some features. With WSGI 2.0 we will end up with: - WSGI 1.0, a full featured protocol, but with hard to implement middlewares - WSGI 2.0, a simple protocol, with more easy to implement middlewares but without support for some "advanced" applications Both WSGI 1.0 can be implemented on top of WSGI 2.0, and WSGI 2.0 on top of WSGI 1.0. The latter should be more "easy" to implement. I would like to have a WSGI 1.1 specification without the write callable, and a *standard* adapter that will expose a more simple API (like WSGI 2.0) so that applications and middlewares can be implemented using this simple API but you still have the full featured API. This is important, IMHO. Because with the next version of WSGI, there will be also support for Python 3.x. And if the next version will not have support for the start_response function, applications that needs Python 3.x and want to use "advance features" will not be able to rely a standard procotol. Regards Manlio From pje at telecommunity.com Thu Apr 8 19:30:32 2010 From: pje at telecommunity.com (P.J. Eby) Date: Thu, 08 Apr 2010 13:30:32 -0400 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BBDF8D6.60704@libero.it> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> Message-ID: <20100408173040.24E873A40AA@sparrow.telecommunity.com> At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote: >With WSGI 2.0 we will end up with: > >- WSGI 1.0, a full featured protocol, but with hard to implement > middlewares >- WSGI 2.0, a simple protocol, with more easy to implement middlewares > but without support for some "advanced" applications Let me see if I understand what you're saying. You want to support suspending an application, without using greenlets or threads. Under WSGI 1, you can do this by yielding empty strings before calling start_response. Under WSGI 2, you can only do this by directly suspending execution, e.g. via greenlet or eventlets or some similar API provided by the server. Is this your objection? As far as I know, nobody has actually implemented an async app facility for WSGI 1, although it sounds like perhaps you're trying to design or implement such a thing now. If so, then there's nothing stopping you from implementing a WSGI 1 server and providing a WSGI 2 adapter, since as you point out, WSGI 2 is easier to implement on top of WSGI 1 than the other way around. (Note, however, that if you simply use a greenlet or eventlet-based API for your async server, then the problem is neatly solved whether you are using WSGI 1 or 2, and the effective API is a lot cleaner than yielding empty strings.) From manlio_perillo at libero.it Thu Apr 8 20:06:17 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 08 Apr 2010 20:06:17 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <20100408173040.24E873A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> Message-ID: <4BBE1B19.4000601@libero.it> P.J. Eby ha scritto: > At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote: >> With WSGI 2.0 we will end up with: >> >> - WSGI 1.0, a full featured protocol, but with hard to implement >> middlewares >> - WSGI 2.0, a simple protocol, with more easy to implement middlewares >> but without support for some "advanced" applications > > Let me see if I understand what you're saying. You want to support > suspending an application, without using greenlets or threads. What I'm trying to do is: * as in the example I posted, turn Mako render function in a generator. The reason is that I would lite to to implement support for Nginx subrequests. During a subrequest, the generated response body is sent directly to the client, so it is necessary to be able to flush the Mako buffer * implement the simple suspend/resume extension, as described here: http://comments.gmane.org/gmane.comp.python.twisted.web/632 Note that my ngx_http_wsgi_module already support asynchronous web server, since when the application returns a generator and sending a yielded buffer to the client would block, execution of WSGI application is suspended, and resumed when the socket is ready to send data. The suspend/resume extension allows an application to explicitly suspend/resume execution, so it is a nice complement for an asynchronous server. I would like to propose this extension for wsgiorg namespace. Not that, however, greenlets are still required, since it will make the code much more usable. > Under > WSGI 1, you can do this by yielding empty strings before calling > start_response. No, in this case this is not what I need to do. I need to call start_response, since the greenlet middleware will yield data to the caller before the application returns. > Under WSGI 2, you can only do this by directly > suspending execution, e.g. via greenlet or eventlets or some similar API > provided by the server. Is this your objection? > In WSGI 2 what I want to do is not really possible. The reason is that I don't use greenlets in the C module (I'm not even sure greenlets can be used in my ngx_http_wsgi module) Execution is suspended using the "normal" suspend extension. The problem is with the greenlet middleware that will force a different code flow. > As far as I know, nobody has actually implemented an async app facility > for WSGI 1, although it sounds like perhaps you're trying to design or > implement such a thing now. Right. My previous attempt was a failure, since the extensions have severe usability problem. It is the same problem you have with Twisted deferred. In this case every function that call a function that use the async extension must be a generator. In my new attempt I plan to: 1) Implement the simple suspend/resume extension 2) Implement a Python extension module that wraps the Nginx events system. 3) Implement a pure Python WSGI middleware that, using greenlets, will enable normal applications to take advantage of Nginx async features. This middleware will have the same purpose as the Hub available in gevent > If so, then there's nothing stopping you > from implementing a WSGI 1 server and providing a WSGI 2 adapter, since > as you point out, WSGI 2 is easier to implement on top of WSGI 1 than > the other way around. > Yes, this is what I would like to do. Do you think it will possible to implement all the requirements of WSGI 2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ? And what about applications that need to use the WSGI 1.0 API but require to run with Python 3.x? Thanks Manlio From pje at telecommunity.com Thu Apr 8 21:09:39 2010 From: pje at telecommunity.com (P.J. Eby) Date: Thu, 08 Apr 2010 15:09:39 -0400 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BBE1B19.4000601@libero.it> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> Message-ID: <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote: >What I'm trying to do is: > >* as in the example I posted, turn Mako render function in a generator. > > The reason is that I would lite to to implement support for Nginx > subrequests. By subrequest, do you mean that one request is invoking another, like one WSGI application calling multiple other WSGI applications to render one page containing contents from more than one? > During a subrequest, the generated response body is sent directly to > the client, so it is necessary to be able to flush the Mako buffer I don't quite understand this, since I don't know what Mako is, or, if it's a template engine, what flushing its buffer would have to do with WSGI buffering. > > Under > > WSGI 1, you can do this by yielding empty strings before calling > > start_response. > >No, in this case this is not what I need to do. Well, if that's not when you're needing to suspend the application, then I don't see what you're losing in WSGI 2. >I need to call start_response, since the greenlet middleware will yield >data to the caller before the application returns. I still don't understand you. In WSGI 1, the only way to suspend execution (without using greenlets) prior to determining the headers is to yield empty strings. I'm beginning to wonder if maybe what you're saying is that you want to be able to write an application function in the form of a generator? If so, be aware that any WSGI 1 app written as: def app(environ, start_response): start_response(status, headers) yield "foo" yield "bar" can be written as a WSGI 2 app thus: def app(environ, start_response): def respond(): yield "foo" yield "bar" return status, headers, respond() This is also a good time for people to learn that generators are usually a *very bad* way to write WSGI apps - yielding is for server push or sending blocks of large files, not tiny strings. In general, if you're yielding more than one block, you're almost certainly doing WSGI wrong. The typical HTML, XML, or JSON output that's 99% of a webapp's requests should be transmitted as a single string, rather than as a series of snippets. IOW, the absence of generator support in WSGI 2 is a feature, not a bug. >In my new attempt I plan to: > >1) Implement the simple suspend/resume extension >2) Implement a Python extension module that wraps the Nginx events > system. >3) Implement a pure Python WSGI middleware that, using greenlets, will > enable normal applications to take advantage of Nginx async features. I think maybe I'm understanding a little better now -- you want to implement the WSGI gateway entirely in C, without using any Python, and without using the greenlet API directly. I think I've been unable to understand because I'm thinking in terms of a server implemented in Python, or at least that has the WSGI part implemented in Python. >Do you think it will possible to implement all the requirements of WSGI >2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ? My practical experience with Python 3 is essentially nonexistent, but being able to implement WSGI 2 in terms of WSGI 1 is a *design requirement* for WSGI 2; it's likely that much early use and development of WSGI 2 will be done through such an adapter. >And what about applications that need to use the WSGI 1.0 API but >require to run with Python 3.x? That's a tougher nut to crack; again, my practical experience with Python 3 is essentially nonexistent. From manlio_perillo at libero.it Thu Apr 8 22:18:02 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 08 Apr 2010 22:18:02 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> Message-ID: <4BBE39FA.2020802@libero.it> P.J. Eby ha scritto: > At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote: >> What I'm trying to do is: >> >> * as in the example I posted, turn Mako render function in a generator. >> >> The reason is that I would lite to to implement support for Nginx >> subrequests. > > By subrequest, do you mean that one request is invoking another, like > one WSGI application calling multiple other WSGI applications to render > one page containing contents from more than one? > Yes. > >> During a subrequest, the generated response body is sent directly to >> the client, so it is necessary to be able to flush the Mako buffer > > I don't quite understand this, since I don't know what Mako is, or, if > it's a template engine, what flushing its buffer would have to do with > WSGI buffering. > Ah, sorry. Mako is a template engine. Suppose I have an HTML template file, and I want to use a sub request. ...
${subrequest('/header/'}
... The problem with this code is that, since Mako will buffer all generated content, the result response body will contain incorrect data. It will first contain the response body generated by the sub request, then the content generated from the Mako template (XXX I have not checked this, but I think it is how it works). So, when executing a sub request, it is necessary to flush (that is, send to Nginx, in my case) the content generated from the template before the sub request is done. Since Mako does not return a generator (I asked the author, and it was too hard to implement), I use a greenlet in order to "turn" the Mako render function in a generator. > >> > Under >> > WSGI 1, you can do this by yielding empty strings before calling >> > start_response. >> >> No, in this case this is not what I need to do. > > Well, if that's not when you're needing to suspend the application, then > I don't see what you're losing in WSGI 2. > > >> I need to call start_response, since the greenlet middleware will yield >> data to the caller before the application returns. > > I still don't understand you. In WSGI 1, the only way to suspend > execution (without using greenlets) prior to determining the headers is > to yield empty strings. > Ah, you are right sorry. But this is not required for the Mako example (I was focusing on that example). > I'm beginning to wonder if maybe what you're saying is that you want to > be able to write an application function in the form of a generator? The greenlet middleware return a generator, in order to work. > If > so, be aware that any WSGI 1 app written as: > > def app(environ, start_response): > start_response(status, headers) > yield "foo" > yield "bar" > > can be written as a WSGI 2 app thus: > > def app(environ, start_response): > def respond(): > yield "foo" > yield "bar" > return status, headers, respond() > The problem, as I wrote, is that with the greenlet middleware, the application needs not to return a generator. def app(environ): tmpl = ... body = tmpl.render(...) return status, headers, [body] This is a very simple WSGI application. But when using the greenlet middleware, and when using the function for flushing Mako buffer, some data will be yielded *before* the application returns and status and headers are passed to Nginx. > This is also a good time for people to learn that generators are usually > a *very bad* way to write WSGI apps It's the only way to be able to suspend execution, when the WSGI implementation is embedded in an async web server not written in Python. The reason is that you can not use (XXX check me) greenlets in C code, you should probably use something like http://code.google.com/p/coev/ Greenlets can be used in gevent, as an example, because scheduling is under control of Python code. This is not the case with Nginx. > - yielding is for server push or > sending blocks of large files, not tiny strings. Again, consider the use of sub requests. yielding a "not large" block is the only choice you have. Unless, of course, you implement sub request support in pure Python (or using SSI - Server Side Include). Another use case is when you have a very large page, and you want to return some data as soon as possible to avoid the user to abort request if it takes some time. Also, note that with Nginx (as with Apache, if I'm not wrong), even if application yields small strings, the server can still do some buffering in order to increase performance. In ngx_http_wsgi_module buffering is optional (and disabled by default). In the sub request example, it means that if both the main request response body and sub request response body are small, Nginx can buffer all the data in memory before sending it to the client (XXX I need to check this). > In general, if you're > yielding more than one block, you're almost certainly doing WSGI wrong. > The typical HTML, XML, or JSON output that's 99% of a webapp's requests > should be transmitted as a single string, rather than as a series of > snippets. > > IOW, the absence of generator support in WSGI 2 is a feature, not a bug. > What do you mean by absence of generator support? WSGI 2 applications can still return a generator. > >> In my new attempt I plan to: >> >> 1) Implement the simple suspend/resume extension >> 2) Implement a Python extension module that wraps the Nginx events >> system. >> 3) Implement a pure Python WSGI middleware that, using greenlets, will >> enable normal applications to take advantage of Nginx async features. > > I think maybe I'm understanding a little better now -- you want to > implement the WSGI gateway entirely in C, without using any Python, and > without using the greenlet API directly. > Right. > I think I've been unable to understand because I'm thinking in terms of > a server implemented in Python, or at least that has the WSGI part > implemented in Python. > Yes. I had a similar problem trying to explain how ngx_http_wsgi_module works to another person (and I'm not even good at explaining things!). > [...] Thanks Manlio From pje at telecommunity.com Thu Apr 8 23:53:10 2010 From: pje at telecommunity.com (P.J. Eby) Date: Thu, 08 Apr 2010 17:53:10 -0400 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BBE39FA.2020802@libero.it> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> <4BBE39FA.2020802@libero.it> Message-ID: <20100408215334.2AB373A40AA@sparrow.telecommunity.com> At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote: >Suppose I have an HTML template file, and I want to use a sub request. > >... >${subrequest('/header/'} >... > >The problem with this code is that, since Mako will buffer all generated >content, the result response body will contain incorrect data. > >It will first contain the response body generated by the sub request, >then the content generated from the Mako template (XXX I have not >checked this, but I think it is how it works). Okay, I'm confused even more now. It seems to me like what you've just described is something that's fundamentally broken, even if you're not using WSGI at all. >So, when executing a sub request, it is necessary to flush (that is, >send to Nginx, in my case) the content generated from the template >before the sub request is done. This seems to only makes sense if you're saying that the subrequest *has to* send its output directly to the client, rather than to the parent request. If the subrequest sends its output to the parent request (as a sane implementation would), then there is no problem. Likewise, if the subrequest is sent to a buffer that's then inserted into the parent invocation. Anything else seems utterly insane to me, unless you're basically taking a bunch of legacy CGI code using 'print' statements and hacking it into something else. (Which is still insane, just differently. ;-) ) >Ah, you are right sorry. >But this is not required for the Mako example (I was focusing on that >example). As far as I can tell, that example is horribly wrong. ;-) >But when using the greenlet middleware, and when using the function for >flushing Mako buffer, some data will be yielded *before* the application >returns and status and headers are passed to Nginx. And that's probably because sharing a single output channel between the parent and child requests is a bad idea. ;-) (Specifically, it's an increase in "temporal coupling", I believe. I know it's some kind of coupling between functions that's considered bad, I just don't remember if that's the correct name for it.) > > This is also a good time for people to learn that generators are usually > > a *very bad* way to write WSGI apps > >It's the only way to be able to suspend execution, when the WSGI >implementation is embedded in an async web server not written in Python. It's true that dropping start_response() means you can't yield empty strings prior to determining your headers, yes. > > - yielding is for server push or > > sending blocks of large files, not tiny strings. > >Again, consider the use of sub requests. >yielding a "not large" block is the only choice you have. No, it isn't. You can buffer your output and yield empty strings until you're ready to flush. >Unless, of course, you implement sub request support in pure Python (or >using SSI - Server Side Include). I don't see why it has to be "pure", actually. It just that the subrequest needs to send data to the invoker rather than sending it straight to the client. That's the bit that's crazy in your example -- it's not a scenario that WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do it to be a bug, not a feature. ;-) That being said, I can see that removing start_response() closes a loophole that allows async apps to *potentially* exist under WSGI 1 (as long as you were able to tolerate the resulting crappy API). However, to fix that crappy API requires greenlets or threads, at which point you might as well just use WSGI 2. In the Nginx case, you can either do WSGI 1 in C and then use an adapter to provide WSGI 2, or you can expose your C API to Python and write a small greenlets-using Python wrapper to support suspending. It would look something like: def gateway(request_info, app): # set up environ run(greenlet(lambda: Finished(app(environ)))) def run(child): while not child.dead: data = child.switch() if isinstance(data, Finished): send_status(data.status) send_headers(data.headers) send_response(data.response) else: perform_appropriate_action_on(data) if data.suspend: # arrange for run(child) to be re-called later, then... return Suspension now works by switching back to the parent greenlet with command objects (like Finished()) to tell the run() loop what to do. The run() loop is not stateful, so when the task is unsuspended, you simply call run(child) again. A similar structure would exist for send_response() - i.e., it's a loop over the response, can break out of the loop if it needs to suspend, and arranges for itself to be re-called at the appropriate time. Voila - you now have asynchronous WSGI 2 support. Now, whether you actually *want* to do that is a separate question, but as (I hope) you can see, you definitely *can* do it, and without needing any greenlet-using code to be in C. From C, you just call back into one of the Python top-level loops (run() and send_response()), which then does the appropriate task switching. >Another use case is when you have a very large page, and you want to >return some data as soon as possible to avoid the user to abort request >if it takes some time. That's the server push case -- but of course that's not a problem even in WSGI 2, since the "response" can still be a generator. >Also, note that with Nginx (as with Apache, if I'm not wrong), even if >application yields small strings, the server can still do some buffering >in order to increase performance. In which case, it's in violation of the WSGI spec. The spec requires eparately-yielded strings to be flushed to OS-level buffering. >What do you mean by absence of generator support? >WSGI 2 applications can still return a generator. Yes - but they can't *be* a generator - previously they could, due to the separate start_response callable. From manlio_perillo at libero.it Fri Apr 9 13:00:12 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 09 Apr 2010 13:00:12 +0200 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web Message-ID: <4BBF08BC.7090404@libero.it> I have started to write an asynchronous WSGI implementation for Twisted Web. The standard implementation execute the WSGI application in a separate thread. twsgi will instead execute the application in the main Twisted thread. The advantage is that twsgi is better integrated in Twisted, and WSGI applications will be able to use all features available in Twisted. Code is availale from a Mercurial repository: http://hg.mperillo.ath.cx/twisted/twsgi The purpose of twsgi is to have a pure Python implementation of WSGI with support for asynchronous HTTP servers and asynchronous WSGI applications. The implementation is similar to ngx_http_wsgi_module, and can be used to quick test asynchronous extensions. write callable is not implemented (calling it will raise NotImplemented error), since write callable can not be implemented in an asynchronous web server without using threads (and twsgi *does* not use threads). ngx_http_wsgi_module does the same. TODO ---- * support for suspending iteration over WSGI app iter, when socket is not ready to send data. execution will be resumed when socked is ready again. * support for suspend/resume extension, as described here: http://comments.gmane.org/gmane.comp.python.twisted.web/632 It will have some differences: - the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output' The wsgiorg namespace is used, since the plan is to have it standardized [1], but it can only be implemented on asynchronous servers. - wsgi.pause_output function will accept an optional timeout, in milliseconds. If timeout is specified, application will be implicitly resumed when timeout expires. - resume function will return a boolean value. True: if execution was suspended and it is going to be resumed False: if execution was not suspended The return value can be used to check if timeout specified in wsgiorg.suspend expired. I'm not sure if a boolean value is the best solution. Maybe it should return -1 is execution was not suspended, and 0 otherwise. [1] unlike other proposed async extensions, suspend/resume is much more simple and easy to implement, so it is more likely to have a wide consensus over the specification. Feedbacks are welcomed. Regards Manlio From graham.dumpleton at gmail.com Fri Apr 9 13:17:47 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 9 Apr 2010 21:17:47 +1000 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: <4BBF08BC.7090404@libero.it> References: <4BBF08BC.7090404@libero.it> Message-ID: On 9 April 2010 21:00, Manlio Perillo wrote: > I have started to write an asynchronous WSGI implementation for Twisted Web. > > The standard implementation execute the WSGI application in a separate > thread. > twsgi will instead execute the application in the main Twisted thread. > > The advantage is that twsgi is better integrated in Twisted, and WSGI > applications will be able to use all features available in Twisted. > > > Code is availale from a Mercurial repository: > http://hg.mperillo.ath.cx/twisted/twsgi > > > The purpose of twsgi is to have a pure Python implementation of WSGI > with support for asynchronous HTTP servers and asynchronous WSGI > applications. > > The implementation is similar to ngx_http_wsgi_module, and can be used > to quick test asynchronous extensions. > > write callable is not implemented (calling it will raise NotImplemented > error), since write callable can not be implemented in an asynchronous > web server without using threads (and twsgi *does* not use threads). > > ngx_http_wsgi_module does the same. > > > TODO > ---- > > * support for suspending iteration over WSGI app iter, when socket is > ?not ready to send data. > ?execution will be resumed when socked is ready again. > > * support for suspend/resume extension, as described here: > ?http://comments.gmane.org/gmane.comp.python.twisted.web/632 > > ?It will have some differences: > > ? ?- the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output' > > ? ? ?The wsgiorg namespace is used, since the plan is to have it > ? ? ?standardized [1], but it can only be implemented on asynchronous > ? ? ?servers. Please read: http://www.wsgi.org/wsgi/Specifications If a proposal is suggested, it MUST use 'x-wsgiorg.' and not 'wsgiorg.'. Only after it is officially accepted can it use the 'wsgiorg.'. I would question whether you should even be using 'x-wsgiorg.' as as far as I can see from my quick scans of emails, you aren't even supporting WSGI proper as you are dropping support for bits. As such, it isn't WSGI, only WSGIish so how can you justify using the name. Why don't you given it all a completely different name else you will just cause ongoing confusion like you did with when you felt you could reuse the 'mod_wsgi' name for your nginx version even though I asked you to use a different name. It has been an absolute pain seeing discussions on places like #django irc where people don't know when people mention mod_wsgi whether they are talking about Apache of nginx. Graham > ? ?- wsgi.pause_output function will accept an optional timeout, in > ? ? ?milliseconds. > > ? ? ?If timeout is specified, application will be implicitly resumed > ? ? ?when timeout expires. > > ? ?- resume function will return a boolean value. > ? ? ?True: if execution was suspended and it is going to be resumed > ? ? ?False: if execution was not suspended > > ? ? ?The return value can be used to check if timeout specified in > ? ? ?wsgiorg.suspend expired. > > ? ? ?I'm not sure if a boolean value is the best solution. > ? ? ?Maybe it should return -1 is execution was not suspended, and 0 > ? ? ?otherwise. > > > [1] unlike other proposed async extensions, suspend/resume is much more > ? ?simple and easy to implement, so it is more likely to have a wide > ? ?consensus over the specification. > > > Feedbacks are welcomed. > > > Regards ?Manlio > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com > From manlio_perillo at libero.it Fri Apr 9 13:29:48 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 09 Apr 2010 13:29:48 +0200 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: References: <4BBF08BC.7090404@libero.it> Message-ID: <4BBF0FAC.4060606@libero.it> Graham Dumpleton ha scritto: > [...] >> - the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output' >> >> The wsgiorg namespace is used, since the plan is to have it >> standardized [1], but it can only be implemented on asynchronous >> servers. > > Please read: > > http://www.wsgi.org/wsgi/Specifications > > If a proposal is suggested, it MUST use 'x-wsgiorg.' and not > 'wsgiorg.'. Only after it is officially accepted can it use the > 'wsgiorg.'. > Well; since the original propose was using wsgi namespace, I just suggested the use of wsgiorg namespace instead Of course, when it will be implemented I will use a different namespace, until it gots approved. > I would question whether you should even be using 'x-wsgiorg.' as as > far as I can see from my quick scans of emails, you aren't even > supporting WSGI proper as you are dropping support for bits. As such, > it isn't WSGI, only WSGIish so how can you justify using the name. > This is not completely correct. The twsgi implementation, as well ngx_http_wsgi_module implementation, does not implement the write callable. The reason is simple: write callable was an huge mistake in WSGI 1.0 since it can not be implemented in an asynchronous web server. But since the write callable **can** be implemented in a middleware (using greenlets) and since middlewares *can* be configured inside WSGI gateway, implementations can still claim to be WSGI 1.0 conformant. > Why don't you given it all a completely different name else you will > just cause ongoing confusion In don't really see how this can cause confusion! > like you did with when you felt you could > reuse the 'mod_wsgi' name for your nginx In fact the first thing I did during code refactoring was to rename it to ngx_http_wsgi_module. > version even though I asked > you to use a different name. It has been an absolute pain seeing > discussions on places like #django irc where people don't know when > people mention mod_wsgi whether they are talking about Apache of > nginx. > Apologies for having underestimated this. Manlio From graham.dumpleton at gmail.com Fri Apr 9 13:55:47 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 9 Apr 2010 21:55:47 +1000 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: <4BBF0FAC.4060606@libero.it> References: <4BBF08BC.7090404@libero.it> <4BBF0FAC.4060606@libero.it> Message-ID: On 9 April 2010 21:29, Manlio Perillo wrote: > Graham Dumpleton ha scritto: >> [...] >>> ? ?- the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output' >>> >>> ? ? ?The wsgiorg namespace is used, since the plan is to have it >>> ? ? ?standardized [1], but it can only be implemented on asynchronous >>> ? ? ?servers. >> >> Please read: >> >> ? http://www.wsgi.org/wsgi/Specifications >> >> If a proposal is suggested, it MUST use 'x-wsgiorg.' and not >> 'wsgiorg.'. Only after it is officially accepted can it use the >> 'wsgiorg.'. >> > > Well; since the original propose was using wsgi namespace, I just > suggested the use of wsgiorg namespace instead > > Of course, when it will be implemented I will use a different namespace, > until it gots approved. > >> I would question whether you should even be using 'x-wsgiorg.' as as >> far as I can see from my quick scans of emails, you aren't even >> supporting WSGI proper as you are dropping support for bits. As such, >> it isn't WSGI, only WSGIish so how can you justify using the name. >> > > This is not completely correct. > The twsgi implementation, as well ngx_http_wsgi_module implementation, > does not implement the write callable. So, they aren't compliant with WSGI specification. How is my statement not correct? If not compliant, how can you use WSGI in their names? > The reason is simple: write callable was an huge mistake in WSGI 1.0 > since it can not be implemented in an asynchronous web server. It may not be the preferred way of doing things now, but it was not a huge mistake. There was a lot of stuff back then that used a write() style semantic for returning response content and that was the only way of supporting them without forcing those packages to do a major rewrite to change to an iterable type response. So, it served its purpose. > But since the write callable **can** be implemented in a middleware > (using greenlets) and since middlewares *can* be configured inside WSGI > gateway, implementations can still claim to be WSGI 1.0 conformant. Then only the higher level middleware adapter can even claim to be WSGI compliant and deserve to use the WSGI name. Any underlying abstraction you use at the web server interface isn't WSGI and by rights should be called something else so there is no confusion and also shouldn't use 'wsgi' keys in its environ dictionary. Have your high level middleware do a completely remapping of names as appropriate. >> Why don't you given it all a completely different name else you will >> just cause ongoing confusion > > In don't really see how this can cause confusion! So, when someone goes and runs a WSGI application directly against you WSGIish web server interface which you still insist you can describe as being WSGI and it fails because the write() method isn't implemented what is your answr going to be? If something is going to use WSGI name it should implement the full WSGI specification. >> like you did with when you felt you could >> reuse the 'mod_wsgi' name for your nginx > > In fact the first thing I did during code refactoring was to rename it > to ngx_http_wsgi_module. The mod_wsgi name is still used all through http://wiki.nginx.org/NginxNgxWSGIModule that I can tell. Graham From manlio_perillo at libero.it Fri Apr 9 14:15:25 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 09 Apr 2010 14:15:25 +0200 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: References: <4BBF08BC.7090404@libero.it> <4BBF0FAC.4060606@libero.it> Message-ID: <4BBF1A5D.7060604@libero.it> Graham Dumpleton ha scritto: > [...] >> But since the write callable **can** be implemented in a middleware >> (using greenlets) and since middlewares *can* be configured inside WSGI >> gateway, implementations can still claim to be WSGI 1.0 conformant. > > Then only the higher level middleware adapter can even claim to be > WSGI compliant and deserve to use the WSGI name. Since the middleware is executed inside WSGI gateway, and the gateway can be configured to always execute some middleware, the final application will simply have at disposal a WSGI conformant write callable. > Any underlying > abstraction you use at the web server interface isn't WSGI and by > rights should be called something else so there is no confusion and > also shouldn't use 'wsgi' keys in its environ dictionary. Have your > high level middleware do a completely remapping of names as > appropriate. > This will add useless overhead. >>> Why don't you given it all a completely different name else you will >>> just cause ongoing confusion >> In don't really see how this can cause confusion! > > So, when someone goes and runs a WSGI application directly against you > WSGIish web server interface which you still insist you can describe > as being WSGI and it fails because the write() method isn't > implemented what is your answr going to be? If something is going to > use WSGI name it should implement the full WSGI specification. > To make people happy, I can just have the default implementation include the required middleware by default. >>> like you did with when you felt you could >>> reuse the 'mod_wsgi' name for your nginx >> In fact the first thing I did during code refactoring was to rename it >> to ngx_http_wsgi_module. > > The mod_wsgi name is still used all through > http://wiki.nginx.org/NginxNgxWSGIModule that I can tell. > I still have to update it. Manlio From graham.dumpleton at gmail.com Fri Apr 9 14:21:57 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 9 Apr 2010 22:21:57 +1000 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: <4BBF1A5D.7060604@libero.it> References: <4BBF08BC.7090404@libero.it> <4BBF0FAC.4060606@libero.it> <4BBF1A5D.7060604@libero.it> Message-ID: On 9 April 2010 22:15, Manlio Perillo wrote: > Graham Dumpleton ha scritto: >> [...] >>> But since the write callable **can** be implemented in a middleware >>> (using greenlets) and since middlewares *can* be configured inside WSGI >>> gateway, implementations can still claim to be WSGI 1.0 conformant. >> >> Then only the higher level middleware adapter can even claim to be >> WSGI compliant and deserve to use the WSGI name. > > Since the middleware is executed inside WSGI gateway, and the gateway > can be configured to always execute some middleware, the final > application will simply have at disposal a WSGI conformant write callable. Then it isn't really a middleware at all then, but a part of your overall solution. So long as only the complete solution is exposed and is WSGI compliant then fine. But if it is going to be layered in any way such that lower level layers can be used in their own right, then the lower level layers shouldn't really be said to be WSGI if they don't implement full WSGI specification. As much as we all have our complaints about WSGI specification, it is what it is and is all we have right now. Graham >> Any underlying >> abstraction you use at the web server interface isn't WSGI and by >> rights should be called something else so there is no confusion and >> also shouldn't use 'wsgi' keys in its environ dictionary. Have your >> high level middleware do a completely remapping of names as >> appropriate. >> > > This will add useless overhead. > >>>> Why don't you given it all a completely different name else you will >>>> just cause ongoing confusion >>> In don't really see how this can cause confusion! >> >> So, when someone goes and runs a WSGI application directly against you >> WSGIish web server interface which you still insist you can describe >> as being WSGI and it fails because the write() method isn't >> implemented what is your answr going to be? If something is going to >> use WSGI name it should implement the full WSGI specification. >> > > To make people happy, I can just have the default implementation include > the required middleware by default. > >>>> like you did with when you felt you could >>>> reuse the 'mod_wsgi' name for your nginx >>> In fact the first thing I did during code refactoring was to rename it >>> to ngx_http_wsgi_module. >> >> The mod_wsgi name is still used all through >> http://wiki.nginx.org/NginxNgxWSGIModule that I can tell. >> > > I still have to update it. > > > Manlio > From manlio_perillo at libero.it Fri Apr 9 14:46:17 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 09 Apr 2010 14:46:17 +0200 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: References: <4BBF08BC.7090404@libero.it> <4BBF0FAC.4060606@libero.it> <4BBF1A5D.7060604@libero.it> Message-ID: <4BBF2199.2@libero.it> Graham Dumpleton ha scritto: > On 9 April 2010 22:15, Manlio Perillo wrote: >> Graham Dumpleton ha scritto: >>> [...] >>>> But since the write callable **can** be implemented in a middleware >>>> (using greenlets) and since middlewares *can* be configured inside WSGI >>>> gateway, implementations can still claim to be WSGI 1.0 conformant. >>> Then only the higher level middleware adapter can even claim to be >>> WSGI compliant and deserve to use the WSGI name. >> Since the middleware is executed inside WSGI gateway, and the gateway >> can be configured to always execute some middleware, the final >> application will simply have at disposal a WSGI conformant write callable. > > Then it isn't really a middleware at all then, but a part of your > overall solution. It is just that the gateway has support to direct execution of middlewares, since this make the implementation more flexible. > So long as only the complete solution is exposed and > is WSGI compliant then fine. But if it is going to be layered in any > way such that lower level layers can be used in their own right, then > the lower level layers shouldn't really be said to be WSGI if they > don't implement full WSGI specification. As much as we all have our > complaints about WSGI specification, it is what it is and is all we > have right now. > By the way, as a matter of curiosity. WSGI 1.0 states: """The start_response callable must return a write(body_data) callable that takes one positional parameter: a string to be written as part of the HTTP response body. (Note: the write() callable is provided only to support certain existing frameworks' imperative output APIs; it should not be used by new applications or frameworks if it can be avoided. See the Buffering and Streaming section for more details.)""" There is nothing that prevents the write callable to raise an exception. Of course an implementation that always raise a NotImplementedError is going to be useless (for applications that require the write callable), but it seems to me that such an implementation can still claim to conform to WSGI 1.0. > [...] Manlio From renesd at gmail.com Fri Apr 9 15:04:15 2010 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 9 Apr 2010 14:04:15 +0100 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: <4BBF2199.2@libero.it> References: <4BBF08BC.7090404@libero.it> <4BBF0FAC.4060606@libero.it> <4BBF1A5D.7060604@libero.it> <4BBF2199.2@libero.it> Message-ID: On Fri, Apr 9, 2010 at 1:46 PM, Manlio Perillo wrote: > > By the way, as a matter of curiosity. > WSGI 1.0 states: > > """The start_response callable must return a write(body_data) callable > that takes one positional parameter: a string to be written as part of > the HTTP response body. (Note: the write() callable is provided only to > support certain existing frameworks' imperative output APIs; it should > not be used by new applications or frameworks if it can be avoided. See > the Buffering and Streaming section for more details.)""" > > > There is nothing that prevents the write callable to raise an exception. > > Of course an implementation that always raise a NotImplementedError is > going to be useless (for applications that require the write callable), > but it seems to me that such an implementation can still claim to > conform to WSGI 1.0. > > Agreed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From and-py at doxdesk.com Fri Apr 9 17:26:56 2010 From: and-py at doxdesk.com (And Clover) Date: Fri, 09 Apr 2010 17:26:56 +0200 Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? In-Reply-To: <4BB71921.6374.11014531@jdmain.comcast.net> References: <4BB71921.6374.11014531@jdmain.comcast.net> Message-ID: <4BBF4740.7070805@doxdesk.com> J.D. Main wrote: > I want to see the entire HTTP request with everything inside it. You won't get that as a CGI (or WSGI) application. It is the web server's job to parse the headers of the request, choose what host and script that maps to, and make them available to you (in the environ dictionary in WSGI, or the real environment variables in CGI). The server may perform additional processing on the input/output (eg. buffering and chunking). If you really need low-level detail you'll need to write your own HTTP server, or adapt one from eg. BaseHTTPServer. You almost never need that for normal web applications. > Does IIS actually pass that information to the CGI application or does it just > pass the variables? For a query string as posted, IIS parses the initial HTTP GET command, extracts the path part of that, splits it, and puts the `?...` part in the variable `QUERY_STRING` for you. > how would my python parse the following: > http://someserver/someapp/someuser/someupdate?var1=Charlie Many people do this with URL rewriting, to turn that into something like: http://someserver/someapp.py?user=someuser&action=someupdate&var1=Charlie You don't get a standard URL rewriter in IIS 5 but there are many third-party options. Personally I hate URL rewriting and try to avoid it wherever possible, because IMO URL format should be in the domain of the application and not a deployment issue. Unfortunately, if you really want to get rid of the `.py` in the URL, you will need at least some rewriting, because IIS refuses to map files without an extension to script engines. You can make the extension `.p` or `.html` or something else if you like, but you can't get rid of it. http://someserver/someapp.py/someuser/someupdate?var1=Charlie This URL should be parsed into environ members: HTTP_HOST: someserver SCRIPT_NAME: /someapp.py PATH_INFO: /someuser/someupdate QUERY_STRING: ?var1=Charlie Unfortunately (again), IIS gets this wrong. It sets `PATH_INFO` to: /someapp.py/someuser/someupdate which is contrary to the CGI/WSGI specifications. If you want to sniff path parts as an input mechanism (to do URL routing yourself without rewriting), you will have to detect this situation (probably by sniffing SERVER_SOFTWARE) and hack a fix in. Some libraries and frameworks may do this for you. (Aside: even this is not certain. This wrong behaviour can be turned off using a little-known IIS config option. However, it's unlikely to be used in the wild, not least because the flag typically breaks ASP.) Unfortunately (yet again), it's not reliable to send any old characters as part of the path. Because of the poor design of the original CGI standard (carried over into WSGI), any `%nn` escape sequences get decoded before being dropped into SCRIPT_NAME/PATH_INFO (though not, thankfully, QUERY_STRING). This has the consequence that there are many characters that can't reliably be used in a path part, including slashes, backslashes, control characters, and all non-ASCII characters (since they go through a Unicode decode/encode cycle with what are almost guaranteed to be the wrong charsets). Stick with simple strings like `someuser`. Summary: IIS is a pain. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From me at gustavonarea.net Fri Apr 9 23:14:17 2010 From: me at gustavonarea.net (Gustavo Narea) Date: Fri, 09 Apr 2010 22:14:17 +0100 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: <4BBF08BC.7090404@libero.it> References: <4BBF08BC.7090404@libero.it> Message-ID: <4BBF98A9.4000108@gustavonarea.net> Hello, Maybe I'm missing something obvious, but if the gateway doesn't support applications that return write() callables, then it's not WSGI. A callable that raises an exception does not even count. It's obvious that they must not raise exceptions -- Then what's the point of providing the callable? That said, I *think* it might be OK to disable support for the write() callable *optionally* on a per application basis. For example, the gateway could look at the "requires_write" attribute of the application callable, if any: """ def wsgi_app(environ, start_response): # ... process the request and return a response.... wsgi_app.requires_write = False """ That way, applications which don't use the write() callable can let your gateway know and thus it won't pass one on. We could even standardize this (at wsgi.org) so that any WSGI middleware which wraps an application can expose the "requires_write" attribute of the wrapped application... As long as such a middleware doesn't use write() either. On the other hand, I would avoid using "middleware" in this context for something specific to your implementation as people will believe it's a proper WSGI middleware. It'd certainly be *middle*ware, but I'd use something that is not confusing/misleading, like "filter". This is just a suggestion. Cheers, -- Gustavo Narea . From graham.dumpleton at gmail.com Sat Apr 10 13:20:36 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sat, 10 Apr 2010 21:20:36 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <20100408215334.2AB373A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> <4BBE39FA.2020802@libero.it> <20100408215334.2AB373A40AA@sparrow.telecommunity.com> Message-ID: On 9 April 2010 07:53, P.J. Eby wrote: >> Also, note that with Nginx (as with Apache, if I'm not wrong), even if >> application yields small strings, the server can still do some buffering >> in order to increase performance. > > In which case, it's in violation of the WSGI spec. ?The spec requires > eparately-yielded strings to be flushed to OS-level buffering. True, and Apache/mod_wsgi does best effort on that. Output filters at Apache level can be a problem with that though as a flush bucket in Apache bucket chains is a request only and an output filter can decide not to flush through all data. For example, mod_deflate may buffer partial data in order to get enough for next block of compressed data. This is the exception rather than the norm, and if no such output filters exists, then separately yield strings should be flushed right through to the socket. So, one can try and satisfy that requirement in WSGI, but in practice it cannot always be achieved because you may have absolutely no control over the underlying web server. Graham From chris.dent at gmail.com Sat Apr 10 15:04:07 2010 From: chris.dent at gmail.com (Chris Dent) Date: Sat, 10 Apr 2010 14:04:07 +0100 (BST) Subject: [Web-SIG] wsgi and generators (was Re: WSGI and start_response) In-Reply-To: <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> Message-ID: On Thu, 8 Apr 2010, P.J. Eby wrote: > This is also a good time for people to learn that generators are usually a > *very bad* way to write WSGI apps - yielding is for server push or sending > blocks of large files, not tiny strings. In general, if you're yielding more > than one block, you're almost certainly doing WSGI wrong. The typical HTML, > XML, or JSON output that's 99% of a webapp's requests should be transmitted > as a single string, rather than as a series of snippets. Now the thread that included the quoted bit above has died down a bit, I wanted to get back to this. I was surprised when I read this as I found it counter intuitive, different to what I'm doing in practical day to day WSGI app creation and contrary to what my old school network services thinking thinks (start getting stuff queued for the pipe as soon as possible). The apps I'm creating tend to be HTTP APIs that are trying to be RESTful and as such they have singular resources I call entities, and aggregates of those entities I call collections. The APIs provide access to GETting and PUTting entities and GETting collections. Whenever a GET request is made on an entity or collection, the entity or entities involved is serialzed to some string form. When there are many entities in a collection, yielding their serialized forms makes semantic sense as well as (it appears) resource utiliziation sense. I realize I'm able to build up a complete string or yield via a generator, or a whole bunch of various ways to accomplish things (which is part of why I like WSGI: that content is just an iterator, that's a good thing) so I'm not looking for a statement of what is or isn't possible, but rather opinions. Why is yielding lots of moderately sized strings *very bad*? Why is it _not_ very bad (as presumably others think)? The model I have in my mind is an application where there is a fair amount of layering and separation between the request handling, the persistence layer, and the serialization system. When a GET for a collection happens, it would call the persistence layer, which would return a generator of entities, which would be passed to the serialization, which would yield a block of output per entity. Here's some pseudo code: def get_collection(environ, start_response): try: entities = store.get_collection('something') except NoSomething: start_response('404 Not Found', []) return ['sorry'] start_response('200 OK' [('Content-Type', 'text/html')]) # yield a block of html per entity return serializer.generate_html_from_entities(entities) "In general, if you're yielding more than one block, you're almost certainly doing WSGI wrong." I don't understand how this is wrong. It appears to allow nice conceptual separation between the store and serializer while still allowing the memory (and sometimes cpu) efficiences of generators. It may be that I'm a special case (some of the serializations can be quite expansive and expensive), but I would be surprised if that were so. So what's going on? P.S. Speaking of these things, can anyone point me to a JSON tool that can yield a string of JSON as a series of blocks? Assuming a data structure that is a long list of anonymous dicts, json.dumps(the_list) returns a single string. It would be nice, to fit in the model above I could yield each dict. Better if I could pass the_list as a generator. I can think of ways to create such a tool myself, but I'd like to use an existing one if it exists. From dirkjan at ochtman.nl Sat Apr 10 15:45:26 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Sat, 10 Apr 2010 15:45:26 +0200 Subject: [Web-SIG] wsgi and generators (was Re: WSGI and start_response) In-Reply-To: References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> Message-ID: On Sat, Apr 10, 2010 at 15:04, Chris Dent wrote: > On Thu, 8 Apr 2010, P.J. Eby wrote: >> This is also a good time for people to learn that generators are usually a >> *very bad* way to write WSGI apps - yielding is for server push or sending >> blocks of large files, not tiny strings. ?In general, if you're yielding >> more than one block, you're almost certainly doing WSGI wrong. ?The typical >> HTML, XML, or JSON output that's 99% of a webapp's requests should be >> transmitted as a single string, rather than as a series of snippets. While I agree that it doesn't make sense to yield small strings, it seems to make perfect sense to chunk up larger buffers (e.g. starting at several kilobytes). This is something we do when transmitting Mercurial changesets, for example. Cheers, Dirkjan From pje at telecommunity.com Sat Apr 10 19:52:00 2010 From: pje at telecommunity.com (P.J. Eby) Date: Sat, 10 Apr 2010 13:52:00 -0400 Subject: [Web-SIG] wsgi and generators (was Re: WSGI and start_response) In-Reply-To: References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> Message-ID: <20100410175211.EF9903A40AA@sparrow.telecommunity.com> At 02:04 PM 4/10/2010 +0100, Chris Dent wrote: >I realize I'm able to build up a complete string or yield via a >generator, or a whole bunch of various ways to accomplish things >(which is part of why I like WSGI: that content is just an iterator, >that's a good thing) so I'm not looking for a statement of what is or >isn't possible, but rather opinions. Why is yielding lots of moderately >sized strings *very bad*? Why is it _not_ very bad (as presumably >others think)? How bad it is depends a lot on the specific middleware, server architecture, OS, and what else is running on the machine. The more layers of architecture you have, the worse the overhead is going to be. The main reason, though, is that alternating control between your app and the server means increased request lifetime and worsened average request completion latency. Imagine that I have five tasks to work on right now. Let us say each takes five units of time to complete. If I have five units of time right now, I can either finish one task now, or partially finish five. If I work on them in an interleaved way, *none* of the tasks will be done until twenty-five units have elapsed, and so all tasks will have a completion latency of 25 units. If I work on them one at a time, however, then one task will be done in 5 units, the next in 10, and so on -- for an average latency of only 15 units. And that is *not* counting any task switching overhead. But it's *worse* than that, because by multitasking, my task queue has five things in it the whole time... so I am using more memory and have more management overhead, as well as task switching overhead. If you translate this to the architecture of a web application, where the "work" is the server serving up bytes produced by the application, then you will see that if the application serves up small chunks, the web server is effectively forced to multitask, and keep more application instances simultaneously running, with lowered latency, increased memory usage, etc. However, if the application hands either its entire output to the server, then the "task" is already *done* -- the server doesn't need the thread or child process for that app anymore, and can have it do something else while the I/O is happening. The OS is in a better position to interleave its own I/O with the app's computation, and the overall request latency is reduced. Is this a big emergency if your server's mostly idle? Nope. Is it a problem if you're writing a CGI program or some other direct API that doesn't automatically flush I/O? Not at all. I/O buffering works just fine for making sure that the tasks are handed off in bigger chunks. But if you're coding up a WSGI framework, you don't really want to have it sending tiny chunks of data up a stack of middleware, because WSGI doesn't *have* any buffering, and each chunk is supposed to be sent *immediately*. Well-written web frameworks usually do some degree of buffering already, for API and performance reasons, so for simplicity's sake, WSGI was spec'd assuming that applications would send data in already-buffered chunks. (Specifically, the simplicity of not needing to have an explicit flushing API, which would otherwise have been necessary if middleware and servers were allowed to buffer the data, too.) From graham.dumpleton at gmail.com Sun Apr 11 06:32:25 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sun, 11 Apr 2010 14:32:25 +1000 Subject: [Web-SIG] wsgi and generators (was Re: WSGI and start_response) In-Reply-To: References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> Message-ID: On 10 April 2010 23:04, Chris Dent wrote: > On Thu, 8 Apr 2010, P.J. Eby wrote: > >> This is also a good time for people to learn that generators are usually a >> *very bad* way to write WSGI apps - yielding is for server push or sending >> blocks of large files, not tiny strings. ?In general, if you're yielding >> more than one block, you're almost certainly doing WSGI wrong. ?The typical >> HTML, XML, or JSON output that's 99% of a webapp's requests should be >> transmitted as a single string, rather than as a series of snippets. > > Now the thread that included the quoted bit above has died down a bit, I > wanted to get back to this. I was surprised when I read this as I found > it counter intuitive, different to what I'm doing in practical day to > day WSGI app creation and contrary to what my old school network > services thinking thinks (start getting stuff queued for the pipe as > soon as possible). > > The apps I'm creating tend to be HTTP APIs that are trying to be RESTful > and as such they have singular resources I call entities, and aggregates > of those entities I call collections. The APIs provide access to GETting > and PUTting entities and GETting collections. > > Whenever a GET request is made on an entity or collection, the entity or > entities involved is serialzed to some string form. When there are many > entities in a collection, yielding their serialized forms makes semantic > sense as well as (it appears) resource utiliziation sense. > > I realize I'm able to build up a complete string or yield via a > generator, or a whole bunch of various ways to accomplish things > (which is part of why I like WSGI: that content is just an iterator, > that's a good thing) so I'm not looking for a statement of what is or > isn't possible, but rather opinions. Why is yielding lots of moderately > sized strings *very bad*? Why is it _not_ very bad (as presumably > others think)? Because for a WSGI application, you have absolutely no idea what actual web server it may run on and what the overheads are of sending a block of data, let alone many. In Apache for example, if sent as small blocks, for which a flush has to occur between each, you have to call into the Apache output filter bucket chain on every block. This in itself is not an insubstantial overhead if done many many times. You also have the actual overheads of writing smalls blocks onto the actual socket. Let us take an extreme example of a hello world program. import sys def application(environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output] Say for this I can reliably get: Requests per second: 2122.56 [#/sec] (mean) Now change the last line of that hello world program, mirroring a common mistake you see some make, to: return output so that instead of yielding a single string, yields each character in the string. About the best I can achieve now is: Requests per second: 1973.51 [#/sec] (mean) This example is only a small string and so only a handful of flushes had to be done. If you break up a large amount of data into many small bits, the overheads will obviously become worse. More so if you actually had Apache output filters installed such as mod_deflate which actually did work on ever flush. In case above there were no output filters installed. So, you may get away with it, but you just have to be a bit careful on how fine grained you do it. Also, since lot of clients are going to be slow at reading the response, it is questionable how much it would help anyway. Delaying and sending as complete response may work just as well or better. Certainly, if using a front end such as nginx, returning a complete response will allow the WSGI server to off load the full response quicker because of the way nginx works as buffer. Dribbling it in bits just means the backend has to do more work. Overall I would suggest you form complete responses and focus your effort instead on better application caching so that you can deliver responses from a cache and avoid the whole need to generate it in the first place. Graham From manlio_perillo at libero.it Sun Apr 11 21:54:07 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sun, 11 Apr 2010 21:54:07 +0200 Subject: [Web-SIG] [ANN] txwsgi 0.1 Message-ID: <4BC228DF.50603@libero.it> I'm pleased to announce txwsgi, version 0.1. txwsgi is a fork of twisted.web.wsgi, that, unlike the original implementation, executes the WSGI application in the main I/O thread. txwsgi implements the proposed x-wsgiorg.suspend extension, that enables support to asynchronous WSGI applications. Some examples are available in the doc/examples directory, in the source distribution. The project is available on BitBucket: http://bitbucket.org/mperillo/txwsgi/ More informations are available in the README file. The x-wsgiorg.suspend extension is specified in doc/wsgiorg.suspend.rst. I will starte a new thread for official approval process. I have tried to write as much documentation possible, also taking into consideration feedback received in previous threads; thanks for the support. Thanks and regards Manlio Perillo From manlio_perillo at libero.it Sun Apr 11 22:07:32 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sun, 11 Apr 2010 22:07:32 +0200 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension Message-ID: <4BC22C04.5050308@libero.it> I'm not sure about the correct procedure to follow, I hope it is not a problem. I here propose the x-wsgiorg.suspend to be accepted as official WSGI extension, using the wsgiorg namespace. The extension is documented in doc/wsgiorg.suspend.rst document in the txwsgi source distribution, available on: http://bitbucket.org/mperillo/txwsgi/ The direct link to the specification is: http://bitbucket.org/mperillo/txwsgi/src/tip/doc/wsgiorg.suspend.rst The extension is implemented in txwsgi implementation for Twisted Web server, and I'm going to implement it in the ngx_http_wsgi_module implementation for Nginx server. The extension is very easy to implement. It also generalize the proposed x-wsgiorg.fdevent extension. Please, see http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_fdevent.py for a comparison of the same example described in fdevent specification, implemented using suspend and Twisted reactor API. Thanks to Christopher Stawarz for writing the fdevent specification, since I was able to use it as a reference. Some additional notes. x-wsgiorg.suspend extension can be implemented in both WSGI 1.0 and the proposed WSGI 2.0. However, due to the lack of start_response support, the usability is limited. Thanks and regards Manlio Perillo From manlio_perillo at libero.it Sun Apr 11 22:26:33 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sun, 11 Apr 2010 22:26:33 +0200 Subject: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web In-Reply-To: <4BBF98A9.4000108@gustavonarea.net> References: <4BBF08BC.7090404@libero.it> <4BBF98A9.4000108@gustavonarea.net> Message-ID: <4BC23079.1010600@libero.it> Gustavo Narea ha scritto: > Hello, > > Maybe I'm missing something obvious, but if the gateway doesn't support > applications that return write() callables, then it's not WSGI. > > A callable that raises an exception does not even count. It's obvious > that they must not raise exceptions -- Then what's the point of > providing the callable? > Nothing is obvious in an official specification ;-). The reason I choose to not completely remove the write callable is because it will raise a nice error message if someone even try to use my implementation to execute a WSGI application that requires the write callable. Moreover some middlewares or applications may assume the write callable exists and the value returned by start_response is not None, even if it is never used. > That said, I *think* it might be OK to disable support for the write() > callable *optionally* on a per application basis. For example, the > gateway could look at the "requires_write" attribute of the application > callable, if any: > """ > def wsgi_app(environ, start_response): > # ... process the request and return a response.... > > wsgi_app.requires_write = False > """ > > That way, applications which don't use the write() callable can let your > gateway know and thus it won't pass one on. > The problem is that applications that requires the write callable, are not aware of this extension. This is really a no problem, IMHO. If you try to execute an application, and you get a NotImplementedError extension, then you *know* that write callable is required. Then, you just configure the WSGI gateway to use the required adapter. See http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_write.py for a pratical example using txwsgi. With ngx_http_wsgi_module, you just have to add a wsgi_middleware txwsgi.greenlet write_adapter; directive in Nginx configuration file. > We could even standardize this (at wsgi.org) so that any WSGI middleware > which wraps an application can expose the "requires_write" attribute of > the wrapped application... As long as such a middleware doesn't use > write() either. > > On the other hand, I would avoid using "middleware" in this context for > something specific to your implementation as people will believe it's a > proper WSGI middleware. Yes. I now use the term "adapter". Regards Manlio From manlio_perillo at libero.it Sun Apr 11 22:39:51 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sun, 11 Apr 2010 22:39:51 +0200 Subject: [Web-SIG] wsgi and generators (was Re: WSGI and start_response) In-Reply-To: <20100410175211.EF9903A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> <20100410175211.EF9903A40AA@sparrow.telecommunity.com> Message-ID: <4BC23397.1020707@libero.it> P.J. Eby ha scritto: > At 02:04 PM 4/10/2010 +0100, Chris Dent wrote: >> I realize I'm able to build up a complete string or yield via a >> generator, or a whole bunch of various ways to accomplish things >> (which is part of why I like WSGI: that content is just an iterator, >> that's a good thing) so I'm not looking for a statement of what is or >> isn't possible, but rather opinions. Why is yielding lots of moderately >> sized strings *very bad*? Why is it _not_ very bad (as presumably >> others think)? > > How bad it is depends a lot on the specific middleware, server > architecture, OS, and what else is running on the machine. The more > layers of architecture you have, the worse the overhead is going to be. > > The main reason, though, is that alternating control between your app > and the server means increased request lifetime and worsened average > request completion latency. > This is not completely true. At least this is not how things will work on an asynchronous WSGI implementation. It is true that alternating control between your app and server decrease performance. This can be verified with: http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_cooperative.py However yielding small strings in the application iterator, because the application does not want to buffer data, will usually not cause the problems you describe. Instead, the possible performance problems have been described by Graham. Moreover, when we speak about latency, we should also consider that web page are usually served to human users. In this case, latency is not the only factor to consider. Is it better for the user to wait 3 seconds for some text to appear on the browser window, and then wait for other 5 seconds for the complete page to be rendered, or having to wait 5 seconds for some text to appear on the browser window? > [...] > If you translate this to the architecture of a web application, where > the "work" is the server serving up bytes produced by the application, > then you will see that if the application serves up small chunks, the > web server is effectively forced to multitask, and keep more application > instances simultaneously running, with lowered latency, increased memory > usage, etc. > Yielding small strings *will* not force multitasking. This can be verified with: http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_producer.py WSGI application will be suspended *only* when data can not be sent to the OS socket buffer. Yielding several small strings will *usually* not cause socket buffer overflow, unless the client is very slow at reading data. Instead, ironically, you will have a problem when the application yields several big strings. In this case it is better to yield only one very big string, but this is not always feasible. And I'm not sure if it is worse to keep a very big buffer in memory, or to send several not small chunks to the client. > [...] Regards Manlio From graham.dumpleton at gmail.com Mon Apr 12 05:40:31 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Mon, 12 Apr 2010 13:40:31 +1000 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <4BC22C04.5050308@libero.it> References: <4BC22C04.5050308@libero.it> Message-ID: On 12 April 2010 06:07, Manlio Perillo wrote: > I'm not sure about the correct procedure to follow, I hope it is not a > problem. > > I here propose the x-wsgiorg.suspend to be accepted as official WSGI > extension, using the wsgiorg namespace. > > The extension is documented in doc/wsgiorg.suspend.rst document in the > txwsgi source distribution, available on: > http://bitbucket.org/mperillo/txwsgi/ > > The direct link to the specification is: > http://bitbucket.org/mperillo/txwsgi/src/tip/doc/wsgiorg.suspend.rst > > The extension is implemented in txwsgi implementation for Twisted Web > server, and I'm going to implement it in the ngx_http_wsgi_module > implementation for Nginx server. > > The extension is very easy to implement. > It also generalize the proposed x-wsgiorg.fdevent extension. > > Please, see > http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_fdevent.py > for a comparison of the same example described in fdevent specification, > implemented using suspend and Twisted reactor API. > > > Thanks to Christopher Stawarz for writing the fdevent specification, > since I was able to use it as a reference. > > > Some additional notes. > x-wsgiorg.suspend extension can be implemented in both WSGI 1.0 and the > proposed WSGI 2.0. ?However, due to the lack of start_response support, > the usability is limited. In the code of demo_fdevent.py it has: while True: while True: ret, num_handles = m.perform() if ret != pycurl.E_CALL_MULTI_PERFORM: break if not num_handles: break read, write, exc = m.fdset() resume = environ['x-wsgiorg.suspend'](1000) if read: readable(read[0], resume) yield '' else: writeable(write[0], resume) yield '' The registration of file descriptors doesn't occur until after the first suspend() call. If the underlying reactor that the WSGI server is presumably also using doesn't know about the file descriptors at that point, then how does it now to return from the suspend(). You are also calling perform() before that point. When calling that, it is presumed you have already done a select/poll to know data is available, but you haven't done that on first pass through the loop. If you call that and data isn't ready, can't it block still. This example also illustrates well why I am so against an asynchronous WSGI server extension. The reason is that your specific application has to be with this extension bound to the specific event loop mechanism used by the underlying WSGI server. I can't for example take this application and host it on a different WSGI server which implements the same WSGI extension but uses a different event loop. If one can't do that and it is tied to the event loop and infrastructure of the underlying WSGI server, what is the point of defining and implementing the WSGI extension as it doesn't aid portability at all, so what service is it actually providing? In that respect, the extension: http://www.wsgi.org/wsgi/Specifications/fdevent/ provided more as at least it tried to abstract out a generic interface for registering interest in file descriptor activity and so perhaps allow the application not to be dependent on the specific event loop used by the underlying WSGI server. >From the open issues of that other specification however, you can see that there can be problems. It only allowed an application to be interested in a single file descriptor where some packages may need to express interest in more than one. Quite often an application is never going to be that simple anyway. Some event systems allow a lot more than just watching of file descriptors and timeouts however. You cant come up with a generic interface for all these as they will not be able to be implemented by a different event system which isn't so feature rich or which has a different style of interface. Thus applications are restricted to the lowest common denominator and likely that is not going to be enough for most and so have no choice but to bind it to interfaces of specific event loop. If that is going to be the case anyway, you may as well forget about WSGI and write to that event systems specific web server interface. So, given that one of the strengths of WSGI is that it is an interface which aids portability of applications to different hosting mechanisms, explain to me what purpose this WSGI extension has if it doesn't aid portability given that your application still has to be aware of the underlying event loop of that specific system anyway. Graham From manlio_perillo at libero.it Mon Apr 12 13:25:05 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 12 Apr 2010 13:25:05 +0200 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: References: <4BC22C04.5050308@libero.it> Message-ID: <4BC30311.9070206@libero.it> Graham Dumpleton ha scritto: > On 12 April 2010 06:07, Manlio Perillo wrote: >> I'm not sure about the correct procedure to follow, I hope it is not a >> problem. >> >> I here propose the x-wsgiorg.suspend to be accepted as official WSGI >> extension, using the wsgiorg namespace. >> First of all thanks for the feedback. > [...] > In the code of demo_fdevent.py it has: > > while True: > while True: > ret, num_handles = m.perform() > if ret != pycurl.E_CALL_MULTI_PERFORM: > break > if not num_handles: > break > > read, write, exc = m.fdset() > resume = environ['x-wsgiorg.suspend'](1000) > if read: > readable(read[0], resume) > yield '' > else: > writeable(write[0], resume) > yield '' > > The registration of file descriptors doesn't occur until after the > first suspend() call. > > If the underlying reactor that the WSGI server is presumably also > using doesn't know about the file descriptors at that point, then how > does it now to return from the suspend(). > I'm not sure to understand your concern, but the execution is not suspended when you call x-wsgiorg.suspend, but only when you yield a empty string. In the example, registration of file descriptor occur before application is suspended. > You are also calling perform() before that point. When calling that, > it is presumed you have already done a select/poll to know data is > available, but you haven't done that on first pass through the loop. > If you call that and data isn't ready, can't it block still. > I have to admit that I just copied the example from fdevent specification. However the code seems correct, to me. > This example also illustrates well why I am so against an asynchronous > WSGI server extension. > > The reason is that your specific application has to be with this > extension bound to the specific event loop mechanism used by the > underlying WSGI server. > > I can't for example take this application and host it on a different > WSGI server which implements the same WSGI extension but uses a > different event loop. > Instead I think that being "agnostic" about how it is used, in one of the most important feature of x-wsgiorg.suspend extension. After all, if you think about it, how to interface with a database in a WSGI application is not specified by WSGI. This is done by a separate standard, dbapi2. For applications that need a template engine, we don't even have a standard inteface. The lack of a standard event API is not a problem that should be discussed in WSGI. It is a problem with the Python community; in fact I would like to define a standard event API *and* a standard efficient network API (the reason is expressed at the end of the README file in txwsgi). > If one can't do that and it is tied to the event loop and > infrastructure of the underlying WSGI server, what is the point of > defining and implementing the WSGI extension as it doesn't aid > portability at all, so what service is it actually providing? > The service it provides is: "allow a WSGI application to suspend its execution and resume it later". > In that respect, the extension: > > http://www.wsgi.org/wsgi/Specifications/fdevent/ > > provided more as at least it tried to abstract out a generic interface > for registering interest in file descriptor activity and so perhaps > allow the application not to be dependent on the specific event loop > used by the underlying WSGI server. > However exposing this event interface is really something that has little to do with WSGI. Moreover, the fdevent example is rather inefficient. Suspensions should be minimized, and this is not possible with x-wsgiorg.fdevent but it is possible with x-wsgiorg.suspend. >>From the open issues of that other specification however, you can see > that there can be problems. It only allowed an application to be > interested in a single file descriptor where some packages may need to > express interest in more than one. > > Quite often an application is never going to be that simple anyway. > Some event systems allow a lot more than just watching of file > descriptors and timeouts however. You cant come up with a generic > interface for all these as they will not be able to be implemented by > a different event system which isn't so feature rich or which has a > different style of interface. Thus applications are restricted to the > lowest common denominator and likely that is not going to be enough > for most and so have no choice but to bind it to interfaces of > specific event loop. This is the reason why x-wsgiorg.resume is a better API than the one proposed by x-wsgiorg.fdevent, IMHO. > If that is going to be the case anyway, you may > as well forget about WSGI and write to that event systems specific web > server interface. > > So, given that one of the strengths of WSGI is that it is an interface > which aids portability of applications to different hosting > mechanisms, explain to me what purpose this WSGI extension has if it > doesn't aid portability given that your application still has to be > aware of the underlying event loop of that specific system anyway. > I suspect that it is not clear what the real purpose of x-wsgiorg.suspend extension is. The purpose of the extension if to just have a standard interface that WSGI applications can use to take advantage of the possibility, offered by asynchronous server, to suspend execution and resume it later. In the specification text I have explicitly stated that how resume is called is not specified. Claiming that x-wsgiorg.suspend does not help writing portable WSGI application is something similar (well, I'm a bit exaggerating here) of saying that WSGI does not allow to write portable web applications, because real world WSGI applications needs a database, a database engine, and so on. Regards Manlio From graham.dumpleton at gmail.com Mon Apr 12 13:59:59 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Mon, 12 Apr 2010 21:59:59 +1000 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <4BC30311.9070206@libero.it> References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> Message-ID: On 12 April 2010 21:25, Manlio Perillo wrote: > Graham Dumpleton ha scritto: >> On 12 April 2010 06:07, Manlio Perillo wrote: >>> I'm not sure about the correct procedure to follow, I hope it is not a >>> problem. >>> >>> I here propose the x-wsgiorg.suspend to be accepted as official WSGI >>> extension, using the wsgiorg namespace. >>> > > First of all thanks for the feedback. > >> [...] >> In the code of demo_fdevent.py it has: >> >> ? ? while True: >> ? ? ? ? while True: >> ? ? ? ? ? ? ret, num_handles = m.perform() >> ? ? ? ? ? ? if ret != pycurl.E_CALL_MULTI_PERFORM: >> ? ? ? ? ? ? ? ? break >> ? ? ? ? if not num_handles: >> ? ? ? ? ? ? break >> >> ? ? ? ? read, write, exc = m.fdset() >> ? ? ? ? resume = environ['x-wsgiorg.suspend'](1000) >> ? ? ? ? if read: >> ? ? ? ? ? ? readable(read[0], resume) >> ? ? ? ? ? ? yield '' >> ? ? ? ? else: >> ? ? ? ? ? ? writeable(write[0], resume) >> ? ? ? ? ? ? yield '' >> >> The registration of file descriptors doesn't occur until after the >> first suspend() call. >> >> If the underlying reactor that the WSGI server is presumably also >> using doesn't know about the file descriptors at that point, then how >> does it now to return from the suspend(). >> > > I'm not sure to understand your concern, but the execution is not > suspended when you call x-wsgiorg.suspend, but only when you yield a > empty string. Okay, missed that. > In the example, registration of file descriptor occur before application > is suspended. > >> You are also calling perform() before that point. When calling that, >> it is presumed you have already done a select/poll to know data is >> available, but you haven't done that on first pass through the loop. >> If you call that and data isn't ready, can't it block still. >> > > I have to admit that I just copied the example from fdevent specification. > However the code seems correct, to me. > >> This example also illustrates well why I am so against an asynchronous >> WSGI server extension. >> >> The reason is that your specific application has to be with this >> extension bound to the specific event loop mechanism used by the >> underlying WSGI server. >> >> I can't for example take this application and host it on a different >> WSGI server which implements the same WSGI extension but uses a >> different event loop. >> > > Instead I think that being "agnostic" about how it is used, in one of > the most important feature of x-wsgiorg.suspend extension. > > After all, if you think about it, how to interface with a database in a > WSGI application is not specified by WSGI. > This is done by a separate standard, dbapi2. > > For applications that need a template engine, we don't even have a > standard inteface. > > The lack of a standard event API is not a problem that should be > discussed in WSGI. > It is a problem with the Python community; in fact I would like to > define a standard event API *and* a standard efficient network API (the > reason is expressed at the end of the README file in txwsgi). > >> If one can't do that and it is tied to the event loop and >> infrastructure of the underlying WSGI server, what is the point of >> defining and implementing the WSGI extension as it doesn't aid >> portability at all, so what service is it actually providing? >> > > The service it provides is: "allow a WSGI application to suspend its > execution and resume it later". > >> In that respect, the extension: >> >> http://www.wsgi.org/wsgi/Specifications/fdevent/ >> >> provided more as at least it tried to abstract out a generic interface >> for registering interest in file descriptor activity and so perhaps >> allow the application not to be dependent on the specific event loop >> used by the underlying WSGI server. >> > > However exposing this event interface is really something that has > little to do with WSGI. > > Moreover, the fdevent example is rather inefficient. > Suspensions should be minimized, and this is not possible with > x-wsgiorg.fdevent but it is possible with x-wsgiorg.suspend. > >>>From the open issues of that other specification however, you can see >> that there can be problems. It only allowed an application to be >> interested in a single file descriptor where some packages may need to >> express interest in more than one. >> >> Quite often an application is never going to be that simple anyway. >> Some event systems allow a lot more than just watching of file >> descriptors and timeouts however. You cant come up with a generic >> interface for all these as they will not be able to be implemented by >> a different event system which isn't so feature rich or which has a >> different style of interface. Thus applications are restricted to the >> lowest common denominator and likely that is not going to be enough >> for most and so have no choice but to bind it to interfaces of >> specific event loop. > > This is the reason why x-wsgiorg.resume is a better API than the one > proposed by x-wsgiorg.fdevent, IMHO. > >> If that is going to be the case anyway, you may >> as well forget about WSGI and write to that event systems specific web >> server interface. >> >> So, given that one of the strengths of WSGI is that it is an interface >> which aids portability of applications to different hosting >> mechanisms, explain to me what purpose this WSGI extension has if it >> doesn't aid portability given that your application still has to be >> aware of the underlying event loop of that specific system anyway. >> > > I suspect that it is not clear what the real purpose of > x-wsgiorg.suspend extension is. > > The purpose of the extension if to just have a standard interface that > WSGI applications can use to take advantage of the possibility, offered > by asynchronous server, to suspend execution and resume it later. > > In the specification text I have explicitly stated that how resume is > called is not specified. > > > Claiming that x-wsgiorg.suspend does not help writing portable WSGI > application is something similar (well, I'm a bit exaggerating here) of > saying that WSGI does not allow to write portable web applications, > because real world WSGI applications needs a database, a database > engine, and so on. It is not the same. I can take code using a specific database instance and still run that WSGI application, using the same database, on a different WSGI hosting mechanism without really changing anything about how I interact with the WSGI server and its request handling. The concern here is the WSGI interface and interacting with the web server, not other non related third party packages. You are articificially adding something to the WSGI interface as an extension which is pointless. Since you are bound to the specific event loop of the underlying WSGI server or event framework being used you may just as well call a function directly on the WSGI server. Adding that function under a key in the WSGI environment and accessing it that way does not in itself provide any value and doesn't somehow make the code easily portable to a different WSGI hosting mechanism using a different event loop as you still have to change lots of other code in your application. In some respects this is similar to the issues between using a WSGI wrapper which injects stuff in WSGI environment versus that functionality being in a separate library. Read: http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html Graham From manlio_perillo at libero.it Mon Apr 12 16:19:56 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 12 Apr 2010 16:19:56 +0200 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> Message-ID: <4BC32C0C.6080909@libero.it> Graham Dumpleton ha scritto: > [...] >> >> Claiming that x-wsgiorg.suspend does not help writing portable WSGI >> application is something similar (well, I'm a bit exaggerating here) of >> saying that WSGI does not allow to write portable web applications, >> because real world WSGI applications needs a database, a database >> engine, and so on. > > It is not the same. I can take code using a specific database instance > and still run that WSGI application, using the same database, on a > different WSGI hosting mechanism without really changing anything > about how I interact with the WSGI server and its request handling. > The concern here is the WSGI interface and interacting with the web > server, not other non related third party packages. > This is true. However you can say the same for x-wsgorg.suspend extension. As an example, you can have an application that use a standard event API, and you can run it on several asynchronous WSGI implementations. The difference is that here we speak about event API, and not specific event implementation. Note however that we can also speak about specific implementations. As an example, I can implement Twisted reactor API in Nginx, so that WSGI applications using Twisted API can be executed on both Twisted and Nginx. I could do the same with libevent API. It's only a technical problem. > You are articificially adding something to the WSGI interface as an > extension which is pointless. Since you are bound to the specific > event loop of the underlying WSGI server or event framework being used You are not bound to a specific event framework, when using x-wsgiorg.suspend! > you may just as well call a function directly on the WSGI server. > Adding that function under a key in the WSGI environment and accessing > it that way does not in itself provide any value and doesn't somehow > make the code easily portable to a different WSGI hosting mechanism > using a different event loop as you still have to change lots of other > code in your application. > This is absolutely not true! > In some respects this is similar to the issues between using a WSGI > wrapper which injects stuff in WSGI environment versus that > functionality being in a separate library. Read: > > http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html > This is simply wrong. x-wsgiorg.suspend **can not** be implemented as simply library code; it **must** be accessed from environ dictionary. The reason is simple: 1) First of all, in order to suspend application, you **must** return control to the server, and this can only be done by yielding some value in the application generator. 2) In order for the implementation to know if application requested suspension, it must keep a flag in its *internal* state. The x-wsgiorg.suspend function simply sets this flag. Manlio From pje at telecommunity.com Mon Apr 12 16:26:39 2010 From: pje at telecommunity.com (P.J. Eby) Date: Mon, 12 Apr 2010 10:26:39 -0400 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <4BC30311.9070206@libero.it> References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> Message-ID: <20100412142642.D25243A411A@sparrow.telecommunity.com> At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote: >The purpose of the extension if to just have a standard interface that >WSGI applications can use to take advantage of the possibility, offered >by asynchronous server, to suspend execution and resume it later. WSGI has this ability now - it's yielding an empty string. Yielding an empty string is a hint to the server that the application is not ready to send any output, and the server is free to schedule other applications next. And WSGI does not require the application to be rescheduled any time soon. In other words, if saying "don't call me for a while" is the purpose of the extension, it is not needed. As Graham says, the thing that would actually be needed is a way to tell the server when to poll the app again. From manlio_perillo at libero.it Mon Apr 12 16:39:51 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 12 Apr 2010 16:39:51 +0200 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <20100412142642.D25243A411A@sparrow.telecommunity.com> References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> <20100412142642.D25243A411A@sparrow.telecommunity.com> Message-ID: <4BC330B7.3080403@libero.it> P.J. Eby ha scritto: > At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote: >> The purpose of the extension if to just have a standard interface that >> WSGI applications can use to take advantage of the possibility, offered >> by asynchronous server, to suspend execution and resume it later. > > WSGI has this ability now - it's yielding an empty string. Yielding an > empty string is a hint to the server that the application is not ready > to send any output, and the server is free to schedule other > applications next. And WSGI does not require the application to be > rescheduled any time soon. > > In other words, if saying "don't call me for a while" is the purpose of > the extension, it is not needed. As Graham says, the thing that would > actually be needed is a way to tell the server when to poll the app again. > Just yielding an empty string does not give the server some important informations. As an example, with x-wsgi.suspend application can specify a timeout, that tells the server that the application must be resumed before timeout milliseconds have elapsed. And x-wsgi.suspend returns a callable that, when called, tell the server to poll the app again. Regards Manlio From graham.dumpleton at gmail.com Tue Apr 13 01:03:11 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 09:03:11 +1000 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <4BC330B7.3080403@libero.it> References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> <20100412142642.D25243A411A@sparrow.telecommunity.com> <4BC330B7.3080403@libero.it> Message-ID: On 13 April 2010 00:39, Manlio Perillo wrote: > P.J. Eby ha scritto: >> At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote: >>> The purpose of the extension if to just have a standard interface that >>> WSGI applications can use to take advantage of the possibility, offered >>> by asynchronous server, to suspend execution and resume it later. >> >> WSGI has this ability now - it's yielding an empty string. ?Yielding an >> empty string is a hint to the server that the application is not ready >> to send any output, and the server is free to schedule other >> applications next. ?And WSGI does not require the application to be >> rescheduled any time soon. >> >> In other words, if saying "don't call me for a while" is the purpose of >> the extension, it is not needed. ?As Graham says, the thing that would >> actually be needed is a way to tell the server when to poll the app again. >> > > Just yielding an empty string does not give the server some important > informations. > > As an example, with x-wsgi.suspend application can specify a timeout, > that tells the server that the application must be resumed before > timeout milliseconds have elapsed. > > And x-wsgi.suspend returns a callable that, when called, tell the server > to poll the app again. There are other ways of doing that, the callable doesn't need to be in the WSGI environment. This is because since it is single threaded, the WSGI server need only record in a global variable for that WSGI application some state about the current request. The separate function to note the suspension can then lookup that and does what it needs to. In other words, you don't need the WSGI environment to maintain that relationship. Having the timeout as argument is also questionable anyway. All you really need to do is to tell the WSGI server that I don't want to be called until I tell it otherwise. The WSGI application could itself handle the timeout in other ways. Overall one could do all of this without having to do anything in the WSGI environment. As PJE points out, it can be done by relying only on the ability to yield an empty string. Everything else can be in the application realm with the application normally being bound to a specific WSGI server/event loop implementation, thus no portability. The problem of a middleware not passing through an empty string doesn't even need to be an issue in as much as the application could track when it requested to be suspended and if called into again before the required criteria had been met, it could detect a middleware that wasn't playing by the rules and at least raise an error rather than potentially go into blocking state and tight loop. One could theoretically abstract out an interface for a generic event system, but what you don't want is a general purpose one. You want one which is specifically associated with the concept of a WSGI server. That way the API for it can expose methods which specifically relate to stuff like suspension of calling into the WSGI application for data until specific events occur. That abstract interface could then be implemented as concrete implementations for specific event based WSGI servers. Because a handle to that instance would be needed by the application, including outside context of a request, then a requirement of the interface may be that this handle to the WSGI server event interface be passed as argument to the WSGI application when it is created. So, you could come up with a standard for asynchronous WSGI, but the WSGI specification itself doesn't need to change nor additional keys put in the WSGI environment. Instead, any standardised interfaces exist outside of that and relates more to the interaction between application and underlying WSGI server directly, independent of a specific request in the large part. Graham From manlio_perillo at libero.it Tue Apr 13 10:22:00 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 13 Apr 2010 10:22:00 +0200 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> <20100412142642.D25243A411A@sparrow.telecommunity.com> <4BC330B7.3080403@libero.it> Message-ID: <4BC429A8.8020709@libero.it> Graham Dumpleton ha scritto: > [...] >> Just yielding an empty string does not give the server some important >> informations. >> >> As an example, with x-wsgi.suspend application can specify a timeout, >> that tells the server that the application must be resumed before >> timeout milliseconds have elapsed. >> >> And x-wsgi.suspend returns a callable that, when called, tell the server >> to poll the app again. > > There are other ways of doing that, the callable doesn't need to be in > the WSGI environment. This is because since it is single threaded, the > WSGI server need only record in a global variable for that WSGI > application some state about the current request. The separate > function to note the suspension can then lookup that and does what it > needs to. In other words, you don't need the WSGI environment to > maintain that relationship. > This seems completely broken, to me; do you have looked at txwsgi implementation? It is true that the WSGI server is single threaded, but there can be multiple concurrent requests processed in this thread. What happens if one request is being suspended and a new one is being processed? As far as I can tell, the new request will note the suspend flag set to True, and will be suspended as well. > Having the timeout as argument is also questionable anyway. All you > really need to do is to tell the WSGI server that I don't want to be > called until I tell it otherwise. The WSGI application could itself > handle the timeout in other ways. > But I can't see the reason why this can not be done by x-wsgiorg.suspend, since it is a very convenient interface. > Overall one could do all of this without having to do anything in the > WSGI environment. As PJE points out, it can be done by relying only on > the ability to yield an empty string. Everything else can be in the > application realm with the application normally being bound to a > specific WSGI server/event loop implementation, thus no portability. > >From what I can tell, this is only possible by having a custom variable in the WSGI environ. But since I wrote txwsgi for precisely this reason, it should not be hard to prove that your idea is actually possible to implement (and it does not make implementation more complex as it should be, think about an implementation written in C). > The problem of a middleware not passing through an empty string > doesn't even need to be an issue in as much as the application could > track when it requested to be suspended and if called into again > before the required criteria had been met, it could detect a > middleware that wasn't playing by the rules and at least raise an > error rather than potentially go into blocking state and tight loop. > Yes. This is something that can be done by an implementation. Currently txwsgi only checks for suspend flag when an empty string is yielded by application. > One could theoretically abstract out an interface for a generic event > system, but what you don't want is a general purpose one. You want one > which is specifically associated with the concept of a WSGI server. Why? This is not required at all. > That way the API for it can expose methods which specifically relate > to stuff like suspension of calling into the WSGI application for data > until specific events occur. The event API just needs to deal with events, using callbacks to report data to application. Please, see the demo_getpage_green.py example, in txwsgi. > [...] Regards Manlio From manlio_perillo at libero.it Tue Apr 13 12:41:44 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 13 Apr 2010 12:41:44 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <20100408215334.2AB373A40AA@sparrow.telecommunity.com> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> <4BBE39FA.2020802@libero.it> <20100408215334.2AB373A40AA@sparrow.telecommunity.com> Message-ID: <4BC44A68.7010709@libero.it> P.J. Eby ha scritto: > At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote: >> Suppose I have an HTML template file, and I want to use a sub request. >> >> ... >> ${subrequest('/header/'} >> ... >> >> The problem with this code is that, since Mako will buffer all generated >> content, the result response body will contain incorrect data. >> >> It will first contain the response body generated by the sub request, >> then the content generated from the Mako template (XXX I have not >> checked this, but I think it is how it works). > > Okay, I'm confused even more now. It seems to me like what you've just > described is something that's fundamentally broken, even if you're not > using WSGI at all. > If you are referring to Mako being turned in a generator, yes, this implementation is rather obscure. I wrote it as a proof of concept. Before this, I wrote a more polite implementation: http://paste.pocoo.org/show/201324/ > >> So, when executing a sub request, it is necessary to flush (that is, >> send to Nginx, in my case) the content generated from the template >> before the sub request is done. > > This seems to only makes sense if you're saying that the subrequest *has > to* send its output directly to the client, rather than to the parent > request. Yes, this is how subrequests work in Nginx. And I assume the same is true for Apache. > If the subrequest sends its output to the parent request (as a > sane implementation would), then there is no problem. You are forgetting that Nginx is not an application server. Why should the subrequest output returned to the parent? This would only make it less efficient. > Likewise, if the > subrequest is sent to a buffer that's then inserted into the parent > invocation. > > Anything else seems utterly insane to me, unless you're basically taking > a bunch of legacy CGI code using 'print' statements and hacking it into > something else. (Which is still insane, just differently. ;-) ) > We are talking about subrequest implementation in a efficient web server written in C, like Nginx and Apache. > >> Ah, you are right sorry. >> But this is not required for the Mako example (I was focusing on that >> example). > > As far as I can tell, that example is horribly wrong. ;-) > I agree ;-) > >> But when using the greenlet middleware, and when using the function for >> flushing Mako buffer, some data will be yielded *before* the application >> returns and status and headers are passed to Nginx. > > And that's probably because sharing a single output channel between the > parent and child requests is a bad idea. ;-) > No, this is not specific to subrequests. As an example, here you can find an up to date greenlet adapters: http://bitbucket.org/mperillo/txwsgi/src/tip/txwsgi/greenlet.py The ``write_adapter`` **needs** to yield some data before WSGI application return, because this is how the write callable workd. The exposed ``gsuspend`` function, instead, will cause an empty string to be yielded to the server, before the WSGI application returns. > (Specifically, it's an increase in "temporal coupling", I believe. I > know it's some kind of coupling between functions that's considered bad, > I just don't remember if that's the correct name for it.) > Nginx code contains some coupling; I assume this is done because it was designed with efficiency in mind. > [...] > It's true that dropping start_response() means you can't yield empty > strings prior to determining your headers, yes. > > >> > - yielding is for server push or >> > sending blocks of large files, not tiny strings. >> >> Again, consider the use of sub requests. >> yielding a "not large" block is the only choice you have. > > No, it isn't. You can buffer your output and yield empty strings until > you're ready to flush. > As I wrote, this will not work if you want to use subrequest support from Nginx. > > >> Unless, of course, you implement sub request support in pure Python (or >> using SSI - Server Side Include). > > I don't see why it has to be "pure", actually. It just that the > subrequest needs to send data to the invoker rather than sending it > straight to the client. > You may say this, but it is not how subrequests are implemented in Nginx ;-). > That's the bit that's crazy in your example -- it's not a scenario that > WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do > it to be a bug, not a feature. ;-) > Are you referring to the bad Mako example, or to the ``greenlet_adapter`` idea? > That being said, I can see that removing start_response() closes a > loophole that allows async apps to *potentially* exist under WSGI 1 (as > long as you were able to tolerate the resulting crappy API). > > However, to fix that crappy API requires greenlets or threads, at which > point you might as well just use WSGI 2. In the Nginx case, you can > either do WSGI 1 in C and then use an adapter to provide WSGI 2, or you > can expose your C API to Python and write a small greenlets-using Python > wrapper to support suspending. But this is already implemented using the ``greenlet_adapter`` in txwsgi, and the x-wsgiorg.suspend extension. And this implementation has the advantage that the greenlet_adapter works on **every** WSGI implementation that supports the x-wsgiorg.suspend extension. > It would look something like: > > def gateway(request_info, app): > # set up environ > run(greenlet(lambda: Finished(app(environ)))) > > def run(child): > while not child.dead: > data = child.switch() > if isinstance(data, Finished): > send_status(data.status) > send_headers(data.headers) > send_response(data.response) > else: > perform_appropriate_action_on(data) > if data.suspend: > # arrange for run(child) to be re-called later, > then... > return > I have to actually implement this to check if it works. This can be done using my txwsgi implementation. If it can help, I can also implement WSGI 2.0 in txwsgi. WSGI 1.0 and WSGI 2.0 stacks will be independent, no adapter will be used (they will just share most of the code). > [...] Regards Manlio From graham.dumpleton at gmail.com Tue Apr 13 12:42:43 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 20:42:43 +1000 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <4BC429A8.8020709@libero.it> References: <4BC22C04.5050308@libero.it> <4BC30311.9070206@libero.it> <20100412142642.D25243A411A@sparrow.telecommunity.com> <4BC330B7.3080403@libero.it> <4BC429A8.8020709@libero.it> Message-ID: On 13 April 2010 18:22, Manlio Perillo wrote: > Graham Dumpleton ha scritto: >> [...] >>> Just yielding an empty string does not give the server some important >>> informations. >>> >>> As an example, with x-wsgi.suspend application can specify a timeout, >>> that tells the server that the application must be resumed before >>> timeout milliseconds have elapsed. >>> >>> And x-wsgi.suspend returns a callable that, when called, tell the server >>> to poll the app again. >> >> There are other ways of doing that, the callable doesn't need to be in >> the WSGI environment. This is because since it is single threaded, the >> WSGI server need only record in a global variable for that WSGI >> application some state about the current request. The separate >> function to note the suspension can then lookup that and does what it >> needs to. In other words, you don't need the WSGI environment to >> maintain ?that relationship. >> > > This seems completely broken, to me; do you have looked at txwsgi > implementation? > > It is true that the WSGI server is single threaded, but there can be > multiple concurrent requests processed in this thread. > > What happens if one request is being suspended and a new one is being > processed? > As far as I can tell, the new request will note the suspend flag set to > True, and will be suspended as well. No. I said 'record in a global variable for that WSGI application some state about the ***current request***'. The WSGI server when it switches to calling into an application for the purposes of a concurrent request, switches the global variable to reference the state about the other request. So, when accessing via that global variable from within an application, it is always only looking at its own state. The WSGI server obviously will be iterrogating all those active request states to know which is not in a suspended state and which can be called into. >> Having the timeout as argument is also questionable anyway. All you >> really need to do is to tell the WSGI server that I don't want to be >> called until I tell it otherwise. The WSGI application could itself >> handle the timeout in other ways. > > But I can't see the reason why this can not be done by > x-wsgiorg.suspend, since it is a very convenient interface. > >> Overall one could do all of this without having to do anything in the >> WSGI environment. As PJE points out, it can be done by relying only on >> the ability to yield an empty string. Everything else can be in the >> application realm with the application normally being bound to a >> specific WSGI server/event loop implementation, thus no portability. >> > > From what I can tell, this is only possible by having a custom variable > in the WSGI environ You may not be able to see the alternatives, but they definitely exist. > But since I wrote txwsgi for precisely this reason, it should not be > hard to prove that your idea is actually possible to implement (and it > does not make implementation more complex as it should be, think about > an implementation written in C). The notion I am describing is no more difficult in C except to the extent that writing against C API for Python is more verbose that pure Python. This is going to be the case whatever you are doing and is how the C API is and nothing to do with the solution. >> The problem of a middleware not passing through an empty string >> doesn't even need to be an issue in as much as the application could >> track when it requested to be suspended and if called into again >> before the required criteria had been met, it could detect a >> middleware that wasn't playing by the rules and at least raise an >> error rather than potentially go into blocking state and tight loop. >> > > Yes. > This is something that can be done by an implementation. > Currently txwsgi only checks for suspend flag when an empty string is > yielded by application. > >> One could theoretically abstract out an interface for a generic event >> system, but what you don't want is a general purpose one. You want one >> which is specifically associated with the concept of a WSGI server. > > Why? > This is not required at all. And neither is adding your suspend function to the WSGI environment. You obviously are just not able to grok the bigger picture. Sure one can have an extension with a very narrow focus which sort of helps with in an issue, but if it doesn't address the bigger issues and just perpuates the mess, it is not a good extension. You show me a async extension for WSGI where you can take the exact same application code and run it on a completely different async based WSGI hosting mechanism, then I well listen, but your current idea fails because your application is still inextricably wedded to the event loop of the specific underlying framework. You have no abstraction there to allow portability and this suspend proposal is merely tinkering at the edges, not solving the real problems and polluting the WSGI environment when there is no reason to. Graham From graham.dumpleton at gmail.com Tue Apr 13 12:53:03 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 20:53:03 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BC44A68.7010709@libero.it> References: <165558.15790.qm@web111701.mail.gq1.yahoo.com> <4BBDEF56.8060507@libero.it> <20100408152052.700413A40AA@sparrow.telecommunity.com> <4BBDF8D6.60704@libero.it> <20100408173040.24E873A40AA@sparrow.telecommunity.com> <4BBE1B19.4000601@libero.it> <20100408190946.BF77B3A40AA@sparrow.telecommunity.com> <4BBE39FA.2020802@libero.it> <20100408215334.2AB373A40AA@sparrow.telecommunity.com> <4BC44A68.7010709@libero.it> Message-ID: On 13 April 2010 20:41, Manlio Perillo wrote: >>> So, when executing a sub request, it is necessary to flush (that is, >>> send to Nginx, in my case) the content generated from the template >>> before the sub request is done. >> >> This seems to only makes sense if you're saying that the subrequest *has >> to* send its output directly to the client, rather than to the parent >> request. > > Yes, this is how subrequests work in Nginx. And I assume the same is > true for Apache. No that is not true for Apache. Apache content handlers write output into what is called a bucket brigade. For a normal sub request this may be the bucket brigade of the parent request and so be processed by the output filters of the parent request. You can however code the mechanics of the sub request to override that and do something else with the data pushed into that bucket brigade. Although it can be done it gets a bit complicated to have the data written back into the bucket brigade pulled back into the context of a parent request. This is because the data is written from the context of the sub request where as at same time the parent request is going to want to pull it. Thus need to use threading and have to fire off the sub request in its own thread with a queue of some sort being used to communicate between the two. So messy, but technically it should be possible with custom Python code specific to Apache to fire off a subrequest and the result of the sub request be an iterable which yields data which itself could be yielded from the context of the parent application such that the content could then be processed and modified by a WSGI middleware wrapper. Graham From bchesneau at gmail.com Tue Apr 13 12:59:51 2010 From: bchesneau at gmail.com (Benoit Chesneau) Date: Tue, 13 Apr 2010 12:59:51 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On Thu, Apr 8, 2010 at 4:53 PM, P.J. Eby wrote: > At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote: >> >> Hi. >> >> Some time ago I objected the decision to remove start_response function >> from next version WSGI, using as rationale the fact that without >> start_callable, asynchronous extension are impossible to support. >> >> Now I have found that removing start_response will also make impossible >> to support coroutines (or, at least, some coroutines usage). >> >> Here is an example (this is the same example I posted few days ago): >> http://paste.pocoo.org/show/199202/ >> >> Forgetting about the write callable, the problem is that the application >> starts to yield data when tmpl.render_unicode function is called. >> >> Please note that this has *nothing* to do with asynchronus applications. >> The code should work with *all* WSGI implementations. >> >> >> In the pasted example, the Mako render_unicode function is "turned" into >> a generator, with a simple function that allows to flush the current >> buffer. >> >> >> Can someone else confirm that this code is impossible to support in WSGI >> 2.0? > > I don't understand why it's a problem. ?See my previous post here: > > http://mail.python.org/pipermail/web-sig/2009-September/003986.html > > for a sketch of a WSGI 1-to-2 converter. ?It takes a WSGI 1 application > callable as the input, and returns a WSGI 2 function. > where is WSGI 2 pep ? I would like to see it first rather than seeig different implementations. - benoit From graham.dumpleton at gmail.com Tue Apr 13 13:13:08 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 21:13:08 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On 13 April 2010 20:59, Benoit Chesneau wrote: > On Thu, Apr 8, 2010 at 4:53 PM, P.J. Eby wrote: >> At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote: >>> >>> Hi. >>> >>> Some time ago I objected the decision to remove start_response function >>> from next version WSGI, using as rationale the fact that without >>> start_callable, asynchronous extension are impossible to support. >>> >>> Now I have found that removing start_response will also make impossible >>> to support coroutines (or, at least, some coroutines usage). >>> >>> Here is an example (this is the same example I posted few days ago): >>> http://paste.pocoo.org/show/199202/ >>> >>> Forgetting about the write callable, the problem is that the application >>> starts to yield data when tmpl.render_unicode function is called. >>> >>> Please note that this has *nothing* to do with asynchronus applications. >>> The code should work with *all* WSGI implementations. >>> >>> >>> In the pasted example, the Mako render_unicode function is "turned" into >>> a generator, with a simple function that allows to flush the current >>> buffer. >>> >>> >>> Can someone else confirm that this code is impossible to support in WSGI >>> 2.0? >> >> I don't understand why it's a problem. ?See my previous post here: >> >> http://mail.python.org/pipermail/web-sig/2009-September/003986.html >> >> for a sketch of a WSGI 1-to-2 converter. ?It takes a WSGI 1 application >> callable as the input, and returns a WSGI 2 function. >> > where is WSGI 2 pep ? I would like to see it first rather than seeig > different implementations. There is no such thing as a WSGI 2.0 PEP and there is no proper concensus either on what it should look like. Thus if you see anything claiming to implement WSGI 2.0, then it isn't and you should only view it as an experimental proposal. You are warned. :-) Graham From dirkjan at ochtman.nl Tue Apr 13 13:20:58 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 13 Apr 2010 13:20:58 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On Tue, Apr 13, 2010 at 13:13, Graham Dumpleton wrote: > There is no such thing as a WSGI 2.0 PEP and there is no proper > concensus either on what it should look like. Thus if you see anything > claiming to implement WSGI 2.0, then it isn't and you should only view > it as an experimental proposal. You are warned. :-) Do you (or someone else) have a status on where WSGI 2 is? IIRC WSGI 1 isn't really usable with Python 3.x, so it seems about time we get something going again (AIUI this is blocking Werkzeug from being ported to 3.x, for example). Cheers, Dirkjan From manlio_perillo at libero.it Tue Apr 13 13:33:47 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 13 Apr 2010 13:33:47 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: <4BC4569B.8040407@libero.it> Dirkjan Ochtman ha scritto: > On Tue, Apr 13, 2010 at 13:13, Graham Dumpleton > wrote: >> There is no such thing as a WSGI 2.0 PEP and there is no proper >> concensus either on what it should look like. Thus if you see anything >> claiming to implement WSGI 2.0, then it isn't and you should only view >> it as an experimental proposal. You are warned. :-) > > Do you (or someone else) have a status on where WSGI 2 is? IIRC WSGI 1 > isn't really usable with Python 3.x, so it seems about time we get > something going again (AIUI this is blocking Werkzeug from being > ported to 3.x, for example). > WSGI 2.0 ideas are here: http://wsgi.org/wsgi/WSGI_2.0 But it does not have support for Python 3.x. Some corrections to WSGI 1.0 are here: http://wsgi.org/wsgi/Amendments_1.0 You may add support to Python 3.x in existing WSGI 1.0 implementation, but your implementation will end up to something that is no more WSGI 1.0. Manlio From graham.dumpleton at gmail.com Tue Apr 13 13:39:30 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 21:39:30 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On 13 April 2010 21:20, Dirkjan Ochtman wrote: > On Tue, Apr 13, 2010 at 13:13, Graham Dumpleton > wrote: >> There is no such thing as a WSGI 2.0 PEP and there is no proper >> concensus either on what it should look like. Thus if you see anything >> claiming to implement WSGI 2.0, then it isn't and you should only view >> it as an experimental proposal. You are warned. :-) > > Do you (or someone else) have a status on where WSGI 2 is? IIRC WSGI 1 > isn't really usable with Python 3.x, so it seems about time we get > something going again (AIUI this is blocking Werkzeug from being > ported to 3.x, for example). WSGI 2.0 isn't about Python 3.X, it is about removing start_response(). Python 3.X support can be catered for by clarifications in the WSGI 1.0 specification and to a degree how Python 3.X is implemented is dictated by existing practice in the form of what wsgiref implemented in Python 3.1. The Apache/mod_wsgi implementation has had Python 3.X support for over a year using the same interpretation. I believe latest CherryPy WSGI server code is also providing Python 3.X support. Apache/mod_wsgi tried to push the issue of a new definition/specification to cater for Python 3.X by actually identifying itself as WSGI 1.1. The attempts at ratifying that didn't happen however, but then no one has turned around either to complain about Apache/mod_wsgi identifying itself as WSGI 1.1 and so it has been left that way and not reverted to WSGI 1.0. So, in effect existing practice has determined how WSGI on Python 3.X should be implemented and given how long this has been going on, nothing is likely to change that now. You can however see a summary of how it is being interpreted at: http://code.google.com/p/modwsgi/wiki/SupportForPython3X Graham From dirkjan at ochtman.nl Tue Apr 13 13:46:19 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 13 Apr 2010 13:46:19 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On Tue, Apr 13, 2010 at 13:39, Graham Dumpleton wrote: > WSGI 2.0 isn't about Python 3.X, it is about removing start_response(). Okay, so it is orthogonal, right? > Python 3.X support can be catered for by clarifications in the WSGI > 1.0 specification and to a degree how Python 3.X is implemented is > dictated by existing practice in the form of what wsgiref implemented > in Python 3.1. The Apache/mod_wsgi implementation has had Python 3.X > support for over a year using the same interpretation. I believe > latest CherryPy WSGI server code is also providing Python 3.X support. > > Apache/mod_wsgi tried to push the issue of a new > definition/specification to cater for Python 3.X by actually > identifying itself as WSGI 1.1. The attempts at ratifying that didn't > happen however, but then no one has turned around either to complain > about Apache/mod_wsgi identifying itself as WSGI 1.1 and so it has > been left that way and not reverted to WSGI 1.0. > > So, in effect existing practice has determined how WSGI on Python 3.X > should be implemented and given how long this has been going on, > nothing is likely to change that now. You can however see a summary of > how it is being interpreted at: > > ?http://code.google.com/p/modwsgi/wiki/SupportForPython3X So that page has 8 points required for 3.x, which apparently wsgiref and CherryPy also adhere to? And you have 5 simplifications that, as far as we know, have only been been implemented in mod_wsgi? Cheers, Dirkjan From graham.dumpleton at gmail.com Tue Apr 13 14:01:25 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 22:01:25 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On 13 April 2010 21:46, Dirkjan Ochtman wrote: > On Tue, Apr 13, 2010 at 13:39, Graham Dumpleton > wrote: >> WSGI 2.0 isn't about Python 3.X, it is about removing start_response(). > > Okay, so it is orthogonal, right? > >> Python 3.X support can be catered for by clarifications in the WSGI >> 1.0 specification and to a degree how Python 3.X is implemented is >> dictated by existing practice in the form of what wsgiref implemented >> in Python 3.1. The Apache/mod_wsgi implementation has had Python 3.X >> support for over a year using the same interpretation. I believe >> latest CherryPy WSGI server code is also providing Python 3.X support. >> >> Apache/mod_wsgi tried to push the issue of a new >> definition/specification to cater for Python 3.X by actually >> identifying itself as WSGI 1.1. The attempts at ratifying that didn't >> happen however, but then no one has turned around either to complain >> about Apache/mod_wsgi identifying itself as WSGI 1.1 and so it has >> been left that way and not reverted to WSGI 1.0. >> >> So, in effect existing practice has determined how WSGI on Python 3.X >> should be implemented and given how long this has been going on, >> nothing is likely to change that now. You can however see a summary of >> how it is being interpreted at: >> >> ?http://code.google.com/p/modwsgi/wiki/SupportForPython3X > > So that page has 8 points required for 3.x, which apparently wsgiref > and CherryPy also adhere to? > > And you have 5 simplifications that, as far as we know, have only been > been implemented in mod_wsgi? They are not simplications. They are clarifications or just describing existing practice. They are not necessarily mod_wsgi specific. (1) Implemented by everyone already otherwise cgi.FieldStorage doesn't work. (2) Implemented by the more reputable WSGI servers such as CherryPy and Paste WSGI servers. Not implemented by wsgiref. An implementation that doesn't implement it and which doesn't discard request content yet supports HTTP 1.1 request pipelining is arguably broken. A correctly implemented WSGI middleware already has to cope with an empty string as end sentinel anyway to detect premature end of input. (3) Only implemented by Apache/mod_wsgi that I know of for sure. Although it makes it work like a proper file like object as specification suggests wsgi.input is, would raise potential issues with WSGI middleware which depend on it then not being able to be used with WSGI 1.0 and so intent was that it not be made a requirement. (4) This is a statement about WSGI middleware, not the WSGI server. A WSGI middleware that doesn't do it is arguably broken and can generate incorrect data. Thus is a clarification of obligations only. (5) A WSGI server shouldn't do this now as doing so is technically a voilation of HTTP. Thus is a clarification of obligations only. You can find more detail on these at: http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html Graham From dirkjan at ochtman.nl Tue Apr 13 14:12:07 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 13 Apr 2010 14:12:07 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On Tue, Apr 13, 2010 at 14:01, Graham Dumpleton wrote: > They are not simplications. They are clarifications or just describing > existing practice. They are not necessarily mod_wsgi specific. Sorry, I didn't mean to imply they were mod_wsgi specific, and they definitely look sane/like an improvement to me. Okay, so, would it be valuable to have both 1.1 and 2.0? I.e. if we start writing a new spec now, should we aim for just a 2.0 that copes with everything at once, or first work on an 1.1 that has clarifications/improvements and is otherwise relatively compatible with 1.0 (but also with python 3.x)? Who's up for some PEP-writing? Cheers, Dirkjan From graham.dumpleton at gmail.com Tue Apr 13 14:46:19 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 13 Apr 2010 22:46:19 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On 13 April 2010 22:12, Dirkjan Ochtman wrote: > On Tue, Apr 13, 2010 at 14:01, Graham Dumpleton > wrote: >> They are not simplications. They are clarifications or just describing >> existing practice. They are not necessarily mod_wsgi specific. > > Sorry, I didn't mean to imply they were mod_wsgi specific, and they > definitely look sane/like an improvement to me. > > Okay, so, would it be valuable to have both 1.1 and 2.0? I.e. if we > start writing a new spec now, should we aim for just a 2.0 that copes > with everything at once, or first work on an 1.1 that has > clarifications/improvements and is otherwise relatively compatible > with 1.0 (but also with python 3.x)? > > Who's up for some PEP-writing? The last attempt was to have WSGI 1.1 as clarifications and Python 3.X. And when I say 'last attempt', yes there have been people who have stepped up to try and get this to happen in the past. I think you would be the 3rd time, excluding me in general having tried to push it in the past and also given up. You really should perhaps look back through the archive of WEB-SIG posts on Google Groups to understand the history and how this always seems to just go around in circles. :-) Graham From dirkjan at ochtman.nl Tue Apr 13 15:55:09 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 13 Apr 2010 15:55:09 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: On Tue, Apr 13, 2010 at 14:46, Graham Dumpleton wrote: > The last attempt was to have WSGI 1.1 as clarifications and Python 3.X. > > And when I say 'last attempt', yes there have been people who have > stepped up to try and get this to happen in the past. I think you > would be the 3rd time, excluding me in general having tried to push it > in the past and also given up. > > You really should perhaps look back through the archive of WEB-SIG > posts on Google Groups to understand the history and how this always > seems to just go around in circles. :-) I've been on Web-SIG for quite a while now, exactly to keep track of these issues. Since there doesn't seem to be much traction, I figured it would be time to just get a new PEP together. To limit the amount of work, I'd go in the direction of having a single WSGI 2.0 PEP incorporating your suggestions (maybe minus the number 3), everything required for Python 3 (as outlined by your wiki page). The idea is that, as soon as there is a draft PEP, we can circulate it on Web-SIG for a little bit before bringing it to python-dev. No single person should really be able to block all this. I'll try to write something up that incorporates all of your notes. Cheers, Dirkjan From graham.dumpleton at gmail.com Wed Apr 14 03:57:35 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Wed, 14 Apr 2010 11:57:35 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> Message-ID: On 13 April 2010 23:55, Dirkjan Ochtman wrote: > On Tue, Apr 13, 2010 at 14:46, Graham Dumpleton > wrote: >> The last attempt was to have WSGI 1.1 as clarifications and Python 3.X. >> >> And when I say 'last attempt', yes there have been people who have >> stepped up to try and get this to happen in the past. I think you >> would be the 3rd time, excluding me in general having tried to push it >> in the past and also given up. >> >> You really should perhaps look back through the archive of WEB-SIG >> posts on Google Groups to understand the history and how this always >> seems to just go around in circles. :-) > > I've been on Web-SIG for quite a while now, exactly to keep track of > these issues. > > Since there doesn't seem to be much traction, I figured it would be > time to just get a new PEP together. To limit the amount of work, I'd > go in the direction of having a single WSGI 2.0 PEP Limiting your work by going to WSGI 2.0 and dropping start_response() potentially causes more work for anyone implementing WSGI. This is because if no official statement about Python 3.X is made about WSGI 1.0 (with start_response()), and the only option is WSGI 2.0 (without start_response()), people are likely going to be less inclined to move to Python 3.X for WSGI because to do so means potentially more drastic changes to their code. People talk about WSGI 2.0 -> WSGI 1.0 adapters, but if WSGI 1.0 on Python 3.X is formalised, you can do that. > incorporating your > suggestions (maybe minus the number 3), everything required for Python > 3 (as outlined by your wiki page). The whole point of not doing (3) in WSGI 1.1 was that it was seen that it could only be done, if it were to be done, at a point where the API was broken anyway, ie., when start_response() was dropped in WSGI 2.0. So, it shouldn't be ruled out in WSGI 2.0 as a possible change. If you do, then it likely would never be able to be incorporated. So WSGI 2.0 is the only chance to clean that up. > The idea is that, as soon as there is a draft PEP, we can circulate it > on Web-SIG for a little bit before bringing it to python-dev. No > single person should really be able to block all this. > > I'll try to write something up that incorporates all of your notes. Graham From manlio_perillo at libero.it Wed Apr 14 18:53:28 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 14 Apr 2010 18:53:28 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <20100408145342.B3BFD3A40AA@sparrow.telecommunity.com> Message-ID: <4BC5F308.6080001@libero.it> Dirkjan Ochtman ha scritto: > On Tue, Apr 13, 2010 at 14:46, Graham Dumpleton > wrote: >> The last attempt was to have WSGI 1.1 as clarifications and Python 3.X. >> >> And when I say 'last attempt', yes there have been people who have >> stepped up to try and get this to happen in the past. I think you >> would be the 3rd time, excluding me in general having tried to push it >> in the past and also given up. >> >> You really should perhaps look back through the archive of WEB-SIG >> posts on Google Groups to understand the history and how this always >> seems to just go around in circles. :-) > > I've been on Web-SIG for quite a while now, exactly to keep track of > these issues. > > Since there doesn't seem to be much traction, I figured it would be > time to just get a new PEP together. To limit the amount of work, I'd > go in the direction of having a single WSGI 2.0 PEP incorporating your > suggestions (maybe minus the number 3), everything required for Python > 3 (as outlined by your wiki page). > If you volunteer for this task, I have some suggestions: * target WSGI 1.1, not WSGI 2.0 * take the original WSGI 1.0 spec text * start to integrate all changes documented by Graham * I would really like to have changes integrates as a series of diff, using and HTML elements. Unfortunately docutils seems to not have support for this, but should not be hard to implement. I can help. * You should keep a separate, unofficial document, with the rationale of the changes. You can just copy the content of Graham blog post, and reformatting it, if this is ok for Graham * For each of the main changea, start a thread on this mailing list asking for votation. If, after 1 week, there is no vote against it, consider it approved If we are really going to approve WSGI 1.1, I have a request: remove the ``write`` callable. Rationale: * it was added in WSGI 1.0 only for compatibility * new code does not use it * this will force applications under development that still use the ``write`` callable to be fixed. See work on Mercurial * it is very easy for current implementations to support both WSGI 1.0 and WSGI 1.1 * legacy application will continue to work * removing of the ``write`` callable will make middlewares more easy to write Thanks Manlio From graham.dumpleton at gmail.com Thu Apr 15 01:35:00 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Thu, 15 Apr 2010 09:35:00 +1000 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BC5F308.6080001@libero.it> References: <4BBDE35A.3050101@libero.it> <4BC5F308.6080001@libero.it> Message-ID: On 15 April 2010 02:53, Manlio Perillo wrote: > Dirkjan Ochtman ha scritto: >> On Tue, Apr 13, 2010 at 14:46, Graham Dumpleton >> wrote: >>> The last attempt was to have WSGI 1.1 as clarifications and Python 3.X. >>> >>> And when I say 'last attempt', yes there have been people who have >>> stepped up to try and get this to happen in the past. I think you >>> would be the 3rd time, excluding me in general having tried to push it >>> in the past and also given up. >>> >>> You really should perhaps look back through the archive of WEB-SIG >>> posts on Google Groups to understand the history and how this always >>> seems to just go around in circles. :-) >> >> I've been on Web-SIG for quite a while now, exactly to keep track of >> these issues. >> >> Since there doesn't seem to be much traction, I figured it would be >> time to just get a new PEP together. To limit the amount of work, I'd >> go in the direction of having a single WSGI 2.0 PEP incorporating your >> suggestions (maybe minus the number 3), everything required for Python >> 3 (as outlined by your wiki page). >> > > If you volunteer for this task, I have some suggestions: > > * target WSGI 1.1, not WSGI 2.0 > * take the original WSGI 1.0 spec text > * start to integrate all changes documented by Graham > * I would really like to have changes integrates as a series of diff, > ?using and HTML elements. > > ?Unfortunately docutils seems to not have support for this, but should > ?not be hard to implement. I can help. > * You should keep a separate, unofficial document, with the rationale of > ?the changes. > ?You can just copy the content of Graham blog post, and reformatting > ?it, if this is ok for Graham > * For each of the main changea, start a thread on this mailing list > ?asking for votation. > ?If, after 1 week, there is no vote against it, consider it approved > > > If we are really going to approve WSGI 1.1, I have a request: remove the > ``write`` callable. > Rationale: > * it was added in WSGI 1.0 only for compatibility > * new code does not use it > * this will force applications under development that still use the > ?``write`` callable to be fixed. See work on Mercurial > * it is very easy for current implementations to support both WSGI 1.0 > ?and WSGI 1.1 > * legacy application will continue to work > * removing of the ``write`` callable will make middlewares more easy to > ?write This is in part why an attempt to come up with a new WSGI 1.X specification, even one that covers just the obvious and justified changes because of actual problems, keeps failing. That is, parties with vested interests or a desire for their little pet change to be made because it helps them, keep poking their heads up and disrupting the process. Given how long this has taken, all that should happen at this point is a codification of what wsgiref implements for Python 3.X along with readline() argument change and obvious other clarifications where actual use shows the original WSGI specification was wrong or where it wasn't practical. If that isn't done, we will be here in another year still arguing about whether some aspect of the specification should be changed or removed based on some individuals perceived need. Such a significant change as removing the requirement for write() should also not be done within a minor version of the WSGI specification because anything that works with WSGI 1.0 should still work with WSGI 1.1 and vice versa. On that basis it can't really be entertained until WSGI 2.0 where incompatible changes would be allowed. Graham From dirkjan at ochtman.nl Thu Apr 15 08:45:47 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Thu, 15 Apr 2010 08:45:47 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <4BC5F308.6080001@libero.it> Message-ID: On Thu, Apr 15, 2010 at 01:35, Graham Dumpleton wrote: > If that isn't done, we will be here in another year still arguing > about whether some aspect of the specification should be changed or > removed based on some individuals perceived need. I agree, WSGI 1.1 should be more like HTML5 in that it tries to more meticulously describe/unify existing implementations. > Such a significant change as removing the requirement for write() > should also not be done within a minor version of the WSGI > specification because anything that works with WSGI 1.0 should still > work with WSGI 1.1 and vice versa. On that basis it can't really be > entertained until WSGI 2.0 where incompatible changes would be > allowed. I think it's a good idea to consider for 2.0, certainly. Cheers, Dirkjan From manlio_perillo at libero.it Thu Apr 15 11:09:52 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 15 Apr 2010 11:09:52 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: References: <4BBDE35A.3050101@libero.it> <4BC5F308.6080001@libero.it> Message-ID: <4BC6D7E0.2090002@libero.it> Dirkjan Ochtman ha scritto: > [...] >> Such a significant change as removing the requirement for write() >> should also not be done within a minor version of the WSGI >> specification because anything that works with WSGI 1.0 should still >> work with WSGI 1.1 and vice versa. On that basis it can't really be >> entertained until WSGI 2.0 where incompatible changes would be >> allowed. > > I think it's a good idea to consider for 2.0, certainly. > Ehm, the purpose of WSGI 2.0 is precisely to remove start_response and write callable with it... Manlio From dirkjan at ochtman.nl Thu Apr 15 11:17:43 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Thu, 15 Apr 2010 11:17:43 +0200 Subject: [Web-SIG] WSGI and start_response In-Reply-To: <4BC6D7E0.2090002@libero.it> References: <4BBDE35A.3050101@libero.it> <4BC5F308.6080001@libero.it> <4BC6D7E0.2090002@libero.it> Message-ID: On Thu, Apr 15, 2010 at 11:09, Manlio Perillo wrote: > Ehm, the purpose of WSGI 2.0 is precisely to remove start_response and > write callable with it... Right, there you go! Cheers, Dirkjan From dirkjan at ochtman.nl Thu Apr 15 14:54:21 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Thu, 15 Apr 2010 14:54:21 +0200 Subject: [Web-SIG] Draft PEP: WSGI 1.1 Message-ID: Mostly taking Graham's list of issues and incorporating it into PEP 333. Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt Let's have comments here (comments in the form of diffs are particularly welcome, of course). Remember, the idea is not to change or improve WSGI right now, but only to improve the spec, improving interoperability and enabling Python 3 support. Graham, I hope I did a good job with your suggestions. (Since so much of this is yours, I've just listed you as the second author.) I tried to clarify exactly what you meant by "native strings", can you check that out? Cheers, Dirkjan --- pep-0333.txt 2010-04-15 14:46:02.000000000 +0200 +++ wsgi-1.1.txt 2010-04-15 14:51:39.000000000 +0200 @@ -1,114 +1,124 @@ -PEP: 333 -Title: Python Web Server Gateway Interface v1.0 +PEP: 0000 +Title: Python Web Server Gateway Interface 1.1 Version: $Revision$ Last-Modified: $Date$ -Author: Phillip J. Eby +Author: Dirkjan Ochtman , + Graham Dumpleton Discussions-To: Python Web-SIG Status: Draft Type: Informational Content-Type: text/x-rst -Created: 07-Dec-2003 -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 +Created: 15-04-2010 +Post-History: Not yet Abstract ======== -This document specifies a proposed standard interface between web -servers and Python web applications or frameworks, to promote web -application portability across a variety of web servers. +This document specifies a revision of the proposed standard interface +between web servers and Python web applications or frameworks, to +promote web application portability across a variety of web servers. Rationale and Goals =================== -Python currently boasts a wide variety of web application frameworks, -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to -name just a few [1]_. This wide variety of choices can be a problem -for new Python users, because generally speaking, their choice of web -framework will limit their choice of usable web servers, and vice -versa. - -By contrast, although Java has just as many web application frameworks -available, Java's "servlet" API makes it possible for applications -written with any Java web application framework to run in any web -server that supports the servlet API. - -The availability and widespread use of such an API in web servers for -Python -- whether those servers are written in Python (e.g. Medusa), -embed Python (e.g. mod_python), or invoke Python via a gateway -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of -framework from choice of web server, freeing users to choose a pairing -that suits them, while freeing framework and server developers to -focus on their preferred area of specialization. - -This PEP, therefore, proposes a simple and universal interface between -web servers and web applications or frameworks: the Python Web Server -Gateway Interface (WSGI). - -But the mere existence of a WSGI spec does nothing to address the -existing state of servers and frameworks for Python web applications. -Server and framework authors and maintainers must actually implement -WSGI for there to be any effect. - -However, since no existing servers or frameworks support WSGI, there -is little immediate reward for an author who implements WSGI support. -Thus, WSGI **must** be easy to implement, so that an author's initial -investment in the interface can be reasonably low. - -Thus, simplicity of implementation on *both* the server and framework -sides of the interface is absolutely critical to the utility of the -WSGI interface, and is therefore the principal criterion for any -design decisions. - -Note, however, that simplicity of implementation for a framework -author is not the same thing as ease of use for a web application -author. WSGI presents an absolutely "no frills" interface to the -framework author, because bells and whistles like response objects and -cookie handling would just get in the way of existing frameworks' -handling of these issues. Again, the goal of WSGI is to facilitate -easy interconnection of existing servers and applications or -frameworks, not to create a new web framework. - -Note also that this goal precludes WSGI from requiring anything that -is not already available in deployed versions of Python. Therefore, -new standard library modules are not proposed or required by this -specification, and nothing in WSGI requires a Python version greater -than 2.2.2. (It would be a good idea, however, for future versions -of Python to include support for this interface in web servers -provided by the standard library.) - -In addition to ease of implementation for existing and future -frameworks and servers, it should also be easy to create request -preprocessors, response postprocessors, and other WSGI-based -"middleware" components that look like an application to their -containing server, while acting as a server for their contained -applications. - -If middleware can be both simple and robust, and WSGI is widely -available in servers and frameworks, it allows for the possibility -of an entirely new kind of Python web application framework: one -consisting of loosely-coupled WSGI middleware components. Indeed, -existing framework authors may even choose to refactor their -frameworks' existing services to be provided in this way, becoming -more like libraries used with WSGI, and less like monolithic -frameworks. This would then allow application developers to choose -"best-of-breed" components for specific functionality, rather than -having to commit to all the pros and cons of a single framework. - -Of course, as of this writing, that day is doubtless quite far off. -In the meantime, it is a sufficient short-term goal for WSGI to -enable the use of any framework with any server. - -Finally, it should be mentioned that the current version of WSGI -does not prescribe any particular mechanism for "deploying" an -application for use with a web server or server gateway. At the -present time, this is necessarily implementation-defined by the -server or gateway. After a sufficient number of servers and -frameworks have implemented WSGI to provide field experience with -varying deployment requirements, it may make sense to create -another PEP, describing a deployment standard for WSGI servers and -application frameworks. +WSGI 1.0, specified in PEP 333, did a great job in making it easier +for web applications and web servers to interface with each other. +It has become very much the standard it was meant to be and an +important part of the Python web development infrastructure. + +After several implementations were built by different developers, +it inevitably turned out that the specification wasn't perfect. It +left out some details that were implemented by all the web server +interfaces because they were critical for many applications (or +application frameworks). Additionally, the specification was written +before Python 3.x was specified, resulting in a lack of clear +specification on what to do with unicode strings. + +While there are some ideas around to improve WSGI further in less +compatible ways, we feel that there is value to be had in first +specifying a minor revision of the specification, which is largely +compatible with existing implementations. Further simplification +and experimentation are therefore deferred to a 2.0 version. + + +Differences with WSGI 1.0 +========================= + +Descriptive changes +------------------- + +The following changes were made to realign the spec with +implementations 'in the wild'. + +1. The 'readline()' function of 'wsgi.input' must optionally take + a size hint. This is required because many applications use + cgi.FieldStorage, which uses this functionality. + +2. The 'wsgi.input' functions for reading input must return an empty + string as end of input stream marker. This is required for support + of HTTP 1.1 request pipelining. A correctly implemented WSGI + middleware already has to cope with an empty string as end + sentinel anyway to detect premature end of input. + +3. Any WSGI application or middleware should not itself return, or + consume from a wrapped WSGI component, more data than specified by + the Content-Length response header if defined. Middleware that + does this is arguably broken and can generate incorrect data. + This is just a clarification of obligations. + +4. The WSGI adapter must not pass on to the server any data above + what the Content-Length response header defines, if supplied. + Doing this is technically a violation of HTTP. This is another + clarification of obligations. + + +String handling changes +----------------------- + +The following changes were made to make WSGI work on Python 3.x. + +1. The application is passed an instance of a Python dictionary + containing what is referred to as the WSGI environment. All keys + in this dictionary are native strings. For CGI variables, all names + are going to be ISO-8859-1 and so where native strings are + unicode strings, that encoding is used for the names of CGI + variables. + +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI + environment, the value of the variable should be a native string. + +3. For the CGI variables contained in the WSGI environment, the values + of the variables are native strings. Where native strings are + unicode strings, ISO-8859-1 encoding would be used such that the + original character data is preserved and as necessary the unicode + string can be converted back to bytes and thence decoded to unicode + again using a different encoding. + +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment + and from which request content is read, should yield byte strings. + +5. The status line specified by the WSGI application should be a byte + string. Where native strings are unicode strings, the native string + type can also be returned in which case it would be encoded as + ISO-8859-1. + +6. The list of response headers specified by the WSGI application should + contain tuples consisting of two values, where each value is a byte + string. Where native strings are unicode strings, the native string + type can also be returned in which case it would be encoded as + ISO-8859-1. + +7. The iterable returned by the application and from which response + content is derived, should yield byte strings. Where native strings + are unicode strings, the native string type can also be returned in + which case it would be encoded as ISO-8859-1. + +8. The value passed to the 'write()' callback returned by + 'start_response()' should be a byte string. Where native strings + are unicode strings, a native string type can also be supplied, in + which case it would be encoded as ISO-8859-1. Specification Overview @@ -447,6 +457,13 @@ Streaming`_ section below for more on how application output must be handled.) +Further on, several places specify constraints upon string types used +in the WSGI API. The term native string is used to mean the 'str' class +in both Python 2.x and 3.x. The spec tries to ensure optimal +compatibility and ease of use by allowing implementations running on +Python 3.x to encode strings (which are Unicode strings with no +specified encoding) as ISO-8859-1 where a 3.x string is passed in. + The server or gateway should treat the yielded strings as binary byte sequences: in particular, it should ensure that line endings are not altered. The application is responsible for ensuring that the @@ -489,12 +506,22 @@ ``environ`` Variables --------------------- +All keys in this dictionary are native strings. For CGI variables, +all names are going to be ISO-8859-1 and so where native strings are +unicode strings, that encoding is used for the names of CGI variables. + The ``environ`` dictionary is required to contain these CGI environment variables, as defined by the Common Gateway Interface specification [2]_. The following variables **must** be present, unless their value would be an empty string, in which case they **may** be omitted, except as otherwise noted below. +The values for CGI variables are native strings. Where native strings +are unicode strings, ISO-8859-1 encoding would be used such that the +original character data is preserved and as necessary the unicode +string can be converted back to bytes and thence decoded to unicode +again using a different encoding. + ``REQUEST_METHOD`` The HTTP request method, such as ``"GET"`` or ``"POST"``. This cannot ever be an empty string, and so is always required. @@ -575,13 +602,14 @@ ===================== =============================================== Variable Value ===================== =============================================== -``wsgi.version`` The tuple ``(1,0)``, representing WSGI +``wsgi.version`` The tuple ``(1, 0)``, representing WSGI version 1.0. ``wsgi.url_scheme`` A string representing the "scheme" portion of the URL at which the application is being invoked. Normally, this will have the value - ``"http"`` or ``"https"``, as appropriate. + ``"http"`` or ``"https"``, as appropriate. The + value is a native string. ``wsgi.input`` An input stream (file-like object) from which the HTTP request body can be read. (The server @@ -646,7 +674,7 @@ Method Stream Notes =================== ========== ======== ``read(size)`` ``input`` 1 -``readline()`` ``input`` 1,2 +``readline(hint)`` ``input`` 1,2 ``readlines(hint)`` ``input`` 1,3 ``__iter__()`` ``input`` ``flush()`` ``errors`` 4 @@ -661,11 +689,12 @@ ``Content-Length``, and is allowed to simulate an end-of-file condition if the application attempts to read past that point. The application **should not** attempt to read more data than is - specified by the ``CONTENT_LENGTH`` variable. + specified by the ``CONTENT_LENGTH`` variable. All read functions + are required to return an empty string as the end of input stream + marker. They must yield byte strings. -2. The optional "size" argument to ``readline()`` is not supported, - as it may be complex for server authors to implement, and is not - often used in practice. +2. The optional "size" argument to ``readline()`` is required for + the implementer, but optional for callers. 3. Note that the ``hint`` argument to ``readlines()`` is optional for both caller and implementer. The application is free not to @@ -692,12 +721,15 @@ --------------------------------- The second parameter passed to the application object is a callable -of the form ``start_response(status,response_headers,exc_info=None)``. +of the form ``start_response(status, response_headers, exc_info=None)``. (As with all WSGI callables, the arguments must be supplied positionally, not by keyword.) The ``start_response`` callable is used to begin the HTTP response, and it must return a ``write(body_data)`` callable (see the `Buffering and Streaming`_ -section, below). +section, below). Values passed to the ``write(body_data)`` callable +should be byte strings. Where native strings are unicode strings, a +native strings type can also be supplied, in which case it would be +encoded as ISO-8859-1. The ``status`` argument is an HTTP "status" string like ``"200 OK"`` or ``"404 Not Found"``. That is, it is a string consisting of a @@ -705,14 +737,20 @@ single space, with no surrounding whitespace or other characters. (See RFC 2616, Section 6.1.1 for more information.) The string **must not** contain control characters, and must not be terminated -with a carriage return, linefeed, or combination thereof. +with a carriage return, linefeed, or combination thereof. This +value should be a byte string. Where native strings are unicode +strings, the native string type can also be returned, in which +case it would be encoded as ISO-8859-1. The ``response_headers`` argument is a list of ``(header_name, header_value)`` tuples. It must be a Python list; i.e. -``type(response_headers) is ListType``, and the server **may** change +``type(response_headers) is list``, and the server **may** change its contents in any way it desires. Each ``header_name`` must be a valid HTTP header field-name (as defined by RFC 2616, Section 4.2), -without a trailing colon or other punctuation. +without a trailing colon or other punctuation. Both the header_name +and the header_value should be byte strings. Where native strings +are unicode strings, the native string type can also be returned, +in which case it would be encoded as ISO-8859-1. Each ``header_value`` **must not** include *any* control characters, including carriage returns or linefeeds, either embedded or at the end. @@ -809,6 +847,14 @@ Handling the ``Content-Length`` Header ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +If an application or middleware layer chooses to return a +Content-Length header, it should not return more data than specified +by the header value. Any wrapping middleware layer should not +consume more data than specified in the header value from the +wrapped component (either middleware or application). Any WSGI +adapter must similarly not pass on data above what the +Content-Length response header value defines. + If the application does not supply a ``Content-Length`` header, a server or gateway may choose one of several approaches to handling it. The simplest of these is to close the client connection when @@ -1569,55 +1615,13 @@ developers. -Proposed/Under Discussion -========================= - -These items are currently being discussed on the Web-SIG and elsewhere, -or are on the PEP author's "to-do" list: - -* Should ``wsgi.input`` be an iterator instead of a file? This would - help for asynchronous applications and chunked-encoding input - streams. - -* Optional extensions are being discussed for pausing iteration of an - application's ouptut until input is available or until a callback - occurs. - -* Add a section about synchronous vs. asynchronous apps and servers, - the relevant threading models, and issues/design goals in these - areas. - - Acknowledgements ================ -Thanks go to the many folks on the Web-SIG mailing list whose -thoughtful feedback made this revised draft possible. Especially: +Thanks go to many folks on the Web-SIG mailing list for helping the work +on clarifying and improving this specification. In particular: -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up - on the first draft as not offering any advantages over "plain old - CGI", thus encouraging me to look for a better approach. - -* Ian Bicking, who helped nag me into properly specifying the - multithreading and multiprocess options, as well as badgering me to - provide a mechanism for servers to supply custom extension data to - an application. - -* Tony Lownds, who came up with the concept of a ``start_response`` - function that took the status and headers, returning a ``write`` - function. His input also guided the design of the exception handling - facilities, especially in the area of allowing for middleware that - overrides application error messages. - -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython - (well before the spec was finalized) helped to shape the "supporting - older versions of Python" section, as well as the optional - ``wsgi.file_wrapper`` facility. - -* Mark Nottingham, who reviewed the spec extensively for issues with - HTTP RFC compliance, especially with regard to HTTP/1.1 features that - I didn't even know existed until he pointed them out. - +* Phillip J. Eby, for writing/editing the 1.0 specification. References ========== @@ -1643,8 +1647,6 @@ This document has been placed in the public domain. - - .. Local Variables: mode: indented-text From manlio_perillo at libero.it Thu Apr 15 15:57:03 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 15 Apr 2010 15:57:03 +0200 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: <4BC71B2F.4080400@libero.it> Dirkjan Ochtman ha scritto: > [...] > --- pep-0333.txt 2010-04-15 14:46:02.000000000 +0200 > +++ wsgi-1.1.txt 2010-04-15 14:51:39.000000000 +0200 > @@ -1,114 +1,124 @@ > [...] > Abstract > ======== > > [...] > -Thus, simplicity of implementation on *both* the server and framework > -sides of the interface is absolutely critical to the utility of the > -WSGI interface, and is therefore the principal criterion for any > -design decisions. > - > -Note, however, that simplicity of implementation for a framework > -author is not the same thing as ease of use for a web application > -author. WSGI presents an absolutely "no frills" interface to the > -framework author, because bells and whistles like response objects and > -cookie handling would just get in the way of existing frameworks' > -handling of these issues. Again, the goal of WSGI is to facilitate > -easy interconnection of existing servers and applications or > -frameworks, not to create a new web framework. > - This, and the rest of the abstract, should not entirely be removed, IMHO. > [...] > - > -Finally, it should be mentioned that the current version of WSGI > -does not prescribe any particular mechanism for "deploying" an > -application for use with a web server or server gateway. At the > -present time, this is necessarily implementation-defined by the > -server or gateway. After a sufficient number of servers and > -frameworks have implemented WSGI to provide field experience with > -varying deployment requirements, it may make sense to create > -another PEP, describing a deployment standard for WSGI servers and > -application frameworks. This should not be removed. > [...] > + > +Differences with WSGI 1.0 > +========================= > + > +Descriptive changes > +------------------- > + > +The following changes were made to realign the spec with > +implementations 'in the wild'. > + This text feels wrong, to me, > +1. The 'readline()' function of 'wsgi.input' must optionally take > + a size hint. This is required because many applications use > + cgi.FieldStorage, which uses this functionality. > + What values are supported for size? Are values -1 and None supported? > [...] > +3. Any WSGI application or middleware should not itself return, or > + consume from a wrapped WSGI component This is not very clear. What is the meaning of "consume from a wrapped WSGI component"? > , more data than specified by > + the Content-Length response header if defined. Middleware that > + does this is arguably broken and can generate incorrect data. > + This is just a clarification of obligations. > + > [...] > + > +String handling changes > +----------------------- > + > +The following changes were made to make WSGI work on Python 3.x. > + > +1. The application is passed an instance of a Python dictionary > + containing what is referred to as the WSGI environment. All keys > + in this dictionary are native strings. For CGI variables, all names > + are going to be ISO-8859-1 "going to be ISO-8859-1" should be expressed in more precise terms. Moreover, you should probably define first what a "native string" is, or you shoudl add a note that it is defined later in the document. > and so where native strings are > + unicode strings, that encoding is used for the names of CGI > + variables. > + > +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI > + environment, the value of the variable should be a native string. > + > +3. For the CGI variables contained in the WSGI environment, the values > + of the variables are native strings. Where native strings are > + unicode strings, ISO-8859-1 encoding would be used such that the What is the precise meaning of *would*, here? > + original character data is preserved and as necessary the unicode > + string can be converted back to bytes and thence decoded to unicode > + again using a different encoding. > + > +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment > + and from which request content is read, should yield byte strings. > + "yield" should be replaced with "return". And, again, why are you using *should*, here? Is an implementation allowed to return a native string? See my previous comment for "native string", about the use od "byte string" here. > [...] > @@ -575,13 +602,14 @@ > ===================== =============================================== > Variable Value > ===================== =============================================== > -``wsgi.version`` The tuple ``(1,0)``, representing WSGI > +``wsgi.version`` The tuple ``(1, 0)``, representing WSGI > version 1.0. > Should be (1, 1), not (1, 0). > [...] > > -Proposed/Under Discussion > -========================= > - I see no real reasons for removing this section. > [...] Moreover, should the section "Supporting Older (<2.2) Versions of Python" be removed? > - > Acknowledgements > ================ > Since WSGI 1.1 contains only corrections for WSGI 1.0, I see no reasons to remove original contributors to WSGI 1.0. > [...] Regards Manlio From manlio_perillo at libero.it Thu Apr 15 16:46:10 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 15 Apr 2010 16:46:10 +0200 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: <4BC726B2.4070609@libero.it> Dirkjan Ochtman ha scritto: > Mostly taking Graham's list of issues and incorporating it into PEP 333. > > Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt > > Let's have comments here (comments in the form of diffs are > particularly welcome, of course). Remember, the idea is not to change > or improve WSGI right now, but only to improve the spec, improving > interoperability and enabling Python 3 support. > > [...] Another comment. The run_with_cgi sample function should be changed, since it probably does not work correctly, on Python 3.x. I'm not sure, since sys.stdout.write accepts a native string, however how it is encoded is platform specific (with current text of WSGI 1.1, however, it seems this is allowed). I would like to do some tests with CGI, Python 3.2, IIS and Windows. Regards Manlio From and-py at doxdesk.com Thu Apr 15 17:30:59 2010 From: and-py at doxdesk.com (And Clover) Date: Thu, 15 Apr 2010 17:30:59 +0200 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: <4BC73133.3070202@doxdesk.com> Dirkjan Ochtman wrote: > 1. The application is passed an instance of a Python dictionary > containing what is referred to as the WSGI environment. All keys > in this dictionary are native strings. For CGI variables, all names > are going to be ISO-8859-1 and so where native strings are > unicode strings, that encoding is used for the names of CGI > variables. Perhaps explain where those ISO-8859-1 bytes might come from: ...are native strings. Where native strings are Unicode, any keys derived from byte-oriented sources (such as custom headers in the HTTP request reflected in the CGI environment variables) should be decoded using the ISO-8859-1 encoding. > 3. For the CGI variables contained in the WSGI environment, the values > of the variables are native strings. Where native strings are > unicode strings, ISO-8859-1 encoding would be used such that the > original character data is preserved and as necessary the unicode > string can be converted back to bytes and thence decoded to unicode > again using a different encoding. Good. The only problem that remains with this is that in certain environments (notably: all IIS use, not just CGI) a WSGI gateway cannot fully comply with this requirement. a. disallow environments that cannot be sure they are preserving the original byte data from declaring that they support wsgi.version 1.1? b. add an extra wsgi.something flag for a WSGI server to add, to specify that it is sure that the original bytes have been preserved? (ie. so wsgiref's CGI handler would have to declare it wasn't sure when running under Windows.) c. just let WSGI gateways silently ignore the ISO-8859-1 requirement if they can't honour it and let the application spend its time trying to unravel the mess (status quo). (Can wsgiref be fixed to use ISO-8859-1 in time for Python 3.2?) > 7. The iterable returned by the application and from which response > content is derived, should yield byte strings. Where native strings > are unicode strings, the native string type can also be returned in > which case it would be encoded as ISO-8859-1. > 8. The value passed to the 'write()' callback returned by > 'start_response()' should be a byte string. Where native strings > are unicode strings, a native string type can also be supplied, in > which case it would be encoded as ISO-8859-1. Weren't we going to only allow US-ASCII for the output? (These threads are always so far apart I can never remember what conclusion we reached... if any.) Whilst ISO-8859-1 is in the HTTP standard for headers, and required to preserve bytes in input, it's much, much less likely that the response body is going to be ISO-8859-1. It could maybe be cp1252, but more likely the author wanted UTF-8. If we must support Unicode strings for response body output at all, I'd prefer to be conservative here and spit a UnicodeEncodeError straight away, rather than quietly mangle characters U+0080 to U+00FF. Manlio Perillo wrote: > The run_with_cgi sample function should be changed, since it probably > does not work correctly, on Python 3.x. Yes, the 'URL Reconstruction' fragment will be wrong too, since it uses urllib.quote() to encode the path part. quote() defaults to UTF-8 rather than the ISO-8859-1 WSGI 1.1 requires. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From manlio_perillo at libero.it Thu Apr 15 18:30:40 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 15 Apr 2010 18:30:40 +0200 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: <4BC73133.3070202@doxdesk.com> References: <4BC73133.3070202@doxdesk.com> Message-ID: <4BC73F30.3050109@libero.it> And Clover ha scritto: > [...] >> 8. The value passed to the 'write()' callback returned by >> 'start_response()' should be a byte string. Where native strings >> are unicode strings, a native string type can also be supplied, in >> which case it would be encoded as ISO-8859-1. > > Weren't we going to only allow US-ASCII for the output? (These threads > are always so far apart I can never remember what conclusion we > reached... if any.) > By the way, yesterday I wrote some tests for Python 3.x and I found a possible problem (only indirectly related to WSGI, however). The example consists in a simple client -> proxy -> server, where the client and server are in Python 2.5 and the proxy in Python 3.2 (compiled from tip, some time ago). Here is the proxy: http://paste.pocoo.org/show/202212/ The application fails, if cookie contains non ascii character. The reason is that, for reasons I do not understand, http.client encode request headers using us-ascii, instead of iso-8859-1. The offending code is: http://hg.python.org/cpython/file/7dcb7a2fb54d/Lib/http/client.py#l912 Regards Manlio From graham.dumpleton at gmail.com Fri Apr 16 03:41:55 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 16 Apr 2010 11:41:55 +1000 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: I haven't read what you have done yet, but if you have done so already, ensure you read: http://bitbucket.org/ianb/wsgi-peps/src/ This is Ian's and Armin's previous go at new specification. It though tried to go further than what you are doing. Also read: http://blog.dscpl.com.au/2009/09/roadmap-for-python-wsgi-specification.html I explain what I mean by native strings in that. Graham On 15 April 2010 22:54, Dirkjan Ochtman wrote: > Mostly taking Graham's list of issues and incorporating it into PEP 333. > > Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt > > Let's have comments here (comments in the form of diffs are > particularly welcome, of course). Remember, the idea is not to change > or improve WSGI right now, but only to improve the spec, improving > interoperability and enabling Python 3 support. > > Graham, I hope I did a good job with your suggestions. (Since so much > of this is yours, I've just listed you as the second author.) I tried > to clarify exactly what you meant by "native strings", can you check > that out? > > Cheers, > > Dirkjan > > --- pep-0333.txt ? ? ? ?2010-04-15 14:46:02.000000000 +0200 > +++ wsgi-1.1.txt ? ? ? ?2010-04-15 14:51:39.000000000 +0200 > @@ -1,114 +1,124 @@ > -PEP: 333 > -Title: Python Web Server Gateway Interface v1.0 > +PEP: 0000 > +Title: Python Web Server Gateway Interface 1.1 > ?Version: $Revision$ > ?Last-Modified: $Date$ > -Author: Phillip J. Eby > +Author: Dirkjan Ochtman , > + ? ? ? ?Graham Dumpleton > ?Discussions-To: Python Web-SIG > ?Status: Draft > ?Type: Informational > ?Content-Type: text/x-rst > -Created: 07-Dec-2003 > -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 > +Created: 15-04-2010 > +Post-History: Not yet > > > ?Abstract > ?======== > > -This document specifies a proposed standard interface between web > -servers and Python web applications or frameworks, to promote web > -application portability across a variety of web servers. > +This document specifies a revision of the proposed standard interface > +between web servers and Python web applications or frameworks, to > +promote web application portability across a variety of web servers. > > > ?Rationale and Goals > ?=================== > > -Python currently boasts a wide variety of web application frameworks, > -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to > -name just a few [1]_. ?This wide variety of choices can be a problem > -for new Python users, because generally speaking, their choice of web > -framework will limit their choice of usable web servers, and vice > -versa. > - > -By contrast, although Java has just as many web application frameworks > -available, Java's "servlet" API makes it possible for applications > -written with any Java web application framework to run in any web > -server that supports the servlet API. > - > -The availability and widespread use of such an API in web servers for > -Python -- whether those servers are written in Python (e.g. Medusa), > -embed Python (e.g. mod_python), or invoke Python via a gateway > -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of > -framework from choice of web server, freeing users to choose a pairing > -that suits them, while freeing framework and server developers to > -focus on their preferred area of specialization. > - > -This PEP, therefore, proposes a simple and universal interface between > -web servers and web applications or frameworks: the Python Web Server > -Gateway Interface (WSGI). > - > -But the mere existence of a WSGI spec does nothing to address the > -existing state of servers and frameworks for Python web applications. > -Server and framework authors and maintainers must actually implement > -WSGI for there to be any effect. > - > -However, since no existing servers or frameworks support WSGI, there > -is little immediate reward for an author who implements WSGI support. > -Thus, WSGI **must** be easy to implement, so that an author's initial > -investment in the interface can be reasonably low. > - > -Thus, simplicity of implementation on *both* the server and framework > -sides of the interface is absolutely critical to the utility of the > -WSGI interface, and is therefore the principal criterion for any > -design decisions. > - > -Note, however, that simplicity of implementation for a framework > -author is not the same thing as ease of use for a web application > -author. ?WSGI presents an absolutely "no frills" interface to the > -framework author, because bells and whistles like response objects and > -cookie handling would just get in the way of existing frameworks' > -handling of these issues. ?Again, the goal of WSGI is to facilitate > -easy interconnection of existing servers and applications or > -frameworks, not to create a new web framework. > - > -Note also that this goal precludes WSGI from requiring anything that > -is not already available in deployed versions of Python. ?Therefore, > -new standard library modules are not proposed or required by this > -specification, and nothing in WSGI requires a Python version greater > -than 2.2.2. ?(It would be a good idea, however, for future versions > -of Python to include support for this interface in web servers > -provided by the standard library.) > - > -In addition to ease of implementation for existing and future > -frameworks and servers, it should also be easy to create request > -preprocessors, response postprocessors, and other WSGI-based > -"middleware" components that look like an application to their > -containing server, while acting as a server for their contained > -applications. > - > -If middleware can be both simple and robust, and WSGI is widely > -available in servers and frameworks, it allows for the possibility > -of an entirely new kind of Python web application framework: one > -consisting of loosely-coupled WSGI middleware components. ?Indeed, > -existing framework authors may even choose to refactor their > -frameworks' existing services to be provided in this way, becoming > -more like libraries used with WSGI, and less like monolithic > -frameworks. ?This would then allow application developers to choose > -"best-of-breed" components for specific functionality, rather than > -having to commit to all the pros and cons of a single framework. > - > -Of course, as of this writing, that day is doubtless quite far off. > -In the meantime, it is a sufficient short-term goal for WSGI to > -enable the use of any framework with any server. > - > -Finally, it should be mentioned that the current version of WSGI > -does not prescribe any particular mechanism for "deploying" an > -application for use with a web server or server gateway. ?At the > -present time, this is necessarily implementation-defined by the > -server or gateway. ?After a sufficient number of servers and > -frameworks have implemented WSGI to provide field experience with > -varying deployment requirements, it may make sense to create > -another PEP, describing a deployment standard for WSGI servers and > -application frameworks. > +WSGI 1.0, specified in PEP 333, did a great job in making it easier > +for web applications and web servers to interface with each other. > +It has become very much the standard it was meant to be and an > +important part of the Python web development infrastructure. > + > +After several implementations were built by different developers, > +it inevitably turned out that the specification wasn't perfect. It > +left out some details that were implemented by all the web server > +interfaces because they were critical for many applications (or > +application frameworks). Additionally, the specification was written > +before Python 3.x was specified, resulting in a lack of clear > +specification on what to do with unicode strings. > + > +While there are some ideas around to improve WSGI further in less > +compatible ways, we feel that there is value to be had in first > +specifying a minor revision of the specification, which is largely > +compatible with existing implementations. Further simplification > +and experimentation are therefore deferred to a 2.0 version. > + > + > +Differences with WSGI 1.0 > +========================= > + > +Descriptive changes > +------------------- > + > +The following changes were made to realign the spec with > +implementations 'in the wild'. > + > +1. The 'readline()' function of 'wsgi.input' must optionally take > + ? a size hint. This is required because many applications use > + ? cgi.FieldStorage, which uses this functionality. > + > +2. The 'wsgi.input' functions for reading input must return an empty > + ? string as end of input stream marker. This is required for support > + ? of HTTP 1.1 request pipelining. A correctly implemented WSGI > + ? middleware already has to cope with an empty string as end > + ? sentinel anyway to detect premature end of input. > + > +3. Any WSGI application or middleware should not itself return, or > + ? consume from a wrapped WSGI component, more data than specified by > + ? the Content-Length response header if defined. Middleware that > + ? does this is arguably broken and can generate incorrect data. > + ? This is just a clarification of obligations. > + > +4. The WSGI adapter must not pass on to the server any data above > + ? what the Content-Length response header defines, if supplied. > + ? Doing this is technically a violation of HTTP. This is another > + ? clarification of obligations. > + > + > +String handling changes > +----------------------- > + > +The following changes were made to make WSGI work on Python 3.x. > + > +1. The application is passed an instance of a Python dictionary > + ? containing what is referred to as the WSGI environment. All keys > + ? in this dictionary are native strings. For CGI variables, all names > + ? are going to be ISO-8859-1 and so where native strings are > + ? unicode strings, that encoding is used for the names of CGI > + ? variables. > + > +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI > + ? environment, the value of the variable should be a native string. > + > +3. For the CGI variables contained in the WSGI environment, the values > + ? of the variables are native strings. Where native strings are > + ? unicode strings, ISO-8859-1 encoding would be used such that the > + ? original character data is preserved and as necessary the unicode > + ? string can be converted back to bytes and thence decoded to unicode > + ? again using a different encoding. > + > +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment > + ? and from which request content is read, should yield byte strings. > + > +5. The status line specified by the WSGI application should be a byte > + ? string. Where native strings are unicode strings, the native string > + ? type can also be returned in which case it would be encoded as > + ? ISO-8859-1. > + > +6. The list of response headers specified by the WSGI application should > + ? contain tuples consisting of two values, where each value is a byte > + ? string. Where native strings are unicode strings, the native string > + ? type can also be returned in which case it would be encoded as > + ? ISO-8859-1. > + > +7. The iterable returned by the application and from which response > + ? content is derived, should yield byte strings. Where native strings > + ? are unicode strings, the native string type can also be returned in > + ? which case it would be encoded as ISO-8859-1. > + > +8. The value passed to the 'write()' callback returned by > + ? 'start_response()' should be a byte string. Where native strings > + ? are unicode strings, a native string type can also be supplied, in > + ? which case it would be encoded as ISO-8859-1. > > > ?Specification Overview > @@ -447,6 +457,13 @@ > ?Streaming`_ section below for more on how application output must be > ?handled.) > > +Further on, several places specify constraints upon string types used > +in the WSGI API. The term native string is used to mean the 'str' class > +in both Python 2.x and 3.x. The spec tries to ensure optimal > +compatibility and ease of use by allowing implementations running on > +Python 3.x to encode strings (which are Unicode strings with no > +specified encoding) as ISO-8859-1 where a 3.x string is passed in. > + > ?The server or gateway should treat the yielded strings as binary byte > ?sequences: in particular, it should ensure that line endings are > ?not altered. ?The application is responsible for ensuring that the > @@ -489,12 +506,22 @@ > ?``environ`` Variables > ?--------------------- > > +All keys in this dictionary are native strings. For CGI variables, > +all names are going to be ISO-8859-1 and so where native strings are > +unicode strings, that encoding is used for the names of CGI variables. > + > ?The ``environ`` dictionary is required to contain these CGI > ?environment variables, as defined by the Common Gateway Interface > ?specification [2]_. ?The following variables **must** be present, > ?unless their value would be an empty string, in which case they > ?**may** be omitted, except as otherwise noted below. > > +The values for CGI variables are native strings. Where native strings > +are unicode strings, ISO-8859-1 encoding would be used such that the > +original character data is preserved and as necessary the unicode > +string can be converted back to bytes and thence decoded to unicode > +again using a different encoding. > + > ?``REQUEST_METHOD`` > ? The HTTP request method, such as ``"GET"`` or ``"POST"``. ?This > ? cannot ever be an empty string, and so is always required. > @@ -575,13 +602,14 @@ > ?===================== ?=============================================== > ?Variable ? ? ? ? ? ? ? Value > ?===================== ?=============================================== > -``wsgi.version`` ? ? ? The tuple ``(1,0)``, representing WSGI > +``wsgi.version`` ? ? ? The tuple ``(1, 0)``, representing WSGI > ? ? ? ? ? ? ? ? ? ? ? ?version 1.0. > > ?``wsgi.url_scheme`` ? ?A string representing the "scheme" portion of > ? ? ? ? ? ? ? ? ? ? ? ?the URL at which the application is being > ? ? ? ? ? ? ? ? ? ? ? ?invoked. ?Normally, this will have the value > - ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. > + ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. The > + ? ? ? ? ? ? ? ? ? ? ? value is a native string. > > ?``wsgi.input`` ? ? ? ? An input stream (file-like object) from which > ? ? ? ? ? ? ? ? ? ? ? ?the HTTP request body can be read. ?(The server > @@ -646,7 +674,7 @@ > ?Method ? ? ? ? ? ? ? Stream ? ? ?Notes > ?=================== ?========== ?======== > ?``read(size)`` ? ? ? ``input`` ? 1 > -``readline()`` ? ? ? ``input`` ? 1,2 > +``readline(hint)`` ? ``input`` ? 1,2 > ?``readlines(hint)`` ?``input`` ? 1,3 > ?``__iter__()`` ? ? ? ``input`` > ?``flush()`` ? ? ? ? ?``errors`` ?4 > @@ -661,11 +689,12 @@ > ? ?``Content-Length``, and is allowed to simulate an end-of-file > ? ?condition if the application attempts to read past that point. > ? ?The application **should not** attempt to read more data than is > - ? specified by the ``CONTENT_LENGTH`` variable. > + ? specified by the ``CONTENT_LENGTH`` variable. All read functions > + ? are required to return an empty string as the end of input stream > + ? marker. They must yield byte strings. > > -2. The optional "size" argument to ``readline()`` is not supported, > - ? as it may be complex for server authors to implement, and is not > - ? often used in practice. > +2. The optional "size" argument to ``readline()`` is required for > + ? the implementer, but optional for callers. > > ?3. Note that the ``hint`` argument to ``readlines()`` is optional for > ? ?both caller and implementer. ?The application is free not to > @@ -692,12 +721,15 @@ > ?--------------------------------- > > ?The second parameter passed to the application object is a callable > -of the form ``start_response(status,response_headers,exc_info=None)``. > +of the form ``start_response(status, response_headers, exc_info=None)``. > ?(As with all WSGI callables, the arguments must be supplied > ?positionally, not by keyword.) ?The ``start_response`` callable is > ?used to begin the HTTP response, and it must return a > ?``write(body_data)`` callable (see the `Buffering and Streaming`_ > -section, below). > +section, below). Values passed to the ``write(body_data)`` callable > +should be byte strings. Where native strings are unicode strings, a > +native strings type can also be supplied, in which case it would be > +encoded as ISO-8859-1. > > ?The ``status`` argument is an HTTP "status" string like ``"200 OK"`` > ?or ``"404 Not Found"``. ?That is, it is a string consisting of a > @@ -705,14 +737,20 @@ > ?single space, with no surrounding whitespace or other characters. > ?(See RFC 2616, Section 6.1.1 for more information.) ?The string > ?**must not** contain control characters, and must not be terminated > -with a carriage return, linefeed, or combination thereof. > +with a carriage return, linefeed, or combination thereof. This > +value should be a byte string. Where native strings are unicode > +strings, the native string type can also be returned, in which > +case it would be encoded as ISO-8859-1. > > ?The ``response_headers`` argument is a list of ``(header_name, > ?header_value)`` tuples. ?It must be a Python list; i.e. > -``type(response_headers) is ListType``, and the server **may** change > +``type(response_headers) is list``, and the server **may** change > ?its contents in any way it desires. ?Each ``header_name`` must be a > ?valid HTTP header field-name (as defined by RFC 2616, Section 4.2), > -without a trailing colon or other punctuation. > +without a trailing colon or other punctuation. Both the header_name > +and the header_value should be byte strings. Where native strings > +are unicode strings, the native string type can also be returned, > +in which case it would be encoded as ISO-8859-1. > > ?Each ``header_value`` **must not** include *any* control characters, > ?including carriage returns or linefeeds, either embedded or at the end. > @@ -809,6 +847,14 @@ > ?Handling the ``Content-Length`` Header > ?~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > +If an application or middleware layer chooses to return a > +Content-Length header, it should not return more data than specified > +by the header value. Any wrapping middleware layer should not > +consume more data than specified in the header value from the > +wrapped component (either middleware or application). Any WSGI > +adapter must similarly not pass on data above what the > +Content-Length response header value defines. > + > ?If the application does not supply a ``Content-Length`` header, a > ?server or gateway may choose one of several approaches to handling > ?it. ?The simplest of these is to close the client connection when > @@ -1569,55 +1615,13 @@ > ? ?developers. > > > -Proposed/Under Discussion > -========================= > - > -These items are currently being discussed on the Web-SIG and elsewhere, > -or are on the PEP author's "to-do" list: > - > -* Should ``wsgi.input`` be an iterator instead of a file? ?This would > - ?help for asynchronous applications and chunked-encoding input > - ?streams. > - > -* Optional extensions are being discussed for pausing iteration of an > - ?application's ouptut until input is available or until a callback > - ?occurs. > - > -* Add a section about synchronous vs. asynchronous apps and servers, > - ?the relevant threading models, and issues/design goals in these > - ?areas. > - > - > ?Acknowledgements > ?================ > > -Thanks go to the many folks on the Web-SIG mailing list whose > -thoughtful feedback made this revised draft possible. ?Especially: > +Thanks go to many folks on the Web-SIG mailing list for helping the work > +on clarifying and improving this specification. In particular: > > -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up > - ?on the first draft as not offering any advantages over "plain old > - ?CGI", thus encouraging me to look for a better approach. > - > -* Ian Bicking, who helped nag me into properly specifying the > - ?multithreading and multiprocess options, as well as badgering me to > - ?provide a mechanism for servers to supply custom extension data to > - ?an application. > - > -* Tony Lownds, who came up with the concept of a ``start_response`` > - ?function that took the status and headers, returning a ``write`` > - ?function. ?His input also guided the design of the exception handling > - ?facilities, especially in the area of allowing for middleware that > - ?overrides application error messages. > - > -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython > - ?(well before the spec was finalized) helped to shape the "supporting > - ?older versions of Python" section, as well as the optional > - ?``wsgi.file_wrapper`` facility. > - > -* Mark Nottingham, who reviewed the spec extensively for issues with > - ?HTTP RFC compliance, especially with regard to HTTP/1.1 features that > - ?I didn't even know existed until he pointed them out. > - > +* Phillip J. Eby, for writing/editing the 1.0 specification. > > ?References > ?========== > @@ -1643,8 +1647,6 @@ > > ?This document has been placed in the public domain. > > - > - > ?.. > ? ?Local Variables: > ? ?mode: indented-text > From graham.dumpleton at gmail.com Fri Apr 16 04:08:01 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 16 Apr 2010 12:08:01 +1000 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: On 16 April 2010 11:41, Graham Dumpleton wrote: > I haven't read what you have done yet And still haven't. Don't know when I will get a chance to do so. Two points from a quick scan of emails. 1. The following section of PEP needs to be updated: """ 1417 Apart from the handling of ``close()``, the semantics of returning a 1418 file wrapper from the application should be the same as if the 1419 application had returned ``iter(filelike.read, '')``. In other words, 1420 transmission should begin at the current position within the "file" 1421 at the time that transmission begins, and continue until the end is 1422 reached. """ It can't say read until 'end is reached' of file as Content-Length must be honoured and less returned if Content-Length is less than what is available in the remainder of the file as per descriptive changes (3) and (4). In respect of question about readline() arguments and whether -1 or None is allowed. I would say no they are not. Must be positive integer or no argument supplied at all. Different implementations use -1 or None as value of a default argument to know when an argument wasn't supplied. One cant rely though on one or the other being used and so that supplying those arguments explicitly means the same thing as no argument supplied. In other words, supplying anything but positive integer or no argument at all is undefined. Same issue arises with read() except that only positive integer can technically be supplied and argument is not optional. Although, any implementation which implements wsgi.input as a proper file like argument is going to accept no argument to mean read all input, this is outside of WSGI specification and calling with no argument is undefined. Graham > but if you have done so > already, ensure you read: > > ?http://bitbucket.org/ianb/wsgi-peps/src/ > > This is Ian's and Armin's previous go at new specification. It though > tried to go further than what you are doing. > > Also read: > > ?http://blog.dscpl.com.au/2009/09/roadmap-for-python-wsgi-specification.html > > I explain what I mean by native strings in that. > > Graham > > On 15 April 2010 22:54, Dirkjan Ochtman wrote: >> Mostly taking Graham's list of issues and incorporating it into PEP 333. >> >> Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt >> >> Let's have comments here (comments in the form of diffs are >> particularly welcome, of course). Remember, the idea is not to change >> or improve WSGI right now, but only to improve the spec, improving >> interoperability and enabling Python 3 support. >> >> Graham, I hope I did a good job with your suggestions. (Since so much >> of this is yours, I've just listed you as the second author.) I tried >> to clarify exactly what you meant by "native strings", can you check >> that out? >> >> Cheers, >> >> Dirkjan >> >> --- pep-0333.txt ? ? ? ?2010-04-15 14:46:02.000000000 +0200 >> +++ wsgi-1.1.txt ? ? ? ?2010-04-15 14:51:39.000000000 +0200 >> @@ -1,114 +1,124 @@ >> -PEP: 333 >> -Title: Python Web Server Gateway Interface v1.0 >> +PEP: 0000 >> +Title: Python Web Server Gateway Interface 1.1 >> ?Version: $Revision$ >> ?Last-Modified: $Date$ >> -Author: Phillip J. Eby >> +Author: Dirkjan Ochtman , >> + ? ? ? ?Graham Dumpleton >> ?Discussions-To: Python Web-SIG >> ?Status: Draft >> ?Type: Informational >> ?Content-Type: text/x-rst >> -Created: 07-Dec-2003 >> -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 >> +Created: 15-04-2010 >> +Post-History: Not yet >> >> >> ?Abstract >> ?======== >> >> -This document specifies a proposed standard interface between web >> -servers and Python web applications or frameworks, to promote web >> -application portability across a variety of web servers. >> +This document specifies a revision of the proposed standard interface >> +between web servers and Python web applications or frameworks, to >> +promote web application portability across a variety of web servers. >> >> >> ?Rationale and Goals >> ?=================== >> >> -Python currently boasts a wide variety of web application frameworks, >> -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to >> -name just a few [1]_. ?This wide variety of choices can be a problem >> -for new Python users, because generally speaking, their choice of web >> -framework will limit their choice of usable web servers, and vice >> -versa. >> - >> -By contrast, although Java has just as many web application frameworks >> -available, Java's "servlet" API makes it possible for applications >> -written with any Java web application framework to run in any web >> -server that supports the servlet API. >> - >> -The availability and widespread use of such an API in web servers for >> -Python -- whether those servers are written in Python (e.g. Medusa), >> -embed Python (e.g. mod_python), or invoke Python via a gateway >> -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of >> -framework from choice of web server, freeing users to choose a pairing >> -that suits them, while freeing framework and server developers to >> -focus on their preferred area of specialization. >> - >> -This PEP, therefore, proposes a simple and universal interface between >> -web servers and web applications or frameworks: the Python Web Server >> -Gateway Interface (WSGI). >> - >> -But the mere existence of a WSGI spec does nothing to address the >> -existing state of servers and frameworks for Python web applications. >> -Server and framework authors and maintainers must actually implement >> -WSGI for there to be any effect. >> - >> -However, since no existing servers or frameworks support WSGI, there >> -is little immediate reward for an author who implements WSGI support. >> -Thus, WSGI **must** be easy to implement, so that an author's initial >> -investment in the interface can be reasonably low. >> - >> -Thus, simplicity of implementation on *both* the server and framework >> -sides of the interface is absolutely critical to the utility of the >> -WSGI interface, and is therefore the principal criterion for any >> -design decisions. >> - >> -Note, however, that simplicity of implementation for a framework >> -author is not the same thing as ease of use for a web application >> -author. ?WSGI presents an absolutely "no frills" interface to the >> -framework author, because bells and whistles like response objects and >> -cookie handling would just get in the way of existing frameworks' >> -handling of these issues. ?Again, the goal of WSGI is to facilitate >> -easy interconnection of existing servers and applications or >> -frameworks, not to create a new web framework. >> - >> -Note also that this goal precludes WSGI from requiring anything that >> -is not already available in deployed versions of Python. ?Therefore, >> -new standard library modules are not proposed or required by this >> -specification, and nothing in WSGI requires a Python version greater >> -than 2.2.2. ?(It would be a good idea, however, for future versions >> -of Python to include support for this interface in web servers >> -provided by the standard library.) >> - >> -In addition to ease of implementation for existing and future >> -frameworks and servers, it should also be easy to create request >> -preprocessors, response postprocessors, and other WSGI-based >> -"middleware" components that look like an application to their >> -containing server, while acting as a server for their contained >> -applications. >> - >> -If middleware can be both simple and robust, and WSGI is widely >> -available in servers and frameworks, it allows for the possibility >> -of an entirely new kind of Python web application framework: one >> -consisting of loosely-coupled WSGI middleware components. ?Indeed, >> -existing framework authors may even choose to refactor their >> -frameworks' existing services to be provided in this way, becoming >> -more like libraries used with WSGI, and less like monolithic >> -frameworks. ?This would then allow application developers to choose >> -"best-of-breed" components for specific functionality, rather than >> -having to commit to all the pros and cons of a single framework. >> - >> -Of course, as of this writing, that day is doubtless quite far off. >> -In the meantime, it is a sufficient short-term goal for WSGI to >> -enable the use of any framework with any server. >> - >> -Finally, it should be mentioned that the current version of WSGI >> -does not prescribe any particular mechanism for "deploying" an >> -application for use with a web server or server gateway. ?At the >> -present time, this is necessarily implementation-defined by the >> -server or gateway. ?After a sufficient number of servers and >> -frameworks have implemented WSGI to provide field experience with >> -varying deployment requirements, it may make sense to create >> -another PEP, describing a deployment standard for WSGI servers and >> -application frameworks. >> +WSGI 1.0, specified in PEP 333, did a great job in making it easier >> +for web applications and web servers to interface with each other. >> +It has become very much the standard it was meant to be and an >> +important part of the Python web development infrastructure. >> + >> +After several implementations were built by different developers, >> +it inevitably turned out that the specification wasn't perfect. It >> +left out some details that were implemented by all the web server >> +interfaces because they were critical for many applications (or >> +application frameworks). Additionally, the specification was written >> +before Python 3.x was specified, resulting in a lack of clear >> +specification on what to do with unicode strings. >> + >> +While there are some ideas around to improve WSGI further in less >> +compatible ways, we feel that there is value to be had in first >> +specifying a minor revision of the specification, which is largely >> +compatible with existing implementations. Further simplification >> +and experimentation are therefore deferred to a 2.0 version. >> + >> + >> +Differences with WSGI 1.0 >> +========================= >> + >> +Descriptive changes >> +------------------- >> + >> +The following changes were made to realign the spec with >> +implementations 'in the wild'. >> + >> +1. The 'readline()' function of 'wsgi.input' must optionally take >> + ? a size hint. This is required because many applications use >> + ? cgi.FieldStorage, which uses this functionality. >> + >> +2. The 'wsgi.input' functions for reading input must return an empty >> + ? string as end of input stream marker. This is required for support >> + ? of HTTP 1.1 request pipelining. A correctly implemented WSGI >> + ? middleware already has to cope with an empty string as end >> + ? sentinel anyway to detect premature end of input. >> + >> +3. Any WSGI application or middleware should not itself return, or >> + ? consume from a wrapped WSGI component, more data than specified by >> + ? the Content-Length response header if defined. Middleware that >> + ? does this is arguably broken and can generate incorrect data. >> + ? This is just a clarification of obligations. >> + >> +4. The WSGI adapter must not pass on to the server any data above >> + ? what the Content-Length response header defines, if supplied. >> + ? Doing this is technically a violation of HTTP. This is another >> + ? clarification of obligations. >> + >> + >> +String handling changes >> +----------------------- >> + >> +The following changes were made to make WSGI work on Python 3.x. >> + >> +1. The application is passed an instance of a Python dictionary >> + ? containing what is referred to as the WSGI environment. All keys >> + ? in this dictionary are native strings. For CGI variables, all names >> + ? are going to be ISO-8859-1 and so where native strings are >> + ? unicode strings, that encoding is used for the names of CGI >> + ? variables. >> + >> +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI >> + ? environment, the value of the variable should be a native string. >> + >> +3. For the CGI variables contained in the WSGI environment, the values >> + ? of the variables are native strings. Where native strings are >> + ? unicode strings, ISO-8859-1 encoding would be used such that the >> + ? original character data is preserved and as necessary the unicode >> + ? string can be converted back to bytes and thence decoded to unicode >> + ? again using a different encoding. >> + >> +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment >> + ? and from which request content is read, should yield byte strings. >> + >> +5. The status line specified by the WSGI application should be a byte >> + ? string. Where native strings are unicode strings, the native string >> + ? type can also be returned in which case it would be encoded as >> + ? ISO-8859-1. >> + >> +6. The list of response headers specified by the WSGI application should >> + ? contain tuples consisting of two values, where each value is a byte >> + ? string. Where native strings are unicode strings, the native string >> + ? type can also be returned in which case it would be encoded as >> + ? ISO-8859-1. >> + >> +7. The iterable returned by the application and from which response >> + ? content is derived, should yield byte strings. Where native strings >> + ? are unicode strings, the native string type can also be returned in >> + ? which case it would be encoded as ISO-8859-1. >> + >> +8. The value passed to the 'write()' callback returned by >> + ? 'start_response()' should be a byte string. Where native strings >> + ? are unicode strings, a native string type can also be supplied, in >> + ? which case it would be encoded as ISO-8859-1. >> >> >> ?Specification Overview >> @@ -447,6 +457,13 @@ >> ?Streaming`_ section below for more on how application output must be >> ?handled.) >> >> +Further on, several places specify constraints upon string types used >> +in the WSGI API. The term native string is used to mean the 'str' class >> +in both Python 2.x and 3.x. The spec tries to ensure optimal >> +compatibility and ease of use by allowing implementations running on >> +Python 3.x to encode strings (which are Unicode strings with no >> +specified encoding) as ISO-8859-1 where a 3.x string is passed in. >> + >> ?The server or gateway should treat the yielded strings as binary byte >> ?sequences: in particular, it should ensure that line endings are >> ?not altered. ?The application is responsible for ensuring that the >> @@ -489,12 +506,22 @@ >> ?``environ`` Variables >> ?--------------------- >> >> +All keys in this dictionary are native strings. For CGI variables, >> +all names are going to be ISO-8859-1 and so where native strings are >> +unicode strings, that encoding is used for the names of CGI variables. >> + >> ?The ``environ`` dictionary is required to contain these CGI >> ?environment variables, as defined by the Common Gateway Interface >> ?specification [2]_. ?The following variables **must** be present, >> ?unless their value would be an empty string, in which case they >> ?**may** be omitted, except as otherwise noted below. >> >> +The values for CGI variables are native strings. Where native strings >> +are unicode strings, ISO-8859-1 encoding would be used such that the >> +original character data is preserved and as necessary the unicode >> +string can be converted back to bytes and thence decoded to unicode >> +again using a different encoding. >> + >> ?``REQUEST_METHOD`` >> ? The HTTP request method, such as ``"GET"`` or ``"POST"``. ?This >> ? cannot ever be an empty string, and so is always required. >> @@ -575,13 +602,14 @@ >> ?===================== ?=============================================== >> ?Variable ? ? ? ? ? ? ? Value >> ?===================== ?=============================================== >> -``wsgi.version`` ? ? ? The tuple ``(1,0)``, representing WSGI >> +``wsgi.version`` ? ? ? The tuple ``(1, 0)``, representing WSGI >> ? ? ? ? ? ? ? ? ? ? ? ?version 1.0. >> >> ?``wsgi.url_scheme`` ? ?A string representing the "scheme" portion of >> ? ? ? ? ? ? ? ? ? ? ? ?the URL at which the application is being >> ? ? ? ? ? ? ? ? ? ? ? ?invoked. ?Normally, this will have the value >> - ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. >> + ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. The >> + ? ? ? ? ? ? ? ? ? ? ? value is a native string. >> >> ?``wsgi.input`` ? ? ? ? An input stream (file-like object) from which >> ? ? ? ? ? ? ? ? ? ? ? ?the HTTP request body can be read. ?(The server >> @@ -646,7 +674,7 @@ >> ?Method ? ? ? ? ? ? ? Stream ? ? ?Notes >> ?=================== ?========== ?======== >> ?``read(size)`` ? ? ? ``input`` ? 1 >> -``readline()`` ? ? ? ``input`` ? 1,2 >> +``readline(hint)`` ? ``input`` ? 1,2 >> ?``readlines(hint)`` ?``input`` ? 1,3 >> ?``__iter__()`` ? ? ? ``input`` >> ?``flush()`` ? ? ? ? ?``errors`` ?4 >> @@ -661,11 +689,12 @@ >> ? ?``Content-Length``, and is allowed to simulate an end-of-file >> ? ?condition if the application attempts to read past that point. >> ? ?The application **should not** attempt to read more data than is >> - ? specified by the ``CONTENT_LENGTH`` variable. >> + ? specified by the ``CONTENT_LENGTH`` variable. All read functions >> + ? are required to return an empty string as the end of input stream >> + ? marker. They must yield byte strings. >> >> -2. The optional "size" argument to ``readline()`` is not supported, >> - ? as it may be complex for server authors to implement, and is not >> - ? often used in practice. >> +2. The optional "size" argument to ``readline()`` is required for >> + ? the implementer, but optional for callers. >> >> ?3. Note that the ``hint`` argument to ``readlines()`` is optional for >> ? ?both caller and implementer. ?The application is free not to >> @@ -692,12 +721,15 @@ >> ?--------------------------------- >> >> ?The second parameter passed to the application object is a callable >> -of the form ``start_response(status,response_headers,exc_info=None)``. >> +of the form ``start_response(status, response_headers, exc_info=None)``. >> ?(As with all WSGI callables, the arguments must be supplied >> ?positionally, not by keyword.) ?The ``start_response`` callable is >> ?used to begin the HTTP response, and it must return a >> ?``write(body_data)`` callable (see the `Buffering and Streaming`_ >> -section, below). >> +section, below). Values passed to the ``write(body_data)`` callable >> +should be byte strings. Where native strings are unicode strings, a >> +native strings type can also be supplied, in which case it would be >> +encoded as ISO-8859-1. >> >> ?The ``status`` argument is an HTTP "status" string like ``"200 OK"`` >> ?or ``"404 Not Found"``. ?That is, it is a string consisting of a >> @@ -705,14 +737,20 @@ >> ?single space, with no surrounding whitespace or other characters. >> ?(See RFC 2616, Section 6.1.1 for more information.) ?The string >> ?**must not** contain control characters, and must not be terminated >> -with a carriage return, linefeed, or combination thereof. >> +with a carriage return, linefeed, or combination thereof. This >> +value should be a byte string. Where native strings are unicode >> +strings, the native string type can also be returned, in which >> +case it would be encoded as ISO-8859-1. >> >> ?The ``response_headers`` argument is a list of ``(header_name, >> ?header_value)`` tuples. ?It must be a Python list; i.e. >> -``type(response_headers) is ListType``, and the server **may** change >> +``type(response_headers) is list``, and the server **may** change >> ?its contents in any way it desires. ?Each ``header_name`` must be a >> ?valid HTTP header field-name (as defined by RFC 2616, Section 4.2), >> -without a trailing colon or other punctuation. >> +without a trailing colon or other punctuation. Both the header_name >> +and the header_value should be byte strings. Where native strings >> +are unicode strings, the native string type can also be returned, >> +in which case it would be encoded as ISO-8859-1. >> >> ?Each ``header_value`` **must not** include *any* control characters, >> ?including carriage returns or linefeeds, either embedded or at the end. >> @@ -809,6 +847,14 @@ >> ?Handling the ``Content-Length`` Header >> ?~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> +If an application or middleware layer chooses to return a >> +Content-Length header, it should not return more data than specified >> +by the header value. Any wrapping middleware layer should not >> +consume more data than specified in the header value from the >> +wrapped component (either middleware or application). Any WSGI >> +adapter must similarly not pass on data above what the >> +Content-Length response header value defines. >> + >> ?If the application does not supply a ``Content-Length`` header, a >> ?server or gateway may choose one of several approaches to handling >> ?it. ?The simplest of these is to close the client connection when >> @@ -1569,55 +1615,13 @@ >> ? ?developers. >> >> >> -Proposed/Under Discussion >> -========================= >> - >> -These items are currently being discussed on the Web-SIG and elsewhere, >> -or are on the PEP author's "to-do" list: >> - >> -* Should ``wsgi.input`` be an iterator instead of a file? ?This would >> - ?help for asynchronous applications and chunked-encoding input >> - ?streams. >> - >> -* Optional extensions are being discussed for pausing iteration of an >> - ?application's ouptut until input is available or until a callback >> - ?occurs. >> - >> -* Add a section about synchronous vs. asynchronous apps and servers, >> - ?the relevant threading models, and issues/design goals in these >> - ?areas. >> - >> - >> ?Acknowledgements >> ?================ >> >> -Thanks go to the many folks on the Web-SIG mailing list whose >> -thoughtful feedback made this revised draft possible. ?Especially: >> +Thanks go to many folks on the Web-SIG mailing list for helping the work >> +on clarifying and improving this specification. In particular: >> >> -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up >> - ?on the first draft as not offering any advantages over "plain old >> - ?CGI", thus encouraging me to look for a better approach. >> - >> -* Ian Bicking, who helped nag me into properly specifying the >> - ?multithreading and multiprocess options, as well as badgering me to >> - ?provide a mechanism for servers to supply custom extension data to >> - ?an application. >> - >> -* Tony Lownds, who came up with the concept of a ``start_response`` >> - ?function that took the status and headers, returning a ``write`` >> - ?function. ?His input also guided the design of the exception handling >> - ?facilities, especially in the area of allowing for middleware that >> - ?overrides application error messages. >> - >> -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython >> - ?(well before the spec was finalized) helped to shape the "supporting >> - ?older versions of Python" section, as well as the optional >> - ?``wsgi.file_wrapper`` facility. >> - >> -* Mark Nottingham, who reviewed the spec extensively for issues with >> - ?HTTP RFC compliance, especially with regard to HTTP/1.1 features that >> - ?I didn't even know existed until he pointed them out. >> - >> +* Phillip J. Eby, for writing/editing the 1.0 specification. >> >> ?References >> ?========== >> @@ -1643,8 +1647,6 @@ >> >> ?This document has been placed in the public domain. >> >> - >> - >> ?.. >> ? ?Local Variables: >> ? ?mode: indented-text >> > From paul.joseph.davis at gmail.com Fri Apr 16 05:29:29 2010 From: paul.joseph.davis at gmail.com (Paul Davis) Date: Thu, 15 Apr 2010 23:29:29 -0400 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: On Thu, Apr 15, 2010 at 10:08 PM, Graham Dumpleton wrote: > On 16 April 2010 11:41, Graham Dumpleton wrote: >> I haven't read what you have done yet > > And still haven't. Don't know when I will get a chance to do so. > > Two points from a quick scan of emails. > > 1. The following section of PEP needs to be updated: > > """ > ?1417 Apart from the handling of ``close()``, the semantics of returning a > ?1418 file wrapper from the application should be the same as if the > ?1419 application had returned ``iter(filelike.read, '')``. ?In other words, > ?1420 transmission should begin at the current position within the "file" > ?1421 at the time that transmission begins, and continue until the end is > ?1422 reached. > """ > > It can't say read until 'end is reached' of file as Content-Length > must be honoured and less returned if Content-Length is less than what > is available in the remainder of the file as per descriptive changes > (3) and (4). > > In respect of question about readline() arguments and whether -1 or > None is allowed. I would say no they are not. Must be positive integer > or no argument supplied at all. > > Different implementations use -1 or None as value of a default > argument to know when an argument wasn't supplied. One cant rely > though on one or the other being used and so that supplying those > arguments explicitly means the same thing as no argument supplied. In > other words, supplying anything but positive integer or no argument at > all is undefined. > > Same issue arises with read() except that only positive integer can > technically be supplied and argument is not optional. Although, any > implementation which implements wsgi.input as a proper file like > argument is going to accept no argument to mean read all input, this > is outside of WSGI specification and calling with no argument is > undefined. > > Graham I happened to have just started hitting the body reading functions on an HTTP parser I've been working on. I'd be interested to hear a response on what happens when the various read functions are called with a size hint of zero. I realize that zero is not a positive integer but I'm not quite sure on what the recommended return value would be. I'm can see None and -1 being obvious flags for "no size hint", but zero is a tad weird. I want to say that it'd either return "" (which could sorta kinda violate #2) or raise an exception. I really haven't got any reason to prefer on over the other though. As an aside, I think that "honoring Content-Length" should probably be rephrased to a "middleware should not break HTTP" coupled with a page that lists common ways that middle ware breaks HTTP. I reckon its the same reasoning for 333's dictation that hop-by-hop headers are server only, though there are plenty of other ways I could violate RFC 2616 as a middleware author without violating WSGI. Pie in the sky, the common ways would be included with wsgiref's validate decorator. Paul >> but if you have done so >> already, ensure you read: >> >> ?http://bitbucket.org/ianb/wsgi-peps/src/ >> >> This is Ian's and Armin's previous go at new specification. It though >> tried to go further than what you are doing. >> >> Also read: >> >> ?http://blog.dscpl.com.au/2009/09/roadmap-for-python-wsgi-specification.html >> >> I explain what I mean by native strings in that. >> >> Graham >> >> On 15 April 2010 22:54, Dirkjan Ochtman wrote: >>> Mostly taking Graham's list of issues and incorporating it into PEP 333. >>> >>> Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt >>> >>> Let's have comments here (comments in the form of diffs are >>> particularly welcome, of course). Remember, the idea is not to change >>> or improve WSGI right now, but only to improve the spec, improving >>> interoperability and enabling Python 3 support. >>> >>> Graham, I hope I did a good job with your suggestions. (Since so much >>> of this is yours, I've just listed you as the second author.) I tried >>> to clarify exactly what you meant by "native strings", can you check >>> that out? >>> >>> Cheers, >>> >>> Dirkjan >>> >>> --- pep-0333.txt ? ? ? ?2010-04-15 14:46:02.000000000 +0200 >>> +++ wsgi-1.1.txt ? ? ? ?2010-04-15 14:51:39.000000000 +0200 >>> @@ -1,114 +1,124 @@ >>> -PEP: 333 >>> -Title: Python Web Server Gateway Interface v1.0 >>> +PEP: 0000 >>> +Title: Python Web Server Gateway Interface 1.1 >>> ?Version: $Revision$ >>> ?Last-Modified: $Date$ >>> -Author: Phillip J. Eby >>> +Author: Dirkjan Ochtman , >>> + ? ? ? ?Graham Dumpleton >>> ?Discussions-To: Python Web-SIG >>> ?Status: Draft >>> ?Type: Informational >>> ?Content-Type: text/x-rst >>> -Created: 07-Dec-2003 >>> -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 >>> +Created: 15-04-2010 >>> +Post-History: Not yet >>> >>> >>> ?Abstract >>> ?======== >>> >>> -This document specifies a proposed standard interface between web >>> -servers and Python web applications or frameworks, to promote web >>> -application portability across a variety of web servers. >>> +This document specifies a revision of the proposed standard interface >>> +between web servers and Python web applications or frameworks, to >>> +promote web application portability across a variety of web servers. >>> >>> >>> ?Rationale and Goals >>> ?=================== >>> >>> -Python currently boasts a wide variety of web application frameworks, >>> -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to >>> -name just a few [1]_. ?This wide variety of choices can be a problem >>> -for new Python users, because generally speaking, their choice of web >>> -framework will limit their choice of usable web servers, and vice >>> -versa. >>> - >>> -By contrast, although Java has just as many web application frameworks >>> -available, Java's "servlet" API makes it possible for applications >>> -written with any Java web application framework to run in any web >>> -server that supports the servlet API. >>> - >>> -The availability and widespread use of such an API in web servers for >>> -Python -- whether those servers are written in Python (e.g. Medusa), >>> -embed Python (e.g. mod_python), or invoke Python via a gateway >>> -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of >>> -framework from choice of web server, freeing users to choose a pairing >>> -that suits them, while freeing framework and server developers to >>> -focus on their preferred area of specialization. >>> - >>> -This PEP, therefore, proposes a simple and universal interface between >>> -web servers and web applications or frameworks: the Python Web Server >>> -Gateway Interface (WSGI). >>> - >>> -But the mere existence of a WSGI spec does nothing to address the >>> -existing state of servers and frameworks for Python web applications. >>> -Server and framework authors and maintainers must actually implement >>> -WSGI for there to be any effect. >>> - >>> -However, since no existing servers or frameworks support WSGI, there >>> -is little immediate reward for an author who implements WSGI support. >>> -Thus, WSGI **must** be easy to implement, so that an author's initial >>> -investment in the interface can be reasonably low. >>> - >>> -Thus, simplicity of implementation on *both* the server and framework >>> -sides of the interface is absolutely critical to the utility of the >>> -WSGI interface, and is therefore the principal criterion for any >>> -design decisions. >>> - >>> -Note, however, that simplicity of implementation for a framework >>> -author is not the same thing as ease of use for a web application >>> -author. ?WSGI presents an absolutely "no frills" interface to the >>> -framework author, because bells and whistles like response objects and >>> -cookie handling would just get in the way of existing frameworks' >>> -handling of these issues. ?Again, the goal of WSGI is to facilitate >>> -easy interconnection of existing servers and applications or >>> -frameworks, not to create a new web framework. >>> - >>> -Note also that this goal precludes WSGI from requiring anything that >>> -is not already available in deployed versions of Python. ?Therefore, >>> -new standard library modules are not proposed or required by this >>> -specification, and nothing in WSGI requires a Python version greater >>> -than 2.2.2. ?(It would be a good idea, however, for future versions >>> -of Python to include support for this interface in web servers >>> -provided by the standard library.) >>> - >>> -In addition to ease of implementation for existing and future >>> -frameworks and servers, it should also be easy to create request >>> -preprocessors, response postprocessors, and other WSGI-based >>> -"middleware" components that look like an application to their >>> -containing server, while acting as a server for their contained >>> -applications. >>> - >>> -If middleware can be both simple and robust, and WSGI is widely >>> -available in servers and frameworks, it allows for the possibility >>> -of an entirely new kind of Python web application framework: one >>> -consisting of loosely-coupled WSGI middleware components. ?Indeed, >>> -existing framework authors may even choose to refactor their >>> -frameworks' existing services to be provided in this way, becoming >>> -more like libraries used with WSGI, and less like monolithic >>> -frameworks. ?This would then allow application developers to choose >>> -"best-of-breed" components for specific functionality, rather than >>> -having to commit to all the pros and cons of a single framework. >>> - >>> -Of course, as of this writing, that day is doubtless quite far off. >>> -In the meantime, it is a sufficient short-term goal for WSGI to >>> -enable the use of any framework with any server. >>> - >>> -Finally, it should be mentioned that the current version of WSGI >>> -does not prescribe any particular mechanism for "deploying" an >>> -application for use with a web server or server gateway. ?At the >>> -present time, this is necessarily implementation-defined by the >>> -server or gateway. ?After a sufficient number of servers and >>> -frameworks have implemented WSGI to provide field experience with >>> -varying deployment requirements, it may make sense to create >>> -another PEP, describing a deployment standard for WSGI servers and >>> -application frameworks. >>> +WSGI 1.0, specified in PEP 333, did a great job in making it easier >>> +for web applications and web servers to interface with each other. >>> +It has become very much the standard it was meant to be and an >>> +important part of the Python web development infrastructure. >>> + >>> +After several implementations were built by different developers, >>> +it inevitably turned out that the specification wasn't perfect. It >>> +left out some details that were implemented by all the web server >>> +interfaces because they were critical for many applications (or >>> +application frameworks). Additionally, the specification was written >>> +before Python 3.x was specified, resulting in a lack of clear >>> +specification on what to do with unicode strings. >>> + >>> +While there are some ideas around to improve WSGI further in less >>> +compatible ways, we feel that there is value to be had in first >>> +specifying a minor revision of the specification, which is largely >>> +compatible with existing implementations. Further simplification >>> +and experimentation are therefore deferred to a 2.0 version. >>> + >>> + >>> +Differences with WSGI 1.0 >>> +========================= >>> + >>> +Descriptive changes >>> +------------------- >>> + >>> +The following changes were made to realign the spec with >>> +implementations 'in the wild'. >>> + >>> +1. The 'readline()' function of 'wsgi.input' must optionally take >>> + ? a size hint. This is required because many applications use >>> + ? cgi.FieldStorage, which uses this functionality. >>> + >>> +2. The 'wsgi.input' functions for reading input must return an empty >>> + ? string as end of input stream marker. This is required for support >>> + ? of HTTP 1.1 request pipelining. A correctly implemented WSGI >>> + ? middleware already has to cope with an empty string as end >>> + ? sentinel anyway to detect premature end of input. >>> + >>> +3. Any WSGI application or middleware should not itself return, or >>> + ? consume from a wrapped WSGI component, more data than specified by >>> + ? the Content-Length response header if defined. Middleware that >>> + ? does this is arguably broken and can generate incorrect data. >>> + ? This is just a clarification of obligations. >>> + >>> +4. The WSGI adapter must not pass on to the server any data above >>> + ? what the Content-Length response header defines, if supplied. >>> + ? Doing this is technically a violation of HTTP. This is another >>> + ? clarification of obligations. >>> + >>> + >>> +String handling changes >>> +----------------------- >>> + >>> +The following changes were made to make WSGI work on Python 3.x. >>> + >>> +1. The application is passed an instance of a Python dictionary >>> + ? containing what is referred to as the WSGI environment. All keys >>> + ? in this dictionary are native strings. For CGI variables, all names >>> + ? are going to be ISO-8859-1 and so where native strings are >>> + ? unicode strings, that encoding is used for the names of CGI >>> + ? variables. >>> + >>> +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI >>> + ? environment, the value of the variable should be a native string. >>> + >>> +3. For the CGI variables contained in the WSGI environment, the values >>> + ? of the variables are native strings. Where native strings are >>> + ? unicode strings, ISO-8859-1 encoding would be used such that the >>> + ? original character data is preserved and as necessary the unicode >>> + ? string can be converted back to bytes and thence decoded to unicode >>> + ? again using a different encoding. >>> + >>> +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment >>> + ? and from which request content is read, should yield byte strings. >>> + >>> +5. The status line specified by the WSGI application should be a byte >>> + ? string. Where native strings are unicode strings, the native string >>> + ? type can also be returned in which case it would be encoded as >>> + ? ISO-8859-1. >>> + >>> +6. The list of response headers specified by the WSGI application should >>> + ? contain tuples consisting of two values, where each value is a byte >>> + ? string. Where native strings are unicode strings, the native string >>> + ? type can also be returned in which case it would be encoded as >>> + ? ISO-8859-1. >>> + >>> +7. The iterable returned by the application and from which response >>> + ? content is derived, should yield byte strings. Where native strings >>> + ? are unicode strings, the native string type can also be returned in >>> + ? which case it would be encoded as ISO-8859-1. >>> + >>> +8. The value passed to the 'write()' callback returned by >>> + ? 'start_response()' should be a byte string. Where native strings >>> + ? are unicode strings, a native string type can also be supplied, in >>> + ? which case it would be encoded as ISO-8859-1. >>> >>> >>> ?Specification Overview >>> @@ -447,6 +457,13 @@ >>> ?Streaming`_ section below for more on how application output must be >>> ?handled.) >>> >>> +Further on, several places specify constraints upon string types used >>> +in the WSGI API. The term native string is used to mean the 'str' class >>> +in both Python 2.x and 3.x. The spec tries to ensure optimal >>> +compatibility and ease of use by allowing implementations running on >>> +Python 3.x to encode strings (which are Unicode strings with no >>> +specified encoding) as ISO-8859-1 where a 3.x string is passed in. >>> + >>> ?The server or gateway should treat the yielded strings as binary byte >>> ?sequences: in particular, it should ensure that line endings are >>> ?not altered. ?The application is responsible for ensuring that the >>> @@ -489,12 +506,22 @@ >>> ?``environ`` Variables >>> ?--------------------- >>> >>> +All keys in this dictionary are native strings. For CGI variables, >>> +all names are going to be ISO-8859-1 and so where native strings are >>> +unicode strings, that encoding is used for the names of CGI variables. >>> + >>> ?The ``environ`` dictionary is required to contain these CGI >>> ?environment variables, as defined by the Common Gateway Interface >>> ?specification [2]_. ?The following variables **must** be present, >>> ?unless their value would be an empty string, in which case they >>> ?**may** be omitted, except as otherwise noted below. >>> >>> +The values for CGI variables are native strings. Where native strings >>> +are unicode strings, ISO-8859-1 encoding would be used such that the >>> +original character data is preserved and as necessary the unicode >>> +string can be converted back to bytes and thence decoded to unicode >>> +again using a different encoding. >>> + >>> ?``REQUEST_METHOD`` >>> ? The HTTP request method, such as ``"GET"`` or ``"POST"``. ?This >>> ? cannot ever be an empty string, and so is always required. >>> @@ -575,13 +602,14 @@ >>> ?===================== ?=============================================== >>> ?Variable ? ? ? ? ? ? ? Value >>> ?===================== ?=============================================== >>> -``wsgi.version`` ? ? ? The tuple ``(1,0)``, representing WSGI >>> +``wsgi.version`` ? ? ? The tuple ``(1, 0)``, representing WSGI >>> ? ? ? ? ? ? ? ? ? ? ? ?version 1.0. >>> >>> ?``wsgi.url_scheme`` ? ?A string representing the "scheme" portion of >>> ? ? ? ? ? ? ? ? ? ? ? ?the URL at which the application is being >>> ? ? ? ? ? ? ? ? ? ? ? ?invoked. ?Normally, this will have the value >>> - ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. >>> + ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. The >>> + ? ? ? ? ? ? ? ? ? ? ? value is a native string. >>> >>> ?``wsgi.input`` ? ? ? ? An input stream (file-like object) from which >>> ? ? ? ? ? ? ? ? ? ? ? ?the HTTP request body can be read. ?(The server >>> @@ -646,7 +674,7 @@ >>> ?Method ? ? ? ? ? ? ? Stream ? ? ?Notes >>> ?=================== ?========== ?======== >>> ?``read(size)`` ? ? ? ``input`` ? 1 >>> -``readline()`` ? ? ? ``input`` ? 1,2 >>> +``readline(hint)`` ? ``input`` ? 1,2 >>> ?``readlines(hint)`` ?``input`` ? 1,3 >>> ?``__iter__()`` ? ? ? ``input`` >>> ?``flush()`` ? ? ? ? ?``errors`` ?4 >>> @@ -661,11 +689,12 @@ >>> ? ?``Content-Length``, and is allowed to simulate an end-of-file >>> ? ?condition if the application attempts to read past that point. >>> ? ?The application **should not** attempt to read more data than is >>> - ? specified by the ``CONTENT_LENGTH`` variable. >>> + ? specified by the ``CONTENT_LENGTH`` variable. All read functions >>> + ? are required to return an empty string as the end of input stream >>> + ? marker. They must yield byte strings. >>> >>> -2. The optional "size" argument to ``readline()`` is not supported, >>> - ? as it may be complex for server authors to implement, and is not >>> - ? often used in practice. >>> +2. The optional "size" argument to ``readline()`` is required for >>> + ? the implementer, but optional for callers. >>> >>> ?3. Note that the ``hint`` argument to ``readlines()`` is optional for >>> ? ?both caller and implementer. ?The application is free not to >>> @@ -692,12 +721,15 @@ >>> ?--------------------------------- >>> >>> ?The second parameter passed to the application object is a callable >>> -of the form ``start_response(status,response_headers,exc_info=None)``. >>> +of the form ``start_response(status, response_headers, exc_info=None)``. >>> ?(As with all WSGI callables, the arguments must be supplied >>> ?positionally, not by keyword.) ?The ``start_response`` callable is >>> ?used to begin the HTTP response, and it must return a >>> ?``write(body_data)`` callable (see the `Buffering and Streaming`_ >>> -section, below). >>> +section, below). Values passed to the ``write(body_data)`` callable >>> +should be byte strings. Where native strings are unicode strings, a >>> +native strings type can also be supplied, in which case it would be >>> +encoded as ISO-8859-1. >>> >>> ?The ``status`` argument is an HTTP "status" string like ``"200 OK"`` >>> ?or ``"404 Not Found"``. ?That is, it is a string consisting of a >>> @@ -705,14 +737,20 @@ >>> ?single space, with no surrounding whitespace or other characters. >>> ?(See RFC 2616, Section 6.1.1 for more information.) ?The string >>> ?**must not** contain control characters, and must not be terminated >>> -with a carriage return, linefeed, or combination thereof. >>> +with a carriage return, linefeed, or combination thereof. This >>> +value should be a byte string. Where native strings are unicode >>> +strings, the native string type can also be returned, in which >>> +case it would be encoded as ISO-8859-1. >>> >>> ?The ``response_headers`` argument is a list of ``(header_name, >>> ?header_value)`` tuples. ?It must be a Python list; i.e. >>> -``type(response_headers) is ListType``, and the server **may** change >>> +``type(response_headers) is list``, and the server **may** change >>> ?its contents in any way it desires. ?Each ``header_name`` must be a >>> ?valid HTTP header field-name (as defined by RFC 2616, Section 4.2), >>> -without a trailing colon or other punctuation. >>> +without a trailing colon or other punctuation. Both the header_name >>> +and the header_value should be byte strings. Where native strings >>> +are unicode strings, the native string type can also be returned, >>> +in which case it would be encoded as ISO-8859-1. >>> >>> ?Each ``header_value`` **must not** include *any* control characters, >>> ?including carriage returns or linefeeds, either embedded or at the end. >>> @@ -809,6 +847,14 @@ >>> ?Handling the ``Content-Length`` Header >>> ?~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> +If an application or middleware layer chooses to return a >>> +Content-Length header, it should not return more data than specified >>> +by the header value. Any wrapping middleware layer should not >>> +consume more data than specified in the header value from the >>> +wrapped component (either middleware or application). Any WSGI >>> +adapter must similarly not pass on data above what the >>> +Content-Length response header value defines. >>> + >>> ?If the application does not supply a ``Content-Length`` header, a >>> ?server or gateway may choose one of several approaches to handling >>> ?it. ?The simplest of these is to close the client connection when >>> @@ -1569,55 +1615,13 @@ >>> ? ?developers. >>> >>> >>> -Proposed/Under Discussion >>> -========================= >>> - >>> -These items are currently being discussed on the Web-SIG and elsewhere, >>> -or are on the PEP author's "to-do" list: >>> - >>> -* Should ``wsgi.input`` be an iterator instead of a file? ?This would >>> - ?help for asynchronous applications and chunked-encoding input >>> - ?streams. >>> - >>> -* Optional extensions are being discussed for pausing iteration of an >>> - ?application's ouptut until input is available or until a callback >>> - ?occurs. >>> - >>> -* Add a section about synchronous vs. asynchronous apps and servers, >>> - ?the relevant threading models, and issues/design goals in these >>> - ?areas. >>> - >>> - >>> ?Acknowledgements >>> ?================ >>> >>> -Thanks go to the many folks on the Web-SIG mailing list whose >>> -thoughtful feedback made this revised draft possible. ?Especially: >>> +Thanks go to many folks on the Web-SIG mailing list for helping the work >>> +on clarifying and improving this specification. In particular: >>> >>> -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up >>> - ?on the first draft as not offering any advantages over "plain old >>> - ?CGI", thus encouraging me to look for a better approach. >>> - >>> -* Ian Bicking, who helped nag me into properly specifying the >>> - ?multithreading and multiprocess options, as well as badgering me to >>> - ?provide a mechanism for servers to supply custom extension data to >>> - ?an application. >>> - >>> -* Tony Lownds, who came up with the concept of a ``start_response`` >>> - ?function that took the status and headers, returning a ``write`` >>> - ?function. ?His input also guided the design of the exception handling >>> - ?facilities, especially in the area of allowing for middleware that >>> - ?overrides application error messages. >>> - >>> -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython >>> - ?(well before the spec was finalized) helped to shape the "supporting >>> - ?older versions of Python" section, as well as the optional >>> - ?``wsgi.file_wrapper`` facility. >>> - >>> -* Mark Nottingham, who reviewed the spec extensively for issues with >>> - ?HTTP RFC compliance, especially with regard to HTTP/1.1 features that >>> - ?I didn't even know existed until he pointed them out. >>> - >>> +* Phillip J. Eby, for writing/editing the 1.0 specification. >>> >>> ?References >>> ?========== >>> @@ -1643,8 +1647,6 @@ >>> >>> ?This document has been placed in the public domain. >>> >>> - >>> - >>> ?.. >>> ? ?Local Variables: >>> ? ?mode: indented-text >>> >> > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/paul.joseph.davis%40gmail.com > From graham.dumpleton at gmail.com Fri Apr 16 05:53:02 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 16 Apr 2010 13:53:02 +1000 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: References: Message-ID: On 16 April 2010 13:29, Paul Davis wrote: > On Thu, Apr 15, 2010 at 10:08 PM, Graham Dumpleton > wrote: >> On 16 April 2010 11:41, Graham Dumpleton wrote: >>> I haven't read what you have done yet >> >> And still haven't. Don't know when I will get a chance to do so. >> >> Two points from a quick scan of emails. >> >> 1. The following section of PEP needs to be updated: >> >> """ >> ?1417 Apart from the handling of ``close()``, the semantics of returning a >> ?1418 file wrapper from the application should be the same as if the >> ?1419 application had returned ``iter(filelike.read, '')``. ?In other words, >> ?1420 transmission should begin at the current position within the "file" >> ?1421 at the time that transmission begins, and continue until the end is >> ?1422 reached. >> """ >> >> It can't say read until 'end is reached' of file as Content-Length >> must be honoured and less returned if Content-Length is less than what >> is available in the remainder of the file as per descriptive changes >> (3) and (4). >> >> In respect of question about readline() arguments and whether -1 or >> None is allowed. I would say no they are not. Must be positive integer >> or no argument supplied at all. >> >> Different implementations use -1 or None as value of a default >> argument to know when an argument wasn't supplied. One cant rely >> though on one or the other being used and so that supplying those >> arguments explicitly means the same thing as no argument supplied. In >> other words, supplying anything but positive integer or no argument at >> all is undefined. >> >> Same issue arises with read() except that only positive integer can >> technically be supplied and argument is not optional. Although, any >> implementation which implements wsgi.input as a proper file like >> argument is going to accept no argument to mean read all input, this >> is outside of WSGI specification and calling with no argument is >> undefined. >> >> Graham > > I happened to have just started hitting the body reading functions on > an HTTP parser I've been working on. I'd be interested to hear a > response on what happens when the various read functions are called > with a size hint of zero. > > I realize that zero is not a positive integer but I'm not quite sure > on what the recommended return value would be. I'm can see None and -1 > being obvious flags for "no size hint", but zero is a tad weird. I > want to say that it'd either return "" (which could sorta kinda > violate #2) or raise an exception. I really haven't got any reason to > prefer on over the other though. I almost mentioned 0 as argument in my previous email, but I got a bit scared off by it also. In all these things, one has to be guided by what a standard file like object does in Python. Ie., >>> import sys >>> sys.stdin.read(0) '' So, although an empty string would normally indicate no more content can be read, a argument of 0 has to be seen as a special exception to that rule, with no choice but that empty string is returned. Graham > As an aside, I think that "honoring Content-Length" should probably be > rephrased to a "middleware should not break HTTP" coupled with a page > that lists common ways that middle ware breaks HTTP. I reckon its the > same reasoning for 333's dictation that hop-by-hop headers are server > only, though there are plenty of other ways I could violate RFC 2616 > as a middleware author without violating WSGI. Pie in the sky, the > common ways would be included with wsgiref's validate decorator. > > Paul > >>> but if you have done so >>> already, ensure you read: >>> >>> ?http://bitbucket.org/ianb/wsgi-peps/src/ >>> >>> This is Ian's and Armin's previous go at new specification. It though >>> tried to go further than what you are doing. >>> >>> Also read: >>> >>> ?http://blog.dscpl.com.au/2009/09/roadmap-for-python-wsgi-specification.html >>> >>> I explain what I mean by native strings in that. >>> >>> Graham >>> >>> On 15 April 2010 22:54, Dirkjan Ochtman wrote: >>>> Mostly taking Graham's list of issues and incorporating it into PEP 333. >>>> >>>> Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt >>>> >>>> Let's have comments here (comments in the form of diffs are >>>> particularly welcome, of course). Remember, the idea is not to change >>>> or improve WSGI right now, but only to improve the spec, improving >>>> interoperability and enabling Python 3 support. >>>> >>>> Graham, I hope I did a good job with your suggestions. (Since so much >>>> of this is yours, I've just listed you as the second author.) I tried >>>> to clarify exactly what you meant by "native strings", can you check >>>> that out? >>>> >>>> Cheers, >>>> >>>> Dirkjan >>>> >>>> --- pep-0333.txt ? ? ? ?2010-04-15 14:46:02.000000000 +0200 >>>> +++ wsgi-1.1.txt ? ? ? ?2010-04-15 14:51:39.000000000 +0200 >>>> @@ -1,114 +1,124 @@ >>>> -PEP: 333 >>>> -Title: Python Web Server Gateway Interface v1.0 >>>> +PEP: 0000 >>>> +Title: Python Web Server Gateway Interface 1.1 >>>> ?Version: $Revision$ >>>> ?Last-Modified: $Date$ >>>> -Author: Phillip J. Eby >>>> +Author: Dirkjan Ochtman , >>>> + ? ? ? ?Graham Dumpleton >>>> ?Discussions-To: Python Web-SIG >>>> ?Status: Draft >>>> ?Type: Informational >>>> ?Content-Type: text/x-rst >>>> -Created: 07-Dec-2003 >>>> -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 >>>> +Created: 15-04-2010 >>>> +Post-History: Not yet >>>> >>>> >>>> ?Abstract >>>> ?======== >>>> >>>> -This document specifies a proposed standard interface between web >>>> -servers and Python web applications or frameworks, to promote web >>>> -application portability across a variety of web servers. >>>> +This document specifies a revision of the proposed standard interface >>>> +between web servers and Python web applications or frameworks, to >>>> +promote web application portability across a variety of web servers. >>>> >>>> >>>> ?Rationale and Goals >>>> ?=================== >>>> >>>> -Python currently boasts a wide variety of web application frameworks, >>>> -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to >>>> -name just a few [1]_. ?This wide variety of choices can be a problem >>>> -for new Python users, because generally speaking, their choice of web >>>> -framework will limit their choice of usable web servers, and vice >>>> -versa. >>>> - >>>> -By contrast, although Java has just as many web application frameworks >>>> -available, Java's "servlet" API makes it possible for applications >>>> -written with any Java web application framework to run in any web >>>> -server that supports the servlet API. >>>> - >>>> -The availability and widespread use of such an API in web servers for >>>> -Python -- whether those servers are written in Python (e.g. Medusa), >>>> -embed Python (e.g. mod_python), or invoke Python via a gateway >>>> -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of >>>> -framework from choice of web server, freeing users to choose a pairing >>>> -that suits them, while freeing framework and server developers to >>>> -focus on their preferred area of specialization. >>>> - >>>> -This PEP, therefore, proposes a simple and universal interface between >>>> -web servers and web applications or frameworks: the Python Web Server >>>> -Gateway Interface (WSGI). >>>> - >>>> -But the mere existence of a WSGI spec does nothing to address the >>>> -existing state of servers and frameworks for Python web applications. >>>> -Server and framework authors and maintainers must actually implement >>>> -WSGI for there to be any effect. >>>> - >>>> -However, since no existing servers or frameworks support WSGI, there >>>> -is little immediate reward for an author who implements WSGI support. >>>> -Thus, WSGI **must** be easy to implement, so that an author's initial >>>> -investment in the interface can be reasonably low. >>>> - >>>> -Thus, simplicity of implementation on *both* the server and framework >>>> -sides of the interface is absolutely critical to the utility of the >>>> -WSGI interface, and is therefore the principal criterion for any >>>> -design decisions. >>>> - >>>> -Note, however, that simplicity of implementation for a framework >>>> -author is not the same thing as ease of use for a web application >>>> -author. ?WSGI presents an absolutely "no frills" interface to the >>>> -framework author, because bells and whistles like response objects and >>>> -cookie handling would just get in the way of existing frameworks' >>>> -handling of these issues. ?Again, the goal of WSGI is to facilitate >>>> -easy interconnection of existing servers and applications or >>>> -frameworks, not to create a new web framework. >>>> - >>>> -Note also that this goal precludes WSGI from requiring anything that >>>> -is not already available in deployed versions of Python. ?Therefore, >>>> -new standard library modules are not proposed or required by this >>>> -specification, and nothing in WSGI requires a Python version greater >>>> -than 2.2.2. ?(It would be a good idea, however, for future versions >>>> -of Python to include support for this interface in web servers >>>> -provided by the standard library.) >>>> - >>>> -In addition to ease of implementation for existing and future >>>> -frameworks and servers, it should also be easy to create request >>>> -preprocessors, response postprocessors, and other WSGI-based >>>> -"middleware" components that look like an application to their >>>> -containing server, while acting as a server for their contained >>>> -applications. >>>> - >>>> -If middleware can be both simple and robust, and WSGI is widely >>>> -available in servers and frameworks, it allows for the possibility >>>> -of an entirely new kind of Python web application framework: one >>>> -consisting of loosely-coupled WSGI middleware components. ?Indeed, >>>> -existing framework authors may even choose to refactor their >>>> -frameworks' existing services to be provided in this way, becoming >>>> -more like libraries used with WSGI, and less like monolithic >>>> -frameworks. ?This would then allow application developers to choose >>>> -"best-of-breed" components for specific functionality, rather than >>>> -having to commit to all the pros and cons of a single framework. >>>> - >>>> -Of course, as of this writing, that day is doubtless quite far off. >>>> -In the meantime, it is a sufficient short-term goal for WSGI to >>>> -enable the use of any framework with any server. >>>> - >>>> -Finally, it should be mentioned that the current version of WSGI >>>> -does not prescribe any particular mechanism for "deploying" an >>>> -application for use with a web server or server gateway. ?At the >>>> -present time, this is necessarily implementation-defined by the >>>> -server or gateway. ?After a sufficient number of servers and >>>> -frameworks have implemented WSGI to provide field experience with >>>> -varying deployment requirements, it may make sense to create >>>> -another PEP, describing a deployment standard for WSGI servers and >>>> -application frameworks. >>>> +WSGI 1.0, specified in PEP 333, did a great job in making it easier >>>> +for web applications and web servers to interface with each other. >>>> +It has become very much the standard it was meant to be and an >>>> +important part of the Python web development infrastructure. >>>> + >>>> +After several implementations were built by different developers, >>>> +it inevitably turned out that the specification wasn't perfect. It >>>> +left out some details that were implemented by all the web server >>>> +interfaces because they were critical for many applications (or >>>> +application frameworks). Additionally, the specification was written >>>> +before Python 3.x was specified, resulting in a lack of clear >>>> +specification on what to do with unicode strings. >>>> + >>>> +While there are some ideas around to improve WSGI further in less >>>> +compatible ways, we feel that there is value to be had in first >>>> +specifying a minor revision of the specification, which is largely >>>> +compatible with existing implementations. Further simplification >>>> +and experimentation are therefore deferred to a 2.0 version. >>>> + >>>> + >>>> +Differences with WSGI 1.0 >>>> +========================= >>>> + >>>> +Descriptive changes >>>> +------------------- >>>> + >>>> +The following changes were made to realign the spec with >>>> +implementations 'in the wild'. >>>> + >>>> +1. The 'readline()' function of 'wsgi.input' must optionally take >>>> + ? a size hint. This is required because many applications use >>>> + ? cgi.FieldStorage, which uses this functionality. >>>> + >>>> +2. The 'wsgi.input' functions for reading input must return an empty >>>> + ? string as end of input stream marker. This is required for support >>>> + ? of HTTP 1.1 request pipelining. A correctly implemented WSGI >>>> + ? middleware already has to cope with an empty string as end >>>> + ? sentinel anyway to detect premature end of input. >>>> + >>>> +3. Any WSGI application or middleware should not itself return, or >>>> + ? consume from a wrapped WSGI component, more data than specified by >>>> + ? the Content-Length response header if defined. Middleware that >>>> + ? does this is arguably broken and can generate incorrect data. >>>> + ? This is just a clarification of obligations. >>>> + >>>> +4. The WSGI adapter must not pass on to the server any data above >>>> + ? what the Content-Length response header defines, if supplied. >>>> + ? Doing this is technically a violation of HTTP. This is another >>>> + ? clarification of obligations. >>>> + >>>> + >>>> +String handling changes >>>> +----------------------- >>>> + >>>> +The following changes were made to make WSGI work on Python 3.x. >>>> + >>>> +1. The application is passed an instance of a Python dictionary >>>> + ? containing what is referred to as the WSGI environment. All keys >>>> + ? in this dictionary are native strings. For CGI variables, all names >>>> + ? are going to be ISO-8859-1 and so where native strings are >>>> + ? unicode strings, that encoding is used for the names of CGI >>>> + ? variables. >>>> + >>>> +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI >>>> + ? environment, the value of the variable should be a native string. >>>> + >>>> +3. For the CGI variables contained in the WSGI environment, the values >>>> + ? of the variables are native strings. Where native strings are >>>> + ? unicode strings, ISO-8859-1 encoding would be used such that the >>>> + ? original character data is preserved and as necessary the unicode >>>> + ? string can be converted back to bytes and thence decoded to unicode >>>> + ? again using a different encoding. >>>> + >>>> +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment >>>> + ? and from which request content is read, should yield byte strings. >>>> + >>>> +5. The status line specified by the WSGI application should be a byte >>>> + ? string. Where native strings are unicode strings, the native string >>>> + ? type can also be returned in which case it would be encoded as >>>> + ? ISO-8859-1. >>>> + >>>> +6. The list of response headers specified by the WSGI application should >>>> + ? contain tuples consisting of two values, where each value is a byte >>>> + ? string. Where native strings are unicode strings, the native string >>>> + ? type can also be returned in which case it would be encoded as >>>> + ? ISO-8859-1. >>>> + >>>> +7. The iterable returned by the application and from which response >>>> + ? content is derived, should yield byte strings. Where native strings >>>> + ? are unicode strings, the native string type can also be returned in >>>> + ? which case it would be encoded as ISO-8859-1. >>>> + >>>> +8. The value passed to the 'write()' callback returned by >>>> + ? 'start_response()' should be a byte string. Where native strings >>>> + ? are unicode strings, a native string type can also be supplied, in >>>> + ? which case it would be encoded as ISO-8859-1. >>>> >>>> >>>> ?Specification Overview >>>> @@ -447,6 +457,13 @@ >>>> ?Streaming`_ section below for more on how application output must be >>>> ?handled.) >>>> >>>> +Further on, several places specify constraints upon string types used >>>> +in the WSGI API. The term native string is used to mean the 'str' class >>>> +in both Python 2.x and 3.x. The spec tries to ensure optimal >>>> +compatibility and ease of use by allowing implementations running on >>>> +Python 3.x to encode strings (which are Unicode strings with no >>>> +specified encoding) as ISO-8859-1 where a 3.x string is passed in. >>>> + >>>> ?The server or gateway should treat the yielded strings as binary byte >>>> ?sequences: in particular, it should ensure that line endings are >>>> ?not altered. ?The application is responsible for ensuring that the >>>> @@ -489,12 +506,22 @@ >>>> ?``environ`` Variables >>>> ?--------------------- >>>> >>>> +All keys in this dictionary are native strings. For CGI variables, >>>> +all names are going to be ISO-8859-1 and so where native strings are >>>> +unicode strings, that encoding is used for the names of CGI variables. >>>> + >>>> ?The ``environ`` dictionary is required to contain these CGI >>>> ?environment variables, as defined by the Common Gateway Interface >>>> ?specification [2]_. ?The following variables **must** be present, >>>> ?unless their value would be an empty string, in which case they >>>> ?**may** be omitted, except as otherwise noted below. >>>> >>>> +The values for CGI variables are native strings. Where native strings >>>> +are unicode strings, ISO-8859-1 encoding would be used such that the >>>> +original character data is preserved and as necessary the unicode >>>> +string can be converted back to bytes and thence decoded to unicode >>>> +again using a different encoding. >>>> + >>>> ?``REQUEST_METHOD`` >>>> ? The HTTP request method, such as ``"GET"`` or ``"POST"``. ?This >>>> ? cannot ever be an empty string, and so is always required. >>>> @@ -575,13 +602,14 @@ >>>> ?===================== ?=============================================== >>>> ?Variable ? ? ? ? ? ? ? Value >>>> ?===================== ?=============================================== >>>> -``wsgi.version`` ? ? ? The tuple ``(1,0)``, representing WSGI >>>> +``wsgi.version`` ? ? ? The tuple ``(1, 0)``, representing WSGI >>>> ? ? ? ? ? ? ? ? ? ? ? ?version 1.0. >>>> >>>> ?``wsgi.url_scheme`` ? ?A string representing the "scheme" portion of >>>> ? ? ? ? ? ? ? ? ? ? ? ?the URL at which the application is being >>>> ? ? ? ? ? ? ? ? ? ? ? ?invoked. ?Normally, this will have the value >>>> - ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. >>>> + ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. The >>>> + ? ? ? ? ? ? ? ? ? ? ? value is a native string. >>>> >>>> ?``wsgi.input`` ? ? ? ? An input stream (file-like object) from which >>>> ? ? ? ? ? ? ? ? ? ? ? ?the HTTP request body can be read. ?(The server >>>> @@ -646,7 +674,7 @@ >>>> ?Method ? ? ? ? ? ? ? Stream ? ? ?Notes >>>> ?=================== ?========== ?======== >>>> ?``read(size)`` ? ? ? ``input`` ? 1 >>>> -``readline()`` ? ? ? ``input`` ? 1,2 >>>> +``readline(hint)`` ? ``input`` ? 1,2 >>>> ?``readlines(hint)`` ?``input`` ? 1,3 >>>> ?``__iter__()`` ? ? ? ``input`` >>>> ?``flush()`` ? ? ? ? ?``errors`` ?4 >>>> @@ -661,11 +689,12 @@ >>>> ? ?``Content-Length``, and is allowed to simulate an end-of-file >>>> ? ?condition if the application attempts to read past that point. >>>> ? ?The application **should not** attempt to read more data than is >>>> - ? specified by the ``CONTENT_LENGTH`` variable. >>>> + ? specified by the ``CONTENT_LENGTH`` variable. All read functions >>>> + ? are required to return an empty string as the end of input stream >>>> + ? marker. They must yield byte strings. >>>> >>>> -2. The optional "size" argument to ``readline()`` is not supported, >>>> - ? as it may be complex for server authors to implement, and is not >>>> - ? often used in practice. >>>> +2. The optional "size" argument to ``readline()`` is required for >>>> + ? the implementer, but optional for callers. >>>> >>>> ?3. Note that the ``hint`` argument to ``readlines()`` is optional for >>>> ? ?both caller and implementer. ?The application is free not to >>>> @@ -692,12 +721,15 @@ >>>> ?--------------------------------- >>>> >>>> ?The second parameter passed to the application object is a callable >>>> -of the form ``start_response(status,response_headers,exc_info=None)``. >>>> +of the form ``start_response(status, response_headers, exc_info=None)``. >>>> ?(As with all WSGI callables, the arguments must be supplied >>>> ?positionally, not by keyword.) ?The ``start_response`` callable is >>>> ?used to begin the HTTP response, and it must return a >>>> ?``write(body_data)`` callable (see the `Buffering and Streaming`_ >>>> -section, below). >>>> +section, below). Values passed to the ``write(body_data)`` callable >>>> +should be byte strings. Where native strings are unicode strings, a >>>> +native strings type can also be supplied, in which case it would be >>>> +encoded as ISO-8859-1. >>>> >>>> ?The ``status`` argument is an HTTP "status" string like ``"200 OK"`` >>>> ?or ``"404 Not Found"``. ?That is, it is a string consisting of a >>>> @@ -705,14 +737,20 @@ >>>> ?single space, with no surrounding whitespace or other characters. >>>> ?(See RFC 2616, Section 6.1.1 for more information.) ?The string >>>> ?**must not** contain control characters, and must not be terminated >>>> -with a carriage return, linefeed, or combination thereof. >>>> +with a carriage return, linefeed, or combination thereof. This >>>> +value should be a byte string. Where native strings are unicode >>>> +strings, the native string type can also be returned, in which >>>> +case it would be encoded as ISO-8859-1. >>>> >>>> ?The ``response_headers`` argument is a list of ``(header_name, >>>> ?header_value)`` tuples. ?It must be a Python list; i.e. >>>> -``type(response_headers) is ListType``, and the server **may** change >>>> +``type(response_headers) is list``, and the server **may** change >>>> ?its contents in any way it desires. ?Each ``header_name`` must be a >>>> ?valid HTTP header field-name (as defined by RFC 2616, Section 4.2), >>>> -without a trailing colon or other punctuation. >>>> +without a trailing colon or other punctuation. Both the header_name >>>> +and the header_value should be byte strings. Where native strings >>>> +are unicode strings, the native string type can also be returned, >>>> +in which case it would be encoded as ISO-8859-1. >>>> >>>> ?Each ``header_value`` **must not** include *any* control characters, >>>> ?including carriage returns or linefeeds, either embedded or at the end. >>>> @@ -809,6 +847,14 @@ >>>> ?Handling the ``Content-Length`` Header >>>> ?~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> >>>> +If an application or middleware layer chooses to return a >>>> +Content-Length header, it should not return more data than specified >>>> +by the header value. Any wrapping middleware layer should not >>>> +consume more data than specified in the header value from the >>>> +wrapped component (either middleware or application). Any WSGI >>>> +adapter must similarly not pass on data above what the >>>> +Content-Length response header value defines. >>>> + >>>> ?If the application does not supply a ``Content-Length`` header, a >>>> ?server or gateway may choose one of several approaches to handling >>>> ?it. ?The simplest of these is to close the client connection when >>>> @@ -1569,55 +1615,13 @@ >>>> ? ?developers. >>>> >>>> >>>> -Proposed/Under Discussion >>>> -========================= >>>> - >>>> -These items are currently being discussed on the Web-SIG and elsewhere, >>>> -or are on the PEP author's "to-do" list: >>>> - >>>> -* Should ``wsgi.input`` be an iterator instead of a file? ?This would >>>> - ?help for asynchronous applications and chunked-encoding input >>>> - ?streams. >>>> - >>>> -* Optional extensions are being discussed for pausing iteration of an >>>> - ?application's ouptut until input is available or until a callback >>>> - ?occurs. >>>> - >>>> -* Add a section about synchronous vs. asynchronous apps and servers, >>>> - ?the relevant threading models, and issues/design goals in these >>>> - ?areas. >>>> - >>>> - >>>> ?Acknowledgements >>>> ?================ >>>> >>>> -Thanks go to the many folks on the Web-SIG mailing list whose >>>> -thoughtful feedback made this revised draft possible. ?Especially: >>>> +Thanks go to many folks on the Web-SIG mailing list for helping the work >>>> +on clarifying and improving this specification. In particular: >>>> >>>> -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up >>>> - ?on the first draft as not offering any advantages over "plain old >>>> - ?CGI", thus encouraging me to look for a better approach. >>>> - >>>> -* Ian Bicking, who helped nag me into properly specifying the >>>> - ?multithreading and multiprocess options, as well as badgering me to >>>> - ?provide a mechanism for servers to supply custom extension data to >>>> - ?an application. >>>> - >>>> -* Tony Lownds, who came up with the concept of a ``start_response`` >>>> - ?function that took the status and headers, returning a ``write`` >>>> - ?function. ?His input also guided the design of the exception handling >>>> - ?facilities, especially in the area of allowing for middleware that >>>> - ?overrides application error messages. >>>> - >>>> -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython >>>> - ?(well before the spec was finalized) helped to shape the "supporting >>>> - ?older versions of Python" section, as well as the optional >>>> - ?``wsgi.file_wrapper`` facility. >>>> - >>>> -* Mark Nottingham, who reviewed the spec extensively for issues with >>>> - ?HTTP RFC compliance, especially with regard to HTTP/1.1 features that >>>> - ?I didn't even know existed until he pointed them out. >>>> - >>>> +* Phillip J. Eby, for writing/editing the 1.0 specification. >>>> >>>> ?References >>>> ?========== >>>> @@ -1643,8 +1647,6 @@ >>>> >>>> ?This document has been placed in the public domain. >>>> >>>> - >>>> - >>>> ?.. >>>> ? ?Local Variables: >>>> ? ?mode: indented-text >>>> >>> >> _______________________________________________ >> Web-SIG mailing list >> Web-SIG at python.org >> Web SIG: http://www.python.org/sigs/web-sig >> Unsubscribe: http://mail.python.org/mailman/options/web-sig/paul.joseph.davis%40gmail.com >> > From graham.dumpleton at gmail.com Fri Apr 16 07:25:39 2010 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 16 Apr 2010 15:25:39 +1000 Subject: [Web-SIG] Draft PEP: WSGI 1.1 In-Reply-To: <6705ADE1-F904-4F9E-8219-35A53C64FEAD@gmail.com> References: <6705ADE1-F904-4F9E-8219-35A53C64FEAD@gmail.com> Message-ID: On 16 April 2010 15:19, Paul J Davis wrote: > > > On Apr 15, 2010, at 11:53 PM, Graham Dumpleton wrote: > >> On 16 April 2010 13:29, Paul Davis wrote: >>> On Thu, Apr 15, 2010 at 10:08 PM, Graham Dumpleton >>> wrote: >>>> On 16 April 2010 11:41, Graham Dumpleton wrote: >>>>> I haven't read what you have done yet >>>> >>>> And still haven't. Don't know when I will get a chance to do so. >>>> >>>> Two points from a quick scan of emails. >>>> >>>> 1. The following section of PEP needs to be updated: >>>> >>>> """ >>>> ?1417 Apart from the handling of ``close()``, the semantics of returning a >>>> ?1418 file wrapper from the application should be the same as if the >>>> ?1419 application had returned ``iter(filelike.read, '')``. ?In other words, >>>> ?1420 transmission should begin at the current position within the "file" >>>> ?1421 at the time that transmission begins, and continue until the end is >>>> ?1422 reached. >>>> """ >>>> >>>> It can't say read until 'end is reached' of file as Content-Length >>>> must be honoured and less returned if Content-Length is less than what >>>> is available in the remainder of the file as per descriptive changes >>>> (3) and (4). >>>> >>>> In respect of question about readline() arguments and whether -1 or >>>> None is allowed. I would say no they are not. Must be positive integer >>>> or no argument supplied at all. >>>> >>>> Different implementations use -1 or None as value of a default >>>> argument to know when an argument wasn't supplied. One cant rely >>>> though on one or the other being used and so that supplying those >>>> arguments explicitly means the same thing as no argument supplied. In >>>> other words, supplying anything but positive integer or no argument at >>>> all is undefined. >>>> >>>> Same issue arises with read() except that only positive integer can >>>> technically be supplied and argument is not optional. Although, any >>>> implementation which implements wsgi.input as a proper file like >>>> argument is going to accept no argument to mean read all input, this >>>> is outside of WSGI specification and calling with no argument is >>>> undefined. >>>> >>>> Graham >>> >>> I happened to have just started hitting the body reading functions on >>> an HTTP parser I've been working on. I'd be interested to hear a >>> response on what happens when the various read functions are called >>> with a size hint of zero. >>> >>> I realize that zero is not a positive integer but I'm not quite sure >>> on what the recommended return value would be. I'm can see None and -1 >>> being obvious flags for "no size hint", but zero is a tad weird. I >>> want to say that it'd either return "" (which could sorta kinda >>> violate #2) or raise an exception. I really haven't got any reason to >>> prefer on over the other though. >> >> I almost mentioned 0 as argument in my previous email, but I got a bit >> scared off by it also. >> >> In all these things, one has to be guided by what a standard file like >> object does in Python. Ie., >> >>>>> import sys >>>>> sys.stdin.read(0) >> '' >> >> So, although an empty string would normally indicate no more content >> can be read, a argument of 0 has to be seen as a special exception to >> that rule, with no choice but that empty string is returned. >> >> Graham >> > > I'm inclined to agree. As a quick follow up that's semi tangentially related, what of the case where an app doesn't consume the entire request body in a keep alive context. I've been running on the assumption that the server would discard but I wonder about the possibility of silence causing more confusion than an exception, or worse, some sort of attack based on sneaking a hidden request. I only mention it because of the chunked-encoding-might-mean-zero-length-body assumption. The underlying HTTP server, or WSGI server if they are one and the same, should really ensure that any request content not consumed is read in and discarded if it is going to allow a follow on request. >From memory, wsgiref on top of Python basic HTTP server doesn't do this. Not sure whether this obligation should be part of specification given that it is really a HTTP thing and if a HTTP server implementing HTTP/1.1 doesn't do that, it is arguably broken. Graham > Paul > > >>> As an aside, I think that "honoring Content-Length" should probably be >>> rephrased to a "middleware should not break HTTP" coupled with a page >>> that lists common ways that middle ware breaks HTTP. I reckon its the >>> same reasoning for 333's dictation that hop-by-hop headers are server >>> only, though there are plenty of other ways I could violate RFC 2616 >>> as a middleware author without violating WSGI. Pie in the sky, the >>> common ways would be included with wsgiref's validate decorator. >>> >>> Paul >>> >>>>> but if you have done so >>>>> already, ensure you read: >>>>> >>>>> ?http://bitbucket.org/ianb/wsgi-peps/src/ >>>>> >>>>> This is Ian's and Armin's previous go at new specification. It though >>>>> tried to go further than what you are doing. >>>>> >>>>> Also read: >>>>> >>>>> ?http://blog.dscpl.com.au/2009/09/roadmap-for-python-wsgi-specification.html >>>>> >>>>> I explain what I mean by native strings in that. >>>>> >>>>> Graham >>>>> >>>>> On 15 April 2010 22:54, Dirkjan Ochtman wrote: >>>>>> Mostly taking Graham's list of issues and incorporating it into PEP 333. >>>>>> >>>>>> Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt >>>>>> >>>>>> Let's have comments here (comments in the form of diffs are >>>>>> particularly welcome, of course). Remember, the idea is not to change >>>>>> or improve WSGI right now, but only to improve the spec, improving >>>>>> interoperability and enabling Python 3 support. >>>>>> >>>>>> Graham, I hope I did a good job with your suggestions. (Since so much >>>>>> of this is yours, I've just listed you as the second author.) I tried >>>>>> to clarify exactly what you meant by "native strings", can you check >>>>>> that out? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Dirkjan >>>>>> >>>>>> --- pep-0333.txt ? ? ? ?2010-04-15 14:46:02.000000000 +0200 >>>>>> +++ wsgi-1.1.txt ? ? ? ?2010-04-15 14:51:39.000000000 +0200 >>>>>> @@ -1,114 +1,124 @@ >>>>>> -PEP: 333 >>>>>> -Title: Python Web Server Gateway Interface v1.0 >>>>>> +PEP: 0000 >>>>>> +Title: Python Web Server Gateway Interface 1.1 >>>>>> ?Version: $Revision$ >>>>>> ?Last-Modified: $Date$ >>>>>> -Author: Phillip J. Eby >>>>>> +Author: Dirkjan Ochtman , >>>>>> + ? ? ? ?Graham Dumpleton >>>>>> ?Discussions-To: Python Web-SIG >>>>>> ?Status: Draft >>>>>> ?Type: Informational >>>>>> ?Content-Type: text/x-rst >>>>>> -Created: 07-Dec-2003 >>>>>> -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 >>>>>> +Created: 15-04-2010 >>>>>> +Post-History: Not yet >>>>>> >>>>>> >>>>>> ?Abstract >>>>>> ?======== >>>>>> >>>>>> -This document specifies a proposed standard interface between web >>>>>> -servers and Python web applications or frameworks, to promote web >>>>>> -application portability across a variety of web servers. >>>>>> +This document specifies a revision of the proposed standard interface >>>>>> +between web servers and Python web applications or frameworks, to >>>>>> +promote web application portability across a variety of web servers. >>>>>> >>>>>> >>>>>> ?Rationale and Goals >>>>>> ?=================== >>>>>> >>>>>> -Python currently boasts a wide variety of web application frameworks, >>>>>> -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to >>>>>> -name just a few [1]_. ?This wide variety of choices can be a problem >>>>>> -for new Python users, because generally speaking, their choice of web >>>>>> -framework will limit their choice of usable web servers, and vice >>>>>> -versa. >>>>>> - >>>>>> -By contrast, although Java has just as many web application frameworks >>>>>> -available, Java's "servlet" API makes it possible for applications >>>>>> -written with any Java web application framework to run in any web >>>>>> -server that supports the servlet API. >>>>>> - >>>>>> -The availability and widespread use of such an API in web servers for >>>>>> -Python -- whether those servers are written in Python (e.g. Medusa), >>>>>> -embed Python (e.g. mod_python), or invoke Python via a gateway >>>>>> -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of >>>>>> -framework from choice of web server, freeing users to choose a pairing >>>>>> -that suits them, while freeing framework and server developers to >>>>>> -focus on their preferred area of specialization. >>>>>> - >>>>>> -This PEP, therefore, proposes a simple and universal interface between >>>>>> -web servers and web applications or frameworks: the Python Web Server >>>>>> -Gateway Interface (WSGI). >>>>>> - >>>>>> -But the mere existence of a WSGI spec does nothing to address the >>>>>> -existing state of servers and frameworks for Python web applications. >>>>>> -Server and framework authors and maintainers must actually implement >>>>>> -WSGI for there to be any effect. >>>>>> - >>>>>> -However, since no existing servers or frameworks support WSGI, there >>>>>> -is little immediate reward for an author who implements WSGI support. >>>>>> -Thus, WSGI **must** be easy to implement, so that an author's initial >>>>>> -investment in the interface can be reasonably low. >>>>>> - >>>>>> -Thus, simplicity of implementation on *both* the server and framework >>>>>> -sides of the interface is absolutely critical to the utility of the >>>>>> -WSGI interface, and is therefore the principal criterion for any >>>>>> -design decisions. >>>>>> - >>>>>> -Note, however, that simplicity of implementation for a framework >>>>>> -author is not the same thing as ease of use for a web application >>>>>> -author. ?WSGI presents an absolutely "no frills" interface to the >>>>>> -framework author, because bells and whistles like response objects and >>>>>> -cookie handling would just get in the way of existing frameworks' >>>>>> -handling of these issues. ?Again, the goal of WSGI is to facilitate >>>>>> -easy interconnection of existing servers and applications or >>>>>> -frameworks, not to create a new web framework. >>>>>> - >>>>>> -Note also that this goal precludes WSGI from requiring anything that >>>>>> -is not already available in deployed versions of Python. ?Therefore, >>>>>> -new standard library modules are not proposed or required by this >>>>>> -specification, and nothing in WSGI requires a Python version greater >>>>>> -than 2.2.2. ?(It would be a good idea, however, for future versions >>>>>> -of Python to include support for this interface in web servers >>>>>> -provided by the standard library.) >>>>>> - >>>>>> -In addition to ease of implementation for existing and future >>>>>> -frameworks and servers, it should also be easy to create request >>>>>> -preprocessors, response postprocessors, and other WSGI-based >>>>>> -"middleware" components that look like an application to their >>>>>> -containing server, while acting as a server for their contained >>>>>> -applications. >>>>>> - >>>>>> -If middleware can be both simple and robust, and WSGI is widely >>>>>> -available in servers and frameworks, it allows for the possibility >>>>>> -of an entirely new kind of Python web application framework: one >>>>>> -consisting of loosely-coupled WSGI middleware components. ?Indeed, >>>>>> -existing framework authors may even choose to refactor their >>>>>> -frameworks' existing services to be provided in this way, becoming >>>>>> -more like libraries used with WSGI, and less like monolithic >>>>>> -frameworks. ?This would then allow application developers to choose >>>>>> -"best-of-breed" components for specific functionality, rather than >>>>>> -having to commit to all the pros and cons of a single framework. >>>>>> - >>>>>> -Of course, as of this writing, that day is doubtless quite far off. >>>>>> -In the meantime, it is a sufficient short-term goal for WSGI to >>>>>> -enable the use of any framework with any server. >>>>>> - >>>>>> -Finally, it should be mentioned that the current version of WSGI >>>>>> -does not prescribe any particular mechanism for "deploying" an >>>>>> -application for use with a web server or server gateway. ?At the >>>>>> -present time, this is necessarily implementation-defined by the >>>>>> -server or gateway. ?After a sufficient number of servers and >>>>>> -frameworks have implemented WSGI to provide field experience with >>>>>> -varying deployment requirements, it may make sense to create >>>>>> -another PEP, describing a deployment standard for WSGI servers and >>>>>> -application frameworks. >>>>>> +WSGI 1.0, specified in PEP 333, did a great job in making it easier >>>>>> +for web applications and web servers to interface with each other. >>>>>> +It has become very much the standard it was meant to be and an >>>>>> +important part of the Python web development infrastructure. >>>>>> + >>>>>> +After several implementations were built by different developers, >>>>>> +it inevitably turned out that the specification wasn't perfect. It >>>>>> +left out some details that were implemented by all the web server >>>>>> +interfaces because they were critical for many applications (or >>>>>> +application frameworks). Additionally, the specification was written >>>>>> +before Python 3.x was specified, resulting in a lack of clear >>>>>> +specification on what to do with unicode strings. >>>>>> + >>>>>> +While there are some ideas around to improve WSGI further in less >>>>>> +compatible ways, we feel that there is value to be had in first >>>>>> +specifying a minor revision of the specification, which is largely >>>>>> +compatible with existing implementations. Further simplification >>>>>> +and experimentation are therefore deferred to a 2.0 version. >>>>>> + >>>>>> + >>>>>> +Differences with WSGI 1.0 >>>>>> +========================= >>>>>> + >>>>>> +Descriptive changes >>>>>> +------------------- >>>>>> + >>>>>> +The following changes were made to realign the spec with >>>>>> +implementations 'in the wild'. >>>>>> + >>>>>> +1. The 'readline()' function of 'wsgi.input' must optionally take >>>>>> + ? a size hint. This is required because many applications use >>>>>> + ? cgi.FieldStorage, which uses this functionality. >>>>>> + >>>>>> +2. The 'wsgi.input' functions for reading input must return an empty >>>>>> + ? string as end of input stream marker. This is required for support >>>>>> + ? of HTTP 1.1 request pipelining. A correctly implemented WSGI >>>>>> + ? middleware already has to cope with an empty string as end >>>>>> + ? sentinel anyway to detect premature end of input. >>>>>> + >>>>>> +3. Any WSGI application or middleware should not itself return, or >>>>>> + ? consume from a wrapped WSGI component, more data than specified by >>>>>> + ? the Content-Length response header if defined. Middleware that >>>>>> + ? does this is arguably broken and can generate incorrect data. >>>>>> + ? This is just a clarification of obligations. >>>>>> + >>>>>> +4. The WSGI adapter must not pass on to the server any data above >>>>>> + ? what the Content-Length response header defines, if supplied. >>>>>> + ? Doing this is technically a violation of HTTP. This is another >>>>>> + ? clarification of obligations. >>>>>> + >>>>>> + >>>>>> +String handling changes >>>>>> +----------------------- >>>>>> + >>>>>> +The following changes were made to make WSGI work on Python 3.x. >>>>>> + >>>>>> +1. The application is passed an instance of a Python dictionary >>>>>> + ? containing what is referred to as the WSGI environment. All keys >>>>>> + ? in this dictionary are native strings. For CGI variables, all names >>>>>> + ? are going to be ISO-8859-1 and so where native strings are >>>>>> + ? unicode strings, that encoding is used for the names of CGI >>>>>> + ? variables. >>>>>> + >>>>>> +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI >>>>>> + ? environment, the value of the variable should be a native string. >>>>>> + >>>>>> +3. For the CGI variables contained in the WSGI environment, the values >>>>>> + ? of the variables are native strings. Where native strings are >>>>>> + ? unicode strings, ISO-8859-1 encoding would be used such that the >>>>>> + ? original character data is preserved and as necessary the unicode >>>>>> + ? string can be converted back to bytes and thence decoded to unicode >>>>>> + ? again using a different encoding. >>>>>> + >>>>>> +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment >>>>>> + ? and from which request content is read, should yield byte strings. >>>>>> + >>>>>> +5. The status line specified by the WSGI application should be a byte >>>>>> + ? string. Where native strings are unicode strings, the native string >>>>>> + ? type can also be returned in which case it would be encoded as >>>>>> + ? ISO-8859-1. >>>>>> + >>>>>> +6. The list of response headers specified by the WSGI application should >>>>>> + ? contain tuples consisting of two values, where each value is a byte >>>>>> + ? string. Where native strings are unicode strings, the native string >>>>>> + ? type can also be returned in which case it would be encoded as >>>>>> + ? ISO-8859-1. >>>>>> + >>>>>> +7. The iterable returned by the application and from which response >>>>>> + ? content is derived, should yield byte strings. Where native strings >>>>>> + ? are unicode strings, the native string type can also be returned in >>>>>> + ? which case it would be encoded as ISO-8859-1. >>>>>> + >>>>>> +8. The value passed to the 'write()' callback returned by >>>>>> + ? 'start_response()' should be a byte string. Where native strings >>>>>> + ? are unicode strings, a native string type can also be supplied, in >>>>>> + ? which case it would be encoded as ISO-8859-1. >>>>>> >>>>>> >>>>>> ?Specification Overview >>>>>> @@ -447,6 +457,13 @@ >>>>>> ?Streaming`_ section below for more on how application output must be >>>>>> ?handled.) >>>>>> >>>>>> +Further on, several places specify constraints upon string types used >>>>>> +in the WSGI API. The term native string is used to mean the 'str' class >>>>>> +in both Python 2.x and 3.x. The spec tries to ensure optimal >>>>>> +compatibility and ease of use by allowing implementations running on >>>>>> +Python 3.x to encode strings (which are Unicode strings with no >>>>>> +specified encoding) as ISO-8859-1 where a 3.x string is passed in. >>>>>> + >>>>>> ?The server or gateway should treat the yielded strings as binary byte >>>>>> ?sequences: in particular, it should ensure that line endings are >>>>>> ?not altered. ?The application is responsible for ensuring that the >>>>>> @@ -489,12 +506,22 @@ >>>>>> ?``environ`` Variables >>>>>> ?--------------------- >>>>>> >>>>>> +All keys in this dictionary are native strings. For CGI variables, >>>>>> +all names are going to be ISO-8859-1 and so where native strings are >>>>>> +unicode strings, that encoding is used for the names of CGI variables. >>>>>> + >>>>>> ?The ``environ`` dictionary is required to contain these CGI >>>>>> ?environment variables, as defined by the Common Gateway Interface >>>>>> ?specification [2]_. ?The following variables **must** be present, >>>>>> ?unless their value would be an empty string, in which case they >>>>>> ?**may** be omitted, except as otherwise noted below. >>>>>> >>>>>> +The values for CGI variables are native strings. Where native strings >>>>>> +are unicode strings, ISO-8859-1 encoding would be used such that the >>>>>> +original character data is preserved and as necessary the unicode >>>>>> +string can be converted back to bytes and thence decoded to unicode >>>>>> +again using a different encoding. >>>>>> + >>>>>> ?``REQUEST_METHOD`` >>>>>> ? The HTTP request method, such as ``"GET"`` or ``"POST"``. ?This >>>>>> ? cannot ever be an empty string, and so is always required. >>>>>> @@ -575,13 +602,14 @@ >>>>>> ?===================== ?=============================================== >>>>>> ?Variable ? ? ? ? ? ? ? Value >>>>>> ?===================== ?=============================================== >>>>>> -``wsgi.version`` ? ? ? The tuple ``(1,0)``, representing WSGI >>>>>> +``wsgi.version`` ? ? ? The tuple ``(1, 0)``, representing WSGI >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?version 1.0. >>>>>> >>>>>> ?``wsgi.url_scheme`` ? ?A string representing the "scheme" portion of >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?the URL at which the application is being >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?invoked. ?Normally, this will have the value >>>>>> - ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. >>>>>> + ? ? ? ? ? ? ? ? ? ? ? ``"http"`` or ``"https"``, as appropriate. The >>>>>> + ? ? ? ? ? ? ? ? ? ? ? value is a native string. >>>>>> >>>>>> ?``wsgi.input`` ? ? ? ? An input stream (file-like object) from which >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?the HTTP request body can be read. ?(The server >>>>>> @@ -646,7 +674,7 @@ >>>>>> ?Method ? ? ? ? ? ? ? Stream ? ? ?Notes >>>>>> ?=================== ?========== ?======== >>>>>> ?``read(size)`` ? ? ? ``input`` ? 1 >>>>>> -``readline()`` ? ? ? ``input`` ? 1,2 >>>>>> +``readline(hint)`` ? ``input`` ? 1,2 >>>>>> ?``readlines(hint)`` ?``input`` ? 1,3 >>>>>> ?``__iter__()`` ? ? ? ``input`` >>>>>> ?``flush()`` ? ? ? ? ?``errors`` ?4 >>>>>> @@ -661,11 +689,12 @@ >>>>>> ? ?``Content-Length``, and is allowed to simulate an end-of-file >>>>>> ? ?condition if the application attempts to read past that point. >>>>>> ? ?The application **should not** attempt to read more data than is >>>>>> - ? specified by the ``CONTENT_LENGTH`` variable. >>>>>> + ? specified by the ``CONTENT_LENGTH`` variable. All read functions >>>>>> + ? are required to return an empty string as the end of input stream >>>>>> + ? marker. They must yield byte strings. >>>>>> >>>>>> -2. The optional "size" argument to ``readline()`` is not supported, >>>>>> - ? as it may be complex for server authors to implement, and is not >>>>>> - ? often used in practice. >>>>>> +2. The optional "size" argument to ``readline()`` is required for >>>>>> + ? the implementer, but optional for callers. >>>>>> >>>>>> ?3. Note that the ``hint`` argument to ``readlines()`` is optional for >>>>>> ? ?both caller and implementer. ?The application is free not to >>>>>> @@ -692,12 +721,15 @@ >>>>>> ?--------------------------------- >>>>>> >>>>>> ?The second parameter passed to the application object is a callable >>>>>> -of the form ``start_response(status,response_headers,exc_info=None)``. >>>>>> +of the form ``start_response(status, response_headers, exc_info=None)``. >>>>>> ?(As with all WSGI callables, the arguments must be supplied >>>>>> ?positionally, not by keyword.) ?The ``start_response`` callable is >>>>>> ?used to begin the HTTP response, and it must return a >>>>>> ?``write(body_data)`` callable (see the `Buffering and Streaming`_ >>>>>> -section, below). >>>>>> +section, below). Values passed to the ``write(body_data)`` callable >>>>>> +should be byte strings. Where native strings are unicode strings, a >>>>>> +native strings type can also be supplied, in which case it would be >>>>>> +encoded as ISO-8859-1. >>>>>> >>>>>> ?The ``status`` argument is an HTTP "status" string like ``"200 OK"`` >>>>>> ?or ``"404 Not Found"``. ?That is, it is a string consisting of a >>>>>> @@ -705,14 +737,20 @@ >>>>>> ?single space, with no surrounding whitespace or other characters. >>>>>> ?(See RFC 2616, Section 6.1.1 for more information.) ?The string >>>>>> ?**must not** contain control characters, and must not be terminated >>>>>> -with a carriage return, linefeed, or combination thereof. >>>>>> +with a carriage return, linefeed, or combination thereof. This >>>>>> +value should be a byte string. Where native strings are unicode >>>>>> +strings, the native string type can also be returned, in which >>>>>> +case it would be encoded as ISO-8859-1. >>>>>> >>>>>> ?The ``response_headers`` argument is a list of ``(header_name, >>>>>> ?header_value)`` tuples. ?It must be a Python list; i.e. >>>>>> -``type(response_headers) is ListType``, and the server **may** change >>>>>> +``type(response_headers) is list``, and the server **may** change >>>>>> ?its contents in any way it desires. ?Each ``header_name`` must be a >>>>>> ?valid HTTP header field-name (as defined by RFC 2616, Section 4.2), >>>>>> -without a trailing colon or other punctuation. >>>>>> +without a trailing colon or other punctuation. Both the header_name >>>>>> +and the header_value should be byte strings. Where native strings >>>>>> +are unicode strings, the native string type can also be returned, >>>>>> +in which case it would be encoded as ISO-8859-1. >>>>>> >>>>>> ?Each ``header_value`` **must not** include *any* control characters, >>>>>> ?including carriage returns or linefeeds, either embedded or at the end. >>>>>> @@ -809,6 +847,14 @@ >>>>>> ?Handling the ``Content-Length`` Header >>>>>> ?~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>> >>>>>> +If an application or middleware layer chooses to return a >>>>>> +Content-Length header, it should not return more data than specified >>>>>> +by the header value. Any wrapping middleware layer should not >>>>>> +consume more data than specified in the header value from the >>>>>> +wrapped component (either middleware or application). Any WSGI >>>>>> +adapter must similarly not pass on data above what the >>>>>> +Content-Length response header value defines. >>>>>> + >>>>>> ?If the application does not supply a ``Content-Length`` header, a >>>>>> ?server or gateway may choose one of several approaches to handling >>>>>> ?it. ?The simplest of these is to close the client connection when >>>>>> @@ -1569,55 +1615,13 @@ >>>>>> ? ?developers. >>>>>> >>>>>> >>>>>> -Proposed/Under Discussion >>>>>> -========================= >>>>>> - >>>>>> -These items are currently being discussed on the Web-SIG and elsewhere, >>>>>> -or are on the PEP author's "to-do" list: >>>>>> - >>>>>> -* Should ``wsgi.input`` be an iterator instead of a file? ?This would >>>>>> - ?help for asynchronous applications and chunked-encoding input >>>>>> - ?streams. >>>>>> - >>>>>> -* Optional extensions are being discussed for pausing iteration of an >>>>>> - ?application's ouptut until input is available or until a callback >>>>>> - ?occurs. >>>>>> - >>>>>> -* Add a section about synchronous vs. asynchronous apps and servers, >>>>>> - ?the relevant threading models, and issues/design goals in these >>>>>> - ?areas. >>>>>> - >>>>>> - >>>>>> ?Acknowledgements >>>>>> ?================ >>>>>> >>>>>> -Thanks go to the many folks on the Web-SIG mailing list whose >>>>>> -thoughtful feedback made this revised draft possible. ?Especially: >>>>>> +Thanks go to many folks on the Web-SIG mailing list for helping the work >>>>>> +on clarifying and improving this specification. In particular: >>>>>> >>>>>> -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up >>>>>> - ?on the first draft as not offering any advantages over "plain old >>>>>> - ?CGI", thus encouraging me to look for a better approach. >>>>>> - >>>>>> -* Ian Bicking, who helped nag me into properly specifying the >>>>>> - ?multithreading and multiprocess options, as well as badgering me to >>>>>> - ?provide a mechanism for servers to supply custom extension data to >>>>>> - ?an application. >>>>>> - >>>>>> -* Tony Lownds, who came up with the concept of a ``start_response`` >>>>>> - ?function that took the status and headers, returning a ``write`` >>>>>> - ?function. ?His input also guided the design of the exception handling >>>>>> - ?facilities, especially in the area of allowing for middleware that >>>>>> - ?overrides application error messages. >>>>>> - >>>>>> -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython >>>>>> - ?(well before the spec was finalized) helped to shape the "supporting >>>>>> - ?older versions of Python" section, as well as the optional >>>>>> - ?``wsgi.file_wrapper`` facility. >>>>>> - >>>>>> -* Mark Nottingham, who reviewed the spec extensively for issues with >>>>>> - ?HTTP RFC compliance, especially with regard to HTTP/1.1 features that >>>>>> - ?I didn't even know existed until he pointed them out. >>>>>> - >>>>>> +* Phillip J. Eby, for writing/editing the 1.0 specification. >>>>>> >>>>>> ?References >>>>>> ?========== >>>>>> @@ -1643,8 +1647,6 @@ >>>>>> >>>>>> ?This document has been placed in the public domain. >>>>>> >>>>>> - >>>>>> - >>>>>> ?.. >>>>>> ? ?Local Variables: >>>>>> ? ?mode: indented-text >>>>>> >>>>> >>>> _______________________________________________ >>>> Web-SIG mailing list >>>> Web-SIG at python.org >>>> Web SIG: http://www.python.org/sigs/web-sig >>>> Unsubscribe: http://mail.python.org/mailman/options/web-sig/paul.joseph.davis%40gmail.com >>>> >>> > From manlio_perillo at libero.it Sat Apr 17 14:46:40 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sat, 17 Apr 2010 14:46:40 +0200 Subject: [Web-SIG] [RFC] x-wsgiorg.suspend extension In-Reply-To: <3FFB5DE7-8766-4F81-B46D-FC37557DA9EB@lericson.se> References: <4BC22C04.5050308@libero.it> <3FFB5DE7-8766-4F81-B46D-FC37557DA9EB@lericson.se> Message-ID: <4BC9ADB0.4040902@libero.it> Ludvig Ericson ha scritto: I have put web-sig in Cc. > On 11 apr 2010, at 22:07, Manlio Perillo wrote: > >> I here propose the x-wsgiorg.suspend to be accepted as official WSGI >> extension, using the wsgiorg namespace. > > I'm sorry, but I don't see how such a solution wins out over any other stab at event-based concurrency (like gevent, eventlet, etc.) > > I've made a WSGI application using gevent, and then gunicorn's gevent arbiter thing. Works like a charm. > Because eventlet, gevent and friends works *because* they have full control over the event loop, and they can use greenlets as they like. This is not possible with implementations like txwsgi (Twisted) and ngx_http_wsgi_module (Nginx). eventlet has support for Twisted, but, as far as I can tell, it works by running the Twisted event loop inside a greenlet. This is of course impossible with ngx_http_wsgi_module, since it is embedded in a web server written in C. > I get the point in trying to standardize something, but this solution seems rather intrusive and not something I'd adopt any time soon. > Can you suggest a less intrusive extension that works with *every* WSGI implementation? > Nice work though! > Regards Manlio From jdmain at comcast.net Sat Apr 17 17:20:34 2010 From: jdmain at comcast.net (J.D. Main) Date: Sat, 17 Apr 2010 09:20:34 -0600 Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data? In-Reply-To: <4BBF4740.7070805@doxdesk.com> References: <4BB71921.6374.11014531@jdmain.comcast.net>, <4BBF4740.7070805@doxdesk.com> Message-ID: <4BC97D62.28488.C8C022B@jdmain.comcast.net> Thanks Andrew. It seems like URL rewriting is exactly the way to create a CGI based "RESTful" WEB service using IIS. I think one can map an .exe to a folder in IIS and thus remove the need for the .py extension in the URL. Though it would probably be fairly inefficient to execute a PY2EXE program with every web hit. I'm going to keep tinkering... Best Regards, JDM J.D. Main wrote: > I want to see the entire HTTP request with everything inside it. You won't get that as a CGI (or WSGI) application. It is the web server's job to parse the headers of the request, choose what host and script that maps to, and make them available to you (in the environ dictionary in WSGI, or the real environment variables in CGI). The server may perform additional processing on the input/output (eg. buffering and chunking). If you really need low-level detail you'll need to write your own HTTP server, or adapt one from eg. BaseHTTPServer. You almost never need that for normal web applications. > Does IIS actually pass that information to the CGI application or does it just > pass the variables? For a query string as posted, IIS parses the initial HTTP GET command, extracts the path part of that, splits it, and puts the `?...` part in the variable `QUERY_STRING` for you. > how would my python parse the following: > http://someserver/someapp/someuser/someupdate?var1=Charlie Many people do this with URL rewriting, to turn that into something like: http://someserver/someapp.py?user=someuser&action=someupdate&var1= Charlie You don't get a standard URL rewriter in IIS 5 but there are many third-party options. Personally I hate URL rewriting and try to avoid it wherever possible, because IMO URL format should be in the domain of the application and not a deployment issue. Unfortunately, if you really want to get rid of the `.py` in the URL, you will need at least some rewriting, because IIS refuses to map files without an extension to script engines. You can make the extension `.p` or `.html` or something else if you like, but you can't get rid of it. http://someserver/someapp.py/someuser/someupdate?var1=Charlie This URL should be parsed into environ members: HTTP_HOST: someserver SCRIPT_NAME: /someapp.py PATH_INFO: /someuser/someupdate QUERY_STRING: ?var1=Charlie Unfortunately (again), IIS gets this wrong. It sets `PATH_INFO` to: /someapp.py/someuser/someupdate which is contrary to the CGI/WSGI specifications. If you want to sniff path parts as an input mechanism (to do URL routing yourself without rewriting), you will have to detect this situation (probably by sniffing SERVER_SOFTWARE) and hack a fix in. Some libraries and frameworks may do this for you. (Aside: even this is not certain. This wrong behaviour can be turned off using a little-known IIS config option. However, it's unlikely to be used in the wild, not least because the flag typically breaks ASP.) Unfortunately (yet again), it's not reliable to send any old characters as part of the path. Because of the poor design of the original CGI standard (carried over into WSGI), any `%nn` escape sequences get decoded before being dropped into SCRIPT_NAME/PATH_INFO (though not, thankfully, QUERY_STRING). This has the consequence that there are many characters that can't reliably be used in a path part, including slashes, backslashes, control characters, and all non-ASCII characters (since they go through a Unicode decode/encode cycle with what are almost guaranteed to be the wrong charsets). Stick with simple strings like `someuser`. Summary: IIS is a pain. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/