From pje at telecommunity.com  Sat Oct  2 01:07:04 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Oct  2 01:08:40 2004
Subject: [Web-SIG] Latest WSGI revision posted; finalization soon?
Message-ID: <5.1.1.6.0.20041001185733.02147810@mail.telecommunity.com>

FYI, I've posted a new revision, mainly it's just the changes Mark proposed.

At this point, the only real open issue is what to do about the async API, 
and drafting a section on sync/async/threading, to replace the currently 
very short section on threading.

It's been a little over a week since I proposed an alternative way to 
structure an optional asynchronous API, but I haven't seen any comments on 
that API.

I'd really like to get some sort of async API finalized, just so that there 
is some "standard" way of offering the feature.  But, since I personally 
don't need it, I'd like some guidance from the community as to what 
approach is more desirable.

The other option is to merely present some ideas and alternatives in the 
PEP, and leave it to the community to try different things.

Whichever way we go, I'd ideally like to see the PEP able to move to a 
"Final" status this month, such that we don't make any further semantic 
changes to 1.0.

From ianb at colorstudy.com  Sun Oct  3 06:18:44 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun Oct  3 06:18:49 2004
Subject: [Web-SIG] WSGI Webware progress
Message-ID: <415F7DA4.3090805@colorstudy.com>

I've made quite a bit of progress with the WSGI port of Webware, running 
two real applications I've written under it, without any significant 
changes to the applications (except for an import statement or two). 
The applications weren't written with WSGI in mind, so they didn't limit 
themselves to things that seemed simple under WSGI.  OTOH, I wrote both 
of them, and I only use a subset of the Webware API.

The Webware portion of this remains fairly minimal, mostly some simple 
classes that translate the WSGI environment and general system to the 
Webware API.

In the process, I've made some reusable middleware that is 
Webware-neutral, but implements some of Webware's functionality (and I 
layer them in roughly this order):

* httpexceptions; catches particular exceptions and turns them into HTTP 
responses (e.g., HTTPMovedPermanently, HTTPNotFound, etc).  In a way I 
wish this was standard.  However, the other middleware doesn't use this 
(though some of my Webware code does).

* recursive; allows applications to forward to other URLs and to make 
recursive calls to include other URLs.  These URLs have to be under the 
location where recursive is used.

* session; implements sessions.  The persistence is simple and doesn't 
take concurrency into account (yet), but the basic structure seems 
correct to me.

* urlparser; this takes a URL and finds an application based on it. 
Currently it looks in a single directory, parses out the next part, and 
finds the application associated.  Subdirectories turn into other 
urlparser instances.  Finds .py modules, and looks for "application" 
(which is a ready-made application), or module.module_name, where the 
object must be called before it is ready to act as an application (in 
Webware's case, this is a class, instances of which are WSGI 
applications).  Also serves up static files, like .css, .html, etc.

* wsgilib; a number of generic functions for use with WSGI.  Right now 
this includes:

   * Cookie parser: get_cookies

   * Something to add a finalizing function to an iterator: add_close

   * A way to run a request in a fake environment, for interactive
     debugging and testing: interactive

   * An error response creator (for 404 messages, etc): error_response

   * An application-builder for on-disk files: send_file

I still need to do more testing, and write some unit tests for these 
middleware.  But progress has gone well, and implementing a real-world 
framework on WSGI seems very doable.  This is a more aggressive use of 
WSGI than many framework ports may make; a simpler porting technique 
would be to take the whole framework and find a single entry point, 
letting the framework keep all its URL parsing and other code.  I'm 
doing this refactoring in part because I think it's the right direction 
for Webware, moreso than it's the best or easiest way to port a framework.

Comments and suggestions welcome.  The code is located at 
svn://colorstudy.com/trunk/WSGI

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From floydophone at gmail.com  Sun Oct  3 16:42:13 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Sun Oct  3 16:42:15 2004
Subject: [Web-SIG] WSGI Webware progress
Message-ID: <6654eac40410030742163cd370@mail.gmail.com>

Looking good! I see we've written a lot of similar code; perhaps we
could merge our two separate efforts into "wsgilib"?
From pje at telecommunity.com  Sun Oct  3 18:02:09 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Oct  3 18:02:02 2004
Subject: [Web-SIG] WSGI Webware progress
In-Reply-To: <6654eac40410030742163cd370@mail.gmail.com>
Message-ID: <5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>

At 10:42 AM 10/3/04 -0400, Peter Hunt wrote:
>Looking good! I see we've written a lot of similar code; perhaps we
>could merge our two separate efforts into "wsgilib"?

Heh.  I've also started work on a "wsgilib", mainly to provide common base 
classes and utility functions for servers and gateways.  Maybe we need to 
co-ordinate in some fashion.  :)

From py-web-sig at xhaus.com  Sun Oct  3 20:15:55 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Sun Oct  3 20:17:26 2004
Subject: [Web-SIG] ANN: Release 0.20.0 of modjy: a WSGI gateway for jython
	2.1 and J2EE.
Message-ID: <416041DB.30107@xhaus.com>

Dear all,

I am somewhat pleased to announce the release of version 0.20.0 of 
modjy, a WSGI-compliant gateway for Jython 2.1 and J2EE. Modjy is 
released under the Apache 2.0 License.

You can download this release, including all source and documentation, 
from the following address

http://www.xhaus.com/modjy

There are still a number of areas to be cleaned up, including the import 
mechanism. Also exception handling needs to be improved, especially with 
the introduction of the "exc_info" parameter to the 
start_response_callable.

Also, I have a fair amount of tests to develop. But I don't have a lot 
of time to spare right now, due to extensive work commitments, so it may 
be a while before those tests are developed. The reason why I decided to 
release without a full test suite is because there seems to be several 
members of the WEB-SIG who are developing WSGI test suites right now, so 
I'm hoping that I will be able to at least partially reuse those test 
suites.

Still, I'm hoping that you will find modjy a useful test bed for WSGI 
development.

Kind regards,

Alan.
From pje at telecommunity.com  Sun Oct  3 21:47:12 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Oct  3 21:47:05 2004
Subject: [Web-SIG] ANN: Release 0.20.0 of modjy: a WSGI gateway for
	jython 2.1 and J2EE.
In-Reply-To: <416041DB.30107@xhaus.com>
Message-ID: <5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>

At 07:15 PM 10/3/04 +0100, Alan Kennedy wrote:
>Dear all,
>
>I am somewhat pleased to announce the release of version 0.20.0 of modjy, 
>a WSGI-compliant gateway for Jython 2.1 and J2EE. Modjy is released under 
>the Apache 2.0 License.
>
>You can download this release, including all source and documentation, 
>from the following address
>
>http://www.xhaus.com/modjy
>
>There are still a number of areas to be cleaned up, including the import 
>mechanism. Also exception handling needs to be improved, especially with 
>the introduction of the "exc_info" parameter to the start_response_callable.

Looks pretty good.  FYI, as far as I can tell, your 'j2ee.*' extensions 
aren't compliant, because they can bypass middleware modifications to the 
environment.

(I wonder if perhaps the current mechanism to prevent middleware bypassing 
is too heavyweight?)

I haven't had time to read all of the source code yet, so I'm not sure if 
that's the only compliance issue, but that's the only one I've seen in your 
documentation.

By the way, if you do implement pooling of application objects to bypass 
their single-threadedness, I think the only really safe way to do that is 
by having a separate Jython interpreter for each one.  A single-threaded 
application is going to assume it can use module-level globals without 
conflicts, so just creating duplicate application objects isn't going to 
resolve that issue.

From floydophone at gmail.com  Mon Oct  4 00:23:55 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Mon Oct  4 00:23:58 2004
Subject: [Web-SIG] WSGI Webware progress
In-Reply-To: <5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>
References: <6654eac40410030742163cd370@mail.gmail.com>
	<5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>
Message-ID: <6654eac4041003152327b03280@mail.gmail.com>

I think we could use a SVN repository for all of this stuff. Most of
my code is uploaded on http://st0rm.hopto.org/wsgi/, except I've been
working on a Twisted.web resource for running WSGI apps.

On Sun, 03 Oct 2004 12:02:09 -0400, Phillip J. Eby
<pje@telecommunity.com> wrote:
> 
> 
> At 10:42 AM 10/3/04 -0400, Peter Hunt wrote:
> >Looking good! I see we've written a lot of similar code; perhaps we
> >could merge our two separate efforts into "wsgilib"?
> 
> Heh.  I've also started work on a "wsgilib", mainly to provide common base
> classes and utility functions for servers and gateways.  Maybe we need to
> co-ordinate in some fashion.  :)
> 
>
From py-web-sig at xhaus.com  Mon Oct  4 01:00:20 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Mon Oct  4 01:01:20 2004
Subject: [Web-SIG] ANN: Release 0.20.0 of modjy: a WSGI gateway for jython
	2.1 and J2EE.
In-Reply-To: <5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>
References: <5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>
Message-ID: <41608484.8010101@xhaus.com>

[Phillip J. Eby]
> Looks pretty good.  FYI, as far as I can tell, your 'j2ee.*' extensions 
> aren't compliant, because they can bypass middleware modifications to 
> the environment.

Indeed, I was aware of that. I meant to add something to the 
documentation which said "these modjy-specific extensions are not 
compliant with the strict wording of the spec, which forbids access to 
HTTP request and response data in a way that bypasses WSGI mechanisms".

However .....

> (I wonder if perhaps the current mechanism to prevent middleware 
> bypassing is too heavyweight?)

I'm sort of thinking that it is a little heavyweight. I think that 
anyone who wants to bypass the middleware will probably have a good 
reason for doing so. Also, they would probably be very aware that their 
application would no longer be portable.

Also, I would have to add a fair amount of extra code, just to ensure 
that the extension APIs present the same information as the standard 
WSGI interface. Which seems unnecessary, given that the WSGI information 
is already there.

More importantly, that extra code would then make it impossible for the 
application/framework author to get at the original request, which they 
might conceivably really, really, need .....

But I am concerned about the statement in the PEP which says "it is very 
important that these "safe extension" rules be followed by both 
server/gateway and middleware developers, in order to avoid a future in 
which middleware developers are forced to delete any and all extension 
APIs from environ to ensure that their mediation isn't being bypassed by 
applications using those extensions!"

I definitely don't want to bring such a future about ....

> I haven't had time to read all of the source code yet, so I'm not sure 
> if that's the only compliance issue, but that's the only one I've seen 
> in your documentation.

I am reasonably sure that there are other minor nits in the code, which 
I will incrementally fix in the coming weeks.

The reason for "releasing early, releasing often" is that I want to 
demonstrate that I am serious about publishing a production-quality 
J2EE->WSGI gateway for jython.

> By the way, if you do implement pooling of application objects to bypass 
> their single-threadedness, I think the only really safe way to do that 
> is by having a separate Jython interpreter for each one.  A 
> single-threaded application is going to assume it can use module-level 
> globals without conflicts, so just creating duplicate application 
> objects isn't going to resolve that issue.

That's true, and would indeed be quite messy to implement. I'll leave 
that one on the back burner for now.

Regards,

Alan.
From pje at telecommunity.com  Mon Oct  4 04:38:47 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct  4 04:38:40 2004
Subject: [Web-SIG] ANN: Release 0.20.0 of modjy: a WSGI gateway for
	jython 2.1 and J2EE.
In-Reply-To: <41608484.8010101@xhaus.com>
References: <5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>
	<5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041003222240.036b41c0@mail.telecommunity.com>

At 12:00 AM 10/4/04 +0100, Alan Kennedy wrote:
>[Phillip J. Eby]
>>(I wonder if perhaps the current mechanism to prevent middleware 
>>bypassing is too heavyweight?)
>
>I'm sort of thinking that it is a little heavyweight. I think that anyone 
>who wants to bypass the middleware will probably have a good reason for 
>doing so. Also, they would probably be very aware that their application 
>would no longer be portable.

Well, the problem isn't portability, nor is it *intending* to bypass 
middleware.  The problem is that you can write a portable program that uses 
bypass APIs for performance when they're available, but then mysteriously 
breaks when you add middleware to the mix, because it's bypassing the 
middleware.


>Also, I would have to add a fair amount of extra code, just to ensure that 
>the extension APIs present the same information as the standard WSGI 
>interface. Which seems unnecessary, given that the WSGI information is 
>already there.

Right.  I think it's a natural first thought to say, "Oh, I'll add an 
extension API so you can get at the original server request", but given the 
purpose of WSGI, at second thought it seems rather pointless.  If the app 
author wanted something non-portable, he'd have written to the server's API 
to begin with.  If it's *extra* information you're providing, just add it 
to environ, as long as it's not information *derived* from other data in 
environ.  If it's derived, offer a function to derive it, rather than data.

If you're providing a special input feature, attach it to the input stream, 
so that if middleware replaces the input stream, it disables the feature 
automatically.  If it's a special output feature, supply an 
iterator-wrapper that can be returned by the application for special 
treatment by the server, or make it an attribute of start_response.

Maybe the above guidelines should be added to the spec.


>More importantly, that extra code would then make it impossible for the 
>application/framework author to get at the original request, which they 
>might conceivably really, really, need .....

For...?


>But I am concerned about the statement in the PEP which says "it is very 
>important that these "safe extension" rules be followed by both 
>server/gateway and middleware developers, in order to avoid a future in 
>which middleware developers are forced to delete any and all extension 
>APIs from environ to ensure that their mediation isn't being bypassed by 
>applications using those extensions!"
>
>I definitely don't want to bring such a future about ....

That is the big issue, yes.  When an app behaves mysteriously when 
middleware is added, the middleware author will get the blame, even though 
the application developer did everything right, and the server author is 
the real culprit.  So, the middleware author will gripe and grumble and add 
code to delete the server's extensions...  in which case there was no point 
in the server author putting them there.


From ianb at colorstudy.com  Mon Oct  4 04:54:13 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon Oct  4 04:54:17 2004
Subject: [Web-SIG] ANN: Release 0.20.0 of modjy: a WSGI gateway for	jython
	2.1 and J2EE.
In-Reply-To: <5.1.1.6.0.20041003222240.036b41c0@mail.telecommunity.com>
References: <5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>	<5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>
	<5.1.1.6.0.20041003222240.036b41c0@mail.telecommunity.com>
Message-ID: <4160BB55.9050902@colorstudy.com>

Phillip J. Eby wrote:
>> Also, I would have to add a fair amount of extra code, just to ensure 
>> that the extension APIs present the same information as the standard 
>> WSGI interface. Which seems unnecessary, given that the WSGI 
>> information is already there.
> 
> 
> Right.  I think it's a natural first thought to say, "Oh, I'll add an 
> extension API so you can get at the original server request", but given 
> the purpose of WSGI, at second thought it seems rather pointless.  If 
> the app author wanted something non-portable, he'd have written to the 
> server's API to begin with.  If it's *extra* information you're 
> providing, just add it to environ, as long as it's not information 
> *derived* from other data in environ.  If it's derived, offer a function 
> to derive it, rather than data.

I think Alan might be considering a situation in which there's some 
information which he isn't aware of that's missing, and rather than have 
the application author curse him for neutering his environment, he gives 
the author a way to get around it all.

Then, ideally, the author makes a note of this and the information shows 
up in the next version of modjy.  Or, the author who uses that 
information just has to be careful about data integrity him or herself.

Maybe it would be sufficient not to provide the request or response 
immediately in the dictionary, but require the author to do something 
like j2ee_req = environ['modjy.request'](environ); then when they get 
this, you could emit a warning, or if they get the request and you 
detect that there's something weird about the environ, you return None, 
raise an exception, log a warning, or something along those lines.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From py-web-sig at xhaus.com  Mon Oct  4 13:40:17 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Mon Oct  4 13:41:25 2004
Subject: [Web-SIG] ANN: Release 0.20.0 of modjy: a WSGI gateway for	jython
	2.1 and J2EE.
In-Reply-To: <4160BB55.9050902@colorstudy.com>
References: <5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>	<5.1.1.6.0.20041003154110.0369c890@mail.telecommunity.com>
	<5.1.1.6.0.20041003222240.036b41c0@mail.telecommunity.com>
	<4160BB55.9050902@colorstudy.com>
Message-ID: <416136A1.3050600@xhaus.com>

[Alan Kennedy]
>>> Also, I would have to add a fair amount of extra code, just to ensure 
>>> that the extension APIs present the same information as the standard 
>>> WSGI interface. Which seems unnecessary, given that the WSGI 
>>> information is already there.

[Phillip J. Eby]
>> Right.  I think it's a natural first thought to say, "Oh, I'll add an 
>> extension API so you can get at the original server request", but 
>> given the purpose of WSGI, at second thought it seems rather 
>> pointless.  If the app author wanted something non-portable, he'd have 
>> written to the server's API to begin with.  

Or the author may want to reuse some existing WSGI code, and minimally 
tweak it to use a server-specific API. And could explicitly check for 
relevant server-specific extensions in different servers/gateways, e.g.

if environ.has_key('j2ee.request'):
   # Do J2EE specific processing
elif environ.has_key('mod_python.request'):
   # Do mod_python specific processing
else:
   raise UnableToProvideError()

That said, I can not currently think of situation where such might be 
necessary.

[Phillip J. Eby]
>> If it's *extra* 
>> information you're providing, just add it to environ, as long as it's 
>> not information *derived* from other data in environ.  If it's 
>> derived, offer a function to derive it, rather than data.

[Ian Bicking]
> I think Alan might be considering a situation in which there's some 
> information which he isn't aware of that's missing, and rather than have 
> the application author curse him for neutering his environment, he gives 
> the author a way to get around it all.

I couldn't have said it better myself, Ian.

[Ian Bicking]
> Maybe it would be sufficient not to provide the request or response 
> immediately in the dictionary, but require the author to do something 
> like j2ee_req = environ['modjy.request'](environ); then when they get 
> this, you could emit a warning, or if they get the request and you 
> detect that there's something weird about the environ, you return None, 
> raise an exception, log a warning, or something along those lines.

I'll do whatever is necessary to comply with the spec. If bypassing 
middleware is judged to be out-of-the-question, then I will either 
eliminate the extensions or wrap them so that they are compliant.

Regards,

Alan.
From mnot at mnot.net  Mon Oct  4 19:18:47 2004
From: mnot at mnot.net (Mark Nottingham)
Date: Mon Oct  4 19:43:52 2004
Subject: [Web-SIG] Latest WSGI revision posted; finalization soon?
In-Reply-To: <5.1.1.6.0.20041001185733.02147810@mail.telecommunity.com>
References: <5.1.1.6.0.20041001185733.02147810@mail.telecommunity.com>
Message-ID: <736F563A-1629-11D9-88DC-000A95BD86C0@mnot.net>

Big +1!

On Oct 1, 2004, at 4:07 PM, Phillip J. Eby wrote:

> Whichever way we go, I'd ideally like to see the PEP able to move to a 
> "Final" status this month, such that we don't make any further 
> semantic changes to 1.0.


--
Mark Nottingham     http://www.mnot.net/

From james at pythonweb.org  Mon Oct  4 20:46:40 2004
From: james at pythonweb.org (James Gardner)
Date: Mon Oct  4 20:46:49 2004
Subject: [Web-SIG] Python Web Modules - Version 0.4.1
Message-ID: <41619A90.8060504@pythonweb.org>

Hello,

I'd like to announce the release of the Python Web Modules 0.4.1. This is the
first time the modules have been publicly announced.

http://www.pythonweb.org/     Feel free to download and have a play

Back in March before the WSGI discussions there was some talk about releasing
better standard modules in Python for developing web applications. This is my
attempt to achieve that. These modules are designed to be easily accessible to
beginners or developers currently using PHP or Perl whilst also offering lower
level APIs for experts to create powerful dynamic websites.

Key features include:

* web.auth     - Identity and identification handling. Users may have multiple
                  access levels to multiple applications. Sign in and password
                  reminder handling is built in.

* web.session  - Persistence using cookie or URL based session IDs allowing any
                  object which can be pickled to be stored using a dictionary-
                  like interface. Can be used with file or database drivers.

* web.form     - HTML Form generation and user input handling. Field objects
                  available for HTML fields and the main Python types including
                  date and time objects. Values returned as Python objects.

* web.database - Database abstraction layer supporting MySQL, SQLite, ODBC and
                  Gadfly for cross-database programming. Types are converted.
                - Multiple return formats including dict, tuple and object.
                - Object-relational mapper similar to SQLObject allowing
                  transparent database manipulation using dictionary-like objects
                  in Python code. One and many to many mappings and automatic
                  HTML form generation for editing records are supported.

* web.error    - Enhanced error handling based on the principles of the cgitb
                  module. Plain text or HTML output to a file or browser. Custom
                  extension mechanism for email notifications and more.

* web.template - Support for Cheetah, XYAPUT and Dreamweaver MX templates.

* web.mail     - Quickly send plain text or HTML emails.

* web.image    - Generate 2D pie, bar and scatter graphs in a variety of
                  image formats. Requires PIL.

* datetime     - Python 2.3 date handling compatibility module for Python 2.2

There is probably nothing too ground-breaking here (apart from perhaps the HTML
form interface being combined with a database ORM) but I have tried to make it
all as complete and intuitive as possible which is why I feel it stands out
from other modules. A sample webserver is included to test the examples.

The full module reference and examples are available at:

http://www.pythonweb.org/doc/0.4.1/

One feature which should make this package more attractive to certain developers
over Zope or Webware is that no superuser rights are needed to use the modules
since there is no application server to be run. They can be uploaded to a
shared Apache-based web server and run without compilation or installation
(although certain features are only available if you have external software).

The project plan for the next stage includes continued work on useful
applications such as user management and contact forms (which most websites
use), write code to support the WSGI PEP and further improve the documentation.

http://www.pythonweb.org/project/plan.html

Any thoughts or comments would be really appreciated.

Best wishes,

James

-- 
James Gardner
james 'at' pythonweb.org
http://www.pythonweb.org

From ianb at colorstudy.com  Mon Oct  4 22:55:46 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon Oct  4 22:56:48 2004
Subject: [Web-SIG] Python Web Modules - Version 0.4.1
In-Reply-To: <41619A90.8060504@pythonweb.org>
References: <41619A90.8060504@pythonweb.org>
Message-ID: <4161B8D2.1020902@colorstudy.com>

James Gardner wrote:
> Hello,
> 
> I'd like to announce the release of the Python Web Modules 0.4.1. 
> This is the first time the modules have been publicly announced.
> 
> http://www.pythonweb.org/     Feel free to download and have a play

> Back in March before the WSGI discussions there was some talk about 
> releasing better standard modules in Python for developing web 
> applications. This is my attempt to achieve that. These modules are 
> designed to be easily accessible to beginners or developers currently
>  using PHP or Perl whilst also offering lower level APIs for experts 
> to create powerful dynamic websites.

Now with WSGI, have you thought about refactoring some of these with
that in mind?  Some of these are really WSGI-neutral libraries, but 
others aren't.

The obvious place to start would be a WSGI backend.  It doesn't seem 
like Python Web Modules model will work well in a non-CGI environment. 
Not only does it seem to put everything in the global space (e.g., 
web.cgi), making it difficult to run in threaded environments, but all 
the examples run the request at the top level of the module, so that you 
have to reload the module to serve a second request.  This will paint 
you into a corner, as the API will be resistent to any other environments.

There are some other parts that might be good as middleware.  A deep 
stack of middleware starts to bring up issues of configuration and 
providing hooks... but that's another issue.  Anyway...

> Key features include:
> 
> * web.auth     - Identity and identification handling. Users may have
>  multiple access levels to multiple applications. Sign in and 
> password reminder handling is built in.

This could be middleware, though obviously it requires a lot of user 
configuration.  If it's middleware you could share a single 
authentication system with different WSGI applications.  Some 
standardization in this case would be good -- starting with things as 
simple as environ['auth.username'] holding the string username.  But for 
now there's no standard, so you should use a custom prefix.

> * web.session  - Persistence using cookie or URL based session IDs 
> allowing any object which can be pickled to be stored using a 
> dictionary- like interface. Can be used with file or database 
> drivers.

This would be good as a WSGI middleware.  I have such a middleware at 
svn://colorstudy.com/trunk/WSGI/session.py , but the actual persistence 
and configuration is minimal.  But it might be helpful for thinking 
about how it might look as middleware.

> * web.error    - Enhanced error handling based on the principles of 
> the cgitb module. Plain text or HTML output to a file or browser. 
> Custom extension mechanism for email notifications and more.

This could also be a piece of middleware.  I feel like it's one of the 
more complicated kinds of middleware, but useful.  It could also be a 
bit of library code that applications can use, but I'd prefer it as 
middleware because you could configure it for multiple applications.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From james at pythonweb.org  Tue Oct  5 01:07:04 2004
From: james at pythonweb.org (James Gardner)
Date: Tue Oct  5 01:07:12 2004
Subject: [Web-SIG] Python Web Modules - Version 0.4.1
Message-ID: <4161D798.6030609@pythonweb.org>

Thanks for the comments, much appreciated. I'm afraid I've got some more 
questions though :-)

> Now with WSGI, have you thought about refactoring some of these with
> that in mind?  Some of these are really WSGI-neutral libraries, but 
> others aren't.
>
> The obvious place to start would be a WSGI backend.  It doesn't seem 
> like Python Web Modules model will work well in a non-CGI environment. 
> Not only does it seem to put everything in the global space (e.g., 
> web.cgi), making it difficult to run in threaded environments, but all 
> the examples run the request at the top level of the module, so that 
> you have to reload the module to serve a second request.  This will 
> paint you into a corner, as the API will be resistent to any other 
> environments.


Agreed, the modules are fairly CGI-orientated.. and none of the examples 
show anything clever going on.. but I am keen to refactor them and think 
they could be easily modified.. even thought I might need some advice! I 
also think the web modules and the WSGI might make a good fit and there 
would be no harm in writing the necessary glue so that they could be 
used in both environments. I am also going to look into how hard it 
would be getting them working with jython.

I'm just trying to get my head around the best way of doing things..  My 
understanding is this: the server is constantly running and calls both 
the application and any encompassing middleware every time a request is 
made. This means that for each request the middleware and the 
application are executed for the request. Consequently there is no speed 
advantage in moving code from the application to the middleware. The 
only advantage is that it makes certain bits of code more reusable for 
other applications.

I can see how the web.database structure or cursor can be moved to the 
server and passed as environ['web.database.cursor']  and 
environ['web.database.structure']  objects. (btw it is legal to put 
objects in the environ dictionary isn't it or are the values expected to 
be strings?) but surely there would be no advantage to moving things 
like the web.cgi object away from the application global namespace 
because it would have to be reloaded on each request anyway so it might 
as well exist in the application's global space mightn't it? I guess 
what I'm asking is: for items that have to be refreshed every request is 
there a lot to be gained by moving them away from the application's 
global namespcae?

Could you possibly be more specific about which areas of the modules you 
think wouldn't work well with threading and why they wouldn't? I don't 
expect you've studied the modules too closely but I'm not sure I 
understand where the difficulties might lie?

>> Key features include:
>>
>> * web.auth     - Identity and identification handling. Users may have
>>  multiple access levels to multiple applications. Sign in and 
>> password reminder handling is built in.
>
>
>
> This could be middleware, though obviously it requires a lot of user 
> configuration.  If it's middleware you could share a single 
> authentication system with different WSGI applications.  Some 
> standardization in this case would be good -- starting with things as 
> simple as environ['auth.username'] holding the string username.  But 
> for now there's no standard, so you should use a custom prefix.


Yes, I guess the auth and session modules could be middleware and I am 
writing an application to handle the sign in and sign out so that 
wouldn't need to be included in the middleware, just the current auth 
status of the user and the access levels making the middleware thinner.

>> * web.session  - Persistence using cookie or URL based session IDs 
>> allowing any object which can be pickled to be stored using a 
>> dictionary- like interface. Can be used with file or database drivers..
>
>
> This would be good as a WSGI middleware.  I have such a middleware at 
> svn://colorstudy.com/trunk/WSGI/session.py , but the actual 
> persistence and configuration is minimal.  But it might be helpful for 
> thinking about how it might look as middleware.


I downloaded and ran your code earlier today and had a look.. certainly 
helpful.. thank you.

>> * web.error    - Enhanced error handling based on the principles of 
>> the cgitb module. Plain text or HTML output to a file or browser. 
>> Custom extension mechanism for email notifications and more.
>
>
> This could also be a piece of middleware.  I feel like it's one of the 
> more complicated kinds of middleware, but useful.  It could also be a 
> bit of library code that applications can use, but I'd prefer it as 
> middleware because you could configure it for multiple applications.


I quite like this as middleware too, but again it could go in the 
server.. how do you decide? I also find that all pages have similar 
regions like title, breadcrumbs, navigation bar, content.. I was 
planning on having some sort of templating middleware so that 
applications didn't have to worry so much about the broad page structure 
allowing easy theming of sites.

At the moment I'm think of refactoring as follows:
WSGI - Server:      web.database
                   web.database.object
                   any global config options
                      - Middleware:  web.auth
                   web.session
                   web.error
                   theming engine
                      - Application: Sign in, sign out, change password, 
password reminder, change access levels etc
       - Library:     web.mail
                   web.image.graph
                   web.template

Does this sound like a sensible architecture to go with?

Again any thoughts would be appreciated.

Cheers then,

James
-- 
James Gardner
james 'at' pythonweb.org
http://www.pythonweb.org


From floydophone at gmail.com  Tue Oct  5 04:10:33 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Tue Oct  5 04:10:36 2004
Subject: [Web-SIG] Updated my WSGI examples
Message-ID: <6654eac40410041910182deb55@mail.gmail.com>

http://st0rm.hopto.org/wsgi/

- test_applications.py - contains a bunch of fun little test WSGI
applications which demonstrate various capabilities. It also contains
a unit test which will test all of these applications when given a
URL. WE SHOULD EXPAND ON THIS; to ensure WSGI compatibility, we should
expand this test case to be as conclusive as possible and require
framework authors to pass it.

- middleware.py - added generic encoding middleware which defaults to rot-13.

- twisted_wsgi.py - Twisted.web Resource which will export a WSGI
application. Example server is included in this file which sets up a
server which can then be tested by test_applications.py. You can run
them in async mode, which executes the WSGI app assuming it does not
block, or in sync mode, which simply executes it in a thread.

Looking forward to the bugs you will find :) I'm still not quite sure
if I'm handling errors the correct way (twisted_wsgi)...
From ianb at colorstudy.com  Tue Oct  5 06:09:17 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct  5 06:09:22 2004
Subject: [Web-SIG] Python Web Modules - Version 0.4.1
In-Reply-To: <4161D798.6030609@pythonweb.org>
References: <4161D798.6030609@pythonweb.org>
Message-ID: <41621E6D.1040208@colorstudy.com>

James Gardner wrote:
> Agreed, the modules are fairly CGI-orientated.. and none of the examples 
> show anything clever going on.. but I am keen to refactor them and think 
> they could be easily modified.. even thought I might need some advice! I 
> also think the web modules and the WSGI might make a good fit and there 
> would be no harm in writing the necessary glue so that they could be 
> used in both environments. I am also going to look into how hard it 
> would be getting them working with jython.
> 
> I'm just trying to get my head around the best way of doing things..  My 
> understanding is this: the server is constantly running and calls both 
> the application and any encompassing middleware every time a request is 
> made. This means that for each request the middleware and the 
> application are executed for the request. Consequently there is no speed 
> advantage in moving code from the application to the middleware. The 
> only advantage is that it makes certain bits of code more reusable for 
> other applications.

Correct.

> I can see how the web.database structure or cursor can be moved to the 
> server and passed as environ['web.database.cursor']  and 
> environ['web.database.structure']  objects. 

I'm not sure what the benefit would be?  I'd expect those modules to 
stay as libraries for the application to use, just like they are now.

> (btw it is legal to put 
> objects in the environ dictionary isn't it or are the values expected to 
> be strings?) 

Yes, it is legal.

> but surely there would be no advantage to moving things 
> like the web.cgi object away from the application global namespace 
> because it would have to be reloaded on each request anyway so it might 
> as well exist in the application's global space mightn't it? I guess 
> what I'm asking is: for items that have to be refreshed every request is 
> there a lot to be gained by moving them away from the application's 
> global namespcae?

Well, they *have* to be moved away from the global namespace.  There is 
no global request object in WSGI -- the request is represented with the 
environ dictionary, and it has to be passed around.  If it's global, 
then only one request can be processed at a time.  This would make it 
incompatible with threaded environments.

> Could you possibly be more specific about which areas of the modules you 
> think wouldn't work well with threading and why they wouldn't? I don't 
> expect you've studied the modules too closely but I'm not sure I 
> understand where the difficulties might lie?

To be threadsafe, you have to move anything request-related out of 
global variables.  You don't *have* to be threadsafe; you could simply 
not support threaded environments.  That still leaves a number of other 
environments -- CGI, mod_python, and some others -- but I don't think 
it's a good idea to build in that limitation.

The other issue with your modules is that applications shouldn't be 
scripts.  They should be objects of some sort (possibly including 
functions).  The problem with scripts is that they are awkward to work 
with in Python, as you can't import them.  Because if you import them, 
then the script runs, and if you import it a second time, the script 
*won't* run.  And you *must* support an application being run more than 
one time in the same process.

You could get around this, by creating an application object that reruns 
the script everytime it is called, but I think this is unnecessarily 
difficult, and there are other downsides to using scripts in this style.

>>> Key features include:
>>>
>>> * web.auth     - Identity and identification handling. Users may have
>>>  multiple access levels to multiple applications. Sign in and 
>>> password reminder handling is built in.
>>
>>
>>
>>
>> This could be middleware, though obviously it requires a lot of user 
>> configuration.  If it's middleware you could share a single 
>> authentication system with different WSGI applications.  Some 
>> standardization in this case would be good -- starting with things as 
>> simple as environ['auth.username'] holding the string username.  But 
>> for now there's no standard, so you should use a custom prefix.
> 
> 
> 
> Yes, I guess the auth and session modules could be middleware and I am 
> writing an application to handle the sign in and sign out so that 
> wouldn't need to be included in the middleware, just the current auth 
> status of the user and the access levels making the middleware thinner.

Yes, I think that's about right.  More generally, you might just include 
whatever object represents the user, and depend on the application to 
handle its own permission levels.

>>> * web.error    - Enhanced error handling based on the principles of 
>>> the cgitb module. Plain text or HTML output to a file or browser. 
>>> Custom extension mechanism for email notifications and more.
>>
>>
>>
>> This could also be a piece of middleware.  I feel like it's one of the 
>> more complicated kinds of middleware, but useful.  It could also be a 
>> bit of library code that applications can use, but I'd prefer it as 
>> middleware because you could configure it for multiple applications.
> 
> 
> 
> I quite like this as middleware too, but again it could go in the 
> server.. how do you decide? 

I'd be inclined limit the server to the most basic issues, like 
supporting HTTP or interfacing with a web server, and with the basic 
concurrency issues of responding to multiple requests.  I'd rather leave 
other parts out, unless it's really natural to include them.  Like, you 
might include URL resolution in a server based on mod_python, because 
Apache already has URL resolution.

> I also find that all pages have similar 
> regions like title, breadcrumbs, navigation bar, content.. I was 
> planning on having some sort of templating middleware so that 
> applications didn't have to worry so much about the broad page structure 
> allowing easy theming of sites.

A filtering middleware could make sense here.  Otherwise, it might just 
make sense to think of this as configuration -- you indicate what the 
standard template is, and expect the application to select and fill the 
template appropriately.

> At the moment I'm think of refactoring as follows:
> WSGI - Server:      web.database
>                   web.database.object

What would you gain from putting this in the server, instead of a library?

>                   any global config options
>                      - Middleware:  web.auth
>                   web.session
>                   web.error
>                   theming engine

What's your thinking here?  Would the theming engine work for other 
kinds of WSGI applications, e.g., a Webware application?  If not, then I 
don't think there's any need to put this in the server/middleware.

>                      - Application: Sign in, sign out, change password, 
> password reminder, change access levels etc

Yes, definitely application, though there's a configuration aspect -- 
you'd probably configure the authentication middleware to know where 
some of these things were located.

>       - Library:     web.mail
>                   web.image.graph
>                   web.template
From pje at telecommunity.com  Tue Oct  5 06:15:07 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct  5 06:14:58 2004
Subject: [Web-SIG] Updated my WSGI examples
In-Reply-To: <6654eac40410041910182deb55@mail.gmail.com>
Message-ID: <5.1.1.6.0.20041005001007.02bf5800@mail.telecommunity.com>

At 10:10 PM 10/4/04 -0400, Peter Hunt wrote:

>Looking forward to the bugs you will find :)

Good, then I won't feel so bad about telling you that the 'wsgi.' prefix is 
reserved for WSGI-defined features, so "wsgi.field_storage" and friends are 
right out.  ;)

Technically, I think this was only implied in the spec, not explicitly 
stated, so I'll have to fix that.  Anyway, the idea of the prefix is to 
avoid name collisions between different developers, so you need to pick 
your *own* prefix that isn't the same as anybody else's.

From ianb at colorstudy.com  Tue Oct  5 06:38:53 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct  5 06:38:58 2004
Subject: [Web-SIG] Updated my WSGI examples
In-Reply-To: <5.1.1.6.0.20041005001007.02bf5800@mail.telecommunity.com>
References: <5.1.1.6.0.20041005001007.02bf5800@mail.telecommunity.com>
Message-ID: <4162255D.9010607@colorstudy.com>

Phillip J. Eby wrote:
> Good, then I won't feel so bad about telling you that the 'wsgi.' prefix 
> is reserved for WSGI-defined features, so "wsgi.field_storage" and 
> friends are right out.  ;)
> 
> Technically, I think this was only implied in the spec, not explicitly 
> stated, so I'll have to fix that.  Anyway, the idea of the prefix is to 
> avoid name collisions between different developers, so you need to pick 
> your *own* prefix that isn't the same as anybody else's.

That raises a question of convention that I was thinking about.  I ended 
up giving each of my modules its own namespace.  Which probably isn't 
the right way to go.  But then, I also wasn't trying to think of them as 
a unified package.  Also, some of the extensions are meant to be opaque 
to the rest of the application; for instance, a cookie parser stores 
data in the environment to cache the parse, but that data shouldn't be 
manipulated by other applications.  Maybe I should have used a leading 
underscore.

Also, there's already things I'm starting to think of in terms of 
extensions, where we'd agree on the meaning of a second namespace.  For 
instance, I'd like a flag to indicate to applications that they should 
let their unexpected exceptions be raised.  This would be nice for 
something like a debugging server that can be run in a console and falls 
into pdb when there's an error.  Once this flag was set, middleware 
further up shouldn't catch unexpected errors; and if this flag isn't 
set, then applications should avoid letting errors escape.  Sessions and 
configuration might be other places where standardization is called for, 
just to think of some things I've encountered so far.

But then, this should probably be part of a second standard, which 
follows from WSGI.  Maybe WAI, Web Application Interface, to make up an 
acronym.  Or maybe "webapp" would be better.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From foom at fuhm.net  Tue Oct  5 06:52:54 2004
From: foom at fuhm.net (James Y Knight)
Date: Tue Oct  5 06:53:02 2004
Subject: [Web-SIG] A more Twisted approach to async apps in WSGI
In-Reply-To: <5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
References: <5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
Message-ID: <6B8CDF7C-168A-11D9-B112-000A95A50FB2@fuhm.net>

A bit late with the response...but better late than never I hope. ;)

On Sep 22, 2004, at 9:56 PM, Phillip J. Eby wrote:
> On the positive side of the iterator approach, it could make it easier 
> for asynchronous applications to pause waiting for input, and it could 
> in principle support "chunked" transfer encoding of the input stream.
>
> Anyway, the long and short of it is that CGI and chunked encoding are 
> quite simply incompatible, which means that relying on its 
> availability would be nonportable in a WSGI application anyway.

I do not find that a good reason to copy the mistake (not supporting 
chunking) to a new API.

However! I don't think that the file-like-object API even has a problem 
with chunked incoming data. As long as WSGI does not make 
CONTENT_LENGTH a required header, and as long as the result of read 
looks different for "more data still to come" and "data finished" (it 
does, blocking for more data to occur vs. returning ''), I think it 
should be fine (for non-async apps). Am I missing something here?

> [...] That means that if we switch from an input stream to an 
> iterator, a lot of people are going to be trying to make sensible 
> wrappers to convert the iterator back to an input stream, and that's 
> just getting ridiculous, [...]

Iterable input stream does seems like it may be a loser for the common 
case.

> So, I'm thinking we should shift the burden to an async-specific API.  
> But, in this case, "burden" means that we get to give asynchronous 
> apps an API much more suited to their use cases.
> [...]
> The idea is that this would create an iterator that the server/gateway 
> could recognize as "special", similar to the file-wrapper trick.  But, 
> the object returned would provide an extra API for use by the 
> asynchronous application, maybe something like:
>
>     put(data) -- queue data for retrieval when the controller is 
> iterated over
>
>     finish() -- mark the iterator finished, so it raises StopIteration
>
>     on_get(length,callback) -- call 'callback(data)' when 'length' 
> bytes are available on 'wsgi.input' (but return immediately from the 
> 'on_get()' call)
>
> While this API is an optional extension, it seems it would be closer 
> to what some async fans wanted, and less of a kludge.  It won't do 
> away with the possibility that middleware might block waiting for 
> input, of course, but when no middleware is present or the middleware 
> isn't transforming the input stream, it should work out quite well.

That sounds okay. I'd specify that the on_get "length" bit is a hint, 
and may or may not be honored. put/finish is the right API for output 
(although I'd call it write/finish myself), and on_get seems like the a 
fairly usable API for input. It doesn't let you pause the incoming 
data, so if you're passing it on to a slow downstream you'll 
potentially need to buffer a lot, but maybe that's too much to ask for. 
I assume callback('') is used to indicate end of incoming data: that 
should be specified.

However, interaction with middleware seems quite tricky here:
- For input modifying middleware: I guess on_get would have to just 
raise an exception if wsgi.input has been replaced. If the input stream 
was iterable, an on_get callback could just be considered notice that 
you can iterate the input stream once without blocking, assuming the 
block boundary requirements were also in effect here. Then it would 
work right even if the input stream was replaced. However, I think it 
might be the case that middleware that wants to modify the input stream 
is so rare, it doesn't really matter.
- Output. The block boundary section implies that middleware that 
follows the guidelines, and doesn't do any blocking operations of its 
own should work without worrying about the server and application being 
async or sync. If this is to work, the server cannot expect to actually 
receive an asyncwrapper iterable as the return value, even if the app 
is using it, because the middleware might be consuming that iterable 
and returning one of its own. This means the .put/.next methods should 
communicate out-of-band, effectively calling pause/resume functions in 
the server so it knows when it's safe to iterate the vanilla iterator 
the middleware returned without the middleware blocking when calling 
the asyncwrapper-iterator.

> But if this is the overall right approach, I'd like to drop the 
> current proposals to make 'wsgi.input' an iterator and add optional 
> 'pause'/'resume' APIs, since they were rather kludgy compared to 
> giving async apps their own mini-API for nonblocking I/O.

Perhaps Peter Hunt could try to implement it in his twisted wsgi 
gateway and see if it works out. :)

James

From pje at telecommunity.com  Tue Oct  5 08:37:18 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct  5 08:37:10 2004
Subject: [Twisted-web] Re: [Web-SIG] A more Twisted approach to
	async apps in WSGI
In-Reply-To: <6B8CDF7C-168A-11D9-B112-000A95A50FB2@fuhm.net>
References: <5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>

At 12:52 AM 10/5/04 -0400, James Y Knight wrote:
>A bit late with the response...but better late than never I hope. ;)
>
>On Sep 22, 2004, at 9:56 PM, Phillip J. Eby wrote:
>>On the positive side of the iterator approach, it could make it easier 
>>for asynchronous applications to pause waiting for input, and it could in 
>>principle support "chunked" transfer encoding of the input stream.
>>
>>Anyway, the long and short of it is that CGI and chunked encoding are 
>>quite simply incompatible, which means that relying on its availability 
>>would be nonportable in a WSGI application anyway.
>
>I do not find that a good reason to copy the mistake (not supporting 
>chunking) to a new API.

Perhaps not, but there are also lots of other reasons not to support 
chunked input, mainly that a Google search for "chunked encoding CGI" turns 
up reams of vulnerabilities that suggest existing HTTP implementations may 
leave a bit to be desired with respect to accepting a POST of chunked 
input.  :)


>However! I don't think that the file-like-object API even has a problem 
>with chunked incoming data. As long as WSGI does not make CONTENT_LENGTH a 
>required header, and as long as the result of read looks different for 
>"more data still to come" and "data finished" (it does, blocking for more 
>data to occur vs. returning ''), I think it should be fine (for non-async 
>apps). Am I missing something here?

I don't think so.  Although you probably want something more like a pipe 
error if the input times out or the connection is broken.


>>So, I'm thinking we should shift the burden to an async-specific API.
>>But, in this case, "burden" means that we get to give asynchronous apps 
>>an API much more suited to their use cases.
>>[...]
>>The idea is that this would create an iterator that the server/gateway 
>>could recognize as "special", similar to the file-wrapper trick.  But, 
>>the object returned would provide an extra API for use by the 
>>asynchronous application, maybe something like:
>>
>>     put(data) -- queue data for retrieval when the controller is 
>> iterated over
>>
>>     finish() -- mark the iterator finished, so it raises StopIteration
>>
>>     on_get(length,callback) -- call 'callback(data)' when 'length' bytes 
>> are available on 'wsgi.input' (but return immediately from the 'on_get()' call)
>>
>>While this API is an optional extension, it seems it would be closer to 
>>what some async fans wanted, and less of a kludge.  It won't do away with 
>>the possibility that middleware might block waiting for input, of course, 
>>but when no middleware is present or the middleware isn't transforming 
>>the input stream, it should work out quite well.
>
>That sounds okay. I'd specify that the on_get "length" bit is a hint, and 
>may or may not be honored. put/finish is the right API for output 
>(although I'd call it write/finish myself),

The reason for not using 'write' is to avoid confusion with the existing 
"write" callable, both in terms of knowing which one we're talking about, 
and in terms of not confusing the semantics, which may differ subtly 
between the two.


>  and on_get seems like the a fairly usable API for input. It doesn't let 
> you pause the incoming data,

Actually it does; it's supposed to be a one-shot.  You have to call it 
again if you want to get called back again.


>  so if you're passing it on to a slow downstream you'll potentially need 
> to buffer a lot, but maybe that's too much to ask for. I assume 
> callback('') is used to indicate end of incoming data: that should be 
> specified.

I missed that entirely, but it sounds like a good idea.


>However, interaction with middleware seems quite tricky here:
>- For input modifying middleware: I guess on_get would have to just raise 
>an exception if wsgi.input has been replaced.

Yep.  Although it might be that the wrapper would just refuse to 
instantiate in the first place in that circumstance.


>  If the input stream was iterable, an on_get callback could just be 
> considered notice that you can iterate the input stream once without 
> blocking, assuming the block boundary requirements were also in effect here.

Yes, but this'd only work if the input were an iterator.  input.read() 
returning an empty string would mean EOF, so the boundary stuff doesn't 
work in that case.


>- Output. The block boundary section implies that middleware that follows 
>the guidelines, and doesn't do any blocking operations of its own should 
>work without worrying about the server and application being async or 
>sync. If this is to work, the server cannot expect to actually receive an 
>asyncwrapper iterable as the return value, even if the app is using it, 
>because the middleware might be consuming that iterable and returning one 
>of its own.

Correct.


>  This means the .put/.next methods should communicate out-of-band, 
> effectively calling pause/resume functions in the server so it knows when 
> it's safe to iterate the vanilla iterator the middleware returned without 
> the middleware blocking when calling the asyncwrapper-iterator.

It could do that, certainly.  But, the truth is it's *always* safe to 
iterate.  Note that the application can just use the on_get callback to set 
a flag that it's ready to continue, and just keep yielding empty strings 
till then.

More to the point, the iterator-wrapper can simply yield empty strings when 
its internal queue is empty, and a sensible async server should back off 
its iterator.next() retry attempts when an application yields empty 
strings.  This is pretty much always safe and sensible.

However, the out-of-band communication you describe can also take place, 
since it provides better communication in the case where the extension is 
available.

From tsarna at sarna.org  Tue Oct  5 17:24:15 2004
From: tsarna at sarna.org (Ty Sarna)
Date: Tue Oct  5 17:20:15 2004
Subject: [Web-SIG] Python Web Modules - Version 0.4.1 
In-Reply-To: Message from ianb at colorstudy.com (Ian Bicking) 
	of "Mon, 04 Oct 2004 22:56:48." <4161B8D2.1020902@colorstudy.com> 
Message-ID: <20041005152415.98A3EBB980@kopernik.sarna.org>

> > * web.auth     - Identity and identification handling. Users may have
> >  multiple access levels to multiple applications. Sign in and 
> > password reminder handling is built in.
> 
> This could be middleware, though obviously it requires a lot of user 
> configuration.  If it's middleware you could share a single 
> authentication system with different WSGI applications.  Some 
> standardization in this case would be good -- starting with things as 
> simple as environ['auth.username'] holding the string username.  But for 
> now there's no standard, so you should use a custom prefix.

I think this should be environ['REMOTE_USER'], per the CGI spec, so that
same app could take auth either from the server (apache
mod_auth_whatever or equivalent in other servers) or from middleware. 
From pje at telecommunity.com  Tue Oct  5 17:29:57 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct  5 17:29:46 2004
Subject: [Web-SIG] Python Web Modules - Version 0.4.1 
In-Reply-To: <20041005152415.98A3EBB980@kopernik.sarna.org>
References: <Message from ianb at colorstudy.com (Ian Bicking)  of "Mon,
	04 Oct 2004 22:56:48." <4161B8D2.1020902@colorstudy.com>
Message-ID: <5.1.1.6.0.20041005112930.02c06ec0@mail.telecommunity.com>

At 11:24 AM 10/5/04 -0400, Ty Sarna wrote:
> > > * web.auth     - Identity and identification handling. Users may have
> > >  multiple access levels to multiple applications. Sign in and
> > > password reminder handling is built in.
> >
> > This could be middleware, though obviously it requires a lot of user
> > configuration.  If it's middleware you could share a single
> > authentication system with different WSGI applications.  Some
> > standardization in this case would be good -- starting with things as
> > simple as environ['auth.username'] holding the string username.  But for
> > now there's no standard, so you should use a custom prefix.
>
>I think this should be environ['REMOTE_USER'], per the CGI spec, so that
>same app could take auth either from the server (apache
>mod_auth_whatever or equivalent in other servers) or from middleware.

+1.

From ianb at colorstudy.com  Tue Oct  5 20:12:37 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct  5 20:14:09 2004
Subject: [Web-SIG] WSGI Webware progress
In-Reply-To: <5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>
References: <5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>
Message-ID: <4162E415.5040904@colorstudy.com>

Phillip J. Eby wrote:
> At 10:42 AM 10/3/04 -0400, Peter Hunt wrote:
> 
>> Looking good! I see we've written a lot of similar code; perhaps we
>> could merge our two separate efforts into "wsgilib"?
> 
> 
> Heh.  I've also started work on a "wsgilib", mainly to provide common 
> base classes and utility functions for servers and gateways.  Maybe we 
> need to co-ordinate in some fashion.  :)

Should we put some of this code in a common repository?  I guess there's 
actually some benefit to working separately, since this is a standard 
not an implementation.  But then we at least need to agree on module 
names and it would be convenient to agree on some of these simple, 
common functions.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From ianb at colorstudy.com  Tue Oct  5 20:19:37 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct  5 20:20:59 2004
Subject: [Web-SIG] A more Twisted approach to async apps in WSGI
In-Reply-To: <6B8CDF7C-168A-11D9-B112-000A95A50FB2@fuhm.net>
References: <5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<6B8CDF7C-168A-11D9-B112-000A95A50FB2@fuhm.net>
Message-ID: <4162E5B9.7080502@colorstudy.com>

James Y Knight wrote:
> However, interaction with middleware seems quite tricky here:
> - For input modifying middleware: I guess on_get would have to just 
> raise an exception if wsgi.input has been replaced. If the input stream 
> was iterable, an on_get callback could just be considered notice that 
> you can iterate the input stream once without blocking, assuming the 
> block boundary requirements were also in effect here. Then it would work 
> right even if the input stream was replaced. However, I think it might 
> be the case that middleware that wants to modify the input stream is so 
> rare, it doesn't really matter.

I think middleware would have to modify the input stream if it wanted to 
parse POST variables.  In that case, you might parse the input stream, 
while also constructing a replacement input stream for when the 
application tries to re-read the stream.  In effect the middleware wants 
to peek at the input stream.

I can't think of any other useful reasons to modify the input stream, 
but this one seems fairly reasonable.  For instance, a piece of 
middleware might try to detect a login attempt by looking for particular 
field names in the request.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From floydophone at gmail.com  Tue Oct  5 20:29:02 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Tue Oct  5 20:29:06 2004
Subject: [Web-SIG] WSGI Webware progress
In-Reply-To: <4162E415.5040904@colorstudy.com>
References: <5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>
	<4162E415.5040904@colorstudy.com>
Message-ID: <6654eac4041005112933ceb412@mail.gmail.com>

I was actually thinking of putting all of the wsgilib candidate code
in a SVN repository. That way you can fix all of the bugs that I write
in my code without waiting for me :)


On Tue, 05 Oct 2004 13:12:37 -0500, Ian Bicking <ianb@colorstudy.com> wrote:
> 
> 
> Phillip J. Eby wrote:
> > At 10:42 AM 10/3/04 -0400, Peter Hunt wrote:
> >
> >> Looking good! I see we've written a lot of similar code; perhaps we
> >> could merge our two separate efforts into "wsgilib"?
> >
> >
> > Heh.  I've also started work on a "wsgilib", mainly to provide common
> > base classes and utility functions for servers and gateways.  Maybe we
> > need to co-ordinate in some fashion.  :)
> 
> Should we put some of this code in a common repository?  I guess there's
> actually some benefit to working separately, since this is a standard
> not an implementation.  But then we at least need to agree on module
> names and it would be convenient to agree on some of these simple,
> common functions.
> 
> --
> Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
>
From ianb at colorstudy.com  Tue Oct  5 20:40:27 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct  5 20:41:32 2004
Subject: [Web-SIG] WSGI Webware progress
In-Reply-To: <6654eac4041005112933ceb412@mail.gmail.com>
References: <5.1.1.6.0.20041003120107.03bf15c0@mail.telecommunity.com>	
	<4162E415.5040904@colorstudy.com>
	<6654eac4041005112933ceb412@mail.gmail.com>
Message-ID: <4162EA9B.70001@colorstudy.com>

Peter Hunt wrote:
> I was actually thinking of putting all of the wsgilib candidate code
> in a SVN repository. That way you can fix all of the bugs that I write
> in my code without waiting for me :)

Sure... or something like that ;)  I can offer up repository space on 
colorstudy.com or on webwareforpython.org.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From pje at telecommunity.com  Wed Oct  6 01:26:46 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct  6 01:26:35 2004
Subject: [Web-SIG] An implementation error I just found in PEP 333
Message-ID: <5.1.1.6.0.20041005191932.03540e80@mail.telecommunity.com>

Just a quick heads-up...  there's an error in the PEP's CGI implementation, 
so if you are basing a server/gateway implementation on it, you may be 
copying this error into your own code.

Specifically, 'start_response' contains this code:

         elif headers_sent:
             raise AssertionError("Headers already sent!")

It *should* read:

         elif headers_set:
             raise AssertionError("Headers already set!")

This is apparently a typo; it leads to noncompliant behavior (allowing 
set_response() to be called multiple times without error even if exc_info 
isn't supplied).  I discovered it while working on the WSGI reference 
library (wsgiref).  FYI, the ViewCVS for wsgiref is:

     http://cvs.eby-sarna.com/wsgiref/

And you can also get it via anonymous CVS; see

     http://peak.telecommunity.com/Meta/AnonymousCVSAccess.html

for instructions, replacing 'co PEAK' with 'co wsgiref'.

At the moment, wsgiref just contains a header manipulation class, a 
FileWrapper class, and a bunch of environment manipulation functions, all 
with extensive automated tests.

I'm in the middle of working on a base class that can be used to implement 
pretty much any kind of WSGI server or gateway, and I noticed that I had 
managed to copy the above error into my new base class.  So I thought I 
should mention it to everybody so they can verify that they didn't make the 
same mistake.  Sorry about the mixup, I'll get it fixed in the next PEP 
revision.

In the meantime, you can now feel good about the fact that even *my* PEP 
333 implementation had a compliance bug...  ;)

From pje at telecommunity.com  Wed Oct  6 08:39:42 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct  6 08:39:31 2004
Subject: [Web-SIG] Draft of server/gateway base class now available
Message-ID: <5.1.1.6.0.20041006021659.02270150@mail.telecommunity.com>

I've just checked in a set of server/gateway base classes into the wsgiref 
library.  The main class, BaseHandler, implements the structural flow of a 
WSGI application invocation, with stub methods for creating the various 
streams, variables, and so on, including some optional extensions like 
'wsgi.file_wrapper'.  Server/gateway implementations can subclass 
BaseHandler to fill in these stubs with appropriate implementations for 
their particular architecture.

Two other classes, BaseCGIHandler and CGIHandler, are usable as-is (more or 
less) for CGI and CGI-like environments.  BaseCGIHandler instances can be 
passed the streams and environ mapping to use, while CGIHandler takes them 
direct from the 'sys' and 'os' modules, while using different defaults for 
e.g. wsgi.multiprocess and wsgi.run_once.

The main things missing at the moment from BaseHandler are:

  * sensible default error handling
  * automatic addition of missing headers (e.g. Content-Length)
  * Any HTTP/1.1 support whatsoever  :)
  * a more comprehensive test suite
   (there is a simple test suite now, but it doesn't cover all code paths)

The wsgiref package comes with a small set of automated tests; they can be 
run automatically via 'python setup.py -q test'.  It also includes utility 
routines like 'setup_testing_defaults()' to populate a basic 'environ' for 
testing purposes, HTTP header manipulation support, and various other 
useful things for server and application implementors.

I've tried to write the package to work with Python 2.1 (e.g. Jython), 
though I may have missed a few idioms; if you're working with an older 
version of Python and experience any difficulties, please let me know.

Most everything in the package has moderately verbose docstrings, so using 
pydoc or 'help()' in the interpreter should help you get going.  For a 
quick start, you can run a WSGI application under CGI with:

     from wsgiref.handlers import CGIHandler
     CGIHandler().run(application)

FYI, the ViewCVS for wsgiref is:

     http://cvs.eby-sarna.com/wsgiref/

And you can also get it via anonymous CVS; see

     http://peak.telecommunity.com/Meta/AnonymousCVSAccess.html

for instructions, replacing 'co PEAK' with 'co wsgiref'.

From pje at telecommunity.com  Wed Oct  6 08:42:21 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct  6 08:42:11 2004
Subject: [Web-SIG] *Another* implementation error
In-Reply-To: <5.1.1.6.0.20041005191932.03540e80@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041006024000.0232e140@mail.telecommunity.com>

At 07:26 PM 10/5/04 -0400, Phillip J. Eby wrote:
>Just a quick heads-up...  there's an error in the PEP's CGI 
>implementation, so if you are basing a server/gateway implementation on 
>it, you may be copying this error into your own code.

This time, the culprit is:

     environ['wsgi.last_call']    = True

Which I apparently never updated when the name of the variable became 
'wsgi.run_once'.  Please check to make sure you didn't copy this error into 
your implementations.  Sorry for the inconvenience.

I've just checked in an update of the PEP to fix this and the other coding 
errors I found today.

From py-web-sig at xhaus.com  Wed Oct  6 16:13:14 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Wed Oct  6 16:13:49 2004
Subject: [Web-SIG] Multipart/multiple stream file uploads.
Message-ID: <4163FD7A.5040200@xhaus.com>

[Ian Bicking]
 > I think middleware would have to modify the input stream if it wanted
 > to  parse POST variables.  In that case, you might parse the input
 > stream, while also constructing a replacement input stream for when
 > the application tries to re-read the stream.  In effect the middleware
 > wants to peek at the input stream.

Reading this put me in mind of a potential use case that any WSGI input 
API will have to cover: that of multiple streamed file uploads.

So for example that the user is uploading a set of form variables, *and* 
multiple files, each in a MIME multipart sub-message.

Say further that on the server-side we want to stream each of the files 
into disk without buffering them to memory, as well as access the form 
variables from the first MIME multipart.

In this case, the file stream for the second file stream (i.e. the third 
multipart) cannot be made available to the WSGI application until the 
first file has been processed/saved to disk.

How could an asynchronous API support such multiple file uploads? As 
well as process/present form data from the first part? Would it have to 
register callbacks for "a new multipart has arrived" events?

I don't have a proposed solution, I just thought it was worth raising 
the use case, for discussion purposes.

Regards,

Alan.

From py-web-sig at xhaus.com  Wed Oct  6 16:22:59 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Wed Oct  6 16:23:26 2004
Subject: [Web-SIG] Modjy and external packages.
Message-ID: <4163FFC3.4050800@xhaus.com>

Dear All,

I've just had an email from a modjy user who was delighted to get lucene 
(the excellent java text indexing engine[1]) up and running under 
WSGI/modjy. Cool B-)

But there is one little trick that one needs to know to make such things 
work. This is a trick that most jythonistas know, but if one doesn't 
know it, finding why the relevant imports don't work can be infuriating.

When referencing external jars in modjy applications, it is not 
sufficient to place the jar on the classpath, or to place it in the 
WEB-INF/lib directory.

You *also* have to inform jython about the existence of the package. 
This is very simple to do, by adding a simple declaration to your 
modules, like so

#######
import sys
sys.add_package('org.apache.lucene')
#######

And that's it.

I will be releasing a micro revision to modjy at the weekend which 
supports doing this through a configuration parameter, rather than the 
mildly ugly approach outlined above.

Happy modjy'ing!

Regards,

Alan.

[1] " Jakarta Lucene is a high-performance, full-featured text search 
engine library written entirely in Java."
http://jakarta.apache.org/lucene

From paul.boddie at ementor.no  Wed Oct  6 16:46:32 2004
From: paul.boddie at ementor.no (Paul Boddie)
Date: Wed Oct  6 16:46:36 2004
Subject: [Web-SIG] Modjy and external packages.
Message-ID: <0F4BD34E02639E428B4654DCBAB4502D109266@100NOOSLMSG004.common.alpharoot.net>

Alan Kennedy wrote:
> 
> When referencing external jars in modjy applications, it is not 
> sufficient to place the jar on the classpath, or to place it in the 
> WEB-INF/lib directory.

Really? I know that Java Servlet deployment issues are infuriating
enough as
it is, but all I've ever needed to do with JythonServlet (which forms
the
basis of WebStack's Java/Jython support) is to make sure that relevant
libraries reside in the WEB-INF/lib directory. At least, the basic
servlet
libraries have to reside there, despite them also residing in lots of
other
places within Apache Tomcat (which is what I'm testing on).

Perhaps recent Tomcat developments (I'm using 4.1.27) have messed around
with the security model, but I've never seen any need for what you
suggest...

> import sys
> sys.add_package('org.apache.lucene')

modjy looks interesting, though.

Paul
From py-web-sig at xhaus.com  Wed Oct  6 16:56:30 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Wed Oct  6 16:57:16 2004
Subject: [Web-SIG] Modjy and external packages.
In-Reply-To: <0F4BD34E02639E428B4654DCBAB4502D109266@100NOOSLMSG004.common.alpharoot.net>
References: <0F4BD34E02639E428B4654DCBAB4502D109266@100NOOSLMSG004.common.alpharoot.net>
Message-ID: <4164079E.3070904@xhaus.com>

[Alan Kennedy]
 >>When referencing external jars in modjy applications, it is not
 >>sufficient to place the jar on the classpath, or to place it in the
 >>WEB-INF/lib directory.

[Paul Boddie]
 > Really? I know that Java Servlet deployment issues are infuriating
 > enough as
 > it is, but all I've ever needed to do with JythonServlet (which forms
 > the
 > basis of WebStack's Java/Jython support) is to make sure that relevant
 > libraries reside in the WEB-INF/lib directory.

That only works because the current org.python.util.PyServlet class 
already adds the relevant packages for you behind the scenes. Take a 
look at the source for the PyServlet.java file

http://cvs.sourceforge.net/viewcvs.py/jython/jython/org/python/util/PyServlet.java?rev=1.16&view=markup

The following are the relevant lines

/* ------------------------------------- */
public class PyServlet extends HttpServlet
     {
     /* ....... */

     public void init()
         {
         /* ....... */
         PySystemState sys = Py.getSystemState();
         sys.add_package("javax.servlet");
         sys.add_package("javax.servlet.http");
         sys.add_package("javax.servlet.jsp");
         sys.add_package("javax.servlet.jsp.tagext");
         sys.add_classdir(rootPath + "WEB-INF" +
             File.separator + "classes");
         sys.add_extdir(rootPath + "WEB-INF" + File.separator + "lib",
             true);
         /* ....... */
         }
     }
/* ------------------------------------- */

Regards,

Alan.
From paul.boddie at ementor.no  Wed Oct  6 17:06:14 2004
From: paul.boddie at ementor.no (Paul Boddie)
Date: Wed Oct  6 17:06:17 2004
Subject: [Web-SIG] Modjy and external packages.
Message-ID: <0F4BD34E02639E428B4654DCBAB4502D10926A@100NOOSLMSG004.common.alpharoot.net>

Alan Kennedy wrote:
> 

[Servlet libraries in WEB-INF/lib]

> That only works because the current org.python.util.PyServlet class 
> already adds the relevant packages for you behind the scenes. Take a 
> look at the source for the PyServlet.java file

[...]

I stand corrected! Having made various changes to PyServlet, one would
have
thought I might have remembered this. You've quite possibly saved me
some
time in the near future, Alan!

Paul
From foom at fuhm.net  Thu Oct  7 06:59:47 2004
From: foom at fuhm.net (James Y Knight)
Date: Thu Oct  7 07:04:23 2004
Subject: [Twisted-web] Re: [Web-SIG] A more Twisted approach to async apps
	in WSGI
In-Reply-To: <5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
References: <5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
Message-ID: <B6269DF6-181D-11D9-AAA6-000A95A50FB2@fuhm.net>

On Oct 5, 2004, at 2:37 AM, Phillip J. Eby wrote:
> Although you probably want something more like a pipe error if the 
> input times out or the connection is broken.

You normally only get pipe errors on writes, read just sees EOF.

But that does bring up a good point: How does the server notify the 
application that the client has gone away, and any further work is 
useless?
- For non-async apps that use the iterator model: I think the server is 
allowed to just call iterable.close() and never iterate again.
- For async applications, with the proposed API, that may not be an 
option, because the iterable returned is the special wrapper, not a 
user-created class. Although, actually, I guess the app can return its 
own iterable whose __iter__ calls through and returns the wrapper's 
__iter__.
- What about for non-async applications that use the write callable? 
Should write be allowed to raise an exception? Or should it just become 
a no-op when the client is disconnected?

>>  and on_get seems like the a fairly usable API for input. It doesn't 
>> let you pause the incoming data,
>
> Actually it does; it's supposed to be a one-shot.  You have to call it 
> again if you want to get called back again.

Ah, didn't see that it was one-shot. Yeah, in that case, the server can 
stop reading if there is no registered data callback and some 
predetermined buffer size is filled. Nice.

>>  If the input stream was iterable, an on_get callback could just be 
>> considered notice that you can iterate the input stream once without 
>> blocking, assuming the block boundary requirements were also in 
>> effect here.
>
> Yes, but this'd only work if the input were an iterator.  input.read() 
> returning an empty string would mean EOF, so the boundary stuff 
> doesn't work in that case.

Right -- just pointing out one plus to the iterator model. :)

>>  This means the .put/.next methods should communicate out-of-band, 
>> effectively calling pause/resume functions in the server so it knows 
>> when it's safe to iterate the vanilla iterator the middleware 
>> returned without the middleware blocking when calling the 
>> asyncwrapper-iterator.
>
> It could do that, certainly.  But, the truth is it's *always* safe to 
> iterate.  Note that the application can just use the on_get callback 
> to set a flag that it's ready to continue, and just keep yielding 
> empty strings till then.
>
> More to the point, the iterator-wrapper can simply yield empty strings 
> when its internal queue is empty, and a sensible async server should 
> back off its iterator.next() retry attempts when an application yields 
> empty strings.  This is pretty much always safe and sensible.
>
> However, the out-of-band communication you describe can also take 
> place, since it provides better communication in the case where the 
> extension is available.

Hmm, yes. I totally missed the option of just yielding ''. Of course 
it's a very bad idea to repeatedly yield '' to a server if you don't 
know the server can properly handle it (by e.g. delaying longer and 
longer), but, in this case, since the server itself is providing the 
special iterable, that should be fine.

It seems like it should be possible to make a generic class that 
implements this async API for use with sync servers that do not support 
it natively. That would allow async apps to run on a sync server 
without modification, which is potentially useful. To do that, though, 
I think the it'd have to spawn an extra thread per request that is 
waiting to read data, for the read() call to block on. Unless, of 
course, the app never needs to yield outgoing data while waiting for 
incoming data.

The one remaining issue I have is the required thread-safeness of 
various APIs.

The spec doesn't mention much of anything about threadsafeness: is it 
ok to call wsgi methods from a different thread than the one the server 
originally called the request on? Especially interesting for 
implementing the above sync->async adapter: 
environ['wsgi.input'].read(x) would be called from a second thread.

What thread (if there's a choice) does the on_get callback get called 
on. Etc. I haven't really thought about these thready questions much 
either, so maybe the answers are obvious, but in my experience, that's 
usually not the case when it comes to threads. That's why async apps 
are nice. ;)

James

From pje at telecommunity.com  Thu Oct  7 07:28:42 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct  7 07:28:28 2004
Subject: [Twisted-web] Re: [Web-SIG] A more Twisted approach to
	async apps in WSGI
In-Reply-To: <B6269DF6-181D-11D9-AAA6-000A95A50FB2@fuhm.net>
References: <5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041007010942.02d33c90@mail.telecommunity.com>

At 12:59 AM 10/7/04 -0400, James Y Knight wrote:
>On Oct 5, 2004, at 2:37 AM, Phillip J. Eby wrote:
>>Although you probably want something more like a pipe error if the input 
>>times out or the connection is broken.
>
>You normally only get pipe errors on writes, read just sees EOF.
>
>But that does bring up a good point: How does the server notify the 
>application that the client has gone away, and any further work is useless?
>- For non-async apps that use the iterator model: I think the server is 
>allowed to just call iterable.close() and never iterate again.

Yes.


>- For async applications, with the proposed API, that may not be an 
>option, because the iterable returned is the special wrapper, not a 
>user-created class. Although, actually, I guess the app can return its own 
>iterable whose __iter__ calls through and returns the wrapper's __iter__.

Not if the server wants to be able to handle that iterable specially.  But 
anyway, it seems that the wrapper's constructor should take a close method, 
or have a way to set one.


>- What about for non-async applications that use the write callable? 
>Should write be allowed to raise an exception? Or should it just become a 
>no-op when the client is disconnected?

It's allowed to raise an exception, though this was never explicitly put in 
the spec; I'll have to fix that.  The actual process for that scenario 
looks something like this:

    * app calls write()
    * write() raises error
    * app catches error (maybe) and calls start_response() with exc_info
    * start_response() reraises the error, because it has already sent 
headers to the client and can't restart the response
    * application error handler bombs out and returns to server/gateway
    * server/gateway logs the exception (maybe) and gets on with life in 
the big 'net


>Hmm, yes. I totally missed the option of just yielding ''. Of course it's 
>a very bad idea to repeatedly yield '' to a server if you don't know the 
>server can properly handle it (by e.g. delaying longer and longer), but, 
>in this case, since the server itself is providing the special iterable, 
>that should be fine.

Yes.  Also, when we finally settle on an async API, I do want to cover the 
issue of backing off iteration when empty strings are yielded.  I'm 
actually inclined to suggest that an async application should take 
responsibility for doing the delaying if it's called repeatedly, and the 
async API isn't available.


>It seems like it should be possible to make a generic class that 
>implements this async API for use with sync servers that do not support it 
>natively. That would allow async apps to run on a sync server without 
>modification, which is potentially useful. To do that, though, I think the 
>it'd have to spawn an extra thread per request that is waiting to read 
>data, for the read() call to block on. Unless, of course, the app never 
>needs to yield outgoing data while waiting for incoming data.

Well, with Twisted you could deferToThread the read() operations, though 
it's hard for me to think straight about that scenario because I keep 
finding it hard to imagine an async web app that isn't just written to the 
Twisted API to start with... ;)


>The one remaining issue I have is the required thread-safeness of various 
>APIs.
>
>The spec doesn't mention much of anything about threadsafeness: is it ok 
>to call wsgi methods from a different thread than the one the server 
>originally called the request on? Especially interesting for implementing 
>the above sync->async adapter: environ['wsgi.input'].read(x) would be 
>called from a second thread.

Excellent question; I should add the answer to the spec, as soon as I 
decide precisely what it is. :)

One point: the spec should absolutely forbid servers from using thread 
identity to identify the application/caller.  The "what can you call while 
what else is executing" part of the question is a bit trickier.


>What thread (if there's a choice) does the on_get callback get called on. Etc.

My inclination is to make threading issues symmetrical.  That is, the 
application doesn't get any thread-identity guarantees either.


>  I haven't really thought about these thready questions much either, so 
> maybe the answers are obvious, but in my experience, that's usually not 
> the case when it comes to threads.

Yep.  :)  However, the more I think about it, the more it seems to me that 
WSGI should emulate single-threadedness with respect to any 
function/method/iterator invocations associated with a given application 
invocation.  However, it is *not* guaranteed that all such invocations will 
occur from the same thread.

Basically, it means "no multitasking with the other guy's objects", and 
puts the locking burdens on whoever's trying to mix multitasking into the 
works.


>That's why async apps are nice. ;)

Not to mention fork().  :)


By the way, after all this discussion...  do you think it would be better to:

1) Push towards a full async API, nailing down all these loose ends

2) Use the simple-but-klugdy "pause iteration" API idea

3) Don't make an "official" async API, and just leave it open to server 
authors to create their own extensions, and maybe cherry pick the best 
ideas for WSGI 2.0, or

4) Do something else altogether?

From carribeiro at gmail.com  Thu Oct  7 16:55:13 2004
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Thu Oct  7 16:55:33 2004
Subject: [Web-SIG] Philosophical question: publishing classes vs instances
Message-ID: <864d3709041007075558ecfac2@mail.gmail.com>

Hello all,

I've been following the Web SIG, although I only signed the list
today. I'm working out some concepts related to object-oriented web
application design. I'm sure I'm not the first to do it :-) and I
would like not to reinvent the wheel -- at least, not the _same_
wheel.

The "natural way" to implement Python web apps seems to be through
some type of object publisher -- a system that finds the correct
object, that is 'published' in some part of the site, and activates
this object upon request. I've checked a few systems, and although I
can't claim extensive experience with them, most seem to operate based
on publishing object *instances*.

I'm not working the high level design for an application of mine, and
I thought that the correct way to do it out be to publish object
*classes*, and let the web framework instantiate the class and them
activate upon request. Most of the time, I can't preserve information
in the server side anyway. And even if I use some of the advanced
techniques (for example, the persistent Javascript trick that apps
such as GMail use), an object instance seems to be a better fit,
although it would need a more complex management model.

I would like to know what do you think of it, and if is there any good
resources that I can study to understand all the issues. Maybe I'm
missing something; I don't believe that performance alone justifies
such preference, and it's something that I would like to understand.

Best regards,

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From pje at telecommunity.com  Fri Oct  8 01:21:32 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct  8 01:21:18 2004
Subject: [Web-SIG] PEAK now provides various WSGI gateway and server options
Message-ID: <5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>

The CVS version of PEAK now offers three options for running WSGI 
applications: CGI, FastCGI, and SimpleHTTPServer.  For example, this command:

     peak launch WSGI import:my_app.application

will do this:

  1. Import 'application' from 'my_app', treating it as a WSGI application
  2. Start a SimpleHTTPServer listening to an arbitrary port on 'localhost'
  3. launch a browser window pointing to that local server

So, it's a pretty easy way to test and play with WSGI applications without 
needing to configure a web server or mess with CGI.

PEAK also includes a CGI/FastCGI gateway that auto-detects whether it's 
running under CGI or FastCGI; the equivalent command is:

     peak CGI WSGI import:my_app.application

But you would normally turn this into a shell script, e.g.:

    #!/bin/sh
    peak CGI WSGI import:my_app.application

that would then be used as the CGI or FastCGI application executable.

Finally, PEAK also offers an advanced FastCGI "supervisor" that's a 
compelling replacement for mod_fastcgi's process manager when running 
high-volume and slow-starting applications.  It handles its own forking and 
killing off of child processes when they become too idle, and it has better 
"knowledge" of when new processes should or shouldn't be started.

All of these containers are fairly stable, with some of them having been 
used in production for over a year now.  (Until now, of course, the 
interface they used was a predecessor of the current WSGI spec, and they 
now use a simple adapter (courtesy of the wsgiref library) to wrap 
WSGI-compliant objects such that they implement that older, more CGI-like 
interface.)

In addition to these server and gateway implementations, all of PEAK's 
web-based tools including the peak.web application framework, the 'DDT' 
(Document-Driven Testing) toolkit, and various example applications, are 
now all WSGI applications, and should in principle be able to run under 
other WSGI-compliant servers and gateways, once you write an appropriate 
startup script to instantiate them.

Information about PEAK can be found at 
http://peak.telecommunity.com/.  PEAK's server and gateway implementations 
are based on the 'wsgiref' library, which is distributed bundled with PEAK, 
as well as in a separate distribution.

From ianb at colorstudy.com  Wed Oct 13 22:05:42 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Oct 13 22:07:09 2004
Subject: [Web-SIG] PEAK now provides various WSGI gateway and server
	options
In-Reply-To: <5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>
References: <5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>
Message-ID: <416D8A96.7000100@colorstudy.com>

Phillip J. Eby wrote:
> The CVS version of PEAK now offers three options for running WSGI 
> applications: CGI, FastCGI, and SimpleHTTPServer.  For example, this 
> command:
> 
>     peak launch WSGI import:my_app.application
> 
> will do this:
> 
>  1. Import 'application' from 'my_app', treating it as a WSGI application
>  2. Start a SimpleHTTPServer listening to an arbitrary port on 'localhost'
>  3. launch a browser window pointing to that local server

I'm noticing that peak serve WSGI import:... does the same thing, but 
without launching a web browser.

Is there any way to start the server up on a known port and interface? 
When I do "launch" it opens itself up in "localhost.my.hostname", and 
I'm not sure where localhost.my.hostname is coming from.  Since my 
computer has several interfaces, I'm not sure which one it's starting 
on, so I haven't been able to figure it out even when I try different 
addresses.

I was able to get "peak CGI WSGI import:..." working successfully, so 
the basic system is all installed and working.  I tried FastCGI a 
little, but I got stuck on installing mod_fastcgi for the moment.  I'm 
assuming that if I create a script like:

#!/bin/sh
peak FastCGI WSGI import:...

In a .fcgi, executable script, with "AddHandler fastcgi-script .fcgi" in 
my httpd.conf, it'll just work...?

I'm also not sure what the concurrency is for these.  Multithreaded, 
multiple processes, single process?  Configurable?  Does the supervisor 
start on its own, or does that have to be configured?

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From ianb at colorstudy.com  Wed Oct 13 22:21:15 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Oct 13 22:22:41 2004
Subject: [Web-SIG] PEAK now provides various WSGI gateway and
	server	options
In-Reply-To: <416D8A96.7000100@colorstudy.com>
References: <5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>
	<416D8A96.7000100@colorstudy.com>
Message-ID: <416D8E3B.7000900@colorstudy.com>

Ian Bicking wrote:
> I was able to get "peak CGI WSGI import:..." working successfully, so 
> the basic system is all installed and working.  I tried FastCGI a 
> little, but I got stuck on installing mod_fastcgi for the moment.  

BTW, does anyone know of a CGI gateway to FastCGI?  Lots of 
FastCGI-alike protocols have these: wkcgi in Webware, scgi-cgi for SCGI, 
Zope/PCGI's Zope.cgi, etc.  Typically these are just little C CGI 
programs.  I couldn't find one for FastCGI, but then the search terms 
are woefully ambiguous (too many "cgi"s).

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From fumanchu at amor.org  Wed Oct 13 23:06:12 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Wed Oct 13 23:06:56 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3022F79@exchange.hqamor.amorhq.net>

In order to test my application's WSGI interface, I wrote a quick
mod_python server interface for WSGI. It's not bulletproof, but the
parts I use work. Sorry, Phillip, I didn't subclass
wsgiref.handlers.BaseHandler yet. ;(


class ModPythonInputWrapper(object):
    
    def __init__(self, req):
        self.req = req
    
    def read(self, size=-1):
        return self.req.read(size)
    
    def readline(self):
        return self.req.readline()
    
    def readlines(self, hint=-1):
        return self.req.readlines(hint)
    
    def __iter__(self):
        return iter(self.req.readlines())


class ModPythonErrorWrapper(object):
    
    def __init__(self, req):
        self.req = req
    
    def flush(self):
        pass
    
    def write(self, content):
        self.req.log_error(content)
    
    def writelines(self, seq):
        for content in seq:
            self.req.log_error(content)


def wrap_mod_python(application, req):
    """WSGI wrapper for mod_python 3.1 (Apache 2).
    
    Write your own short handler function, obtain your application,
    and pass it and the apache Request object to this function.
    """
    
    from mod_python import apache
    
    req.add_common_vars()
    environ = dict(req.subprocess_env.items())
    environ['wsgi.input']        = ModPythonInputWrapper(req)
    environ['wsgi.errors']       = ModPythonErrorWrapper(req)
    environ['wsgi.version']      = (1, 0)
    environ['wsgi.multithread']  = True
    environ['wsgi.multiprocess'] = False
    if req.protocol.count(u'HTTPS') > 0:
        environ['wsgi.url_scheme'] = 'https'
    else:
        environ['wsgi.url_scheme'] = 'http'
    
    nested_status = [apache.OK]
    
    def start_response(status, headers):
        if status:
            if status == "200 OK":
                nested_status[0] = apache.OK
            else:
                nested_status[0] = int(status[:3])
        for key, val in headers:
            req.headers_out[key] = val
        return req.write
    
    result = application(environ, start_response)
    try:
        for data in result:
            req.write(data)
    finally:
        if hasattr(result,'close'):
            result.close()
    return nested_status[0]

-----------

Example handler (for Junct, my wiki, built on Cation, my app framework):

from cation.html import uiwsgi
import junct

def handler(req):
    ui = uiwsgi.UserInterfaceWSGI(junct.junctapp)
    ui.sandbox = junct.arena.new_sandbox()
    app = ui.request
    result = uiwsgi.wrap_mod_python(app, req)
    ui.sandbox.flush_all()
    return result


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From pje at telecommunity.com  Wed Oct 13 23:31:25 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct 13 23:32:19 2004
Subject: [Web-SIG] PEAK now provides various WSGI gateway and
	server	options
In-Reply-To: <416D8E3B.7000900@colorstudy.com>
References: <416D8A96.7000100@colorstudy.com>
	<5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>
	<416D8A96.7000100@colorstudy.com>
Message-ID: <5.1.1.6.0.20041013173029.0315bdb0@mail.telecommunity.com>

At 03:21 PM 10/13/04 -0500, Ian Bicking wrote:
>Ian Bicking wrote:
>>I was able to get "peak CGI WSGI import:..." working successfully, so the 
>>basic system is all installed and working.  I tried FastCGI a little, but 
>>I got stuck on installing mod_fastcgi for the moment.
>
>BTW, does anyone know of a CGI gateway to FastCGI?  Lots of FastCGI-alike 
>protocols have these: wkcgi in Webware, scgi-cgi for SCGI, Zope/PCGI's 
>Zope.cgi, etc.  Typically these are just little C CGI programs.  I 
>couldn't find one for FastCGI, but then the search terms are woefully 
>ambiguous (too many "cgi"s).

It's called 'cgi-fcgi', and it's part of the FastCGI developer's kit:

http://www.fastcgi.com/devkit/doc/fcgi-devel-kit.htm#S4.2

From pje at telecommunity.com  Thu Oct 14 00:25:48 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 14 00:26:44 2004
Subject: [Web-SIG] PEAK now provides various WSGI gateway and
	server options
In-Reply-To: <416D8A96.7000100@colorstudy.com>
References: <5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>
	<5.1.1.6.0.20041007190441.02386300@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041013172910.02b5e5e0@mail.telecommunity.com>

At 03:05 PM 10/13/04 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>The CVS version of PEAK now offers three options for running WSGI 
>>applications: CGI, FastCGI, and SimpleHTTPServer.  For example, this command:
>>     peak launch WSGI import:my_app.application
>>will do this:
>>  1. Import 'application' from 'my_app', treating it as a WSGI application
>>  2. Start a SimpleHTTPServer listening to an arbitrary port on 'localhost'
>>  3. launch a browser window pointing to that local server
>
>I'm noticing that peak serve WSGI import:... does the same thing, but 
>without launching a web browser.

Yes, but it's less convenient to use since you have to set up a 
configuration file to specify the port and hostname and such.  "peak 
launch" selects an available port and tells your web browser about 
it.  However, if you want to use 'peak serve', you can put something like this:

   [peak.tools.server]
   url = "tcp://fqdn.goes.here:8000"

in a configuration file, and then point to it with PEAK_CONFIG.  E.g.:

   PEAK_CONFIG=myserver.conf peak serve WSGI import:my_app.application

Or, if you want to just make the whole thing an easy-to-run application:

   #!invoke peak runIni

   [peak.running]
   app = = commands.Alias(command=['serve','WSGI','import:my_app.application'])

   [peak.tools.server]
   url = "tcp://fqdn.goes.here:8000"

And then make the file executable, so you can run it directly.  Now, you've 
got a ready-made setup to run a specific application.  You can also use 
'launch' instead of 'serve'; it will start the web browser on the 'http' 
version of the given URL.


>Is there any way to start the server up on a known port and interface? 
>When I do "launch" it opens itself up in "localhost.my.hostname", and I'm 
>not sure where localhost.my.hostname is coming from.

 From 'socket.getfqdn(serversocket.getsockname())'.  Specifically, the 
default address is 'localhost:0', which translates to any available port on 
'localhost'.  Apparently, your local resolver considers your FQDN to be 
'localhost.my.hostname', so I'd check /etc/resolv.conf or some such if 
you're on a Unix-like machine.  If you're on Windows or OS/X machine, I 
have no idea what to do.

Your problem does suggest that maybe I should change local_server to 
consider its address to be whatever it was configured to be, and not ask 
for the "official" socket address.  That way, it won't rely on a properly 
configured resolver, just to set up a localhost server.


>I was able to get "peak CGI WSGI import:..." working successfully, so the 
>basic system is all installed and working.  I tried FastCGI a little, but 
>I got stuck on installing mod_fastcgi for the moment.  I'm assuming that 
>if I create a script like:
>
>#!/bin/sh
>peak FastCGI WSGI import:...
>
>In a .fcgi, executable script, with "AddHandler fastcgi-script .fcgi" in 
>my httpd.conf, it'll just work...?

Something like that, yes.  It's been a while since I used that 
approach;  I've mainly used stuff that's more like:

    <Files some_app>
    SetHandler fastcgi-script
    </Files>

For the most part, mod_fastcgi is a bitch to set up for non-trivial 
applications, even *with* the PEAK supervisor tool, as many of its options 
are either poorly documented, or buggy, depending on whether you consider 
the code or documentation to be the thing that's wrong.  :)


>I'm also not sure what the concurrency is for these.  Multithreaded, 
>multiple processes, single process?  Configurable?

CGI/FastCGI are both single thread, multi-process.  The supervisor is also 
multi-process, but forking.   If your application module wants to set up 
caches, import lots of modules, etc., this will be done in the parent 
process, so that child processes will already have the work done.


>   Does the supervisor start on its own, or does that have to be configured?

mod_fastcgi starts its own process manager as needed.  Based on the 
settings in httpd.conf, it will start multiple processes for you, up to the 
maximum you specify.  It will also kill them off by signalling them when 
they become idle.  (The PEAK FastCGI implementations detect this and shut 
down gracefully.)

If you are using PEAK's process supervisor tool (peak.tools.supervisor) to 
manage an application, then you should configure mod_fastcgi to start one 
and only one process for that application.  Or, you can have the 
application start independently, listening on a known socket (e.g 
/tmp/myapp.sock), and configure mod_fastcgi not to manage the start/stop of 
processes.  The  process supervisor will take care of the rest for you.  If 
you start a supervised application that's already running, the new copy 
will get ready to run, and then signal the old copy to terminate 
gracefully, allowing currently-running requests to finish.  This is 
intended to make it easy to do a "warm restart" of your application to e.g. 
upgrade the code of a production application.  Alternately, you can simply 
issue a soft kill signal to the running parent process, and mod_fastcgi 
will take care of restarting it, if you've used the "start exactly one" 
approach.

To run an app under the PEAK "supervisor" tool, you need to create a 
configuration file, at minimum, something like:

     #!invoke peak supervise
     Command FastCGI fd.socket:stdin WSGI import:my_app.application
     PidFile /var/run/my_app.pid

This assumes that your OS supports using PATH to interpret "#!" lines; if 
not, you'll need an absolute path to 'invoke'.  ('invoke' is a C program 
that comes with PEAK in the 'scripts' directory, that you can install to 
more easily use PEAK tools as interpreters.)

The 'FastCGI fd.socket:stdin' means to use standard input as the connect 
socket for FastCGI; if you are using the "standalone" configuration, you'll 
want to replace that with a 'unix:/path/to/a_socket' or 
'tcp://localhost:1234' URL, as appropriate.  (For more detailed info on 
PEAK socket URL's, see the 'peak.net.sockets' module.)

The 'PidFile' spec is required; it's how the supervisor ensures that 
there's only one "master" process for the application at a given time, and 
it also makes it easy to shut down the application.  (There are also some 
other files used, whose names default to variations on the PidFile's 
filename, such as the "startup lock" file and the "pid lock" file; see the 
"Supervisor.xml" file in the 'peak.tools.supervisor' package directory for 
detailed info on these and all other configuration options for the 
"supervise" tool.)

Anyway, the configuration file can contain other options, like:

   MinProcesses 1       # Always have one request-handling process
   MaxProcesses 4       # and up to 4 if needed
   StartInterval 15s    # Don't start children more often than 1 per 15 seconds
   Import some.module   # force module to be imported in parent, that child 
might need

Note that 'Import' directives do not do anything with the contents of the 
named module; they just ensure the module is imported before the supervisor 
considers itself "started".  This is useful if your application's initial 
import doesn't load all the modules it's going to use, and you don't want 
to slow down the startup of new child processes by making them import the 
module.

Whew.  Anyway, so, the minimum to use PEAK's supervise tool in place of the 
mod_fastcgi process supervisor is to make a configuration file specifying 
the command and pidfile, and it should be run using 'peak 
supervise'.  Ideally, you can do that with a '#!' line as shown above, but 
you can also do it with a shell script, e.g.:

   #!/bin/sh
   peak supervise config_file_for_my_app

Note that you can probably get by for a while without PEAK's supervise 
tool; it's fairly "industrial strength" and exists mainly to work around 
performance flaws in mod_fastcgi's process manager that affect 
slow-starting applications that need multiple processes in order to handle 
the server's request volume, and to make it easier to control a running 
application (e.g. easy warm restart).  If you don't have an application 
that costs measurable amounts of money for every second of delayed 
response, you may not need "peak supervise".

Finally, note that there's a very nice tutorial at

    http://peak.telecommunity.com/DevCenter/IntroToPeak

that covers lots of basic "how to set up configuration files and make them 
executable" stuff for PEAK.  There's also some useful information in 
INSTALL.txt, under "SCRIPTS, BATCH FILES, and #!":

    http://peak.telecommunity.com/doc/INSTALL.txt.html


From pje at telecommunity.com  Thu Oct 14 01:01:31 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 14 01:02:38 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3022F79@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20041013182934.02473c70@mail.telecommunity.com>

At 02:06 PM 10/13/04 -0700, Robert Brewer wrote:
>In order to test my application's WSGI interface, I wrote a quick
>mod_python server interface for WSGI. It's not bulletproof, but the
>parts I use work. Sorry, Phillip, I didn't subclass
>wsgiref.handlers.BaseHandler yet. ;(

That's okay; you've given me several of the pieces I would need to do it 
myself.  :)  Although, I still would want a better way to find out what to 
set the multithread/multiprocess flags to; as some Apache builds are 
multithreaded and some are not, and some are multi-process, and some or 
not.  To be compliant

There are, however, numerous other issues in your code, from a 
WSGI-compliance perspective.  For example, your start_response() doesn't 
support WSGI error handling.

Anyway, a mod_python handler would probably look something like:


     from wsgiref.handlers import BaseCGIHandler

     class ModPyHandler(BaseCGIHandler):

         def __init__(self,req):
             req.add_common_vars()
             BaseCGIHandler.__init__(self,
                 stdin = ModPythonInputWrapper(req),
                 stdout = None,
                 stderr = ModPythonErrorWrapper(req),
                 environ = dict(req.subprocess_env.items()),
                 multiprocess = True,  # XXX
                 multithread  = True,  # XXX
             )
             self.request = req
             self._write = req.write

         def _flush(self):
             pass

         def send_headers(self):
             self.cleanup_headers()
             self.headers_sent = True
             self.request.status = int(self.status[:3])
             for key, val in self.headers.items():
                 self.request.headers_out[key] = val

     def wsgi_handler(req):
         handler = ModPyHandler(req)
         options = req.get_options()
         appmod,appname = options['application'].split('::')
         d = {}
         exec ("from %(appmod)s import %(appname) as application" % 
locals()) in d
         handler.run(d[application])
         from mod_python import apache
         return apache.OK


But note that this is just a draft off the top of my head, and may be 
deficient with respect to how it uses the mod_python API (especially since 
I've never used mod_python even once).  Anyway, to use it, one would 
configure something like:

    PythonHandler somewhere::wsgi_handler
    PythonOption application myapp::wsgi_app_func

In other words, it uses a PythonOption called "application" to indicate the 
application to be run, thus simplifying the launch configuration.

Let me know if this code works for you, and if so I'll add it to the 
wsgiref library.

From pje at telecommunity.com  Thu Oct 14 01:09:21 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 14 01:10:18 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
In-Reply-To: <5.1.1.6.0.20041013182934.02473c70@mail.telecommunity.com>
References: <3A81C87DC164034AA4E2DDFE11D258E3022F79@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20041013190831.023e1ca0@mail.telecommunity.com>

At 07:01 PM 10/13/04 -0400, Phillip J. Eby wrote:
>         exec ("from %(appmod)s import %(appname) as application" % 
> locals()) in d
>         handler.run(d[application])

Oops, typos.  There should be an 's' after '%(appname)', and that should be 
"d['application']".  Those are probably not the only mistakes I made in 
that code, but they're the first I've seen so far.  :)

I'm almost tempted to go build mod_python so I can see what the rest of the 
errors are.  :)

From fumanchu at amor.org  Thu Oct 14 07:13:21 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Thu Oct 14 07:14:06 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3022F82@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> 
> At 02:06 PM 10/13/04 -0700, Robert Brewer wrote:
> >In order to test my application's WSGI interface, I wrote a quick
> >mod_python server interface for WSGI. It's not bulletproof, but the
> >parts I use work. Sorry, Phillip, I didn't subclass
> >wsgiref.handlers.BaseHandler yet. ;(
> 
> That's okay; you've given me several of the pieces I would 
> need to do it myself.  :)

I was hoping someone would say that. :)

> Anyway, a mod_python handler would probably look something like:
> 
>      from wsgiref.handlers import BaseCGIHandler
> 
>      class ModPyHandler(BaseCGIHandler):
> 
>          def __init__(self,req):
>              req.add_common_vars()
>              BaseCGIHandler.__init__(self,
>                  stdin = ModPythonInputWrapper(req),
>                  stdout = None,
>                  stderr = ModPythonErrorWrapper(req),
>                  environ = dict(req.subprocess_env.items()),
>                  multiprocess = True,  # XXX
>                  multithread  = True,  # XXX
>              )

1. I found apache.build_cgi_env(req) tonight, which does the
add_common_vars() and dict() shoving for you. Unfortunately, it's got a
bug. So I just stole code from it.

2. I think apache.mpm_query() is what we want for
multithreading/process. But it was introduced in version 3.1, so that
needs to be trapped. I went with optional arguments to
ModPyHandler.__init__

>      def wsgi_handler(req):
>          handler = ModPyHandler(req)
>          options = req.get_options()
>          appmod,appname = options['application'].split('::')
>          d = {}
>          exec ("from %(appmod)s import %(appname) as application" % 
> locals()) in d
>          handler.run(d[application])
>          from mod_python import apache
>          return apache.OK

Eeew. exec. Smelly. :) I'll stick with plain Python code over
PythonOption, thanks, and make my app developers do a few lines of extra
work *once* instead of every deployer on every install. To each his
own... :P

> Let me know if this code works for you, and if so I'll add it to the 
> wsgiref library.

Here's the revised version. I haven't tested everything; for example,
reading straight from wsgi.input or writing to .errors. I'll wait for
the bug reports. :)


class ModPythonInputWrapper(object):
    
    def __init__(self, req):
        self.req = req
    
    def read(self, size=-1):
        return self.req.read(size)
    
    def readline(self):
        return self.req.readline()
    
    def readlines(self, hint=-1):
        return self.req.readlines(hint)
    
    def __iter__(self):
        return iter(self.req.readlines())


class ModPythonErrorWrapper(object):
    
    def __init__(self, req):
        self.req = req
    
    def flush(self):
        pass
    
    def write(self, content):
        self.req.log_error(content)
    
    def writelines(self, seq):
        for content in seq:
            self.req.log_error(content)


from wsgiref.handlers import BaseCGIHandler

class ModPyHandler(BaseCGIHandler):
    
    def __init__(self, req, threaded=None, forked=None):
        from mod_python import apache
        try:
            q = apache.mpm_query
        except AttributeError:
             if (threaded is None) or (forked is None):
                 m = ("You must provide 'threaded' and 'forked' args to
"
                      "ModPyHandler when running mod_python < 3.1")
                 raise ValueError(m)
        else:
            threaded = apache.mpm_query(apache.AP_MPMQ_IS_THREADED)
            forked = apache.mpm_query(apache.AP_MPMQ_IS_FORKED)
        
        req.add_common_vars()
        env = req.subprocess_env.copy()
        
        if req.path_info:
            env["SCRIPT_NAME"] = req.uri[:-len(req.path_info)]
        else:
            env["SCRIPT_NAME"] = req.uri
        
        env["GATEWAY_INTERFACE"] = "Python-CGI/1.1"
        
        # you may want to comment this out for better security
        if req.headers_in.has_key("authorization"):
            env["HTTP_AUTHORIZATION"] = req.headers_in["authorization"]
        
        BaseCGIHandler.__init__(self,
                                stdin=ModPythonInputWrapper(req),
                                stdout=None,
                                stderr=ModPythonErrorWrapper(req),
                                environ=env,
                                multiprocess=forked,
                                multithread=threaded
                                )
        self.request = req
        self._write = req.write
    
    def _flush(self):
        pass
    
    def send_headers(self):
        self.cleanup_headers()
        self.headers_sent = True
        self.request.status = int(self.status[:3])
        for key, val in self.headers.items():
            self.request.headers_out[key] = val


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From fumanchu at amor.org  Thu Oct 14 08:20:22 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Thu Oct 14 08:21:07 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3022F83@exchange.hqamor.amorhq.net>

I wrote:
> Here's the revised version. I haven't tested everything; for example,
> reading straight from wsgi.input or writing to .errors. I'll wait for
> the bug reports. :)

Okay. I just did file uploads (with cgi.FieldStorage) and naturally
encountered errors ;) which were printed to the apache2 error.log.
However, this did *not* happen in BaseCGIHandler.log_exception because
the headers had already been set, which raised another error, which also
got printed to error.log. Not sure what to do about that, if anything.
:/

I also confirmed that multiprocess and multithreaded were set correctly,
at least for mpm_winnt.


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From pje at telecommunity.com  Thu Oct 14 19:24:57 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 14 19:24:32 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3022F82@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20041014131210.03468640@mail.telecommunity.com>

At 10:13 PM 10/13/04 -0700, Robert Brewer wrote:
>Phillip J. Eby wrote:
> >      def wsgi_handler(req):
> >          handler = ModPyHandler(req)
> >          options = req.get_options()
> >          appmod,appname = options['application'].split('::')
> >          d = {}
> >          exec ("from %(appmod)s import %(appname) as application" %
> > locals()) in d
> >          handler.run(d[application])
> >          from mod_python import apache
> >          return apache.OK
>
>Eeew. exec. Smelly. :)

The "correct" way to do it would be to swipe whatever code mod_python 
itself uses for that, although I wouldn't be surprised if it uses exec 
also.  :)

More likely, it uses '__import__', but for the prototype version, why bother?


>I'll stick with plain Python code over
>PythonOption, thanks, and make my app developers do a few lines of extra
>work *once* instead of every deployer on every install.

I'm confused.  One of the main points of WSGI is to "write once, run 
anywhere".  Assuming most WSGI apps end up as a callable that can be 
imported from somewhere, then the path of least resistance for a deployer 
is to be able to pop an extra line or two in an .htaccess or 
httpd.conf.  They're going to have to touch that file anyway, even to set 
up a wrapper script.  Why should they have to edit the configuration *and* 
write a script?  Especially if they're just deploying the app.  That makes 
no sense to me at all.  Likewise, it makes no sense to have the application 
developer have to write a mod_python wrapper for their WSGI applications, 
since they might not have or care about mod_python specifically.

Perhaps I'm misunderstanding what you're saying, because I don't "get 
it".  Or maybe you misunderstood the intent of my code.  I was assuming 
that the 'wsgi_handler' function would be bundled with the *gateway*, not 
added to every application.  So, you would always have, e.g.:

     PythonHandler wsgiref.handlers::wsgi_handler

as part of the handler setup for a WSGI application.  Thus, deploying a 
WSGI app on mod_python should be as simple as having wsgiref and the 
application itself on the server's PYTHONPATH, and then setting a couple of 
configuration options.


>         if req.path_info:
>             env["SCRIPT_NAME"] = req.uri[:-len(req.path_info)]
>         else:
>             env["SCRIPT_NAME"] = req.uri

Does the 'req.uri' attribute include a query string?


>         # you may want to comment this out for better security

No, you don't want to.  :)  If you don't trust the WSGI app, you shouldn't 
run it.  It would be trivial for it to inspect Python stack frames until it 
finds the request object and pull out the authorization on its own.  So, it 
might give someone a warm fuzzy feeling to take it out, it won't really 
help anything.  :)

From pje at telecommunity.com  Thu Oct 14 19:39:27 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 14 19:39:01 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3022F83@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20041014132635.03467e50@mail.telecommunity.com>

At 11:20 PM 10/13/04 -0700, Robert Brewer wrote:
>I wrote:
> > Here's the revised version. I haven't tested everything; for example,
> > reading straight from wsgi.input or writing to .errors. I'll wait for
> > the bug reports. :)
>
>Okay. I just did file uploads (with cgi.FieldStorage) and naturally
>encountered errors ;) which were printed to the apache2 error.log.
>However, this did *not* happen in BaseCGIHandler.log_exception because
>the headers had already been set, which raised another error, which also
>got printed to error.log. Not sure what to do about that, if anything.

Send me both tracebacks.  :)

One quick question: what is 'sys.stderr' for Python under mod_python?  If 
it prints to the error log, there's no reason (at least from a compliance 
POV) not to simply use it as the handler's stderr.

From fumanchu at amor.org  Thu Oct 14 23:20:07 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Thu Oct 14 23:20:53 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3022F86@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> At 11:20 PM 10/13/04 -0700, Robert Brewer wrote:
> >I wrote:
> > > Here's the revised version. I haven't tested everything; 
> for example,
> > > reading straight from wsgi.input or writing to .errors. 
> I'll wait for
> > > the bug reports. :)
> >
> >Okay. I just did file uploads (with cgi.FieldStorage) and naturally
> >encountered errors ;) which were printed to the apache2 error.log.
> >However, this did *not* happen in 
> BaseCGIHandler.log_exception because
> >the headers had already been set, which raised another 
> error, which also
> >got printed to error.log. Not sure what to do about that, if 
> anything.
> 
> Send me both tracebacks.  :)

Traceback (most recent call last):

  File "C:\Python23\lib\site-packages\mod_python\apache.py", line 299,
in HandlerDispatch
    result = object(req)

  File "C:\Python23\lib\site-packages\cation\html\uiwsgi.py", line 144,
in run_app
    self.run(self.app)

  File "C:\Python23\lib\site-packages\wsgiref\handlers.py", line 96, in
run
    self.handle_error()

  File "C:\Python23\lib\site-packages\wsgiref\handlers.py", line 307, in
handle_error
    self.result = self.error_output(self.environ, self.start_response)

  File "C:\Python23\lib\site-packages\wsgiref\handlers.py", line 325, in
error_output
    start_response(self.error_status, self.error_headers[:])

  File "C:\Python23\lib\site-packages\wsgiref\handlers.py", line 176, in
start_response
    raise AssertionError("Headers already set!")

AssertionError: Headers already set!

> One quick question: what is 'sys.stderr' for Python under 
> mod_python?  If 
> it prints to the error log, there's no reason (at least from 
> a compliance 
> POV) not to simply use it as the handler's stderr.

sys.stderr -> apache's error log. See
http://www.modpython.org/FAQ/faqw.py?req=show&file=faq02.003.htp


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From pje at telecommunity.com  Thu Oct 14 23:33:52 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 14 23:33:27 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3022F86@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20041014172738.022c1b40@mail.telecommunity.com>

At 02:20 PM 10/14/04 -0700, Robert Brewer wrote:

>   File "C:\Python23\lib\site-packages\wsgiref\handlers.py", line 325, in
>error_output
>     start_response(self.error_status, self.error_headers[:])

Aha.  That's a wsgiref bug; it should be passing 'sys.exc_info()' as the 
third argument here.  As a result, it doesn't work if start_response has 
already been called.  I've fixed this in CVS now.  Apparently the tests 
don't yet cover a scenario of "call start_response(), then raise an 
exception before the headers are actually sent."


> > One quick question: what is 'sys.stderr' for Python under
> > mod_python?  If
> > it prints to the error log, there's no reason (at least from
> > a compliance
> > POV) not to simply use it as the handler's stderr.
>
>sys.stderr -> apache's error log. See
>http://www.modpython.org/FAQ/faqw.py?req=show&file=faq02.003.htp

Ah.  So it should suffice to use sys.stderr, as long as the output is 
flushed from time to time.  I've changed wsgiref to flush stderr after 
writing exception output, since it really should be doing that for other 
platforms as well.

From fumanchu at amor.org  Thu Oct 14 23:49:29 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Thu Oct 14 23:50:15 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3022F87@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> At 10:13 PM 10/13/04 -0700, Robert Brewer wrote:
> >Phillip J. Eby wrote:
> > >      def wsgi_handler(req):
> > >          handler = ModPyHandler(req)
> > >          options = req.get_options()
> > >          appmod,appname = options['application'].split('::')
> > >          d = {}
> > >          exec ("from %(appmod)s import %(appname) as 
> application" %
> > > locals()) in d
> > >          handler.run(d[application])
> > >          from mod_python import apache
> > >          return apache.OK
> >
> >Eeew. exec. Smelly. :)
> 
> The "correct" way to do it would be to swipe whatever code mod_python 
> itself uses for that, although I wouldn't be surprised if it 
> uses exec also.  :)
> 
> More likely, it uses '__import__', but for the prototype 
> version, why bother?

Because it's easy:

def wsgi_handler(req):
    from mod_python import apache
    
    handler = ModPyHandler(req)
    options = req.get_options()
    modname, objname = options['application'].split('::')
    module = apache.import_module(modname, autoreload=False, log=debug)
    app = apache.resolve_object(module, objname, arg=None, silent=False)
    handler.run(app)
    return apache.OK

> >I'll stick with plain Python code over
> >PythonOption, thanks, and make my app developers do a few 
> lines of extra
> >work *once* instead of every deployer on every install.
> 
> I'm confused.  One of the main points of WSGI is to "write once, run 
> anywhere".  Assuming most WSGI apps end up as a callable that can be 
> imported from somewhere, then the path of least resistance 
> for a deployer 
> is to be able to pop an extra line or two in an .htaccess or 
> httpd.conf.  They're going to have to touch that file anyway, 
> even to set 
> up a wrapper script.  Why should they have to edit the 
> configuration *and* 
> write a script?  Especially if they're just deploying the 
> app.  That makes 
> no sense to me at all.  Likewise, it makes no sense to have 
> the application 
> developer have to write a mod_python wrapper for their WSGI 
> applications, 
> since they might not have or care about mod_python specifically.

You're right. It was just one more level of indirection and my brain was
on overload with all the callbacks, etc. Turns out I can take what would
have been a handler:

def handler(req):

...and change it to:

def get_wsgi_app(environ, start_response):

...and make that my "application" callable.

> >         if req.path_info:
> >             env["SCRIPT_NAME"] = req.uri[:-len(req.path_info)]
> >         else:
> >             env["SCRIPT_NAME"] = req.uri
> 
> Does the 'req.uri' attribute include a query string?

The docs say "uri: The path portion of the URI." Helpful.

I'd guess req.uri does not include query string, since path_info comes
before query args. I copied the above 4 lines from mod_python/apache.py

> 
> >         # you may want to comment this out for better security
> 
> No, you don't want to.  :)  If you don't trust the WSGI app, 
> you shouldn't run it.  It would be trivial for it to inspect
> Python stack frames until it finds the request object and pull
> out the authorization on its own.  So, it might give someone a
> warm fuzzy feeling to take it out, it won't really help anything.

That comment was also copied from mod_python.

I realized you don't need a separate handler function called
wsgi_handler; mod_python is smart enough to notice when your handler is
an unbound class method, and automatically forms an instance of your
class (passing the request object), and then calling the bound method
(again, passing the request). So I folded the handler code directly into
ModPyHandler. Here's the latest version:


class ModPythonInputWrapper(object):
    
    def __init__(self, req):
        self.req = req
    
    def read(self, size=-1):
        return self.req.read(size)
    
    def readline(self):
        return self.req.readline()
    
    def readlines(self, hint=-1):
        return self.req.readlines(hint)
    
    def __iter__(self):
        return iter(self.req.readlines())

import sys
from wsgiref.handlers import BaseCGIHandler

class ModPyHandler(BaseCGIHandler):
    
    def __init__(self, req):
        from mod_python import apache
        options = req.get_options()
        
        try:
            q = apache.mpm_query
        except AttributeError:
            # Threading and forking
            threaded = options.get('multithread', '')
            forked = options.get('multiprocess', '')
            if not (threaded and forked):
                raise ValueError("You must provide 'multithread' and "
                                 "'multiprocess' PythonOptions when "
                                 "running mod_python < 3.1")
            threaded = threaded.lower() in ('on', 't', 'true', '1')
            forked = forked.lower() in ('on', 't', 'true', '1')
        else:
            threaded = q(apache.AP_MPMQ_IS_THREADED)
            forked = q(apache.AP_MPMQ_IS_FORKED)
        
        req.add_common_vars()
        env = req.subprocess_env.copy()
        
        if req.path_info:
            env["SCRIPT_NAME"] = req.uri[:-len(req.path_info)]
        else:
            env["SCRIPT_NAME"] = req.uri
        
        env["GATEWAY_INTERFACE"] = "Python-CGI/1.1"
        
        if req.headers_in.has_key("authorization"):
            env["HTTP_AUTHORIZATION"] = req.headers_in["authorization"]
        
        BaseCGIHandler.__init__(self,
                                stdin=ModPythonInputWrapper(req),
                                stdout=None,
                                stderr=sys.stderr,
                                environ=env,
                                multiprocess=forked,
                                multithread=threaded
                                )
        self.request = req
        self._write = req.write
        
        config = req.get_config()
        debug = int(config.get("PythonDebug", 0))
        
        modname, objname = options['application'].split('::')
        module = apache.import_module(modname, autoreload=False,
log=debug)
        self.app = apache.resolve_object(module, objname, arg=None,
silent=False)
    
    def run_app(self, req):
        self.run(self.app)
        return 0 # = apache.OK
    
    def _flush(self):
        pass
    
    def send_headers(self):
        self.cleanup_headers()
        self.headers_sent = True
        self.request.status = int(self.status[:3])
        for key, val in self.headers.items():
            self.request.headers_out[key] = val

------------

and a sample .conf:


<Directory D:\htdocs\myprog>
  PythonHandler wsgiref.handlers::ModPyHandler.run_app
  PythonOption application myproggie.startup::get_wsgi_app
  # These options are required if you're using a version of mod_python <
3.1
  # multithread = On
  # multiprocess = Off
</Directory>


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From floydophone at gmail.com  Fri Oct 15 02:59:45 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Fri Oct 15 02:59:47 2004
Subject: [Web-SIG] WSGI async API
Message-ID: <6654eac4041014175977291ff4@mail.gmail.com>

Can someone briefly outline how the WSGI async API works? Sorry to
reiterate, but I don't know the agreement we finally reached.
From pje at telecommunity.com  Fri Oct 15 07:46:42 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 07:46:18 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <6654eac4041014175977291ff4@mail.gmail.com>
Message-ID: <5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>

At 08:59 PM 10/14/04 -0400, Peter Hunt wrote:
>Can someone briefly outline how the WSGI async API works? Sorry to
>reiterate, but I don't know the agreement we finally reached.

That's because no agreement was reached.  There are two (moderately vague) 
proposals still at large:

1) Have server extension APIs to pause iteration until further notice, or 
until input is available

2) Have a server extension API that returns an iterable that the 
application then returns after registering callbacks with it.  This object 
would provide a more continuation-like API.

Another alternative is not to bless an official async API at this time, and 
leave it open for server developers to innovate.  Then, come back later and 
extend the PEP once there's more user/developer experience with the various 
innovations out there.  Both of the above approaches could be implemented 
in various ways, according to developer interest, but would be considered 
server-specific extensions until/unless there was consensus to formalize 
them as optional extensions to the current spec.

Given that none of the proposals appear to require making any further 
changes to the base API, and that traffic discussing the existing proposals 
has been slim, this latter alternative is beginning to look pretty 
attractive to me.

From pje at telecommunity.com  Fri Oct 15 07:47:22 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 07:46:57 2004
Subject: [Web-SIG] [WSGI] mod_python wrapper: minimal first attempt
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3022F87@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20041015013257.02c1aec0@mail.telecommunity.com>

At 02:49 PM 10/14/04 -0700, Robert Brewer wrote:
>Phillip J. Eby wrote:
> > The "correct" way to do it would be to swipe whatever code mod_python
> > itself uses for that, although I wouldn't be surprised if it
> > uses exec also.  :)
> >
> > More likely, it uses '__import__', but for the prototype
> > version, why bother?
>
>Because it's easy:

Not if you haven't installed or even downloaded mod_python.  ;)


>I realized you don't need a separate handler function called
>wsgi_handler; mod_python is smart enough to notice when your handler is
>an unbound class method, and automatically forms an instance of your
>class (passing the request object), and then calling the bound method
>(again, passing the request). So I folded the handler code directly into
>ModPyHandler. Here's the latest version:

I think I'll stick with the version where they're separate.  It's easier to 
implement unit tests on the handler class if its __init__ method doesn't 
run the application.  Still, this looks like  it's in pretty good shape to 
pop into wsgiref.  Thanks for your help in fleshing it out.

From floydophone at gmail.com  Fri Oct 15 12:57:57 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Fri Oct 15 12:58:01 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
References: <6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
Message-ID: <6654eac404101503573c8cfa7a@mail.gmail.com>

So if I'm implementing a Twisted gateway, where should
request.finish() go? This has been puzzling me for some time...


On Fri, 15 Oct 2004 01:46:42 -0400, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 08:59 PM 10/14/04 -0400, Peter Hunt wrote:
> 
> 
> >Can someone briefly outline how the WSGI async API works? Sorry to
> >reiterate, but I don't know the agreement we finally reached.
> 
> That's because no agreement was reached.  There are two (moderately vague)
> proposals still at large:
> 
> 1) Have server extension APIs to pause iteration until further notice, or
> until input is available
> 
> 2) Have a server extension API that returns an iterable that the
> application then returns after registering callbacks with it.  This object
> would provide a more continuation-like API.
> 
> Another alternative is not to bless an official async API at this time, and
> leave it open for server developers to innovate.  Then, come back later and
> extend the PEP once there's more user/developer experience with the various
> innovations out there.  Both of the above approaches could be implemented
> in various ways, according to developer interest, but would be considered
> server-specific extensions until/unless there was consensus to formalize
> them as optional extensions to the current spec.
> 
> Given that none of the proposals appear to require making any further
> changes to the base API, and that traffic discussing the existing proposals
> has been slim, this latter alternative is beginning to look pretty
> attractive to me.
> 
>
From irmen at xs4all.nl  Fri Oct 15 13:55:16 2004
From: irmen at xs4all.nl (Irmen de Jong)
Date: Fri Oct 15 13:55:21 2004
Subject: [Web-SIG] http content-location header, and different browsers
Message-ID: <416FBAA4.6060502@xs4all.nl>

Hello all,
I was just trying some new code I was writing for Snakelets with
different browsers, and stumbled across something weird.
It has to do with the HTTP Content-Location header.

What I used to do was adding a Content-Location header in the
reply, when the page was internally redirected in Snakelets.
(I thought this was a good idea, based on what I knew about
the meaning of that header). Everything worked fine. Until I
opened my website with Opera, instead of Firefox or IE....:
a few of my links had totally wrong URLs in Opera!

After a bit of searching I now know that at least Opera implements
the HTTP specification, which says in
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.14
that "The value of Content-Location also defines the base URI
for the entity." So Opera was -rightfully so- using the value
of the content-location header as the new base URI, and the
other browsers I tried *do not do that*. Firefox has a WONTFIX-bug
on this (bugzilla #109553) because they feel that it would break
a lot of websites that supply faulty content-location headers.

In the end, I decided to just not generate this header anymore.
And my site started working in Opera too ;-)

What do you think of this?

--Irmen de Jong.
From foom at fuhm.net  Fri Oct 15 17:20:34 2004
From: foom at fuhm.net (James Y Knight)
Date: Fri Oct 15 17:20:40 2004
Subject: [Twisted-web] Re: [Web-SIG] A more Twisted approach to async apps
	in WSGI
In-Reply-To: <5.1.1.6.0.20041007010942.02d33c90@mail.telecommunity.com>
References: <5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
	<5.1.1.6.0.20041007010942.02d33c90@mail.telecommunity.com>
Message-ID: <C2ABAA9E-1EBD-11D9-AAA6-000A95A50FB2@fuhm.net>


On Oct 7, 2004, at 1:28 AM, Phillip J. Eby wrote:
>> - For async applications, with the proposed API, that may not be an 
>> option, because the iterable returned is the special wrapper, not a 
>> user-created class. Although, actually, I guess the app can return 
>> its own iterable whose __iter__ calls through and returns the 
>> wrapper's __iter__.
>
> Not if the server wants to be able to handle that iterable specially.  
> But anyway, it seems that the wrapper's constructor should take a 
> close method, or have a way to set one.

As already discussed, the server cannot really expect to actually get 
the iterable back anyhow. But yes, I'd say either the init should take 
a close argument, or else the use of something like "wrapper.close = 
myCloseFunction" should be part of the API.


>> Hmm, yes. I totally missed the option of just yielding ''. Of course 
>> it's a very bad idea to repeatedly yield '' to a server if you don't 
>> know the server can properly handle it (by e.g. delaying longer and 
>> longer), but, in this case, since the server itself is providing the 
>> special iterable, that should be fine.
>
> Yes.  Also, when we finally settle on an async API, I do want to cover 
> the issue of backing off iteration when empty strings are yielded.  
> I'm actually inclined to suggest that an async application should take 
> responsibility for doing the delaying if it's called repeatedly, and 
> the async API isn't available.

If the async API isn't available, and I'm an async application, I would 
assume I'm running on a synch server, and thus am allowed to block the 
request thread indefinitely, and do so, waiting for a wakeup 
notification from the reactor loop. It doesn't seem to me that any 
iterator back-off behavior is needed, or desirable. I can fabricate an 
async wrapper that uses threads

>> It seems like it should be possible to make a generic class that 
>> implements this async API for use with sync servers that do not 
>> support it natively. That would allow async apps to run on a sync 
>> server without modification, which is potentially useful. To do that, 
>> though, I think the it'd have to spawn an extra thread per request 
>> that is waiting to read data, for the read() call to block on. 
>> Unless, of course, the app never needs to yield outgoing data while 
>> waiting for incoming data.
>
> Well, with Twisted you could deferToThread the read() operations, 
> though it's hard for me to think straight about that scenario because 
> I keep finding it hard to imagine an async web app that isn't just 
> written to the Twisted API to start with... ;)

Right -- but deferToThread'ing a read() operation is essentially the 
same as spawning an extra thread per request to read the data, just 
with nicer thread management.

> [thread stuff]
>
>>  I haven't really thought about these thready questions much either, 
>> so maybe the answers are obvious, but in my experience, that's 
>> usually not the case when it comes to threads.
>
> Yep.  :)  However, the more I think about it, the more it seems to me 
> that WSGI should emulate single-threadedness with respect to any 
> function/method/iterator invocations associated with a given 
> application invocation.  However, it is *not* guaranteed that all such 
> invocations will occur from the same thread.
>
> Basically, it means "no multitasking with the other guy's objects", 
> and puts the locking burdens on whoever's trying to mix multitasking 
> into the works.

That does sound good. No multitasking means it's impossible to write a 
response while already waiting for incoming data. But actually I think 
it's probably fine for an async app running on a sync server to not be 
able to simultaneously read data and write data, so I take back 
anything about needing to call wsgi server methods from more than one 
thread. In the compat wrapper, calling on_get can just block writing 
until the read has occurred; in that case, all wsgi methods can be 
called from the server's request thread.

> By the way, after all this discussion...  do you think it would be 
> better to:
>
> 1) Push towards a full async API, nailing down all these loose ends
>
> 2) Use the simple-but-klugdy "pause iteration" API idea
>
> 3) Don't make an "official" async API, and just leave it open to 
> server authors to create their own extensions, and maybe cherry pick 
> the best ideas for WSGI 2.0, or
>
> 4) Do something else altogether?

I think the API you've outlined sounds good. I can imagine ways to 
implement it both for an async server like twisted, and as a 
compatibility layer for an async-requiring application on a sync 
server. I think it's easier to make the compatibility layer with this 
API than with the pause/resume API.  However, I would be quite wary of 
including it in the final spec without it being implemented first.

Another question is: what is the current use for it? Does anyone want 
to write untwisted async web applications?

My current interest in WSGI is basically on the "plug twisted web into 
another webserver as an application" side of things. I wouldn't want to 
write an application to WSGI (without a framework on top)... If 
everyone else feels that way, an async API may not be actually useful 
until there is some other Async-WSGI web server that you could plug 
twisted framework stuff on top of, or some other async framework you 
can plug on top of the twisted server.

As for postponing until WSGI 2.0, I would hope there doesn't need to be 
a WSGI 2.0, though, since the interface is so darn simple. ;) But it 
could be in a separate WSGI async addons.

James

From pje at telecommunity.com  Fri Oct 15 17:31:56 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 17:31:31 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <6654eac404101503573c8cfa7a@mail.gmail.com>
References: <5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041015113034.02c7a7e0@mail.telecommunity.com>

At 06:57 AM 10/15/04 -0400, Peter Hunt wrote:
>So if I'm implementing a Twisted gateway, where should
>request.finish() go? This has been puzzling me for some time...

What's request.finish()?  I've never done anything with Twisted at a higher 
level than the raw reactor interface, and a bit with Deferreds.  So I'm not 
sure what you're talking about here.

From pje at telecommunity.com  Fri Oct 15 17:52:58 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 17:52:36 2004
Subject: [Twisted-web] Re: [Web-SIG] A more Twisted approach to
	async apps in WSGI
In-Reply-To: <C2ABAA9E-1EBD-11D9-AAA6-000A95A50FB2@fuhm.net>
References: <5.1.1.6.0.20041007010942.02d33c90@mail.telecommunity.com>
	<5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20040922204838.024f61c0@mail.telecommunity.com>
	<5.1.1.6.0.20041005022421.02fae470@mail.telecommunity.com>
	<5.1.1.6.0.20041007010942.02d33c90@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041015113209.02155ba0@mail.telecommunity.com>

At 11:20 AM 10/15/04 -0400, James Y Knight wrote:

>On Oct 7, 2004, at 1:28 AM, Phillip J. Eby wrote:
>>By the way, after all this discussion...  do you think it would be better to:
>>
>>1) Push towards a full async API, nailing down all these loose ends
>>
>>2) Use the simple-but-klugdy "pause iteration" API idea
>>
>>3) Don't make an "official" async API, and just leave it open to server 
>>authors to create their own extensions, and maybe cherry pick the best 
>>ideas for WSGI 2.0, or
>>
>>4) Do something else altogether?
>
>I think the API you've outlined sounds good. I can imagine ways to 
>implement it both for an async server like twisted, and as a compatibility 
>layer for an async-requiring application on a sync server. I think it's 
>easier to make the compatibility layer with this API than with the 
>pause/resume API.  However, I would be quite wary of including it in the 
>final spec without it being implemented first.

Right, this is one reason I'm thinking that #3 might be a good idea, 
although it'd probably be more like 1.1 than 2.0.  Or really, it would just 
be an optional extension available under 1.0.  Even if we finalize the 1.0 
spec, nothing stops us from adding optional extensions that don't alter the 
existing required semantics.


>Another question is: what is the current use for it? Does anyone want to 
>write untwisted async web applications?

Right.  That's the really big issue, and another reason why saying, "let's 
wait for implementations" might be a good idea.  That is, if people 
implement something, there's clearly a market for it.  If they don't, maybe 
we don't need it.


>My current interest in WSGI is basically on the "plug twisted web into 
>another webserver as an application" side of things. I wouldn't want to 
>write an application to WSGI (without a framework on top)... If everyone 
>else feels that way, an async API may not be actually useful until there 
>is some other Async-WSGI web server that you could plug twisted framework 
>stuff on top of, or some other async framework you can plug on top of the 
>twisted server.

Yep, that's the issue alright.  It seems that the common usecase for an 
async web app is going to boil down to: "do you want to proxy your Twisted 
app from some other web server?"  Because let's face it, Twisted's process 
model isn't really a match for say, the Apache prefork model, or CGI.

ISTM, then, that the useful thing to write would be a synchronous 
WSGI->HTTP "application" object.  That would allow Twisted or any other 
async server (or really any HTTP server at all) to be treated as a WSGI 
application, thus letting async apps join the WSGI party without forcing 
them to give up any asyncness or to have to do other really horrid things 
to fit.

With a little more sophistication, such an application component could 
perhaps actually spawn the async server if it's not running, by checking a 
pid file or some such.  Or that could be middleware; you have a "server 
starter" middleware that just ensures the server is running before it 
passes the request down to the proxy middleware.


>As for postponing until WSGI 2.0, I would hope there doesn't need to be a 
>WSGI 2.0, though, since the interface is so darn simple. ;) But it could 
>be in a separate WSGI async addons.

Technically, I don't think finalizing the base specification would prevent 
us from amending the PEP to add optional features even to 1.0.

From floydophone at gmail.com  Fri Oct 15 19:24:19 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Fri Oct 15 19:26:47 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <5.1.1.6.0.20041015113034.02c7a7e0@mail.telecommunity.com>
References: <6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
	<5.1.1.6.0.20041015113034.02c7a7e0@mail.telecommunity.com>
Message-ID: <6654eac404101510246ffd970@mail.gmail.com>

Essentially, Twisted.Web gives you something like this:

class MyResource(resource.Resource):
    def render(self, request):
        return "content here" # you could also do request.write("content here")

If you do an async call, you have to use request.write() to write the
data, return server.NOT_DONE_YET from the render() method, and call
request.finish() to finish the request.


On Fri, 15 Oct 2004 11:31:56 -0400, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 06:57 AM 10/15/04 -0400, Peter Hunt wrote:
> >So if I'm implementing a Twisted gateway, where should
> >request.finish() go? This has been puzzling me for some time...
> 
> What's request.finish()?  I've never done anything with Twisted at a higher
> level than the raw reactor interface, and a bit with Deferreds.  So I'm not
> sure what you're talking about here.
> 
>
From carribeiro at gmail.com  Fri Oct 15 19:51:54 2004
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Fri Oct 15 19:52:33 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <6654eac404101510246ffd970@mail.gmail.com>
References: <6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
	<5.1.1.6.0.20041015113034.02c7a7e0@mail.telecommunity.com>
	<6654eac404101510246ffd970@mail.gmail.com>
Message-ID: <864d3709041015105131652057@mail.gmail.com>

On Fri, 15 Oct 2004 13:24:19 -0400, Peter Hunt <floydophone@gmail.com> wrote:
> Essentially, Twisted.Web gives you something like this:
> 
> class MyResource(resource.Resource):
>     def render(self, request):
>         return "content here" # you could also do request.write("content here")
> 
> If you do an async call, you have to use request.write() to write the
> data, return server.NOT_DONE_YET from the render() method, and call
> request.finish() to finish the request.

Just curious, so forgive me from jumping into the middle of the
discussion. Isn't this one of the scenarios where output generators
are most useful? Assuming that Twisted supported it, you could yield
lines until there were nothing else to write. Did I get it right?


-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From exarkun at divmod.com  Fri Oct 15 20:07:46 2004
From: exarkun at divmod.com (exarkun@divmod.com)
Date: Fri Oct 15 20:07:48 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <864d3709041015105131652057@mail.gmail.com>
Message-ID: <20041015180746.4730.745244974.divmod.quotient.126@ohm>

On Fri, 15 Oct 2004 14:51:54 -0300, Carlos Ribeiro <carribeiro@gmail.com> wrote:
>On Fri, 15 Oct 2004 13:24:19 -0400, Peter Hunt <floydophone@gmail.com> wrote:
> > Essentially, Twisted.Web gives you something like this:
> > 
> > class MyResource(resource.Resource):
> >     def render(self, request):
> >         return "content here" # you could also do request.write("content here")
> > 
> > If you do an async call, you have to use request.write() to write the
> > data, return server.NOT_DONE_YET from the render() method, and call
> > request.finish() to finish the request.
> 
> Just curious, so forgive me from jumping into the middle of the
> discussion. Isn't this one of the scenarios where output generators
> are most useful? Assuming that Twisted supported it, you could yield
> lines until there were nothing else to write. Did I get it right?
> 

  Only if you can also signal to the code which is iterating the generator that it should stop iterating it for a while, otherwise user code might be called upon for bytes before they are available.  
  
  If I have understand the conversation on the matter then this caveat is a main stumbling block for the async wsgi api.  
  
  Jp
From carribeiro at gmail.com  Fri Oct 15 20:17:02 2004
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Fri Oct 15 20:17:52 2004
Subject: [Web-SIG] http content-location header, and different browsers
In-Reply-To: <416FBAA4.6060502@xs4all.nl>
References: <416FBAA4.6060502@xs4all.nl>
Message-ID: <864d370904101511173de0ac@mail.gmail.com>

On Fri, 15 Oct 2004 13:55:16 +0200, Irmen de Jong <irmen@xs4all.nl> wrote:
> Hello all,
> I was just trying some new code I was writing for Snakelets with
> different browsers, and stumbled across something weird.
> It has to do with the HTTP Content-Location header.
> 
> What I used to do was adding a Content-Location header in the
> reply, when the page was internally redirected in Snakelets.
> (I thought this was a good idea, based on what I knew about
> the meaning of that header). Everything worked fine. Until I
> opened my website with Opera, instead of Firefox or IE....:
> a few of my links had totally wrong URLs in Opera!
> 
> After a bit of searching I now know that at least Opera implements
> the HTTP specification, which says in
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.14
> that "The value of Content-Location also defines the base URI
> for the entity." So Opera was -rightfully so- using the value
> of the content-location header as the new base URI, and the
> other browsers I tried *do not do that*. Firefox has a WONTFIX-bug
> on this (bugzilla #109553) because they feel that it would break
> a lot of websites that supply faulty content-location headers.
> 
> In the end, I decided to just not generate this header anymore.
> And my site started working in Opera too ;-)
> 
> What do you think of this?

I have limited experience with this. But if Firefox guys decided it
wasnt worth fixing, they're probably correct. God knows how much email
(and bug tickets) they get when something they do works differently
from IE or other 'mainstream' browsers.

BTW... did you try it in Opera using their IE-emulation mode?

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From foom at fuhm.net  Fri Oct 15 20:19:51 2004
From: foom at fuhm.net (James Y Knight)
Date: Fri Oct 15 20:19:55 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <6654eac404101503573c8cfa7a@mail.gmail.com>
References: <6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
Message-ID: <CE243198-1ED6-11D9-AAA6-000A95A50FB2@fuhm.net>

On Oct 15, 2004, at 6:57 AM, Peter Hunt wrote:
> So if I'm implementing a Twisted gateway, where should
> request.finish() go? This has been puzzling me for some time...

You'd call finish when the iterator from the iterable returned by the 
WSGI app is exhausted and raises StopIteration, I think?

James

From pje at telecommunity.com  Fri Oct 15 20:21:40 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 20:21:16 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <20041015180746.4730.745244974.divmod.quotient.126@ohm>
References: <864d3709041015105131652057@mail.gmail.com>
Message-ID: <5.1.1.6.0.20041015141622.027125c0@mail.telecommunity.com>

At 06:07 PM 10/15/04 +0000, exarkun@divmod.com wrote:
>On Fri, 15 Oct 2004 14:51:54 -0300, Carlos Ribeiro <carribeiro@gmail.com> 
>wrote:
> >On Fri, 15 Oct 2004 13:24:19 -0400, Peter Hunt <floydophone@gmail.com> 
> wrote:
> > > Essentially, Twisted.Web gives you something like this:
> > >
> > > class MyResource(resource.Resource):
> > >     def render(self, request):
> > >         return "content here" # you could also do 
> request.write("content here")
> > >
> > > If you do an async call, you have to use request.write() to write the
> > > data, return server.NOT_DONE_YET from the render() method, and call
> > > request.finish() to finish the request.
> >
> > Just curious, so forgive me from jumping into the middle of the
> > discussion. Isn't this one of the scenarios where output generators
> > are most useful? Assuming that Twisted supported it, you could yield
> > lines until there were nothing else to write. Did I get it right?
> >
>
>   Only if you can also signal to the code which is iterating the 
> generator that it should stop iterating it for a while, otherwise user 
> code might be called upon for bytes before they are available.
>
>   If I have understand the conversation on the matter then this caveat is 
> a main stumbling block for the async wsgi api.

Yes.

Essentially, in order to write a Twisted WSGI gateway (for running WSGI 
apps under Twisted), you *must* use threads (e.g. deferToThread) for 
invoking the WSGI application and iterating over its result, because a 
synchronous WSGI app might block during either operation.

However, note that an asynchronous server/gateway is free to delay 
requesting another iteration, if the application yields an empty 
string.  So, the minimum "asynchronous API" is simply backing off the 
iteration rate when the application yields empty strings.

But, a more sophisticated API would of course only iterate when there was 
data available to be iterated.

From pje at telecommunity.com  Fri Oct 15 20:28:25 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 20:28:00 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <CE243198-1ED6-11D9-AAA6-000A95A50FB2@fuhm.net>
References: <6654eac404101503573c8cfa7a@mail.gmail.com>
	<6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
Message-ID: <5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>

At 02:19 PM 10/15/04 -0400, James Y Knight wrote:
>On Oct 15, 2004, at 6:57 AM, Peter Hunt wrote:
>>So if I'm implementing a Twisted gateway, where should
>>request.finish() go? This has been puzzling me for some time...
>
>You'd call finish when the iterator from the iterable returned by the WSGI 
>app is exhausted and raises StopIteration, I think?

Yes.  A Twisted gateway, to avoid blocking, would need to deferToThread() 
the initial invocation of the WSGI app, and immediately return 
server.NOT_DONE_YET.  A callback on the deferred would then deferToThread 
an iteration on the return iterable, which would in turn defer to the next 
iteration, and so on.  When you get an errback() of StopIteration instead 
of a callback, you could finish().

But all invocations of the application or any  method of any object 
provided by the application *has* to be in a non-reactor thread, so as not 
to block the reactor.  For example, there's no guarantee that simply 
calling 'iter(result)' on the result returned by the application, won't 
e.g. open a database connection or something.

From exarkun at divmod.com  Fri Oct 15 20:32:39 2004
From: exarkun at divmod.com (exarkun@divmod.com)
Date: Fri Oct 15 20:32:41 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
Message-ID: <20041015183239.4730.1265718043.divmod.quotient.138@ohm>

On Fri, 15 Oct 2004 14:28:25 -0400, "Phillip J. Eby" <pje@telecommunity.com> wrote:
>At 02:19 PM 10/15/04 -0400, James Y Knight wrote:
> >On Oct 15, 2004, at 6:57 AM, Peter Hunt wrote:
> >>So if I'm implementing a Twisted gateway, where should
> >>request.finish() go? This has been puzzling me for some time...
> >
> >You'd call finish when the iterator from the iterable returned by the WSGI 
> >app is exhausted and raises StopIteration, I think?
> 
> Yes.  A Twisted gateway, to avoid blocking, would need to deferToThread() 
> the initial invocation of the WSGI app, and immediately return 
> server.NOT_DONE_YET.  A callback on the deferred would then deferToThread 
> an iteration on the return iterable, which would in turn defer to the next 
> iteration, and so on.  When you get an errback() of StopIteration instead 
> of a callback, you could finish().
> 
> But all invocations of the application or any  method of any object 
> provided by the application *has* to be in a non-reactor thread, so as not 
> to block the reactor.  For example, there's no guarantee that simply 
> calling 'iter(result)' on the result returned by the application, won't 
> e.g. open a database connection or something.
>

  Does WSGI enforce any requirements about which thread the function is first invoked in, and which thread(s) it is iterated in?  The scenario you described above would lead to an arbitrary thread being used for each iteration.  I could see this being a problem for WSGI applications which attempted to use thread local storage, assuming that they would always be run in the same non-IO thread.

  Jp
From pje at telecommunity.com  Fri Oct 15 20:59:16 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 15 20:58:52 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <20041015183239.4730.1265718043.divmod.quotient.138@ohm>
References: <5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041015145625.02b28210@mail.telecommunity.com>

At 06:32 PM 10/15/04 +0000, exarkun@divmod.com wrote:
>On Fri, 15 Oct 2004 14:28:25 -0400, "Phillip J. Eby" 
><pje@telecommunity.com> wrote:
> >At 02:19 PM 10/15/04 -0400, James Y Knight wrote:
> > >On Oct 15, 2004, at 6:57 AM, Peter Hunt wrote:
> > >>So if I'm implementing a Twisted gateway, where should
> > >>request.finish() go? This has been puzzling me for some time...
> > >
> > >You'd call finish when the iterator from the iterable returned by the 
> WSGI
> > >app is exhausted and raises StopIteration, I think?
> >
> > Yes.  A Twisted gateway, to avoid blocking, would need to deferToThread()
> > the initial invocation of the WSGI app, and immediately return
> > server.NOT_DONE_YET.  A callback on the deferred would then deferToThread
> > an iteration on the return iterable, which would in turn defer to the next
> > iteration, and so on.  When you get an errback() of StopIteration instead
> > of a callback, you could finish().
> >
> > But all invocations of the application or any  method of any object
> > provided by the application *has* to be in a non-reactor thread, so as not
> > to block the reactor.  For example, there's no guarantee that simply
> > calling 'iter(result)' on the result returned by the application, won't
> > e.g. open a database connection or something.
> >
>
>   Does WSGI enforce any requirements about which thread the function is 
> first invoked in, and which thread(s) it is iterated in?

Not currently.


>   The scenario you described above would lead to an arbitrary thread 
> being used for each iteration.  I could see this being a problem for WSGI 
> applications which attempted to use thread local storage, assuming that 
> they would always be run in the same non-IO thread.

The discussion so far has been that the spec should prohibit applications 
and servers from depending on what thread a callable is invoked from, the 
result is iterated over, etc., as long as only one thread at at time does 
these things.  In other words, servers and applications may not use 
thread-local storage to determine invocation context, but they do not have 
to do any locking (except for the 'wsgi.multithread' case).

From floydophone at gmail.com  Fri Oct 15 21:06:37 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Fri Oct 15 21:07:23 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
References: <6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
	<CE243198-1ED6-11D9-AAA6-000A95A50FB2@fuhm.net>
	<5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
Message-ID: <6654eac4041015120665163b25@mail.gmail.com>

Okay. How will the gateway know to go to the next iteration of the
application? Constantly iterating over a bunch of empty strings while
waiting for output seems like a waste of cycles to me. Perhaps, for
async apps, there can be an environ["async.wakeup"]() method which
will tell the gateway to iterate until the next empty string?


On Fri, 15 Oct 2004 14:28:25 -0400, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 02:19 PM 10/15/04 -0400, James Y Knight wrote:
> 
> 
> >On Oct 15, 2004, at 6:57 AM, Peter Hunt wrote:
> >>So if I'm implementing a Twisted gateway, where should
> >>request.finish() go? This has been puzzling me for some time...
> >
> >You'd call finish when the iterator from the iterable returned by the WSGI
> >app is exhausted and raises StopIteration, I think?
> 
> Yes.  A Twisted gateway, to avoid blocking, would need to deferToThread()
> the initial invocation of the WSGI app, and immediately return
> server.NOT_DONE_YET.  A callback on the deferred would then deferToThread
> an iteration on the return iterable, which would in turn defer to the next
> iteration, and so on.  When you get an errback() of StopIteration instead
> of a callback, you could finish().
> 
> But all invocations of the application or any  method of any object
> provided by the application *has* to be in a non-reactor thread, so as not
> to block the reactor.  For example, there's no guarantee that simply
> calling 'iter(result)' on the result returned by the application, won't
> e.g. open a database connection or something.
> 
>
From pje at telecommunity.com  Sat Oct 16 00:07:05 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Oct 16 00:06:42 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <6654eac4041015120665163b25@mail.gmail.com>
References: <5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
	<6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
	<CE243198-1ED6-11D9-AAA6-000A95A50FB2@fuhm.net>
	<5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041015175958.023acd60@mail.telecommunity.com>

At 03:06 PM 10/15/04 -0400, Peter Hunt wrote:
>Okay. How will the gateway know to go to the next iteration of the
>application? Constantly iterating over a bunch of empty strings while
>waiting for output seems like a waste of cycles to me. Perhaps, for
>async apps, there can be an environ["async.wakeup"]() method which
>will tell the gateway to iterate until the next empty string?

That's close to the first outstanding proposal for an async API, which went 
something like:

    resume = environ["wsgi.pause_iteration"]()

Which would pause subsequent iteration until 'resume()' was called.

By the way, if you're trying to implement async applications under WSGI, 
I'd really like to know more about what you have in mind, what your goals 
are, etc.  One of the problems in formulating a good WSGI API for async 
applications is that it's hard to envision use cases where somebody wants 
to write an async web application, and yet doesn't want to run it in a 
dedicated process.  So anything you could add to enlighten me on this point 
would make it easier for me to finalize an async API.  I've been leaving it 
up to the SIG so far, because I don't have as strong a vision of the use 
cases for async apps as I do for async servers.

From irmen at xs4all.nl  Sat Oct 16 01:59:04 2004
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sat Oct 16 01:59:06 2004
Subject: [Web-SIG] http content-location header, and different browsers
In-Reply-To: <864d370904101511173de0ac@mail.gmail.com>
References: <416FBAA4.6060502@xs4all.nl>
	<864d370904101511173de0ac@mail.gmail.com>
Message-ID: <41706448.3020506@xs4all.nl>

Carlos Ribeiro wrote:
[....about Content-Location header...]
> I have limited experience with this. But if Firefox guys decided it
> wasnt worth fixing, they're probably correct. God knows how much email
> (and bug tickets) they get when something they do works differently
> from IE or other 'mainstream' browsers.

It was news for me too. I always thought that Mozilla(/firefox)
followed the RFCs to the letter. But this was the first one that
I encountered that they deliberatly chose *not* to implement.
Because if they did, it would break a lot of sites (apparently)
and people start to blame Mozilla.

I wonder what Opera users do with this. Because Opera
will break those sites...


> BTW... did you try it in Opera using their IE-emulation mode?

Ehm, isn't it just a change in the User-Agent string?
That wouldn't make any difference...

--Irmen


From floydophone at gmail.com  Sat Oct 16 02:05:08 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Sat Oct 16 02:05:10 2004
Subject: [Web-SIG] WSGI async API
In-Reply-To: <5.1.1.6.0.20041015175958.023acd60@mail.telecommunity.com>
References: <6654eac4041014175977291ff4@mail.gmail.com>
	<5.1.1.6.0.20041015014108.02c1ec00@mail.telecommunity.com>
	<6654eac404101503573c8cfa7a@mail.gmail.com>
	<CE243198-1ED6-11D9-AAA6-000A95A50FB2@fuhm.net>
	<5.1.1.6.0.20041015142243.022869f0@mail.telecommunity.com>
	<6654eac4041015120665163b25@mail.gmail.com>
	<5.1.1.6.0.20041015175958.023acd60@mail.telecommunity.com>
Message-ID: <6654eac404101517055b4d4758@mail.gmail.com>

Here's what I was thinking.

To install an application on the Twisted library, you'd provide the
application callable and a boolean optional parameter which is whether
to run async or sync (defaults to sync). If it runs it sync, it
launches the WSGI in a new thread and does business as usual. If it
runs it async, the app *cannot* block; there needs to be a way around
it. How about I write a simple demo implementation of "wakeup" and you
guys can try it out?


On Fri, 15 Oct 2004 18:07:05 -0400, Phillip J. Eby
<pje@telecommunity.com> wrote:
> At 03:06 PM 10/15/04 -0400, Peter Hunt wrote:
> >Okay. How will the gateway know to go to the next iteration of the
> >application? Constantly iterating over a bunch of empty strings while
> >waiting for output seems like a waste of cycles to me. Perhaps, for
> >async apps, there can be an environ["async.wakeup"]() method which
> >will tell the gateway to iterate until the next empty string?
> 
> That's close to the first outstanding proposal for an async API, which went
> something like:
> 
>    resume = environ["wsgi.pause_iteration"]()
> 
> Which would pause subsequent iteration until 'resume()' was called.
> 
> By the way, if you're trying to implement async applications under WSGI,
> I'd really like to know more about what you have in mind, what your goals
> are, etc.  One of the problems in formulating a good WSGI API for async
> applications is that it's hard to envision use cases where somebody wants
> to write an async web application, and yet doesn't want to run it in a
> dedicated process.  So anything you could add to enlighten me on this point
> would make it easier for me to finalize an async API.  I've been leaving it
> up to the SIG so far, because I don't have as strong a vision of the use
> cases for async apps as I do for async servers.
> 
>
From foom at fuhm.net  Sat Oct 16 02:37:00 2004
From: foom at fuhm.net (James Y Knight)
Date: Sat Oct 16 02:37:07 2004
Subject: [Web-SIG] http content-location header, and different browsers
In-Reply-To: <41706448.3020506@xs4all.nl>
References: <416FBAA4.6060502@xs4all.nl>
	<864d370904101511173de0ac@mail.gmail.com>
	<41706448.3020506@xs4all.nl>
Message-ID: <7E0632B6-1F0B-11D9-AAA6-000A95A50FB2@fuhm.net>


On Oct 15, 2004, at 7:59 PM, Irmen de Jong wrote:

> Carlos Ribeiro wrote:
> [....about Content-Location header...]
>> I have limited experience with this. But if Firefox guys decided it
>> wasnt worth fixing, they're probably correct. God knows how much email
>> (and bug tickets) they get when something they do works differently
>> from IE or other 'mainstream' browsers.
>
> It was news for me too. I always thought that Mozilla(/firefox)
> followed the RFCs to the letter. But this was the first one that
> I encountered that they deliberatly chose *not* to implement.
> Because if they did, it would break a lot of sites (apparently)
> and people start to blame Mozilla.
>
> I wonder what Opera users do with this. Because Opera
> will break those sites...

Actually, Opera also doesn't follow the RFC, as I recall. It only 
listens to Content-Location if the host and port matches that of the 
real URL. This fixes the problems with IIS servers, which are most of 
the broken sites.

James

From carribeiro at gmail.com  Sat Oct 16 02:51:40 2004
From: carribeiro at gmail.com (Carlos Ribeiro)
Date: Sat Oct 16 02:51:42 2004
Subject: [Web-SIG] http content-location header, and different browsers
In-Reply-To: <41706448.3020506@xs4all.nl>
References: <416FBAA4.6060502@xs4all.nl>
	<864d370904101511173de0ac@mail.gmail.com> <41706448.3020506@xs4all.nl>
Message-ID: <864d37090410151751316a19ab@mail.gmail.com>

On Sat, 16 Oct 2004 01:59:04 +0200, Irmen de Jong <irmen@xs4all.nl> wrote:
> > BTW... did you try it in Opera using their IE-emulation mode?
> 
> Ehm, isn't it just a change in the User-Agent string?
> That wouldn't make any difference...

I'm really not sure if it's only a User-Agent 'hack' or if it affects
other aspects of the browser. I've read a *lot* of CSS-specific tricks
over the past few days, and its amazing how many things a modern
browser has to do to properly render real-world web sites. In the end
I was under the impression that Opera did more than just mimic the IE
User-Agent string (which it does just to fool JavaScript code) - it
actually has to use other 'hints' to be able to stablish how it is
supposed to behave under certain situations. But I'm not a Opera user,
and it's possible that I just got it wrong.

-- 
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: carribeiro@gmail.com
mail: carribeiro@yahoo.com
From floydophone at gmail.com  Sat Oct 16 04:16:11 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Sat Oct 16 04:16:13 2004
Subject: [Web-SIG] Async API - example of my implementation
Message-ID: <6654eac40410151916c0434dd@mail.gmail.com>

I'm working on getting subversion running again, but for now, take a
look at how I write my Twisted WSGI async apps.

def blocking_call():
    d = defer.Deferred()
    reactor.callLater(2, d.callback, None)
    return d

def phase2(result, environ):
    environ["thetime"] = time.time()
    environ["twisted.wsgi.resume"]()

def blocking_async_app(environ, start_response):
    write = start_response("200 OK", [("Content-type","text/plain")])
    yield "the time right now is " + `time.time()` + "\n"
    blocking_call().addCallback(phase2, environ)
    yield ""
    yield "the time now is " + `environ["thetime"]`

Is this acceptible?

Basically, when in the special async mode, the gateway iterates over
the application iterator until it hits a "". It then lets the app do
its thing until environ["twisted.wsgi.resume"]() is called, at which
point it repeats this process until StopIteration.

What do you think?
From pje at telecommunity.com  Sat Oct 16 07:47:19 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Oct 16 07:46:59 2004
Subject: [Web-SIG] Async API - example of my implementation
In-Reply-To: <6654eac40410151916c0434dd@mail.gmail.com>
Message-ID: <5.1.1.6.0.20041016014138.03381ec0@mail.telecommunity.com>

At 10:16 PM 10/15/04 -0400, Peter Hunt wrote:
>I'm working on getting subversion running again, but for now, take a
>look at how I write my Twisted WSGI async apps.
>
>def blocking_call():
>     d = defer.Deferred()
>     reactor.callLater(2, d.callback, None)
>     return d
>
>def phase2(result, environ):
>     environ["thetime"] = time.time()
>     environ["twisted.wsgi.resume"]()
>
>def blocking_async_app(environ, start_response):
>     write = start_response("200 OK", [("Content-type","text/plain")])
>     yield "the time right now is " + `time.time()` + "\n"
>     blocking_call().addCallback(phase2, environ)
>     yield ""
>     yield "the time now is " + `environ["thetime"]`
>
>Is this acceptible?
>
>Basically, when in the special async mode, the gateway iterates over
>the application iterator until it hits a "". It then lets the app do
>its thing until environ["twisted.wsgi.resume"]() is called, at which
>point it repeats this process until StopIteration.
>
>What do you think?

I think an explicit pause operation is better, e.g.:

     def blocking_async_app(environ,start_response):
         start_response("200 OK", [("Content-type","text/plain")])
         yield "doing something"

         resume = environ['wsgi.pause_iteration']()
         def phase2(result):
             environ["thetime"] = time.time()
             resume()

         blocking_call().addCallback(phase2)
         yield ""

         # Won't get here till 'resume' is called
         yield "the time now is " + `environ["thetime"]`


This is basically the first of the two alternative API proposals that's 
currently outstanding.  One issue that is not addressed either in your 
example or in the previous proposal is error handling/timeouts.  Suppose 
resume() is never called?   How do we define what "never" is?  This is just 
one open issue with the current async API proposals.

From pje at telecommunity.com  Sun Oct 17 15:47:03 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Oct 17 15:46:54 2004
Subject: [Web-SIG] FYI: Changes to PEP 333 and wsgiref
Message-ID: <5.1.1.6.0.20041017093759.02428ec0@mail.telecommunity.com>

I noticed today that the "URL Reconstruction" algorithm in the PEP (which I 
also copied into wsgiref) is incorrect.  HTTP_HOST (aka the 'Host:' header) 
can contain a port, if it's not the default port for the corresponding 
protocol.  So, SERVER_PORT should not be appended to it in that case.  I've 
fixed the PEP and wsgiref.  (The PEP update should be visible on python.org 
within an hour or two.)

In addition, I've also updated the PEP to make SERVER_PROTOCOL a required 
environ variable.  It's impossible to comply with the HTTP RFC's if you 
don't know what HTTP version the client is using.  (Despite its name, 
SERVER_PROTOCOL is actually the *client* protocol: "the name and revision 
of the information protocol with which the request arrived", according to 
the CGI spec.)

Finally, while making the updates, I also added a notation to the effect 
that 'wsgi.errors' is intended to be a "text mode" file.  This was always 
the intent, but the fact wasn't documented.

From floydophone at gmail.com  Mon Oct 18 03:46:01 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Mon Oct 18 03:46:03 2004
Subject: [Web-SIG] Exciting new developments :)
Message-ID: <6654eac404101718463d56cc3c@mail.gmail.com>

- My Twisted WSGI implementation is now fully-functional and tested
synchronously. The async API is broken. It's also now built upon
Philip's wsgiref library.

- I've written a WSGI object publisher, similar to Zope's ZPublisher.
It's extremely simple, but rather nice I'd say:

def publisher_application(root):
    """
    I'm a ZPublisher-like application, except I run everything as a
WSGI application or coerce it to a string.
    If you don't want something published, start it with an underscore.
    Possible TODOs: security, list of what attributes are accessible
via the web.
                    insert base href optionally
                    should str() objects Content-type be text/plain or
text/html?
    """
    def app(environ, start_response):
        o = root # start at the root
        for elem in environ.get("PATH_INFO","").split("/"): # iterate
through every item in the path coming after this script
            if len(elem) > 0 and elem[0] != "_": # if the element
isn't blank and doesn't begin with an underscore...
                try:
                    o = getattr(o, elem) # try to get the next part of the path
                except AttributeError: # if it's not found...
                    start_response("404 Not Found",
[("Content-type","text/plain")]) # return a 404
                    return ["Resource not found."] # and a nice little message
        if callable(o): # if the final object is callable...
            return o(environ, start_response) # call it as a WSGI application
        else:
            start_response("200 OK", [("Content-type","text/html")]) #
otherwise, assume it's just a string.
            return [str(o)]
    return app

- If you've heard of FlowScript, I've implemented something very
similar for WSGI on Stackless. It lets you write applications without
worrying about writing FSM's. Once I get a good example, I'll post it.

- I fixed up Ian Bicking's session middleware a bit to be more
browser, OS, and machine friendly. I also removed all of its external
dependencies and integrated it with my cookie middleware

- My cookie middleware is now stable

- I've started putting together a WSGI unit tests library...would
anyone like to contribute?

Since I have no hosting as of right now, I can't post any of this cool
stuff. However, once it's back up, I'll send a message to the list.
From pje at telecommunity.com  Wed Oct 20 18:50:08 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct 20 18:49:40 2004
Subject: [Web-SIG] Re: PEP 333 / unittest
In-Reply-To: <Pine.GSO.4.58.0410201157100.4581@qew.cs>
Message-ID: <5.1.1.6.0.20041020124548.022a59a0@mail.telecommunity.com>

At 11:58 AM 10/20/04 -0400, Greg Wilson wrote:
>Hi Phillip.  Hope you don't mind mail out of the blue, but I was wondering
>if anyone had already done work to integrate WSGI and the unit test
>framework, i.e. built a mock-WSGI that could be dropped directly into
>unittest?

Check the SIG list archives; there are people who have talked about various 
tests they've done.  I don't know if any of their work qualifies as what 
you're talking about.

'wsgiref' also has some simple unit tests that run simple WSGI applications 
under a "server" to test the server handlers, but I didn't really make any 
effort for them to be generic beyond the scope of server implementations 
based on wsgiref.  And the wsgiref handlers have lots of 'assert' 
statements in them designed to cause a crash if you run a broken 
application under a wsigref-based server.  That's about all I've done in 
the area of testing.

I seem to recall Ian Bicking created a few WSGI test programs, including a 
'lint' middleware to run between a server and an application, testing both 
for compliance, and an 'echo' application to be used by an external test 
script verifying a server's compliance.

At this point, I would say that all of these various tests are preliminary, 
and there has been little or no interoperability testing to verify that the 
tests themselves are correct.

From ianb at colorstudy.com  Wed Oct 20 19:13:40 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Oct 20 19:14:07 2004
Subject: [Web-SIG] Re: PEP 333 / unittest
In-Reply-To: <5.1.1.6.0.20041020124548.022a59a0@mail.telecommunity.com>
References: <5.1.1.6.0.20041020124548.022a59a0@mail.telecommunity.com>
Message-ID: <41769CC4.5080102@colorstudy.com>

Phillip J. Eby wrote:
> At 11:58 AM 10/20/04 -0400, Greg Wilson wrote:
> 
>> Hi Phillip.  Hope you don't mind mail out of the blue, but I was 
>> wondering
>> if anyone had already done work to integrate WSGI and the unit test
>> framework, i.e. built a mock-WSGI that could be dropped directly into
>> unittest?

Kind of, depending on which part of WSGI is "mock".  The echo 
application is intended to be, essentially, a mock application.  There 
are unittests that work against that application, implicitly testing the 
WSGI server.

lint checks for more compliance issues, mostly trying to determine that 
errors don't quietly pass through (e.g., not supplying a content type, a 
problem which many servers and browsers will cover up).

wsgilib.interactive is something I've created for inspecting 
applications.  I think wsgiref has somthing similar -- just creating a 
mock request, and providing a response.  It would be nice to make a 
response object that was appropriate for testing -- that might mean easy 
methods to test for a string in the response, check for general success 
(e.g., 200 status code, no applicatinon-generated errors, etc), maybe 
check what shows up in the error log.

 From there, you could make a unittest.TestCase subclass that automated 
this a bit further, so you could quickly write acceptance/functional tests.

But, a lot of the acceptance test work could be done through HTTP 
directly, and wouldn't be much more difficult to implement.  The 
advantage to using WSGI instead of HTTP would be in saving some work 
doing configuration.  That's very possibly worth it, since configuring a 
test environment is annoying (since you'll never actively use it).  But 
since HTTP and WSGI are so close, it might be nice to allow either to be 
tested using the same framework.

My code is in svn://colorstudy.com/trunk/WSGI ; Peter Hunt has also done 
some stuff similar to the echo tests, in 
svn://colorstudy.com/trunk/WSGI/phunt/test_applications.py

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From floydophone at gmail.com  Sun Oct 24 20:39:56 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Sun Oct 24 20:39:58 2004
Subject: [Web-SIG] I put up my WSGI code again
Message-ID: <6654eac404102411392020722f@mail.gmail.com>

http://st0rm.hopto.org:8080/wsgi/

Apache died on me...so I put up a Zope3 server for the time being.
From titus at caltech.edu  Sun Oct 24 20:49:40 2004
From: titus at caltech.edu (Titus Brown)
Date: Sun Oct 24 20:49:43 2004
Subject: [Web-SIG] I put up my WSGI code again
In-Reply-To: <6654eac404102411392020722f@mail.gmail.com>
References: <6654eac404102411392020722f@mail.gmail.com>
Message-ID: <20041024184940.GB21864@caltech.edu>

-> http://st0rm.hopto.org:8080/wsgi/
-> 
-> Apache died on me...so I put up a Zope3 server for the time being.

Hi all,

I've noticed that a few people seem to lack stable Web hosting setups.
I have a co-located server that is nowhere near to capacity; I'd be
happy to set up individual WebDAV access for people posting Python+WWW
code.  I can also give you virtual domains etc., either under idyll.org
or whatever domain(s) you own.

Just drop me a private line & I'll set you up...

cheers,
--titus
From neel at mediapulse.com  Thu Oct 28 22:41:31 2004
From: neel at mediapulse.com (Michael C. Neel)
Date: Thu Oct 28 22:35:56 2004
Subject: [Web-SIG] [ANNOUNCE] SnakeSkin 1.0
Message-ID: <1098996090.3838.142.camel@mike.mediapulse.com>

We are proud to announce the release of version 1.0 of SnakeSkin, a
python application toolkit released under an Open Source BSD-Style
license, available at http://snakeskin.pseudocode.net/  Along with this
release, we have updated the CGI Demo to be easier to install and
follow.

Both of these releases can be found at (along with more information
including change logs):
http://sourceforge.net/project/showfiles.php?group_id=118346

Support for SnakeSkin is handled though SourceForge.Net.  The project
information page is at http://www.sourceforge.net/projects/snakeskin-tools
There you will find the bug tracking system, a feature request system, and
the main method of support, the SnakeSkin mailing list.

About SnakeSkin

In SnakeSkin, developers can customize the framework to the application,
unlike in traditional frameworks, such as PHP. For example, adding 
custom tags to the templating system is quick and easy. The goal of the 
project is to have a framework that scales down as well as up--a 
"Zope-lite" framework. SnakeSkin can scale down to be useful in a simple 
form-to-email or just to apply a clean-cut design skin. The toolkit can 
just as easily scale up to handle complex content managment systems, B2B 
extranets, and full-fledged e-commerce engines.  We do it all the time.

SnakeSkin, based upon the existing Albatross project maintained by 
Object Craft, runs under several webservers, including CGI based, 
Apache, FastCGI, and its own included webserver (used mainly for 
development).

SnakeSkin has several built in capabilities:

* Dynamic Macro Features (think server-side includes on steroids)
* SQL support in both the application and the template
* Support for Apache 2.0 Filters

... and includes Albatross features ...

* Clean separation of logic and design
* A simple-yet-robust templating system that is Web Designer-friendly 
(Plays nice with Dreamweaver)
* Secure Session Management in hidden fields, server-side data-stores, 
or through a session server

The SnakeSkin team.
http://snakeskin.pseudocode.net/