From paul at boddie.org.uk  Wed Dec  1 00:20:42 2004
From: paul at boddie.org.uk (Paul Boddie)
Date: Wed Dec  1 00:21:45 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <20041130190154.GA12058@caltech.edu>
References: <20041130190154.GA12058@caltech.edu>
Message-ID: <200412010020.42195.paul@boddie.org.uk>

On Tuesday 30 November 2004 20:01, Titus Brown wrote:
>
> My experience highlights an issue that needs to be dealt with by any
> WSGI server code.  Several app frameworks -- Quixote Webware, and Zope,
> for example -- expect to be handed control of an entire URL tree.

I handled this control issue using the following code:

# Magic dictionary for WSGIServer.

class MagicDict:
    def __init__(self, handler):
        self.handler = handler
    def has_key(self, name):
        return 1
    def __getitem__(self, name):
        return self.handler

When such an object is passed to WSGIServer, the specified handler always gets 
control, although it'd surely be preferable for so-called "WSGI middleware" 
to manage the URL space.

Paul
From titus at caltech.edu  Wed Dec  1 01:04:50 2004
From: titus at caltech.edu (Titus Brown)
Date: Wed Dec  1 01:05:02 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <200412010020.42195.paul@boddie.org.uk>
References: <20041130190154.GA12058@caltech.edu>
	<200412010020.42195.paul@boddie.org.uk>
Message-ID: <20041201000450.GB12543@caltech.edu>

-> > My experience highlights an issue that needs to be dealt with by any
-> > WSGI server code.  Several app frameworks -- Quixote Webware, and Zope,
-> > for example -- expect to be handed control of an entire URL tree.
-> 
-> I handled this control issue using the following code:
-> 
-> # Magic dictionary for WSGIServer.
-> 
-> class MagicDict:
->     def __init__(self, handler):
->         self.handler = handler
->     def has_key(self, name):
->         return 1
->     def __getitem__(self, name):
->         return self.handler
-> 
-> When such an object is passed to WSGIServer, the specified handler always gets 
-> control, although it'd surely be preferable for so-called "WSGI middleware" 
-> to manage the URL space.

this still has the problem that env["SCRIPT_NAME"] and env["PATH_INFO"]
aren't munged appropriately, no?  I know this would be a problem with
Quixote, not sure about the rest.

--titus
From colin at owlfish.com  Wed Dec  1 02:06:34 2004
From: colin at owlfish.com (Colin Stewart)
Date: Wed Dec  1 02:06:46 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
Message-ID: <1101863195.15522.42.camel@rock>

Hi,

(I've subscribed to the list so we can continue discussion purely on-
list)


> >The only real problem in getting this to work was that wsgiServer.py
> >expected *every* URL under /demo to be registered to demo_obj.  I
> >changed the wsgiServer.py code to allow for partial matches & munged
> >the SCRIPT_NAME and PATH_INFO variables appropriately.  I also added
> >REQUEST_URI because Quixote uses it for a few things; this should
> >probably be moved into QWIP.
> 
> I think I'm going to have to call that point out in the PEP 
> somewhere.  Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be 
> set, but I think perhaps some folks have missed the implications of that 
> for the URL path space.


The clarification is good - it certainly wasn't clear to me the first
time I read it!

A quick question about the SCRIPT_NAME: If an application registers for
the path '/testapp/' should SCRIPT_NAME be set to '/testapp',
'/testapp/', or even 'testapp'?.  I've implemented the first one in my
latest version of wsgiServer, but I want to make sure that's correct.

Colin.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20041130/c14135d5/attachment.html
From colin at owlfish.com  Wed Dec  1 02:16:45 2004
From: colin at owlfish.com (Colin Stewart)
Date: Wed Dec  1 02:16:54 2004
Subject: [Web-SIG] Re: WSGI Utils & SCGI/Quixote.
In-Reply-To: <20041130190154.GA12058@caltech.edu>
References: <20041130190154.GA12058@caltech.edu>
Message-ID: <1101863806.15522.50.camel@rock>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: path_fix.patch
Type: text/x-patch
Size: 3290 bytes
Desc: not available
Url : http://mail.python.org/pipermail/web-sig/attachments/20041130/3e4f1652/path_fix-0001.bin
From titus at caltech.edu  Wed Dec  1 02:29:25 2004
From: titus at caltech.edu (Titus Brown)
Date: Wed Dec  1 02:29:30 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <1101863195.15522.42.camel@rock>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
	<1101863195.15522.42.camel@rock>
Message-ID: <20041201012925.GA20972@caltech.edu>

-> > >The only real problem in getting this to work was that wsgiServer.py
-> > >expected *every* URL under /demo to be registered to demo_obj.  I
-> > >changed the wsgiServer.py code to allow for partial matches & munged
-> > >the SCRIPT_NAME and PATH_INFO variables appropriately.  I also added
-> > >REQUEST_URI because Quixote uses it for a few things; this should
-> > >probably be moved into QWIP.
-> > 
-> > I think I'm going to have to call that point out in the PEP 
-> > somewhere.  Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be 
-> > set, but I think perhaps some folks have missed the implications of that 
-> > for the URL path space.
-> 
-> 
-> The clarification is good - it certainly wasn't clear to me the first
-> time I read it!
-> 
-> A quick question about the SCRIPT_NAME: If an application registers for
-> the path '/testapp/' should SCRIPT_NAME be set to '/testapp',
-> '/testapp/', or even 'testapp'?.  I've implemented the first one in my
-> latest version of wsgiServer, but I want to make sure that's correct.

Well, 'testapp' would be ruled out because of the requirement that

	SCRIPT_NAME + PATH_INFO == REQUEST_URI

(where REQUEST_URI is everything after the host/port info).  I'd be
happy with the literal case, myself, but I'm not sure how anything
other than Quixote deals with the URLs.

--titus
From ianb at colorstudy.com  Wed Dec  1 02:42:05 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Dec  1 02:42:04 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <1101863195.15522.42.camel@rock>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
	<1101863195.15522.42.camel@rock>
Message-ID: <41AD216D.2060107@colorstudy.com>

Colin Stewart wrote:
> (I've subscribed to the list so we can continue discussion purely on-list)
> 
>>>The only real problem in getting this to work was that wsgiServer.py
>>>expected *every* URL under /demo to be registered to demo_obj.  I
>>>changed the wsgiServer.py code to allow for partial matches & munged
>>>the SCRIPT_NAME and PATH_INFO variables appropriately.  I also added
>>>REQUEST_URI because Quixote uses it for a few things; this should
>>>probably be moved into QWIP.
>>
>>I think I'm going to have to call that point out in the PEP 
>>somewhere.  Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be 
>>set, but I think perhaps some folks have missed the implications of that 
>>for the URL path space.
>>
> 
> The clarification is good - it certainly wasn't clear to me the first 
> time I read it!
> 
> A quick question about the SCRIPT_NAME: If an application registers for 
> the path '/testapp/' should SCRIPT_NAME be set to '/testapp', 
> '/testapp/', or even 'testapp'?.  I've implemented the first one in my 
> latest version of wsgiServer, but I want to make sure that's correct.

Because PATH_INFO must either be empty or start with a /, SCRIPT_NAME 
should be "/testapp" (no trailing /).  If the script registers for the 
root (i.e., all URLs), SCRIPT_NAME should be "", and PATH_INFO contains 
the entire URL.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From pje at telecommunity.com  Wed Dec  1 02:53:12 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Dec  1 02:55:35 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <1101863195.15522.42.camel@rock>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
	<5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041130204832.0243c770@mail.telecommunity.com>

At 08:06 PM 11/30/04 -0500, Colin Stewart wrote:
>A quick question about the SCRIPT_NAME: If an application registers for 
>the path '/testapp/' should SCRIPT_NAME be set to '/testapp', '/testapp/', 
>or even 'testapp'?.  I've implemented the first one in my latest version 
>of wsgiServer, but I want to make sure that's correct.

Yes, the first one is correct.  SCRIPT_NAME and PATH_INFO must both 
*always* either begin with a '/', or be empty strings.

Technically, I would recommend that an app register as '/testapp' rather 
than '/testapp/', but as long as you generate a valid SCRIPT_NAME and 
PATH_INFO, it's not a compliance issue, as long as your web server can 
distinguish between:

     GET /testapp/

and

     GET /testapp

which should produce a PATH_INFO of '/' in the first case, and an empty 
string in the second.  It may be that some web servers exist that are 
biased towards physical URL mappings and don't pay attention to 
this.  Perhaps I should add some clarification to the PEP on this point, 
although maybe instead some kind soul will volunteer to write a nice 
article about tips and traps for WSGI server implementors.  :)

From janssen at parc.com  Wed Dec  1 03:57:33 2004
From: janssen at parc.com (Bill Janssen)
Date: Wed Dec  1 03:57:58 2004
Subject: [Web-SIG] WSGI configuration and character encoding. 
In-Reply-To: Your message of "Tue, 30 Nov 2004 14:07:53 PST."
	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com> 
Message-ID: <04Nov30.185740pst."58617"@synergy1.parc.xerox.com>

I think you're either dreaming, or have a much different idea of
"non-technical user" than I have.

Bill

Phillip Eby wrote:
> At 12:10 PM 11/30/04 -0800, Bill Janssen wrote:
> > > I don't think I could honestly expect non-technical users to be able to
> > > get their head around RFC 2047.
> >
> >I wouldn't be expecting non-technical users to be editing WSGI config
> >files in the first place.
> 
> That's one of our explicit requirements, actually.  We don't need them to 
> be able to *create* a deployment file, but they should be able to edit one 
> to tweak file paths and such.

From pje at telecommunity.com  Wed Dec  1 04:19:21 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Dec  1 04:17:48 2004
Subject: [Web-SIG] WSGI configuration and character encoding. 
In-Reply-To: <04Nov30.185740pst."58617"@synergy1.parc.xerox.com>
References: <Your message of "Tue, 30 Nov 2004 14:07:53 PST."
	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>

At 06:57 PM 11/30/04 -0800, Bill Janssen wrote:
>I think you're either dreaming, or have a much different idea of
>"non-technical user" than I have.

Well, by definition in this context, they need to be somebody who can edit 
a simple configuration file.  If not, then it doesn't matter how simple a 
configuration file we make it!  (Also, presumably they're not going to be 
able to configure their web server, either.)  The point is to require as 
few skills as possible beyond "can edit a configuration file".  :)

As to the "non-technical user" part, I am thinking of a person who is not 
technically inclined.  That is, someone who may do technical-ish things 
when forced to, but has no inherent interest in them, and little patience 
for them.  Someone who's maybe edited an HTML file or PHP script in order 
to change something, but doesn't actually *know* any HTML or PHP, they just 
figure it out as they go.  They are not technophobic, just not 
techno-interested.  :)

If we have to tell this person how to get into Notepad in order to do the 
editing, or explain to them what a filename is, then they are not 
reasonably within the scope of this effort, and the application author 
should attempt to charge them an installation fee.  :)  On the other hand, 
if we have to tell them about \u escapes or RFC 2047 or XML entities, then 
it is us who have gone out of scope, and we do not deserve to get any of 
their money.  :)

From janssen at parc.com  Wed Dec  1 04:46:19 2004
From: janssen at parc.com (Bill Janssen)
Date: Wed Dec  1 04:46:51 2004
Subject: [Web-SIG] WSGI configuration and character encoding. 
In-Reply-To: Your message of "Tue, 30 Nov 2004 19:19:21 PST."
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> 
Message-ID: <04Nov30.194625pst."58617"@synergy1.parc.xerox.com>

So maybe the charset of the contents of the config file should just be
whatever the locale of the machine says it is.  Presumably that's what
will drive the simple text editor that the user will be using to
create/edit the file.

Bill
From ianb at colorstudy.com  Wed Dec  1 04:55:11 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Dec  1 04:55:12 2004
Subject: [Web-SIG] WSGI configuration and character encoding.
In-Reply-To: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
References: <Your message of "Tue, 30 Nov 2004 14:07:53
	PST."	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com>
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
Message-ID: <41AD409F.7000507@colorstudy.com>

Phillip J. Eby wrote:
> At 06:57 PM 11/30/04 -0800, Bill Janssen wrote:
> 
>> I think you're either dreaming, or have a much different idea of
>> "non-technical user" than I have.
> 
> 
> Well, by definition in this context, they need to be somebody who can 
> edit a simple configuration file.  If not, then it doesn't matter how 
> simple a configuration file we make it!  (Also, presumably they're not 
> going to be able to configure their web server, either.)  The point is 
> to require as few skills as possible beyond "can edit a configuration 
> file".  :)

FWIW, a lot of PHP applications these days use through-the-web 
configuration; dump the files somewhere web-accessible, make sure at 
least a few select files are writable by Apache, and the rest has a GUI 
(of sorts).  Even I find this quite convenient.  Though I just 
encountered an application that took this too far, and stored preference 
information in the database, including the database connection 
information.  It confused me greatly when the two weren't in sync, and 
it tried to reconnect to a database that no longer existed after I moved 
the application to another server.  But I digress.

We aren't where (mindful) PHP is (or even close), but it's something to 
shoot for.  This may not actually apply to deployment configuration 
files, except that it would be nice if cooperative software could be 
packaged with a deployment configuration file that didn't need editing. 
  At which point it might as well be a Python script that sets up the 
necessary objects.  Python can be much smarter about this than any 
configuration file.

Which is why I don't really think deployment configuration is all that 
important.  It doesn't hurt, but I don't think it should hold up the PEP 
in any way -- I think the PEP is entirely sufficient as it is, and we 
can figure out deployment or async or whatever in other PEPs, or in a 
later revision to WSGI.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From pje at telecommunity.com  Wed Dec  1 05:12:12 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Dec  1 05:10:39 2004
Subject: [Web-SIG] WSGI configuration and character encoding. 
In-Reply-To: <04Nov30.194625pst."58617"@synergy1.parc.xerox.com>
References: <Your message of "Tue, 30 Nov 2004 19:19:21 PST."
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041130231033.024f6620@mail.telecommunity.com>

At 07:46 PM 11/30/04 -0800, Bill Janssen wrote:
>So maybe the charset of the contents of the config file should just be
>whatever the locale of the machine says it is.  Presumably that's what
>will drive the simple text editor that the user will be using to
>create/edit the file.

For some platforms, that certainly would make sense.  Of course, this is 
also why I was just thinking plain ASCII, at least until somebody pointed 
out that some OSes have Unicode filenames.  IMO, this is really the only 
use case that supports having Unicode support at all; everything we need 
for HTTP itself is either ISO-Latin-1 or "byte strings".

From pje at telecommunity.com  Wed Dec  1 05:22:56 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Dec  1 05:21:23 2004
Subject: [Web-SIG] WSGI configuration and character encoding.
In-Reply-To: <41AD409F.7000507@colorstudy.com>
References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
	<Your message of "Tue, 30 Nov 2004 14:07:53
	PST."	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com>
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com>

At 09:55 PM 11/30/04 -0600, Ian Bicking wrote:

>We aren't where (mindful) PHP is (or even close), but it's something to 
>shoot for.  This may not actually apply to deployment configuration files, 
>except that it would be nice if cooperative software could be packaged 
>with a deployment configuration file that didn't need editing.  At which 
>point it might as well be a Python script that sets up the necessary 
>objects.  Python can be much smarter about this than any configuration file.

Here's what I'm thinking: paths in the file should be allowed to be 
relative to the directory containing the deployment file, and the 
configuration passed to the application or its setup should include the 
path to the deployment file.  The combination of these two things would 
suffice to allow distribution of an application in a largely 
ready-to-deploy form.  The application could always provide facilities to 
edit its own configuration file(s) or the deployment configuration.

This doesn't mean that some simple apps or middleware won't end up using 
the deployment file for all their configuration needs, but that may well be 
okay for their target audiences.  The key is to have a path to near-turnkey 
installation, if possible.


>Which is why I don't really think deployment configuration is all that 
>important.  It doesn't hurt, but I don't think it should hold up the PEP 
>in any way -- I think the PEP is entirely sufficient as it is, and we can 
>figure out deployment or async or whatever in other PEPs, or in a later 
>revision to WSGI.

Hmm, I seem to recall you arguing almost the opposite about a year 
ago...  ;)  For example, that it was really important for apps to know what 
web server they were running in, and conversely that they expose lots of 
configuration data to the web server.

Anyway, there's nothing really "holding up" the PEP; people are making 
implementations, and we're so far only finding things that need 
clarification, not fixing.  So clearly the PEP itself is in fairly good 
shape.  I probably should block out some time in the next week or two to 
apply the pending updates and write that sync/async/threading primer.

I'd also still like to see a solid async API proposal, and I'd like to 
*make* a deployment format proposal, once I get a few other things taken 
care of.

From ianb at colorstudy.com  Wed Dec  1 19:41:48 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Dec  1 19:45:21 2004
Subject: [Web-SIG] WSGI configuration and character encoding.
In-Reply-To: <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com>
References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <Your
	message of "Tue, 30 Nov 2004 14:07:53
	PST."	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com>
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
	<5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com>
Message-ID: <41AE106C.4050403@colorstudy.com>

Phillip J. Eby wrote:
> At 09:55 PM 11/30/04 -0600, Ian Bicking wrote:
> 
>> We aren't where (mindful) PHP is (or even close), but it's something 
>> to shoot for.  This may not actually apply to deployment configuration 
>> files, except that it would be nice if cooperative software could be 
>> packaged with a deployment configuration file that didn't need 
>> editing.  At which point it might as well be a Python script that sets 
>> up the necessary objects.  Python can be much smarter about this than 
>> any configuration file.
> 
> 
> Here's what I'm thinking: paths in the file should be allowed to be 
> relative to the directory containing the deployment file, and the 
> configuration passed to the application or its setup should include the 
> path to the deployment file.  The combination of these two things would 
> suffice to allow distribution of an application in a largely 
> ready-to-deploy form.  The application could always provide facilities 
> to edit its own configuration file(s) or the deployment configuration.

This leads to the question: when would you edit the deployment file? 
Besides just using a different directory prefix?

A given application has a fixed set of requirements, why not just code 
them up in Python?  Well, I can imagine reasons, but I think we need to 
start from use cases.

So here's a use case:

I have a Wiki application, written for Webware.  I can expose it at a 
few different levels -- the many Webware servlets (which are 
applications), or a single application, and that application can have 
more or less functionality (depending on how much functionality I expect 
the parent to have -- e.g., session support).

I also require configuration for the Wiki, though it probably could be 
installed with no configuration and reasonable defaults.  (Should it 
create a template configuration file with defaults in this case?  Where 
to put it?)

Because it is based on Webware (WSGIKit), it requires a bunch of 
middleware.  If there are other Webware applications in the stack, 
*maybe* it would be useful to share that middleware.  Maybe not, maybe 
just sharing configuration would be sufficient.

So, where does deployment fit in here?  Probably all you'd need would be 
to give the path to the application (maybe an importable package name), 
and an optional path to the configuration file for the application.  I 
don't need the configuration until runtime, though.

Or, it could be inverted.  The Wiki application is the front-facing 
object, and you tell it what server you want to use.  Both could even 
coexist fairly easily.  Maybe it would be smart enough to tell if it was 
being run as a CGI script (just by looking at the environment), and if 
not it would have options to start up some kind of server.  And it would 
export some conventional name, so you could point some other server at 
it, using whatever mechanisms that server uses (which probably includes 
giving the application a URL space).  As an application distributor this 
seems like the easiest thing to describe and support.

> This doesn't mean that some simple apps or middleware won't end up using 
> the deployment file for all their configuration needs, but that may well 
> be okay for their target audiences.  The key is to have a path to 
> near-turnkey installation, if possible.
> 
> 
>> Which is why I don't really think deployment configuration is all that 
>> important.  It doesn't hurt, but I don't think it should hold up the 
>> PEP in any way -- I think the PEP is entirely sufficient as it is, and 
>> we can figure out deployment or async or whatever in other PEPs, or in 
>> a later revision to WSGI.
> 
> 
> Hmm, I seem to recall you arguing almost the opposite about a year 
> ago...  ;)  For example, that it was really important for apps to know 
> what web server they were running in, and conversely that they expose 
> lots of configuration data to the web server.

Consistency is the hobgoblin of little minds!  Well, I don't know if 
I've been consistent or not, but I don't place much weight in it either 
way ;)

I guess I don't really want WSGI to be exposed to less-technical web 
developers, or to people who install applications based on it.  So I'd 
like the pieces to communicate with each other fairly freely and 
completely.  If we can automate something, that's great -- like 
including process information in the WSGI environment.  But 
configuration isn't automation, so it doesn't excite me a lot.

But of course configuration exists, so if it exists then I'd like to 
keep it together, because I find lots of configuration files to be hard 
to navigate (as a user).  I also want configuration to be optional, and 
deployment configuration isn't very optional.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From colin at owlfish.com  Thu Dec  2 06:20:54 2004
From: colin at owlfish.com (Colin Stewart)
Date: Thu Dec  2 06:21:04 2004
Subject: [Web-SIG] ANN: WSGIUtils 0.3
Message-ID: <1101964854.26020.5.camel@rock>

Hi,

Following on from the discussion here regarding the handling of URLs
I've uploaded a new version of WSGIUtils
(http://www.owlfish.com/software/wsgiutils/) that should behave as the
spec intended.  Any feedback or suggestions are welcome...

Colin.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20041202/47505a76/attachment.htm
From titus at caltech.edu  Thu Dec  2 08:22:07 2004
From: titus at caltech.edu (Titus Brown)
Date: Thu Dec  2 08:22:10 2004
Subject: [Web-SIG] ANN: WSGIUtils 0.3
In-Reply-To: <1101964854.26020.5.camel@rock>
References: <1101964854.26020.5.camel@rock>
Message-ID: <20041202072207.GA26907@caltech.edu>

-> Following on from the discussion here regarding the handling of URLs
-> I've uploaded a new version of WSGIUtils
-> (http://www.owlfish.com/software/wsgiutils/) that should behave as the
-> spec intended.  Any feedback or suggestions are welcome...

Hi, Colin,

sorry I waited 'til you cut a new release to test out your patch!

Everything now works as expected, except for an omission in
wsgiAdaptor.py where 'setContentType' isn't defined on the Request
class.

One other request -- could you omit (or replace) the space in the
directory name, e.g. rather than "WSGI Utils-0.3" make it
"WSGIUtils-0.3" or something similar?  Ugly, I know, but that space can
be awkward in a command-line environment.

cheers,
--titus
From ianb at colorstudy.com  Thu Dec  2 20:20:26 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu Dec  2 20:23:57 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
Message-ID: <41AF6AFA.8010205@colorstudy.com>

Phillip J. Eby wrote:
> At 11:01 AM 11/30/04 -0800, Titus Brown wrote:
> 
>> The only real problem in getting this to work was that wsgiServer.py
>> expected *every* URL under /demo to be registered to demo_obj.  I
>> changed the wsgiServer.py code to allow for partial matches & munged
>> the SCRIPT_NAME and PATH_INFO variables appropriately.  I also added
>> REQUEST_URI because Quixote uses it for a few things; this should
>> probably be moved into QWIP.
> 
> 
> I think I'm going to have to call that point out in the PEP somewhere.  
> Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be set, but 
> I think perhaps some folks have missed the implications of that for the 
> URL path space.
> 
> Perhaps something like this would do the trick:
> 
> """
> Application Placement in Server URL Space
> -----------------------------------------
> 
> In order to generate correct SCRIPT_NAME and PATH_INFO variables, 
> servers and gateways MUST treat an application's location as a URL path 
> prefix.  That is, servers and gateways:
> 
> * MUST determine the target application using a matching prefix of the 
> request path (which then determines the value of SCRIPT_NAME).
> 
> * MUST take the remaining portion of the request path, and use it to 
> determine PATH_INFO. (Note that the remainder must be empty or begin 
> with a '/', otherwise the prefix match was invalid!)
> 
> * MUST assume that there are an infinite number of possible URL paths 
> that may appear as a PATH_INFO suffix "beneath" the application's base URL

I think this is too restrictive.  It's the natural way to do things in 
most cases, but there's no reason to enforce it.  E.g., a 
mod_rewrite-like middleware might do any number of things; it's a 
use-at-your-own-risk proposition (with considerable risk, at least from 
my own mod_rewrite experiences), but it shouldn't be disallowed, and 
this appears to disallow that kind of code.

A particular use case came to my mind today.  Imagine a login middleware 
-- it wants to allow login and logout, but otherwise interrupt the 
request cycle as little as possible.  So, lets say an application 
requires login; maybe it sends a 401.  The login middleware catches it, 
sees that it's configured for cookie-based (form) login, and turns it 
into a 200 with a login form.  The user logs in, and goes to their 
original page.  You want to customize the login form, so the form might 
be an application that doesn't belong to the login middleware (but uses 
conventional keys); the URL belongs to the originally-requested 
application, but the application being served is some other application. 
  Or later, if they try to login but fail, their URL may still be 
pointing at the original application (useful if they were submitting a 
POST form, which you want to pass through to the original URL, and it's 
difficult to do that with a redirect-after-submit).

There's a bunch of other ways this could be factored, but a number of 
them involve dispatching to an application based on query string, or in 
some way where SCRIPT_NAME and PATH_INFO don't have any relation to the 
application at all.

So I'd say these should all be SHOULDs, not MUSTs.  Or they should 
simply be put in as implementation recommendations.  In general I don't 
think this should be a problem, because implementors will respond to 
feedback, and if it's really a problem it will be addressed (and 
probably fairly quickly).  There's a lot of weird use cases for 
application dispatching, and I don't see any reason to restrict that by 
formalizing how dispatching should work.

> Notice that these requirements imply that servers and gateways:
> 
> * MUST NOT use query string contents, fragment identifiers, or URL 
> parameters to determine the application object that a request should be 
> sent to.
> 
> * MUST NOT require that every URL path used by the application be 
> preconfigured or pre-registered with the server, or have some required 
> mapping to existing files, or any other requirement that would make 
> dynamic URLs impractical.
> 
> A server or gateway that cannot meet these requirements IS NOT COMPLIANT 
> with this specification; it would be completely unusable for 
> applications from many popular Python web frameworks inlcuding at least 
> Zope, Webware, and Quixote, and many standalone Python web applications 
> as well.
> """
From ianb at colorstudy.com  Thu Dec  2 20:27:47 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu Dec  2 20:31:17 2004
Subject: [Web-SIG] WSGI configuration and character encoding.
In-Reply-To: <41AF600B.3010006@jdiworks.net>
References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
	<Your	message of "Tue, 30 Nov 2004
	14:07:53	PST."	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com>	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>	<5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com>
	<41AE106C.4050403@colorstudy.com> <41AF600B.3010006@jdiworks.net>
Message-ID: <41AF6CB3.3020908@colorstudy.com>

Terrel Shumway wrote:
> I haven't been following this thread closely, but here is my $.02 based 
> on my continuing experience in implementing a Cheetah framework over 
> WSGI and deploying it via CGI and mod_python.
> 
> Ian Bicking wrote:
> 
>> Phillip J. Eby wrote:
>>
>>> Here's what I'm thinking: paths in the file should be allowed to be 
>>> relative to the directory containing the deployment file, and the 
>>> configuration passed to the application or its setup should include 
>>> the path to the deployment file.  The combination of these two things 
>>> would suffice to allow distribution of an application in a largely 
>>> ready-to-deploy form.  The application could always provide 
>>> facilities to edit its own configuration file(s) or the deployment 
>>> configuration.
>>
>>
>>
>> This leads to the question: when would you edit the deployment file? 
>> Besides just using a different directory prefix?
> 
> 
> The "packager" edits the <deployment descriptor> and bundles the code, 
> templates, etc. that the app requires.  This is often a different person 
> from the "programmer" who creates the components.

I expect the packager to have programming skills, even if they aren't
the same person as the application programmer.

> The server administrator or "webmaster" is the person who edits the 
> <server configuration> file to assign a URL space to each app.  The 
> webmaster might have his own middleware to add, e.g. for extra logging, 
> or performance monitoring.

The webmaster may not have programming skills (or at least not Python).
  Though depending on the sophistication of the integration, I'd be okay
with some programming required.  For instance, if you are integrating
your login method with the application's, it might be necessary to do
some programming -- simple login sharing should be easy, but sharing
user metadata and administrative operations (e.g., adding users and the
like) will probably require programming (unless the systems are
specifically meant to work with each other -- i.e., another standardized
interface).

> A given application has a fixed set of requirements, why not just code 
> them up in Python?  Well, I can imagine reasons, but I think we need to 
> start from use cases.
> 
> So here's a use case:
> 
>  > that application can have more or less functionality (depending on 
> how much functionality I expect the parent to have -- e.g., session 
> support).

I'd like this to be automated.  If, for instance, we can standardize the
session interface this should be doable.  The application looks for,
say, session.api_1 (standard session API, version 1).  If it finds it,
it uses it, knowing the interface.  If not, it puts in its own piece of
middleware that provides the API.

Until we standardize that, we'll be doing this stuff ad hoc, but that's
okay -- this is an ongoing process.

> Two core features that are "required" by a Java servlet container are 
> session support, and login support. And every existing container that I 
> know of also supports JSP.

JSP is funny, and not a model widely used for Python (I think).  It
certainly doesn't seem as fundamental in a WSGI model, where URLs aren't
necessarily mapped to files.

That is, I don't think there's any kind of file you can just plop into a
WSGI container and it will display; not even Python source.  There's no
file-like container, and there's no standard URL->object mapping or
configuration.  I'm not entirely sure there should be a standard, at
least not one we expect most people to use... certainly several
frameworks could share implementations, to the degree they act
similarly.  But object publishers (e.g., Quixote, Zope) and file-based
systems (e.g., Webware, Spyce) are going to remain fairly separate.

> These three features -- session, login, templates -- are needed by 
> enough people that I think they should be standard.  (e.g. how difficult 
> would it be to create a wiki if you could rely on the framework for 
> these? -- 80% of the work is done.)
> 
> Another kit that might have broad application is a formkit --  a 
> higher-level way to manage posted form data  -- but that probably 
> doesn't belong in WSGI (PEP), because there are a lot of different ways 
> people want to do it.

I don't think this needs to be part of the request cycle at all -- the
application is always an intermediary there.  It's simply a library.

> Probably ditto for templates. But in python, "you really only have to do 
> it one way".  There should be one (1) easy way to say how a container 
> interacts with a template engine. I'm not sure what that means yet, but 
> I'll think about it.

At first I thought templates should just be a library as well, though it
would be nice if applications could share templating configuration.
I.e., you could indicate a template search path, maybe adding more paths
on a per-application basis, or otherwise fiddling with that path (e.g.,
skinning an application based on URL).  But we can handle that in a
neutral way, i.e., providing a generic configuration system that is
template users can use as they wish (though they'll want to form
conventions about key naming; or we can provide conventions about how to
adapt configurations to different naming schemes).

> e.g. the framework I am building can get template files from different 
> places to easily support skinning. It would be nice to say "get template 
> X and fill it from these variables" without worrying about where X 
> resides in a filesystem or .zip archive. (along the lines of the java 
> ServletContext.getResource*() methods)  If I deploy four applications 
> (Contexts), I want them to share template files 
> (/var/www/sitename/templates/) so the designer can change the look and 
> feel of the whole site at once.  In my case, I also want a set of 
> templates shared among many sites on the same server (/var/www/templates/)
> There should be a standard way for servlet authors to say "this is the 
> 'content' piece that I care about, and here are some styles and <head> 
> content. Now you put it together inside the site-wide templates to 
> create the page."  And it shouldn't matter to the developer whether that 
> sitewide template is implemented with Cheetah or CherryPy or Quixote or 
> ZTP or whatever.

This does make me think templating could participate in the request
cycle, as a filter of sorts.  Right now we're trying to move to SSIs as
a shared templating scheme, at least when we move to Apache 2, because
all our scripts can output SSIs and Apache will evaluate them.  Maybe
not everyone will want SSIs (obviously), but maybe this general pattern
can be used -- one of filtering text.  I don't know what else we can
agree on, especially in environments where everything isn't Python.
This is akin to an XSLT-based templating approach, but of course there
are much better languages than XSLT that we can come up with ;)

If we do it as filtering, we don't have to agree nearly as much about
templating languages or even interfaces.  We just have to agree on a
document format, which somehow seems easier.  We wouldn't even have to
agree that much on document format; if we were using SSIs, we could make
something that transforms document type X into SSIs, and then Apache
does the next step.  This is an N^2 problem, given N kinds of
data/template languages, but at least it offers some kind of solution.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Thu Dec  2 20:40:23 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Dec  2 20:38:54 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <41AF6AFA.8010205@colorstudy.com>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
	<5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20041202143501.02b80480@mail.telecommunity.com>

At 01:20 PM 12/2/04 -0600, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>Application Placement in Server URL Space
>>-----------------------------------------
>>In order to generate correct SCRIPT_NAME and PATH_INFO variables, servers 
>>and gateways MUST treat an application's location as a URL path 
>>prefix.  That is, servers and gateways:
>>* MUST determine the target application using a matching prefix of the 
>>request path (which then determines the value of SCRIPT_NAME).
>>* MUST take the remaining portion of the request path, and use it to 
>>determine PATH_INFO. (Note that the remainder must be empty or begin with 
>>a '/', otherwise the prefix match was invalid!)
>>* MUST assume that there are an infinite number of possible URL paths 
>>that may appear as a PATH_INFO suffix "beneath" the application's base URL
>
>I think this is too restrictive.  It's the natural way to do things in 
>most cases, but there's no reason to enforce it.  E.g., a mod_rewrite-like 
>middleware might do any number of things; it's a use-at-your-own-risk 
>proposition (with considerable risk, at least from my own mod_rewrite 
>experiences), but it shouldn't be disallowed, and this appears to disallow 
>that kind of code.
>
>A particular use case came to my mind today.  Imagine a login middleware 
>-- it wants to allow login and logout, but otherwise interrupt the request 
>cycle as little as possible.  So, lets say an application requires login; 
>maybe it sends a 401.  The login middleware catches it, sees that it's 
>configured for cookie-based (form) login, and turns it into a 200 with a 
>login form.

You're focusing here on middleware; IMO the above is valid as long as it's 
applied only to servers and gateways, rather than middleware.  It just 
needs a parenthetical to indicate that these restrictions don't apply to 
middleware.

From pje at telecommunity.com  Thu Dec  2 22:35:13 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Dec  2 22:33:45 2004
Subject: [Web-SIG] WSGI configuration and character encoding.
In-Reply-To: <41AF6CB3.3020908@colorstudy.com>
References: <41AF600B.3010006@jdiworks.net>
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
	<Your	message of "Tue, 30 Nov 2004
	14:07:53	PST."	<5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com>
	<5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com>
	<5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com>
	<41AE106C.4050403@colorstudy.com> <41AF600B.3010006@jdiworks.net>
Message-ID: <5.1.1.6.0.20041202163249.0287aa30@mail.telecommunity.com>

At 01:27 PM 12/2/04 -0600, Ian Bicking wrote:
>That is, I don't think there's any kind of file you can just plop into a
>WSGI container and it will display;

That's actually one use case for the deployment file.  That is, if you can 
just plop a .wsgi file in there with the deployment information, and have 
the application's virtual URL space simply be "beneath" the URL of the 
.wsgi file.  (Or whatever filename it has.)

From terrel at terrelshumway.com  Fri Dec  3 01:57:38 2004
From: terrel at terrelshumway.com (Terrel Shumway)
Date: Fri Dec  3 01:57:50 2004
Subject: [Web-SIG] WSGI Utils & SCGI/Quixote.
In-Reply-To: <41AF6AFA.8010205@colorstudy.com>
References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com>
	<41AF6AFA.8010205@colorstudy.com>
Message-ID: <41AFBA02.4060409@terrelshumway.com>

Ian Bicking wrote:

> Phillip J. Eby wrote:
>
>> I think I'm going to have to call that point out in the PEP 
>> somewhere.  Technically, the PEP requires that SCRIPT_NAME and 
>> PATH_INFO be set, but I think perhaps some folks have missed the 
>> implications of that for the URL path space.
>>
>> Perhaps something like this would do the trick:
>>
>> """
>> Application Placement in Server URL Space
>> -----------------------------------------
>>
>> In order to generate correct SCRIPT_NAME and PATH_INFO variables, 
>> servers and gateways MUST treat an application's location as a URL 
>> path prefix.  That is, servers and gateways:
>>
>> * MUST determine the target application using a matching prefix of 
>> the request path (which then determines the value of SCRIPT_NAME).
>>
>> * MUST take the remaining portion of the request path, and use it to 
>> determine PATH_INFO. (Note that the remainder must be empty or begin 
>> with a '/', otherwise the prefix match was invalid!)
>>
>> * MUST assume that there are an infinite number of possible URL paths 
>> that may appear as a PATH_INFO suffix "beneath" the application's 
>> base URL
>
>
> I think this is too restrictive.  It's the natural way to do things in 
> most cases, 

It is the natural way, and it is not very restrictive.

> but there's no reason to enforce it.  

Reason #1: "You really only need to do it one way"  which is the entire 
point of the WEB-SIG.
Reason #2:  If you don't specify one well-documented, easily-implemented 
way, you will get a dozen poorly-implemented, poorly-documented ways.

> E.g., a mod_rewrite-like middleware might do any number of things; 
> it's a use-at-your-own-risk proposition (with considerable risk, at 
> least from my own mod_rewrite experiences), but it shouldn't be 
> disallowed, and this appears to disallow that kind of code.

Reason #3: mod_rewrite is the problem. an understandable mapping 
convention is the solution.

[snip]

> The login middleware catches it, sees that it's configured for 
> cookie-based (form) login, and turns it into a 200 with a login form.  

that should be a "303 See Other" pointing to the login form. 

<>> Or later, if they try to login but fail, their URL may still be 
pointing at the original application (useful if they were submitting a 
POST form,

not really, because you still lost the original POST data.

... Unless the login middleware also saved that to a "conditional post" 
queue like Fastmail.FM does if your session times out while you are 
composing a message. 

(IMO, every successful POST SHOULD respond with 303 -- avoiding 90% of 
all double posts. Unsuccessful POSTs should send 200, with the original 
form already filled out with the info that was correct, and error 
messages where it was not.)

> which you want to pass through to the original URL, and it's difficult 
> to do that with a redirect-after-submit).

If you use cookie-based authentication, the user can usually just hit 
the back button twice and POST again. (not fun if they were uploading 
big files, but otherwise harmless, because the orignal post "failed" and 
no unsafe action was taken.)


> There's a bunch of other ways this could be factored, but a number of 
> them involve dispatching to an application based on query string, or 
> in some way where SCRIPT_NAME and PATH_INFO don't have any relation to 
> the application at all.

And those other ways create UGLY urls, which enticed someone to create 
mod_rewrite to make them pretty. Search engines are getting better at 
making sense of that ugliness, but the URL space is still not very RESTful.

Keep in mind we are talking about the *container* doing the 
dispatching.  Once the servlet is selected, it can do anything it wants 
with the PATH_INFO and query string, including forwards and includes and 
redirects.  If the application wants to do crazy dispatching within its 
own URL space, that's fine, but the container shouldn't need to deal 
with that.

>
> So I'd say these should all be SHOULDs, not MUSTs.  Or they should 
> simply be put in as implementation recommendations.  

That's what the Java people said sometime before Servlet Version 2.2.  
But they tightened it up based on experience:

--------------SRV.10 (v.2.2)--------------
Previous versions of this specification have allowed servlet containers 
a great deal of flexibility in mapping client requests to servlets only 
defining a set a suggested mapping techniques. This specification *now 
requires* a set of mapping techniques to be used for web applications 
which are deployed via the Web Application Deployment mechanism. Just as 
it is highly recommended that servlet containers use the deployment 
representations as their runtime representation, it is highly 
recommended that they use these path mapping rules in their servers for 
all purposes and not just as part of deploying a web application.
--------------
--------------SRV.11 (v.2.4)--------------
The mapping techniques described in this chapter are *required* for Web 
containers mapping client requests to servlets.

(Previous versions of this specification made use of these mapping 
techniques as a suggestion rather than a requirement, allowing servlet 
containers to each have their different schemes for mapping client 
requests to servlets.)
--------------

http://jdiworks.net/projects/servlet/SRV.11.html
http://jdiworks.net/projects/servlet/SRV.4.4.html

Let's learn from their experience.

---
Terrel Shumway
"That Web Guy Who Knows Marketing"
http://jdiworks.net/


From floydophone at gmail.com  Sat Dec  4 05:12:30 2004
From: floydophone at gmail.com (Peter Hunt)
Date: Sat Dec  4 05:12:33 2004
Subject: [Web-SIG] WSGI-ISAPI
Message-ID: <6654eac404120320126b9d9456@mail.gmail.com>

After installed Python 2.4 and the latest Pythonwin, I discovered a
new cool ISAPI module. If anyone wants to assist me with WSGI-ISAPI,
I'd be glad for the help. I'm still at the "Hello, world" stage, but I
think this (along with mod_python) will be a huge selling point for
WSGI.
From jlowery at m2is.com  Wed Dec  8 01:16:55 2004
From: jlowery at m2is.com (Jeff Lowery)
Date: Wed Dec  8 01:16:59 2004
Subject: [Web-SIG] Running python cgi script
Message-ID: <006f01c4dcbb$39b8e470$2600a8c0@Folderal>

I'm trying to get  MoinMoin Wiki server installed on IIS 6.0, but am hung up on getting the moin.cgi script to execute.  Yes, I have read and followed the directions in the INSTALL.html document, including:

1) appended the site-packages directory to sys.path
2) added virtual directory 'wiki', pointing to htdocs directory
3) added virtual directory 'mywiki', pointing to wiki instance directory
4) configured the virtual directory in IIS above to run "c:\python23\python.exe" -u %s %s on .cgi extensions
5) set Web Service Extensions to allow unknown cgi extensions

Added some logging statements to a log file at the top of moin.cgi, just to see if it was executing at all (runs fine from the command line, btw). Apparently not: getting a "CGI Error: The specified CGI application misbehaved by not returning a complete set of HTTP headers", and no log is generated. Funny thing is that if I remove the "-u %s %s" from the cgi extension setup (4), I get a timeout error instead. Looks like IIS knows about the CGI mapping, but is not running the python interpreter.

Any ideas? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20041207/3ed94dcf/attachment.html
From amk at amk.ca  Wed Dec 15 15:36:08 2004
From: amk at amk.ca (A.M. Kuchling)
Date: Wed Dec 15 15:36:53 2004
Subject: [Web-SIG] WSGI presentation
Message-ID: <20041215143608.GA8049@rogue.amk.ca>

A WSGI presentation at PyCon would probably be a good idea; anyone
want to give one? (Proposal deadline is Dec. 31...)

--amk
From ianb at colorstudy.com  Wed Dec 15 17:04:34 2004
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Dec 15 17:04:27 2004
Subject: [Web-SIG] WSGI presentation
In-Reply-To: <20041215143608.GA8049@rogue.amk.ca>
References: <20041215143608.GA8049@rogue.amk.ca>
Message-ID: <41C06092.8090504@colorstudy.com>

A.M. Kuchling wrote:
> A WSGI presentation at PyCon would probably be a good idea; anyone
> want to give one? (Proposal deadline is Dec. 31...)

I was planning on submitting something about WSGIKit, though mostly 
focused on WSGI and decomposing a framework into a set of WSGI 
middleware components.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org