From paul at boddie.org.uk Wed Dec 1 00:20:42 2004 From: paul at boddie.org.uk (Paul Boddie) Date: Wed Dec 1 00:21:45 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <20041130190154.GA12058@caltech.edu> References: <20041130190154.GA12058@caltech.edu> Message-ID: <200412010020.42195.paul@boddie.org.uk> On Tuesday 30 November 2004 20:01, Titus Brown wrote: > > My experience highlights an issue that needs to be dealt with by any > WSGI server code. Several app frameworks -- Quixote Webware, and Zope, > for example -- expect to be handed control of an entire URL tree. I handled this control issue using the following code: # Magic dictionary for WSGIServer. class MagicDict: def __init__(self, handler): self.handler = handler def has_key(self, name): return 1 def __getitem__(self, name): return self.handler When such an object is passed to WSGIServer, the specified handler always gets control, although it'd surely be preferable for so-called "WSGI middleware" to manage the URL space. Paul From titus at caltech.edu Wed Dec 1 01:04:50 2004 From: titus at caltech.edu (Titus Brown) Date: Wed Dec 1 01:05:02 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <200412010020.42195.paul@boddie.org.uk> References: <20041130190154.GA12058@caltech.edu> <200412010020.42195.paul@boddie.org.uk> Message-ID: <20041201000450.GB12543@caltech.edu> -> > My experience highlights an issue that needs to be dealt with by any -> > WSGI server code. Several app frameworks -- Quixote Webware, and Zope, -> > for example -- expect to be handed control of an entire URL tree. -> -> I handled this control issue using the following code: -> -> # Magic dictionary for WSGIServer. -> -> class MagicDict: -> def __init__(self, handler): -> self.handler = handler -> def has_key(self, name): -> return 1 -> def __getitem__(self, name): -> return self.handler -> -> When such an object is passed to WSGIServer, the specified handler always gets -> control, although it'd surely be preferable for so-called "WSGI middleware" -> to manage the URL space. this still has the problem that env["SCRIPT_NAME"] and env["PATH_INFO"] aren't munged appropriately, no? I know this would be a problem with Quixote, not sure about the rest. --titus From colin at owlfish.com Wed Dec 1 02:06:34 2004 From: colin at owlfish.com (Colin Stewart) Date: Wed Dec 1 02:06:46 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> Message-ID: <1101863195.15522.42.camel@rock> Hi, (I've subscribed to the list so we can continue discussion purely on- list) > >The only real problem in getting this to work was that wsgiServer.py > >expected *every* URL under /demo to be registered to demo_obj. I > >changed the wsgiServer.py code to allow for partial matches & munged > >the SCRIPT_NAME and PATH_INFO variables appropriately. I also added > >REQUEST_URI because Quixote uses it for a few things; this should > >probably be moved into QWIP. > > I think I'm going to have to call that point out in the PEP > somewhere. Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be > set, but I think perhaps some folks have missed the implications of that > for the URL path space. The clarification is good - it certainly wasn't clear to me the first time I read it! A quick question about the SCRIPT_NAME: If an application registers for the path '/testapp/' should SCRIPT_NAME be set to '/testapp', '/testapp/', or even 'testapp'?. I've implemented the first one in my latest version of wsgiServer, but I want to make sure that's correct. Colin. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20041130/c14135d5/attachment.html From colin at owlfish.com Wed Dec 1 02:16:45 2004 From: colin at owlfish.com (Colin Stewart) Date: Wed Dec 1 02:16:54 2004 Subject: [Web-SIG] Re: WSGI Utils & SCGI/Quixote. In-Reply-To: <20041130190154.GA12058@caltech.edu> References: <20041130190154.GA12058@caltech.edu> Message-ID: <1101863806.15522.50.camel@rock> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: path_fix.patch Type: text/x-patch Size: 3290 bytes Desc: not available Url : http://mail.python.org/pipermail/web-sig/attachments/20041130/3e4f1652/path_fix-0001.bin From titus at caltech.edu Wed Dec 1 02:29:25 2004 From: titus at caltech.edu (Titus Brown) Date: Wed Dec 1 02:29:30 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <1101863195.15522.42.camel@rock> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> <1101863195.15522.42.camel@rock> Message-ID: <20041201012925.GA20972@caltech.edu> -> > >The only real problem in getting this to work was that wsgiServer.py -> > >expected *every* URL under /demo to be registered to demo_obj. I -> > >changed the wsgiServer.py code to allow for partial matches & munged -> > >the SCRIPT_NAME and PATH_INFO variables appropriately. I also added -> > >REQUEST_URI because Quixote uses it for a few things; this should -> > >probably be moved into QWIP. -> > -> > I think I'm going to have to call that point out in the PEP -> > somewhere. Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be -> > set, but I think perhaps some folks have missed the implications of that -> > for the URL path space. -> -> -> The clarification is good - it certainly wasn't clear to me the first -> time I read it! -> -> A quick question about the SCRIPT_NAME: If an application registers for -> the path '/testapp/' should SCRIPT_NAME be set to '/testapp', -> '/testapp/', or even 'testapp'?. I've implemented the first one in my -> latest version of wsgiServer, but I want to make sure that's correct. Well, 'testapp' would be ruled out because of the requirement that SCRIPT_NAME + PATH_INFO == REQUEST_URI (where REQUEST_URI is everything after the host/port info). I'd be happy with the literal case, myself, but I'm not sure how anything other than Quixote deals with the URLs. --titus From ianb at colorstudy.com Wed Dec 1 02:42:05 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Dec 1 02:42:04 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <1101863195.15522.42.camel@rock> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> <1101863195.15522.42.camel@rock> Message-ID: <41AD216D.2060107@colorstudy.com> Colin Stewart wrote: > (I've subscribed to the list so we can continue discussion purely on-list) > >>>The only real problem in getting this to work was that wsgiServer.py >>>expected *every* URL under /demo to be registered to demo_obj. I >>>changed the wsgiServer.py code to allow for partial matches & munged >>>the SCRIPT_NAME and PATH_INFO variables appropriately. I also added >>>REQUEST_URI because Quixote uses it for a few things; this should >>>probably be moved into QWIP. >> >>I think I'm going to have to call that point out in the PEP >>somewhere. Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be >>set, but I think perhaps some folks have missed the implications of that >>for the URL path space. >> > > The clarification is good - it certainly wasn't clear to me the first > time I read it! > > A quick question about the SCRIPT_NAME: If an application registers for > the path '/testapp/' should SCRIPT_NAME be set to '/testapp', > '/testapp/', or even 'testapp'?. I've implemented the first one in my > latest version of wsgiServer, but I want to make sure that's correct. Because PATH_INFO must either be empty or start with a /, SCRIPT_NAME should be "/testapp" (no trailing /). If the script registers for the root (i.e., all URLs), SCRIPT_NAME should be "", and PATH_INFO contains the entire URL. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Wed Dec 1 02:53:12 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Dec 1 02:55:35 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <1101863195.15522.42.camel@rock> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20041130204832.0243c770@mail.telecommunity.com> At 08:06 PM 11/30/04 -0500, Colin Stewart wrote: >A quick question about the SCRIPT_NAME: If an application registers for >the path '/testapp/' should SCRIPT_NAME be set to '/testapp', '/testapp/', >or even 'testapp'?. I've implemented the first one in my latest version >of wsgiServer, but I want to make sure that's correct. Yes, the first one is correct. SCRIPT_NAME and PATH_INFO must both *always* either begin with a '/', or be empty strings. Technically, I would recommend that an app register as '/testapp' rather than '/testapp/', but as long as you generate a valid SCRIPT_NAME and PATH_INFO, it's not a compliance issue, as long as your web server can distinguish between: GET /testapp/ and GET /testapp which should produce a PATH_INFO of '/' in the first case, and an empty string in the second. It may be that some web servers exist that are biased towards physical URL mappings and don't pay attention to this. Perhaps I should add some clarification to the PEP on this point, although maybe instead some kind soul will volunteer to write a nice article about tips and traps for WSGI server implementors. :) From janssen at parc.com Wed Dec 1 03:57:33 2004 From: janssen at parc.com (Bill Janssen) Date: Wed Dec 1 03:57:58 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: Your message of "Tue, 30 Nov 2004 14:07:53 PST." <5.1.1.6.0.20041130170653.038ef0d0@mail.telecommunity.com> Message-ID: <04Nov30.185740pst."58617"@synergy1.parc.xerox.com> I think you're either dreaming, or have a much different idea of "non-technical user" than I have. Bill Phillip Eby wrote: > At 12:10 PM 11/30/04 -0800, Bill Janssen wrote: > > > I don't think I could honestly expect non-technical users to be able to > > > get their head around RFC 2047. > > > >I wouldn't be expecting non-technical users to be editing WSGI config > >files in the first place. > > That's one of our explicit requirements, actually. We don't need them to > be able to *create* a deployment file, but they should be able to edit one > to tweak file paths and such. From pje at telecommunity.com Wed Dec 1 04:19:21 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Dec 1 04:17:48 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <04Nov30.185740pst."58617"@synergy1.parc.xerox.com> References: Message-ID: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> At 06:57 PM 11/30/04 -0800, Bill Janssen wrote: >I think you're either dreaming, or have a much different idea of >"non-technical user" than I have. Well, by definition in this context, they need to be somebody who can edit a simple configuration file. If not, then it doesn't matter how simple a configuration file we make it! (Also, presumably they're not going to be able to configure their web server, either.) The point is to require as few skills as possible beyond "can edit a configuration file". :) As to the "non-technical user" part, I am thinking of a person who is not technically inclined. That is, someone who may do technical-ish things when forced to, but has no inherent interest in them, and little patience for them. Someone who's maybe edited an HTML file or PHP script in order to change something, but doesn't actually *know* any HTML or PHP, they just figure it out as they go. They are not technophobic, just not techno-interested. :) If we have to tell this person how to get into Notepad in order to do the editing, or explain to them what a filename is, then they are not reasonably within the scope of this effort, and the application author should attempt to charge them an installation fee. :) On the other hand, if we have to tell them about \u escapes or RFC 2047 or XML entities, then it is us who have gone out of scope, and we do not deserve to get any of their money. :) From janssen at parc.com Wed Dec 1 04:46:19 2004 From: janssen at parc.com (Bill Janssen) Date: Wed Dec 1 04:46:51 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: Your message of "Tue, 30 Nov 2004 19:19:21 PST." <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> Message-ID: <04Nov30.194625pst."58617"@synergy1.parc.xerox.com> So maybe the charset of the contents of the config file should just be whatever the locale of the machine says it is. Presumably that's what will drive the simple text editor that the user will be using to create/edit the file. Bill From ianb at colorstudy.com Wed Dec 1 04:55:11 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Dec 1 04:55:12 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> Message-ID: <41AD409F.7000507@colorstudy.com> Phillip J. Eby wrote: > At 06:57 PM 11/30/04 -0800, Bill Janssen wrote: > >> I think you're either dreaming, or have a much different idea of >> "non-technical user" than I have. > > > Well, by definition in this context, they need to be somebody who can > edit a simple configuration file. If not, then it doesn't matter how > simple a configuration file we make it! (Also, presumably they're not > going to be able to configure their web server, either.) The point is > to require as few skills as possible beyond "can edit a configuration > file". :) FWIW, a lot of PHP applications these days use through-the-web configuration; dump the files somewhere web-accessible, make sure at least a few select files are writable by Apache, and the rest has a GUI (of sorts). Even I find this quite convenient. Though I just encountered an application that took this too far, and stored preference information in the database, including the database connection information. It confused me greatly when the two weren't in sync, and it tried to reconnect to a database that no longer existed after I moved the application to another server. But I digress. We aren't where (mindful) PHP is (or even close), but it's something to shoot for. This may not actually apply to deployment configuration files, except that it would be nice if cooperative software could be packaged with a deployment configuration file that didn't need editing. At which point it might as well be a Python script that sets up the necessary objects. Python can be much smarter about this than any configuration file. Which is why I don't really think deployment configuration is all that important. It doesn't hurt, but I don't think it should hold up the PEP in any way -- I think the PEP is entirely sufficient as it is, and we can figure out deployment or async or whatever in other PEPs, or in a later revision to WSGI. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Wed Dec 1 05:12:12 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Dec 1 05:10:39 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <04Nov30.194625pst."58617"@synergy1.parc.xerox.com> References: Message-ID: <5.1.1.6.0.20041130231033.024f6620@mail.telecommunity.com> At 07:46 PM 11/30/04 -0800, Bill Janssen wrote: >So maybe the charset of the contents of the config file should just be >whatever the locale of the machine says it is. Presumably that's what >will drive the simple text editor that the user will be using to >create/edit the file. For some platforms, that certainly would make sense. Of course, this is also why I was just thinking plain ASCII, at least until somebody pointed out that some OSes have Unicode filenames. IMO, this is really the only use case that supports having Unicode support at all; everything we need for HTTP itself is either ISO-Latin-1 or "byte strings". From pje at telecommunity.com Wed Dec 1 05:22:56 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Dec 1 05:21:23 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <41AD409F.7000507@colorstudy.com> References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com> At 09:55 PM 11/30/04 -0600, Ian Bicking wrote: >We aren't where (mindful) PHP is (or even close), but it's something to >shoot for. This may not actually apply to deployment configuration files, >except that it would be nice if cooperative software could be packaged >with a deployment configuration file that didn't need editing. At which >point it might as well be a Python script that sets up the necessary >objects. Python can be much smarter about this than any configuration file. Here's what I'm thinking: paths in the file should be allowed to be relative to the directory containing the deployment file, and the configuration passed to the application or its setup should include the path to the deployment file. The combination of these two things would suffice to allow distribution of an application in a largely ready-to-deploy form. The application could always provide facilities to edit its own configuration file(s) or the deployment configuration. This doesn't mean that some simple apps or middleware won't end up using the deployment file for all their configuration needs, but that may well be okay for their target audiences. The key is to have a path to near-turnkey installation, if possible. >Which is why I don't really think deployment configuration is all that >important. It doesn't hurt, but I don't think it should hold up the PEP >in any way -- I think the PEP is entirely sufficient as it is, and we can >figure out deployment or async or whatever in other PEPs, or in a later >revision to WSGI. Hmm, I seem to recall you arguing almost the opposite about a year ago... ;) For example, that it was really important for apps to know what web server they were running in, and conversely that they expose lots of configuration data to the web server. Anyway, there's nothing really "holding up" the PEP; people are making implementations, and we're so far only finding things that need clarification, not fixing. So clearly the PEP itself is in fairly good shape. I probably should block out some time in the next week or two to apply the pending updates and write that sync/async/threading primer. I'd also still like to see a solid async API proposal, and I'd like to *make* a deployment format proposal, once I get a few other things taken care of. From ianb at colorstudy.com Wed Dec 1 19:41:48 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Dec 1 19:45:21 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com> References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com> Message-ID: <41AE106C.4050403@colorstudy.com> Phillip J. Eby wrote: > At 09:55 PM 11/30/04 -0600, Ian Bicking wrote: > >> We aren't where (mindful) PHP is (or even close), but it's something >> to shoot for. This may not actually apply to deployment configuration >> files, except that it would be nice if cooperative software could be >> packaged with a deployment configuration file that didn't need >> editing. At which point it might as well be a Python script that sets >> up the necessary objects. Python can be much smarter about this than >> any configuration file. > > > Here's what I'm thinking: paths in the file should be allowed to be > relative to the directory containing the deployment file, and the > configuration passed to the application or its setup should include the > path to the deployment file. The combination of these two things would > suffice to allow distribution of an application in a largely > ready-to-deploy form. The application could always provide facilities > to edit its own configuration file(s) or the deployment configuration. This leads to the question: when would you edit the deployment file? Besides just using a different directory prefix? A given application has a fixed set of requirements, why not just code them up in Python? Well, I can imagine reasons, but I think we need to start from use cases. So here's a use case: I have a Wiki application, written for Webware. I can expose it at a few different levels -- the many Webware servlets (which are applications), or a single application, and that application can have more or less functionality (depending on how much functionality I expect the parent to have -- e.g., session support). I also require configuration for the Wiki, though it probably could be installed with no configuration and reasonable defaults. (Should it create a template configuration file with defaults in this case? Where to put it?) Because it is based on Webware (WSGIKit), it requires a bunch of middleware. If there are other Webware applications in the stack, *maybe* it would be useful to share that middleware. Maybe not, maybe just sharing configuration would be sufficient. So, where does deployment fit in here? Probably all you'd need would be to give the path to the application (maybe an importable package name), and an optional path to the configuration file for the application. I don't need the configuration until runtime, though. Or, it could be inverted. The Wiki application is the front-facing object, and you tell it what server you want to use. Both could even coexist fairly easily. Maybe it would be smart enough to tell if it was being run as a CGI script (just by looking at the environment), and if not it would have options to start up some kind of server. And it would export some conventional name, so you could point some other server at it, using whatever mechanisms that server uses (which probably includes giving the application a URL space). As an application distributor this seems like the easiest thing to describe and support. > This doesn't mean that some simple apps or middleware won't end up using > the deployment file for all their configuration needs, but that may well > be okay for their target audiences. The key is to have a path to > near-turnkey installation, if possible. > > >> Which is why I don't really think deployment configuration is all that >> important. It doesn't hurt, but I don't think it should hold up the >> PEP in any way -- I think the PEP is entirely sufficient as it is, and >> we can figure out deployment or async or whatever in other PEPs, or in >> a later revision to WSGI. > > > Hmm, I seem to recall you arguing almost the opposite about a year > ago... ;) For example, that it was really important for apps to know > what web server they were running in, and conversely that they expose > lots of configuration data to the web server. Consistency is the hobgoblin of little minds! Well, I don't know if I've been consistent or not, but I don't place much weight in it either way ;) I guess I don't really want WSGI to be exposed to less-technical web developers, or to people who install applications based on it. So I'd like the pieces to communicate with each other fairly freely and completely. If we can automate something, that's great -- like including process information in the WSGI environment. But configuration isn't automation, so it doesn't excite me a lot. But of course configuration exists, so if it exists then I'd like to keep it together, because I find lots of configuration files to be hard to navigate (as a user). I also want configuration to be optional, and deployment configuration isn't very optional. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org From colin at owlfish.com Thu Dec 2 06:20:54 2004 From: colin at owlfish.com (Colin Stewart) Date: Thu Dec 2 06:21:04 2004 Subject: [Web-SIG] ANN: WSGIUtils 0.3 Message-ID: <1101964854.26020.5.camel@rock> Hi, Following on from the discussion here regarding the handling of URLs I've uploaded a new version of WSGIUtils (http://www.owlfish.com/software/wsgiutils/) that should behave as the spec intended. Any feedback or suggestions are welcome... Colin. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20041202/47505a76/attachment.htm From titus at caltech.edu Thu Dec 2 08:22:07 2004 From: titus at caltech.edu (Titus Brown) Date: Thu Dec 2 08:22:10 2004 Subject: [Web-SIG] ANN: WSGIUtils 0.3 In-Reply-To: <1101964854.26020.5.camel@rock> References: <1101964854.26020.5.camel@rock> Message-ID: <20041202072207.GA26907@caltech.edu> -> Following on from the discussion here regarding the handling of URLs -> I've uploaded a new version of WSGIUtils -> (http://www.owlfish.com/software/wsgiutils/) that should behave as the -> spec intended. Any feedback or suggestions are welcome... Hi, Colin, sorry I waited 'til you cut a new release to test out your patch! Everything now works as expected, except for an omission in wsgiAdaptor.py where 'setContentType' isn't defined on the Request class. One other request -- could you omit (or replace) the space in the directory name, e.g. rather than "WSGI Utils-0.3" make it "WSGIUtils-0.3" or something similar? Ugly, I know, but that space can be awkward in a command-line environment. cheers, --titus From ianb at colorstudy.com Thu Dec 2 20:20:26 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Dec 2 20:23:57 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> Message-ID: <41AF6AFA.8010205@colorstudy.com> Phillip J. Eby wrote: > At 11:01 AM 11/30/04 -0800, Titus Brown wrote: > >> The only real problem in getting this to work was that wsgiServer.py >> expected *every* URL under /demo to be registered to demo_obj. I >> changed the wsgiServer.py code to allow for partial matches & munged >> the SCRIPT_NAME and PATH_INFO variables appropriately. I also added >> REQUEST_URI because Quixote uses it for a few things; this should >> probably be moved into QWIP. > > > I think I'm going to have to call that point out in the PEP somewhere. > Technically, the PEP requires that SCRIPT_NAME and PATH_INFO be set, but > I think perhaps some folks have missed the implications of that for the > URL path space. > > Perhaps something like this would do the trick: > > """ > Application Placement in Server URL Space > ----------------------------------------- > > In order to generate correct SCRIPT_NAME and PATH_INFO variables, > servers and gateways MUST treat an application's location as a URL path > prefix. That is, servers and gateways: > > * MUST determine the target application using a matching prefix of the > request path (which then determines the value of SCRIPT_NAME). > > * MUST take the remaining portion of the request path, and use it to > determine PATH_INFO. (Note that the remainder must be empty or begin > with a '/', otherwise the prefix match was invalid!) > > * MUST assume that there are an infinite number of possible URL paths > that may appear as a PATH_INFO suffix "beneath" the application's base URL I think this is too restrictive. It's the natural way to do things in most cases, but there's no reason to enforce it. E.g., a mod_rewrite-like middleware might do any number of things; it's a use-at-your-own-risk proposition (with considerable risk, at least from my own mod_rewrite experiences), but it shouldn't be disallowed, and this appears to disallow that kind of code. A particular use case came to my mind today. Imagine a login middleware -- it wants to allow login and logout, but otherwise interrupt the request cycle as little as possible. So, lets say an application requires login; maybe it sends a 401. The login middleware catches it, sees that it's configured for cookie-based (form) login, and turns it into a 200 with a login form. The user logs in, and goes to their original page. You want to customize the login form, so the form might be an application that doesn't belong to the login middleware (but uses conventional keys); the URL belongs to the originally-requested application, but the application being served is some other application. Or later, if they try to login but fail, their URL may still be pointing at the original application (useful if they were submitting a POST form, which you want to pass through to the original URL, and it's difficult to do that with a redirect-after-submit). There's a bunch of other ways this could be factored, but a number of them involve dispatching to an application based on query string, or in some way where SCRIPT_NAME and PATH_INFO don't have any relation to the application at all. So I'd say these should all be SHOULDs, not MUSTs. Or they should simply be put in as implementation recommendations. In general I don't think this should be a problem, because implementors will respond to feedback, and if it's really a problem it will be addressed (and probably fairly quickly). There's a lot of weird use cases for application dispatching, and I don't see any reason to restrict that by formalizing how dispatching should work. > Notice that these requirements imply that servers and gateways: > > * MUST NOT use query string contents, fragment identifiers, or URL > parameters to determine the application object that a request should be > sent to. > > * MUST NOT require that every URL path used by the application be > preconfigured or pre-registered with the server, or have some required > mapping to existing files, or any other requirement that would make > dynamic URLs impractical. > > A server or gateway that cannot meet these requirements IS NOT COMPLIANT > with this specification; it would be completely unusable for > applications from many popular Python web frameworks inlcuding at least > Zope, Webware, and Quixote, and many standalone Python web applications > as well. > """ From ianb at colorstudy.com Thu Dec 2 20:27:47 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Dec 2 20:31:17 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <41AF600B.3010006@jdiworks.net> References: <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com> <41AE106C.4050403@colorstudy.com> <41AF600B.3010006@jdiworks.net> Message-ID: <41AF6CB3.3020908@colorstudy.com> Terrel Shumway wrote: > I haven't been following this thread closely, but here is my $.02 based > on my continuing experience in implementing a Cheetah framework over > WSGI and deploying it via CGI and mod_python. > > Ian Bicking wrote: > >> Phillip J. Eby wrote: >> >>> Here's what I'm thinking: paths in the file should be allowed to be >>> relative to the directory containing the deployment file, and the >>> configuration passed to the application or its setup should include >>> the path to the deployment file. The combination of these two things >>> would suffice to allow distribution of an application in a largely >>> ready-to-deploy form. The application could always provide >>> facilities to edit its own configuration file(s) or the deployment >>> configuration. >> >> >> >> This leads to the question: when would you edit the deployment file? >> Besides just using a different directory prefix? > > > The "packager" edits the and bundles the code, > templates, etc. that the app requires. This is often a different person > from the "programmer" who creates the components. I expect the packager to have programming skills, even if they aren't the same person as the application programmer. > The server administrator or "webmaster" is the person who edits the > file to assign a URL space to each app. The > webmaster might have his own middleware to add, e.g. for extra logging, > or performance monitoring. The webmaster may not have programming skills (or at least not Python). Though depending on the sophistication of the integration, I'd be okay with some programming required. For instance, if you are integrating your login method with the application's, it might be necessary to do some programming -- simple login sharing should be easy, but sharing user metadata and administrative operations (e.g., adding users and the like) will probably require programming (unless the systems are specifically meant to work with each other -- i.e., another standardized interface). > A given application has a fixed set of requirements, why not just code > them up in Python? Well, I can imagine reasons, but I think we need to > start from use cases. > > So here's a use case: > > > that application can have more or less functionality (depending on > how much functionality I expect the parent to have -- e.g., session > support). I'd like this to be automated. If, for instance, we can standardize the session interface this should be doable. The application looks for, say, session.api_1 (standard session API, version 1). If it finds it, it uses it, knowing the interface. If not, it puts in its own piece of middleware that provides the API. Until we standardize that, we'll be doing this stuff ad hoc, but that's okay -- this is an ongoing process. > Two core features that are "required" by a Java servlet container are > session support, and login support. And every existing container that I > know of also supports JSP. JSP is funny, and not a model widely used for Python (I think). It certainly doesn't seem as fundamental in a WSGI model, where URLs aren't necessarily mapped to files. That is, I don't think there's any kind of file you can just plop into a WSGI container and it will display; not even Python source. There's no file-like container, and there's no standard URL->object mapping or configuration. I'm not entirely sure there should be a standard, at least not one we expect most people to use... certainly several frameworks could share implementations, to the degree they act similarly. But object publishers (e.g., Quixote, Zope) and file-based systems (e.g., Webware, Spyce) are going to remain fairly separate. > These three features -- session, login, templates -- are needed by > enough people that I think they should be standard. (e.g. how difficult > would it be to create a wiki if you could rely on the framework for > these? -- 80% of the work is done.) > > Another kit that might have broad application is a formkit -- a > higher-level way to manage posted form data -- but that probably > doesn't belong in WSGI (PEP), because there are a lot of different ways > people want to do it. I don't think this needs to be part of the request cycle at all -- the application is always an intermediary there. It's simply a library. > Probably ditto for templates. But in python, "you really only have to do > it one way". There should be one (1) easy way to say how a container > interacts with a template engine. I'm not sure what that means yet, but > I'll think about it. At first I thought templates should just be a library as well, though it would be nice if applications could share templating configuration. I.e., you could indicate a template search path, maybe adding more paths on a per-application basis, or otherwise fiddling with that path (e.g., skinning an application based on URL). But we can handle that in a neutral way, i.e., providing a generic configuration system that is template users can use as they wish (though they'll want to form conventions about key naming; or we can provide conventions about how to adapt configurations to different naming schemes). > e.g. the framework I am building can get template files from different > places to easily support skinning. It would be nice to say "get template > X and fill it from these variables" without worrying about where X > resides in a filesystem or .zip archive. (along the lines of the java > ServletContext.getResource*() methods) If I deploy four applications > (Contexts), I want them to share template files > (/var/www/sitename/templates/) so the designer can change the look and > feel of the whole site at once. In my case, I also want a set of > templates shared among many sites on the same server (/var/www/templates/) > There should be a standard way for servlet authors to say "this is the > 'content' piece that I care about, and here are some styles and > content. Now you put it together inside the site-wide templates to > create the page." And it shouldn't matter to the developer whether that > sitewide template is implemented with Cheetah or CherryPy or Quixote or > ZTP or whatever. This does make me think templating could participate in the request cycle, as a filter of sorts. Right now we're trying to move to SSIs as a shared templating scheme, at least when we move to Apache 2, because all our scripts can output SSIs and Apache will evaluate them. Maybe not everyone will want SSIs (obviously), but maybe this general pattern can be used -- one of filtering text. I don't know what else we can agree on, especially in environments where everything isn't Python. This is akin to an XSLT-based templating approach, but of course there are much better languages than XSLT that we can come up with ;) If we do it as filtering, we don't have to agree nearly as much about templating languages or even interfaces. We just have to agree on a document format, which somehow seems easier. We wouldn't even have to agree that much on document format; if we were using SSIs, we could make something that transforms document type X into SSIs, and then Apache does the next step. This is an N^2 problem, given N kinds of data/template languages, but at least it offers some kind of solution. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Thu Dec 2 20:40:23 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Dec 2 20:38:54 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <41AF6AFA.8010205@colorstudy.com> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20041202143501.02b80480@mail.telecommunity.com> At 01:20 PM 12/2/04 -0600, Ian Bicking wrote: >Phillip J. Eby wrote: >>Application Placement in Server URL Space >>----------------------------------------- >>In order to generate correct SCRIPT_NAME and PATH_INFO variables, servers >>and gateways MUST treat an application's location as a URL path >>prefix. That is, servers and gateways: >>* MUST determine the target application using a matching prefix of the >>request path (which then determines the value of SCRIPT_NAME). >>* MUST take the remaining portion of the request path, and use it to >>determine PATH_INFO. (Note that the remainder must be empty or begin with >>a '/', otherwise the prefix match was invalid!) >>* MUST assume that there are an infinite number of possible URL paths >>that may appear as a PATH_INFO suffix "beneath" the application's base URL > >I think this is too restrictive. It's the natural way to do things in >most cases, but there's no reason to enforce it. E.g., a mod_rewrite-like >middleware might do any number of things; it's a use-at-your-own-risk >proposition (with considerable risk, at least from my own mod_rewrite >experiences), but it shouldn't be disallowed, and this appears to disallow >that kind of code. > >A particular use case came to my mind today. Imagine a login middleware >-- it wants to allow login and logout, but otherwise interrupt the request >cycle as little as possible. So, lets say an application requires login; >maybe it sends a 401. The login middleware catches it, sees that it's >configured for cookie-based (form) login, and turns it into a 200 with a >login form. You're focusing here on middleware; IMO the above is valid as long as it's applied only to servers and gateways, rather than middleware. It just needs a parenthetical to indicate that these restrictions don't apply to middleware. From pje at telecommunity.com Thu Dec 2 22:35:13 2004 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Dec 2 22:33:45 2004 Subject: [Web-SIG] WSGI configuration and character encoding. In-Reply-To: <41AF6CB3.3020908@colorstudy.com> References: <41AF600B.3010006@jdiworks.net> <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130221022.0278aaf0@mail.telecommunity.com> <5.1.1.6.0.20041130231218.024f1e60@mail.telecommunity.com> <41AE106C.4050403@colorstudy.com> <41AF600B.3010006@jdiworks.net> Message-ID: <5.1.1.6.0.20041202163249.0287aa30@mail.telecommunity.com> At 01:27 PM 12/2/04 -0600, Ian Bicking wrote: >That is, I don't think there's any kind of file you can just plop into a >WSGI container and it will display; That's actually one use case for the deployment file. That is, if you can just plop a .wsgi file in there with the deployment information, and have the application's virtual URL space simply be "beneath" the URL of the .wsgi file. (Or whatever filename it has.) From terrel at terrelshumway.com Fri Dec 3 01:57:38 2004 From: terrel at terrelshumway.com (Terrel Shumway) Date: Fri Dec 3 01:57:50 2004 Subject: [Web-SIG] WSGI Utils & SCGI/Quixote. In-Reply-To: <41AF6AFA.8010205@colorstudy.com> References: <5.1.1.6.0.20041130142125.02101070@mail.telecommunity.com> <41AF6AFA.8010205@colorstudy.com> Message-ID: <41AFBA02.4060409@terrelshumway.com> Ian Bicking wrote: > Phillip J. Eby wrote: > >> I think I'm going to have to call that point out in the PEP >> somewhere. Technically, the PEP requires that SCRIPT_NAME and >> PATH_INFO be set, but I think perhaps some folks have missed the >> implications of that for the URL path space. >> >> Perhaps something like this would do the trick: >> >> """ >> Application Placement in Server URL Space >> ----------------------------------------- >> >> In order to generate correct SCRIPT_NAME and PATH_INFO variables, >> servers and gateways MUST treat an application's location as a URL >> path prefix. That is, servers and gateways: >> >> * MUST determine the target application using a matching prefix of >> the request path (which then determines the value of SCRIPT_NAME). >> >> * MUST take the remaining portion of the request path, and use it to >> determine PATH_INFO. (Note that the remainder must be empty or begin >> with a '/', otherwise the prefix match was invalid!) >> >> * MUST assume that there are an infinite number of possible URL paths >> that may appear as a PATH_INFO suffix "beneath" the application's >> base URL > > > I think this is too restrictive. It's the natural way to do things in > most cases, It is the natural way, and it is not very restrictive. > but there's no reason to enforce it. Reason #1: "You really only need to do it one way" which is the entire point of the WEB-SIG. Reason #2: If you don't specify one well-documented, easily-implemented way, you will get a dozen poorly-implemented, poorly-documented ways. > E.g., a mod_rewrite-like middleware might do any number of things; > it's a use-at-your-own-risk proposition (with considerable risk, at > least from my own mod_rewrite experiences), but it shouldn't be > disallowed, and this appears to disallow that kind of code. Reason #3: mod_rewrite is the problem. an understandable mapping convention is the solution. [snip] > The login middleware catches it, sees that it's configured for > cookie-based (form) login, and turns it into a 200 with a login form. that should be a "303 See Other" pointing to the login form. <>> Or later, if they try to login but fail, their URL may still be pointing at the original application (useful if they were submitting a POST form, not really, because you still lost the original POST data. ... Unless the login middleware also saved that to a "conditional post" queue like Fastmail.FM does if your session times out while you are composing a message. (IMO, every successful POST SHOULD respond with 303 -- avoiding 90% of all double posts. Unsuccessful POSTs should send 200, with the original form already filled out with the info that was correct, and error messages where it was not.) > which you want to pass through to the original URL, and it's difficult > to do that with a redirect-after-submit). If you use cookie-based authentication, the user can usually just hit the back button twice and POST again. (not fun if they were uploading big files, but otherwise harmless, because the orignal post "failed" and no unsafe action was taken.) > There's a bunch of other ways this could be factored, but a number of > them involve dispatching to an application based on query string, or > in some way where SCRIPT_NAME and PATH_INFO don't have any relation to > the application at all. And those other ways create UGLY urls, which enticed someone to create mod_rewrite to make them pretty. Search engines are getting better at making sense of that ugliness, but the URL space is still not very RESTful. Keep in mind we are talking about the *container* doing the dispatching. Once the servlet is selected, it can do anything it wants with the PATH_INFO and query string, including forwards and includes and redirects. If the application wants to do crazy dispatching within its own URL space, that's fine, but the container shouldn't need to deal with that. > > So I'd say these should all be SHOULDs, not MUSTs. Or they should > simply be put in as implementation recommendations. That's what the Java people said sometime before Servlet Version 2.2. But they tightened it up based on experience: --------------SRV.10 (v.2.2)-------------- Previous versions of this specification have allowed servlet containers a great deal of flexibility in mapping client requests to servlets only defining a set a suggested mapping techniques. This specification *now requires* a set of mapping techniques to be used for web applications which are deployed via the Web Application Deployment mechanism. Just as it is highly recommended that servlet containers use the deployment representations as their runtime representation, it is highly recommended that they use these path mapping rules in their servers for all purposes and not just as part of deploying a web application. -------------- --------------SRV.11 (v.2.4)-------------- The mapping techniques described in this chapter are *required* for Web containers mapping client requests to servlets. (Previous versions of this specification made use of these mapping techniques as a suggestion rather than a requirement, allowing servlet containers to each have their different schemes for mapping client requests to servlets.) -------------- http://jdiworks.net/projects/servlet/SRV.11.html http://jdiworks.net/projects/servlet/SRV.4.4.html Let's learn from their experience. --- Terrel Shumway "That Web Guy Who Knows Marketing" http://jdiworks.net/ From floydophone at gmail.com Sat Dec 4 05:12:30 2004 From: floydophone at gmail.com (Peter Hunt) Date: Sat Dec 4 05:12:33 2004 Subject: [Web-SIG] WSGI-ISAPI Message-ID: <6654eac404120320126b9d9456@mail.gmail.com> After installed Python 2.4 and the latest Pythonwin, I discovered a new cool ISAPI module. If anyone wants to assist me with WSGI-ISAPI, I'd be glad for the help. I'm still at the "Hello, world" stage, but I think this (along with mod_python) will be a huge selling point for WSGI. From jlowery at m2is.com Wed Dec 8 01:16:55 2004 From: jlowery at m2is.com (Jeff Lowery) Date: Wed Dec 8 01:16:59 2004 Subject: [Web-SIG] Running python cgi script Message-ID: <006f01c4dcbb$39b8e470$2600a8c0@Folderal> I'm trying to get MoinMoin Wiki server installed on IIS 6.0, but am hung up on getting the moin.cgi script to execute. Yes, I have read and followed the directions in the INSTALL.html document, including: 1) appended the site-packages directory to sys.path 2) added virtual directory 'wiki', pointing to htdocs directory 3) added virtual directory 'mywiki', pointing to wiki instance directory 4) configured the virtual directory in IIS above to run "c:\python23\python.exe" -u %s %s on .cgi extensions 5) set Web Service Extensions to allow unknown cgi extensions Added some logging statements to a log file at the top of moin.cgi, just to see if it was executing at all (runs fine from the command line, btw). Apparently not: getting a "CGI Error: The specified CGI application misbehaved by not returning a complete set of HTTP headers", and no log is generated. Funny thing is that if I remove the "-u %s %s" from the cgi extension setup (4), I get a timeout error instead. Looks like IIS knows about the CGI mapping, but is not running the python interpreter. Any ideas? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20041207/3ed94dcf/attachment.html From amk at amk.ca Wed Dec 15 15:36:08 2004 From: amk at amk.ca (A.M. Kuchling) Date: Wed Dec 15 15:36:53 2004 Subject: [Web-SIG] WSGI presentation Message-ID: <20041215143608.GA8049@rogue.amk.ca> A WSGI presentation at PyCon would probably be a good idea; anyone want to give one? (Proposal deadline is Dec. 31...) --amk From ianb at colorstudy.com Wed Dec 15 17:04:34 2004 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Dec 15 17:04:27 2004 Subject: [Web-SIG] WSGI presentation In-Reply-To: <20041215143608.GA8049@rogue.amk.ca> References: <20041215143608.GA8049@rogue.amk.ca> Message-ID: <41C06092.8090504@colorstudy.com> A.M. Kuchling wrote: > A WSGI presentation at PyCon would probably be a good idea; anyone > want to give one? (Proposal deadline is Dec. 31...) I was planning on submitting something about WSGIKit, though mostly focused on WSGI and decomposing a framework into a set of WSGI middleware components. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org